Commit Graph

8574 Commits

Author SHA1 Message Date
Linus Torvalds
4d6fe79fde New x86 features:
* Guest API and guest kernel support for SEV live migration
 
 * SEV and SEV-ES intra-host migration
 
 Bugfixes and cleanups for x86:
 
 * Fix misuse of gfn-to-pfn cache when recording guest steal time / preempted status
 
 * Fix selftests on APICv machines
 
 * Fix sparse warnings
 
 * Fix detection of KVM features in CPUID
 
 * Cleanups for bogus writes to MSR_KVM_PV_EOI_EN
 
 * Fixes and cleanups for MSR bitmap handling
 
 * Cleanups for INVPCID
 
 * Make x86 KVM_SOFT_MAX_VCPUS consistent with other architectures
 
 Bugfixes for ARM:
 
 * Fix finalization of host stage2 mappings
 
 * Tighten the return value of kvm_vcpu_preferred_target()
 
 * Make sure the extraction of ESR_ELx.EC is limited to architected bits
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGO1usUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOsXgf/d2CgZmTxZii/pOt0JKGSDKwRdoti
 pfbmtXwTkagqg2tIJtEsxwvcc26Q2VLyg9mlEelX1I8KVlexSa8mDtbGWHLFilsc
 JIZY8lE96wtlr9AyuN06K44QingDMIbXzlxcwO+muS3zTlSPsNkPcaVDk5PL35sN
 Wy2GA6GCLWv6iTLCtlX5EzbcgkrR+Mypj4lGSdXGRqfBKVFOQjeVt+Q3YzOPuacS
 03NBlYOmRxTnAGXfeAVs0/zEa69u/dYjBmwDPe6ERma4u8thHv0U/fDAh5C/ES7q
 42Thes/JOYnEZiBQ1xZLsHqRII0achbJKZhmxqPwjRf/u6eXsfz0KY9FBg==
 =kN8U
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm updates from Paolo Bonzini:
 "New x86 features:

   - Guest API and guest kernel support for SEV live migration

   - SEV and SEV-ES intra-host migration

  Bugfixes and cleanups for x86:

   - Fix misuse of gfn-to-pfn cache when recording guest steal time /
     preempted status

   - Fix selftests on APICv machines

   - Fix sparse warnings

   - Fix detection of KVM features in CPUID

   - Cleanups for bogus writes to MSR_KVM_PV_EOI_EN

   - Fixes and cleanups for MSR bitmap handling

   - Cleanups for INVPCID

   - Make x86 KVM_SOFT_MAX_VCPUS consistent with other architectures

  Bugfixes for ARM:

   - Fix finalization of host stage2 mappings

   - Tighten the return value of kvm_vcpu_preferred_target()

   - Make sure the extraction of ESR_ELx.EC is limited to architected
     bits"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (34 commits)
  KVM: SEV: unify cgroup cleanup code for svm_vm_migrate_from
  KVM: x86: move guest_pv_has out of user_access section
  KVM: x86: Drop arbitrary KVM_SOFT_MAX_VCPUS
  KVM: Move INVPCID type check from vmx and svm to the common kvm_handle_invpcid()
  KVM: VMX: Add a helper function to retrieve the GPR index for INVPCID, INVVPID, and INVEPT
  KVM: nVMX: Clean up x2APIC MSR handling for L2
  KVM: VMX: Macrofy the MSR bitmap getters and setters
  KVM: nVMX: Handle dynamic MSR intercept toggling
  KVM: nVMX: Query current VMCS when determining if MSR bitmaps are in use
  KVM: x86: Don't update vcpu->arch.pv_eoi.msr_val when a bogus value was written to MSR_KVM_PV_EOI_EN
  KVM: x86: Rename kvm_lapic_enable_pv_eoi()
  KVM: x86: Make sure KVM_CPUID_FEATURES really are KVM_CPUID_FEATURES
  KVM: x86: Add helper to consolidate core logic of SET_CPUID{2} flows
  kvm: mmu: Use fast PF path for access tracking of huge pages when possible
  KVM: x86/mmu: Properly dereference rcu-protected TDP MMU sptep iterator
  KVM: x86: inhibit APICv when KVM_GUESTDBG_BLOCKIRQ active
  kvm: x86: Convert return type of *is_valid_rdpmc_ecx() to bool
  KVM: x86: Fix recording of guest steal time / preempted status
  selftest: KVM: Add intra host migration tests
  selftest: KVM: Add open sev dev helper
  ...
2021-11-13 10:01:10 -08:00
Jakub Kicinski
0cda7d4bac selftests: net: switch to socat in the GSO GRE test
Commit a985442fde ("selftests: net: properly support IPv6 in GSO GRE test")
is not compatible with:

  Ncat: Version 7.80 ( https://nmap.org/ncat )

(which is distributed with Fedora/Red Hat), tests fail with:

  nc: invalid option -- 'N'

Let's switch to socat which is far more dependable.

Fixes: 025efa0a82 ("selftests: add simple GSO GRE test")
Fixes: a985442fde ("selftests: net: properly support IPv6 in GSO GRE test")
Tested-by: Andrea Righi <andrea.righi@canonical.com>
Link: https://lore.kernel.org/r/20211111162929.530470-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-12 19:59:01 -08:00
Jean-Philippe Brucker
e4ac80ef81 tools/runqslower: Fix cross-build
Commit be79505caf ("tools/runqslower: Install libbpf headers when
building") uses the target libbpf to build the host bpftool, which
doesn't work when cross-building:

  make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -C tools/bpf/runqslower O=/tmp/runqslower
  ...
    LINK    /tmp/runqslower/bpftool/bpftool
  /usr/bin/ld: /tmp/runqslower/libbpf/libbpf.a(libbpf-in.o): Relocations in generic ELF (EM: 183)
  /usr/bin/ld: /tmp/runqslower/libbpf/libbpf.a: error adding symbols: file in wrong format
  collect2: error: ld returned 1 exit status

When cross-building, the target architecture differs from the host. The
bpftool used for building runqslower is executed on the host, and thus
must use a different libbpf than that used for runqslower itself.
Remove the LIBBPF_OUTPUT and LIBBPF_DESTDIR parameters, so the bpftool
build makes its own library if necessary.

In the selftests, pass the host bpftool, already a prerequisite for the
runqslower recipe, as BPFTOOL_OUTPUT. The runqslower Makefile will use
the bpftool that's already built for selftests instead of making a new
one.

Fixes: be79505caf ("tools/runqslower: Install libbpf headers when building")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20211112155128.565680-1-jean-philippe@linaro.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-11-12 17:23:16 -08:00
Lorenz Bauer
6af2e12374 selftests/bpf: Check map in map pruning
Ensure that two registers with a map_value loaded from a nested
map are considered equivalent for the purpose of state pruning
and don't cause the verifier to revisit a pruning point.

This uses a rather crude match on the number of insns visited by
the verifier, which might change in the future. I've therefore
tried to keep the code as "unpruneable" as possible by having
the code paths only converge on the second to last instruction.

Should you require to adjust the test in the future, reducing the
number of processed instructions should always be safe. Increasing
them could cause another regression, so proceed with caution.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/CACAyw99hVEJFoiBH_ZGyy=+oO-jyydoz6v1DeKPKs2HVsUH28w@mail.gmail.com
Link: https://lore.kernel.org/bpf/20211111161452.86864-1-lmb@cloudflare.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-11-12 17:23:04 -08:00
Yonghong Song
325d956d67 selftests/bpf: Fix a tautological-constant-out-of-range-compare compiler warning
When using clang to build selftests with LLVM=1 in make commandline,
I hit the following compiler warning:

  benchs/bench_bloom_filter_map.c:84:46: warning: result of comparison of constant 256
    with expression of type '__u8' (aka 'unsigned char') is always false
    [-Wtautological-constant-out-of-range-compare]
                if (args.value_size < 2 || args.value_size > 256) {
                                           ~~~~~~~~~~~~~~~ ^ ~~~

The reason is arg.vaue_size has type __u8, so comparison "args.value_size > 256"
is always false.

This patch fixed the issue by doing proper comparison before assigning the
value to args.value_size. The patch also fixed the same issue in two
other places.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112204838.3579953-1-yhs@fb.com
2021-11-12 14:11:46 -08:00
Yonghong Song
21c6ec3d52 selftests/bpf: Fix an unused-but-set-variable compiler warning
When using clang to build selftests with LLVM=1 in make commandline,
I hit the following compiler warning:
  xdpxceiver.c:747:6: warning: variable 'total' set but not used [-Wunused-but-set-variable]
          u32 total = 0;
              ^

This patch fixed the issue by removing that declaration and its
assocatied unused operation.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112204833.3579457-1-yhs@fb.com
2021-11-12 14:11:46 -08:00
Yonghong Song
3f1d0dc0ba selftests/bpf: Clarify llvm dependency with btf_tag selftest
btf_tag selftest needs certain llvm versions (>= llvm14).
Make it clear in the selftests README.rst file.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012651.1508549-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
5698a42a73 selftests/bpf: Add a C test for btf_type_tag
The following is the main btf_type_tag usage in the
C test:
  #define __tag1 __attribute__((btf_type_tag("tag1")))
  #define __tag2 __attribute__((btf_type_tag("tag2")))
  struct btf_type_tag_test {
       int __tag1 * __tag1 __tag2 *p;
  } g;

The bpftool raw dump with related types:
  [4] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
  [11] STRUCT 'btf_type_tag_test' size=8 vlen=1
          'p' type_id=14 bits_offset=0
  [12] TYPE_TAG 'tag1' type_id=16
  [13] TYPE_TAG 'tag2' type_id=12
  [14] PTR '(anon)' type_id=13
  [15] TYPE_TAG 'tag1' type_id=4
  [16] PTR '(anon)' type_id=15
  [17] VAR 'g' type_id=11, linkage=global

With format C dump, we have
  struct btf_type_tag_test {
        int __attribute__((btf_type_tag("tag1"))) * __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag2"))) *p;
  };
The result C code is identical to the original definition except macro's are gone.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012646.1508231-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
26c79fcbfa selftests/bpf: Rename progs/tag.c to progs/btf_decl_tag.c
Rename progs/tag.c to progs/btf_decl_tag.c so we can introduce
progs/btf_type_tag.c in the next patch.

Also create a subtest for btf_decl_tag in prog_tests/btf_tag.c
so we can introduce btf_type_tag subtest in the next patch.

I also took opportunity to remove the check whether __has_attribute
is defined or not in progs/btf_decl_tag.c since all recent
clangs should already support this macro.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012641.1507144-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
846f4826d1 selftests/bpf: Test BTF_KIND_DECL_TAG for deduplication
Add BTF_KIND_TYPE_TAG duplication unit tests.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012635.1506853-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
6aa5dabc9d selftests/bpf: Add BTF_KIND_TYPE_TAG unit tests
Add BTF_KIND_TYPE_TAG unit tests.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012630.1506095-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Yonghong Song
0dc8587220 selftests/bpf: Test libbpf API function btf__add_type_tag()
Add unit tests for btf__add_type_tag().

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012625.1505748-1-yhs@fb.com
2021-11-11 17:41:11 -08:00
Andrii Nakryiko
60ba87bb6b selftests/bpf: Update btf_dump__new() uses to v1.0+ variant
Update to-be-deprecated forms of btf_dump__new().

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-8-andrii@kernel.org
2021-11-11 16:54:06 -08:00
Andrii Nakryiko
0b52a5f4b9 selftests/bpf: Migrate all deprecated perf_buffer uses
Migrate all old-style perf_buffer__new() and perf_buffer__new_raw()
calls to new v1.0+ variants.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-7-andrii@kernel.org
2021-11-11 16:54:05 -08:00
Andrii Nakryiko
957d350a8b libbpf: Turn btf_dedup_opts into OPTS-based struct
btf__dedup() and struct btf_dedup_opts were added before we figured out
OPTS mechanism. As such, btf_dedup_opts is non-extensible without
breaking an ABI and potentially crashing user application.

Unfortunately, btf__dedup() and btf_dedup_opts are short and succinct
names that would be great to preserve and use going forward. So we use
___libbpf_override() macro approach, used previously for bpf_prog_load()
API, to define a new btf__dedup() variant that accepts only struct btf *
and struct btf_dedup_opts * arguments, and rename the old btf__dedup()
implementation into btf__dedup_deprecated(). This keeps both source and
binary compatibility with old and new applications.

The biggest problem was struct btf_dedup_opts, which wasn't OPTS-based,
and as such doesn't have `size_t sz;` as a first field. But btf__dedup()
is a pretty rarely used API and I believe that the only currently known
users (besides selftests) are libbpf's own bpf_linker and pahole.
Neither use case actually uses options and just passes NULL. So instead
of doing extra hacks, just rewrite struct btf_dedup_opts into OPTS-based
one, move btf_ext argument into those opts (only bpf_linker needs to
dedup btf_ext, so it's not a typical thing to specify), and drop never
used `dont_resolve_fwds` option (it was never used anywhere, AFAIK, it
makes BTF dedup much less useful and efficient).

Just in case, for old implementation, btf__dedup_deprecated(), detect
non-NULL options and error out with helpful message, to help users
migrate, if there are any user playing with btf__dedup().

The last remaining piece is dedup_table_size, which is another
anachronism from very early days of BTF dedup. Since then it has been
reduced to the only valid value, 1, to request forced hash collisions.
This is only used during testing. So instead introduce a bool flag to
force collisions explicitly.

This patch also adapts selftests to new btf__dedup() and btf_dedup_opts
use to avoid selftests breakage.

  [0] Closes: https://github.com/libbpf/libbpf/issues/281

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-4-andrii@kernel.org
2021-11-11 16:54:05 -08:00
Andrii Nakryiko
de29e6bbb9 selftests/bpf: Minor cleanups and normalization of Makefile
Few clean ups and single-line simplifications. Also split CLEAN command
into multiple $(RM) invocations as it gets dangerously close to too long
argument list. Make sure that -o <output.o> is used always as the last
argument for saner verbose make output.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-3-andrii@kernel.org
2021-11-11 16:54:05 -08:00
Andrii Nakryiko
50dee7078b selftests/bpf: Fix bpf_prog_test_load() logic to pass extra log level
After recent refactoring bpf_prog_test_load(), used across multiple
selftests, lost ability to specify extra log_level 1 or 2 (for -vv and
-vvv, respectively). Fix that problem by using bpf_object__load_xattr()
API that supports extra log_level flags. Also restore
BPF_F_TEST_RND_HI32 prog_flags by utilizing new bpf_program__set_extra_flags()
API.

Fixes: f87c1930ac ("selftests/bpf: Merge test_stub.c into testing_helpers.c")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111051758.92283-3-andrii@kernel.org
2021-11-11 16:44:26 -08:00
Linus Torvalds
f54ca91fe6 Networking fixes for 5.16-rc1, including fixes from bpf, can
and netfilter.
 
 Current release - regressions:
 
  - bpf: do not reject when the stack read size is different
    from the tracked scalar size
 
  - net: fix premature exit from NAPI state polling in napi_disable()
 
  - riscv, bpf: fix RV32 broken build, and silence RV64 warning
 
 Current release - new code bugs:
 
  - net: fix possible NULL deref in sock_reserve_memory
 
  - amt: fix error return code in amt_init(); fix stopping the workqueue
 
  - ax88796c: use the correct ioctl callback
 
 Previous releases - always broken:
 
  - bpf: stop caching subprog index in the bpf_pseudo_func insn
 
  - security: fixups for the security hooks in sctp
 
  - nfc: add necessary privilege flags in netlink layer, limit operations
    to admin only
 
  - vsock: prevent unnecessary refcnt inc for non-blocking connect
 
  - net/smc: fix sk_refcnt underflow on link down and fallback
 
  - nfnetlink_queue: fix OOB when mac header was cleared
 
  - can: j1939: ignore invalid messages per standard
 
  - bpf, sockmap:
    - fix race in ingress receive verdict with redirect to self
    - fix incorrect sk_skb data_end access when src_reg = dst_reg
    - strparser, and tls are reusing qdisc_skb_cb and colliding
 
  - ethtool: fix ethtool msg len calculation for pause stats
 
  - vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries
    to access an unregistering real_dev
 
  - udp6: make encap_rcv() bump the v6 not v4 stats
 
  - drv: prestera: add explicit padding to fix m68k build
 
  - drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge
 
  - drv: mvpp2: fix wrong SerDes reconfiguration order
 
 Misc & small latecomers:
 
  - ipvs: auto-load ipvs on genl access
 
  - mctp: sanity check the struct sockaddr_mctp padding fields
 
  - libfs: support RENAME_EXCHANGE in simple_rename()
 
  - avoid double accounting for pure zerocopy skbs
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGNQdwACgkQMUZtbf5S
 IrsiMQ//f66lTJ8PJ5Qj70hX9dC897olx7uGHB9eiKoyOcJI459hFlfXwRU2T4Tf
 fPNwPNUQ9Mynw9tX/jWEi+7zd6r6TSHGXK49U9/rIbQ95QjKY4LHowIE63x+vPl2
 5Cpf+80zXC3DUX1fijgyG1ujnU3kBaqopTxDLmlsHw2PGkwT5Ox1DUwkhc370eEL
 xlpq3PYGWA8/AQNyhSVBkG/UmoLaq0jYNP5yVcOj4jGjgcgLe1SLrqczENr35QHZ
 cRkuBsFBMBZF7wSX2f9qQIB/+b1pcLlD9IO+K3S7Ruq+rUd7qfL/tmwNxEh0axYK
 AyIun1Bxcy7QJGjtpGAz+Ku7jS9T3HxzyxhqilQo3co8jAW0WJ1YwHl+XPgQXyjV
 DLG6Vxt4syiwsoSXGn8MQugs4nlBT+0qWl8YamIR+o7KkAYPc2QWkXlzEDfNeIW8
 JNCZA3sy7VGi1ytorZGx16sQsEWnyRG9a6/WV20Dr+HVs1SKPcFzIfG6mVngR07T
 mQMHnbAF6Z5d8VTcPQfMxd7UH48s1bHtk5lcSTa3j0Cw+GkA6ytTmjPdJ1qRcdkH
 dl9jAfADe4O6frG+9XH7FEFqhmkghVI7bOCA4ZOhClVaIcDGgEZc2y7sY9/oZ7P4
 KXBD2R5X1caCUM0UtzwL7/8ddOtPtHIrFnhY+7+I6ijt9qmI0BY=
 =Ttgq
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, can and netfilter.

  Current release - regressions:

   - bpf: do not reject when the stack read size is different from the
     tracked scalar size

   - net: fix premature exit from NAPI state polling in napi_disable()

   - riscv, bpf: fix RV32 broken build, and silence RV64 warning

  Current release - new code bugs:

   - net: fix possible NULL deref in sock_reserve_memory

   - amt: fix error return code in amt_init(); fix stopping the
     workqueue

   - ax88796c: use the correct ioctl callback

  Previous releases - always broken:

   - bpf: stop caching subprog index in the bpf_pseudo_func insn

   - security: fixups for the security hooks in sctp

   - nfc: add necessary privilege flags in netlink layer, limit
     operations to admin only

   - vsock: prevent unnecessary refcnt inc for non-blocking connect

   - net/smc: fix sk_refcnt underflow on link down and fallback

   - nfnetlink_queue: fix OOB when mac header was cleared

   - can: j1939: ignore invalid messages per standard

   - bpf, sockmap:
      - fix race in ingress receive verdict with redirect to self
      - fix incorrect sk_skb data_end access when src_reg = dst_reg
      - strparser, and tls are reusing qdisc_skb_cb and colliding

   - ethtool: fix ethtool msg len calculation for pause stats

   - vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries to
     access an unregistering real_dev

   - udp6: make encap_rcv() bump the v6 not v4 stats

   - drv: prestera: add explicit padding to fix m68k build

   - drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge

   - drv: mvpp2: fix wrong SerDes reconfiguration order

  Misc & small latecomers:

   - ipvs: auto-load ipvs on genl access

   - mctp: sanity check the struct sockaddr_mctp padding fields

   - libfs: support RENAME_EXCHANGE in simple_rename()

   - avoid double accounting for pure zerocopy skbs"

* tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (123 commits)
  selftests/net: udpgso_bench_rx: fix port argument
  net: wwan: iosm: fix compilation warning
  cxgb4: fix eeprom len when diagnostics not implemented
  net: fix premature exit from NAPI state polling in napi_disable()
  net/smc: fix sk_refcnt underflow on linkdown and fallback
  net/mlx5: Lag, fix a potential Oops with mlx5_lag_create_definer()
  gve: fix unmatched u64_stats_update_end()
  net: ethernet: lantiq_etop: Fix compilation error
  selftests: forwarding: Fix packet matching in mirroring selftests
  vsock: prevent unnecessary refcnt inc for nonblocking connect
  net: marvell: mvpp2: Fix wrong SerDes reconfiguration order
  net: ethernet: ti: cpsw_ale: Fix access to un-initialized memory
  net: stmmac: allow a tc-taprio base-time of zero
  selftests: net: test_vxlan_under_vrf: fix HV connectivity test
  net: hns3: allow configure ETS bandwidth of all TCs
  net: hns3: remove check VF uc mac exist when set by PF
  net: hns3: fix some mac statistics is always 0 in device version V2
  net: hns3: fix kernel crash when unload VF while it is being reset
  net: hns3: sync rx ring head in echo common pull
  net: hns3: fix pfc packet number incorrect after querying pfc parameters
  ...
2021-11-11 09:49:36 -08:00
Paolo Bonzini
1f05833193 Merge branch 'kvm-sev-move-context' into kvm-master
Add support for AMD SEV and SEV-ES intra-host migration support.  Intra
host migration provides a low-cost mechanism for userspace VMM upgrades.

In the common case for intra host migration, we can rely on the normal
ioctls for passing data from one VMM to the next. SEV, SEV-ES, and other
confidential compute environments make most of this information opaque, and
render KVM ioctls such as "KVM_GET_REGS" irrelevant.  As a result, we need
the ability to pass this opaque metadata from one VMM to the next. The
easiest way to do this is to leave this data in the kernel, and transfer
ownership of the metadata from one KVM VM (or vCPU) to the next.  In-kernel
hand off makes it possible to move any data that would be
unsafe/impossible for the kernel to hand directly to userspace, and
cannot be reproduced using data that can be handed to userspace.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11 11:02:58 -05:00
Peter Gonda
6a58150859 selftest: KVM: Add intra host migration tests
Adds testcases for intra host migration for SEV and SEV-ES. Also adds
locking test to confirm no deadlock exists.

Signed-off-by: Peter Gonda <pgonda@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Message-Id: <20211021174303.385706-6-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11 10:36:17 -05:00
Peter Gonda
7a6ab3cf39 selftest: KVM: Add open sev dev helper
Refactors out open path support from open_kvm_dev_path_or_exit() and
adds new helper for SEV device path.

Signed-off-by: Peter Gonda <pgonda@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Message-Id: <20211021174303.385706-5-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11 10:35:27 -05:00
Willem de Bruijn
d336509cb9 selftests/net: udpgso_bench_rx: fix port argument
The below commit added optional support for passing a bind address.
It configures the sockaddr bind arguments before parsing options and
reconfigures on options -b and -4.

This broke support for passing port (-p) on its own.

Configure sockaddr after parsing all arguments.

Fixes: 3327a9c463 ("selftests: add functionals test for UDP GRO")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-11 12:24:26 +00:00
Mark Pashmfouroush
8b4fd2bf1f selftests/bpf: Add tests for accessing ingress_ifindex in bpf_sk_lookup
A new field was added to the bpf_sk_lookup data that users can access.
Add tests that validate that the new ingress_ifindex field contains the
right data.

Signed-off-by: Mark Pashmfouroush <markpash@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211110111016.5670-3-markpash@cloudflare.com
2021-11-10 16:29:59 -08:00
Petr Machata
af0a51113c selftests: forwarding: Fix packet matching in mirroring selftests
In commit 6de6e46d27 ("cls_flower: Fix inability to match GRE/IPIP
packets"), cls_flower was fixed to match an outer packet of a tunneled
packet as would be expected, rather than dissecting to the inner packet and
matching on that.

This fix uncovered several issues in packet matching in mirroring
selftests:

- in mirror_gre_bridge_1d_vlan.sh and mirror_gre_vlan_bridge_1q.sh, the
  vlan_ethtype match is copied around as "ip", even as some of the tests
  are running over ip6gretap. This is fixed by using an "ipv6" for
  vlan_ethtype in the ip6gretap tests.

- in mirror_gre_changes.sh, a filter to count GRE packets is set up to
  match TTL of 50. This used to trigger in the offloaded datapath, where
  the envelope TTL was matched, but not in the software datapath, which
  considered TTL of the inner packet. Now that both match consistently, all
  the packets were double-counted. This is fixed by marking the filter as
  skip_hw, leaving only the SW datapath component active.

Fixes: 6de6e46d27 ("cls_flower: Fix inability to match GRE/IPIP packets")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-10 14:38:44 +00:00
Andrea Righi
e7e4785fa3 selftests: net: test_vxlan_under_vrf: fix HV connectivity test
It looks like test_vxlan_under_vrf.sh is always failing to verify the
connectivity test during the ping between the two simulated VMs.

This is due to the fact that veth-hv in each VM should have a distinct
MAC address.

Fix by setting a unique MAC address on each simulated VM interface.

Without this fix:

 $ sudo ./tools/testing/selftests/net/test_vxlan_under_vrf.sh
 Checking HV connectivity                                           [ OK ]
 Check VM connectivity through VXLAN (underlay in the default VRF)  [FAIL]

With this fix applied:

 $ sudo ./tools/testing/selftests/net/test_vxlan_under_vrf.sh
 Checking HV connectivity                                           [ OK ]
 Check VM connectivity through VXLAN (underlay in the default VRF)  [ OK ]
 Check VM connectivity through VXLAN (underlay in a VRF)            [FAIL]

NOTE: the connectivity test with the underlay VRF is still failing; it
seems that ARP requests are blocked at the simulated hypervisor level,
probably due to some missing ARP forwarding rules. This requires more
investigation (in the meantime we may consider to set that test as
expected failure - XFAIL).

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-10 14:31:02 +00:00
Jakub Kicinski
fceb07950a Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2021-11-09

We've added 7 non-merge commits during the last 3 day(s) which contain
a total of 10 files changed, 174 insertions(+), 48 deletions(-).

The main changes are:

1) Various sockmap fixes, from John and Jussi.

2) Fix out-of-bound issue with bpf_pseudo_func, from Martin.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg
  bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding
  bpf, sockmap: Fix race in ingress receive verdict with redirect to self
  bpf, sockmap: Remove unhash handler for BPF sockmap usage
  bpf, sockmap: Use stricter sk state checks in sk_lookup_assign
  bpf: selftest: Trigger a DCE on the whole subprog
  bpf: Stop caching subprog index in the bpf_pseudo_func insn
====================

Link: https://lore.kernel.org/r/20211109215702.38350-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-09 15:44:48 -08:00
Linus Torvalds
59a2ceeef6 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "87 patches.

  Subsystems affected by this patch series: mm (pagecache and hugetlb),
  procfs, misc, MAINTAINERS, lib, checkpatch, binfmt, kallsyms, ramfs,
  init, codafs, nilfs2, hfs, crash_dump, signals, seq_file, fork,
  sysvfs, kcov, gdb, resource, selftests, and ipc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (87 commits)
  ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL
  ipc: check checkpoint_restore_ns_capable() to modify C/R proc files
  selftests/kselftest/runner/run_one(): allow running non-executable files
  virtio-mem: disallow mapping virtio-mem memory via /dev/mem
  kernel/resource: disallow access to exclusive system RAM regions
  kernel/resource: clean up and optimize iomem_is_exclusive()
  scripts/gdb: handle split debug for vmlinux
  kcov: replace local_irq_save() with a local_lock_t
  kcov: avoid enable+disable interrupts if !in_task()
  kcov: allocate per-CPU memory on the relevant node
  Documentation/kcov: define `ip' in the example
  Documentation/kcov: include types.h in the example
  sysv: use BUILD_BUG_ON instead of runtime check
  kernel/fork.c: unshare(): use swap() to make code cleaner
  seq_file: fix passing wrong private data
  seq_file: move seq_escape() to a header
  signal: remove duplicate include in signal.h
  crash_dump: remove duplicate include in crash_dump.h
  crash_dump: fix boolreturn.cocci warning
  hfs/hfsplus: use WARN_ON for sanity check
  ...
2021-11-09 10:11:53 -08:00
SeongJae Park
303f8e2d02 selftests/kselftest/runner/run_one(): allow running non-executable files
When running a test program, 'run_one()' checks if the program has the
execution permission and fails if it doesn't.  However, it's easy to
mistakenly lose the permissions, as some common tools like 'diff' don't
support the permission change well[1].  Compared to that, making mistakes
in the test program's path would only rare, as those are explicitly listed
in 'TEST_PROGS'.  Therefore, it might make more sense to resolve the
situation on our own and run the program.

For this reason, this commit makes the test program runner function still
print the warning message but to try parsing the interpreter of the
program and to explicitly run it with the interpreter, in this case.

[1] https://lore.kernel.org/mm-commits/YRJisBs9AunccCD4@kroah.com/

Link: https://lkml.kernel.org/r/20210810164534.25902-1-sj38.park@gmail.com
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:53 -08:00
Florian Weimer
0658a0961b procfs: do not list TID 0 in /proc/<pid>/task
If a task exits concurrently, task_pid_nr_ns may return 0.

[akpm@linux-foundation.org: coding style tweaks]
[adobriyan@gmail.com: test that /proc/*/task doesn't contain "0"]
  Link: https://lkml.kernel.org/r/YV88AnVzHxPafQ9o@localhost.localdomain

Link: https://lkml.kernel.org/r/8735pn5dx7.fsf@oldenburg.str.redhat.com
Signed-off-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:48 -08:00
Alan Maguire
c23551c9c3 selftests/bpf: Add exception handling selftests for tp_bpf program
Exception handling is triggered in BPF tracing programs when a NULL pointer
is dereferenced; the exception handler zeroes the target register and
execution of the BPF program progresses.

To test exception handling then, we need to trigger a NULL pointer dereference
for a field which should never be zero; if it is, the only explanation is the
exception handler ran. task->task_works is the NULL pointer chosen (for a new
task from fork() no work is associated), and the task_works->func field should
not be zero if task_works is non-NULL. The test verifies that task_works and
task_works->func are 0.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1636131046-5982-3-git-send-email-alan.maguire@oracle.com
2021-11-08 22:17:55 +01:00
Linus Torvalds
dd72945c43 cxl for v5.16
- Fix support for platforms that do not enumerate every ACPI0016 (CXL
   Host Bridge) in the CHBS (ACPI Host Bridge Structure).
 
 - Introduce a common pci_find_dvsec_capability() helper, clean up open
   coded implementations in various drivers.
 
 - Add 'cxl_test' for regression testing CXL subsystem ABIs. 'cxl_test'
   is a module built from tools/testing/cxl/ that mocks up a CXL topology
   to augment the nascent support for emulation of CXL devices in QEMU.
 
 - Convert libnvdimm to use the uuid API.
 
 - Complete the definition of CXL namespace labels in libnvdimm.
 
 - Tunnel libnvdimm label operations from nd_ioctl() back to the CXL
   mailbox driver. Enable 'ndctl {read,write}-labels' for CXL.
 
 - Continue to sort and refactor functionality into distinct driver and
   core-infrastructure buckets. For example, mailbox handling is now a
   generic core capability consumed by the PCI and cxl_test drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYYRtyQAKCRDfioYZHlFs
 Z7UsAP9DzUN6IWWnYk1R95YXYVxFriRtRsBjujAqTg49EMghawEAoHaA9lxO3Hho
 l25TLYUOmB/zFTlUbe6YQptMJZ5YLwY=
 =im9j
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl updates from Dan Williams:
 "More preparation and plumbing work in the CXL subsystem.

  From an end user perspective the highlight here is lighting up the CXL
  Persistent Memory related commands (label read / write) with the
  generic ioctl() front-end in LIBNVDIMM.

  Otherwise, the ability to instantiate new persistent and volatile
  memory regions is still on track for v5.17.

  Summary:

   - Fix support for platforms that do not enumerate every ACPI0016 (CXL
     Host Bridge) in the CHBS (ACPI Host Bridge Structure).

   - Introduce a common pci_find_dvsec_capability() helper, clean up
     open coded implementations in various drivers.

   - Add 'cxl_test' for regression testing CXL subsystem ABIs.
     'cxl_test' is a module built from tools/testing/cxl/ that mocks up
     a CXL topology to augment the nascent support for emulation of CXL
     devices in QEMU.

   - Convert libnvdimm to use the uuid API.

   - Complete the definition of CXL namespace labels in libnvdimm.

   - Tunnel libnvdimm label operations from nd_ioctl() back to the CXL
     mailbox driver. Enable 'ndctl {read,write}-labels' for CXL.

   - Continue to sort and refactor functionality into distinct driver
     and core-infrastructure buckets. For example, mailbox handling is
     now a generic core capability consumed by the PCI and cxl_test
     drivers"

* tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (34 commits)
  ocxl: Use pci core's DVSEC functionality
  cxl/pci: Use pci core's DVSEC functionality
  PCI: Add pci_find_dvsec_capability to find designated VSEC
  cxl/pci: Split cxl_pci_setup_regs()
  cxl/pci: Add @base to cxl_register_map
  cxl/pci: Make more use of cxl_register_map
  cxl/pci: Remove pci request/release regions
  cxl/pci: Fix NULL vs ERR_PTR confusion
  cxl/pci: Remove dev_dbg for unknown register blocks
  cxl/pci: Convert register block identifiers to an enum
  cxl/acpi: Do not fail cxl_acpi_probe() based on a missing CHBS
  cxl/pci: Disambiguate cxl_pci further from cxl_mem
  Documentation/cxl: Add bus internal docs
  cxl/core: Split decoder setup into alloc + add
  tools/testing/cxl: Introduce a mock memory device + driver
  cxl/mbox: Move command definitions to common location
  cxl/bus: Populate the target list at decoder create
  tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
  cxl/pmem: Add support for multiple nvdimm-bridge objects
  cxl/pmem: Translate NVDIMM label commands to CXL label commands
  ...
2021-11-08 11:49:48 -08:00
Phil Sutter
85c0c8b342 selftests: nft_nat: Simplify port shadow notrack test
The second rule in prerouting chain was probably a leftover: The router
listens on veth0, so not tracking connections via that interface is
sufficient. Likewise, the rule in output chain can be limited to that
interface as well.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-08 11:26:57 +01:00
Phil Sutter
e1f8bc06e4 selftests: nft_nat: Improve port shadow test stability
Setup phase in test_port_shadow() relied upon a race-condition:
Listening nc on port 1405 was started in background before attempting to
create the fake conntrack entry using the same source port. If listening
nc won, fake conntrack entry could not be created causing wrong
behaviour. Reorder nc calls to fix this and introduce a short delay
before testing the setup to wait for listening nc process startup.

Fixes: 465f15a6d1 ("selftests: nft_nat: add udp hole punch test case")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-08 11:26:57 +01:00
Florian Westphal
228c3fa054 selftests: netfilter: extend nfqueue tests to cover vrf device
VRF device calls the output/postrouting hooks so packet should be seeon
with oifname tvrf and once with eth0.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-08 11:24:01 +01:00
Florian Westphal
33b8aad21a selftests: netfilter: add a vrf+conntrack testcase
Rework the reproducer for the vrf+conntrack regression reported
by Eugene into a selftest and also add a test for ip masquerading
that Lahav fixed recently.

With net or net-next tree, the first test fails and the latter
two pass.

With 09e856d54b ("vrf: Reset skb conntrack connection on VRF rcv")
reverted first test passes but the last two fail.

A proper fix needs more work, for time being a revert seems to be
the best choice, snat/masquerade did not work before the fix.

Link: https://lore.kernel.org/netdev/378ca299-4474-7e9a-3d36-2350c8c98995@gmail.com/T/#m95358a31810df7392f541f99d187227bc75c9963
Reported-by: Eugene Crosser <crosser@average.org>
Cc: Lahav Schlesinger <lschlesinger@drivenets.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-08 11:24:01 +01:00
Song Liu
f108662b27 selftests/bpf: Add tests for bpf_find_vma
Add tests for bpf_find_vma in perf_event program and kprobe program. The
perf_event program is triggered from NMI context, so the second call of
bpf_find_vma() will return -EBUSY (irq_work busy). The kprobe program,
on the other hand, does not have this constraint.

Also add tests for illegal writes to task or vma from the callback
function. The verifier should reject both cases.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211105232330.1936330-3-songliubraving@fb.com
2021-11-07 11:54:51 -08:00
Anders Roxell
62b12ab5df selftests: net: tls: remove unused variable and code
When building selftests/net with clang, the compiler warn about the
function abs() see below:

tls.c:657:15: warning: variable 'len_compared' set but not used [-Wunused-but-set-variable]
        unsigned int len_compared = 0;
                     ^

Rework to remove the unused variable and the for-loop where the variable
'len_compared' was assinged.

Fixes: 7f657d5bf5 ("selftests: tls: add selftests for TLS sockets")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-07 19:35:08 +00:00
Andrii Nakryiko
8c7a955201 selftests/bpf: Fix bpf_object leak in skb_ctx selftest
skb_ctx selftest didn't close bpf_object implicitly allocated by
bpf_prog_test_load() helper. Fix the problem by explicitly calling
bpf_object__close() at the end of the test.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-10-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
f91231eeee selftests/bpf: Destroy XDP link correctly
bpf_link__detach() was confused with bpf_link__destroy() and leaves
leaked FD in the process. Fix the problem.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-9-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
f92321d706 selftests/bpf: Avoid duplicate btf__parse() call
btf__parse() is repeated after successful setup, leaving the first
instance leaked. Remove redundant and premature call.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-8-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
f79587520a selftests/bpf: Clean up btf and btf_dump in dump_datasec test
Free up used resources at the end and on error. Also make it more
obvious that there is btf__parse() call that creates struct btf
instance.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-7-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
5309b516bc selftests/bpf: Free inner strings index in btf selftest
Inner array of allocated strings wasn't freed on success. Now it's
always freed.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-6-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
b8b26e585f selftests/bpf: Free per-cpu values array in bpf_iter selftest
Array holding per-cpu values wasn't freed. Fix that.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211107165521.9240-5-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
8ba2858749 selftests/bpf: Fix memory leaks in btf_type_c_dump() helper
Free up memory and resources used by temporary allocated memstream and
btf_dump instance.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-4-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
2a2cb45b72 selftests/bpf: Pass sanitizer flags to linker through LDFLAGS
When adding -fsanitize=address to SAN_CFLAGS, it has to be passed both
to compiler through CFLAGS as well as linker through LDFLAGS. Add
SAN_CFLAGS into LDFLAGS to allow building selftests with ASAN.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-2-andrii@kernel.org
2021-11-07 09:14:15 -08:00
Andrii Nakryiko
f19ddfe036 selftests/bpf: Use explicit bpf_test_load_program() helper calls
Remove the second part of prog loading testing helper re-definition:

  -Dbpf_load_program=bpf_test_load_program

This completes the clean up of deprecated libbpf program loading APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-13-andrii@kernel.org
2021-11-07 08:34:24 -08:00
Andrii Nakryiko
cbdb1461dc selftests/bpf: Use explicit bpf_prog_test_load() calls everywhere
-Dbpf_prog_load_deprecated=bpf_prog_test_load trick is both ugly and
breaks when deprecation goes into effect due to macro magic. Convert all
the uses to explicit bpf_prog_test_load() calls which avoid deprecation
errors and makes everything less magical.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-12-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
f87c1930ac selftests/bpf: Merge test_stub.c into testing_helpers.c
Move testing prog and object load wrappers (bpf_prog_test_load and
bpf_test_load_program) into testing_helpers.{c,h} and get rid of
otherwise useless test_stub.c. Make testing_helpers.c available to
non-test_progs binaries as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-11-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
d8e86407e5 selftests/bpf: Convert legacy prog load APIs to bpf_prog_load()
Convert all the uses of legacy low-level BPF program loading APIs
(mostly bpf_load_program_xattr(), but also some bpf_verify_program()) to
bpf_prog_load() uses.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-10-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
3d1d62397f selftests/bpf: Fix non-strict SEC() program sections
Fix few more SEC() definitions that were previously missed.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-9-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Andrii Nakryiko
d10ef2b825 libbpf: Unify low-level BPF_PROG_LOAD APIs into bpf_prog_load()
Add a new unified OPTS-based low-level API for program loading,
bpf_prog_load() ([0]).  bpf_prog_load() accepts few "mandatory"
parameters as input arguments (program type, name, license,
instructions) and all the other optional (as in not required to specify
for all types of BPF programs) fields into struct bpf_prog_load_opts.

This makes all the other non-extensible APIs variant for BPF_PROG_LOAD
obsolete and they are slated for deprecation in libbpf v0.7:
  - bpf_load_program();
  - bpf_load_program_xattr();
  - bpf_verify_program().

Implementation-wise, internal helper libbpf__bpf_prog_load is refactored
to become a public bpf_prog_load() API. struct bpf_prog_load_params used
internally is replaced by public struct bpf_prog_load_opts.

Unfortunately, while conceptually all this is pretty straightforward,
the biggest complication comes from the already existing bpf_prog_load()
*high-level* API, which has nothing to do with BPF_PROG_LOAD command.

We try really hard to have a new API named bpf_prog_load(), though,
because it maps naturally to BPF_PROG_LOAD command.

For that, we rename old bpf_prog_load() into bpf_prog_load_deprecated()
and mark it as COMPAT_VERSION() for shared library users compiled
against old version of libbpf. Statically linked users and shared lib
users compiled against new version of libbpf headers will get "rerouted"
to bpf_prog_deprecated() through a macro helper that decides whether to
use new or old bpf_prog_load() based on number of input arguments (see
___libbpf_overload in libbpf_common.h).

To test that existing
bpf_prog_load()-using code compiles and works as expected, I've compiled
and ran selftests as is. I had to remove (locally) selftest/bpf/Makefile
-Dbpf_prog_load=bpf_prog_test_load hack because it was conflicting with
the macro-based overload approach. I don't expect anyone else to do
something like this in practice, though. This is testing-specific way to
replace bpf_prog_load() calls with special testing variant of it, which
adds extra prog_flags value. After testing I kept this selftests hack,
but ensured that we use a new bpf_prog_load_deprecated name for this.

This patch also marks bpf_prog_load() and bpf_prog_load_xattr() as deprecated.
bpf_object interface has to be used for working with struct bpf_program.
Libbpf doesn't support loading just a bpf_program.

The silver lining is that when we get to libbpf 1.0 all these
complication will be gone and we'll have one clean bpf_prog_load()
low-level API with no backwards compatibility hackery surrounding it.

  [0] Closes: https://github.com/libbpf/libbpf/issues/284

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-4-andrii@kernel.org
2021-11-07 08:34:23 -08:00
Linus Torvalds
0b707e572a s390 updates for the 5.16 merge window
- Add support for ftrace with direct call and ftrace direct call samples.
 
 - Add support for kernel command lines longer than current 896 bytes and
   make its length configurable.
 
 - Add support for BEAR enhancement facility to improve last breaking
   event instruction tracking.
 
 - Add kprobes sanity checks and testcases to prevent kprobe in the mid
   of an instruction.
 
 - Allow concurrent access to /dev/hwc for the CPUMF users.
 
 - Various ftrace / jump label improvements.
 
 - Convert unwinder tests to KUnit.
 
 - Add s390_iommu_aperture kernel parameter to tweak the limits on
   concurrently usable DMA mappings.
 
 - Add ap.useirq AP module option which can be used to disable interrupt
   use.
 
 - Add add_disk() error handling support to block device drivers.
 
 - Drop arch specific and use generic implementation of strlcpy and strrchr.
 
 - Several __pa/__va usages fixes.
 
 - Various cio, crypto, pci, kernel doc and other small fixes and
   improvements all over the code.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmGFW6EACgkQjYWKoQLX
 FBg20Qf/UbohgnKnE6vxbbH3sNTlI2dk3Cw4z3IobcsZgqXAu6AFLgLQGLk/X07F
 DIyUdrgSgCzLIEKLqrLrFXIOMIK44zAGaurIltNt7IrnWWlA+/YVD+YeL2gHwccq
 wT7KXRcrVMZQ1z18djJQ45DpPUC8ErBdL6+P+ftHck90YGFZsfMA5S7jf8X1h08U
 IlqdPTmY8t4unKHWVpHbxx9b+xrUuV6KTEXADsllpMV2jQoTLdDECd3vmefYR6tR
 3lssgop1m/RzH5OCqvia5Sy2D5fOQObNWDMakwOkVMxOD43lmGCTHstzS2Uo2OFE
 QcY79lfZ5NrzKnenUdE5Fd0XJ9kSwQ==
 =k0Ab
 -----END PGP SIGNATURE-----

Merge tag 's390-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Add support for ftrace with direct call and ftrace direct call
   samples.

 - Add support for kernel command lines longer than current 896 bytes
   and make its length configurable.

 - Add support for BEAR enhancement facility to improve last breaking
   event instruction tracking.

 - Add kprobes sanity checks and testcases to prevent kprobe in the mid
   of an instruction.

 - Allow concurrent access to /dev/hwc for the CPUMF users.

 - Various ftrace / jump label improvements.

 - Convert unwinder tests to KUnit.

 - Add s390_iommu_aperture kernel parameter to tweak the limits on
   concurrently usable DMA mappings.

 - Add ap.useirq AP module option which can be used to disable interrupt
   use.

 - Add add_disk() error handling support to block device drivers.

 - Drop arch specific and use generic implementation of strlcpy and
   strrchr.

 - Several __pa/__va usages fixes.

 - Various cio, crypto, pci, kernel doc and other small fixes and
   improvements all over the code.

[ Merge fixup as per https://lore.kernel.org/all/YXAqZ%2FEszRisunQw@osiris/ ]

* tag 's390-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (63 commits)
  s390: make command line configurable
  s390: support command lines longer than 896 bytes
  s390/kexec_file: move kernel image size check
  s390/pci: add s390_iommu_aperture kernel parameter
  s390/spinlock: remove incorrect kernel doc indicator
  s390/string: use generic strlcpy
  s390/string: use generic strrchr
  s390/ap: function rework based on compiler warning
  s390/cio: make ccw_device_dma_* more robust
  s390/vfio-ap: s390/crypto: fix all kernel-doc warnings
  s390/hmcdrv: fix kernel doc comments
  s390/ap: new module option ap.useirq
  s390/cpumf: Allow multiple processes to access /dev/hwc
  s390/bitops: return true/false (not 1/0) from bool functions
  s390: add support for BEAR enhancement facility
  s390: introduce nospec_uses_trampoline()
  s390: rename last_break to pgm_last_break
  s390/ptrace: add last_break member to pt_regs
  s390/sclp: sort out physical vs virtual pointers usage
  s390/setup: convert start and end initrd pointers to virtual
  ...
2021-11-06 14:48:06 -07:00
Linus Torvalds
512b7931ad Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "257 patches.

  Subsystems affected by this patch series: scripts, ocfs2, vfs, and
  mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
  gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
  pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
  memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
  vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
  cleanups, kfence, and damon)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
  mm/damon: remove return value from before_terminate callback
  mm/damon: fix a few spelling mistakes in comments and a pr_debug message
  mm/damon: simplify stop mechanism
  Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
  Docs/admin-guide/mm/damon/start: simplify the content
  Docs/admin-guide/mm/damon/start: fix a wrong link
  Docs/admin-guide/mm/damon/start: fix wrong example commands
  mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
  mm/damon: remove unnecessary variable initialization
  Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
  mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
  selftests/damon: support watermarks
  mm/damon/dbgfs: support watermarks
  mm/damon/schemes: activate schemes based on a watermarks mechanism
  tools/selftests/damon: update for regions prioritization of schemes
  mm/damon/dbgfs: support prioritization weights
  mm/damon/vaddr,paddr: support pageout prioritization
  mm/damon/schemes: prioritize regions within the quotas
  mm/damon/selftests: support schemes quotas
  mm/damon/dbgfs: support quotas of schemes
  ...
2021-11-06 14:08:17 -07:00
SeongJae Park
1dc90ccd15 selftests/damon: support watermarks
This updates DAMON selftests for 'schemes' debugfs file to reflect the
changes in the format.

Link: https://lkml.kernel.org/r/20211019150731.16699-14-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:45 -07:00
SeongJae Park
5a0d6a08b8 tools/selftests/damon: update for regions prioritization of schemes
This updates the DAMON selftests for 'schemes' debugfs file, as the file
format is updated.

Link: https://lkml.kernel.org/r/20211019150731.16699-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:45 -07:00
SeongJae Park
a2cb4dd0d4 mm/damon/selftests: support schemes quotas
This updates DAMON selftests to support updated schemes debugfs file
format for the quotas.

Link: https://lkml.kernel.org/r/20211019150731.16699-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:45 -07:00
SeongJae Park
8d5d4c6359 selftests/damon: add 'schemes' debugfs tests
This adds simple selftets for 'schemes' debugfs file of DAMON.

Link: https://lkml.kernel.org/r/20211001125604.29660-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:44 -07:00
David Hildenbrand
50f9481ed9 mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE
CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM, so there is no need for
CONFIG_MEMORY_HOTPLUG_SPARSE anymore; adjust all instances to use
CONFIG_MEMORY_HOTPLUG and remove CONFIG_MEMORY_HOTPLUG_SPARSE.

Link: https://lkml.kernel.org/r/20210929143600.49379-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>	[kselftest]
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:42 -07:00
David Hildenbrand
39b2e5cae4 selftests/vm: make MADV_POPULATE_(READ|WRITE) use in-tree headers
The madv_populate selftest currently builds with a warning when the
local installed headers (via the distribution) don't include
MADV_POPULATE_READ and MADV_POPULATE_WRITE.  The warning is correct,
because the test cannot locate the necessary header.

The reason is that the in-tree installed headers (usr/include) have a
"linux" instead of a "sys" subdirectory.

Including "linux/mman.h" instead of "sys/mman.h" doesn't work (e.g.,
mmap() and madvise() are not defined that way).  The only thing that
seems to work is including "linux/mman.h" in addition to "sys/mman.h".

We can get rid of our availability check and simplify.

Link: https://lkml.kernel.org/r/20211015165758.41374-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:42 -07:00
Pedro Demarchi Gomes
3252548996 selftests: vm: add KSM huge pages merging time test
Add test case of KSM merging time using mostly huge pages

Link: https://lkml.kernel.org/r/20211013044045.360251-1-pedrodemargomes@gmail.com
Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
Cc: Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Aneesh Kumar K.V
e3820ab252 selftest/vm: fix ksm selftest to run with different NUMA topologies
Platforms can have non-contiguous NUMA nodes like below

   #numactl  -H
  available: 2 nodes (0,8)
  .....
  node distances:
  node   0   8
    0:  10  40
    8:  40  10

   #numactl  -H
  available: 1 nodes (1)
  ....
  node distances:
  node   1
    1:  10

Hence update the test to not assume the presence of Node 0 and 1 and
also use numa_num_configured_nodes() instead of numa_max_node for
finding whether to skip the test.

Link: https://lkml.kernel.org/r/20210914141414.350759-1-aneesh.kumar@linux.ibm.com
Fixes: 82e717ad35 ("selftests: vm: add KSM merging across nodes test")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Tyler Hicks <tyhicks@linux.microsoft.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
George G. Davis
39cad8878a selftests/vm/transhuge-stress: fix ram size thinko
When executing transhuge-stress with an argument to specify the virtual
memory size for testing, the ram size is reported as 0, e.g.

  transhuge-stress 384
  thp-mmap: allocate 192 transhuge pages, using 384 MiB virtual memory and 0 MiB of ram
  thp-mmap: 0.184 s/loop, 0.957 ms/page,   2090.265 MiB/s  192 succeed,    0 failed

This appears to be due to a thinko in commit 0085d61fe0
("selftests/vm/transhuge-stress: stress test for memory compaction"),
where, at a guess, the intent was to base "xyz MiB of ram" on `ram`
size.

Here are results after using `ram` size:

  thp-mmap: allocate 192 transhuge pages, using 384 MiB virtual memory and 14 MiB of ram

Link: https://lkml.kernel.org/r/20210825135843.29052-1-george_davis@mentor.com
Fixes: 0085d61fe0 ("selftests/vm/transhuge-stress: stress test for memory compaction")
Signed-off-by: George G. Davis <davis.george@siemens.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Eugeniu Rosca <erosca@de.adit-jv.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Axel Rasmussen
ad0ce23ed0 userfaultfd/selftests: fix calculation of expected ioctls
Today, we assert that the ioctls the kernel reports as supported for a
registration match a precomputed list.  We decide which ioctls are
supported by examining the memory type.  Then, in several locations we
"fix up" this list by adding or removing things this initial decision
got wrong.

What ioctls the kernel reports is actually a function of several things:
- The memory type
- Kernel feature support (e.g., no writeprotect on aarch64)
- The registration type (e.g., CONTINUE only supported for MINOR mode)

So, we can't fully compute this at the start, in set_test_type.  It
varies per test, depending on what registration mode(s) those tests use.

Instead, introduce a new function which computes the correct list.  This
centralizes the add/remove of ioctls depending on these function inputs
in one place, so we don't have to repeat ourselves in various tests.

Not only is the resulting code a bit shorter, but it fixes a real bug in
the existing code: previously, we would incorrectly require the
writeprotect ioctl to be present on aarch64, where it isn't actually
supported.

Link: https://lkml.kernel.org/r/20210930212309.4001967-4-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:39 -07:00
Axel Rasmussen
1042a53d0e userfaultfd/selftests: fix feature support detection
Before any tests are run, in set_test_type, we decide what feature(s) we
are going to be testing, based upon our command line arguments.
However, the supported features are not just a function of the memory
type being used, so this is broken.

For instance, consider writeprotect support.  It is "normally" supported
for anonymous memory, but furthermore it requires that the kernel has
CONFIG_HAVE_ARCH_USERFAULTFD_WP.  So, it is *not* supported at all on
aarch64, for example.

So, this fixes this by querying the kernel for the set of features it
supports in set_test_type, by opening a userfaultfd and issuing a
UFFDIO_API ioctl.  Based upon the reported features, we toggle what
tests are enabled.

Link: https://lkml.kernel.org/r/20210930212309.4001967-3-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:39 -07:00
Axel Rasmussen
1c10e674b3 userfaultfd/selftests: don't rely on GNU extensions for random numbers
Patch series "Small userfaultfd selftest fixups", v2.

This patch (of 3):

Two arguments for doing this:

First, and maybe most importantly, the resulting code is significantly
shorter / simpler.

Then, we avoid using GNU libc extensions.  Why does this matter? It
makes testing userfaultfd with the selftest easier e.g.  on distros
which use something other than glibc (e.g., Alpine, which uses musl);
basically, it makes the test more portable.

Link: https://lkml.kernel.org/r/20210930212309.4001967-2-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:39 -07:00
Ran Jianping
b65c23f72e mm: remove duplicate include in hugepage-mremap.c
Remove duplicate includes 'unistd.h' included in
 '/tools/testing/selftests/vm/hugepage-mremap.c'  is duplicated.It is also
 included on 23 line.

Link: https://lkml.kernel.org/r/20211018102336.869726-1-ran.jianping@zte.com.cn
Signed-off-by: Ran Jianping <ran.jianping@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:39 -07:00
Mina Almasry
12b6132064 mm, hugepages: add hugetlb vma mremap() test
[almasrymina@google.com: v8]
  Link: https://lkml.kernel.org/r/20211014200542.4126947-2-almasrymina@google.com
[wanjiabing@vivo.com: remove duplicated include in hugepage-mremap]
  Link: https://lkml.kernel.org/r/20211021122944.8857-1-wanjiabing@vivo.com

Link: https://lkml.kernel.org/r/20211013195825.3058275-2-almasrymina@google.com
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Ken Chen <kenchen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Kirill Shutemov <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:39 -07:00
Martin KaFai Lau
d99341b373 bpf: selftest: Trigger a DCE on the whole subprog
This patch adds a test to trigger the DCE to remove
the whole subprog to ensure the verifier  does not
depend on a stable subprog index.  The DCE is done
by testing a global const.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211106014020.651638-1-kafai@fb.com
2021-11-06 12:54:12 -07:00
Jakub Kicinski
9bea6aa498 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-11-05

We've added 15 non-merge commits during the last 3 day(s) which contain
a total of 14 files changed, 199 insertions(+), 90 deletions(-).

The main changes are:

1) Fix regression from stack spill/fill of <8 byte scalars, from Martin KaFai Lau.

2) Fix perf's build of bpftool's bootstrap version due to missing libbpf
   headers, from Quentin Monnet.

3) Fix riscv{32,64} BPF exception tables build errors and warnings, from Björn Töpel.

4) Fix bpf fs to allow RENAME_EXCHANGE support for atomic upgrades on sk_lookup
   control planes, from Lorenz Bauer.

5) Fix libbpf's error reporting in bpf_map_lookup_and_delete_elem_flags() due to
   missing libbpf_err_errno(), from Mehrdad Arshad Rad.

6) Various fixes to make xdp_redirect_multi selftest more reliable, from Hangbin Liu.

7) Fix netcnt selftest to make it run serial and thus avoid conflicts with other
   cgroup/skb selftests run in parallel that could cause flakes, from Andrii Nakryiko.

8) Fix reuseport_bpf_numa networking selftest to skip unavailable NUMA nodes,
   from Kleber Sacilotto de Souza.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  riscv, bpf: Fix RV32 broken build, and silence RV64 warning
  selftests/bpf/xdp_redirect_multi: Limit the tests in netns
  selftests/bpf/xdp_redirect_multi: Give tcpdump a chance to terminate cleanly
  selftests/bpf/xdp_redirect_multi: Use arping to accurate the arp number
  selftests/bpf/xdp_redirect_multi: Put the logs to tmp folder
  libbpf: Fix lookup_and_delete_elem_flags error reporting
  bpftool: Install libbpf headers for the bootstrap version, too
  selftests/net: Fix reuseport_bpf_numa by skipping unavailable nodes
  selftests/bpf: Verifier test on refill from a smaller spill
  bpf: Do not reject when the stack read size is different from the tracked scalar size
  selftests/bpf: Make netcnt selftests serial to avoid spurious failures
  selftests/bpf: Test RENAME_EXCHANGE and RENAME_NOREPLACE on bpffs
  selftests/bpf: Convert test_bpffs to ASSERT macros
  libfs: Support RENAME_EXCHANGE in simple_rename()
  libfs: Move shmem_exchange to simple_rename_exchange
====================

Link: https://lore.kernel.org/r/20211105165803.29372-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-05 14:13:40 -07:00
Hangbin Liu
8955c1a329 selftests/bpf/xdp_redirect_multi: Limit the tests in netns
As I want to test both DEVMAP and DEVMAP_HASH in XDP multicast redirect, I
limited DEVMAP max entries to a small value for performace. When the test
runs after amount of interface creating/deleting tests. The interface index
will exceed the map max entries and xdp_redirect_multi will error out with
"Get interfacesInterface index to large".

Fix this issue by limit the tests in netns and specify the ifindex when
creating interfaces.

Fixes: d232924762 ("selftests/bpf: Add xdp_redirect_multi test")
Reported-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-5-liuhangbin@gmail.com
2021-11-05 16:42:56 +01:00
Hangbin Liu
648c367706 selftests/bpf/xdp_redirect_multi: Give tcpdump a chance to terminate cleanly
No need to kill tcpdump with -9.

Fixes: d232924762 ("selftests/bpf: Add xdp_redirect_multi test")
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-4-liuhangbin@gmail.com
2021-11-05 16:38:26 +01:00
Hangbin Liu
f53ea9dbf7 selftests/bpf/xdp_redirect_multi: Use arping to accurate the arp number
The arp request number triggered by ping none exist address is not accurate,
which may lead the test false negative/positive. Change to use arping to
accurate the arp number. Also do not use grep pattern match for dot.

Fixes: d232924762 ("selftests/bpf: Add xdp_redirect_multi test")
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-3-liuhangbin@gmail.com
2021-11-05 16:38:02 +01:00
Hangbin Liu
8b4ac13abe selftests/bpf/xdp_redirect_multi: Put the logs to tmp folder
The xdp_redirect_multi test logs are created in selftest folder and not cleaned
after test. Let's creat a tmp dir and remove the logs after testing.

Fixes: d232924762 ("selftests/bpf: Add xdp_redirect_multi test")
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-2-liuhangbin@gmail.com
2021-11-05 16:37:51 +01:00
Linus Torvalds
5c0b0c676a powerpc updates for 5.16
- Enable STRICT_KERNEL_RWX for Freescale 85xx platforms.
 
  - Activate CONFIG_STRICT_KERNEL_RWX by default, while still allowing it to be disabled.
 
  - Add support for out-of-line static calls on 32-bit.
 
  - Fix oopses doing bpf-to-bpf calls when STRICT_KERNEL_RWX is enabled.
 
  - Fix boot hangs on e5500 due to stale value in ESR passed to do_page_fault().
 
  - Fix several bugs on pseries in handling of device tree cache information for hotplugged
    CPUs, and/or during partition migration.
 
  - Various other small features and fixes.
 
 Thanks to: Alexey Kardashevskiy, Alistair Popple, Anatolij Gustschin, Andrew Donnellan,
 Athira Rajeev, Bixuan Cui, Bjorn Helgaas, Cédric Le Goater, Christophe Leroy, Daniel
 Axtens, Daniel Henrique Barboza, Denis Kirjanov, Fabiano Rosas, Frederic Barrat, Gustavo
 A. R. Silva, Hari Bathini, Jacques de Laval, Joel Stanley, Kai Song, Kajol Jain, Laurent
 Vivier, Leonardo Bras, Madhavan Srinivasan, Nathan Chancellor, Nathan Lynch, Naveen N.
 Rao, Nicholas Piggin, Nick Desaulniers, Niklas Schnelle, Oliver O'Halloran, Rob Herring,
 Russell Currey, Srikar Dronamraju, Stan Johnson, Tyrel Datwyler, Uwe Kleine-König, Vasant
 Hegde, Wan Jiabing, Xiaoming Ni,
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmGFDPoTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgNbsEACVczMVwMBEny5a7W1tqq1bnY9RFVw3
 K+/rE7/FpSLX+RrwgoMkmqfPvfyc9WISVLlIQDKz4XhkBjaXv0+Y4OMsymuDCbxL
 Qk7F1ff22UfLmPjjJk39gHJ8QZQqZk3wmFu2QzTO4yBZbz2SqqXFLxwyoLpZ0LJ8
 pdGl51+bIsTkDJzrdkhX9X4AKS/fYyjbQxq5u7FS89ZCCs+KvzjLcDRo0GZYaOK/
 hgDBa60DCCszL/9yjbh0ANZxmM2Z3+6AXkvAAXrtXzIGk4JzenZfiV+VEzmq8Tt0
 UpWSsUEe7VgykMR3MTrL9G8op70PpKX6OMUPegJq4iRQ24h4mpFCK4oV9OMKJqpF
 ifN9NO2ZZKOz1ke4l7Xe8SEHLX7rq5U/P7INh3AsKYNYwo6HkJhSPxiCBWUTlnZ3
 OYoZ7czyO4gMPHWP92z4CoSiTYVBFuyhYexRcnQskg60TIwbr+lMXziRyPRGI8b6
 taf2rD8eAiyUJnvkFUsyAHtYHpkSkuMeiVqY2CDQdh2SdtIFgwKzB2RjFL0gzaBZ
 XP9RWD+HernGQAJSlIk7cVthont3JHklcKk+ohhDbsWzPeUJcz6t4ChtgRq0x4q4
 Hpes1lsVfXpyxj5ouBK/E/t+diwPvBLM0dCcarQJE6ExgMzBC/C7Br7jCSgyVJA2
 VhtcZaCYh+vRlQ==
 =f7HE
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Enable STRICT_KERNEL_RWX for Freescale 85xx platforms.

 - Activate CONFIG_STRICT_KERNEL_RWX by default, while still allowing it
   to be disabled.

 - Add support for out-of-line static calls on 32-bit.

 - Fix oopses doing bpf-to-bpf calls when STRICT_KERNEL_RWX is enabled.

 - Fix boot hangs on e5500 due to stale value in ESR passed to
   do_page_fault().

 - Fix several bugs on pseries in handling of device tree cache
   information for hotplugged CPUs, and/or during partition migration.

 - Various other small features and fixes.

Thanks to Alexey Kardashevskiy, Alistair Popple, Anatolij Gustschin,
Andrew Donnellan, Athira Rajeev, Bixuan Cui, Bjorn Helgaas, Cédric Le
Goater, Christophe Leroy, Daniel Axtens, Daniel Henrique Barboza, Denis
Kirjanov, Fabiano Rosas, Frederic Barrat, Gustavo A.  R.  Silva, Hari
Bathini, Jacques de Laval, Joel Stanley, Kai Song, Kajol Jain, Laurent
Vivier, Leonardo Bras, Madhavan Srinivasan, Nathan Chancellor, Nathan
Lynch, Naveen N.  Rao, Nicholas Piggin, Nick Desaulniers, Niklas
Schnelle, Oliver O'Halloran, Rob Herring, Russell Currey, Srikar
Dronamraju, Stan Johnson, Tyrel Datwyler, Uwe Kleine-König, Vasant
Hegde, Wan Jiabing, and Xiaoming Ni,

* tag 'powerpc-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (73 commits)
  powerpc/8xx: Fix Oops with STRICT_KERNEL_RWX without DEBUG_RODATA_TEST
  powerpc/32e: Ignore ESR in instruction storage interrupt handler
  powerpc/powernv/prd: Unregister OPAL_MSG_PRD2 notifier during module unload
  powerpc: Don't provide __kernel_map_pages() without ARCH_SUPPORTS_DEBUG_PAGEALLOC
  MAINTAINERS: Update powerpc KVM entry
  powerpc/xmon: fix task state output
  powerpc/44x/fsp2: add missing of_node_put
  powerpc/dcr: Use cmplwi instead of 3-argument cmpli
  KVM: PPC: Tick accounting should defer vtime accounting 'til after IRQ handling
  powerpc/security: Use a mutex for interrupt exit code patching
  powerpc/83xx/mpc8349emitx: Make mcu_gpiochip_remove() return void
  powerpc/fsl_booke: Fix setting of exec flag when setting TLBCAMs
  powerpc/book3e: Fix set_memory_x() and set_memory_nx()
  powerpc/nohash: Fix __ptep_set_access_flags() and ptep_set_wrprotect()
  powerpc/bpf: Fix write protecting JIT code
  selftests/powerpc: Use date instead of EPOCHSECONDS in mitigation-patching.sh
  powerpc/64s/interrupt: Fix check_return_regs_valid() false positive
  powerpc/boot: Set LC_ALL=C in wrapper script
  powerpc/64s: Default to 64K pages for 64 bit book3s
  Revert "powerpc/audit: Convert powerpc to AUDIT_ARCH_COMPAT_GENERIC"
  ...
2021-11-05 08:15:46 -07:00
Linus Torvalds
5c904c66ed Char/Misc driver update for 5.16-rc1
Here is the big set of char and misc and other tiny driver subsystem
 updates for 5.16-rc1.
 
 Loads of things in here, all of which have been in linux-next for a
 while with no reported problems (except for one called out below.)
 
 Included are:
 	- habanana labs driver updates, including dma_buf usage,
 	  reviewed and acked by the dma_buf maintainers
 	- iio driver update (going through this tree not staging as they
 	  really do not belong going through that tree anymore)
 	- counter driver updates
 	- hwmon driver updates that the counter drivers needed, acked by
 	  the hwmon maintainer
 	- xillybus driver updates
 	- binder driver updates
 	- extcon driver updates
 	- dma_buf module namespaces added (will cause a build error in
 	  arm64 for allmodconfig, but that change is on its way through
 	  the drm tree)
 	- lkdtm driver updates
 	- pvpanic driver updates
 	- phy driver updates
 	- virt acrn and nitr_enclaves driver updates
 	- smaller char and misc driver updates
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYYPX2A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymUUgCbB4EKysgLuXYdjUalZDx+vvZO4k0AniS14O4k
 F+2dVSZ5WX6wumUzCaA6
 =bXQM
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of char and misc and other tiny driver subsystem
  updates for 5.16-rc1.

  Loads of things in here, all of which have been in linux-next for a
  while with no reported problems (except for one called out below.)

  Included are:

   - habanana labs driver updates, including dma_buf usage, reviewed and
     acked by the dma_buf maintainers

   - iio driver update (going through this tree not staging as they
     really do not belong going through that tree anymore)

   - counter driver updates

   - hwmon driver updates that the counter drivers needed, acked by the
     hwmon maintainer

   - xillybus driver updates

   - binder driver updates

   - extcon driver updates

   - dma_buf module namespaces added (will cause a build error in arm64
     for allmodconfig, but that change is on its way through the drm
     tree)

   - lkdtm driver updates

   - pvpanic driver updates

   - phy driver updates

   - virt acrn and nitr_enclaves driver updates

   - smaller char and misc driver updates"

* tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (386 commits)
  comedi: dt9812: fix DMA buffers on stack
  comedi: ni_usb6501: fix NULL-deref in command paths
  arm64: errata: Enable TRBE workaround for write to out-of-range address
  arm64: errata: Enable workaround for TRBE overwrite in FILL mode
  coresight: trbe: Work around write to out of range
  coresight: trbe: Make sure we have enough space
  coresight: trbe: Add a helper to determine the minimum buffer size
  coresight: trbe: Workaround TRBE errata overwrite in FILL mode
  coresight: trbe: Add infrastructure for Errata handling
  coresight: trbe: Allow driver to choose a different alignment
  coresight: trbe: Decouple buffer base from the hardware base
  coresight: trbe: Add a helper to pad a given buffer area
  coresight: trbe: Add a helper to calculate the trace generated
  coresight: trbe: Defer the probe on offline CPUs
  coresight: trbe: Fix incorrect access of the sink specific data
  coresight: etm4x: Add ETM PID for Kryo-5XX
  coresight: trbe: Prohibit trace before disabling TRBE
  coresight: trbe: End the AUX handle on truncation
  coresight: trbe: Do not truncate buffer on IRQ
  coresight: trbe: Fix handling of spurious interrupts
  ...
2021-11-04 08:21:47 -07:00
Kleber Sacilotto de Souza
a38bc45a08 selftests/net: Fix reuseport_bpf_numa by skipping unavailable nodes
In some platforms the numa node numbers are not necessarily consecutive,
meaning that not all nodes from 0 to the value returned by numa_max_node()
are available on the system. Using node numbers which are not available
results on errors from libnuma such as:

  ---- IPv4 UDP ----
  send node 0, receive socket 0
  libnuma: Warning: Cannot read node cpumask from sysfs
  ./reuseport_bpf_numa: failed to pin to node: No such file or directory

Fix it by checking if the node number bit is set on numa_nodes_ptr, which
is defined on libnuma as "Set with all nodes the kernel has exposed to
userspace".

Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211101145317.286118-1-kleber.souza@canonical.com
2021-11-04 16:21:13 +01:00
Andrea Righi
a985442fde selftests: net: properly support IPv6 in GSO GRE test
Explicitly pass -6 to netcat when the test is using IPv6 to prevent
failures.

Also make sure to pass "-N" to netcat to close the socket after EOF on
the client side, otherwise we would always hit the timeout and the test
would fail.

Without this fix applied:

 TEST: GREv6/v4 - copy file w/ TSO                                   [FAIL]
 TEST: GREv6/v4 - copy file w/ GSO                                   [FAIL]
 TEST: GREv6/v6 - copy file w/ TSO                                   [FAIL]
 TEST: GREv6/v6 - copy file w/ GSO                                   [FAIL]

With this fix applied:

 TEST: GREv6/v4 - copy file w/ TSO                                   [ OK ]
 TEST: GREv6/v4 - copy file w/ GSO                                   [ OK ]
 TEST: GREv6/v6 - copy file w/ TSO                                   [ OK ]
 TEST: GREv6/v6 - copy file w/ GSO                                   [ OK ]

Fixes: 025efa0a82 ("selftests: add simple GSO GRE test")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-04 11:23:57 +00:00
Linus Torvalds
e1fd0b2acd Second set of tracing updates for 5.16:
- osnoise and timerlat updates that will work with the RTLA tool (Real-Time
   Linux Analysis). Specifically it disconnects the work load (threads
   that look for latency) from the tracing instances attached to them,
   allowing for more than one instance to retrieve data from the work load.
 
 - Optimization on division in the trace histogram trigger code to use shift
   and multiply when possible. Also added documentation.
 
 - Fix prototype to my_direct_func in direct ftrace trampoline sample code.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYYKWXxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qqJEAP9czpSZ/nFvDjxdGHZAcKKXCFWbGcK5
 IF2cHDDwxXjZ/gD+NnpRhR1JPfA55fO52DUJPn2cOU5xOsP6DmJxu6mwDg0=
 =AKVv
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull more tracing updates from Steven Rostedt:

 - osnoise and timerlat updates that will work with the RTLA tool
   (Real-Time Linux Analysis).

   Specifically it disconnects the work load (threads that look for
   latency) from the tracing instances attached to them, allowing for
   more than one instance to retrieve data from the work load.

 - Optimization on division in the trace histogram trigger code to use
   shift and multiply when possible. Also added documentation.

 - Fix prototype to my_direct_func in direct ftrace trampoline sample
   code.

* tag 'trace-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace/samples: Add missing prototype for my_direct_func
  tracing/selftests: Add tests for hist trigger expression parsing
  tracing/histogram: Document hist trigger variables
  tracing/histogram: Update division by 0 documentation
  tracing/histogram: Optimize division by constants
  tracing/osnoise: Remove PREEMPT_RT ifdefs from inside functions
  tracing/osnoise: Remove STACKTRACE ifdefs from inside functions
  tracing/osnoise: Allow multiple instances of the same tracer
  tracing/osnoise: Remove TIMERLAT ifdefs from inside functions
  tracing/osnoise: Support a list of trace_array *tr
  tracing/osnoise: Use start/stop_per_cpu_kthreads() on osnoise_cpus_write()
  tracing/osnoise: Split workload start from the tracer start
  tracing/osnoise: Improve comments about barrier need for NMI callbacks
  tracing/osnoise: Do not follow tracing_cpumask
2021-11-03 09:08:47 -07:00
Martin KaFai Lau
c08455dec5 selftests/bpf: Verifier test on refill from a smaller spill
This patch adds a verifier test to ensure the verifier can read 8 bytes
from the stack after two 32bit write at fp-4 and fp-8. The test is similar
to the reported case from bcc [0].

  [0] https://github.com/iovisor/bcc/pull/3683

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211102064541.316414-1-kafai@fb.com
2021-11-03 15:55:43 +01:00
Andrii Nakryiko
401a33da3a selftests/bpf: Make netcnt selftests serial to avoid spurious failures
When running `./test_progs -j` test_netcnt fails with a very high
probability, undercounting number of packets received (9999 vs expected
10000). It seems to be conflicting with other cgroup/skb selftests. So
make it serial for now to make parallel mode more robust.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211103054113.2130582-1-andrii@kernel.org
2021-11-03 15:43:09 +01:00
Lorenz Bauer
7e5ad817ec selftests/bpf: Test RENAME_EXCHANGE and RENAME_NOREPLACE on bpffs
Add tests to exercise the behaviour of RENAME_EXCHANGE and RENAME_NOREPLACE
on bpffs. The former checks that after an exchange the inode of two
directories has changed. The latter checks that the source still exists
after a failed rename. Generally, having support for renameat2(RENAME_EXCHANGE)
in bpffs fixes atomic upgrades of our sk_lookup control plane.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211028094724.59043-5-lmb@cloudflare.com
2021-11-03 15:43:08 +01:00
Lorenz Bauer
9fc23c22e5 selftests/bpf: Convert test_bpffs to ASSERT macros
Remove usage of deprecated CHECK macros.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211028094724.59043-4-lmb@cloudflare.com
2021-11-03 15:43:08 +01:00
Hangbin Liu
17b67370c3 kselftests/net: add missed toeplitz.sh/toeplitz_client.sh to Makefile
When generating the selftests to another folder, the toeplitz.sh
and toeplitz_client.sh are missing as they are not in Makefile, e.g.

  make -C tools/testing/selftests/ install \
      TARGETS="net" INSTALL_PATH=/tmp/kselftests

Making them under TEST_PROGS_EXTENDED as they test NIC hardware features
and are not intended to be run from kselftests.

Fixes: 5ebfb4cc30 ("selftests/net: toeplitz test")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:07:02 +00:00
Hangbin Liu
8883deb50e kselftests/net: add missed vrf_strict_mode_test.sh test to Makefile
When generating the selftests to another folder, the
vrf_strict_mode_test.sh test will miss as it is not in Makefile, e.g.

  make -C tools/testing/selftests/ install \
      TARGETS="net" INSTALL_PATH=/tmp/kselftests

Fixes: 8735e6eaa4 ("selftests: add selftest for the VRF strict mode")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:07:02 +00:00
Hangbin Liu
653e7f19b4 kselftests/net: add missed SRv6 tests
When generating the selftests to another folder, the SRv6 tests are
missing as they are not in Makefile, e.g.

  make -C tools/testing/selftests/ install \
      TARGETS="net" INSTALL_PATH=/tmp/kselftests

Fixes: 03a0b567a0 ("selftests: seg6: add selftest for SRv6 End.DT46 Behavior")
Fixes: 2195444e09 ("selftests: add selftest for the SRv6 End.DT4 behavior")
Fixes: 2bc035538e ("selftests: add selftest for the SRv6 End.DT6 (VRF) behavior")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:07:02 +00:00
Hangbin Liu
b99ac18411 kselftests/net: add missed setup_loopback.sh/setup_veth.sh to Makefile
When generating the selftests to another folder, the include file
setup_loopback.sh/setup_veth.sh for gro.sh/gre_gro.sh are missing as
they are not in Makefile, e.g.

  make -C tools/testing/selftests/ install \
      TARGETS="net" INSTALL_PATH=/tmp/kselftests

Fixes: 7d1575014a ("selftests/net: GRO coalesce test")
Fixes: 9af771d2ec ("selftests/net: allow GRO coalesce test on veth")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:07:02 +00:00
Hangbin Liu
ca3676f94b kselftests/net: add missed icmp.sh test to Makefile
When generating the selftests to another folder, the icmp.sh test will
miss as it is not in Makefile, e.g.

  make -C tools/testing/selftests/ install \
      TARGETS="net" INSTALL_PATH=/tmp/kselftests

Fixes: 7e9838b791 ("selftests/net: Add icmp.sh for testing ICMP dummy address responses")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:07:02 +00:00
Linus Torvalds
313b6ffc8e linux-kselftest-kunit-5.16-rc1
This KUnit update for Linux 5.16-rc1 consist of several enhancements
 and fixes:
 
 - ability to run each test suite and test separately
 - support for timing test run
 - several fixes and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmGBhpoACgkQCwJExA0N
 Qxx2xQ//bCql+nCJdowo2er7MgYV1Jw9bkjapriV7vaakHVBOQhMItMAD3lw2WNI
 SXuX4n4x4Ap4FMBd3gBQ3Flc1e7MdY/FHQIIcIE5+xDoU/ehl9ypbUMd7NrGOTI7
 KMN18TJawHHyMHjKz/fFFV5Bfi97YptQMy3VB/ujdgRCIF2/bPhra5F/rFYUwdSk
 GLql9CBPbomgaginzAuvnfPKoxWGjiiRZjNTlsXDyRihras0ezS7kfSYPLEV8f/i
 L/ydZK+Lc3QCrWCYadBeAtq/3hkb1pV7FGKfjypvEGhGcvqEQZW7irKxE4KoLB2D
 VFeVVZmK0WCfCFHtOQookTKPVwkgTzItwpXxl57ILMWo6FaB8O8tQGHqEvcF7Xfm
 NIgT/V0laiUWABpWWpQjf1jnY+X1qI+s+f8eZ2eI889AduBdlrNIfCRotYz2tB0i
 /GfnmVLp5pDIYZX0SxCERpu91297sDXSV+uw4mf9hjDs7189LmWppEpPENfjWebM
 geFlconrtrmnWx3E3Wvqxk8pEmJwChDYVAGUUhKFZXLxYO0YvWoBM4REb9v8sv3c
 0x3A1ZhpdeN5f7S6BHEAVKa3VrDHXLG9nZCK6wlaxYIiM2rOJDQN9h4BF5/RO9/2
 SebYQuMA95aQgAPXjJHYKQ9fdxrR3BOgFdn0Xfp/EShEWTl/usA=
 =UpDf
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:
 "Several enhancements and fixes:

   - ability to run each test suite and test separately

   - support for timing test run

   - several fixes and improvements"

* tag 'linux-kselftest-kunit-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: tool: fix typecheck errors about loading qemu configs
  kunit: tool: continue past invalid utf-8 output
  kunit: Reset suite count after running tests
  kunit: tool: improve compatibility of kunit_parser with KTAP specification
  kunit: tool: yield output from run_kernel in real time
  kunit: tool: support running each suite/test separately
  kunit: tool: actually track how long it took to run tests
  kunit: tool: factor exec + parse steps into a function
  kunit: add 'kunit.action' param to allow listing out tests
  kunit: tool: show list of valid --arch options when invalid
  kunit: tool: misc fixes (unused vars, imports, leaked files)
  kunit: fix too small allocation when using suite-only kunit.filter_glob
  kunit: tool: allow filtering test cases via glob
  kunit: drop assumption in kunit-log-test about current suite
2021-11-02 22:06:20 -07:00
Linus Torvalds
84924e2e62 linux-kselftest-next-5.16-rc1
This Kselftest update for Linux 5.16-rc1 consists of fixes to compile
 time error and warnings.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmGBdeoACgkQCwJExA0N
 QxywbBAAuiQow4KPkTt39HGKazGrM6J6rVlxp2QJ+jN8xnIWHmwVO53qbcyED++X
 uC9ORwkKU8jr6sA9pL+uhFT4MauJO8+hlGVJ5A9edt1xer94EGOY41XAGmznHIhL
 KXtFJ/YbeMBZjjaFLiSePUxdOQhmHq4v3rWd9p/Do60PPhfAkLcob8DCt5W/CMXF
 FclEIZnaPlm0b/JSxljJDy4zl5QGDuTmL+Sk8ohtiWH6spwDfPs4eaDcxCjQryYH
 W5FcaBGo+ISh2xecl1Wa6RlmHOqmtM3WlIzT9vUTTbNGRR5b7titOepDz8kPQxtX
 Cz8swXfEI2ZVksXza7JJHkWPfKFw7BBFoUennuBq82FfjUlEaEv90YowIiXX0ibw
 INkYrB1e206tZdpcmS6BjcqqD/6bYf0Xwrz49950WyJYFLTJ3Fq3wUPK/jftS90+
 v8SDKEVnEalughVqzRDtdgM8WqXkDDZgCalMj6eH6nEuxaSLMoMm297y17LLpHPU
 4Bhqc2X9ObqLsEyLNmlbMw5qBg2xNRJQtTpGOEcf3S5eZ7eonnBnNhvtLZdKPHY4
 IMEybYyiiX27F+5eaVvRtNbLUKWXg9YAFLFsQ/7Gu8bcnShKH2ZpfW0saPihcOYH
 frGHyNMCZMW3y+CA3qGq5l7ddSYF8rUo5hIG8XwC1niRqyyuCWk=
 =j8/P
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-next-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest updates from Shuah Khan:
 "Fixes to compile time errors and warnings"

* tag 'linux-kselftest-next-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/core: fix conflicting types compile error for close_range()
  selftests: x86: fix [-Wstringop-overread] warn in test_process_vm_readv()
  selftests: kvm: fix mismatched fclose() after popen()
2021-11-02 22:00:17 -07:00
Linus Torvalds
d7e0a795bf ARM:
* More progress on the protected VM front, now with the full
   fixed feature set as well as the limitation of some hypercalls
   after initialisation.
 
 * Cleanup of the RAZ/WI sysreg handling, which was pointlessly
   complicated
 
 * Fixes for the vgic placement in the IPA space, together with a
   bunch of selftests
 
 * More memcg accounting of the memory allocated on behalf of a guest
 
 * Timer and vgic selftests
 
 * Workarounds for the Apple M1 broken vgic implementation
 
 * KConfig cleanups
 
 * New kvmarm.mode=none option, for those who really dislike us
 
 RISC-V:
 * New KVM port.
 
 x86:
 * New API to control TSC offset from userspace
 
 * TSC scaling for nested hypervisors on SVM
 
 * Switch masterclock protection from raw_spin_lock to seqcount
 
 * Clean up function prototypes in the page fault code and avoid
 repeated memslot lookups
 
 * Convey the exit reason to userspace on emulation failure
 
 * Configure time between NX page recovery iterations
 
 * Expose Predictive Store Forwarding Disable CPUID leaf
 
 * Allocate page tracking data structures lazily (if the i915
 KVM-GT functionality is not compiled in)
 
 * Cleanups, fixes and optimizations for the shadow MMU code
 
 s390:
 * SIGP Fixes
 
 * initial preparations for lazy destroy of secure VMs
 
 * storage key improvements/fixes
 
 * Log the guest CPNC
 
 Starting from this release, KVM-PPC patches will come from
 Michael Ellerman's PPC tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGBOiEUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNowwf/axlx3g9sgCwQHr12/6UF/7hL/RwP
 9z+pGiUzjl2YQE+RjSvLqyd6zXh+h4dOdOKbZDLSkSTbcral/8U70ojKnQsXM0XM
 1LoymxBTJqkgQBLm9LjYreEbzrPV4irk4ygEmuk3CPOHZu8xX1ei6c5LdandtM/n
 XVUkXsQY+STkmnGv4P3GcPoDththCr0tBTWrFWtxa0w9hYOxx0ay1AZFlgM4FFX0
 QFuRc8VBLoDJpIUjbkhsIRIbrlHc/YDGjuYnAU7lV/CIME8vf2BW6uBwIZJdYcDj
 0ejozLjodEnuKXQGnc8sXFioLX2gbMyQJEvwCgRvUu/EU7ncFm1lfs7THQ==
 =UxKM
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:

   - More progress on the protected VM front, now with the full fixed
     feature set as well as the limitation of some hypercalls after
     initialisation.

   - Cleanup of the RAZ/WI sysreg handling, which was pointlessly
     complicated

   - Fixes for the vgic placement in the IPA space, together with a
     bunch of selftests

   - More memcg accounting of the memory allocated on behalf of a guest

   - Timer and vgic selftests

   - Workarounds for the Apple M1 broken vgic implementation

   - KConfig cleanups

   - New kvmarm.mode=none option, for those who really dislike us

  RISC-V:

   - New KVM port.

  x86:

   - New API to control TSC offset from userspace

   - TSC scaling for nested hypervisors on SVM

   - Switch masterclock protection from raw_spin_lock to seqcount

   - Clean up function prototypes in the page fault code and avoid
     repeated memslot lookups

   - Convey the exit reason to userspace on emulation failure

   - Configure time between NX page recovery iterations

   - Expose Predictive Store Forwarding Disable CPUID leaf

   - Allocate page tracking data structures lazily (if the i915 KVM-GT
     functionality is not compiled in)

   - Cleanups, fixes and optimizations for the shadow MMU code

  s390:

   - SIGP Fixes

   - initial preparations for lazy destroy of secure VMs

   - storage key improvements/fixes

   - Log the guest CPNC

  Starting from this release, KVM-PPC patches will come from Michael
  Ellerman's PPC tree"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
  RISC-V: KVM: fix boolreturn.cocci warnings
  RISC-V: KVM: remove unneeded semicolon
  RISC-V: KVM: Fix GPA passed to __kvm_riscv_hfence_gvma_xyz() functions
  RISC-V: KVM: Factor-out FP virtualization into separate sources
  KVM: s390: add debug statement for diag 318 CPNC data
  KVM: s390: pv: properly handle page flags for protected guests
  KVM: s390: Fix handle_sske page fault handling
  KVM: x86: SGX must obey the KVM_INTERNAL_ERROR_EMULATION protocol
  KVM: x86: On emulation failure, convey the exit reason, etc. to userspace
  KVM: x86: Get exit_reason as part of kvm_x86_ops.get_exit_info
  KVM: x86: Clarify the kvm_run.emulation_failure structure layout
  KVM: s390: Add a routine for setting userspace CPU state
  KVM: s390: Simplify SIGP Set Arch handling
  KVM: s390: pv: avoid stalls when making pages secure
  KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vm
  KVM: s390: pv: avoid double free of sida page
  KVM: s390: pv: add macros for UVC CC values
  s390/mm: optimize reset_guest_reference_bit()
  s390/mm: optimize set_guest_storage_key()
  s390/mm: no need for pte_alloc_map_lock() if we know the pmd is present
  ...
2021-11-02 11:24:14 -07:00
Linus Torvalds
cc0356d6a0 - Do not #GP on userspace use of CLI/STI but pretend it was a NOP to
keep old userspace from breaking. Adjust the corresponding iopl selftest
 to that.
 
 - Improve stack overflow warnings to say which stack got overflowed and
 raise the exception stack sizes to 2 pages since overflowing the single
 page of exception stack is very easy to do nowadays with all the tracing
 machinery enabled. With that, rip out the custom mapping of AMD SEV's
 too.
 
 - A bunch of changes in preparation for FGKASLR like supporting more
 than 64K section headers in the relocs tool, correct ORC lookup table
 size to cover the whole kernel .text and other adjustments.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmF/uugACgkQEsHwGGHe
 VUroKw//e8BJ3Aun8bg00FHxfiMGbPYcozjLGDkaoMtMDZ8WlfCUrvtqYICEr8eB
 UU0eRyygAPI167dre1O9JvAcbilkNTKntaU6qbu/ZVyUwS3+Jkjwsotbqn3xKtkd
 QDDTDNiCU+beCJ2ZbspbrPgEh13+H0MwMHUfRxZB9Scpmo6aGSEaU3g295f6GX57
 VFGJ/LNov5MV1dTD7Pp/h6/Nb+R6WmflKcBzJmQxYuKyKX+g1xsSv0VSga+t+uf3
 M9pUkizqTiUxzC2eLgtcEZTqqBHu810E8M76FmhKBUMilsFJT5YAJTiqyahwHXds
 HYarOFRgcnFuJPd29vn8UHjqeeoi6ru8GtcZYzccEc7U3ku/gXPaDJ9ffmvhs7vU
 pJX5Um3GiiFm0w/ZZOKDqh78wRAsCKLN+jIoyszuhkkNchZSj/jKfOgdd3EmcZst
 6L6rxBA4oRHwNOgM7uVMp+jFeRe1/prR280OWWH0D4QmmuqybThOdO23Iuh/Deth
 W3qPUH3UQtfSWxGy2yODzJ1ciuGAr/AzJZ9zjg04e3Vl0DkEpyWtLKJiG3ClXZag
 Nj+3xc4xYH2Aw+M0HRaONk5XVKLpqVjuAfgU5iLQa0YSUbtrR+wCWvY8KgQNbAqK
 xZmzYzQ89stwVCuGKx10gPsL3jSJ3VCylMfqdHD2Ajmld1yApr0=
 =DOZU
 -----END PGP SIGNATURE-----

Merge tag 'x86_core_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 core updates from Borislav Petkov:

 - Do not #GP on userspace use of CLI/STI but pretend it was a NOP to
   keep old userspace from breaking. Adjust the corresponding iopl
   selftest to that.

 - Improve stack overflow warnings to say which stack got overflowed and
   raise the exception stack sizes to 2 pages since overflowing the
   single page of exception stack is very easy to do nowadays with all
   the tracing machinery enabled. With that, rip out the custom mapping
   of AMD SEV's too.

 - A bunch of changes in preparation for FGKASLR like supporting more
   than 64K section headers in the relocs tool, correct ORC lookup table
   size to cover the whole kernel .text and other adjustments.

* tag 'x86_core_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests/x86/iopl: Adjust to the faked iopl CLI/STI usage
  vmlinux.lds.h: Have ORC lookup cover entire _etext - _stext
  x86/boot/compressed: Avoid duplicate malloc() implementations
  x86/boot: Allow a "silent" kaslr random byte fetch
  x86/tools/relocs: Support >64K section headers
  x86/sev: Make the #VC exception stacks part of the default stacks storage
  x86: Increase exception stack sizes
  x86/mm/64: Improve stack overflow warnings
  x86/iopl: Fake iopl(3) CLI/STI usage
2021-11-02 07:56:47 -07:00
Linus Torvalds
fc02cb2b37 Core:
- Remove socket skb caches
 
  - Add a SO_RESERVE_MEM socket op to forward allocate buffer space
    and avoid memory accounting overhead on each message sent
 
  - Introduce managed neighbor entries - added by control plane and
    resolved by the kernel for use in acceleration paths (BPF / XDP
    right now, HW offload users will benefit as well)
 
  - Make neighbor eviction on link down controllable by userspace
    to work around WiFi networks with bad roaming implementations
 
  - vrf: Rework interaction with netfilter/conntrack
 
  - fq_codel: implement L4S style ce_threshold_ect1 marking
 
  - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
 
 BPF:
 
  - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
    as implemented in LLVM14
 
  - Introduce bpf_get_branch_snapshot() to capture Last Branch Records
 
  - Implement variadic trace_printk helper
 
  - Add a new Bloomfilter map type
 
  - Track <8-byte scalar spill and refill
 
  - Access hw timestamp through BPF's __sk_buff
 
  - Disallow unprivileged BPF by default
 
  - Document BPF licensing
 
 Netfilter:
 
  - Introduce egress hook for looking at raw outgoing packets
 
  - Allow matching on and modifying inner headers / payload data
 
  - Add NFT_META_IFTYPE to match on the interface type either from
    ingress or egress
 
 Protocols:
 
  - Multi-Path TCP:
    - increase default max additional subflows to 2
    - rework forward memory allocation
    - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
 
  - MCTP flow support allowing lower layer drivers to configure msg
    muxing as needed
 
  - Automatic Multicast Tunneling (AMT) driver based on RFC7450
 
  - HSR support the redbox supervision frames (IEC-62439-3:2018)
 
  - Support for the ip6ip6 encapsulation of IOAM
 
  - Netlink interface for CAN-FD's Transmitter Delay Compensation
 
  - Support SMC-Rv2 eliminating the current same-subnet restriction,
    by exploiting the UDP encapsulation feature of RoCE adapters
 
  - TLS: add SM4 GCM/CCM crypto support
 
  - Bluetooth: initial support for link quality and audio/codec
    offload
 
 Driver APIs:
 
  - Add a batched interface for RX buffer allocation in AF_XDP
    buffer pool
 
  - ethtool: Add ability to control transceiver modules' power mode
 
  - phy: Introduce supported interfaces bitmap to express MAC
    capabilities and simplify PHY code
 
  - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
 
 New drivers:
 
  - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
 
  - Ethernet driver for ASIX AX88796C SPI device (x88796c)
 
 Drivers:
 
  - Broadcom PHYs
    - support 72165, 7712 16nm PHYs
    - support IDDQ-SR for additional power savings
 
  - PHY support for QCA8081, QCA9561 PHYs
 
  - NXP DPAA2: support for IRQ coalescing
 
  - NXP Ethernet (enetc): support for software TCP segmentation
 
  - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
    Gigabit-capable IP found on RZ/G2L SoC
 
  - Intel 100G Ethernet
    - support for eswitch offload of TC/OvS flow API, including
      offload of GRE, VxLAN, Geneve tunneling
    - support application device queues - ability to assign Rx and Tx
      queues to application threads
    - PTP and PPS (pulse-per-second) extensions
 
  - Broadcom Ethernet (bnxt)
    - devlink health reporting and device reload extensions
 
  - Mellanox Ethernet (mlx5)
    - offload macvlan interfaces
    - support HW offload of TC rules involving OVS internal ports
    - support HW-GRO and header/data split
    - support application device queues
 
  - Marvell OcteonTx2:
    - add XDP support for PF
    - add PTP support for VF
 
  - Qualcomm Ethernet switch (qca8k): support for QCA8328
 
  - Realtek Ethernet DSA switch (rtl8366rb)
    - support bridge offload
    - support STP, fast aging, disabling address learning
    - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
 
  - Mellanox Ethernet/IB switch (mlxsw)
    - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
    - offload root TBF qdisc as port shaper
    - support multiple routing interface MAC address prefixes
    - support for IP-in-IP with IPv6 underlay
 
  - MediaTek WiFi (mt76)
    - mt7921 - ASPM, 6GHz, SDIO and testmode support
    - mt7915 - LED and TWT support
 
  - Qualcomm WiFi (ath11k)
    - include channel rx and tx time in survey dump statistics
    - support for 80P80 and 160 MHz bandwidths
    - support channel 2 in 6 GHz band
    - spectral scan support for QCN9074
    - support for rx decapsulation offload (data frames in 802.3
      format)
 
  - Qualcomm phone SoC WiFi (wcn36xx)
    - enable Idle Mode Power Save (IMPS) to reduce power consumption
      during idle
 
  - Bluetooth driver support for MediaTek MT7922 and MT7921
 
  - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x
    and Realtek 8822C/8852A
 
  - Microsoft vNIC driver (mana)
    - support hibernation and kexec
 
  - Google vNIC driver (gve)
    - support for jumbo frames
    - implement Rx page reuse
 
 Refactor:
 
  - Make all writes to netdev->dev_addr go thru helpers, so that we
    can add this address to the address rbtree and handle the updates
 
  - Various TCP cleanups and optimizations including improvements
    to CPU cache use
 
  - Simplify the gnet_stats, Qdisc stats' handling and remove
    qdisc->running sequence counter
 
  - Driver changes and API updates to address devlink locking
    deficiencies
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGAzX4ACgkQMUZtbf5S
 IrvW3g//Q0ZLrOuHK9pZ8sCXMMhDj8qL6ajm0otMddHWA/+1UglwVBKFhsajfxOf
 wJ/5LZis+XKLpLqKTU5chKVfn39HuDGe/D3l+egi01Gv5BW0+XzEhagfyR5tJX5z
 wsGG5CXO/we/laVSzRiFtwwVEKHKN20YC+tIQwYOYP5Wy3q4G7qDsFhT7GqgsGCS
 n74QUEAIB5Tz0ODWFqLtbsySzIurXrskibwt5T9bvAAlPw/lCU68mmG+NVJ7VddO
 lBbNkLMOo8yW9Ci20H09SrYd4jZTmMARo9tsFO1tAvAMk7qpn0Wd8pnOYTjFFoMD
 +qjiFSVMh7E0JGb8Y7NCvwaB99suAK5rfGP68Xwe62DfP7vYWEx4pZGxBP19F4ld
 6Kn1ME33BX9rUF9tBecf0bdKfJUwB2Q2Xou/b9laG04bwiqsc9iG5FQq1C46lnLZ
 QdzNiS1My4dJMczkWt66HF3Kx30ibwHfvKMIHjf4PqkzEatkv6Y6SBZ57KXL+Lde
 0BQSFhbf0tm2Gf55etzrczLElI3uqHSFWUNZZ2Bt6WmzO1e6tpV9nAtRWF4C/dFg
 QDpLJtOOOY65uq+qz09zoPfv2lem868SrCAuFrVn99bEpYjx/CGNFDeEI02l6jyr
 84eUxd364UcbIk3fc+eTGdXHLQNVk30G0AHVBBxaWNIidwfqXeE=
 =srde
 -----END PGP SIGNATURE-----

Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - Remove socket skb caches

   - Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
     avoid memory accounting overhead on each message sent

   - Introduce managed neighbor entries - added by control plane and
     resolved by the kernel for use in acceleration paths (BPF / XDP
     right now, HW offload users will benefit as well)

   - Make neighbor eviction on link down controllable by userspace to
     work around WiFi networks with bad roaming implementations

   - vrf: Rework interaction with netfilter/conntrack

   - fq_codel: implement L4S style ce_threshold_ect1 marking

   - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()

  BPF:

   - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
     as implemented in LLVM14

   - Introduce bpf_get_branch_snapshot() to capture Last Branch Records

   - Implement variadic trace_printk helper

   - Add a new Bloomfilter map type

   - Track <8-byte scalar spill and refill

   - Access hw timestamp through BPF's __sk_buff

   - Disallow unprivileged BPF by default

   - Document BPF licensing

  Netfilter:

   - Introduce egress hook for looking at raw outgoing packets

   - Allow matching on and modifying inner headers / payload data

   - Add NFT_META_IFTYPE to match on the interface type either from
     ingress or egress

  Protocols:

   - Multi-Path TCP:
      - increase default max additional subflows to 2
      - rework forward memory allocation
      - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS

   - MCTP flow support allowing lower layer drivers to configure msg
     muxing as needed

   - Automatic Multicast Tunneling (AMT) driver based on RFC7450

   - HSR support the redbox supervision frames (IEC-62439-3:2018)

   - Support for the ip6ip6 encapsulation of IOAM

   - Netlink interface for CAN-FD's Transmitter Delay Compensation

   - Support SMC-Rv2 eliminating the current same-subnet restriction, by
     exploiting the UDP encapsulation feature of RoCE adapters

   - TLS: add SM4 GCM/CCM crypto support

   - Bluetooth: initial support for link quality and audio/codec offload

  Driver APIs:

   - Add a batched interface for RX buffer allocation in AF_XDP buffer
     pool

   - ethtool: Add ability to control transceiver modules' power mode

   - phy: Introduce supported interfaces bitmap to express MAC
     capabilities and simplify PHY code

   - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks

  New drivers:

   - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)

   - Ethernet driver for ASIX AX88796C SPI device (x88796c)

  Drivers:

   - Broadcom PHYs
      - support 72165, 7712 16nm PHYs
      - support IDDQ-SR for additional power savings

   - PHY support for QCA8081, QCA9561 PHYs

   - NXP DPAA2: support for IRQ coalescing

   - NXP Ethernet (enetc): support for software TCP segmentation

   - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
     Gigabit-capable IP found on RZ/G2L SoC

   - Intel 100G Ethernet
      - support for eswitch offload of TC/OvS flow API, including
        offload of GRE, VxLAN, Geneve tunneling
      - support application device queues - ability to assign Rx and Tx
        queues to application threads
      - PTP and PPS (pulse-per-second) extensions

   - Broadcom Ethernet (bnxt)
      - devlink health reporting and device reload extensions

   - Mellanox Ethernet (mlx5)
      - offload macvlan interfaces
      - support HW offload of TC rules involving OVS internal ports
      - support HW-GRO and header/data split
      - support application device queues

   - Marvell OcteonTx2:
      - add XDP support for PF
      - add PTP support for VF

   - Qualcomm Ethernet switch (qca8k): support for QCA8328

   - Realtek Ethernet DSA switch (rtl8366rb)
      - support bridge offload
      - support STP, fast aging, disabling address learning
      - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch

   - Mellanox Ethernet/IB switch (mlxsw)
      - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
      - offload root TBF qdisc as port shaper
      - support multiple routing interface MAC address prefixes
      - support for IP-in-IP with IPv6 underlay

   - MediaTek WiFi (mt76)
      - mt7921 - ASPM, 6GHz, SDIO and testmode support
      - mt7915 - LED and TWT support

   - Qualcomm WiFi (ath11k)
      - include channel rx and tx time in survey dump statistics
      - support for 80P80 and 160 MHz bandwidths
      - support channel 2 in 6 GHz band
      - spectral scan support for QCN9074
      - support for rx decapsulation offload (data frames in 802.3
        format)

   - Qualcomm phone SoC WiFi (wcn36xx)
      - enable Idle Mode Power Save (IMPS) to reduce power consumption
        during idle

   - Bluetooth driver support for MediaTek MT7922 and MT7921

   - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
     Realtek 8822C/8852A

   - Microsoft vNIC driver (mana)
      - support hibernation and kexec

   - Google vNIC driver (gve)
      - support for jumbo frames
      - implement Rx page reuse

  Refactor:

   - Make all writes to netdev->dev_addr go thru helpers, so that we can
     add this address to the address rbtree and handle the updates

   - Various TCP cleanups and optimizations including improvements to
     CPU cache use

   - Simplify the gnet_stats, Qdisc stats' handling and remove
     qdisc->running sequence counter

   - Driver changes and API updates to address devlink locking
     deficiencies"

* tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
  Revert "net: avoid double accounting for pure zerocopy skbs"
  selftests: net: add arp_ndisc_evict_nocarrier
  net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
  net: arp: introduce arp_evict_nocarrier sysctl parameter
  libbpf: Deprecate AF_XDP support
  kbuild: Unify options for BTF generation for vmlinux and modules
  selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
  bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
  bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
  net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
  net: avoid double accounting for pure zerocopy skbs
  tcp: rename sk_wmem_free_skb
  netdevsim: fix uninit value in nsim_drv_configure_vfs()
  selftests/bpf: Fix also no-alu32 strobemeta selftest
  bpf: Add missing map_delete_elem method to bloom filter map
  selftests/bpf: Add bloom map success test for userspace calls
  bpf: Add alignment padding for "map_extra" + consolidate holes
  bpf: Bloom filter map naming fixups
  selftests/bpf: Add test cases for struct_ops prog
  bpf: Add dummy BPF STRUCT_OPS for test purpose
  ...
2021-11-02 06:20:58 -07:00
Linus Torvalds
6fedc28076 RCU pull request for v5.16
This pull request contains the following branches:
 
 fixes.2021.10.07a: Miscellaneous fixes.
 
 scftorture.2021.09.16a: smp_call_function torture-test updates, most
 	notably better checking of module parameters.
 
 tasks.2021.09.15a: Tasks-trace RCU updates that fix a number of rare
 	but important race-condition bugs.
 
 torture.2021.09.13b: Other torture-test updates, most notably
 	better checking of module parameters.  In addition, rcutorture
 	may now be run on CONFIG_PREEMPT_RT kernels.
 
 torturescript.2021.09.16a: Torture-test scripting updates, most notably
 	specifying the new CONFIG_KCSAN_STRICT kconfig option rather
 	than maintaining an ever-changing list of individual KCSAN
 	kconfig options.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmGAVMMTHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jGJBD/9ld6USOpedBLAbTYVMQYvIKoSqqDIG
 74ZFhKvZ5I6Y8OZAGxXjb5U06rh4V2brlTN7IJ7XLEA1t401ENffsGeQSCxEmpEf
 PqQN04dbmVvaWjD4jiLZCcl3oDp+w1gIKwmX6wh0Weogr3KZWu5aNvD5tl9qIz4a
 uPC1JqTBxf7WDrLhqNxG5N4MXs27+KvukCd9wftk3NTzRJ9tyLM/YNGOVArM8rW2
 QpEh8n6veB5dEoXBxmRHzuxYHN1k0Fhkbm3irMjcI0T5wj8TDod89zbg9mdFXMIj
 AjZ9CGpIBa4frThdu654ZNuEQHDCsPWtMi925xNOWxh5lkPGjeWnwYpcRrwfI2pj
 op0xVlur+Nam5CT/AJNT9+KogpZthAWXvwqCs5GbYNSU30Rlw99bw1vyAsJUD+af
 Mv08/z4o7Kuhr4cw2vkd2UfF9zuIQsJ1jWCIjMxfj4ctBnIpedrEnEISp8Y61fWk
 w9vXgCRhZCSkxoURoNss+nAUsiePUafptsvqKLu6Z53ufPA5yL0rVS778xq8vurP
 Xyd34TVlQ94ydZDC5pkSNpri1HGV1U7pztFwey5GloE66iV+7TSQCfMhzLd4CM0K
 wW96wimHrDtIxD6LedCZOHLHkS9AJd7F9uSoNodKspTH0tJowQztrzPW1eZifDE3
 iJP8xcJ+vL67Og==
 =nmaP
 -----END PGP SIGNATURE-----

Merge tag 'rcu.2021.11.01a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull RCU updates from Paul McKenney:

 - Miscellaneous fixes

 - Torture-test updates for smp_call_function(), most notably improved
   checking of module parameters.

 - Tasks-trace RCU updates that fix a number of rare but important
   race-condition bugs.

 - Other torture-test updates, most notably better checking of module
   parameters. In addition, rcutorture may once again be run on
   CONFIG_PREEMPT_RT kernels.

 - Torture-test scripting updates, most notably specifying the new
   CONFIG_KCSAN_STRICT kconfig option rather than maintaining an
   ever-changing list of individual KCSAN kconfig options.

* tag 'rcu.2021.11.01a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (46 commits)
  rcu: Fix rcu_dynticks_curr_cpu_in_eqs() vs noinstr
  rcu: Always inline rcu_dynticks_task*_{enter,exit}()
  torture: Make kvm-remote.sh print size of downloaded tarball
  torture: Allot 1G of memory for scftorture runs
  tools/rcu: Add an extract-stall script
  scftorture: Warn on individual scf_torture_init() error conditions
  scftorture: Count reschedule IPIs
  scftorture: Account for weight_resched when checking for all zeroes
  scftorture: Shut down if nonsensical arguments given
  scftorture: Allow zero weight to exclude an smp_call_function*() category
  rcu: Avoid unneeded function call in rcu_read_unlock()
  rcu-tasks: Update comments to cond_resched_tasks_rcu_qs()
  rcu-tasks: Fix IPI failure handling in trc_wait_for_one_reader
  rcu-tasks: Fix read-side primitives comment for call_rcu_tasks_trace
  rcu-tasks: Clarify read side section info for rcu_tasks_rude GP primitives
  rcu-tasks: Correct comparisons for CPU numbers in show_stalled_task_trace
  rcu-tasks: Correct firstreport usage in check_all_holdout_tasks_trace
  rcu-tasks: Fix s/rcu_add_holdout/trc_add_holdout/ typo in comment
  rcu-tasks: Move RTGS_WAIT_CBS to beginning of rcu_tasks_kthread() loop
  rcu-tasks: Fix s/instruction/instructions/ typo in comment
  ...
2021-11-01 20:25:38 -07:00
Linus Torvalds
79ef0c0014 Tracing updates for 5.16:
- kprobes: Restructured stack unwinder to show properly on x86 when a stack
   dump happens from a kretprobe callback.
 
 - Fix to bootconfig parsing
 
 - Have tracefs allow owner and group permissions by default (only denying
   others). There's been pressure to allow non root to tracefs in a
   controlled fashion, and using groups is probably the safest.
 
 - Bootconfig memory managament updates.
 
 - Bootconfig clean up to have the tools directory be less dependent on
   changes in the kernel tree.
 
 - Allow perf to be traced by function tracer.
 
 - Rewrite of function graph tracer to be a callback from the function tracer
   instead of having its own trampoline (this change will happen on an arch
   by arch basis, and currently only x86_64 implements it).
 
 - Allow multiple direct trampolines (bpf hooks to functions) be batched
   together in one synchronization.
 
 - Allow histogram triggers to add variables that can perform calculations
   against the event's fields.
 
 - Use the linker to determine architecture callbacks from the ftrace
   trampoline to allow for proper parameter prototypes and prevent warnings
   from the compiler.
 
 - Extend histogram triggers to key off of variables.
 
 - Have trace recursion use bit magic to determine preempt context over if
   branches.
 
 - Have trace recursion disable preemption as all use cases do anyway.
 
 - Added testing for verification of tracing utilities.
 
 - Various small clean ups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYYBdxhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qp1sAQD2oYFwaG3sx872gj/myBcHIBSKdiki
 Hry5csd8zYDBpgD+Poylopt5JIbeDuoYw/BedgEXmscZ8Qr7VzjAXdnv/Q4=
 =Loz8
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - kprobes: Restructured stack unwinder to show properly on x86 when a
   stack dump happens from a kretprobe callback.

 - Fix to bootconfig parsing

 - Have tracefs allow owner and group permissions by default (only
   denying others). There's been pressure to allow non root to tracefs
   in a controlled fashion, and using groups is probably the safest.

 - Bootconfig memory managament updates.

 - Bootconfig clean up to have the tools directory be less dependent on
   changes in the kernel tree.

 - Allow perf to be traced by function tracer.

 - Rewrite of function graph tracer to be a callback from the function
   tracer instead of having its own trampoline (this change will happen
   on an arch by arch basis, and currently only x86_64 implements it).

 - Allow multiple direct trampolines (bpf hooks to functions) be batched
   together in one synchronization.

 - Allow histogram triggers to add variables that can perform
   calculations against the event's fields.

 - Use the linker to determine architecture callbacks from the ftrace
   trampoline to allow for proper parameter prototypes and prevent
   warnings from the compiler.

 - Extend histogram triggers to key off of variables.

 - Have trace recursion use bit magic to determine preempt context over
   if branches.

 - Have trace recursion disable preemption as all use cases do anyway.

 - Added testing for verification of tracing utilities.

 - Various small clean ups and fixes.

* tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (101 commits)
  tracing/histogram: Fix semicolon.cocci warnings
  tracing/histogram: Fix documentation inline emphasis warning
  tracing: Increase PERF_MAX_TRACE_SIZE to handle Sentinel1 and docker together
  tracing: Show size of requested perf buffer
  bootconfig: Initialize ret in xbc_parse_tree()
  ftrace: do CPU checking after preemption disabled
  ftrace: disable preemption when recursion locked
  tracing/histogram: Document expression arithmetic and constants
  tracing/histogram: Optimize division by a power of 2
  tracing/histogram: Covert expr to const if both operands are constants
  tracing/histogram: Simplify handling of .sym-offset in expressions
  tracing: Fix operator precedence for hist triggers expression
  tracing: Add division and multiplication support for hist triggers
  tracing: Add support for creating hist trigger variables from literal
  selftests/ftrace: Stop tracing while reading the trace file by default
  MAINTAINERS: Update KPROBES and TRACING entries
  test_kprobes: Move it from kernel/ to lib/
  docs, kprobes: Remove invalid URL and add new reference
  samples/kretprobes: Fix return value if register_kretprobe() failed
  lib/bootconfig: Fix the xbc_get_info kerneldoc
  ...
2021-11-01 20:05:19 -07:00
Jakub Kicinski
8a33dcc2f6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in the fixes we had queued in case there was another -rc.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01 20:05:14 -07:00
Jakub Kicinski
b7b98f8689 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-11-01

We've added 181 non-merge commits during the last 28 day(s) which contain
a total of 280 files changed, 11791 insertions(+), 5879 deletions(-).

The main changes are:

1) Fix bpf verifier propagation of 64-bit bounds, from Alexei.

2) Parallelize bpf test_progs, from Yucong and Andrii.

3) Deprecate various libbpf apis including af_xdp, from Andrii, Hengqi, Magnus.

4) Improve bpf selftests on s390, from Ilya.

5) bloomfilter bpf map type, from Joanne.

6) Big improvements to JIT tests especially on Mips, from Johan.

7) Support kernel module function calls from bpf, from Kumar.

8) Support typeless and weak ksym in light skeleton, from Kumar.

9) Disallow unprivileged bpf by default, from Pawan.

10) BTF_KIND_DECL_TAG support, from Yonghong.

11) Various bpftool cleanups, from Quentin.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (181 commits)
  libbpf: Deprecate AF_XDP support
  kbuild: Unify options for BTF generation for vmlinux and modules
  selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
  bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
  bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
  selftests/bpf: Fix also no-alu32 strobemeta selftest
  bpf: Add missing map_delete_elem method to bloom filter map
  selftests/bpf: Add bloom map success test for userspace calls
  bpf: Add alignment padding for "map_extra" + consolidate holes
  bpf: Bloom filter map naming fixups
  selftests/bpf: Add test cases for struct_ops prog
  bpf: Add dummy BPF STRUCT_OPS for test purpose
  bpf: Factor out helpers for ctx access checking
  bpf: Factor out a helper to prepare trampoline for struct_ops prog
  selftests, bpf: Fix broken riscv build
  riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h
  tools, build: Add RISC-V to HOSTARCH parsing
  riscv, bpf: Increase the maximum number of iterations
  selftests, bpf: Add one test for sockmap with strparser
  selftests, bpf: Fix test_txmsg_ingress_parser error
  ...
====================

Link: https://lore.kernel.org/r/20211102013123.9005-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01 19:59:46 -07:00
James Prestwood
f86ca07eb5 selftests: net: add arp_ndisc_evict_nocarrier
This tests the sysctl options for ARP/ND:

/net/ipv4/conf/<iface>/arp_evict_nocarrier
/net/ipv4/conf/all/arp_evict_nocarrier
/net/ipv6/conf/<iface>/ndisc_evict_nocarrier
/net/ipv6/conf/all/ndisc_evict_nocarrier

Signed-off-by: James Prestwood <prestwoj@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01 19:57:15 -07:00
Alexei Starovoitov
0869e5078a selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
./test_progs-no_alu32 -vv -t twfw

Before the 64-bit_into_32-bit fix:
19: (25) if r1 > 0x3f goto pc+6
 R1_w=inv(id=0,umax_value=63,var_off=(0x0; 0xff),s32_max_value=255,u32_max_value=255)

and eventually:

invalid access to map value, value_size=8 off=7 size=8
R6 max value is outside of the allowed memory range
libbpf: failed to load object 'no_alu32/twfw.o'

After the fix:
19: (25) if r1 > 0x3f goto pc+6
 R1_w=inv(id=0,umax_value=63,var_off=(0x0; 0x3f))

verif_twfw:OK

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211101222153.78759-3-alexei.starovoitov@gmail.com
2021-11-01 18:05:12 -07:00
Alexei Starovoitov
b9979db834 bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
Before this fix:
166: (b5) if r2 <= 0x1 goto pc+22
from 166 to 189: R2=invP(id=1,umax_value=1,var_off=(0x0; 0xffffffff))

After this fix:
166: (b5) if r2 <= 0x1 goto pc+22
from 166 to 189: R2=invP(id=1,umax_value=1,var_off=(0x0; 0x1))

While processing BPF_JLE the reg_set_min_max() would set true_reg->umax_value = 1
and call __reg_combine_64_into_32(true_reg).

Without the fix it would not pass the condition:
if (__reg64_bound_u32(reg->umin_value) && __reg64_bound_u32(reg->umax_value))

since umin_value == 0 at this point.
Before commit 10bf4e8316 the umin was incorrectly ingored.
The commit 10bf4e8316 fixed the correctness issue, but pessimized
propagation of 64-bit min max into 32-bit min max and corresponding var_off.

Fixes: 10bf4e8316 ("bpf: Fix propagation of 32 bit unsigned bounds from 64 bit bounds")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211101222153.78759-1-alexei.starovoitov@gmail.com
2021-11-01 18:05:11 -07:00
Kalesh Singh
4e9f63c9e5 tracing/selftests: Add tests for hist trigger expression parsing
Add tests for the parsing of hist trigger expressions; and to
validate expression evaluation.

Link: https://lkml.kernel.org/r/20211029183339.3216491-5-kaleshsingh@google.com

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-11-01 20:47:00 -04:00
Linus Torvalds
46f8763228 arm64 updates for 5.16
- Support for the Arm8.6 timer extensions, including a self-synchronising
   view of the system registers to elide some expensive ISB instructions.
 
 - Exception table cleanup and rework so that the fixup handlers appear
   correctly in backtraces.
 
 - A handful of miscellaneous changes, the main one being selection of
   CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.
 
 - More mm and pgtable cleanups.
 
 - KASAN support for "asymmetric" MTE, where tag faults are reported
   synchronously for loads (via an exception) and asynchronously for
   stores (via a register).
 
 - Support for leaving the MMU enabled during kexec relocation, which
   significantly speeds up the operation.
 
 - Minor improvements to our perf PMU drivers.
 
 - Improvements to the compat vDSO build system, particularly when
   building with LLVM=1.
 
 - Preparatory work for handling some Coresight TRBE tracing errata.
 
 - Cleanup and refactoring of the SVE code to pave the way for SME
   support in future.
 
 - Ensure SCS pages are unpoisoned immediately prior to freeing them
   when KASAN is enabled for the vmalloc area.
 
 - Try moving to the generic pfn_valid() implementation again now that
   the DMA mapping issue from last time has been resolved.
 
 - Numerous improvements and additions to our FPSIMD and SVE selftests.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmF74ZYQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNI/eB/UZYAtmNi6xC5StPaETyMLeZph9BV/IqIFq
 N71ds7MFzlX/agR6MwLbH2tBHezBtlQ90O732Jjz8zAec2cHd+7sx/w82JesX7PB
 IuOfqP78rvtU4ZkKe1Rcd96QtYvbtNAqcRhIo95OzfV9xwuzkvdXI+ZTYhtCfCuZ
 GozCqQoJtnNDayMtfzbDSXyJLNJc/qnIcUQhrt3vg12zbF3BcHxnmp0nBcHCqZEo
 lDJYufju7p87kCzaFYda2WhlI3t+NThqKOiZ332wQfqzNcr+rw1Y4jWbnCfrdLtI
 JfHT9yiuHDmFSYaJrk7NU8kftW31NV70bbhD7rZ+DQCVndl0lRc=
 =3R3j
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "There's the usual summary below, but the highlights are support for
  the Armv8.6 timer extensions, KASAN support for asymmetric MTE, the
  ability to kexec() with the MMU enabled and a second attempt at
  switching to the generic pfn_valid() implementation.

  Summary:

   - Support for the Arm8.6 timer extensions, including a
     self-synchronising view of the system registers to elide some
     expensive ISB instructions.

   - Exception table cleanup and rework so that the fixup handlers
     appear correctly in backtraces.

   - A handful of miscellaneous changes, the main one being selection of
     CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.

   - More mm and pgtable cleanups.

   - KASAN support for "asymmetric" MTE, where tag faults are reported
     synchronously for loads (via an exception) and asynchronously for
     stores (via a register).

   - Support for leaving the MMU enabled during kexec relocation, which
     significantly speeds up the operation.

   - Minor improvements to our perf PMU drivers.

   - Improvements to the compat vDSO build system, particularly when
     building with LLVM=1.

   - Preparatory work for handling some Coresight TRBE tracing errata.

   - Cleanup and refactoring of the SVE code to pave the way for SME
     support in future.

   - Ensure SCS pages are unpoisoned immediately prior to freeing them
     when KASAN is enabled for the vmalloc area.

   - Try moving to the generic pfn_valid() implementation again now that
     the DMA mapping issue from last time has been resolved.

   - Numerous improvements and additions to our FPSIMD and SVE
     selftests"

[ armv8.6 timer updates were in a shared branch and already came in
  through -tip in the timer pull  - Linus ]

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (85 commits)
  arm64: Select POSIX_CPU_TIMERS_TASK_WORK
  arm64: Document boot requirements for FEAT_SME_FA64
  arm64/sve: Fix warnings when SVE is disabled
  arm64/sve: Add stub for sve_max_virtualisable_vl()
  arm64: errata: Add detection for TRBE write to out-of-range
  arm64: errata: Add workaround for TSB flush failures
  arm64: errata: Add detection for TRBE overwrite in FILL mode
  arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
  selftests: arm64: Factor out utility functions for assembly FP tests
  arm64: vmlinux.lds.S: remove `.fixup` section
  arm64: extable: add load_unaligned_zeropad() handler
  arm64: extable: add a dedicated uaccess handler
  arm64: extable: add `type` and `data` fields
  arm64: extable: use `ex` for `exception_table_entry`
  arm64: extable: make fixup_exception() return bool
  arm64: extable: consolidate definitions
  arm64: gpr-num: support W registers
  arm64: factor out GPR numbering helpers
  arm64: kvm: use kvm_exception_table_entry
  arm64: lib: __arch_copy_to_user(): fold fixups into body
  ...
2021-11-01 16:33:53 -07:00
Andrii Nakryiko
a20eac0af0 selftests/bpf: Fix also no-alu32 strobemeta selftest
Previous fix aded bpf_clamp_umax() helper use to re-validate boundaries.
While that works correctly, it introduces more branches, which blows up
past 1 million instructions in no-alu32 variant of strobemeta selftests.

Switching len variable from u32 to u64 also fixes the issue and reduces
the number of validated instructions, so use that instead. Fix this
patch and bpf_clamp_umax() removed, both alu32 and no-alu32 selftests
pass.

Fixes: 0133c20480 ("selftests/bpf: Fix strobemeta selftest regression")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211101230118.1273019-1-andrii@kernel.org
2021-11-01 16:07:27 -07:00
Joanne Koong
7a67087250 selftests/bpf: Add bloom map success test for userspace calls
This patch has two changes:
1) Adds a new function "test_success_cases" to test
successfully creating + adding + looking up a value
in a bloom filter map from the userspace side.

2) Use bpf_create_map instead of bpf_create_map_xattr in
the "test_fail_cases" and test_inner_map to make the
code look cleaner.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029224909.1721024-4-joannekoong@fb.com
2021-11-01 14:16:03 -07:00
Hou Tao
31122b2f76 selftests/bpf: Add test cases for struct_ops prog
Running a BPF_PROG_TYPE_STRUCT_OPS prog for dummy_st_ops::test_N()
through bpf_prog_test_run(). Four test cases are added:
(1) attach dummy_st_ops should fail
(2) function return value of bpf_dummy_ops::test_1() is expected
(3) pointer argument of bpf_dummy_ops::test_1() works as expected
(4) multiple arguments passed to bpf_dummy_ops::test_2() are correct

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211025064025.2567443-5-houtao1@huawei.com
2021-11-01 14:10:00 -07:00
Linus Torvalds
8cb1ae19bf x86/fpu updates:
- Cleanup of extable fixup handling to be more robust, which in turn
    allows to make the FPU exception fixups more robust as well.
 
  - Change the return code for signal frame related failures from explicit
    error codes to a boolean fail/success as that's all what the calling
    code evaluates.
 
  - A large refactoring of the FPU code to prepare for adding AMX support:
 
    - Distangle the public header maze and remove especially the misnomed
      kitchen sink internal.h which is despite it's name included all over
      the place.
 
    - Add a proper abstraction for the register buffer storage (struct
      fpstate) which allows to dynamically size the buffer at runtime by
      flipping the pointer to the buffer container from the default
      container which is embedded in task_struct::tread::fpu to a
      dynamically allocated container with a larger register buffer.
 
    - Convert the code over to the new fpstate mechanism.
 
    - Consolidate the KVM FPU handling by moving the FPU related code into
      the FPU core which removes the number of exports and avoids adding
      even more export when AMX has to be supported in KVM. This also
      removes duplicated code which was of course unnecessary different and
      incomplete in the KVM copy.
 
    - Simplify the KVM FPU buffer handling by utilizing the new fpstate
      container and just switching the buffer pointer from the user space
      buffer to the KVM guest buffer when entering vcpu_run() and flipping
      it back when leaving the function. This cuts the memory requirements
      of a vCPU for FPU buffers in half and avoids pointless memory copy
      operations.
 
      This also solves the so far unresolved problem of adding AMX support
      because the current FPU buffer handling of KVM inflicted a circular
      dependency between adding AMX support to the core and to KVM.  With
      the new scheme of switching fpstate AMX support can be added to the
      core code without affecting KVM.
 
    - Replace various variables with proper data structures so the extra
      information required for adding dynamically enabled FPU features (AMX)
      can be added in one place
 
  - Add AMX (Advanved Matrix eXtensions) support (finally):
 
     AMX is a large XSTATE component which is going to be available with
     Saphire Rapids XEON CPUs. The feature comes with an extra MSR (MSR_XFD)
     which allows to trap the (first) use of an AMX related instruction,
     which has two benefits:
 
     1) It allows the kernel to control access to the feature
 
     2) It allows the kernel to dynamically allocate the large register
        state buffer instead of burdening every task with the the extra 8K
        or larger state storage.
 
     It would have been great to gain this kind of control already with
     AVX512.
 
     The support comes with the following infrastructure components:
 
     1) arch_prctl() to
        - read the supported features (equivalent to XGETBV(0))
        - read the permitted features for a task
        - request permission for a dynamically enabled feature
 
        Permission is granted per process, inherited on fork() and cleared
        on exec(). The permission policy of the kernel is restricted to
        sigaltstack size validation, but the syscall obviously allows
        further restrictions via seccomp etc.
 
     2) A stronger sigaltstack size validation for sys_sigaltstack(2) which
        takes granted permissions and the potentially resulting larger
        signal frame into account. This mechanism can also be used to
        enforce factual sigaltstack validation independent of dynamic
        features to help with finding potential victims of the 2K
        sigaltstack size constant which is broken since AVX512 support was
        added.
 
     3) Exception handling for #NM traps to catch first use of a extended
        feature via a new cause MSR. If the exception was caused by the use
        of such a feature, the handler checks permission for that
        feature. If permission has not been granted, the handler sends a
        SIGILL like the #UD handler would do if the feature would have been
        disabled in XCR0. If permission has been granted, then a new fpstate
        which fits the larger buffer requirement is allocated.
 
        In the unlikely case that this allocation fails, the handler sends
        SIGSEGV to the task. That's not elegant, but unavoidable as the
        other discussed options of preallocation or full per task
        permissions come with their own set of horrors for kernel and/or
        userspace. So this is the lesser of the evils and SIGSEGV caused by
        unexpected memory allocation failures is not a fundamentally new
        concept either.
 
        When allocation succeeds, the fpstate properties are filled in to
        reflect the extended feature set and the resulting sizes, the
        fpu::fpstate pointer is updated accordingly and the trap is disarmed
        for this task permanently.
 
     4) Enumeration and size calculations
 
     5) Trap switching via MSR_XFD
 
        The XFD (eXtended Feature Disable) MSR is context switched with the
        same life time rules as the FPU register state itself. The mechanism
        is keyed off with a static key which is default disabled so !AMX
        equipped CPUs have zero overhead. On AMX enabled CPUs the overhead
        is limited by comparing the tasks XFD value with a per CPU shadow
        variable to avoid redundant MSR writes. In case of switching from a
        AMX using task to a non AMX using task or vice versa, the extra MSR
        write is obviously inevitable.
 
        All other places which need to be aware of the variable feature sets
        and resulting variable sizes are not affected at all because they
        retrieve the information (feature set, sizes) unconditonally from
        the fpstate properties.
 
     6) Enable the new AMX states
 
   Note, this is relatively new code despite the fact that AMX support is in
   the works for more than a year now.
 
   The big refactoring of the FPU code, which allowed to do a proper
   integration has been started exactly 3 weeks ago. Refactoring of the
   existing FPU code and of the original AMX patches took a week and has
   been subject to extensive review and testing. The only fallout which has
   not been caught in review and testing right away was restricted to AMX
   enabled systems, which is completely irrelevant for anyone outside Intel
   and their early access program. There might be dragons lurking as usual,
   but so far the fine grained refactoring has held up and eventual yet
   undetected fallout is bisectable and should be easily addressable before
   the 5.16 release. Famous last words...
 
   Many thanks to Chang Bae and Dave Hansen for working hard on this and
   also to the various test teams at Intel who reserved extra capacity to
   follow the rapid development of this closely which provides the
   confidence level required to offer this rather large update for inclusion
   into 5.16-rc1.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/NkITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYodDkEADH4+/nN/QoSUHIuuha5Zptj3g2b16a
 /3TxT9fhwPen/kzMGsUk70s3iWJMA+I5dCfkSZexJ2hfhcRe9cBzZIa1HCawKwf3
 YCISTsO/M+LpeORuZ+TpfFLJKnxNr1SEOl+EYffGhq0AkCjifb9Cnr0JZuoMUzGU
 jpfJZ2bj28ri5lG812DtzSMBM9E3SAwgJv+GNjmZbxZKb9mAfhbAMdBUXHirX7Ej
 jmx6koQjYOKwYIW8w1BrdC270lUKQUyJTbQgdRkN9Mh/HnKyFixQ18JqGlgaV2cT
 EtYePUfTEdaHdAhUINLIlEug1MfOslHU+HyGsdywnoChNB4GHPQuePC5Tz60VeFN
 RbQ9aKcBUu8r95rjlnKtAtBijNMA4bjGwllVxNwJ/ZoA9RPv1SbDZ07RX3qTaLVY
 YhVQl8+shD33/W24jUTJv1kMMexpHXIlv0gyfMryzpwI7uzzmGHRPAokJdbYKctC
 dyMPfdE90rxTiMUdL/1IQGhnh3awjbyfArzUhHyQ++HyUyzCFh0slsO0CD18vUy8
 FofhCugGBhjuKw3XwLNQ+KsWURz5qHctSzBc3qMOSyqFHbAJCVRANkhsFvWJo2qL
 75+Z7OTRebtsyOUZIdq26r4roSxHrps3dupWTtN70HWx2NhQG1nLEw986QYiQu1T
 hcKvDmehQLrUvg==
 =x3WL
 -----END PGP SIGNATURE-----

Merge tag 'x86-fpu-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fpu updates from Thomas Gleixner:

 - Cleanup of extable fixup handling to be more robust, which in turn
   allows to make the FPU exception fixups more robust as well.

 - Change the return code for signal frame related failures from
   explicit error codes to a boolean fail/success as that's all what the
   calling code evaluates.

 - A large refactoring of the FPU code to prepare for adding AMX
   support:

      - Distangle the public header maze and remove especially the
        misnomed kitchen sink internal.h which is despite it's name
        included all over the place.

      - Add a proper abstraction for the register buffer storage (struct
        fpstate) which allows to dynamically size the buffer at runtime
        by flipping the pointer to the buffer container from the default
        container which is embedded in task_struct::tread::fpu to a
        dynamically allocated container with a larger register buffer.

      - Convert the code over to the new fpstate mechanism.

      - Consolidate the KVM FPU handling by moving the FPU related code
        into the FPU core which removes the number of exports and avoids
        adding even more export when AMX has to be supported in KVM.
        This also removes duplicated code which was of course
        unnecessary different and incomplete in the KVM copy.

      - Simplify the KVM FPU buffer handling by utilizing the new
        fpstate container and just switching the buffer pointer from the
        user space buffer to the KVM guest buffer when entering
        vcpu_run() and flipping it back when leaving the function. This
        cuts the memory requirements of a vCPU for FPU buffers in half
        and avoids pointless memory copy operations.

        This also solves the so far unresolved problem of adding AMX
        support because the current FPU buffer handling of KVM inflicted
        a circular dependency between adding AMX support to the core and
        to KVM. With the new scheme of switching fpstate AMX support can
        be added to the core code without affecting KVM.

      - Replace various variables with proper data structures so the
        extra information required for adding dynamically enabled FPU
        features (AMX) can be added in one place

 - Add AMX (Advanced Matrix eXtensions) support (finally):

   AMX is a large XSTATE component which is going to be available with
   Saphire Rapids XEON CPUs. The feature comes with an extra MSR
   (MSR_XFD) which allows to trap the (first) use of an AMX related
   instruction, which has two benefits:

    1) It allows the kernel to control access to the feature

    2) It allows the kernel to dynamically allocate the large register
       state buffer instead of burdening every task with the the extra
       8K or larger state storage.

   It would have been great to gain this kind of control already with
   AVX512.

   The support comes with the following infrastructure components:

    1) arch_prctl() to
        - read the supported features (equivalent to XGETBV(0))
        - read the permitted features for a task
        - request permission for a dynamically enabled feature

       Permission is granted per process, inherited on fork() and
       cleared on exec(). The permission policy of the kernel is
       restricted to sigaltstack size validation, but the syscall
       obviously allows further restrictions via seccomp etc.

    2) A stronger sigaltstack size validation for sys_sigaltstack(2)
       which takes granted permissions and the potentially resulting
       larger signal frame into account. This mechanism can also be used
       to enforce factual sigaltstack validation independent of dynamic
       features to help with finding potential victims of the 2K
       sigaltstack size constant which is broken since AVX512 support
       was added.

    3) Exception handling for #NM traps to catch first use of a extended
       feature via a new cause MSR. If the exception was caused by the
       use of such a feature, the handler checks permission for that
       feature. If permission has not been granted, the handler sends a
       SIGILL like the #UD handler would do if the feature would have
       been disabled in XCR0. If permission has been granted, then a new
       fpstate which fits the larger buffer requirement is allocated.

       In the unlikely case that this allocation fails, the handler
       sends SIGSEGV to the task. That's not elegant, but unavoidable as
       the other discussed options of preallocation or full per task
       permissions come with their own set of horrors for kernel and/or
       userspace. So this is the lesser of the evils and SIGSEGV caused
       by unexpected memory allocation failures is not a fundamentally
       new concept either.

       When allocation succeeds, the fpstate properties are filled in to
       reflect the extended feature set and the resulting sizes, the
       fpu::fpstate pointer is updated accordingly and the trap is
       disarmed for this task permanently.

    4) Enumeration and size calculations

    5) Trap switching via MSR_XFD

       The XFD (eXtended Feature Disable) MSR is context switched with
       the same life time rules as the FPU register state itself. The
       mechanism is keyed off with a static key which is default
       disabled so !AMX equipped CPUs have zero overhead. On AMX enabled
       CPUs the overhead is limited by comparing the tasks XFD value
       with a per CPU shadow variable to avoid redundant MSR writes. In
       case of switching from a AMX using task to a non AMX using task
       or vice versa, the extra MSR write is obviously inevitable.

       All other places which need to be aware of the variable feature
       sets and resulting variable sizes are not affected at all because
       they retrieve the information (feature set, sizes) unconditonally
       from the fpstate properties.

    6) Enable the new AMX states

   Note, this is relatively new code despite the fact that AMX support
   is in the works for more than a year now.

   The big refactoring of the FPU code, which allowed to do a proper
   integration has been started exactly 3 weeks ago. Refactoring of the
   existing FPU code and of the original AMX patches took a week and has
   been subject to extensive review and testing. The only fallout which
   has not been caught in review and testing right away was restricted
   to AMX enabled systems, which is completely irrelevant for anyone
   outside Intel and their early access program. There might be dragons
   lurking as usual, but so far the fine grained refactoring has held up
   and eventual yet undetected fallout is bisectable and should be
   easily addressable before the 5.16 release. Famous last words...

   Many thanks to Chang Bae and Dave Hansen for working hard on this and
   also to the various test teams at Intel who reserved extra capacity
   to follow the rapid development of this closely which provides the
   confidence level required to offer this rather large update for
   inclusion into 5.16-rc1

* tag 'x86-fpu-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (110 commits)
  Documentation/x86: Add documentation for using dynamic XSTATE features
  x86/fpu: Include vmalloc.h for vzalloc()
  selftests/x86/amx: Add context switch test
  selftests/x86/amx: Add test cases for AMX state management
  x86/fpu/amx: Enable the AMX feature in 64-bit mode
  x86/fpu: Add XFD handling for dynamic states
  x86/fpu: Calculate the default sizes independently
  x86/fpu/amx: Define AMX state components and have it used for boot-time checks
  x86/fpu/xstate: Prepare XSAVE feature table for gaps in state component numbers
  x86/fpu/xstate: Add fpstate_realloc()/free()
  x86/fpu/xstate: Add XFD #NM handler
  x86/fpu: Update XFD state where required
  x86/fpu: Add sanity checks for XFD
  x86/fpu: Add XFD state to fpstate
  x86/msr-index: Add MSRs for XFD
  x86/cpufeatures: Add eXtended Feature Disabling (XFD) feature bit
  x86/fpu: Reset permission and fpstate on exec()
  x86/fpu: Prepare fpu_clone() for dynamically enabled features
  x86/fpu/signal: Prepare for variable sigframe length
  x86/signal: Use fpu::__state_user_size for sigalt stack validation
  ...
2021-11-01 14:03:56 -07:00
Linus Torvalds
9a7e0a90a4 Scheduler updates:
- Revert the printk format based wchan() symbol resolution as it can leak
    the raw value in case that the symbol is not resolvable.
 
  - Make wchan() more robust and work with all kind of unwinders by
    enforcing that the task stays blocked while unwinding is in progress.
 
  - Prevent sched_fork() from accessing an invalid sched_task_group
 
  - Improve asymmetric packing logic
 
  - Extend scheduler statistics to RT and DL scheduling classes and add
    statistics for bandwith burst to the SCHED_FAIR class.
 
  - Properly account SCHED_IDLE entities
 
  - Prevent a potential deadlock when initial priority is assigned to a
    newly created kthread. A recent change to plug a race between cpuset and
    __sched_setscheduler() introduced a new lock dependency which is now
    triggered. Break the lock dependency chain by moving the priority
    assignment to the thread function.
 
  - Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.
 
  - Improve idle balancing in general and especially for NOHZ enabled
    systems.
 
  - Provide proper interfaces for live patching so it does not have to
    fiddle with scheduler internals.
 
  - Add cluster aware scheduling support.
 
  - A small set of tweaks for RT (irqwork, wait_task_inactive(), various
    scheduler options and delaying mmdrop)
 
  - The usual small tweaks and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/OUkTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoR/5D/9ikdGNpKg9osNqJ3GjAmxsK6kVkB29
 iFe2k8pIpWDToWQf/wQRGih4Yj3Cl49QSnZcPIibh2/12EB1qrrW6iSPJkInz8Ec
 /1LS5/Vewn2OyoxyXZjdvGC5gTXEodSbIazASvX7nvdMeI4gsAsL5etzrMJirT/t
 aymqvr7zovvywrwMTQJrGjUMo9l4ewE8tafMNNhRu1BHU1U4ojM9yvThyRAAcmp7
 3Xy49A+Yq3IgrvYI4u8FMK5Zh08KaxSFjiLhePGm/bF+wSfYmWop2TP1jY05W2Uo
 ti8hfbJMUoFRYuMxAiEldkItnc0wV4M9PtWZZ/x+B71bs65Y4Zjt9cW+rxJv2+m1
 vzV31EsQwGnOti072dzWN4c/cZqngVXAjaNtErvDwJUr+Tw1ayv9KUvuodMQqZY6
 mu68bFUO2kV9EMe1CBOv51Uy1RGHyLj3rlNqrkw+Xp5ISE9Ad2vhUEiRp5bQx5Ci
 V/XFhGZkGUluh0vccrdFlNYZwhj8cZEzkOPCnPSeZ+bq8SyZE6xuHH/lTP1CJCOy
 s800rW1huM+kgV+zRN8adDkGXibAk9N3RtVGnQXmuEy8gB9LZmQg+JeM2wsc9B+6
 i0gdqZnsjNAfoK+BBAG4holxptSL8/eOJsFH8ZNIoxQ+iqooyPx9tFX7yXnRTBQj
 d2qWG7UvoseT+g==
 =fgtS
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Thomas Gleixner:

 - Revert the printk format based wchan() symbol resolution as it can
   leak the raw value in case that the symbol is not resolvable.

 - Make wchan() more robust and work with all kind of unwinders by
   enforcing that the task stays blocked while unwinding is in progress.

 - Prevent sched_fork() from accessing an invalid sched_task_group

 - Improve asymmetric packing logic

 - Extend scheduler statistics to RT and DL scheduling classes and add
   statistics for bandwith burst to the SCHED_FAIR class.

 - Properly account SCHED_IDLE entities

 - Prevent a potential deadlock when initial priority is assigned to a
   newly created kthread. A recent change to plug a race between cpuset
   and __sched_setscheduler() introduced a new lock dependency which is
   now triggered. Break the lock dependency chain by moving the priority
   assignment to the thread function.

 - Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.

 - Improve idle balancing in general and especially for NOHZ enabled
   systems.

 - Provide proper interfaces for live patching so it does not have to
   fiddle with scheduler internals.

 - Add cluster aware scheduling support.

 - A small set of tweaks for RT (irqwork, wait_task_inactive(), various
   scheduler options and delaying mmdrop)

 - The usual small tweaks and improvements all over the place

* tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits)
  sched/fair: Cleanup newidle_balance
  sched/fair: Remove sysctl_sched_migration_cost condition
  sched/fair: Wait before decaying max_newidle_lb_cost
  sched/fair: Skip update_blocked_averages if we are defering load balance
  sched/fair: Account update_blocked_averages in newidle_balance cost
  x86: Fix __get_wchan() for !STACKTRACE
  sched,x86: Fix L2 cache mask
  sched/core: Remove rq_relock()
  sched: Improve wake_up_all_idle_cpus() take #2
  irq_work: Also rcuwait for !IRQ_WORK_HARD_IRQ on PREEMPT_RT
  irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT
  irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support.
  sched/rt: Annotate the RT balancing logic irqwork as IRQ_WORK_HARD_IRQ
  sched: Add cluster scheduler level for x86
  sched: Add cluster scheduler level in core and related Kconfig for ARM64
  topology: Represent clusters of CPUs within a die
  sched: Disable -Wunused-but-set-variable
  sched: Add wrapper for get_wchan() to keep task blocked
  x86: Fix get_wchan() to support the ORC unwinder
  proc: Use task_is_running() for wchan in /proc/$pid/stat
  ...
2021-11-01 13:48:52 -07:00
Linus Torvalds
595b28fb0c Locking updates:
- Move futex code into kernel/futex/ and split up the kitchen sink into
    seperate files to make integration of sys_futex_waitv() simpler.
 
  - Add a new sys_futex_waitv() syscall which allows to wait on multiple
    futexes. The main use case is emulating Windows' WaitForMultipleObjects
    which allows Wine to improve the performance of Windows Games. Also
    native Linux games can benefit from this interface as this is a common
    wait pattern for this kind of applications.
 
  - Add context to ww_mutex_trylock() to provide a path for i915 to rework
    their eviction code step by step without making lockdep upset until the
    final steps of rework are completed. It's also useful for regulator and
    TTM to avoid dropping locks in the non contended path.
 
  - Lockdep and might_sleep() cleanups and improvements
 
  - A few improvements for the RT substitutions.
 
  - The usual small improvements and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/FTITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVNZD/9vIm3Bu1Coz8tbNXz58AiCYq9Y/vp5
 mzFgSzz+VJTkW5Vh8jo5Uel4rCKZyt+rL276EoaRPzYl8KFtWDbpK3qd3PrXKqTX
 At49JO4ttAMJUHIBQ6vblEkykmfEd9YPU1uSWk5roJ+s7Jmr5VWnu0FEWHP00As5
 tWOca/TM0ei9kof26V2fl5aecTGII4i4Zsvy+LPsXtI+TnmP0gSBcGAS/5UnZTtJ
 vQRWTR3ojoYvh5iTmNqbaURYoQLe2j8yscn1DSW1CABWVmP12eDWs+N7jRP4b5S9
 73xOv5P7vpva41wxrK2ir5iNkpsLE97VL2JOHTW8nm7orblfiuxHLTCkTjEdd2pO
 h8blI2IBizEB3JYn2BMkOAaZQOSjN8hd6Ye/b2B4AMEGWeXEoEv6eVy/orYKCluQ
 XDqGn47Vce/SYmo5vfTB8VMt6nANx8PKvOP3IvjHInYEQBgiT6QrlUw3RRkXBp5s
 clQkjYYwjAMVIXowcCrdhoKjMROzi6STShVwHwGL8MaZXqr8Vl6BUO9ckU0pY+4C
 F000Hzwxi8lGEQ9k+P+BnYOEzH5osCty8lloKiQ/7ciX6T+CZHGJPGK/iY4YL8P5
 C3CJWMsHCqST7DodNFJmdfZt99UfIMmEhshMDduU9AAH0tHCn8vOu0U6WvCtpyBp
 BvHj68zteAtlYg==
 =RZ4x
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Thomas Gleixner:

 - Move futex code into kernel/futex/ and split up the kitchen sink into
   seperate files to make integration of sys_futex_waitv() simpler.

 - Add a new sys_futex_waitv() syscall which allows to wait on multiple
   futexes.

   The main use case is emulating Windows' WaitForMultipleObjects which
   allows Wine to improve the performance of Windows Games. Also native
   Linux games can benefit from this interface as this is a common wait
   pattern for this kind of applications.

 - Add context to ww_mutex_trylock() to provide a path for i915 to
   rework their eviction code step by step without making lockdep upset
   until the final steps of rework are completed. It's also useful for
   regulator and TTM to avoid dropping locks in the non contended path.

 - Lockdep and might_sleep() cleanups and improvements

 - A few improvements for the RT substitutions.

 - The usual small improvements and cleanups.

* tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  locking: Remove spin_lock_flags() etc
  locking/rwsem: Fix comments about reader optimistic lock stealing conditions
  locking: Remove rcu_read_{,un}lock() for preempt_{dis,en}able()
  locking/rwsem: Disable preemption for spinning region
  docs: futex: Fix kernel-doc references
  futex: Fix PREEMPT_RT build
  futex2: Documentation: Document sys_futex_waitv() uAPI
  selftests: futex: Test sys_futex_waitv() wouldblock
  selftests: futex: Test sys_futex_waitv() timeout
  selftests: futex: Add sys_futex_waitv() test
  futex,arm: Wire up sys_futex_waitv()
  futex,x86: Wire up sys_futex_waitv()
  futex: Implement sys_futex_waitv()
  futex: Simplify double_lock_hb()
  futex: Split out wait/wake
  futex: Split out requeue
  futex: Rename mark_wake_futex()
  futex: Rename: match_futex()
  futex: Rename: hb_waiter_{inc,dec,pending}()
  futex: Split out PI futex
  ...
2021-11-01 13:15:36 -07:00
Björn Töpel
36e70b9b06 selftests, bpf: Fix broken riscv build
This patch is closely related to commit 6016df8fe8 ("selftests/bpf:
Fix broken riscv build"). When clang includes the system include
directories, but targeting BPF program, __BITS_PER_LONG defaults to
32, unless explicitly set. Work around this problem, by explicitly
setting __BITS_PER_LONG to __riscv_xlen.

Signed-off-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211028161057.520552-5-bjorn@kernel.org
2021-11-01 17:10:40 +01:00
Liu Jian
d69672147f selftests, bpf: Add one test for sockmap with strparser
Add the test to check sockmap with strparser is working well.

Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211029141216.211899-3-liujian56@huawei.com
2021-11-01 17:08:21 +01:00
Liu Jian
b556c3fd46 selftests, bpf: Fix test_txmsg_ingress_parser error
After "skmsg: lose offset info in sk_psock_skb_ingress", the test case
with ktls failed. This because ktls parser(tls_read_size) return value
is 285 not 256.

The case like this:

	tls_sk1 --> redir_sk --> tls_sk2

tls_sk1 sent out 512 bytes data, after tls related processing redir_sk
recved 570 btyes data, and redirect 512 (skb_use_parser) bytes data to
tls_sk2; but tls_sk2 needs 285 * 2 bytes data, receive timeout occurred.

Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211029141216.211899-2-liujian56@huawei.com
2021-11-01 17:08:21 +01:00
Andrii Nakryiko
0133c20480 selftests/bpf: Fix strobemeta selftest regression
After most recent nightly Clang update strobemeta selftests started
failing with the following error (relevant portion of assembly included):

  1624: (85) call bpf_probe_read_user_str#114
  1625: (bf) r1 = r0
  1626: (18) r2 = 0xfffffffe
  1628: (5f) r1 &= r2
  1629: (55) if r1 != 0x0 goto pc+7
  1630: (07) r9 += 104
  1631: (6b) *(u16 *)(r9 +0) = r0
  1632: (67) r0 <<= 32
  1633: (77) r0 >>= 32
  1634: (79) r1 = *(u64 *)(r10 -456)
  1635: (0f) r1 += r0
  1636: (7b) *(u64 *)(r10 -456) = r1
  1637: (79) r1 = *(u64 *)(r10 -368)
  1638: (c5) if r1 s< 0x1 goto pc+778
  1639: (bf) r6 = r8
  1640: (0f) r6 += r7
  1641: (b4) w1 = 0
  1642: (6b) *(u16 *)(r6 +108) = r1
  1643: (79) r3 = *(u64 *)(r10 -352)
  1644: (79) r9 = *(u64 *)(r10 -456)
  1645: (bf) r1 = r9
  1646: (b4) w2 = 1
  1647: (85) call bpf_probe_read_user_str#114

  R1 unbounded memory access, make sure to bounds check any such access

In the above code r0 and r1 are implicitly related. Clang knows that,
but verifier isn't able to infer this relationship.

Yonghong Song narrowed down this "regression" in code generation to
a recent Clang optimization change ([0]), which for BPF target generates
code pattern that BPF verifier can't handle and loses track of register
boundaries.

This patch works around the issue by adding an BPF assembly-based helper
that helps to prove to the verifier that upper bound of the register is
a given constant by controlling the exact share of generated BPF
instruction sequence. This fixes the immediate issue for strobemeta
selftest.

  [0] acabad9ff6

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029182907.166910-1-andrii@kernel.org
2021-11-01 17:08:21 +01:00
Taehee Yoo
c08e8baea7 selftests: add amt interface selftest script
This is selftest script for amt interface.
This script includes basic forwarding scenarion and torture scenario.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:36:09 +00:00
Paolo Abeni
b6ab64b074 selftests: mptcp: more stable simult_flows tests
Currently the simult_flows.sh self-tests are not very stable,
especially when running on slow VMs.

The tests measure runtime for transfers on multiple subflows
and check that the time is near the theoretical maximum.

The current test infra introduces a bit of jitter in test
runtime, due to multiple explicit delays. Additionally the
runtime is measured by the shell script wrapper. On a slow
VM, the script overhead is measurable and subject to relevant
jitter.

One solution to make the test more stable would be adding more
slack to the expected time; that could possibly hide real
regressions. Instead move the measurement inside the command
doing the transfer, and drop most unneeded sleeps.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:19:49 +00:00
Geliang Tang
7c909a9804 selftests: mptcp: fix proto type in link_failure tests
In listener_ns, we should pass srv_proto argument to mptcp_connect command,
not cl_proto.

Fixes: 7d1e6f1639 ("selftests: mptcp: add testcase for active-back")
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:19:49 +00:00
Jakub Kicinski
b0ced8f290 selftests: udp: test for passing SO_MARK as cmsg
Before fix:
|  Case IPv6 rejection returned 0, expected 1
|FAIL - 1/4 cases failed

With the fix:
| OK

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:12:48 +00:00
Paolo Bonzini
4e33868433 KVM/arm64 updates for Linux 5.16
- More progress on the protected VM front, now with the full
   fixed feature set as well as the limitation of some hypercalls
   after initialisation.
 
 - Cleanup of the RAZ/WI sysreg handling, which was pointlessly
   complicated
 
 - Fixes for the vgic placement in the IPA space, together with a
   bunch of selftests
 
 - More memcg accounting of the memory allocated on behalf of a guest
 
 - Timer and vgic selftests
 
 - Workarounds for the Apple M1 broken vgic implementation
 
 - KConfig cleanups
 
 - New kvmarm.mode=none option, for those who really dislike us
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmF7u5YPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpD6w8QAIKDLJCTqkxv5Vh4ZSmtXxg4gTZMBlg8oSQ8
 sVL639aqBvFe3A6Vmz6IwBm+NT7Sm1zxkuH9qHzVR1gmXq0oLYNrIuyrzRW8PvqO
 hIkSRRoVsf03755TmkxwR7/2jAFxb6FhEVAy6VWdQyI44orihIPvMp8aTIq+jvU+
 XoNGb/rPf9HpSUtvuaHYvZhSZBhoi5dRnkr33R1+VR69n7Axs8lm905xcl6Pt0a0
 QqYZWQvFu/BXPyNflG7LUsegRF/iiV2vNTbNNowkzlV5suqxBpJAp6ApDL/gWrHv
 ya/6cMqicSjBIkWnawhXY98w6/5xfzK4IV/zc00FNWOlUdVP89Thqrgc8EkigS9R
 BGcxFFqj41snr+ensSBBIkNtV+dBX52H3rUE0F9seiTXm8QWI86JobdeNadT8tUP
 TXdOeCUcA+cp4Ngln18lsbOEaBkPA5H1po1nUFPHbKnVOxnqXScB7E/xF6rAbryV
 m+Z+oidU7MyS/Ev/Da0ww/XFx7cs2ez9EgeQvjcdFAvUMqS6kcXEExvgGYlm+KRQ
 GBMKPLCNHKdflMANoSpol7MZUmPJ45XoWKW1rntj2r9X+oJW2Z2hEx32xrWDJdqK
 ixnbjog5kNZb0CjLGsUC90lo2hpRJecaLhAjgTLYaNC1QxGPrt92eat6gnwuMTBc
 mpADqi7w
 =qBAO
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for Linux 5.16

- More progress on the protected VM front, now with the full
  fixed feature set as well as the limitation of some hypercalls
  after initialisation.

- Cleanup of the RAZ/WI sysreg handling, which was pointlessly
  complicated

- Fixes for the vgic placement in the IPA space, together with a
  bunch of selftests

- More memcg accounting of the memory allocated on behalf of a guest

- Timer and vgic selftests

- Workarounds for the Apple M1 broken vgic implementation

- KConfig cleanups

- New kvmarm.mode=none option, for those who really dislike us
2021-10-31 02:28:48 -04:00
Borislav Petkov
a72fdfd21e selftests/x86/iopl: Adjust to the faked iopl CLI/STI usage
Commit in Fixes changed the iopl emulation to not #GP on CLI and STI
because it would break some insane luserspace tools which would toggle
interrupts.

The corresponding selftest would rely on the fact that executing CLI/STI
would trigger a #GP and thus detect it this way but since that #GP is
not happening anymore, the detection is now wrong too.

Extend the test to actually look at the IF flag and whether executing
those insns had any effect on it. The STI detection needs to have the
fact that interrupts were previously disabled, passed in so do that from
the previous CLI test, i.e., STI test needs to follow a previous CLI one
for it to make sense.

Fixes: b968e84b50 ("x86/iopl: Fake iopl(3) CLI/STI usage")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20211030083939.13073-1-bp@alien8.de
2021-10-30 23:18:04 +02:00
Shuah Khan
f35dcaa0a8 selftests/core: fix conflicting types compile error for close_range()
close_range() test type conflicts with close_range() library call in
x86_64-linux-gnu/bits/unistd_ext.h. Fix it by changing the name to
core_close_range().

gcc -g -I../../../../usr/include/    close_range_test.c  -o ../tools/testing/selftests/core/close_range_test
In file included from close_range_test.c:16:
close_range_test.c:57:6: error: conflicting types for ‘close_range’; have ‘void(struct __test_metadata *)’
   57 | TEST(close_range)
      |      ^~~~~~~~~~~
../kselftest_harness.h:181:21: note: in definition of macro ‘__TEST_IMPL’
  181 |         static void test_name(struct __test_metadata *_metadata); \
      |                     ^~~~~~~~~
close_range_test.c:57:1: note: in expansion of macro ‘TEST’
   57 | TEST(close_range)
      | ^~~~
In file included from /usr/include/unistd.h:1204,
                 from close_range_test.c:13:
/usr/include/x86_64-linux-gnu/bits/unistd_ext.h:56:12: note: previous declaration of ‘close_range’ with type ‘int(unsigned int,  unsigned int,  int)’
   56 | extern int close_range (unsigned int __fd, unsigned int __max_fd,
      |            ^~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-29 13:09:42 -06:00
Daniel Latypov
52a5d80a22 kunit: tool: fix typecheck errors about loading qemu configs
Currently, we have these errors:
$ mypy ./tools/testing/kunit/*.py
tools/testing/kunit/kunit_kernel.py:213: error: Item "_Loader" of "Optional[_Loader]" has no attribute "exec_module"
tools/testing/kunit/kunit_kernel.py:213: error: Item "None" of "Optional[_Loader]" has no attribute "exec_module"
tools/testing/kunit/kunit_kernel.py:214: error: Module has no attribute "QEMU_ARCH"
tools/testing/kunit/kunit_kernel.py:215: error: Module has no attribute "QEMU_ARCH"

exec_module
===========

pytype currently reports no errors, but that's because there's a comment
disabling it on 213.

This is due to https://github.com/python/typeshed/pull/2626.
The fix is to assert the loaded module implements the ABC
(abstract base class) we want which has exec_module support.

QEMU_ARCH
=========

pytype is fine with this, but mypy is not:
https://github.com/python/mypy/issues/5059

Add a check that the loaded module does indeed have QEMU_ARCH.
Note: this is not enough to appease mypy, so we also add a comment to
squash the warning.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-29 13:05:47 -06:00
Andrea Righi
f48ad69097 selftests/bpf: Fix fclose/pclose mismatch in test_progs
Make sure to use pclose() to properly close the pipe opened by popen().

Fixes: 81f77fd0de ("bpf: add selftest for stackmap with BPF_F_STACK_BUILD_ID")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211026143409.42666-1-andrea.righi@canonical.com
2021-10-29 17:48:43 +02:00
Nikolay Aleksandrov
34d7ecb3d4 selftests: net: bridge: update IGMP/MLD membership interval value
When I fixed IGMPv3/MLDv2 to use the bridge's multicast_membership_interval
value which is chosen by user-space instead of calculating it based on
multicast_query_interval and multicast_query_response_interval I forgot
to update the selftests relying on that behaviour. Now we have to
manually set the expected GMI value to perform the tests correctly and get
proper results (similar to IGMPv2 behaviour).

Fixes: fac3cb82a5 ("net: bridge: mcast: use multicast_membership_interval for IGMPv3")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29 13:58:21 +01:00
Shuah Khan
e300a85db1 selftests/net: update .gitignore with newly added tests
Update .gitignore with newly added tests:
	tools/testing/selftests/net/af_unix/test_unix_oob
	tools/testing/selftests/net/gro
	tools/testing/selftests/net/ioam6_parser
	tools/testing/selftests/net/toeplitz

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29 13:14:58 +01:00
Petr Machata
2b11e24eba selftests: mlxsw: Test port shaper
TBF can be used as a root qdisc, in which case it is supposed to configure
port shaper. Add a test that verifies that this is so by installing a root
TBF with a ETS or PRIO below it, and then expecting individual bands to all
be shaped according to the root TBF configuration.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-28 19:47:50 -07:00
Petr Machata
3d5290ea1d selftests: mlxsw: Test offloadability of root TBF
TBF can be used as a root qdisc, with the usual ETS/RED/TBF hierarchy below
it. This use should now be offloaded. Add a test that verifies that it is.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-28 19:47:49 -07:00
David Yang
9c7516d669 tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer
The coccinelle check report:

  ./tools/testing/selftests/vm/split_huge_page_test.c:344:36-42:
  ERROR: application of sizeof to pointer

Use "strlen" to fix it.

Link: https://lkml.kernel.org/r/20211012030116.184027-1-davidcomponentone@gmail.com
Signed-off-by: David Yang <davidcomponentone@gmail.com>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Kumar Kartikeya Dwivedi
efadf2ad17 selftests/bpf: Fix memory leak in test_ima
The allocated ring buffer is never freed, do so in the cleanup path.

Fixes: f446b570ac ("bpf/selftests: Update the IMA test to use BPF ring buffer")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-9-memxor@gmail.com
2021-10-28 16:30:07 -07:00
Kumar Kartikeya Dwivedi
c3fc706e94 selftests/bpf: Fix fd cleanup in sk_lookup test
Similar to the fix in commit:
e31eec77e4 ("bpf: selftests: Fix fd cleanup in get_branch_snapshot")

We use designated initializer to set fds to -1 without breaking on
future changes to MAX_SERVER constant denoting the array size.

The particular close(0) occurs on non-reuseport tests, so it can be seen
with -n 115/{2,3} but not 115/4. This can cause problems with future
tests if they depend on BTF fd never being acquired as fd 0, breaking
internal libbpf assumptions.

Fixes: 0ab5539f85 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-8-memxor@gmail.com
2021-10-28 16:30:07 -07:00
Kumar Kartikeya Dwivedi
087cba799c selftests/bpf: Add weak/typeless ksym test for light skeleton
Also, avoid using CO-RE features, as lskel doesn't support CO-RE, yet.
Include both light and libbpf skeleton in same file to test both of them
together.

In c48e51c8b0 ("bpf: selftests: Add selftests for module kfunc support"),
I added support for generating both lskel and libbpf skel for a BPF
object, however the name parameter for bpftool caused collisions when
included in same file together. This meant that every test needed a
separate file for a libbpf/light skeleton separation instead of
subtests.

Change that by appending a "_lskel" suffix to the name for files using
light skeleton, and convert all existing users.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-7-memxor@gmail.com
2021-10-28 16:30:07 -07:00
Joanne Koong
f44bc543a0 bpf/benchs: Add benchmarks for comparing hashmap lookups w/ vs. w/out bloom filter
This patch adds benchmark tests for comparing the performance of hashmap
lookups without the bloom filter vs. hashmap lookups with the bloom filter.

Checking the bloom filter first for whether the element exists should
overall enable a higher throughput for hashmap lookups, since if the
element does not exist in the bloom filter, we can avoid a costly lookup in
the hashmap.

On average, using 5 hash functions in the bloom filter tended to perform
the best across the widest range of different entry sizes. The benchmark
results using 5 hash functions (running on 8 threads on a machine with one
numa node, and taking the average of 3 runs) were roughly as follows:

value_size = 4 bytes -
	10k entries: 30% faster
	50k entries: 40% faster
	100k entries: 40% faster
	500k entres: 70% faster
	1 million entries: 90% faster
	5 million entries: 140% faster

value_size = 8 bytes -
	10k entries: 30% faster
	50k entries: 40% faster
	100k entries: 50% faster
	500k entres: 80% faster
	1 million entries: 100% faster
	5 million entries: 150% faster

value_size = 16 bytes -
	10k entries: 20% faster
	50k entries: 30% faster
	100k entries: 35% faster
	500k entres: 65% faster
	1 million entries: 85% faster
	5 million entries: 110% faster

value_size = 40 bytes -
	10k entries: 5% faster
	50k entries: 15% faster
	100k entries: 20% faster
	500k entres: 65% faster
	1 million entries: 75% faster
	5 million entries: 120% faster

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-6-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Joanne Koong
57fd1c63c9 bpf/benchs: Add benchmark tests for bloom filter throughput + false positive
This patch adds benchmark tests for the throughput (for lookups + updates)
and the false positive rate of bloom filter lookups, as well as some
minor refactoring of the bash script for running the benchmarks.

These benchmarks show that as the number of hash functions increases,
the throughput and the false positive rate of the bloom filter decreases.
>From the benchmark data, the approximate average false-positive rates
are roughly as follows:

1 hash function = ~30%
2 hash functions = ~15%
3 hash functions = ~5%
4 hash functions = ~2.5%
5 hash functions = ~1%
6 hash functions = ~0.5%
7 hash functions  = ~0.35%
8 hash functions = ~0.15%
9 hash functions = ~0.1%
10 hash functions = ~0%

For reference data, the benchmarks run on one thread on a machine
with one numa node for 1 to 5 hash functions for 8-byte and 64-byte
values are as follows:

1 hash function:
  50k entries
	8-byte value
	    Lookups - 51.1 M/s operations
	    Updates - 33.6 M/s operations
	    False positive rate: 24.15%
	64-byte value
	    Lookups - 15.7 M/s operations
	    Updates - 15.1 M/s operations
	    False positive rate: 24.2%
  100k entries
	8-byte value
	    Lookups - 51.0 M/s operations
	    Updates - 33.4 M/s operations
	    False positive rate: 24.04%
	64-byte value
	    Lookups - 15.6 M/s operations
	    Updates - 14.6 M/s operations
	    False positive rate: 24.06%
  500k entries
	8-byte value
	    Lookups - 50.5 M/s operations
	    Updates - 33.1 M/s operations
	    False positive rate: 27.45%
	64-byte value
	    Lookups - 15.6 M/s operations
	    Updates - 14.2 M/s operations
	    False positive rate: 27.42%
  1 mil entries
	8-byte value
	    Lookups - 49.7 M/s operations
	    Updates - 32.9 M/s operations
	    False positive rate: 27.45%
	64-byte value
	    Lookups - 15.4 M/s operations
	    Updates - 13.7 M/s operations
	    False positive rate: 27.58%
  2.5 mil entries
	8-byte value
	    Lookups - 47.2 M/s operations
	    Updates - 31.8 M/s operations
	    False positive rate: 30.94%
	64-byte value
	    Lookups - 15.3 M/s operations
	    Updates - 13.2 M/s operations
	    False positive rate: 30.95%
  5 mil entries
	8-byte value
	    Lookups - 41.1 M/s operations
	    Updates - 28.1 M/s operations
	    False positive rate: 31.01%
	64-byte value
	    Lookups - 13.3 M/s operations
	    Updates - 11.4 M/s operations
	    False positive rate: 30.98%

2 hash functions:
  50k entries
	8-byte value
	    Lookups - 34.1 M/s operations
	    Updates - 20.1 M/s operations
	    False positive rate: 9.13%
	64-byte value
	    Lookups - 8.4 M/s operations
	    Updates - 7.9 M/s operations
	    False positive rate: 9.21%
  100k entries
	8-byte value
	    Lookups - 33.7 M/s operations
	    Updates - 18.9 M/s operations
	    False positive rate: 9.13%
	64-byte value
	    Lookups - 8.4 M/s operations
	    Updates - 7.7 M/s operations
	    False positive rate: 9.19%
  500k entries
	8-byte value
	    Lookups - 32.7 M/s operations
	    Updates - 18.1 M/s operations
	    False positive rate: 12.61%
	64-byte value
	    Lookups - 8.4 M/s operations
	    Updates - 7.5 M/s operations
	    False positive rate: 12.61%
  1 mil entries
	8-byte value
	    Lookups - 30.6 M/s operations
	    Updates - 18.9 M/s operations
	    False positive rate: 12.54%
	64-byte value
	    Lookups - 8.0 M/s operations
	    Updates - 7.0 M/s operations
	    False positive rate: 12.52%
  2.5 mil entries
	8-byte value
	    Lookups - 25.3 M/s operations
	    Updates - 16.7 M/s operations
	    False positive rate: 16.77%
	64-byte value
	    Lookups - 7.9 M/s operations
	    Updates - 6.5 M/s operations
	    False positive rate: 16.88%
  5 mil entries
	8-byte value
	    Lookups - 20.8 M/s operations
	    Updates - 14.7 M/s operations
	    False positive rate: 16.78%
	64-byte value
	    Lookups - 7.0 M/s operations
	    Updates - 6.0 M/s operations
	    False positive rate: 16.78%

3 hash functions:
  50k entries
	8-byte value
	    Lookups - 25.1 M/s operations
	    Updates - 14.6 M/s operations
	    False positive rate: 7.65%
	64-byte value
	    Lookups - 5.8 M/s operations
	    Updates - 5.5 M/s operations
	    False positive rate: 7.58%
  100k entries
	8-byte value
	    Lookups - 24.7 M/s operations
	    Updates - 14.1 M/s operations
	    False positive rate: 7.71%
	64-byte value
	    Lookups - 5.8 M/s operations
	    Updates - 5.3 M/s operations
	    False positive rate: 7.62%
  500k entries
	8-byte value
	    Lookups - 22.9 M/s operations
	    Updates - 13.9 M/s operations
	    False positive rate: 2.62%
	64-byte value
	    Lookups - 5.6 M/s operations
	    Updates - 4.8 M/s operations
	    False positive rate: 2.7%
  1 mil entries
	8-byte value
	    Lookups - 19.8 M/s operations
	    Updates - 12.6 M/s operations
	    False positive rate: 2.60%
	64-byte value
	    Lookups - 5.3 M/s operations
	    Updates - 4.4 M/s operations
	    False positive rate: 2.69%
  2.5 mil entries
	8-byte value
	    Lookups - 16.2 M/s operations
	    Updates - 10.7 M/s operations
	    False positive rate: 4.49%
	64-byte value
	    Lookups - 4.9 M/s operations
	    Updates - 4.1 M/s operations
	    False positive rate: 4.41%
  5 mil entries
	8-byte value
	    Lookups - 18.8 M/s operations
	    Updates - 9.2 M/s operations
	    False positive rate: 4.45%
	64-byte value
	    Lookups - 5.2 M/s operations
	    Updates - 3.9 M/s operations
	    False positive rate: 4.54%

4 hash functions:
  50k entries
	8-byte value
	    Lookups - 19.7 M/s operations
	    Updates - 11.1 M/s operations
	    False positive rate: 1.01%
	64-byte value
	    Lookups - 4.4 M/s operations
	    Updates - 4.0 M/s operations
	    False positive rate: 1.00%
  100k entries
	8-byte value
	    Lookups - 19.5 M/s operations
	    Updates - 10.9 M/s operations
	    False positive rate: 1.00%
	64-byte value
	    Lookups - 4.3 M/s operations
	    Updates - 3.9 M/s operations
	    False positive rate: 0.97%
  500k entries
	8-byte value
	    Lookups - 18.2 M/s operations
	    Updates - 10.6 M/s operations
	    False positive rate: 2.05%
	64-byte value
	    Lookups - 4.3 M/s operations
	    Updates - 3.7 M/s operations
	    False positive rate: 2.05%
  1 mil entries
	8-byte value
	    Lookups - 15.5 M/s operations
	    Updates - 9.6 M/s operations
	    False positive rate: 1.99%
	64-byte value
	    Lookups - 4.0 M/s operations
	    Updates - 3.4 M/s operations
	    False positive rate: 1.99%
  2.5 mil entries
	8-byte value
	    Lookups - 13.8 M/s operations
	    Updates - 7.7 M/s operations
	    False positive rate: 3.91%
	64-byte value
	    Lookups - 3.7 M/s operations
	    Updates - 3.6 M/s operations
	    False positive rate: 3.78%
  5 mil entries
	8-byte value
	    Lookups - 13.0 M/s operations
	    Updates - 6.9 M/s operations
	    False positive rate: 3.93%
	64-byte value
	    Lookups - 3.5 M/s operations
	    Updates - 3.7 M/s operations
	    False positive rate: 3.39%

5 hash functions:
  50k entries
	8-byte value
	    Lookups - 16.4 M/s operations
	    Updates - 9.1 M/s operations
	    False positive rate: 0.78%
	64-byte value
	    Lookups - 3.5 M/s operations
	    Updates - 3.2 M/s operations
	    False positive rate: 0.77%
  100k entries
	8-byte value
	    Lookups - 16.3 M/s operations
	    Updates - 9.0 M/s operations
	    False positive rate: 0.79%
	64-byte value
	    Lookups - 3.5 M/s operations
	    Updates - 3.2 M/s operations
	    False positive rate: 0.78%
  500k entries
	8-byte value
	    Lookups - 15.1 M/s operations
	    Updates - 8.8 M/s operations
	    False positive rate: 1.82%
	64-byte value
	    Lookups - 3.4 M/s operations
	    Updates - 3.0 M/s operations
	    False positive rate: 1.78%
  1 mil entries
	8-byte value
	    Lookups - 13.2 M/s operations
	    Updates - 7.8 M/s operations
	    False positive rate: 1.81%
	64-byte value
	    Lookups - 3.2 M/s operations
	    Updates - 2.8 M/s operations
	    False positive rate: 1.80%
  2.5 mil entries
	8-byte value
	    Lookups - 10.5 M/s operations
	    Updates - 5.9 M/s operations
	    False positive rate: 0.29%
	64-byte value
	    Lookups - 3.2 M/s operations
	    Updates - 2.4 M/s operations
	    False positive rate: 0.28%
  5 mil entries
	8-byte value
	    Lookups - 9.6 M/s operations
	    Updates - 5.7 M/s operations
	    False positive rate: 0.30%
	64-byte value
	    Lookups - 3.2 M/s operations
	    Updates - 2.7 M/s operations
	    False positive rate: 0.30%

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-5-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Joanne Koong
ed9109ad64 selftests/bpf: Add bloom filter map test cases
This patch adds test cases for bpf bloom filter maps. They include tests
checking against invalid operations by userspace, tests for using the
bloom filter map as an inner map, and a bpf program that queries the
bloom filter map for values added by a userspace program.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-4-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Jakub Kicinski
7df621a3ee Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/net/sock.h
  7b50ecfcc6 ("net: Rename ->stream_memory_read to ->sock_is_readable")
  4c1e34c0db ("vsock: Enable y2038 safe timeval for timeout")

drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
  0daa55d033 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table")
  e77bcdd1f6 ("octeontx2-af: Display all enabled PF VF rsrc_alloc entries.")

Adjacent code addition in both cases, keep both.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-28 10:43:58 -07:00
Chang S. Bae
101c669d16 selftests/x86/amx: Add context switch test
XSAVE state is thread-local.  The kernel switches between thread
state at context switch time.  Generally, running a selftest for
a while will naturally expose it to some context switching and
and will test the XSAVE code.

Instead of just hoping that the tests get context-switched at
random times, force context-switches on purpose.  Spawn off a few
userspace threads and force context-switches between them.
Ensure that the kernel correctly context switches each thread's
unique AMX state.

 [ dhansen: bunches of cleanups ]

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211026122525.6EFD5758@davehans-spike.ostc.intel.com
2021-10-28 14:35:27 +02:00
Chang S. Bae
6a3e0651b4 selftests/x86/amx: Add test cases for AMX state management
AMX TILEDATA is a very large XSAVE feature.  It could have caused
nasty XSAVE buffer space waste in two places:

 * Signal stacks
 * Kernel task_struct->fpu buffers

To avoid this waste, neither of these buffers have AMX state by
default.  The non-default features are called "dynamic" features.

There is an arch_prctl(ARCH_REQ_XCOMP_PERM) which allows a task
to declare that it wants to use AMX or other "dynamic" XSAVE
features.  This arch_prctl() ensures that sufficient sigaltstack
space is available before it will succeed.  It also expands the
task_struct buffer.

Functions of this test:
 * Test arch_prctl(ARCH_REQ_XCOMP_PERM).  Ensure that it checks for
   proper sigaltstack sizing and that the sizing is enforced for
   future sigaltstack calls.
 * Ensure that ARCH_REQ_XCOMP_PERM is inherited across fork()
 * Ensure that TILEDATA use before the prctl() is fatal
 * Ensure that TILEDATA is cleared across fork()

Note: Generally, compiler support is needed to do something with
AMX.  Instead, directly load AMX state from userspace with a
plain XSAVE.  Do not depend on the compiler.

 [ dhansen: bunches of cleanups ]

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211026122524.7BEDAA95@davehans-spike.ostc.intel.com
2021-10-28 14:35:27 +02:00
Yucong Sun
e1ef62a4dd selftests/bpf: Adding a namespace reset for tc_redirect
This patch delete ns_src/ns_dst/ns_redir namespaces before recreating
them, making the test more robust.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211025223345.2136168-5-fallentree@fb.com
2021-10-27 11:59:02 -07:00
Yucong Sun
9e7240fb2d selftests/bpf: Fix attach_probe in parallel mode
This patch makes attach_probe uses its own method as attach point,
avoiding conflict with other tests like bpf_cookie.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211025223345.2136168-4-fallentree@fb.com
2021-10-27 11:59:02 -07:00
Yucong Sun
547208a386 selfetests/bpf: Update vmtest.sh defaults
Increase memory to 4G, 8 SMP core with host cpu passthrough. This
make it run faster in parallel mode and more likely to succeed.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211025223345.2136168-2-fallentree@fb.com
2021-10-27 11:58:44 -07:00
Russell Currey
cb662608e5 selftests/powerpc: Use date instead of EPOCHSECONDS in mitigation-patching.sh
The EPOCHSECONDS environment variable was added in bash 5.0 (released
2019).  Some distributions of the "stable" and "long-term" variety ship
older versions of bash than this, so swap to using the date command
instead.

"%s" was added to coreutils `date` in 1993 so we should be good, but who
knows, it is a GNU extension and not part of the POSIX spec for `date`.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211025102436.19177-1-ruscur@russell.cc
2021-10-27 22:34:02 +11:00
Masami Hiramatsu
25b9513872 selftests/ftrace: Stop tracing while reading the trace file by default
Stop tracing while reading the trace file by default, to prevent
the test results while checking it and to avoid taking a long time
to check the result.
If there is any testcase which wants to test the tracing while reading
the trace file, please override this setting inside the test case.

This also recovers the pause-on-trace when clean it up.

Link: https://lkml.kernel.org/r/163529053143.690749.15365238954175942026.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-26 20:27:19 -04:00
Jakub Kicinski
440ffcdd9d Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-10-26

We've added 12 non-merge commits during the last 7 day(s) which contain
a total of 23 files changed, 118 insertions(+), 98 deletions(-).

The main changes are:

1) Fix potential race window in BPF tail call compatibility check, from Toke Høiland-Jørgensen.

2) Fix memory leak in cgroup fs due to missing cgroup_bpf_offline(), from Quanyang Wang.

3) Fix file descriptor reference counting in generic_map_update_batch(), from Xu Kuohai.

4) Fix bpf_jit_limit knob to the max supported limit by the arch's JIT, from Lorenz Bauer.

5) Fix BPF sockmap ->poll callbacks for UDP and AF_UNIX sockets, from Cong Wang and Yucong Sun.

6) Fix BPF sockmap concurrency issue in TCP on non-blocking sendmsg calls, from Liu Jian.

7) Fix build failure of INODE_STORAGE and TASK_STORAGE maps on !CONFIG_NET, from Tejun Heo.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix potential race in tail call compatibility check
  bpf: Move BPF_MAP_TYPE for INODE_STORAGE and TASK_STORAGE outside of CONFIG_NET
  selftests/bpf: Use recv_timeout() instead of retries
  net: Implement ->sock_is_readable() for UDP and AF_UNIX
  skmsg: Extract and reuse sk_msg_is_readable()
  net: Rename ->stream_memory_read to ->sock_is_readable
  tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function
  cgroup: Fix memory leak caused by missing cgroup_bpf_offline
  bpf: Fix error usage of map_fd and fdget() in generic_map_update_batch()
  bpf: Prevent increasing bpf_jit_limit above max
  bpf: Define bpf_jit_alloc_exec_limit for arm64 JIT
  bpf: Define bpf_jit_alloc_exec_limit for riscv JIT
====================

Link: https://lore.kernel.org/r/20211026201920.11296-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-26 14:38:55 -07:00
Yucong Sun
67b821502d selftests/bpf: Use recv_timeout() instead of retries
We use non-blocking sockets in those tests, retrying for
EAGAIN is ugly because there is no upper bound for the packet
arrival time, at least in theory. After we fix poll() on
sockmap sockets, now we can switch to select()+recv().

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-5-xiyou.wangcong@gmail.com
2021-10-26 12:29:33 -07:00
Danielle Ratson
c24dbf3d4f selftests: mlxsw: Remove deprecated test cases
After adding the previous patches, the constraint that all the router
interface MAC addresses have the same prefix is no longer relevant.

Remove the test cases that validated that this constraint is honored.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:35:58 +01:00
Danielle Ratson
20d446db61 selftests: Add an occupancy test for RIF MAC profiles
When all the RIF MAC profiles are in use, test that it is possible to
change the MAC of a netdev (i.e., a RIF) when its MAC profile is not
shared with other RIFs. Test that replacement fails when the MAC profile
is shared.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:35:58 +01:00
Danielle Ratson
a10b7bacde selftests: mlxsw: Add forwarding test for RIF MAC profiles
Verify that MAC profile changes are indeed applied and that packets are
forwarded with the correct source MAC.

Output example:

$ ./rif_mac_profiles.sh
TEST: h1->h2: new mac profile                                       [ OK ]
TEST: h2->h1: new mac profile                                       [ OK ]
TEST: h1->h2: edit mac profile                                      [ OK ]
TEST: h2->h1: edit mac profile                                      [ OK ]

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:35:58 +01:00
Danielle Ratson
152f98e7c5 selftests: mlxsw: Add a scale test for RIF MAC profiles
Query the maximum number of supported RIF MAC profiles using
devlink-resource and verify that all available MAC profiles can be utilized
and that an error is generated when user space tries to exceed this number.

Output example in Spectrum-2:

$ TESTS='rif_mac_profile' ./resource_scale.sh
TEST: 'rif_mac_profile' 4                                           [ OK ]
TEST: 'rif_mac_profile' overflow 5                                  [ OK ]

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 13:35:58 +01:00
Song Liu
20d1b54a52 selftests/bpf: Guess function end for test_get_branch_snapshot
Function in modules could appear in /proc/kallsyms in random order.

ffffffffa02608a0 t bpf_testmod_loop_test
ffffffffa02600c0 t __traceiter_bpf_testmod_test_writable_bare
ffffffffa0263b60 d __tracepoint_bpf_testmod_test_write_bare
ffffffffa02608c0 T bpf_testmod_test_read
ffffffffa0260d08 t __SCT__tp_func_bpf_testmod_test_writable_bare
ffffffffa0263300 d __SCK__tp_func_bpf_testmod_test_read
ffffffffa0260680 T bpf_testmod_test_write
ffffffffa0260860 t bpf_testmod_test_mod_kfunc

Therefore, we cannot reliably use kallsyms_find_next() to find the end of
a function. Replace it with a simple guess (start + 128). This is good
enough for this test.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022234814.318457-1-songliubraving@fb.com
2021-10-25 21:43:05 -07:00
Song Liu
b4e8707276 selftests/bpf: Skip all serial_test_get_branch_snapshot in vm
Skipping the second half of the test is not enough to silent the warning
in dmesg. Skip the whole test before we can either properly silent the
warning in kernel, or fix LBR snapshot for VM.

Fixes: 025bd7c753 ("selftests/bpf: Add test for bpf_get_branch_snapshot")
Fixes: aa67fdb464 ("selftests/bpf: Skip the second half of get_branch_snapshot in vm")
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026000733.477714-1-songliubraving@fb.com
2021-10-25 20:41:00 -07:00
Ilya Leoshkevich
2e2c6d3fb3 selftests/bpf: Fix test_core_reloc_mods on big-endian machines
This is the same as commit d164dd9a5c ("selftests/bpf: Fix
test_core_autosize on big-endian machines"), but for
test_core_reloc_mods.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026010831.748682-7-iii@linux.ibm.com
2021-10-25 20:39:42 -07:00
Ilya Leoshkevich
3e7ed9cebb selftests/seccomp: Use __BYTE_ORDER__
Use the compiler-defined __BYTE_ORDER__ instead of the libc-defined
__BYTE_ORDER for consistency.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026010831.748682-6-iii@linux.ibm.com
2021-10-25 20:39:42 -07:00
Ilya Leoshkevich
06fca841fb selftests/bpf: Use __BYTE_ORDER__
Use the compiler-defined __BYTE_ORDER__ instead of the libc-defined
__BYTE_ORDER for consistency.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026010831.748682-4-iii@linux.ibm.com
2021-10-25 20:39:42 -07:00
Andrii Nakryiko
3762a39ce8 selftests/bpf: Split out bpf_verif_scale selftests into multiple tests
Instead of using subtests in bpf_verif_scale selftest, turn each scale
sub-test into its own test. Each subtest is compltely independent and
just reuses a bit of common test running logic, so the conversion is
trivial. For convenience, keep all of BPF verifier scale tests in one
file.

This conversion shaves off a significant amount of time when running
test_progs in parallel mode. E.g., just running scale tests (-t verif_scale):

BEFORE
======
Summary: 24/0 PASSED, 0 SKIPPED, 0 FAILED

real    0m22.894s
user    0m0.012s
sys     0m22.797s

AFTER
=====
Summary: 24/0 PASSED, 0 SKIPPED, 0 FAILED

real    0m12.044s
user    0m0.024s
sys     0m27.869s

Ten second saving right there. test_progs -j is not yet ready to be
turned on by default, unfortunately, and some tests fail almost every
time, but this is a good improvement nevertheless. Ignoring few
failures, here is sequential vs parallel run times when running all
tests now:

SEQUENTIAL
==========
Summary: 206/953 PASSED, 4 SKIPPED, 0 FAILED

real    1m5.625s
user    0m4.211s
sys     0m31.650s

PARALLEL
========
Summary: 204/952 PASSED, 4 SKIPPED, 2 FAILED

real    0m35.550s
user    0m4.998s
sys     0m39.890s

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211022223228.99920-5-andrii@kernel.org
2021-10-25 14:45:46 -07:00
Andrii Nakryiko
2c0f51ac32 selftests/bpf: Mark tc_redirect selftest as serial
It seems to cause a lot of harm to kprobe/tracepoint selftests. Yucong
mentioned before that it does manipulate sysfs, which might be the
reason. So let's mark it as serial, though ideally it would be less
intrusive on the system at test.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211022223228.99920-4-andrii@kernel.org
2021-10-25 14:45:46 -07:00
Andrii Nakryiko
8ea688e7f4 selftests/bpf: Support multiple tests per file
Revamp how test discovery works for test_progs and allow multiple test
entries per file. Any global void function with no arguments and
serial_test_ or test_ prefix is considered a test.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211022223228.99920-3-andrii@kernel.org
2021-10-25 14:45:46 -07:00
Andrii Nakryiko
6972dc3b87 selftests/bpf: Normalize selftest entry points
Ensure that all test entry points are global void functions with no
input arguments. Mark few subtest entry points as static.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211022223228.99920-2-andrii@kernel.org
2021-10-25 14:45:45 -07:00
Daniel Latypov
2ab5d5e67f kunit: tool: continue past invalid utf-8 output
kunit.py currently crashes and fails to parse kernel output if it's not
fully valid utf-8.

This can come from memory corruption or just inadvertently printing
out binary data as strings.

E.g. adding this line into a kunit test
  pr_info("\x80")
will cause this exception
  UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position
  1961: invalid start byte

We can tell Python how to handle errors, see
https://docs.python.org/3/library/codecs.html#error-handlers

Unfortunately, it doesn't seem like there's a way to specify this in
just one location, so we need to repeat ourselves quite a bit.

Specify `errors='backslashreplace'` so we instead:
* print out the offending byte as '\x80'
* try and continue parsing the output.
  * as long as the TAP lines themselves are valid, we're fine.

Fixed spelling/grammar in commit log:
Shuah Khan <<skhan@linuxfoundation.org>

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-25 13:06:45 -06:00
Shuah Khan
dd40f44eab selftests: x86: fix [-Wstringop-overread] warn in test_process_vm_readv()
Fix the following [-Wstringop-overread] by passing in the variable
instead of the value.

test_vsyscall.c: In function ‘test_process_vm_readv’:
test_vsyscall.c:500:22: warning: ‘__builtin_memcmp_eq’ specified bound 4096 exceeds source size 0 [-Wstringop-overread]
  500 |                 if (!memcmp(buf, (const void *)0xffffffffff600000, 4096)) {
      |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-25 09:27:20 -06:00
Ido Schimmel
e860419684 selftests: mlxsw: Reduce test run time
Instead of iterating over all the available trap policers, only perform
the tests with three policers: The first, the last and the one in the
middle of the range. On a Spectrum-3 system, this reduces the run time
from almost an hour to a few minutes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:10:11 +01:00
Ido Schimmel
535ac9a5fb selftests: mlxsw: Use permanent neighbours instead of reachable ones
The nexthop objects tests configure dummy reachable neighbours so that
the nexthops will have a MAC address and be programmed to the device.

Since these are dummy reachable neighbours, they can be transitioned by
the kernel to a failed state if they are around for too long. This can
happen, for example, if the "TIMEOUT" variable is configured with a too
high value.

Make the tests more robust by configuring the neighbours as permanent,
so that the tests do not depend on the configured timeout value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:10:11 +01:00
Petr Machata
b8bfafe434 selftests: mlxsw: Add helpers for skipping selftests
A number of mlxsw-specific selftests currently detect whether they are run
on a compatible machine, and bail out silently when not. These tests are
however done in a somewhat impenetrable manner by directly comparing PCI
IDs against a blacklist or a whitelist, and bailing out silently if the
machine is not compatible.

Instead, add a helper, mlxsw_only_on_spectrum(), which allows specifying
the supported machines in a human-readable manner. If the current machine
is incompatible, the helper emits a SKIP message and returns an error code,
based on which the caller can gracefully bail out in a suitable way. This
allows a more readable conditions such as:

	mlxsw_only_on_spectrum 2+ || return

Convert all existing open-coded guards to the new helper. Also add two new
guards to do_mark_test() and do_drop_test(), which are supported only on
Spectrum-2+, but the corresponding check was not there.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:10:11 +01:00
Vladimir Oltean
eccd0a80dc selftests: net: dsa: add a stress test for unlocked FDB operations
This test is a bit strange in that it is perhaps more manual than
others: it does not transmit a clear OK/FAIL verdict, because user space
does not have synchronous feedback from the kernel. If a hardware access
fails, it is in deferred context.

Nonetheless, on sja1105 I have used it successfully to find and solve a
concurrency issue, so it can be used as a starting point for other
driver maintainers too.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 12:59:42 +01:00
Vladimir Oltean
d70b51f284 selftests: lib: forwarding: allow tests to not require mz and jq
These programs are useful, but not all selftests require them.

Additionally, on embedded boards without package management (things like
buildroot), installing mausezahn or jq is not always as trivial as
downloading a package from the web.

So it is actually a bit annoying to require programs that are not used.
Introduce options that can be set by scripts to not enforce these
dependencies. For compatibility, default to "yes".

Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Cc: Guillaume Nault <gnault@redhat.com>
Cc: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 12:59:42 +01:00
David S. Miller
2d7e73f09f Revert "Merge branch 'dsa-rtnl'"
This reverts commit 965e6b262f, reversing
changes made to 4d98bb0d7e.
2021-10-25 12:59:25 +01:00
Kees Cook
d46e58ef77 lkdtm/bugs: Check that a per-task stack canary exists
Introduce REPORT_STACK_CANARY to check for differing stack canaries
between two processes (i.e. that an architecture is correctly implementing
per-task stack canaries), using the task_struct canary as the hint to
locate in the stack. Requires that one of the processes being tested
not be pid 1.

Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20211022223826.330653-3-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-25 09:13:46 +02:00
Kees Cook
149538cd55 selftests/lkdtm: Add way to repeat a test
Some LKDTM tests need to be run more than once (usually to setup and
then later trigger). Until now, the only case was the SOFT_LOCKUP test,
which wasn't useful to run in the bulk selftests. The coming stack canary
checking needs to run twice, so support this with a new test output prefix
"repeat".

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20211022223826.330653-2-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-25 09:13:46 +02:00
Vladimir Oltean
edc90d1585 selftests: net: dsa: add a stress test for unlocked FDB operations
This test is a bit strange in that it is perhaps more manual than
others: it does not transmit a clear OK/FAIL verdict, because user space
does not have synchronous feedback from the kernel. If a hardware access
fails, it is in deferred context.

Nonetheless, on sja1105 I have used it successfully to find and solve a
concurrency issue, so it can be used as a starting point for other
driver maintainers too.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-24 13:47:45 +01:00
Vladimir Oltean
016748961b selftests: lib: forwarding: allow tests to not require mz and jq
These programs are useful, but not all selftests require them.

Additionally, on embedded boards without package management (things like
buildroot), installing mausezahn or jq is not always as trivial as
downloading a package from the web.

So it is actually a bit annoying to require programs that are not used.
Introduce options that can be set by scripts to not enforce these
dependencies. For compatibility, default to "yes".

Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Cc: Guillaume Nault <gnault@redhat.com>
Cc: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-24 13:47:45 +01:00
Yonghong Song
8c18ea2d2c selftests/bpf: Add BTF_KIND_DECL_TAG typedef example in tag.c
Change value type in progs/tag.c to a typedef with a btf_decl_tag.
With `bpftool btf dump file tag.o`, we have
  ...
  [14] TYPEDEF 'value_t' type_id=17
  [15] DECL_TAG 'tag1' type_id=14 component_idx=-1
  [16] DECL_TAG 'tag2' type_id=14 component_idx=-1
  [17] STRUCT '(anon)' size=8 vlen=2
        'a' type_id=2 bits_offset=0
        'b' type_id=2 bits_offset=32
  ...

The btf_tag selftest also succeeded:
  $ ./test_progs -t tag
    #21 btf_tag:OK
    Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211021195643.4020315-1-yhs@fb.com
2021-10-22 17:04:44 -07:00
Yonghong Song
557c8c4804 selftests/bpf: Test deduplication for BTF_KIND_DECL_TAG typedef
Add unit tests for deduplication of BTF_KIND_DECL_TAG to typedef types.
Also changed a few comments from "tag" to "decl_tag" to match
BTF_KIND_DECL_TAG enum value name.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211021195638.4019770-1-yhs@fb.com
2021-10-22 17:04:44 -07:00
Yonghong Song
9d19a12b02 selftests/bpf: Add BTF_KIND_DECL_TAG typedef unit tests
Test good and bad variants of typedef BTF_KIND_DECL_TAG encoding.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211021195633.4019472-1-yhs@fb.com
2021-10-22 17:04:43 -07:00
Stanislav Fomichev
d1321207b1 selftests/bpf: Fix flow dissector tests
- update custom loader to search by name, not section name
- update bpftool commands to use proper pin path

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211021214814.1236114-4-sdf@google.com
2021-10-22 16:53:38 -07:00
Hengqi Chen
487ef148cf selftests/bpf: Switch to new btf__type_cnt/btf__raw_data APIs
Replace the calls to btf__get_nr_types/btf__get_raw_data in
selftests with new APIs btf__type_cnt/btf__raw_data. The old
APIs will be deprecated in libbpf v0.7+.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022130623.1548429-6-hengqi.chen@gmail.com
2021-10-22 16:09:14 -07:00
Andrii Nakryiko
57385ae31f selftests/bpf: Make perf_buffer selftests work on 4.9 kernel again
Recent change to use tp/syscalls/sys_enter_nanosleep for perf_buffer
selftests causes this selftest to fail on 4.9 kernel in libbpf CI ([0]):

  libbpf: prog 'handle_sys_enter': failed to attach to perf_event FD 6: Invalid argument
  libbpf: prog 'handle_sys_enter': failed to attach to tracepoint 'syscalls/sys_enter_nanosleep': Invalid argument

It's not exactly clear why, because perf_event itself is created for
this tracepoint, but I can't even compile 4.9 kernel locally, so it's
hard to figure this out. If anyone has better luck and would like to
help investigating this, I'd really appreciate this.

For now, unblock CI by switching back to raw_syscalls/sys_enter, but reduce
amount of unnecessary samples emitted by filter by process ID. Use
explicit ARRAY map for that to make it work on 4.9 as well, because
global data isn't yet supported there.

Fixes: aa274f98b2 ("selftests/bpf: Fix possible/online index mismatch in perf_buffer test")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211022201342.3490692-1-andrii@kernel.org
2021-10-22 14:26:33 -07:00
Florian Westphal
1f83b835a3 fcnal-test: kill hanging ping/nettest binaries on cleanup
On my box I see a bunch of ping/nettest processes hanging
around after fcntal-test.sh is done.

Clean those up before netns deletion.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20211021140247.29691-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 14:03:18 -07:00
Jim Mattson
ed290e1c20 KVM: selftests: Fix nested SVM tests when built with clang
Though gcc conveniently compiles a simple memset to "rep stos," clang
prefers to call the libc version of memset. If a test is dynamically
linked, the libc memset isn't available in L1 (nor is the PLT or the
GOT, for that matter). Even if the test is statically linked, the libc
memset may choose to use some CPU features, like AVX, which may not be
enabled in L1. Note that __builtin_memset doesn't solve the problem,
because (a) the compiler is free to call memset anyway, and (b)
__builtin_memset may also choose to use features like AVX, which may
not be available in L1.

To avoid a myriad of problems, use an explicit "rep stos" to clear the
VMCB in generic_svm_setup(), which is called both from L0 and L1.

Reported-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Fixes: 20ba262f86 ("selftests: KVM: AMD Nested test infrastructure")
Message-Id: <20210930003649.4026553-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-22 12:46:37 -04:00
David S. Miller
bdfa75ad70 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Lots of simnple overlapping additions.

With a build fix from Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-22 11:41:16 +01:00
Michael Roth
413eaa4ecd KVM: selftests: set CPUID before setting sregs in vcpu creation
Recent kernels have checks to ensure the GPA values in special-purpose
registers like CR3 are within the maximum physical address range and
don't overlap with anything in the upper/reserved range. In the case of
SEV kselftest guests booting directly into 64-bit mode, CR3 needs to be
initialized to the GPA of the page table root, with the encryption bit
set. The kernel accounts for this encryption bit by removing it from
reserved bit range when the guest advertises the bit position via
KVM_SET_CPUID*, but kselftests currently call KVM_SET_SREGS as part of
vm_vcpu_add_default(), before KVM_SET_CPUID*.

As a result, KVM_SET_SREGS will return an error in these cases.
Address this by moving vcpu_set_cpuid() (which calls KVM_SET_CPUID*)
ahead of vcpu_setup() (which calls KVM_SET_SREGS).

While there, address a typo in the assertion that triggers when
KVM_SET_SREGS fails.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20211006203617.13045-1-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Nathan Tempelman <natet@google.com>
2021-10-22 05:19:29 -04:00
Linus Torvalds
6c2c712767 Networking fixes for 5.15-rc7, including fixes from netfilter, and can.
Current release - regressions:
 
  - revert "vrf: reset skb conntrack connection on VRF rcv",
    there are valid uses for previous behavior
 
  - can: m_can: fix iomap_read_fifo() and iomap_write_fifo()
 
 Current release - new code bugs:
 
  - mlx5: e-switch, return correct error code on group creation failure
 
 Previous releases - regressions:
 
  - sctp: fix transport encap_port update in sctp_vtag_verify
 
  - stmmac: fix E2E delay mechanism (in PTP timestamping)
 
 Previous releases - always broken:
 
  - netfilter: ip6t_rt: fix out-of-bounds read of ipv6_rt_hdr
 
  - netfilter: xt_IDLETIMER: fix out-of-bound read caused by lack of init
 
  - netfilter: ipvs: make global sysctl read-only in non-init netns
 
  - tcp: md5: fix selection between vrf and non-vrf keys
 
  - ipv6: count rx stats on the orig netdev when forwarding
 
  - bridge: mcast: use multicast_membership_interval for IGMPv3
 
  - can:
    - j1939: fix UAF for rx_kref of j1939_priv
             abort sessions on receiving bad messages
 
    - isotp: fix TX buffer concurrent access in isotp_sendmsg()
             fix return error on FC timeout on TX path
 
  - ice: fix re-init of RDMA Tx queues and crash if RDMA was not inited
 
  - hns3: schedule the polling again when allocation fails,
    prevent stalls
 
  - drivers: add missing of_node_put() when aborting
    for_each_available_child_of_node()
 
  - ptp: fix possible memory leak and UAF in ptp_clock_register()
 
  - e1000e: fix packet loss in burst mode on Tiger Lake and later
 
  - mlx5e: ipsec: fix more checksum offload issues
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFxgHwACgkQMUZtbf5S
 IrvFgw//T73aR3B2Xvz5/1rtglfmtcUqFQsyGDXGD5HnfbAbsRcz8vcQ/mTsExl7
 +mJY/ZuQefsD7UQDyg3GNhbgf1+pEjHC81ryeNsfET7+JxgYLD3NEYSBYUqIFZUo
 gStAStGBG+ClQUaqlkGFyyf6GrqwpmxZKRr6F9fUsufQ14m9tvcT/QPcrXL4q7qX
 Fz644yUe/IvKnuJDHJVZsc8UXR9NTPCyCNJT9kVewwPMIMEc/xMOg5QONLZT0TlC
 Zk4XJIqlBBEQWrN/QwrGXm82aO+3gQZyD5K9AvpczgcBjOr6FJOmN6zkQrqNNWaC
 2wPAfWi7DALPtOZR6lCxoeWfLRfdn1ZOn5x2z5xrtAXCV2FTaMg8in9TzJ57qmcb
 /l43QzcNGSj1ytyny8pqgdsX2MSqs0O5VSG4egMtz7TeU/rs7uAx2IVHbPT8CHop
 PvhVHeUeu9lGu+FUK8piQbb5aVpbA9qlOj/rXNrHDIxdA9McQgVs+tljNG4X5KtX
 L7BR84wNg98HtIINVx6RjYz9lOpG1qBuw5RCiqiAaN1RBY7lYAhMaAE6U3azjgC+
 AIz/MacNuAz/oTuutQB6/0WZDDJhy4WEy3TrDLlpQNz6yIrpKFN+ftyF6DuVUSMH
 PmtZ4E/DLooQL5KwuoDdYDH1gSMlggBejeGHTFJ+RUMuvRePZQ8=
 =Hwqr
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, and can.

  We'll have one more fix for a socket accounting regression, it's still
  getting polished. Otherwise things look fine.

  Current release - regressions:

   - revert "vrf: reset skb conntrack connection on VRF rcv", there are
     valid uses for previous behavior

   - can: m_can: fix iomap_read_fifo() and iomap_write_fifo()

  Current release - new code bugs:

   - mlx5: e-switch, return correct error code on group creation failure

  Previous releases - regressions:

   - sctp: fix transport encap_port update in sctp_vtag_verify

   - stmmac: fix E2E delay mechanism (in PTP timestamping)

  Previous releases - always broken:

   - netfilter: ip6t_rt: fix out-of-bounds read of ipv6_rt_hdr

   - netfilter: xt_IDLETIMER: fix out-of-bound read caused by lack of
     init

   - netfilter: ipvs: make global sysctl read-only in non-init netns

   - tcp: md5: fix selection between vrf and non-vrf keys

   - ipv6: count rx stats on the orig netdev when forwarding

   - bridge: mcast: use multicast_membership_interval for IGMPv3

   - can:
      - j1939: fix UAF for rx_kref of j1939_priv abort sessions on
        receiving bad messages

      - isotp: fix TX buffer concurrent access in isotp_sendmsg() fix
        return error on FC timeout on TX path

   - ice: fix re-init of RDMA Tx queues and crash if RDMA was not inited

   - hns3: schedule the polling again when allocation fails, prevent
     stalls

   - drivers: add missing of_node_put() when aborting
     for_each_available_child_of_node()

   - ptp: fix possible memory leak and UAF in ptp_clock_register()

   - e1000e: fix packet loss in burst mode on Tiger Lake and later

   - mlx5e: ipsec: fix more checksum offload issues"

* tag 'net-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (75 commits)
  usbnet: sanity check for maxpacket
  net: enetc: make sure all traffic classes can send large frames
  net: enetc: fix ethtool counter name for PM0_TERR
  ptp: free 'vclock_index' in ptp_clock_release()
  sfc: Don't use netif_info before net_device setup
  sfc: Export fibre-specific supported link modes
  net/mlx5e: IPsec: Fix work queue entry ethernet segment checksum flags
  net/mlx5e: IPsec: Fix a misuse of the software parser's fields
  net/mlx5e: Fix vlan data lost during suspend flow
  net/mlx5: E-switch, Return correct error code on group creation failure
  net/mlx5: Lag, change multipath and bonding to be mutually exclusive
  ice: Add missing E810 device ids
  igc: Update I226_K device ID
  e1000e: Fix packet loss on Tiger Lake and later
  e1000e: Separate TGP board type from SPT
  ptp: Fix possible memory leak in ptp_clock_register()
  net: stmmac: Fix E2E delay mechanism
  nfc: st95hf: Make spi remove() callback return zero
  net: hns3: disable sriov before unload hclge layer
  net: hns3: fix vf reset workqueue cannot exit
  ...
2021-10-21 15:36:50 -10:00
Andrii Nakryiko
4f2511e199 selftests/bpf: Switch to ".bss"/".rodata"/".data" lookups for internal maps
Utilize libbpf's feature of allowing to lookup internal maps by their
ELF section names. No need to guess or calculate the exact truncated
prefix taken from the object name.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-11-andrii@kernel.org
2021-10-21 17:10:11 -07:00
Andrii Nakryiko
30c5bd9647 selftests/bpf: Demonstrate use of custom .rodata/.data sections
Enhance existing selftests to demonstrate the use of custom
.data/.rodata sections.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-9-andrii@kernel.org
2021-10-21 17:10:10 -07:00
Jiri Olsa
99d099757a selftests/bpf: Use nanosleep tracepoint in perf buffer test
The perf buffer tests triggers trace with nanosleep syscall,
but monitors all syscalls, which results in lot of data in the
buffer and makes it harder to debug. Let's lower the trace
traffic and monitor just nanosleep syscall.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211021114132.8196-4-jolsa@kernel.org
2021-10-21 15:59:31 -07:00
Jiri Olsa
aa274f98b2 selftests/bpf: Fix possible/online index mismatch in perf_buffer test
The perf_buffer fails on system with offline cpus:

  # test_progs -t perf_buffer
  serial_test_perf_buffer:PASS:nr_cpus 0 nsec
  serial_test_perf_buffer:PASS:nr_on_cpus 0 nsec
  serial_test_perf_buffer:PASS:skel_load 0 nsec
  serial_test_perf_buffer:PASS:attach_kprobe 0 nsec
  serial_test_perf_buffer:PASS:perf_buf__new 0 nsec
  serial_test_perf_buffer:PASS:epoll_fd 0 nsec
  skipping offline CPU #4
  serial_test_perf_buffer:PASS:perf_buffer__poll 0 nsec
  serial_test_perf_buffer:PASS:seen_cpu_cnt 0 nsec
  serial_test_perf_buffer:PASS:buf_cnt 0 nsec
  ...
  serial_test_perf_buffer:PASS:fd_check 0 nsec
  serial_test_perf_buffer:PASS:drain_buf 0 nsec
  serial_test_perf_buffer:PASS:consume_buf 0 nsec
  serial_test_perf_buffer:FAIL:cpu_seen cpu 5 not seen
  #88 perf_buffer:FAIL
  Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

If the offline cpu is from the middle of the possible set,
we get mismatch with possible and online cpu buffers.

The perf buffer test calls perf_buffer__consume_buffer for
all 'possible' cpus, but the library holds only 'online'
cpu buffers and perf_buffer__consume_buffer returns them
based on index.

Adding extra (online) index to keep track of online buffers,
we need the original (possible) index to trigger trace on
proper cpu.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211021114132.8196-3-jolsa@kernel.org
2021-10-21 15:59:28 -07:00
Jiri Olsa
d4121376ac selftests/bpf: Fix perf_buffer test on system with offline cpus
The perf_buffer fails on system with offline cpus:

  # test_progs -t perf_buffer
  test_perf_buffer:PASS:nr_cpus 0 nsec
  test_perf_buffer:PASS:nr_on_cpus 0 nsec
  test_perf_buffer:PASS:skel_load 0 nsec
  test_perf_buffer:PASS:attach_kprobe 0 nsec
  test_perf_buffer:PASS:perf_buf__new 0 nsec
  test_perf_buffer:PASS:epoll_fd 0 nsec
  skipping offline CPU #24
  skipping offline CPU #25
  skipping offline CPU #26
  skipping offline CPU #27
  skipping offline CPU #28
  skipping offline CPU #29
  skipping offline CPU #30
  skipping offline CPU #31
  test_perf_buffer:PASS:perf_buffer__poll 0 nsec
  test_perf_buffer:PASS:seen_cpu_cnt 0 nsec
  test_perf_buffer:FAIL:buf_cnt got 24, expected 32
  Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

Changing the test to check online cpus instead of possible.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211021114132.8196-2-jolsa@kernel.org
2021-10-21 15:59:20 -07:00
Dave Marchevsky
e1b9023fc7 selftests/bpf: Add verif_stats test
verified_insns field was added to response of bpf_obj_get_info_by_fd
call on a prog. Confirm that it's being populated by loading a simple
program and asking for its info.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211020074818.1017682-3-davemarchevsky@fb.com
2021-10-21 15:51:47 -07:00
Hengqi Chen
b6c4e71516 selftests/bpf: Test bpf_skc_to_unix_sock() helper
Add a new test which triggers unix_listen kernel function
to test bpf_skc_to_unix_sock helper.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211021134752.1223426-3-hengqi.chen@gmail.com
2021-10-21 15:11:06 -07:00
Shuah Khan
c3867ab592 selftests: kvm: fix mismatched fclose() after popen()
get_warnings_count() does fclose() using File * returned from popen().
Fix it to call pclose() as it should.

tools/testing/selftests/kvm/x86_64/mmio_warning_test
x86_64/mmio_warning_test.c: In function ‘get_warnings_count’:
x86_64/mmio_warning_test.c:87:9: warning: ‘fclose’ called on pointer returned from a mismatched allocation function [-Wmismatched-dealloc]
   87 |         fclose(f);
      |         ^~~~~~~~~
x86_64/mmio_warning_test.c:84:13: note: returned from ‘popen’
   84 |         f = popen("dmesg | grep \"WARNING:\" | wc -l", "r");
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-21 15:16:28 -06:00
David S. Miller
1439caa1d9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter fixes for net:

1) Crash due to missing initialization of timer data in
   xt_IDLETIMER, from Juhee Kang.

2) NF_CONNTRACK_SECMARK should be bool in Kconfig, from Vegard Nossum.

3) Skip netdev events on netns removal, from Florian Westphal.

4) Add testcase to show port shadowing via UDP, also from Florian.

5) Remove pr_debug() code in ip6t_rt, this fixes a crash due to
   unsafe access to non-linear skbuff, from Xin Long.

6) Make net/ipv4/vs/debug_level read-only from non-init netns,
   from Antoine Tenart.

7) Remove bogus invocation to bash in selftests/netfilter/nft_flowtable.sh
   also from Florian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21 12:32:41 +01:00
Marc Zyngier
5a2acbbb01 Merge branch kvm/selftests/memslot into kvmarm-master/next
* kvm/selftests/memslot:
  : .
  : Enable KVM memslot selftests on arm64, making them less
  : x86 specific.
  : .
  KVM: selftests: Build the memslot tests for arm64
  KVM: selftests: Make memslot_perf_test arch independent

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-10-21 11:40:03 +01:00
Ricardo Koller
358928fd52 KVM: selftests: Build the memslot tests for arm64
Add memslot_perf_test and memslot_modification_stress_test to the list
of aarch64 selftests.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210907180957.609966-3-ricarkol@google.com
2021-10-21 11:36:41 +01:00
Ricardo Koller
ffb4ce3c49 KVM: selftests: Make memslot_perf_test arch independent
memslot_perf_test uses ucalls for synchronization between guest and
host. Ucalls API is architecture independent: tests do not need to know
details like what kind of exit they generate on a specific arch.  More
specifically, there is no need to check whether an exit is KVM_EXIT_IO
in x86 for the host to know that the exit is ucall related, as
get_ucall() already makes that check.

Change memslot_perf_test to not require specifying what exit does a
ucall generate. Also add a missing ucall_init.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210907180957.609966-2-ricarkol@google.com
2021-10-21 11:36:34 +01:00
Mark Brown
260ea4ba94 selftests: arm64: Factor out utility functions for assembly FP tests
The various floating point test programs written in assembly have a bunch
of helper functions and macros which are cut'n'pasted between them. Factor
them out into a separate source file which is linked into all of them.

We don't include memcmp() since it isn't as generic as it should be and
directly branches to report an error in the programs.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019181851.3341232-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 11:11:27 +01:00
Brendan Jackman
7960d02ddd selftests/bpf: Some more atomic tests
Some new verifier tests that hit some important gaps in the parameter
space for atomic ops.

There are already exhaustive tests for the JIT part in
lib/test_bpf.c, but these exercise the verifier too.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211015093318.1273686-1-jackmanb@google.com
2021-10-20 18:09:08 -07:00
Ilya Leoshkevich
b16d12f390 selftests/bpf: Use cpu_number only on arches that have it
cpu_number exists only on Intel and aarch64, so skip the test involing
it on other arches. An alternative would be to replace it with an
exported non-ifdefed primitive-typed percpu variable from the common
code, but there appears to be none.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211013160902.428340-2-iii@linux.ibm.com
2021-10-20 11:40:01 -07:00
Wan Jiabing
b8f49dce79 selftests/bpf: Remove duplicated include in cgroup_helpers
Fix following checkincludes.pl warning:
./scripts/checkincludes.pl tools/testing/selftests/bpf/cgroup_helpers.c
tools/testing/selftests/bpf/cgroup_helpers.c: unistd.h is included more
than once.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211012023231.19911-1-wanjiabing@vivo.com
2021-10-20 10:45:09 -07:00
Rae Moar
d65d07cb5b kunit: tool: improve compatibility of kunit_parser with KTAP specification
Update to kunit_parser to improve compatibility with KTAP
specification including arbitrarily nested tests. Patch accomplishes
three major changes:

- Use a general Test object to represent all tests rather than TestCase
and TestSuite objects. This allows for easier implementation of arbitrary
levels of nested tests and promotes the idea that both test suites and test
cases are tests.

- Print errors incrementally rather than all at once after the
parsing finishes to maximize information given to the user in the
case of the parser given invalid input and to increase the helpfulness
of the timestamps given during printing. Note that kunit.py parse does
not print incrementally yet. However, this fix brings us closer to
this feature.

- Increase compatibility for different formats of input. Arbitrary levels
of nested tests supported. Also, test cases and test suites are now
supported to be present on the same level of testing.

This patch now implements the draft KTAP specification here:
https://lore.kernel.org/linux-kselftest/CA+GJov6tdjvY9x12JsJT14qn6c7NViJxqaJk+r-K1YJzPggFDQ@mail.gmail.com/
We'll update the parser as the spec evolves.

This patch adjusts the kunit_tool_test.py file to check for
the correct outputs from the new parser and adds a new test to check
the parsing for a KTAP result log with correct format for multiple nested
subtests (test_is_test_passed-all_passed_nested.log).

This patch also alters the kunit_json.py file to allow for arbitrarily
nested tests.

Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:22:02 -06:00
Daniel Latypov
7d7c48df81 kunit: tool: yield output from run_kernel in real time
Currently, `run_kernel()` dumps all the kernel output to a file
(.kunit/test.log) and then opens the file and yields it to callers.
This made it easier to respect the requested timeout, if any.

But it means that we can't yield the results in real time, either to the
parser or to stdout (if --raw_output is set).

This change spins up a background thread to enforce the timeout, which
allows us to yield the kernel output in real time, while also copying it
to the .kunit/test.log file.
It's also careful to ensure that the .kunit/test.log file is complete,
even in the kunit_parser throws an exception/otherwise doesn't consume
every line, see the new `finally` block and unit test.

For example:

$ ./tools/testing/kunit/kunit.py run --arch=x86_64 --raw_output
<configure + build steps>
...
<can now see output from QEMU in real time>

This does not currently have a visible effect when --raw_output is not
passed, as kunit_parser.py currently only outputs everything at the end.
But that could change, and this patch is a necessary step towards
showing parsed test results in real time.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:22:02 -06:00
Daniel Latypov
ff9e09a376 kunit: tool: support running each suite/test separately
The new --run_isolated flag makes the tool boot the kernel once per
suite or test, preventing leftover state from one suite to impact the
other. This can be useful as a starting point to debugging test
hermeticity issues.

Note: it takes a lot longer, so people should not use it normally.

Consider the following very simplified example:

  bool disable_something_for_test = false;
  void function_being_tested() {
    ...
    if (disable_something_for_test) return;
    ...
  }

  static void test_before(struct kunit *test)
  {
    disable_something_for_test = true;
    function_being_tested();
    /* oops, we forgot to reset it back to false */
  }

  static void test_after(struct kunit *test)
  {
    /* oops, now "fixing" test_before can cause test_after to fail! */
    function_being_tested();
  }

Presented like this, the issues are obvious, but it gets a lot more
complicated to track down as the amount of test setup and helper
functions increases.

Another use case is memory corruption. It might not be surfaced as a
failure/crash in the test case or suite that caused it. I've noticed in
kunit's own unit tests, the 3rd suite after might be the one to finally
crash after an out-of-bounds write, for example.

Example usage:

Per suite:
$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit --run_isolated=suite
...
Starting KUnit Kernel (1/7)...
============================================================
======== [PASSED] kunit_executor_test ========
....
Testing complete. 5 tests run. 0 failed. 0 crashed. 0 skipped.
Starting KUnit Kernel (2/7)...
============================================================
======== [PASSED] kunit-try-catch-test ========
...

Per test:
$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit --run_isolated=test
Starting KUnit Kernel (1/23)...
============================================================
======== [PASSED] kunit_executor_test ========
[PASSED] parse_filter_test
============================================================
Testing complete. 1 tests run. 0 failed. 0 crashed. 0 skipped.
Starting KUnit Kernel (2/23)...
============================================================
======== [PASSED] kunit_executor_test ========
[PASSED] filter_subsuite_test
...

It works with filters as well:
$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit --run_isolated=suite example
...
Starting KUnit Kernel (1/1)...
============================================================
======== [PASSED] example ========
...

It also handles test filters, '*.*skip*' runs these 3 tests:
  kunit_status.kunit_status_mark_skipped_test
  example.example_skip_test
  example.example_mark_skipped_test

Fixed up merge conflict between:
  d8c23ead70 ("kunit: tool: better handling of quasi-bool args (--json, --raw_output)") and
  6710951ee039 ("kunit: tool: support running each suite/test separately")
    Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Shuah Khan <skhan@linuxfoundation.org>

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:21:08 -06:00
Daniel Latypov
5f6aa6d82e kunit: tool: actually track how long it took to run tests
This is a long standing bug in kunit tool.
Since these files were added, run_kernel() has always yielded lines.

That means, the call to run_kernel() returns before the kernel finishes
executing tests, potentially before a single line of output is even
produced.

So code like this
  time_start = time.time()
  result = linux.run_kernel(...)
  time_end = time.time()

would only measure the time taken for python to give back the generator
object.

From a caller's perspective, the only way to know the kernel has exited
is for us to consume all the output from the `result` generator object.
Alternatively, we could change run_kernel() to try and do its own book
keeping and return the total time, but that doesn't seem worth it.

This change makes us record `time_end` after we're done parsing all the
output (which should mean we've consumed all of it, or errored out).
That means we're including in the parsing time as well, but that should
be quite small, and it's better than claiming it took 0s to run tests.

Let's use this as an example:
$ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit example

Before:
Elapsed time: 7.684s total, 0.001s configuring, 4.692s building, 0.000s running

After:
Elapsed time: 6.283s total, 0.001s configuring, 3.202s building, 3.079s running

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:18:50 -06:00
Daniel Latypov
7ef925ea81 kunit: tool: factor exec + parse steps into a function
Currently this code is copy-pasted between the normal "run" subcommand
and the "exec" subcommand.

Given we don't have any interest in just executing the tests without
giving the user any indication what happened (i.e. parsing the output),
make a function that does both this things and can be reused.

This will be useful when we allow more complicated ways of running
tests, e.g. invoking the kernel multiple times instead of just once,
etc.

We remove input_data from the ParseRequest so the callers don't have to
pass in a dummy value for this field. Named tuples are also immutable,
so if they did pass in a dummy, exec_tests() would need to make a copy
to call parse_tests().

Removing it also makes KunitParseRequest match the other *Request types,
as they only contain user arguments/flags, not data.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Acked-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:18:50 -06:00
Daniel Latypov
fe678fed2c kunit: tool: show list of valid --arch options when invalid
Consider this attempt to run KUnit in QEMU:
$ ./tools/testing/kunit/kunit.py run --arch=x86

Before you'd get this error message:
kunit_kernel.ConfigError: x86 is not a valid arch

After:
kunit_kernel.ConfigError: x86 is not a valid arch, options are ['alpha', 'arm', 'arm64', 'i386', 'powerpc', 'riscv', 's390', 'sparc', 'x86_64']

This should make it a bit easier for people to notice when they make
typos, etc. Currently, one would have to dive into the python code to
figure out what the valid set is.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:18:50 -06:00
Daniel Latypov
a54ea2e057 kunit: tool: misc fixes (unused vars, imports, leaked files)
Drop some variables in unit tests that were unused and/or add assertions
based on them.

For ExitStack, it was imported, but the `es` variable wasn't used so it
didn't do anything, and we were leaking the file objects.
Refactor it to just use nested `with` statements to properly close them.

And drop the direct use of .close() on file objects in the kunit tool
unit test, as these can be leaked if test assertions fail.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-19 14:18:49 -06:00