Commit Graph

752416 Commits

Author SHA1 Message Date
Wang Sheng-Hui
c0885f61bb samples, bpf: remove redundant ret assignment in bpf_load_program()
2 redundant ret assignments removed:

* 'ret = 1' before the logic 'if (data_maps)', and if any errors jump to
  label 'done'. No 'ret = 1' needed before the error jump.

* After the '/* load programs */' part, if everything goes well, then
  the BPF code will be loaded and 'ret' set to 0 by load_and_attach().
  If something goes wrong, 'ret' set to none-O, the redundant 'ret = 0'
  after the for clause will make the error skipped.

  For example, if some BPF code cannot provide supported program types
  in ELF SEC("unknown"), the for clause will not call load_and_attach()
  to load the BPF code. 1 should be returned to callees instead of 0.

Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:47:06 +02:00
Daniel Borkmann
a6712d45b0 Merge branch 'bpf-uapi-helper-doc'
Quentin Monnet says:

====================
eBPF helper functions can be called from within eBPF programs to perform
a variety of tasks that would be otherwise hard or impossible to do with
eBPF itself. There is a growing number of such helper functions in the
kernel, but documentation is scarce. The main user space header file
does contain a short commented description of most helpers, but it is
somewhat outdated and not complete. It is more a "cheat sheet" than a
real documentation accessible to new eBPF developers.

This commit attempts to improve the situation by replacing the existing
overview for the helpers with a more developed description. Furthermore,
a Python script is added to generate a manual page for eBPF helpers. The
workflow is the following, and requires the rst2man utility:

    $ ./scripts/bpf_helpers_doc.py \
            --filename include/uapi/linux/bpf.h > /tmp/bpf-helpers.rst
    $ rst2man /tmp/bpf-helpers.rst > /tmp/bpf-helpers.7
    $ man /tmp/bpf-helpers.7

The objective is to keep all documentation related to the helpers in a
single place, and to be able to generate from here a manual page that
could be packaged in the man-pages repository and shipped with most
distributions.

Additionally, parsing the prototypes of the helper functions could
hopefully be reused, with a different Printer object, to generate
header files needed in some eBPF-related projects.

Regarding the description of each helper, it comprises several items:

- The function prototype.
- A description of the function and of its arguments (except for a
  couple of cases, when there are no arguments and the return value
  makes the function usage really obvious).
- A description of return values (if not void).

Additional items such as the list of compatible eBPF program and map
types for each helper, Linux kernel version that introduced the helper,
GPL-only restriction, and commit hash could be added in the future, but
it was decided on the mailing list to leave them aside for now.

For several helpers, descriptions are inspired (at times, nearly copied)
from the commit logs introducing them in the kernel--Many thanks to
their respective authors! Some sentences were also adapted from comments
from the reviews, thanks to the reviewers as well. Descriptions were
completed as much as possible, the objective being to have something easily
accessible even for people just starting with eBPF. There is probably a bit
more work to do in this direction for some helpers.

Some RST formatting is used in the descriptions (not in function
prototypes, to keep them readable, but the Python script provided in
order to generate the RST for the manual page does add formatting to
prototypes, to produce something pretty) to get "bold" and "italics" in
manual pages. Hopefully, the descriptions in bpf.h file remains
perfectly readable. Note that the few trailing white spaces are
intentional, removing them would break paragraphs for rst2man.

The descriptions should ideally be updated each time someone adds a new
helper, or updates the behaviour (new socket option supported, ...) or
the interface (new flags available, ...) of existing ones.

To ease the review process, the documentation has been split into several
patches.

v3 -> v4:
- Add a patch (#9) for newly added BPF helpers.
- Add a patch (#10) to update UAPI bpf.h version under tools/.
- Use SPDX tag in Python script.
- Several fixes on man page header and footer, and helpers documentation.
  Please refer to individual patches for details.

RFC v2 -> PATCH v3:
Several fixes on man page header and footer, and helpers documentation.
Please refer to individual patches for details.

RFC v1 -> RFC v2:
- Remove "For" (compatible program and map types), "Since" (minimal
  Linux kernel version required), "GPL only" sections and commit hashes
  for the helpers.
- Add comment on top of the description list to explain how this
  documentation is supposed to be processed.
- Update Python script accordingly (remove the same sections, and remove
  paragraphs on program types and GPL restrictions from man page
  header).
- Split series into several patches.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: linux-doc@vger.kernel.org
Cc: linux-man@vger.kernel.org
2018-04-27 00:22:01 +02:00
Quentin Monnet
9cde0c8892 bpf: update bpf.h uapi header for tools
Update tools/include/uapi/linux/bpf.h file in order to reflect the
changes for BPF helper functions documentation introduced in previous
commits.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:21:59 +02:00
Quentin Monnet
2d020dd771 bpf: add documentation for eBPF helpers (65-66)
Add documentation for eBPF helper functions to bpf.h user header file.
This documentation can be parsed with the Python script provided in
another commit of the patch series, in order to provide a RST document
that can later be converted into a man page.

The objective is to make the documentation easily understandable and
accessible to all eBPF developers, including beginners.

This patch contains descriptions for the following helper functions:

Helper from Nikita:
- bpf_xdp_adjust_tail()

Helper from Eyal:
- bpf_skb_get_xfrm_state()

v4:
- New patch (helpers did not exist yet for previous versions).

Cc: Nikita V. Shirokov <tehnerd@tehnerd.com>
Cc: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:21:59 +02:00
Quentin Monnet
ab12704099 bpf: add documentation for eBPF helpers (58-64)
Add documentation for eBPF helper functions to bpf.h user header file.
This documentation can be parsed with the Python script provided in
another commit of the patch series, in order to provide a RST document
that can later be converted into a man page.

The objective is to make the documentation easily understandable and
accessible to all eBPF developers, including beginners.

This patch contains descriptions for the following helper functions, all
written by John:

- bpf_redirect_map()
- bpf_sk_redirect_map()
- bpf_sock_map_update()
- bpf_msg_redirect_map()
- bpf_msg_apply_bytes()
- bpf_msg_cork_bytes()
- bpf_msg_pull_data()

v4:
- bpf_redirect_map(): Fix typos: "XDP_ABORT" changed to "XDP_ABORTED",
  "his" to "this". Also add a paragraph on performance improvement over
  bpf_redirect() helper.

v3:
- bpf_sk_redirect_map(): Improve description of BPF_F_INGRESS flag.
- bpf_msg_redirect_map(): Improve description of BPF_F_INGRESS flag.
- bpf_redirect_map(): Fix note on CPU redirection, not fully implemented
  for generic XDP but supported on native XDP.
- bpf_msg_pull_data(): Clarify comment about invalidated verifier
  checks.

Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:21:59 +02:00
Quentin Monnet
7aa79a869d bpf: add documentation for eBPF helpers (51-57)
Add documentation for eBPF helper functions to bpf.h user header file.
This documentation can be parsed with the Python script provided in
another commit of the patch series, in order to provide a RST document
that can later be converted into a man page.

The objective is to make the documentation easily understandable and
accessible to all eBPF developers, including beginners.

This patch contains descriptions for the following helper functions:

Helpers from Lawrence:
- bpf_setsockopt()
- bpf_getsockopt()
- bpf_sock_ops_cb_flags_set()

Helpers from Yonghong:
- bpf_perf_event_read_value()
- bpf_perf_prog_read_value()

Helper from Josef:
- bpf_override_return()

Helper from Andrey:
- bpf_bind()

v4:
- bpf_perf_event_read_value(): State that this helper should be
  preferred over bpf_perf_event_read().

v3:
- bpf_perf_event_read_value(): Fix time of selection for perf event type
  in description. Remove occurences of "cores" to avoid confusion with
  "CPU".
- bpf_bind(): Remove last paragraph of description, which was off topic.

Cc: Lawrence Brakmo <brakmo@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Yonghong Song <yhs@fb.com>
[for bpf_perf_event_read_value(), bpf_perf_prog_read_value()]
Acked-by: Andrey Ignatov <rdna@fb.com>
[for bpf_bind()]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:21:59 +02:00
Quentin Monnet
c6b5fb8690 bpf: add documentation for eBPF helpers (42-50)
Add documentation for eBPF helper functions to bpf.h user header file.
This documentation can be parsed with the Python script provided in
another commit of the patch series, in order to provide a RST document
that can later be converted into a man page.

The objective is to make the documentation easily understandable and
accessible to all eBPF developers, including beginners.

This patch contains descriptions for the following helper functions:

Helper from Kaixu:
- bpf_perf_event_read()

Helpers from Martin:
- bpf_skb_under_cgroup()
- bpf_xdp_adjust_head()

Helpers from Sargun:
- bpf_probe_write_user()
- bpf_current_task_under_cgroup()

Helper from Thomas:
- bpf_skb_change_head()

Helper from Gianluca:
- bpf_probe_read_str()

Helpers from Chenbo:
- bpf_get_socket_cookie()
- bpf_get_socket_uid()

v4:
- bpf_perf_event_read(): State that bpf_perf_event_read_value() should
  be preferred over this helper.
- bpf_skb_change_head(): Clarify comment about invalidated verifier
  checks.
- bpf_xdp_adjust_head(): Clarify comment about invalidated verifier
  checks.
- bpf_probe_write_user(): Add that dst must be a valid user space
  address.
- bpf_get_socket_cookie(): Improve description by making clearer that
  the cockie belongs to the socket, and state that it remains stable for
  the life of the socket.

v3:
- bpf_perf_event_read(): Fix time of selection for perf event type in
  description. Remove occurences of "cores" to avoid confusion with
  "CPU".

Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Gianluca Borello <g.borello@gmail.com>
Cc: Chenbo Feng <fengc@google.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
[for bpf_skb_under_cgroup(), bpf_xdp_adjust_head()]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:21:59 +02:00
Quentin Monnet
fa15601ab3 bpf: add documentation for eBPF helpers (33-41)
Add documentation for eBPF helper functions to bpf.h user header file.
This documentation can be parsed with the Python script provided in
another commit of the patch series, in order to provide a RST document
that can later be converted into a man page.

The objective is to make the documentation easily understandable and
accessible to all eBPF developers, including beginners.

This patch contains descriptions for the following helper functions, all
written by Daniel:

- bpf_get_hash_recalc()
- bpf_skb_change_tail()
- bpf_skb_pull_data()
- bpf_csum_update()
- bpf_set_hash_invalid()
- bpf_get_numa_node_id()
- bpf_set_hash()
- bpf_skb_adjust_room()
- bpf_xdp_adjust_meta()

v4:
- bpf_skb_change_tail(): Clarify comment about invalidated verifier
  checks.
- bpf_skb_pull_data(): Clarify the motivation for using this helper or
  bpf_skb_load_bytes(), on non-linear buffers. Fix RST formatting for
  *skb*. Clarify comment about invalidated verifier checks.
- bpf_csum_update(): Fix description of checksum (entire packet, not IP
  checksum). Fix a typo: "header" instead of "helper".
- bpf_set_hash_invalid(): Mention bpf_get_hash_recalc().
- bpf_get_numa_node_id(): State that the helper is not restricted to
  programs attached to sockets.
- bpf_skb_adjust_room(): Clarify comment about invalidated verifier
  checks.
- bpf_xdp_adjust_meta(): Clarify comment about invalidated verifier
  checks.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:21:59 +02:00
Quentin Monnet
1fdd08bedc bpf: add documentation for eBPF helpers (23-32)
Add documentation for eBPF helper functions to bpf.h user header file.
This documentation can be parsed with the Python script provided in
another commit of the patch series, in order to provide a RST document
that can later be converted into a man page.

The objective is to make the documentation easily understandable and
accessible to all eBPF developers, including beginners.

This patch contains descriptions for the following helper functions, all
written by Daniel:

- bpf_get_prandom_u32()
- bpf_get_smp_processor_id()
- bpf_get_cgroup_classid()
- bpf_get_route_realm()
- bpf_skb_load_bytes()
- bpf_csum_diff()
- bpf_skb_get_tunnel_opt()
- bpf_skb_set_tunnel_opt()
- bpf_skb_change_proto()
- bpf_skb_change_type()

v4:
- bpf_get_prandom_u32(): Warn that the prng is not cryptographically
  secure.
- bpf_get_smp_processor_id(): Fix a typo (case).
- bpf_get_cgroup_classid(): Clarify description. Add notes on the helper
  being limited to cgroup v1, and to egress path.
- bpf_get_route_realm(): Add comparison with bpf_get_cgroup_classid().
  Add a note about usage with TC and advantage of clsact. Fix a typo in
  return value ("sdb" instead of "skb").
- bpf_skb_load_bytes(): Make explicit loading large data loads it to the
  eBPF stack.
- bpf_csum_diff(): Add a note on seed that can be cascaded. Link to
  bpf_l3|l4_csum_replace().
- bpf_skb_get_tunnel_opt(): Add a note about usage with "collect
  metadata" mode, and example of this with Geneve.
- bpf_skb_set_tunnel_opt(): Add a link to bpf_skb_get_tunnel_opt()
  description.
- bpf_skb_change_proto(): Mention that the main use case is NAT64.
  Clarify comment about invalidated verifier checks.

v3:
- bpf_get_prandom_u32(): Fix helper name :(. Add description, including
  a note on the internal random state.
- bpf_get_smp_processor_id(): Add description, including a note on the
  processor id remaining stable during program run.
- bpf_get_cgroup_classid(): State that CONFIG_CGROUP_NET_CLASSID is
  required to use the helper. Add a reference to related documentation.
  State that placing a task in net_cls controller disables cgroup-bpf.
- bpf_get_route_realm(): State that CONFIG_CGROUP_NET_CLASSID is
  required to use this helper.
- bpf_skb_load_bytes(): Fix comment on current use cases for the helper.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:21:59 +02:00
Quentin Monnet
c456dec4d2 bpf: add documentation for eBPF helpers (12-22)
Add documentation for eBPF helper functions to bpf.h user header file.
This documentation can be parsed with the Python script provided in
another commit of the patch series, in order to provide a RST document
that can later be converted into a man page.

The objective is to make the documentation easily understandable and
accessible to all eBPF developers, including beginners.

This patch contains descriptions for the following helper functions, all
written by Alexei:

- bpf_get_current_pid_tgid()
- bpf_get_current_uid_gid()
- bpf_get_current_comm()
- bpf_skb_vlan_push()
- bpf_skb_vlan_pop()
- bpf_skb_get_tunnel_key()
- bpf_skb_set_tunnel_key()
- bpf_redirect()
- bpf_perf_event_output()
- bpf_get_stackid()
- bpf_get_current_task()

v4:
- bpf_redirect(): Fix typo: "XDP_ABORT" changed to "XDP_ABORTED". Add
  note on bpf_redirect_map() providing better performance. Replace "Save
  for" with "Except for".
- bpf_skb_vlan_push(): Clarify comment about invalidated verifier
  checks.
- bpf_skb_vlan_pop(): Clarify comment about invalidated verifier
  checks.
- bpf_skb_get_tunnel_key(): Add notes on tunnel_id, "collect metadata"
  mode, and example tunneling protocols with which it can be used.
- bpf_skb_set_tunnel_key(): Add a reference to the description of
  bpf_skb_get_tunnel_key().
- bpf_perf_event_output(): Specify that, and for what purpose, the
  helper can be used with programs attached to TC and XDP.

v3:
- bpf_skb_get_tunnel_key(): Change and improve description and example.
- bpf_redirect(): Improve description of BPF_F_INGRESS flag.
- bpf_perf_event_output(): Fix first sentence of description. Delete
  wrong statement on context being evaluated as a struct pt_reg. Remove
  the long yet incomplete example.
- bpf_get_stackid(): Add a note about PERF_MAX_STACK_DEPTH being
  configurable.

Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:21:58 +02:00
Quentin Monnet
ad4a522349 bpf: add documentation for eBPF helpers (01-11)
Add documentation for eBPF helper functions to bpf.h user header file.
This documentation can be parsed with the Python script provided in
another commit of the patch series, in order to provide a RST document
that can later be converted into a man page.

The objective is to make the documentation easily understandable and
accessible to all eBPF developers, including beginners.

This patch contains descriptions for the following helper functions, all
written by Alexei:

- bpf_map_lookup_elem()
- bpf_map_update_elem()
- bpf_map_delete_elem()
- bpf_probe_read()
- bpf_ktime_get_ns()
- bpf_trace_printk()
- bpf_skb_store_bytes()
- bpf_l3_csum_replace()
- bpf_l4_csum_replace()
- bpf_tail_call()
- bpf_clone_redirect()

v4:
- bpf_map_lookup_elem(): Add "const" qualifier for key.
- bpf_map_update_elem(): Add "const" qualifier for key and value.
- bpf_map_lookup_elem(): Add "const" qualifier for key.
- bpf_skb_store_bytes(): Clarify comment about invalidated verifier
  checks.
- bpf_l3_csum_replace(): Mention L3 instead of just IP, and add a note
  about bpf_csum_diff().
- bpf_l4_csum_replace(): Mention L4 instead of just TCP/UDP, and add a
  note about bpf_csum_diff().
- bpf_tail_call(): Bring minor edits to description.
- bpf_clone_redirect(): Add a note about the relation with
  bpf_redirect(). Also clarify comment about invalidated verifier
  checks.

v3:
- bpf_map_lookup_elem(): Fix description of restrictions for flags
  related to the existence of the entry.
- bpf_trace_printk(): State that trace_pipe can be configured. Fix
  return value in case an unknown format specifier is met. Add a note on
  kernel log notice when the helper is used. Edit example.
- bpf_tail_call(): Improve comment on stack inheritance.
- bpf_clone_redirect(): Improve description of BPF_F_INGRESS flag.

Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:21:58 +02:00
Quentin Monnet
56a092c895 bpf: add script and prepare bpf.h for new helpers documentation
Remove previous "overview" of eBPF helpers from user bpf.h header.
Replace it by a comment explaining how to process the new documentation
(to come in following patches) with a Python script to produce RST, then
man page documentation.

Also add the aforementioned Python script under scripts/. It is used to
process include/uapi/linux/bpf.h and to extract helper descriptions, to
turn it into a RST document that can further be processed with rst2man
to produce a man page. The script takes one "--filename <path/to/file>"
option. If the script is launched from scripts/ in the kernel root
directory, it should be able to find the location of the header to
parse, and "--filename <path/to/file>" is then optional. If it cannot
find the file, then the option becomes mandatory. RST-formatted
documentation is printed to standard output.

Typical workflow for producing the final man page would be:

    $ ./scripts/bpf_helpers_doc.py \
            --filename include/uapi/linux/bpf.h > /tmp/bpf-helpers.rst
    $ rst2man /tmp/bpf-helpers.rst > /tmp/bpf-helpers.7
    $ man /tmp/bpf-helpers.7

Note that the tool kernel-doc cannot be used to document eBPF helpers,
whose signatures are not available directly in the header files
(pre-processor directives are used to produce them at the beginning of
the compilation process).

v4:
- Also remove overviews for newly added bpf_xdp_adjust_tail() and
  bpf_skb_get_xfrm_state().
- Remove vague statement about what helpers are restricted to GPL
  programs in "LICENSE" section for man page footer.
- Replace license boilerplate with SPDX tag for Python script.

v3:
- Change license for man page.
- Remove "for safety reasons" from man page header text.
- Change "packets metadata" to "packets" in man page header text.
- Move and fix comment on helpers introducing no overhead.
- Remove "NOTES" section from man page footer.
- Add "LICENSE" section to man page footer.
- Edit description of file include/uapi/linux/bpf.h in man page footer.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:21:58 +02:00
Daniel Borkmann
3f13de6d6f Merge branch 'bpf-tunnel-metadata-selftests'
William Tu says:

====================
The patch series provide end-to-end eBPF tunnel testsute.  A common topology
is created below for all types of tunnels:

Topology:
---------
     root namespace   |     at_ns0 namespace
                      |
      -----------     |     -----------
      | tnl dev |     |     | tnl dev |  (overlay network)
      -----------     |     -----------
      metadata-mode   |     native-mode
       with bpf       |
                      |
      ----------      |     ----------
      |  veth1  | --------- |  veth0  |  (underlay network)
      ----------    peer    ----------

Device Configuration
--------------------
 Root namespace with metadata-mode tunnel + BPF
 Device names and addresses:
       veth1 IP: 172.16.1.200, IPv6: 00::22 (underlay)
       tunnel dev <type>11, ex: gre11, IPv4: 10.1.1.200 (overlay)

 Namespace at_ns0 with native tunnel
 Device names and addresses:
       veth0 IPv4: 172.16.1.100, IPv6: 00::11 (underlay)
       tunnel dev <type>00, ex: gre00, IPv4: 10.1.1.100 (overlay)

End-to-end ping packet flow
---------------------------
 Most of the tests start by namespace creation, device configuration,
 then ping the underlay and overlay network.  When doing 'ping 10.1.1.100'
 from root namespace, the following operations happen:
 1) Route lookup shows 10.1.1.100/24 belongs to tnl dev, fwd to tnl dev.
 2) Tnl device's egress BPF program is triggered and set the tunnel metadata,
    with remote_ip=172.16.1.200 and others.
 3) Outer tunnel header is prepended and route the packet to veth1's egress
 4) veth0's ingress queue receive the tunneled packet at namespace at_ns0
 5) Tunnel protocol handler, ex: vxlan_rcv, decap the packet
 6) Forward the packet to the overlay tnl dev

Test Cases
-----------------------------
 Tunnel Type |  BPF Programs
-----------------------------
 GRE:          gre_set_tunnel, gre_get_tunnel
 IP6GRE:       ip6gretap_set_tunnel, ip6gretap_get_tunnel
 ERSPAN:       erspan_set_tunnel, erspan_get_tunnel
 IP6ERSPAN:    ip4ip6erspan_set_tunnel, ip4ip6erspan_get_tunnel
 VXLAN:        vxlan_set_tunnel, vxlan_get_tunnel
 IP6VXLAN:     ip6vxlan_set_tunnel, ip6vxlan_get_tunnel
 GENEVE:       geneve_set_tunnel, geneve_get_tunnel
 IP6GENEVE:    ip6geneve_set_tunnel, ip6geneve_get_tunnel
 IPIP:         ipip_set_tunnel, ipip_get_tunnel
 IP6IP:        ipip6_set_tunnel, ipip6_get_tunnel,
               ip6ip6_set_tunnel, ip6ip6_get_tunnel
 XFRM:         xfrm_get_state
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:11:16 +02:00
William Tu
b05cd74043 samples/bpf: remove the bpf tunnel testsuite.
Move the testsuite to
selftests/bpf/{test_tunnel_kern.c, test_tunnel.sh}

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:11:15 +02:00
William Tu
933a741e3b selftests/bpf: bpf tunnel test.
The patch migrates the original tests at samples/bpf/tcbpf2_kern.c
and samples/bpf/test_tunnel_bpf.sh to selftests.  There are a couple
changes from the original:
    1) add ipv6 vxlan, ipv6 geneve, ipv6 ipip tests
    2) simplify the original ipip tests (remove iperf tests)
    3) improve documentation
    4) use bpf_ntoh* and bpf_hton* api

In summary, 'test_tunnel_kern.o' contains the following bpf program:
  GRE: gre_set_tunnel, gre_get_tunnel
  IP6GRE: ip6gretap_set_tunnel, ip6gretap_get_tunnel
  ERSPAN: erspan_set_tunnel, erspan_get_tunnel
  IP6ERSPAN: ip4ip6erspan_set_tunnel, ip4ip6erspan_get_tunnel
  VXLAN: vxlan_set_tunnel, vxlan_get_tunnel
  IP6VXLAN: ip6vxlan_set_tunnel, ip6vxlan_get_tunnel
  GENEVE: geneve_set_tunnel, geneve_get_tunnel
  IP6GENEVE: ip6geneve_set_tunnel, ip6geneve_get_tunnel
  IPIP: ipip_set_tunnel, ipip_get_tunnel
  IP6IP: ipip6_set_tunnel, ipip6_get_tunnel,
         ip6ip6_set_tunnel, ip6ip6_get_tunnel
  XFRM: xfrm_get_state

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-27 00:11:14 +02:00
Nikita V. Shirokov
f761312023 bpf: fix xdp_generic for bpf_adjust_tail usecase
When bpf_adjust_tail was introduced for generic xdp, it changed skb's tail
pointer, so it was pointing to the new "end of the packet". However skb's
len field wasn't properly modified, so on the wire ethernet frame had
original (or even bigger, if adjust_head was used) size. This diff is
fixing this.

Fixes: 198d83bb3 (" bpf: make generic xdp compatible w/ bpf_xdp_adjust_tail")
Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-26 22:56:40 +02:00
Jiri Olsa
9b984a20ca tools, bpftool: Display license GPL compatible in prog show/list
Display the license "gpl" string in bpftool prog command, like:

  # bpftool prog list
  5: tracepoint  name func  tag 57cd311f2e27366b  gpl
          loaded_at Apr 26/09:37  uid 0
          xlated 16B  not jited  memlock 4096B

  # bpftool --json --pretty prog show
  [{
          "id": 5,
          "type": "tracepoint",
          "name": "func",
          "tag": "57cd311f2e27366b",
          "gpl_compatible": true,
          "loaded_at": "Apr 26/09:37",
          "uid": 0,
          "bytes_xlated": 16,
          "jited": false,
          "bytes_memlock": 4096
      }
  ]

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-26 22:39:22 +02:00
Jiri Olsa
fb6ef42b5c tools, bpf: Sync bpf.h uapi header
Syncing the bpf.h uapi header with tools.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-26 22:38:38 +02:00
Jiri Olsa
b85fab0e67 bpf: Add gpl_compatible flag to struct bpf_prog_info
Adding gpl_compatible flag to struct bpf_prog_info
so it can be dumped via bpf_prog_get_info_by_fd and
displayed via bpftool progs dump.

Alexei noticed 4-byte hole in struct bpf_prog_info,
so we put the u32 flags field in there, and we can
keep adding bit fields in there without breaking
user space.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-26 22:36:11 +02:00
Daniel Borkmann
af487c5777 Merge branch 'bpf-optimize-neg-sums'
Jakub Kicinski says:

====================
This set adds an optimization run to the NFP jit to turn ADD and SUB
instructions with negative immediate into the opposite operation with
a positive immediate. NFP can fit small immediates into the instructions
but it can't ever fit negative immediates. Addition of small negative
immediates is quite common in BPF programs for stack address calculations,
therefore this optimization gives us non-negligible savings in instruction
count (up to 4%).
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 09:56:12 +02:00
Jakub Kicinski
7bdc97be90 nfp: bpf: optimize comparisons to negative constants
Comparison instruction requires a subtraction.  If the constant
is negative we are more likely to fit it into a NFP instruction
directly if we change the sign and use addition.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 09:56:10 +02:00
Jakub Kicinski
61dd8f0007 nfp: bpf: tabularize generations of compare operations
There are quite a few compare instructions now, use a table
to translate BPF instruction code to NFP instruction parameters
instead of parameterizing helpers.  This saves LOC and makes
future extensions easier.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 09:56:10 +02:00
Jakub Kicinski
6c59500c2d nfp: bpf: optimize add/sub of a negative constant
NFP instruction set can fit small immediates into the instruction.
Negative integers, however, will never fit because they will have
highest bit set.  If we swap the ALU op between ADD and SUB and
negate the constant we have a better chance of fitting small negative
integers into the instruction itself and saving one or two cycles.

immed[gprB_21, 0xfffffffc]
alu[gprA_4, gprA_4, +, gprB_21], gpr_wrboth
immed[gprB_21, 0xffffffff]
alu[gprA_5, gprA_5, +carry, gprB_21], gpr_wrboth

now becomes:

alu[gprA_4, gprA_4, -, 4], gpr_wrboth
alu[gprA_5, gprA_5, -carry, 0], gpr_wrboth

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 09:56:10 +02:00
Jakub Kicinski
9c9e53233c nfp: bpf: remove double space
Whitespace cleanup - remove double space.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 09:56:10 +02:00
John Fastabend
a18fda1a62 bpf: reduce runtime of test_sockmap tests
When test_sockmap was running outside of selftests and was not being
run by build bots it was reasonable to spend significant amount of
time running various tests. The number of tests is high because many
different I/O iterators are run.

However, now that test_sockmap is part of selftests rather than
iterate through all I/O sides only test a minimal set of min/max
values along with a few "normal" I/O ops. Also remove the long
running tests. They can be run from other test frameworks on a regular
cadence.

This significanly reduces runtime of test_sockmap.

Before:

$ time sudo ./test_sockmap  > /dev/null

real    4m47.521s
user    0m0.370s
sys     0m3.131s

After:

$ time sudo ./test_sockmap  > /dev/null

real    0m0.514s
user    0m0.104s
sys     0m0.430s

The CLI is still available for users that want to test the long
running tests that do the larger send/recv tests.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 02:13:01 +02:00
Daniel Borkmann
bf91a76b8f Merge branch 'bpf-sockmap-selftests'
John Fastabend says:

====================
This series moves ./samples/sockmap into BPF selftests. There are a
few good reasons to do this. First, by pushing this into selftests
the tests will be run automatically. Second, sockmap was not really
a sample of anything anymore, but rather a large set of tests.

Note: There are three recent fixes outstanding against bpf branch
that can be detected occasionally by the automated tests here.

https://patchwork.ozlabs.org/patch/903138/
https://patchwork.ozlabs.org/patch/903139/
https://patchwork.ozlabs.org/patch/903140/
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 00:07:05 +02:00
John Fastabend
2e04eb1dd1 bpf: sockmap, remove samples program
The BPF sample sockmap is redundant now that equivelant tests exist
in the BPF selftests. Lets remove this sample and only keep the
selftest version that will be run as part of the selftest suite.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 00:06:21 +02:00
John Fastabend
16962b2404 bpf: sockmap, add selftests
This adds a new test program test_sockmap which is the old sample
sockmap program. By moving the sample program here we can now run it
as part of the self tests suite. To support this a populate_progs()
routine is added to load programs and maps which was previously done
with load_bpf_file(). This is needed because self test libs do not
provide a similar routine. Also we now use the cgroup_helpers
routines to manage cgroup use instead of manually creating one and
supplying it to the CLI.

Notice we keep the CLI around though because it is useful for dbg
and specialized testing.

To run use ./test_sockmap and the result should be,

Summary 660 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 00:06:20 +02:00
John Fastabend
5d9ffeae5e bpf: sockmap, add a set of tests to run by default
If no options are passed to sockmap after this patch we run a set of
tests using various options and sendmsg/sendpage sizes. This replaces
the sockmap_test.sh script.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 00:04:49 +02:00
John Fastabend
15f66a91a6 bpf: sockmap, code sockmap_test in C
By moving sockmap_test from shell script into C we can run it directly
from selftests, but we can also push the input/output around in proper
structures.

However, keep the CLI options around because they are useful for
debugging when a paticular pattern of msghdr or sockmap options
trips up the sockmap code path.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 00:04:49 +02:00
Yonghong Song
6595c7426a tools/bpf: remove test_sock_addr from TEST_GEN_PROGS
Since test_sock_addr is not supposed to run by itself,
remove it from TEST_GEN_PROGS and add it to
TEST_GEN_PROGS_EXTENDED. This way, run_tests will
not run test_sock_addr. The corresponding test to run
is test_sock_addr.sh.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25 00:00:06 +02:00
Anders Roxell
b6fd9cf796 selftests: bpf: update .gitignore with missing file
Fixes: c0fa1b6c3e ("bpf: btf: Add BTF tests")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24 23:51:45 +02:00
Daniel Borkmann
68db35b125 Merge branch 'bpf-map-val-as-key'
Paul Chaignon says:

====================
Currently, helpers that expect ARG_PTR_TO_MAP_KEY and ARG_PTR_TO_MAP_VALUE
can only access stack and packet memory.  This patchset allows these
helpers to directly access map values by passing registers of type
PTR_TO_MAP_VALUE.

The first patch changes the verifier; the second adds new test cases.

The first three versions of this patchset were sent on the iovisor-dev
mailing list only.

Changelogs:
  Changes in v5:
    - Refactor using check_helper_mem_access.
  Changes in v4:
    - Rebase.
  Changes in v3:
    - Bug fixes.
    - Negative test cases.
  Changes in v2:
    - Additional test cases for adjusted maps.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24 22:39:14 +02:00
Paul Chaignon
5f90dd6aae tools/bpf: add verifier tests for accesses to map values
This patch adds new test cases for accesses to map values from map
helpers.

Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24 22:39:13 +02:00
Paul Chaignon
d71962f3e6 bpf: allow map helpers access to map values directly
Helpers that expect ARG_PTR_TO_MAP_KEY and ARG_PTR_TO_MAP_VALUE can only
access stack and packet memory.  Allow these helpers to directly access
map values by passing registers of type PTR_TO_MAP_VALUE.

This change removes the need for an extra copy to the stack when using a
map value to perform a second map lookup, as in the following:

struct bpf_map_def SEC("maps") infobyreq = {
    .type = BPF_MAP_TYPE_HASHMAP,
    .key_size = sizeof(struct request *),
    .value_size = sizeof(struct info_t),
    .max_entries = 1024,
};
struct bpf_map_def SEC("maps") counts = {
    .type = BPF_MAP_TYPE_HASHMAP,
    .key_size = sizeof(struct info_t),
    .value_size = sizeof(u64),
    .max_entries = 1024,
};
SEC("kprobe/blk_account_io_start")
int bpf_blk_account_io_start(struct pt_regs *ctx)
{
    struct info_t *info = bpf_map_lookup_elem(&infobyreq, &ctx->di);
    u64 *count = bpf_map_lookup_elem(&counts, info);
    (*count)++;
}

Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24 22:39:13 +02:00
Daniel Borkmann
0e25d14e5f Merge branch 'bpf-xfrm-states'
Eyal Birger says:

====================
This patchset adds support for fetching XFRM state information from
an eBPF program called from TC.

The first patch introduces a helper for fetching an XFRM state from the
skb's secpath. The XFRM state is modeled using a new virtual struct which
contains the SPI, peer address, and reqid values of the state; This struct
can be extended in the future to provide additional state information.

The second patch adds a test example in test_tunnel_bpf.sh. The sample
validates the correct extraction of state information by the eBPF program.

v3:
  - Kept SPI and peer IPv4 address in state in network byte order
    following suggestion from Alexei Starovoitov
v2:
  - Fixed two comments by Daniel Borkmann:
    - disallow reserved flags in helper call
    - avoid compiling in helper code when CONFIG_XFRM is off
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24 22:27:00 +02:00
Eyal Birger
29a36f9eef samples/bpf: extend test_tunnel_bpf.sh with xfrm state test
Add a test for fetching xfrm state parameters from a tc program running
on ingress.

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24 22:26:58 +02:00
Eyal Birger
12bed760a7 bpf: add helper for getting xfrm states
This commit introduces a helper which allows fetching xfrm state
parameters by eBPF programs attached to TC.

Prototype:
bpf_skb_get_xfrm_state(skb, index, xfrm_state, size, flags)

skb: pointer to skb
index: the index in the skb xfrm_state secpath array
xfrm_state: pointer to 'struct bpf_xfrm_state'
size: size of 'struct bpf_xfrm_state'
flags: reserved for future extensions

The helper returns 0 on success. Non zero if no xfrm state at the index
is found - or non exists at all.

struct bpf_xfrm_state currently includes the SPI, peer IPv4/IPv6
address and the reqid; it can be further extended by adding elements to
its end - indicating the populated fields by the 'size' argument -
keeping backwards compatibility.

Typical usage:

struct bpf_xfrm_state x = {};
bpf_skb_get_xfrm_state(skb, 0, &x, sizeof(x), 0);
...

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24 22:26:58 +02:00
Martin KaFai Lau
fbcf93ebca bpf: btf: Clean up btf.h in uapi
This patch cleans up btf.h in uapi:
1) Rename "name" to "name_off" to better reflect it is an offset to the
   string section instead of a char array.
2) Remove unused value BTF_FLAGS_COMPR and BTF_MAGIC_SWAP

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-23 11:32:01 +02:00
Nikita V. Shirokov
aaa6452763 bpf: fix virtio-net's length calc for XDP_PASS
In commit 6870de435b ("bpf: make virtio compatible w/
bpf_xdp_adjust_tail") i didn't account for vi->hdr_len during new
packet's length calculation after bpf_prog_run in receive_mergeable.
because of this all packets, if they were passed to the kernel,
were truncated by 12 bytes.

Fixes:6870de435b90 ("bpf: make virtio compatible w/ bpf_xdp_adjust_tail")
Reported-by: David Ahern <dsahern@gmail.com>

Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-23 11:28:34 +02:00
Stephen Hemminger
48d7a07ba3 hv_netvsc: select needed ucs2_string routine
The conversion of rndis friendly name to utf8 uses a standard
kernel routine which is optional in config. Therefore build
would fail for some configurations. Resolve by selecting needed
library.

Fixes: 0fe554a46a ("hv_netvsc: propogate Hyper-V friendly name into interface alias")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22 21:07:03 -04:00
David S. Miller
e0ada51db9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts were simple overlapping changes in microchip
driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-21 16:32:48 -04:00
David S. Miller
0638eb573c Merge branch 'ipv6-Another-followup-to-the-fib6_info-change'
David Ahern says:

====================
net/ipv6: Another followup to the fib6_info change

Last one - for this week.

Patches 1, 2 and 7 are more cleanup patches - removing dead code,
moving code from a header to near its single caller, and updating
function name.

Patches 3-5 do some refactoring leading up to patch 6 which fixes
a NULL dereference. I have only managed to trigger a panic once, so
I can not definitively confirm it addresses the problem but it seems
pretty clear that it is a race on removing a 'from' reference on
an rt6_info and another path using that 'from' value to do
cookie checking.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-21 16:06:14 -04:00
David Ahern
8ae869714b net/ipv6: Remove unncessary check on f6i in fib6_check
Dan reported an imbalance in fib6_check on use of f6i and checking
whether it is null. Since fib6_check is only called if f6i is non-null,
remove the unnecessary check.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-21 16:06:14 -04:00
David Ahern
a68886a691 net/ipv6: Make from in rt6_info rcu protected
When a dst entry is created from a fib entry, the 'from' in rt6_info
is set to the fib entry. The 'from' reference is used most notably for
cookie checking - making sure stale dst entries are updated if the
fib entry is changed.

When a fib entry is deleted, the pcpu routes on it are walked releasing
the fib6_info reference. This is needed for the fib6_info cleanup to
happen and to make sure all device references are released in a timely
manner.

There is a race window when a FIB entry is deleted and the 'from' on the
pcpu route is dropped and the pcpu route hits a cookie check. Handle
this race using rcu on from.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-21 16:06:14 -04:00
David Ahern
5bcaa41b96 net/ipv6: Move release of fib6_info from pcpu routes to helper
Code move only; no functional change intended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-21 16:06:13 -04:00
David Ahern
a87b7dc9f7 net/ipv6: Move rcu locking to callers of fib6_get_cookie_safe
A later patch protects 'from' in rt6_info and this simplifies the
locking needed by it.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-21 16:06:13 -04:00
David Ahern
4d85cd0c2a net/ipv6: Move rcu_read_lock to callers of ip6_rt_cache_alloc
A later patch protects 'from' in rt6_info and this simplifies the
locking needed by it.

With the move, the fib6_info_hold for the uncached_rt is no longer
needed since the rcu_lock is still held.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-21 16:06:13 -04:00
David Ahern
a269f1a764 net/ipv6: Rename rt6_get_cookie_safe
rt6_get_cookie_safe takes a fib6_info and checks the sernum of
the node. Update the name to reflect its purpose.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-21 16:06:13 -04:00
David Ahern
6a3e030f08 net/ipv6: Clean up rt expires helpers
rt6_clean_expires and rt6_set_expires are no longer used. Removed them.
rt6_update_expires has 1 caller in route.c, so move it from the header.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-21 16:06:13 -04:00