linux/kernel/bpf
Shmulik Ladkani 98589a0998 netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'
Commit 2c16d60332 ("netfilter: xt_bpf: support ebpf") introduced
support for attaching an eBPF object by an fd, with the
'bpf_mt_check_v1' ABI expecting the '.fd' to be specified upon each
IPT_SO_SET_REPLACE call.

However this breaks subsequent iptables calls:

 # iptables -A INPUT -m bpf --object-pinned /sys/fs/bpf/xxx -j ACCEPT
 # iptables -A INPUT -s 5.6.7.8 -j ACCEPT
 iptables: Invalid argument. Run `dmesg' for more information.

That's because iptables works by loading existing rules using
IPT_SO_GET_ENTRIES to userspace, then issuing IPT_SO_SET_REPLACE with
the replacement set.

However, the loaded 'xt_bpf_info_v1' has an arbitrary '.fd' number
(from the initial "iptables -m bpf" invocation) - so when 2nd invocation
occurs, userspace passes a bogus fd number, which leads to
'bpf_mt_check_v1' to fail.

One suggested solution [1] was to hack iptables userspace, to perform a
"entries fixup" immediatley after IPT_SO_GET_ENTRIES, by opening a new,
process-local fd per every 'xt_bpf_info_v1' entry seen.

However, in [2] both Pablo Neira Ayuso and Willem de Bruijn suggested to
depricate the xt_bpf_info_v1 ABI dealing with pinned ebpf objects.

This fix changes the XT_BPF_MODE_FD_PINNED behavior to ignore the given
'.fd' and instead perform an in-kernel lookup for the bpf object given
the provided '.path'.

It also defines an alias for the XT_BPF_MODE_FD_PINNED mode, named
XT_BPF_MODE_PATH_PINNED, to better reflect the fact that the user is
expected to provide the path of the pinned object.

Existing XT_BPF_MODE_FD_ELF behavior (non-pinned fd mode) is preserved.

References: [1] https://marc.info/?l=netfilter-devel&m=150564724607440&w=2
            [2] https://marc.info/?l=netfilter-devel&m=150575727129880&w=2

Reported-by: Rafael Buchbinder <rafi@rbk.ms>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-10-09 15:18:04 +02:00
..
arraymap.c bpf: inline map in map lookup functions for array and htab 2017-08-19 21:56:34 -07:00
bpf_lru_list.c bpf: lru: Lower the PERCPU_NR_SCANS from 16 to 4 2017-04-17 13:55:52 -04:00
bpf_lru_list.h bpf: Only set node->ref = 1 if it has not been set 2017-09-01 09:57:39 -07:00
cgroup.c bpf: BPF support for sock_ops 2017-07-01 16:15:13 -07:00
core.c bpf: sock_map fixes for !CONFIG_BPF_SYSCALL and !STREAM_PARSER 2017-08-16 15:34:13 -07:00
devmap.c bpf: devmap: pass on return value of bpf_map_precharge_memlock 2017-09-18 16:53:30 -07:00
hashtab.c bpf: Only set node->ref = 1 if it has not been set 2017-09-01 09:57:39 -07:00
helpers.c bpf: rename ARG_PTR_TO_STACK 2017-01-09 16:56:27 -05:00
inode.c netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1' 2017-10-09 15:18:04 +02:00
lpm_trie.c bpf: Allow selecting numa node during map creation 2017-08-19 21:35:43 -07:00
Makefile bpf: sock_map fixes for !CONFIG_BPF_SYSCALL and !STREAM_PARSER 2017-08-16 15:34:13 -07:00
map_in_map.c bpf: Add syscall lookup support for fd array and htab 2017-06-29 13:13:25 -04:00
map_in_map.h bpf: Add syscall lookup support for fd array and htab 2017-06-29 13:13:25 -04:00
percpu_freelist.c bpf: introduce percpu_freelist 2016-03-08 15:28:31 -05:00
percpu_freelist.h bpf: introduce percpu_freelist 2016-03-08 15:28:31 -05:00
sockmap.c bpf: add support for sockmap detach programs 2017-09-08 21:11:00 -07:00
stackmap.c bpf: Allow selecting numa node during map creation 2017-08-19 21:35:43 -07:00
syscall.c bpf: do not disable/enable BH in bpf_map_free_id() 2017-09-19 15:42:54 -07:00
tnum.c bpf/verifier: track signed and unsigned min/max values 2017-08-08 17:51:34 -07:00
verifier.c bpf: fix ri->map_owner pointer on bpf_prog_realloc 2017-09-19 16:38:53 -07:00