linux/net
Andrey Ignatov d74bad4e74 bpf: Hooks for sys_connect
== The problem ==

See description of the problem in the initial patch of this patch set.

== The solution ==

The patch provides much more reliable in-kernel solution for the 2nd
part of the problem: making outgoing connecttion from desired IP.

It adds new attach types `BPF_CGROUP_INET4_CONNECT` and
`BPF_CGROUP_INET6_CONNECT` for program type
`BPF_PROG_TYPE_CGROUP_SOCK_ADDR` that can be used to override both
source and destination of a connection at connect(2) time.

Local end of connection can be bound to desired IP using newly
introduced BPF-helper `bpf_bind()`. It allows to bind to only IP though,
and doesn't support binding to port, i.e. leverages
`IP_BIND_ADDRESS_NO_PORT` socket option. There are two reasons for this:
* looking for a free port is expensive and can affect performance
  significantly;
* there is no use-case for port.

As for remote end (`struct sockaddr *` passed by user), both parts of it
can be overridden, remote IP and remote port. It's useful if an
application inside cgroup wants to connect to another application inside
same cgroup or to itself, but knows nothing about IP assigned to the
cgroup.

Support is added for IPv4 and IPv6, for TCP and UDP.

IPv4 and IPv6 have separate attach types for same reason as sys_bind
hooks, i.e. to prevent reading from / writing to e.g. user_ip6 fields
when user passes sockaddr_in since it'd be out-of-bound.

== Implementation notes ==

The patch introduces new field in `struct proto`: `pre_connect` that is
a pointer to a function with same signature as `connect` but is called
before it. The reason is in some cases BPF hooks should be called way
before control is passed to `sk->sk_prot->connect`. Specifically
`inet_dgram_connect` autobinds socket before calling
`sk->sk_prot->connect` and there is no way to call `bpf_bind()` from
hooks from e.g. `ip4_datagram_connect` or `ip6_datagram_connect` since
it'd cause double-bind. On the other hand `proto.pre_connect` provides a
flexible way to add BPF hooks for connect only for necessary `proto` and
call them at desired time before `connect`. Since `bpf_bind()` is
allowed to bind only to IP and autobind in `inet_dgram_connect` binds
only port there is no chance of double-bind.

bpf_bind() sets `force_bind_address_no_port` to bind to only IP despite
of value of `bind_address_no_port` socket field.

bpf_bind() sets `with_lock` to `false` when calling to __inet_bind()
and __inet6_bind() since all call-sites, where bpf_bind() is called,
already hold socket lock.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-31 02:15:54 +02:00
..
6lowpan
9p virtio: bugfixes 2018-02-15 14:29:27 -08:00
802 treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
8021q Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
appletalk net: make getname() functions return length rather than use int* parameter 2018-02-12 14:15:04 -05:00
atm net: make getname() functions return length rather than use int* parameter 2018-02-12 14:15:04 -05:00
ax25 net: make getname() functions return length rather than use int* parameter 2018-02-12 14:15:04 -05:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
bpf bpf: fix null pointer deref in bpf_prog_test_run_xdp 2018-02-01 07:43:56 -08:00
bridge bridge: Allow max MTU when multiple VLANs present 2018-03-23 12:17:30 -04:00
caif net: Convert caif_net_ops 2018-03-05 10:48:27 -05:00
can net: Convert can_pernet_ops 2018-03-22 11:11:29 -04:00
ceph libceph, ceph: avoid memory leak when specifying same option several times 2018-02-26 16:19:30 +01:00
core bpf: Hooks for sys_connect 2018-03-31 02:15:54 +02:00
dcb
dccp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
decnet Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-02-19 18:46:11 -05:00
dns_resolver afs: Support the AFS dynamic root 2018-02-06 14:43:37 +00:00
dsa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
ethernet
hsr
ieee802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
ife
ipv4 bpf: Hooks for sys_connect 2018-03-31 02:15:54 +02:00
ipv6 bpf: Hooks for sys_connect 2018-03-31 02:15:54 +02:00
iucv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
kcm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
key net: Convert /proc creating and destroying pernet_operations 2018-02-27 11:01:35 -05:00
l2tp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
l3mdev
lapb treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
llc net: llc: drop VLA in llc_sap_mcast() 2018-03-12 11:14:06 -04:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
mac802154 net/mac802154: disambiguate mac80215 vs mac802154 trace events 2018-03-28 22:55:18 +02:00
mpls net: Convert mpls_net_ops 2018-03-17 17:07:39 -04:00
ncsi net/ncsi: unlock on error in ncsi_set_interface_nl() 2018-03-08 21:49:58 -05:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
netlabel net/netlabel: Add list_next_rcu() in rcu_dereference(). 2017-11-18 10:32:41 +09:00
netlink Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
netrom net: make getname() functions return length rather than use int* parameter 2018-02-12 14:15:04 -05:00
nfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-02-19 18:46:11 -05:00
nsh
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
packet net: Convert packet_net_ops 2018-02-13 10:36:08 -05:00
phonet net: Convert /proc creating and destroying pernet_operations 2018-02-27 11:01:35 -05:00
psample
qrtr Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-06 01:20:46 -05:00
rds rds: tcp: remove register_netdevice_notifier infrastructure. 2018-03-22 11:21:45 -04:00
rfkill vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
rose net: make getname() functions return length rather than use int* parameter 2018-02-12 14:15:04 -05:00
rxrpc rxrpc: remove redundant initialization of variable 'len' 2018-03-16 09:48:39 -04:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
smc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
strparser strparser: Call sock_owned_by_user_nocheck 2017-12-28 14:28:22 -05:00
sunrpc net: make getname() functions return length rather than use int* parameter 2018-02-12 14:15:04 -05:00
switchdev
tipc tipc: step sk->sk_drops when rcv buffer is full 2018-03-22 14:43:37 -04:00
tls net: generalize sk_alloc_sg to work with scatterlist rings 2018-03-19 21:14:38 +01:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-02-19 18:46:11 -05:00
vmw_vsock net: make getname() functions return length rather than use int* parameter 2018-02-12 14:15:04 -05:00
wimax
wireless treewide: remove large struct-pass-by-value from tracepoint arguments 2018-03-28 22:55:18 +02:00
x25 x25: use %*ph to print small buffer 2018-02-20 13:51:47 -05:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
compat.c
Kconfig Staging/IIO patches for 4.16-rc1 2018-02-01 09:51:57 -08:00
Makefile ipx: move Novell IPX protocol support into staging 2017-11-28 13:55:00 +01:00
socket.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
sysctl_net.c net: Convert sysctl_pernet_ops 2018-02-13 10:36:05 -05:00