linux/net
Nikolay Aleksandrov 4b9b1cdf83 net: fix wrong mac_len calculation for vlans
After 1e785f48d2 ("net: Start with correct mac_len in
skb_network_protocol") skb->mac_len is used as a start of the
calculation in skb_network_protocol() but that is not always correct. If
skb->protocol == 8021Q/AD, usually the vlan header is already inserted
in the skb (i.e. vlan reorder hdr == 0). Usually when the packet enters
dev_hard_xmit it has mac_len == 0 so we take 2 bytes from the
destination mac address (skb->data + VLAN_HLEN) as a type in
skb_network_protocol() and return vlan_depth == 4. In the case where TSO is
off, then the mac_len is set but it's == 18 (ETH_HLEN + VLAN_HLEN), so
skb_network_protocol() returns a type from inside the packet and
offset == 22. Also make vlan_depth unsigned as suggested before.
As suggested by Eric Dumazet, move the while() loop in the if() so we
can avoid additional testing in fast path.

Here are few netperf tests + debug printk's to illustrate:
cat netperf.tso-on.reorder-on.bugged
- Vlan -> device (reorder on, default, this case is okay)
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    7111.54
[   81.605435] skb->len 65226 skb->gso_size 1448 skb->proto 0x800
skb->mac_len 0 vlan_depth 0 type 0x800

- Vlan -> device (reorder off, bad)
cat netperf.tso-on.reorder-off.bugged
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00     241.35
[  204.578332] skb->len 1518 skb->gso_size 0 skb->proto 0x8100
skb->mac_len 0 vlan_depth 4 type 0x5301
0x5301 are the last two bytes of the destination mac.

And if we stop TSO, we may get even the following:
[   83.343156] skb->len 2966 skb->gso_size 1448 skb->proto 0x8100
skb->mac_len 18 vlan_depth 22 type 0xb84
Because mac_len already accounts for VLAN_HLEN.

After the fix:
cat netperf.tso-on.reorder-off.fixed
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.01    5001.46
[   81.888489] skb->len 65230 skb->gso_size 1448 skb->proto 0x8100
skb->mac_len 0 vlan_depth 18 type 0x800

CC: Vlad Yasevich <vyasevic@redhat.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Daniel Borkman <dborkman@redhat.com>
CC: David S. Miller <davem@davemloft.net>

Fixes:1e785f48d29a ("net: Start with correct mac_len in
skb_network_protocol")
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-01 19:39:13 -07:00
..
9p A bunch of updates and cleanup within the transport layer, 2014-04-11 14:14:57 -07:00
802 neigh: use NEIGH_VAR_INIT in ndo_neigh_setup functions. 2014-01-16 11:31:58 -08:00
8021q vlan: Fix lockdep warning with stacked vlan devices. 2014-05-16 22:14:49 -04:00
appletalk appletalk: fix checkpatch error with indent 2014-02-14 16:18:32 -05:00
atm net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
ax25 net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
batman-adv batman-adv: fix NULL pointer dereferences 2014-05-31 10:07:14 +02:00
bluetooth Bluetooth: Fix L2CAP LE debugfs entries permissions 2014-05-14 09:07:07 -07:00
bridge bridge: superfluous skb->nfct check in br_nf_dev_queue_xmit 2014-05-05 16:05:43 +02:00
caif net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
can net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
ceph crush: decode and initialize chooseleaf_vary_r 2014-05-16 21:29:55 +04:00
core net: fix wrong mac_len calculation for vlans 2014-06-01 19:39:13 -07:00
dcb net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
dccp ipv4: add a sock pointer to ip_queue_xmit() 2014-04-15 12:58:34 -04:00
decnet net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
dns_resolver net/*: Fix FSF address in file headers 2013-12-06 12:37:57 -05:00
dsa net/dsa/dsa.c: increment chip_index during of_node handling on dsa_of_probe() 2014-05-16 16:56:33 -04:00
ethernet net: eth_type_trans() should use skb_header_pointer() 2014-01-16 15:30:31 -08:00
hsr hsr: replace del_timer by del_timer_sync 2014-03-27 15:28:06 -04:00
ieee802154 mac802154: make csma/cca parameters per-wpan 2014-04-01 16:25:51 -04:00
ipv4 ipv4: initialise the itag variable in __mkroute_input 2014-05-22 15:57:36 -04:00
ipv6 ipv6: gro: fix CHECKSUM_COMPLETE support 2014-05-21 17:18:47 -04:00
ipx ipx: implement shutdown() 2014-02-12 19:26:32 -05:00
irda net: add build-time checks for msg->msg_name size 2014-01-18 23:04:16 -08:00
iucv af_iucv: wrong mapping of sent and confirmed skbs 2014-05-14 15:38:39 -04:00
key net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
l2tp ipv4: add a sock pointer to ip_queue_xmit() 2014-04-15 12:58:34 -04:00
lapb net/lapb: re-send packets on timeout 2013-09-23 16:52:45 -04:00
llc llc: remove noisy WARN from llc_mac_hdr_init 2014-01-28 18:01:32 -08:00
mac80211 mac80211: fix on-channel remain-on-channel 2014-05-14 15:48:38 +02:00
mac802154 mac802154: fix duplicate #include headers 2014-04-07 13:18:44 -04:00
mpls ipip: add GSO/TSO support 2013-10-19 19:36:19 -04:00
netfilter ipvs: Fix panic due to non-linear skb 2014-05-26 10:22:46 +09:00
netlabel netlabel: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
netlink net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
netrom net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
nfc net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
openvswitch ipv4: add a sock pointer to dst->output() path. 2014-04-15 13:47:15 -04:00
packet net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
phonet net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
rds net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
rfkill net: rfkill: move poll work to power efficient workqueue 2014-02-04 21:58:16 +01:00
rose net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
rxrpc af_rxrpc: Fix XDR length check in rxrpc key demarshalling. 2014-05-16 15:24:47 -04:00
sched net_sched: fix an oops in tcindex filter 2014-05-21 16:47:13 -04:00
sctp net: sctp: Don't transition to PF state when transport has exhausted 'Path.Max.Retrans'. 2014-04-27 23:41:14 -04:00
sunrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-04-12 17:31:22 -07:00
tipc net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
unix net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
vmw_vsock vsock: Make transport the proto owner 2014-05-05 13:13:50 -04:00
wimax wimax: remove dead code 2013-11-21 13:09:42 -05:00
wireless cfg80211: add cfg80211_sched_scan_stopped_rtnl 2014-05-05 15:14:57 +02:00
x25 net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
xfrm net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
compat.c net/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types 2014-03-06 16:30:45 +01:00
Kconfig Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-04-03 13:05:42 -07:00
Makefile net: move 6lowpan compression code to separate module 2014-01-15 15:36:38 -08:00
nonet.c
socket.c net: use SYSCALL_DEFINEx for sys_recv 2014-04-16 15:15:05 -04:00
sysctl_net.c net: Update the sysctl permissions handler to test effective uid/gid 2013-10-07 15:57:56 -04:00