linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-19 02:34:01 +08:00

Author	SHA1	Message	Date
Eric Dumazet	34cbe27e81	net: napi_hash_del() returns a boolean status napi_hash_del() will soon be used from both drivers (if they want) or core networking stack. Callers are responsibles to ensure an RCU grace period is respected before freeing napi structure : napi_hash_del() can signal if this RCU grace period is needed or not. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:42 -05:00
Eric Dumazet	6180d9de61	net: move napi_hash[] into read mostly section We do not often add/delete a napi context. Moving napi_hash[] into read mostly section avoids potential false sharing. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:42 -05:00
Eric Dumazet	d64b5e85bf	net: add netif_tx_napi_add() netif_tx_napi_add() is a variant of netif_napi_add() It should be used by drivers that use a napi structure to exclusively poll TX. We do not want to add this kind of napi in napi_hash[] in following patches, adding generic busy polling to all NAPI drivers. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:41 -05:00
Eric Dumazet	93f93a4404	net: move skb_mark_napi_id() into core networking stack We would like to automatically provide busy polling support to all NAPI drivers, without them having to implement anything. skb_mark_napi_id() can be called from napi_gro_receive() and napi_get_frags(). Few drivers are still calling skb_mark_napi_id() because they use netif_receive_skb(). They should eventually call napi_gro_receive() instead. I will leave this to drivers maintainers. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:41 -05:00
Eric Dumazet	868fdb0606	mlx4: remove mlx4_en_low_latency_recv() Busy polling can now be handled in generic NAPI poll infrastructure. This removes complexity and fast path overhead : mlx4 used two spin_lock()/spin_unlock() pair per napi->poll() call in mlx4_en_cq_lock_napi()/mlx4_en_cq_unlock_napi() Tested: Without busy polling : lpaa23:~# echo 0 >/proc/sys/net/core/busy_read lpaa24:~# echo 0 >/proc/sys/net/core/busy_read lpaa23:~# ./netperf -H lpaa24 -t TCP_RR MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpaa24.prod.google.com () port 0 AF_INET : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 16384 87380 1 1 10.00 47330.78 With busy polling : lpaa23:~# echo 70 >/proc/sys/net/core/busy_read lpaa24:~# echo 70 >/proc/sys/net/core/busy_read lpaa23:~# ./netperf -H lpaa24 -t TCP_RR MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpaa24.prod.google.com () port 0 AF_INET : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 16384 87380 1 1 10.00 97643.55 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:40 -05:00
Eric Dumazet	b59768c6b4	bnx2x: remove bnx2x_low_latency_recv() support Switch to native NAPI polling, as this reduces overhead and complexity. Normal path is faster, since one cmpxchg() is not anymore requested, and busy polling with the NAPI polling has same performance. Tested: lpk50:~# cat /proc/sys/net/core/busy_read 70 lpk50:~# nstat >/dev/null;./netperf -H lpk55 -t TCP_RR;nstat MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpk55.prod.google.com () port 0 AF_INET : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 16384 87380 1 1 10.00 40095.07 16384 87380 IpInReceives 401062 0.0 IpInDelivers 401062 0.0 IpOutRequests 401079 0.0 TcpActiveOpens 7 0.0 TcpPassiveOpens 3 0.0 TcpAttemptFails 3 0.0 TcpEstabResets 5 0.0 TcpInSegs 401036 0.0 TcpOutSegs 401052 0.0 TcpOutRsts 38 0.0 UdpInDatagrams 26 0.0 UdpOutDatagrams 27 0.0 Ip6OutNoRoutes 1 0.0 TcpExtDelayedACKs 1 0.0 TcpExtTCPPrequeued 98 0.0 TcpExtTCPDirectCopyFromPrequeue 98 0.0 TcpExtTCPHPHits 4 0.0 TcpExtTCPHPHitsToUser 98 0.0 TcpExtTCPPureAcks 5 0.0 TcpExtTCPHPAcks 101 0.0 TcpExtTCPAbortOnData 6 0.0 TcpExtBusyPollRxPackets 400832 0.0 TcpExtTCPOrigDataSent 400983 0.0 IpExtInOctets 21273867 0.0 IpExtOutOctets 21261254 0.0 IpExtInNoECTPkts 401064 0.0 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:40 -05:00
Eric Dumazet	44fb6fbbac	mlx5: support napi_complete_done() A NAPI poll handler should return number of RX packets processed, instead of 0 / budget. This allows proper busy poll accounting through LINUX_MIB_BUSYPOLLRXPACKETS SNMP counter. napi_complete_done() allows /sys/class/net/ethX/gro_flush_timeout to be used for finer GRO aggregation control. Tested: Enabled busy polling, and checked TcpExtBusyPollRxPackets counter is increasing. echo 70 >/proc/sys/net/core/busy_read nstat >/dev/null netperf -H target -t TCP_RR >/dev/null nstat \| grep TcpExtBusyPollRxPackets TcpExtBusyPollRxPackets 490958 0.0 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:39 -05:00
Eric Dumazet	7ae92ae588	mlx5: add busy polling support It is now easy to add busy polling support to a NAPI driver, with very little impact on normal input path. This patch serves as a reference implementation. Note: A followup patch will add proper napi_complete_done() in mlx5, so that LINUX_MIB_BUSYPOLLRXPACKETS snmp counter is properly handled. Tested: Normal TCP_RR results without busy polling : lpk51:~# echo 0 >/proc/sys/net/core/busy_read lpk52:~# echo 0 >/proc/sys/net/core/busy_read lpk51:~# ./netperf -H 192.168.4.52 -t TCP_RR -l 10 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.4.52 () port 0 AF_INET : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 16384 87380 1 1 10.00 53509.49 16384 87380 Now enable busy polling : lpk51:~# echo 70 >/proc/sys/net/core/busy_read lpk52:~# echo 70 >/proc/sys/net/core/busy_read lpk51:~# ./netperf -H 192.168.4.52 -t TCP_RR -l 10 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.4.52 () port 0 AF_INET : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 16384 87380 1 1 10.00 97530.92 16384 87380 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:39 -05:00
Eric Dumazet	ce6aea93f7	net: network drivers no longer need to implement ndo_busy_poll() Instead of having to implement complex ndo_busy_poll() method, drivers can simply rely on NAPI poll logic. Busy polling gains are mainly coming from polling itself, not on exact details on how we poll the device. ndo_busy_poll() if implemented can avoid touching napi state, but it adds extra synchronization between normal napi->poll() and busy poll handler, slowing down the common path (non busy polling) with extra atomic operations. In practice few drivers ever got busy poll because of the complexity. We could go one step further, and make busy polling available for all NAPI drivers, but this would require that all netif_napi_del() calls are done in process context so that we can call synchronize_rcu(). Full audit would be required. Before this is done, a driver still needs to call : - skb_mark_napi_id() for each skb provided to the stack. - napi_hash_add() and napi_hash_del() to allocate a napi_id per napi struct. - Make sure RCU grace period is respected after napi_hash_del() before memory containing napi structure is freed. Followup patch implements busy poll for mlx5 driver as an example. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:39 -05:00
Eric Dumazet	2a028ecb76	net: allow BH servicing in sk_busy_loop() Instead of blocking BH in whole sk_busy_loop(), block them only around ->ndo_busy_poll() calls. This has many benefits. 1) allow tunneled traffic to use busy poll as well as native traffic. Tunnels handlers usually call netif_rx() and depend on net_rx_action() being run (from sofirq handler) 2) allow RFS/RPS being used (sending IPI to other cpus if needed) 3) use the 'lets burn cpu cycles' budget to do useful work (like TX completions, timers, RCU callbacks...) 4) reduce BH latencies, making busy poll a better citizen. Tested: Tested with SIT tunnel lpaa5:~# echo 0 >/proc/sys/net/core/busy_read lpaa5:~# ./netperf -H 2002:af6:786::1 -t TCP_RR MIGRATED TCP REQUEST/RESPONSE TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:786::1 () port 0 AF_INET6 : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 16384 87380 1 1 10.00 37373.93 16384 87380 Now enable busy poll on both hosts lpaa5:~# echo 70 >/proc/sys/net/core/busy_read lpaa6:~# echo 70 >/proc/sys/net/core/busy_read lpaa5:~# ./netperf -H 2002:af6:786::1 -t TCP_RR MIGRATED TCP REQUEST/RESPONSE TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:786::1 () port 0 AF_INET6 : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec 16384 87380 1 1 10.00 58314.77 16384 87380 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:38 -05:00
Eric Dumazet	02d62e86fe	net: un-inline sk_busy_loop() There is really little gain from inlining this big function. We'll soon make it even bigger in following patches. This means we no longer need to export napi_by_id() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:38 -05:00
Eric Dumazet	5865316c9d	mlx4: mlx4_en_low_latency_recv() called with BH disabled mlx4_en_low_latency_recv() is called with BH disabled, as other ndo_busy_poll() methods. No need for spin_lock_bh()/spin_unlock_bh() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:38 -05:00
Eric Dumazet	52bd2d62ce	net: better skb->sender_cpu and skb->napi_id cohabitation skb->sender_cpu and skb->napi_id share a common storage, and we had various bugs about this. We had to call skb_sender_cpu_clear() in some places to not leave a prior skb->napi_id and fool netdev_pick_tx() As suggested by Alexei, we could split the space so that these errors can not happen. 0 value being reserved as the common (not initialized) value, let's reserve [1 .. NR_CPUS] range for valid sender_cpu, and [NR_CPUS+1 .. ~0U] for valid napi_id. This will allow proper busy polling support over tunnels. Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:37 -05:00
Ivan Vecera	d37b4c0a36	be2net: remove local variable 'status' The lancer_cmd_get_file_len() uses lancer_cmd_read_object() to get the current size of registers for ethtool registers dump. Returned status value is stored but not checked. The check itself is not necessary as the data_read output variable is initialized to 0 and status variable can be removed. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 15:21:41 -05:00
huangdaode	6fbaa57076	net: hisilicon: fix binding document of mdio This patch explains the occasion of "hisilcon,mdio" and "hisilicon,hns-mdio" according to Arnd's comments. and reformat it according to comments from Rob<robh@kernel.org>. Signed-off-by: huangdaode <huangdaode@hisilicon.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 15:08:22 -05:00
Bastian Stender	09605cc12c	net ipv4: use preferred log methods Replace printk calls with preferred unconditional log method calls to keep kernel messages clean. Added newline to "too small MTU" message. Signed-off-by: Bastian Stender <bst@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 13:37:20 -05:00
Linus Torvalds	34258a32d9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Assorted bug fixes, the mlock2 system call gets added, and one improvement. The boot from dasd devices is now possible from a wider range of devices" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: remove SALIPL loader s390: wire up mlock2 system call s390: remove g5 elf platform support s390: avoid cache aliasing under z/VM and KVM s390/sclp: _sclp_wait_int(): retain full PSW mask s390/zcrypt: Fix initialisation when zcrypt is built-in s390/zcrypt: Fix kernel crash on systems without AP bus support s390: add support for ipl devices in subchannel sets > 0 s390/ipl: fix out of bounds access in scpdata_write s390/pci_dma: improve debugging of errors during dma map s390/pci_dma: handle dma table failures s390/pci_dma: unify label of invalid translation table entries s390/syscalls: remove system call number calculation s390/cio: simplify css_generate_pgid s390/diag: add a s390 prefix to the diagnose trace point s390/head: fix error message on unsupported hardware	2015-11-18 08:59:29 -08:00
Linus Torvalds	0d77a123ff	hwmon fixes for v4.4-rc2 Fix build issues in scpi and ina2xx drivers, update scpi driver to support recent firmware, and fix an uninitialized variable warning in applesmc driver. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWTKdFAAoJEMsfJm/On5mBOVQP/RUuwr/ugCSxX4FalU6CdUoW bKmtTKNQhSFtY/4w+hOzqcJu0WcpLA8CbHkwG8bB3wZqEcHMASzd9ajNZb1vVh5v Aj61PCJTelv36w7Nt6NZ2dkymYSjkW72YPAt+dzMIn7HIJ2cxjJtvUdH87BE9rDq oGeIQ246z5ytXNJvl/MHRHGrf3RXMPrsTcTvvPMXWZovUD4x+HQ1CvYY5VGPYA8N dhOZGwy3S6/WOnSnKqUxLJCNj/Bi/GyXWHAzLKLl9dGeUp3wbrWJnve1uLNMwSS3 w7ImPKkYGsUN2Z2uJHiyW967gXjpfVav1545eaP/Co52tzIeAbvyXU0EacjM0fU0 UQ4MMyFglNrBJd1yDiVeZQsjeHfqOMVDJFwF6UIXFuKWx5R4RlEe0R5YKf4oBNtK 4ad+n2kHnEGQFdWwirbeiwCBEB6M7Vrv58OZypJRmYxkfgR8x77Tq8tceCPYf6g4 fimcWRr4uuTKooQyhHDrej+/2RFH0acvw2xwypjWj2FkBwV045ZvDogY8syPIB+0 1BJs40ZlV6WXaffHHco//ZLXtOgZ62UKHkLng9knMBjAcKmkPbn4+amym/Y18yVr fTVBt8iiCIVhtXIAHG34fvUDec3EyjYgc2FvtpuHqb2vUF6T75xOzxvpDW5fgYQo tttvfy8fKOUyF+4aanMf =JYMV -----END PGP SIGNATURE----- Merge tag 'hwmon-for-linus-v4.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: "Fix build issues in scpi and ina2xx drivers, update scpi driver to support recent firmware, and fix an uninitialized variable warning in applesmc driver" * tag 'hwmon-for-linus-v4.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (scpi) skip unsupported sensors properly hwmon: (scpi) add thermal-of dependency hwmon : (applesmc) Fix uninitialized variables warnings hwmon: (ina2xx) Fix build issue by selecting REGMAP_I2C	2015-11-18 08:43:29 -08:00
Linus Torvalds	7f151f1d8a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix list tests in netfilter ingress support, from Florian Westphal. 2) Fix reversal of input and output interfaces in ingress hook invocation, from Pablo Neira Ayuso. 3) We have a use after free in r8169, caught by Dave Jones, fixed by Francois Romieu. 4) Splice use-after-free fix in AF_UNIX frmo Hannes Frederic Sowa. 5) Three ipv6 route handling bug fixes from Martin KaFai Lau: a) Don't create clone routes not managed by the fib6 tree b) Don't forget to check expiration of DST_NOCACHE routes. c) Handle rt->dst.from == NULL properly. 6) Several AF_PACKET fixes wrt transport header setting and SKB protocol setting, from Daniel Borkmann. 7) Fix thunder driver crash on shutdown, from Pavel Fedin. 8) Several Mellanox driver fixes (max MTU calculations, use of correct DMA unmap in TX path, etc.) from Saeed Mahameed, Tariq Toukan, Doron Tsur, Achiad Shochat, Eran Ben Elisha, and Noa Osherovich. 9) Several mv88e6060 DSA driver fixes (wrong bit definitions for certain registers, etc.) from Neil Armstrong. 10) Make sure to disable preemption while updating per-cpu stats of ip tunnels, from Jason A. Donenfeld. 11) Various ARM64 bpf JIT fixes, from Yang Shi. 12) Flush icache properly in ARM JITs, from Daniel Borkmann. 13) Fix masking of RX and TX interrupts in ravb driver, from Masaru Nagai. 14) Fix netdev feature propagation for devices not implementing ->ndo_set_features(). From Nikolay Aleksandrov. 15) Big endian fix in vmxnet3 driver, from Shrikrishna Khare. 16) RAW socket code increments incorrect SNMP counters, fix from Ben Cartwright-Cox. 17) IPv6 multicast SNMP counters are bumped twice, fix from Neil Horman. 18) Fix handling of VLAN headers on stacked devices when REORDER is disabled. From Vlad Yasevich. 19) Fix SKB leaks and use-after-free in ipvlan and macvlan drivers, from Sabrina Dubroca. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits) MAINTAINERS: Update Mellanox's Eth NIC driver entries net/core: revert "net: fix __netdev_update_features return.." and add comment af_unix: take receive queue lock while appending new skb rtnetlink: fix frame size warning in rtnl_fill_ifinfo net: use skb_clone to avoid alloc_pages failure. packet: Use PAGE_ALIGNED macro packet: Don't check frames_per_block against negative values net: phy: Use interrupts when available in NOLINK state phy: marvell: Add support for 88E1540 PHY arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS macvlan: fix leak in macvlan_handle_frame ipvlan: fix use after free of skb ipvlan: fix leak in ipvlan_rcv_frame vlan: Do not put vlan headers back on bridge and macvlan ports vlan: Fix untag operations of stacked vlans with REORDER_HEADER off via-velocity: unconditionally drop frames with bad l2 length ipg: Remove ipg driver dl2k: Add support for IP1000A-based cards snmp: Remove duplicate OUTMCAST stat increment net: thunder: Check for driver data in nicvf_remove() ...	2015-11-17 13:52:59 -08:00
Or Gerlitz	e7523a497d	MAINTAINERS: Update Mellanox's Eth NIC driver entries Eugenia (Jenny) Emantayev is replacing Amir Vadai as the mlx4 Ethernet driver maintainer. Saeed Mahameed is assigned to maintain mlx5 Eth functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 15:25:45 -05:00
Nikolay Aleksandrov	17b85d29e8	net/core: revert "net: fix __netdev_update_features return.." and add comment This reverts commit `00ee592717` ("net: fix __netdev_update_features return on ndo_set_features failure") and adds a comment explaining why it's okay to return a value other than 0 upon error. Some drivers might actually change flags and return an error so it's better to fire a spurious notification rather than miss these. CC: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 15:25:45 -05:00
Hannes Frederic Sowa	a3a116e04c	af_unix: take receive queue lock while appending new skb While possibly in future we don't necessarily need to use sk_buff_head.lock this is a rather larger change, as it affects the af_unix fd garbage collector, diag and socket cleanups. This is too much for a stable patch. For the time being grab sk_buff_head.lock without disabling bh and irqs, so don't use locked skb_queue_tail. Fixes: `869e7c6248` ("net: af_unix: implement stream sendpage support") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Reported-by: Eric Dumazet <edumazet@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 15:25:45 -05:00
Hannes Frederic Sowa	b22b941b2c	rtnetlink: fix frame size warning in rtnl_fill_ifinfo Fix the following warning: CC net/core/rtnetlink.o net/core/rtnetlink.c: In function ‘rtnl_fill_ifinfo’: net/core/rtnetlink.c:1308:1: warning: the frame size of 2864 bytes is larger than 2048 bytes [-Wframe-larger-than=] } ^ by splitting up the huge rtnl_fill_ifinfo into some smaller ones, so we don't have the huge frame allocations at the same time. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 15:25:44 -05:00
Martin Zhang	19125c1a4f	net: use skb_clone to avoid alloc_pages failure. 1. new skb only need dst and ip address(v4 or v6). 2. skb_copy may need high order pages, which is very rare on long running server. Signed-off-by: Junwei Zhang <linggao.zjw@alibaba-inc.com> Signed-off-by: Martin Zhang <martinbj2008@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 15:25:44 -05:00
Tobias Klauser	90836b67e2	packet: Use PAGE_ALIGNED macro Use PAGE_ALIGNED(...) instead of open-coding it. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 15:25:44 -05:00
Tobias Klauser	4194b4914a	packet: Don't check frames_per_block against negative values rb->frames_per_block is an unsigned int, thus can never be negative. Also fix spacing in the calculation of frames_per_block. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 15:25:44 -05:00
Andrew Lunn	321beec504	net: phy: Use interrupts when available in NOLINK state The NOLINK state will poll the phy once a second to see if the link has come up. If the phy has an interrupt line, this polling can be skipped, since the phy should interrupt when the link returns. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 15:25:44 -05:00
Andrew Lunn	819ec8e1f3	phy: marvell: Add support for 88E1540 PHY The 88E1540 can be found embedded in the Marvell 88E6352 switch. It is compatible with the 88E1510, so add support for it, using the 88E1510 specific functions. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 15:25:43 -05:00
Yang Shi	ec0738db8d	arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS Save and restore FP/LR in BPF prog prologue and epilogue, save SP to FP in prologue in order to get the correct stack backtrace. However, ARM64 JIT used FP (x29) as eBPF fp register, FP is subjected to change during function call so it may cause the BPF prog stack base address change too. Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee saved register, so it will keep intact during function call. It is initialized in BPF prog prologue when BPF prog is started to run everytime. Save and restore x25/x26 in BPF prologue and epilogue to keep them intact for the outside of BPF. Actually, x26 is unnecessary, but SP requires 16 bytes alignment. So, the BPF stack layout looks like: high original A64_SP => 0:+-----+ BPF prologue \|FP/LR\| current A64_FP => -16:+-----+ \| ... \| callee saved registers +-----+ \| \| x25/x26 BPF fp register => -80:+-----+ \| \| \| ... \| BPF prog stack \| \| \| \| current A64_SP => +-----+ \| \| \| ... \| Function call stack \| \| +-----+ low CC: Zi Shen Lim <zlim.lnx@gmail.com> CC: Xi Wang <xi.wang@gmail.com> Signed-off-by: Yang Shi <yang.shi@linaro.org> Acked-by: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 14:44:39 -05:00
Sabrina Dubroca	e639b8d8a7	macvlan: fix leak in macvlan_handle_frame Reset pskb in macvlan_handle_frame in case skb_share_check returned a clone. Fixes: `8a4eb5734e` ("net: introduce rx_handler results and logic around that") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 14:39:29 -05:00
Sabrina Dubroca	a534dc5298	ipvlan: fix use after free of skb ipvlan_handle_frame is a rx_handler, and when it returns a value other than RX_HANDLER_CONSUMED (here, NET_RX_DROP aka RX_HANDLER_ANOTHER), __netif_receive_skb_core expects that the skb still exists and will process it further, but we just freed it. Fixes: `2ad7bf3638` ("ipvlan: Initial check-in of the IPVLAN driver.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 14:39:29 -05:00
Sabrina Dubroca	cf554ada0b	ipvlan: fix leak in ipvlan_rcv_frame Pass a *skb to ipvlan_rcv_frame so that if skb_share_check returns a new skb, we actually use it during further processing. It's safe to ignore the new skb in the ipvlan_xmit_ functions, because they call ipvlan_rcv_frame with local == true, so that dev_forward_skb is called and always takes ownership of the skb. Fixes: `2ad7bf3638` ("ipvlan: Initial check-in of the IPVLAN driver.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 14:39:28 -05:00
David S. Miller	eb3f8b42aa	Merge branch 'vlan-reorder' Vladislav Yasevich says: ==================== Fix issues with vlans without REORDER_HEADER A while ago Phil Sutter brought up an issue with vlans without REORDER_HEADER and bridges. The problem was that if a vlan without REORDER_HEADER was a port in the bridge, the bridge ended up forwarding corrupted packets that still contained the vlan header. The same issue exists for bridge mode macvlan/macvtap devices. An additional issue with vlans without REORDER_HEADER is that stacking them also doesn't work. The reason here is that skb_reorder_vlan_header() function assumes that it on ETH_HLEN bytes deep into the packet. That is not the case, when you a vlan without REORRDER_HEADER flag set. This series attempts to correct these 2 issues. 1) To solve the stacked vlans problem, the patch simply use skb->mac_len as an offset to start copying mac addresses that is part of header reordering. 2) To fix the issue with bridge/macvlan/macvtap, the second patch simply doesn't write the vlan header back to the packet if the vlan device is either a bridge or a macvlan port. This ends up being the simplest and least performance intrussive solution. I've considered extending patch 2 to all stacked devices (essentially checked for the presense of rx_handler), but that feels like a broader restriction and _may_ break existing uses. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 14:38:36 -05:00
Vlad Yasevich	28f9ee22bc	vlan: Do not put vlan headers back on bridge and macvlan ports When a vlan is configured with REORDER_HEADER set to 0, the vlan header is put back into the packet and makes it appear that the vlan header is still there even after it's been processed. This posses a problem for bridge and macvlan ports. The packets passed to those device may be forwarded and at the time of the forward, vlan headers end up being unexpectedly present. With the patch, we make sure that we do not put the vlan header back (when REORDER_HEADER is 0) if a bridge or macvlan has been configured on top of the vlan device. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 14:38:35 -05:00
Vlad Yasevich	a6e18ff111	vlan: Fix untag operations of stacked vlans with REORDER_HEADER off When we have multiple stacked vlan devices all of which have turned off REORDER_HEADER flag, the untag operation does not locate the ethernet addresses correctly for nested vlans. The reason is that in case of REORDER_HEADER flag being off, the outer vlan headers are put back and the mac_len is adjusted to account for the presense of the header. Then, the subsequent untag operation, for the next level vlan, always use VLAN_ETH_HLEN to locate the begining of the ethernet header and that ends up being a multiple of 4 bytes short of the actuall beginning of the mac header (the multiple depending on the how many vlan encapsulations ethere are). As a reslult, if there are multiple levles of vlan devices with REODER_HEADER being off, the recevied packets end up being dropped. To solve this, we use skb->mac_len as the offset. The value is always set on receive path and starts out as a ETH_HLEN. The value is also updated when the vlan header manupations occur so we know it will be correct. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 14:38:35 -05:00
Timo Teräs	6c606fa32c	via-velocity: unconditionally drop frames with bad l2 length By default the driver allowed incorrect frames to be received. What is worse the code does not handle very short frames correctly. The FCS length is unconditionally subtracted, and the underflow can cause skb_put to be called with large number after implicit cast to unsigned. And indeed, an skb_over_panic() was observed with via-velocity. This removes the module parameter as it does not work in it's current state, and should be implemented via NETIF_F_RXALL if needed. Suggested-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-17 14:37:16 -05:00
Linus Torvalds	a18ab2f6cb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "A fs-cache regression fix, and adding a warning about obnoxiou^W moderation of list given in MAINTAINERS" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: MAINTAINERS: linux-cachefs@redhat.com is moderated for non-subscribers FS-Cache: Add missing initialization of ret in cachefiles_write_page()	2015-11-17 10:11:08 -08:00
Linus Torvalds	864f83a1f6	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a bug in the qat driver where a user-space pointer is dereferenced" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: qat - don't use userspace pointer	2015-11-17 09:40:05 -08:00
Geert Uytterhoeven	e62d6e244f	MAINTAINERS: linux-cachefs@redhat.com is moderated for non-subscribers Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-11-16 20:38:44 -05:00
Geert Uytterhoeven	cf89752645	FS-Cache: Add missing initialization of ret in cachefiles_write_page() fs/cachefiles/rdwr.c: In function ‘cachefiles_write_page’: fs/cachefiles/rdwr.c:882: warning: ‘ret’ may be used uninitialized in this function If the jump to label "error" is taken, "ret" will indeed be uninitialized, and random stack data may be printed by the debug code. Fixes: `102f4d900c` ("FS-Cache: Handle a write to the page immediately beyond the EOF marker") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-11-16 20:38:43 -05:00
Ondrej Zary	f1a454a376	ipg: Remove ipg driver Now that IP1000A chips are supported by dl2k driver, the buggy ipg driver can be removed. Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-16 17:11:31 -05:00
Ondrej Zary	c3f45d322c	dl2k: Add support for IP1000A-based cards Add support for IP1000A chips to dl2k driver. IP1000A chip looks like a TC9020 with integrated PHY. This allows IP1000A chips to work reliably because the ipg driver is buggy - it loses packets under load and then completely stops transmitting data. Tested with Asus NX1101 v2.0 at 10, 100 and 1000Mbps: vendor=0x13f0 device=0x1023 (rev 0x41) subsystem vendor=0x1043 device=0x8180 MAC address registers access needed to be changed from 8-bit to 16-bit because 8-bit does not work on IP1000A. 8-bit access is not even allowed in the TC9020 datasheet (although it worked). 16-bit access works on both. Tested that it does not break D-Link DGE-550T (DL-2000 chip, probably a rebranded TC9020): vendor=0x1186 device=0x4000 (rev 0x0c) subsystem vendor=0x1186 device=0x4000 Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-16 17:11:31 -05:00
Neil Horman	41033f029e	snmp: Remove duplicate OUTMCAST stat increment the OUTMCAST stat is double incremented, getting bumped once in the mcast code itself, and again in the common ip output path. Remove the mcast bump, as its not needed Validated by the reporter, with good results Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Claus Jensen <claus.jensen@microsemi.com> CC: Claus Jensen <claus.jensen@microsemi.com> CC: David Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-16 16:36:32 -05:00
Pavel Fedin	7750130d93	net: thunder: Check for driver data in nicvf_remove() In some cases the crash is caused by nicvf_remove() being called from outside. For example, if we try to feed the device to vfio after the probe has failed for some reason. So, move the check to better place. Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-16 16:24:44 -05:00
Bjørn Mork	88ad4175b2	net/core: use netdev name in warning if no parent A recent flaw in the netdev feature setting resulted in warnings like this one from VLAN interfaces: WARNING: CPU: 1 PID: 4975 at net/core/dev.c:2419 skb_warn_bad_offload+0xbc/0xcb() : caps=(0x00000000001b5820, 0x00000000001b5829) len=2782 data_len=0 gso_size=1348 gso_type=16 ip_summed=3 The ":" is supposed to be preceded by a driver name, but in this case it is an empty string since the device has no parent. There are many types of network devices without a parent. The anonymous warnings for these devices can be hard to debug. Log the network device name instead in these cases to assist further debugging. This is mostly similar to how __netdev_printk() handles orphan devices. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-16 16:21:48 -05:00
Hannes Frederic Sowa	8844f97238	af_unix: don't append consumed skbs to sk_receive_queue In case multiple writes to a unix stream socket race we could end up in a situation where we pre-allocate a new skb for use in unix_stream_sendpage but have to free it again in the locked section because another skb has been appended meanwhile, which we must use. Accidentally we didn't clear the pointer after consuming it and so we touched freed memory while appending it to the sk_receive_queue. So, clear the pointer after consuming the skb. This bug has been found with syzkaller (http://github.com/google/syzkaller) by Dmitry Vyukov. Fixes: `869e7c6248` ("net: af_unix: implement stream sendpage support") Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-16 15:39:35 -05:00
Dragos Tatulea	24cb7055a3	net: switchdev: fix return code of fdb_dump stub rtnl_fdb_dump always expects an index to be returned by the ndo_fdb_dump op, but when CONFIG_NET_SWITCHDEV is off, it returns an error. Fix that by returning the given unmodified idx. A similar fix was `0890cf6cb6` ("switchdev: fix return value of switchdev_port_fdb_dump in case of error") but for the CONFIG_NET_SWITCHDEV=y case. Fixes: `45d4122ca7` ("switchdev: add support for fdb add/del/dump via switchdev_port_obj ops.") Signed-off-by: Dragos Tatulea <dragos@endocode.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-16 15:24:37 -05:00
Yuval Mintz	ab6d7846cf	bnx2x: Fix VLANs null-pointer for 57710, 57711 Commit `05cc5a39dd` "bnx2x: add vlan filtering offload" introduced a regression in regard for vlans for 57710, 57711 adapters - Loading 8021q module on a machine with such an adapter would cause a null pointer dereference, as the driver mistakenly publishes it has capabilities for vlan CTAG filtering. Reported-by: Otto Sabart <osabart@redhat.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-16 15:13:01 -05:00
Masaru Nagai	d60cf616ec	ravb: remove unhandle int cause This driver does not handle the AVB-DMAC Receive FIFO Warning interrupt now, so the interrupt should not be enabled. Signed-off-by: Masaru Nagai <masaru.nagai.vx@renesas.com> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-16 15:12:25 -05:00
Ben Cartwright-Cox	027ac58e3c	raw: increment correct SNMP counters for ICMP messages Sending ICMP packets with raw sockets ends up in the SNMP counters logging the type as the first byte of the IPv4 header rather than the ICMP header. This is fixed by adding the IP Header Length to the casting into a icmphdr struct. Signed-off-by: Ben Cartwright-Cox <ben@benjojo.co.uk> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-16 15:08:48 -05:00

1 2 3 4 5 ...

560653 Commits