reqsk_queue_destroy() and reqsk_queue_unlink() should use
del_timer_sync() instead of del_timer() before calling reqsk_put(),
otherwise we could free a req still used by another cpu.
But before doing so, reqsk_queue_destroy() must release syn_wait_lock
spinlock or risk a dead lock, as reqsk_timer_handler() might
need to take this same spinlock from reqsk_queue_unlink() (called from
inet_csk_reqsk_queue_drop())
Fixes: fa76ce7328 ("inet: get rid of central tcp/dccp listener timer")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If register_netdev() fails we are not propagating the error and
we return success because ax_open() succeeded previously.
Fix this by checking the return value of ax_open() and
register_netdev() and propagate the error in case of failure.
Reported-by: RUC_Soft_Sec <zy900702@163.com>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains five Netfilter fixes for your net tree,
they are:
1) Silence a warning on falling back to vmalloc(). Since 88eab472ec, we can
easily hit this warning message, that gets users confused. So let's get rid
of it.
2) Recently when porting the template object allocation on top of kmalloc to
fix the netns dependencies between x_tables and conntrack, the error
checks where left unchanged. Remove IS_ERR() and check for NULL instead.
Patch from Dan Carpenter.
3) Don't ignore gfp_flags in the new nf_ct_tmpl_alloc() function, from
Joe Stringer.
4) Fix a crash due to NULL pointer dereference in ip6t_SYNPROXY, patch from
Phil Sutter.
5) The sequence number of the Syn+ack that is sent from SYNPROXY to clients is
not adjusted through our NAT infrastructure, as a result the client may
ignore this TCP packet and TCP flow hangs until the client probes us. Also
from Phil Sutter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz says:
====================
bnx2x: small fixes
This adds 2 small fixes, one to error flows during memory release
and the other to flash writes via ethtool API.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Writing each 4Kb page into flash might take up-to ~100 miliseconds,
during which time management firmware cannot acces the nvram for its
own uses.
Firmware upgrade utility use the ethtool API to burn new flash images
for the device via the ethtool API, doing so by writing several page-worth
of data on each command. Such action might create problems for the
management firmware, as the nvram might not be accessible for a long time.
This patch changes the write implementation, releasing the nvram lock on
the completion of each page, allowing the management firmware time to
claim it and perform its own required actions.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On error flows its possible to free an SKB even if it was not allocated.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There were missing curly braces so it means we call add_debugfs_mem()
unintentionally.
Fixes: 3ccc6cf74d ('cxgb4: Adds support for T6 adapter')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After "62bccb8 net-timestamp: Make the clone operation stand-alone from phy
timestamping" the hwtstamps parameter of skb_complete_tx_timestamp() may no
longer be NULL.
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Cc: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
48ed7b26fa ("ipv6: reject locally assigned nexthop addresses") is too
strict; it rejects following corner-case:
ip -6 route add default via fe80::1:2:3 dev eth1
[ where fe80::1:2:3 is assigned to a local interface, but not eth1 ]
Fix this by restricting search to given device if nh is linklocal.
Joint work with Hannes Frederic Sowa.
Fixes: 48ed7b26fa ("ipv6: reject locally assigned nexthop addresses")
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus reports the following deadlock on rtnl_mutex; triggered only
once so far (extract):
[12236.694209] NetworkManager D 0000000000013b80 0 1047 1 0x00000000
[12236.694218] ffff88003f902640 0000000000000000 ffffffff815d15a9 0000000000000018
[12236.694224] ffff880119538000 ffff88003f902640 ffffffff81a8ff84 00000000ffffffff
[12236.694230] ffffffff81a8ff88 ffff880119c47f00 ffffffff815d133a ffffffff81a8ff80
[12236.694235] Call Trace:
[12236.694250] [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694257] [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.694263] [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694271] [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.694280] [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.694291] [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.694299] [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694309] [<ffffffff814f5ad3>] ? rtnl_getlink+0x113/0x190
[12236.694319] [<ffffffff814f202a>] ? rtnetlink_rcv_msg+0x7a/0x210
[12236.694331] [<ffffffff8124565c>] ? sock_has_perm+0x5c/0x70
[12236.694339] [<ffffffff814f1fb0>] ? rtnetlink_rcv+0x30/0x30
[12236.694346] [<ffffffff8150d62c>] ? netlink_rcv_skb+0x9c/0xc0
[12236.694354] [<ffffffff814f1f9f>] ? rtnetlink_rcv+0x1f/0x30
[12236.694360] [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694367] [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.694376] [<ffffffff810a236f>] ? __wake_up+0x2f/0x50
[12236.694387] [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.694396] [<ffffffff814cb05e>] ? ___sys_sendmsg+0x22e/0x240
[12236.694405] [<ffffffff814cab75>] ? ___sys_recvmsg+0x135/0x1a0
[12236.694415] [<ffffffff811a9d12>] ? eventfd_write+0x82/0x210
[12236.694423] [<ffffffff811a0f9e>] ? fsnotify+0x32e/0x4c0
[12236.694429] [<ffffffff8108cb70>] ? wake_up_q+0x60/0x60
[12236.694434] [<ffffffff814cba09>] ? __sys_sendmsg+0x39/0x70
[12236.694440] [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a
It seems so far plausible that the recursive call into rtnetlink_rcv()
looks suspicious. One way, where this could trigger is that the senders
NETLINK_CB(skb).portid was wrongly 0 (which is rtnetlink socket), so
the rtnl_getlink() request's answer would be sent to the kernel instead
to the actual user process, thus grabbing rtnl_mutex() twice.
One theory would be that netlink_autobind() triggered via netlink_sendmsg()
internally overwrites the -EBUSY error to 0, but where it is wrongly
originating from __netlink_insert() instead. That would reset the
socket's portid to 0, which is then filled into NETLINK_CB(skb).portid
later on. As commit d470e3b483 ("[NETLINK]: Fix two socket hashing bugs.")
also puts it, -EBUSY should not be propagated from netlink_insert().
It looks like it's very unlikely to reproduce. We need to trigger the
rhashtable_insert_rehash() handler under a situation where rehashing
currently occurs (one /rare/ way would be to hit ht->elasticity limits
while not filled enough to expand the hashtable, but that would rather
require a specifically crafted bind() sequence with knowledge about
destination slots, seems unlikely). It probably makes sense to guard
__netlink_insert() in any case and remap that error. It was suggested
that EOVERFLOW might be better than an already overloaded ENOMEM.
Reference: http://thread.gmane.org/gmane.linux.network/372676
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit "e29aa33 bna: Enable Multi Buffer RX" moved packets counter
increment from the beginning of the NAPI processing loop after the check
for erroneous packets so they are never accounted. This counter is used
to inform firmware about number of processed completions (packets).
As these packets are never acked the firmware fires IRQs for them again
and again.
Fixes: e29aa33 ("bna: Enable Multi Buffer RX")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcin Wojtas says:
====================
Fixes for the network driver of Marvell Armada 375 SoC
This is a set of three patches that fix long-lasting problems implemented in
the initial support for the Armada 375 network controller.
Due to an inappropriate concept of handling the per-CPU sent packets'
processing on TX path the driver numerous problems occured, such as RCU
stalls. Those have been fixed, of which details you can find in the commit
logs. The patches were intensively tested on top of v4.2-rc5.
I'm looking forward to any comments or remarks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The PP2 controller is capable of per-CPU TX processing, which means there are
per-CPU banked register sets and queues. Current version of the driver supports
TX packet coalescing - once on given CPU sent packets amount reaches a threshold
value, an IRQ occurs. However, there is a single interrupt line responsible for
CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not
support RSS).
When the top-half executes the interrupt cause is not known. This is why in
NAPI poll function, along with RX processing, IRQ cause register on both
CPU's is accessed in order to determine on which of them the TX coalescing
threshold might have been reached. Thus the egress processing and releasing the
buffers is able to take place on the corresponding CPU. Hitherto approach lead
to an illegal usage of on_each_cpu function in softirq context.
The problem is solved by resigning from TX coalescing interrupts and separating
egress finalization from NAPI processing. For that purpose a method of using
hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released
once a software coalescing threshold is reached. In case not all the data is
processed a timer is set on this CPU - in its interrupt context a tasklet is
scheduled in which all queues are processed. At once only one timer per-CPU can
be running, which is controlled by a dedicated flag.
This commit removes TX processing from NAPI polling function, disables hardware
coalescing and enables hrtimer with tasklet, using new per-CPU port structure
(mvpp2_port_pcpu).
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mvpp2 driver allows usage of per-CPU TX processing. Once the packets are
prepared independetly on each CPU, the hardware enqueues the descriptors in
common TX queue. After they are sent, the buffers and associated sk_buffs
should be released on the corresponding CPU.
This is why a special index is maintained in order to point to the right data to
be released after transmission takes place. Each per-CPU TX queue comprise an
array of sent sk_buffs, freed in mvpp2_txq_bufs_free function. However, the
index was used there also for obtaining a descriptor (and therefore a buffer to
be DMA-unmapped) from common TX queue, which was wrong, because it was not
referring to the current CPU.
This commit enables proper unmapping of sent data buffers by indexing them in
per-CPU queues using a dedicated array for keeping their physical addresses.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using spinlocks protection during one-time driver initialization is not
necessary. Moreover it resulted in invalid GFP_KERNEL allocation under the lock.
This commit removes redundant spinlocks from buffer manager part of mvpp2
initialization.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Reported-by: Alexandre Fournier <alexandre.fournier@wisp-e.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon receipt of SYNACK from the server, ipt_SYNPROXY first sends back an ACK to
finish the server handshake, then calls nf_ct_seqadj_init() to initiate
sequence number adjustment of forwarded packets to the client and finally sends
a window update to the client to unblock it's TX queue.
Since synproxy_send_client_ack() does not set synproxy_send_tcp()'s nfct
parameter, no sequence number adjustment happens and the client receives the
window update with incorrect sequence number. Depending on client TCP
implementation, this leads to a significant delay (until a window probe is
being sent).
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This happens when networking namespaces are enabled.
Suggested-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
failed to configure the page size for architectures with page size
different than 4K.
Fixes: 938fe83 ("net/mlx5_core: New device capabilities handling")
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- prevent DAT from replying on behalf of local clients and confuse L2
bridges
- fix crash on double list removal of TT objects (tt_local_entry)
- fix crash due to missing NULL checks
- initialize bw values for new GWs objects to prevent memory leak
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJVwdK+AAoJEOb/4TMchkvfO9UP/i8vIBZ0XabuYlG9r5GnYh8s
1WvO7jbe4zYhe3WrfAibcfcxMT6v0aI6hc/VRCb36zTNT3VJeh76+fKGwLC4ZU08
UdefhzCavV/gg2fX0b67aA/SsC6I9FrZAG8VOt5GXhMWqjZYSwbxMKLrNe/QRJks
hUzfcQbmFjIMSTK2Zf4/MV0AHzzPKbHixE8t3to2DmlHhZ6lCWV5l42nv/H/s8js
uxNI06rnQBY7nY06rZEualznEZpN7AFtFeIw9zXDZ0PgkIPrsjShwu/ZyzVK9wg1
4ld1JQjf1uKbrOqvdY4bAx8EUUQQ3n68513IXuKLgAdPcBGYVFo8atbk9K24QPue
KzO5tqg7ELf6EO1/haH/VLOTCrMxiC1C62CuNxQn1rDIUAIUrtX9Rqt2f2bS75KG
tGNJTUfQfpNCY4IlqaUE8cdzABiY6cLIf68gSmXl/FfBzVNWDc1Totsi/B6C2WEM
PYjXnmPc97tz0/iWRpvOMd1DxoMXnM0odH4gifUt3Osl0hKMbelhnGKACCvHM8jK
3EOOz1yZIEoX9OkwtFnCji+web884ASUJjnyrNJgbyOY3juj2B0hSrWgK++trUMS
+5oZMkFtHaar0l+JRc2Xhf/GqtkY50lGlwcPlOukDyKJ+RvPTRsShIYhns75Z2us
dWqAlJEdf3kRZuDri8O3
=JfaB
-----END PGP SIGNATURE-----
Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
Antonio Quartulli says:
====================
Included changes:
- prevent DAT from replying on behalf of local clients and confuse L2
bridges
- fix crash on double list removal of TT objects (tt_local_entry)
- fix crash due to missing NULL checks
- initialize bw values for new GWs objects to prevent memory leak
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sathya Perla says:
====================
be2net: patch set
This patch set contains 2 driver fixes to a Lancer HW issue and a fix
to a double free bug. Pls apply to the "net" tree. Thanks!
Patch 1 now enables filters only after creating RXQs. This is done as
HW issues were observed on Lancer adapters if filters
(flags, mac addrs etc) are enabled *before* creating RXQs. This patch
changes the driver design by enabling filters in be_open() --
instead of be_setup() -- after RXQs are created and buffers posted.
Patch 2 fixes an RX stall issue that was seen on Lancer adapters when
RXQs are destroyed while they are in an "out of buffer" state.
This patch fixes this issue by posting 64 buffers to each RXQ before
destroying them in the close path. This is done after ensuring that no
more new packets are selected for transfer to the RXQs by disabling
interface filters.
Patch 3 protects eqo->affinity_mask variable from being freed twice and
resulting in a crash. It's now freed only when EQs haven't yet been
destroyed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
There are paths in the driver such as an unrecoverable error (UE) detection
followed by a driver unload wherein be_clear() is invoked twice.
Individual data structures are reset so that they are not cleaned/freed
twice. This patch does the same for eqo->affinity_mask. It is freed only
if EQs haven't yet been destroyed. This fixes a possible crash when
affinity_mask is freed twice.
Signed-off-by: Kalesh AP <kalesh.purayil@avagotech.com>
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
An RX stall issue was seen on Lancer adapters, when RXQs are destroyed
while they are in an "out of buffer" state. This patch fixes this issue
by posting 64 buffers to each RXQ before destroying them in the close path.
This is done after ensuring that no more new packets are selected for
transfer to the RXQs by disabling interface filters.
Signed-off-by: Kalesh AP <kalesh.purayil@avagotech.com>
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
HW issues were observed on Lancer adapters if IFACE filters
(flags, mac addrs etc) are enabled *before* creating RXQs. This patch
changes the driver design by enabling filters in be_open() --
instead of be_setup() -- after RXQs are created and buffers posted.
Two new wrapper functions, be_enable_if_filters() and
be_disable_if_filters() are introduced to enable/disable IFACE filters in
be_open()/be_close() respectively. In be_setup() the IFACE is now created
only with the RSS flag.
Signed-off-by: Kalesh AP <kalesh.purayil@avagotech.com>
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
virtio declares support for NETIF_F_FRAGLIST, but assumes
that there are at most MAX_SKB_FRAGS + 2 fragments which isn't
always true with a fraglist.
A longer fraglist in the skb will make the call to skb_to_sgvec overflow
the sg array, leading to memory corruption.
Drop NETIF_F_FRAGLIST so we only get what we can handle.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch b1c17215d7: "stmmac: add ipq806x glue layer", leads to the
following static checker warning:
.../stmmac/dwmac-ipq806x.c:314 ipq806x_gmac_probe()
warn: double left shift '1 << (1 << gmac->id)'
The NSS_COMMON_CLK_SRC_CTRL_OFFSET macro is used once as an offset, and
once as a mask, which is a bug indeed. We'll fix it by defining the
offset as the real offset value and computing the mask from it when
required.
Tested on IPQ806x ref designs AP148 & DB149.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mathieu Olivari <mathieu@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prior to this patch, rx buffer size for each rx queue
of an interface is configurable through dts bindings.
But for an interface, the first rx queue's rx buffer
size is always the usual MTU size (plus usual overhead)
and page size for the remaining rx queues (if they are
enabled by specifying a non-zero rx queue depth dts
binding of the corresponding interface). This patch
removes the rx buffer size configuration capability.
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
Acked-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As well as for kernels built only for ThunderX ARCH_THUNDERX is also enabled
for kernels which support multiple platforms (such as distro kernels). Thus
"default ARCH_THUNDER" is inappropriate.
I believe default m is equally frowned upon, so remove the line completely
rather than "default m if ARCH_THUNDER".
Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: Sunil Goutham <sgoutham@cavium.com>
Cc: Robert Richter <rric@kernel.org>
Cc: Derek Chickles <derek.chickles@caviumnetworks.com>
Cc: Satanand Burla <satananda.burla@caviumnetworks.com>
Cc: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Cc: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Enforcing this flag in RxConfig for the mentioned chips fixes netdev
watchdog issues prepended with AMD IOMMU message(s) like:
AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x001d address=0x0000000000003000 flags=0x0050]
Note that this flag is also set in Realtek's own driver for these chips.
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Alexander Lindqvist <alexander@bitspace.se>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The attribute size wasn't accounted for in the get_slave_size() callback
(br_port_get_slave_size) when it was introduced, so fix it now. Also add
a policy entry for it in br_port_policy.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 842a9ae08a ("bridge: Extend Proxy ARP design to allow optional rules for Wi-Fi")
Signed-off-by: David S. Miller <davem@davemloft.net>
The attribute size wasn't accounted for in the get_slave_size() callback
(br_port_get_slave_size) when it was introduced, so fix it now. Also add
a policy entry for it in br_port_policy.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 958501163d ("bridge: Add support for IEEE 802.11 Proxy ARP")
Signed-off-by: David S. Miller <davem@davemloft.net>
* a fix for the stuck TFD queue mechanism - it was producing
noisy false alarms
* a fix for the NIC prepare flow that prevented the driver
from being able to access the device on certain systems
* a fix for the scan prority handling which allows the
regular scan to run even if a scheduled scan is already
running
rsi:
* fix firmware load DMA regression
b43:
* fix extpa_gain check for 2GHz
rtlwifi:
* fix NULL dereference when PCI driver used as an AP
* add missing module parameter declaration for rtl8723be_mod_params.msi_support
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJVwOsmAAoJEG4XJFUm622bWSAIAJDlW0ZKnxI+tmsbU2CYuoQS
yfiK+oTuQE5+eB+5bNaJqSI2QSzVo11JBxnf4014wu+/gjKy2OHQ48ufXbkFfHB2
t2o+rIx7WL5zvoy67fIifgIQSdg542e68xPc8Wz6MV58szuYscBT78dy2ZWvthqB
S6DK8K+ohHzHYMV5fw65xkcmsXB8BEKEoUckm8ZxjsLF4Pj+ZzAgpqf+i85JYkFM
LwktfKRxg2un9u51IhBCPZUUxe7MhLgZbSuHSP/7ltxWIGctseMAFMlCMrmi9+2B
zXPR+2EMWLy0rgmQ04/sG6LjNdW6tmF1rCtcuKnS7hiOngubAj3UEj9d5r98/LI=
=SqUX
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-for-davem-2015-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
iwlwifi:
* a fix for the stuck TFD queue mechanism - it was producing
noisy false alarms
* a fix for the NIC prepare flow that prevented the driver
from being able to access the device on certain systems
* a fix for the scan prority handling which allows the
regular scan to run even if a scheduled scan is already
running
rsi:
* fix firmware load DMA regression
b43:
* fix extpa_gain check for 2GHz
rtlwifi:
* fix NULL dereference when PCI driver used as an AP
* add missing module parameter declaration for rtl8723be_mod_params.msi_support
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 1fbe4b46ca "net: pktgen: kill the Wait for kthread_stop
code in pktgen_thread_worker()" removed (in particular) the final
__set_current_state(TASK_RUNNING) and I didn't notice the previous
set_current_state(TASK_INTERRUPTIBLE). This triggers the warning
in __might_sleep() after return.
Afaics, we can simply remove both set_current_state()'s, and we
could do this a long ago right after ef87979c27 "pktgen: better
scheduler friendliness" which changed pktgen_thread_worker() to
use wait_event_interruptible_timeout().
Reported-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Waking the dealloc thread before decrementing inflight_packets is racy
because it means the thread may go to sleep before inflight_packets is
decremented. If kthread_stop() has already been called, the dealloc
thread may wait forever with nothing to wake it. Instead, wake the
thread only after decrementing inflight_packets.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit 738ac1ebb9 ("net: Clone
skb before setting peeked flag") introduced a use-after-free bug
in skb_recv_datagram. This is because skb_set_peeked may create
a new skb and free the existing one. As it stands the caller will
continue to use the old freed skb.
This patch fixes it by making skb_set_peeked return the new skb
(or the old one if unchanged).
Fixes: 738ac1ebb9 ("net: Clone skb before setting peeked flag")
Reported-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Brenden Blanco <bblanco@plumgrid.com>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
The clocks are initially active and thus the device is marked active.
This still keeps the PM refcount at 0, the pm_runtime_put_autosuspend()
call at the end of probe then leaves us with an invalid refcount of -1,
which in turn leads to the device staying in suspended state even though
netdev open had been called.
Fix this by initializing the refcount to be coherent with the initial
device status.
Fixes:
8fff755e9f (net: fec: Ensure clocks are enabled while using mdio bus)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The flags were ignored for this function when it was introduced. Also
fix the style problem in kzalloc.
Fixes: 0838aa7fc (netfilter: fix netns dependencies with conntrack
templates)
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Without this initialization, gateways which actually announce up/down
bandwidth of 0/0 could be added. If these nodes get purged via
_batadv_purge_orig() later, the gw_node structure does not get removed
since batadv_gw_node_delete() updates the gw_node with up/down
bandwidth of 0/0, and the updating function then discards the change
and does not free gw_node.
This results in leaking the gw_node structures, which references other
structures: gw_node -> orig_node -> orig_node_ifinfo -> hardif. When
removing the interface later, the open reference on the hardif may cause
hangs with the infamous "unregister_netdevice: waiting for mesh1 to
become free. Usage count = 1" message.
Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
The tt_local_entry deletion performed in batadv_tt_local_remove() was neither
protecting against simultaneous deletes nor checking whether the element was
still part of the list before calling hlist_del_rcu().
Replacing the hlist_del_rcu() call with batadv_hash_remove() provides adequate
protection via hash spinlocks as well as an is-element-still-in-hash check to
avoid 'blind' hash removal.
Fixes: 068ee6e204 ("batman-adv: roaming handling mechanism redesign")
Reported-by: alfonsname@web.de
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
batadv_softif_vlan_get() may return NULL which has to be verified
by the caller.
Fixes: 35df3b298f ("batman-adv: fix TT VLAN inconsistency on VLAN re-add")
Reported-by: Ryan Thompson <ryan@eero.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
When a node running DAT receives an ARP request from the LAN for the
first time, it is likely that this node will request the ARP entry
through the distributed ARP table (DAT) in the mesh.
Once a DAT reply is received the asking node must check if the MAC
address for which the IP address has been asked is local. If it is, the
node must drop the ARP reply bceause the client should have replied on
its own locally.
Forwarding this reply means fooling any L2 bridge (e.g. Ethernet
switches) lying between the batman-adv node and the LAN. This happens
because the L2 bridge will think that the client sending the ARP reply
lies somewhere in the mesh, while this node is sitting in the same LAN.
Reported-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Determine if a fraglist is needed in the tx path, and allocate it if
necessary before setting up the copy and map operations.
Otherwise, undoing the copy and map operations is tricky.
This fixes a use-after-free: if allocating the fraglist failed, the copy
and map operations that had been set up were still executed, writing
over the data area of a freed skb.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Multicast dst are not cached. They carry DST_NOCACHE.
As mentioned in commit f886497212 ("ipv4: fix dst race in
sk_dst_get()"), these dst need special care before caching them
into a socket.
Caching them is allowed only if their refcnt was not 0, ie we
must use atomic_inc_not_zero()
Also, we must use READ_ONCE() to fetch sk->sk_rx_dst, as mentioned
in commit d0c294c53a ("tcp: prevent fetching dst twice in early demux
code")
Fixes: 421b3885bf ("udp: ipv4: Add udp early demux")
Tested-by: Gregory Hoggarth <Gregory.Hoggarth@alliedtelesis.co.nz>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Gregory Hoggarth <Gregory.Hoggarth@alliedtelesis.co.nz>
Reported-by: Alex Gartrell <agartrell@fb.com>
Cc: Michal Kubeček <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
When vortex_up is failed, the skb buffers allocated by __netdev_alloc_skb
in vortex_open are not released, which may cause resource leaks.
This bug has been submitted before.
This patch modifies the error handling code to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"len" is a signed integer. We check that len is not negative, so it
goes from zero to INT_MAX. PAGE_SIZE is unsigned long so the comparison
is type promoted to unsigned long. ULONG_MAX - 4095 is a higher than
INT_MAX so the condition can never be true.
I don't know if this is harmful but it seems safe to limit "len" to
INT_MAX - 4095.
Fixes: a8c879a7ee ('RDS: Info and stats')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we share an action within a filter, the bind refcnt
should increase, therefore we should not call tcf_hash_release().
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
openvswitch modifies the L4 checksum of a packet when modifying
the ip address. When an IP packet is fragmented only the first
fragment contains an L4 header and checksum. Prior to this change
openvswitch would modify all fragments, modifying application data
in non-first fragments, causing checksum failures in the
reassembled packet.
Signed-off-by: Glenn Griffin <ggriffin.kernel@gmail.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver code allows for the disabling of MSI interrupts; however the
module_parm line was missed and the option fails to show with modinfo.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Stable <stable@vger.kernel.org> [3.15+]
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Alex reported the following crash when using fq_codel
with htb:
crash> bt
PID: 630839 TASK: ffff8823c990d280 CPU: 14 COMMAND: "tc"
[... snip ...]
#8 [ffff8820ceec17a0] page_fault at ffffffff8160a8c2
[exception RIP: htb_qlen_notify+24]
RIP: ffffffffa0841718 RSP: ffff8820ceec1858 RFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88241747b400
RDX: ffff88241747b408 RSI: 0000000000000000 RDI: ffff8811fb27d000
RBP: ffff8820ceec1868 R8: ffff88120cdeff24 R9: ffff88120cdeff30
R10: 0000000000000bd4 R11: ffffffffa0840919 R12: ffffffffa0843340
R13: 0000000000000000 R14: 0000000000000001 R15: ffff8808dae5c2e8
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [...] qdisc_tree_decrease_qlen at ffffffff81565375
#10 [...] fq_codel_dequeue at ffffffffa084e0a0 [sch_fq_codel]
#11 [...] fq_codel_reset at ffffffffa084e2f8 [sch_fq_codel]
#12 [...] qdisc_destroy at ffffffff81560d2d
#13 [...] htb_destroy_class at ffffffffa08408f8 [sch_htb]
#14 [...] htb_put at ffffffffa084095c [sch_htb]
#15 [...] tc_ctl_tclass at ffffffff815645a3
#16 [...] rtnetlink_rcv_msg at ffffffff81552cb0
[... snip ...]
As Jamal pointed out, there is actually no need to call dequeue
to purge the queued skb's in reset, data structures can be just
reset explicitly. Therefore, we reset everything except config's
and stats, so that we would have a fresh start after device flipping.
Fixes: 4b549a2ef4 ("fq_codel: Fair Queue Codel AQM")
Reported-by: Alex Gartrell <agartrell@fb.com>
Cc: Alex Gartrell <agartrell@fb.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
[xiyou.wangcong@gmail.com: added codel_vars_init() and qdisc_qstats_backlog_dec()]
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When removing a port's netdevice in 'rocker_remove_ports', we should
also free the allocated 'net_device' structure. Do that by calling
'free_netdev' after unregistering it.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Fixes: 4b8ac9660a ("rocker: introduce rocker switch driver")
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Must teardown SR-IOV before unregistering netdev in igb driver, from
Alex Williamson.
2) Fix ipv6 route unreachable crash in IPVS, from Alex Gartrell.
3) Default route selection in ipv4 should take the prefix length, table
ID, and TOS into account, from Julian Anastasov.
4) sch_plug must have a reset method in order to purge all buffered
packets when the qdisc is reset, likewise for sch_choke, from WANG
Cong.
5) Fix deadlock and races in slave_changelink/br_setport in bridging.
From Nikolay Aleksandrov.
6) mlx4 bug fixes (wrong index in port even propagation to VFs,
overzealous BUG_ON assertion, etc.) from Ido Shamay, Jack
Morgenstein, and Or Gerlitz.
7) Turn off klog message about SCTP userspace interface compat that
makes no sense at all, from Daniel Borkmann.
8) Fix unbounded restarts of inet frag eviction process, causing NMI
watchdog soft lockup messages, from Florian Westphal.
9) Suspend/resume fixes for r8152 from Hayes Wang.
10) Fix busy loop when MSG_WAITALL|MSG_PEEK is used in TCP recv, from
Sabrina Dubroca.
11) Fix performance regression when removing a lot of routes from the
ipv4 routing tables, from Alexander Duyck.
12) Fix device leak in AF_PACKET, from Lars Westerhoff.
13) AF_PACKET also has a header length comparison bug due to signedness,
from Alexander Drozdov.
14) Fix bug in EBPF tail call generation on x86, from Daniel Borkmann.
15) Memory leaks, TSO stats, watchdog timeout and other fixes to
thunderx driver from Sunil Goutham and Thanneeru Srinivasulu.
16) act_bpf can leak memory when replacing programs, from Daniel
Borkmann.
17) WOL packet fixes in gianfar driver, from Claudiu Manoil.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (79 commits)
stmmac: fix missing MODULE_LICENSE in stmmac_platform
gianfar: Enable device wakeup when appropriate
gianfar: Fix suspend/resume for wol magic packet
gianfar: Fix warning when CONFIG_PM off
act_pedit: check binding before calling tcf_hash_release()
net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket
net: sched: fix refcount imbalance in actions
r8152: reset device when tx timeout
r8152: add pre_reset and post_reset
qlcnic: Fix corruption while copying
act_bpf: fix memory leaks when replacing bpf programs
net: thunderx: Fix for crash while BGX teardown
net: thunderx: Add PCI driver shutdown routine
net: thunderx: Fix crash when changing rss with mutliple traffic flows
net: thunderx: Set watchdog timeout value
net: thunderx: Wakeup TXQ only if CQE_TX are processed
net: thunderx: Suppress alloc_pages() failure warnings
net: thunderx: Fix TSO packet statistic
net: thunderx: Fix memory leak when changing queue count
net: thunderx: Fix RQ_DROP miscalculation
...