Intiyaz Basha says:
====================
liquidio: Removed droq lock from Rx path
Series of patches for removing droq lock from Rx Path.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
With the changes in patch 1 and 2, droq lock is not required.
So removing droq lock.
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removed oom task unconditional rescheduling every 250ms and created per
queue oom work queue for refilling buffers.
The oom task refills only if the available descriptors is fallen to 64.
There will be no packets coming in after hitting this level. So NAPI will
not run until oom task refills the buffers.
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Control packets are processed in tasklet when interface is down and in
NAPI when interface is up. So tasklet can be disabled when interface up
and re-enabled when interface is down.
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert pr_info to net_info_ratelimited to limit the total number of
synflood warnings.
Commit 946cedccbd ("tcp: Change possible SYN flooding messages")
rate limits synflood warnings to one per listener.
Workloads that open many listener sockets can still see a high rate of
log messages. Syzkaller is one frequent example.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dma_zalloc_coherent() now crashes if no dev pointer is given.
Add a dev pointer to the ltq_dma_channel structure and fill it in the
driver using it.
This fixes a bug introduced in kernel 4.19.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
debugfs_remove_recursive has taken IS_ERR_OR_NULL into account. So just
remove the condition check before debugfs_remove_recursive.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
debugfs_remove_recursive has taken the IS_ERR_OR_NULL into account. Just
remove the unnecessary condition check.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "pause" variable is only initialized on BCM5301x.
Fixes: 5e004460f8 ("net: dsa: b53: Add helper to set link parameters")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for you net tree:
1) Remove duplicated include at the end of UDP conntrack, from Yue Haibing.
2) Restore conntrack dependency on xt_cluster, from Martin Willi.
3) Fix splat with GSO skbs from the checksum target, from Florian Westphal.
4) Rework ct timeout support, the template strategy to attach custom timeouts
is not correct since it will not work in conjunction with conntrack zones
and we have a possible free after use when removing the rule due to missing
refcounting. To fix these problems, do not use conntrack template at all
and set custom timeout on the already valid conntrack object. This
fix comes with a preparation patch to simplify timeout adjustment by
initializating the first position of the timeout array for all of the
existing trackers. Patchset from Florian Westphal.
5) Fix missing dependency on from IPv4 chain NAT type, from Florian.
6) Release chain reference counter from the flush path, from Taehee Yoo.
7) After flushing an iptables ruleset, conntrack hooks are unregistered
and entries are left stale to be cleaned up by the timeout garbage
collector. No TCP tracking is done on established flows by this time.
If ruleset is reloaded, then hooks are registered again and TCP
tracking is restored, which considers packets to be invalid. Clear
window tracking to exercise TCP flow pickup from the middle given that
history is lost for us. Again from Florian.
8) Fix crash from netlink interface with CONFIG_NF_CONNTRACK_TIMEOUT=y
and CONFIG_NF_CT_NETLINK_TIMEOUT=n.
9) Broken CT target due to returning incorrect type from
ctnl_timeout_find_get().
10) Solve conntrack clash on NF_REPEAT verdicts too, from Michal Vaner.
11) Missing conversion of hashlimit sysctl interface to new API, from
Cong Wang.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull HID fixes from Jiri Kosina:
- functional regression fix for sensor-hub driver from Hans de Goede
- stop doing device reset for i2c-hid devices, which unbreaks some of
them (and is in line with the specification), from Kai-Heng Feng
- error handling fix for hid-core from Gustavo A. R. Silva
- functional regression fix for some Elan panels from Benjamin
Tissoires
- a few new device ID additions and misc small fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: i2c-hid: Don't reset device upon system resume
HID: sensor-hub: Restore fixup for Lenovo ThinkPad Helix 2 sensor hub report
HID: core: fix NULL pointer dereference
HID: core: fix grouping by application
HID: multitouch: fix Elan panels with 2 input modes declaration
HID: hid-saitek: Add device ID for RAT 7 Contagion
HID: core: fix memory leak on probe
HID: input: fix leaking custom input node name
HID: add support for Apple Magic Keyboards
HID: i2c-hid: Fix flooded incomplete report after S3 on Rayd touchscreen
HID: intel-ish-hid: Enable Sunrise Point-H ish driver
- Fix possible FD type confusion crash
- Fix a user trigger-able crash in cxgb4
- Fix bad handling of IOMMU resources causing user controlled leaking in
bnxt
- Add missing locking in ipoib to fix a rare 'stuck tx' situation
- Add missing locking in cma
- Add two missing missing uverbs cleanups on failure paths, regressions
from this merge window
- Fix a regression from this merge window that caused RDMA NFS to not work
with the mlx4 driver due to the max_sg changes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAluW9yAACgkQOG33FX4g
mxqA7xAAkUb80Q+ssrAFSlGGkPU9ArWD3Bk6bqUHZG2/CbgA45gUuHm7GaYMnBxL
CGULlnL9ejt2LpQUTkWxyVXRwA67wdgfFf+W+bJSJdz1WZNzRZUjC2xNcfjSeVhN
YabxXcZHj4yiEokQpOo+ya2/ygL72Z9f9edIXzg0DjHkjyyGxbNJ+My7gN9odlzl
jcnE3kDTO7YcycwIg5K9a1H9SmioTgXYUioAFUkqB35L5Ye2M/laPJXCEAnKVOgy
HI6VXigphK53y/NbArjO8ou6Lq4gvbunyx1fbXReTwRfYjSRhuWqtvE6z26e4PSs
2BuyGOisLAhWOv9zJRRYXdknZYl9nOfGwraJ76JTbvO58GmORE1yxrIdF0WtuJnH
zDSeBRU5vDMDhrMrLmyfkNe1xugVlcsOur+RDg0ZGxakgXtvxJJiI76qzJpNFu29
T+uLj6j97AgTj/ztQ3Ujse2yaGj0Ldsv98VokBhDSo2oXQrRgvmgHpKzvrBLeAid
28BuER7F5DY6mqSdsBni+LO/Nzn6nSN+82aEy67ltqRLu2Ippu+T4aEo2ArdGyas
cwhJW+5kRL+oVU/X1/PkdlhsXy5w7h746tvklD+hItvMs+PZ4tZrbeEsLlzsMzQh
b15/TQ7RRNOB/NSFh9hW2M0e6ReeSaPuxspNYb1SLUVrft9eCtU=
=33No
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"This fixes one major regression with NFS and mlx4 due to the max_sg
rework in this merge window, tidies a few minor error_path
regressions, and various small fixes.
The HFI1 driver is broken this cycle due to a regression caused by a
PCI change, it is looking like Bjorn will merge a fix for this. Also,
the lingering ipoib issue I mentioned earlier still remains unfixed.
Summary:
- Fix possible FD type confusion crash
- Fix a user trigger-able crash in cxgb4
- Fix bad handling of IOMMU resources causing user controlled leaking
in bnxt
- Add missing locking in ipoib to fix a rare 'stuck tx' situation
- Add missing locking in cma
- Add two missing missing uverbs cleanups on failure paths,
regressions from this merge window
- Fix a regression from this merge window that caused RDMA NFS to not
work with the mlx4 driver due to the max_sg changes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/mlx4: Ensure that maximal send/receive SGE less than supported by HW
RDMA/cma: Protect cma dev list with lock
RDMA/uverbs: Fix error cleanup path of ib_uverbs_add_one()
bnxt_re: Fix couple of memory leaks that could lead to IOMMU call traces
IB/ipoib: Avoid a race condition between start_xmit and cm_rep_handler
iw_cxgb4: only allow 1 flush on user qps
IB/core: Release object lock if destroy failed
RDMA/ucma: check fd type in ucma_migrate_id()
After switching to the new procfs API, it is supposed to
retrieve the private pointer from PDE_DATA(file_inode(s->file)),
s->private is no longer referred.
Fixes: 1cd6718272 ("netfilter/x_tables: switch to proc_create_seq_private")
Reported-by: Sami Farin <hvtaifwkbgefbaei@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Tested-by: Sami Farin <hvtaifwkbgefbaei@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
NF_REPEAT places the packet at the beginning of the iptables chain
instead of accepting or rejecting it right away. The packet however will
reach the end of the chain and continue to the end of iptables
eventually, so it needs the same handling as NF_ACCEPT and NF_DROP.
Fixes: 368982cd7d ("netfilter: nfnetlink_queue: resolve clash for unconfirmed conntracks")
Signed-off-by: Michal 'vorner' Vaner <michal.vaner@avast.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Compiler did not catch incorrect typing in the rcu hook assignment.
% nfct add timeout test-tcp inet tcp established 100 close 10 close_wait 10
% iptables -I OUTPUT -t raw -p tcp -j CT --timeout test-tcp
dmesg - xt_CT: Timeout policy `test-tcp' can only be used by L3 protocol number 25000
The CT target bails out with incorrect layer 3 protocol number.
Fixes: 6c1fd7dc48 ("netfilter: cttimeout: decouple timeout policy from nfnetlink_cttimeout object")
Reported-by: Harsha Sharma <harshasharmaiitr@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Now that cttimeout support for nft_ct is in place, these should depend
on CONFIG_NF_CONNTRACK_TIMEOUT otherwise we can crash when dumping the
policy if this option is not enabled.
[ 71.600121] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[...]
[ 71.600141] CPU: 3 PID: 7612 Comm: nft Not tainted 4.18.0+ #246
[...]
[ 71.600188] Call Trace:
[ 71.600201] ? nft_ct_timeout_obj_dump+0xc6/0xf0 [nft_ct]
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Doug Smythies says:
Sometimes it is desirable to temporarily disable, or clear,
the iptables rule set on a computer being controlled via a
secure shell session (SSH). While unwise on an internet facing
computer, I also do it often on non-internet accessible computers
while testing. Recently, this has become problematic, with the
SSH session being dropped upon re-load of the rule set.
The problem is that when all rules are deleted, conntrack hooks get
unregistered.
In case the rules are re-added later, its possible that tcp window
has moved far enough so that all packets are considered invalid (out of
window) until entry expires (which can take forever, default
established timeout is 5 days).
Fix this by clearing maxwin of existing tcp connections on register.
v2: don't touch entries on hook removal.
v3: remove obsolete expiry check.
Reported-by: Doug Smythies <dsmythies@telus.net>
Fixes: 4d3a57f23d ("netfilter: conntrack: do not enable connection tracking unless needed")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Quectel EP06 (and EM06/EG06) supports dynamic configuration of USB
interfaces, without the device changing VID/PID or configuration number.
When the configuration is updated and interfaces are added/removed, the
interface numbers change. This means that the current code for matching
EP06 does not work.
This patch removes the current EP06 interface number match, and replaces
it with a match on class, subclass and protocol. Unfortunately, matching
on those three alone is not enough, as the diag interface exports the
same values as QMI. The other serial interfaces + adb export different
values and do not match.
The diag interface only has two endpoints, while the QMI interface has
three. I have therefore added a check for number of interfaces, and we
ignore the interface if the number of endpoints equals two.
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cl->leaf.q is slightly more readable than cl->un.leaf.q.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We no longer take any spinlock on RX path for ingress qdisc,
so this lockdep annotation is no longer needed.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change flower in_hw_count type to fixed-size u32 and dump it as
TCA_FLOWER_IN_HW_COUNT. This change is necessary to properly test shared
blocks and re-offload functionality.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates license to use SPDX-License-Identifier
instead of verbose license text.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Miller says:
====================
SKB list handling cleanups
This is a preparatory patch series which cleans up various forms of
sloppy SKB list handling, and makes certain semantics explicit.
We are trying to eliminate code that directly accesses the SKB
list and SKB queue head next/prev members in any way. It is
impossible to convert SKB queue head over the struct list_head
while such code exists.
This patch series does not eliminate all such code, only the simplest
cases. A latter series will tackle the complicated ones.
A helper is added to make the "skb->next == NULL means not on a list"
rule explicit, and another is added to combine this with list_del().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminate the assumption that SKBs and SKB list heads can
be cast to eachother in SKB list handling code.
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminate code which assumes that SKBs and skb_queue_head objects
can be cast to eachother during list processing.
Signed-off-by: David S. Miller <davem@davemloft.net>
It documents what is happening, and eliminates the spurious list
pointer poisoning.
In the long term, in order to get proper list head debugging, we
might want to use the list poison value as the indicator that
an SKB is a singleton and not on a list.
Signed-off-by: David S. Miller <davem@davemloft.net>
An SKB is not on a list if skb->next is NULL.
Codify this convention into a helper function and use it
where we are dequeueing an SKB and need to mark it as such.
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of direct SKB list pointer accesses.
In these situations, we absolutely know that the SKB queue in question
is non-empty.
Signed-off-by: David S. Miller <davem@davemloft.net>
Use skb_queue_walk() instead.
Adjust inner loop test to utilize and skb_queue_is_first().
Unfortunately we have to keep pkt_cnt around because it is
used by a latter loop in this function.
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead, adjust __qdisc_enqueue_tail() such that HTB can use it
instead.
The only other caller of __qdisc_enqueue_tail() is
qdisc_enqueue_tail() so we can move the backlog and return value
handling (which HTB doesn't need/want) to the latter.
Signed-off-by: David S. Miller <davem@davemloft.net>
After the conversion to fib6_info, rt6i_prefsrc has a single user that
reads the value and otherwise it is only set. The one reader can be
converted to use rt->from so rt6i_prefsrc can be removed, reducing
rt6_info by another 20 bytes.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A recent commit updated vlan_cmd.dropnovlan_fm but failed to remove
the older assignment. Fix this by removing the former redundant
assignment.
Detected by CoverityScan, CID#1473290 ("Unused value")
Fixes: a89cdd8e7c ("cxgb4: impose mandatory VLAN usage when non-zero TAG ID")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tls_sw_sendmsg() allocates plaintext and encrypted SG entries using
function sk_alloc_sg(). In case the number of SG entries hit
MAX_SKB_FRAGS, sk_alloc_sg() returns -ENOSPC and sets the variable for
current SG index to '0'. This leads to calling of function
tls_push_record() with 'sg_encrypted_num_elem = 0' and later causes
kernel crash. To fix this, set the number of SG elements to the number
of elements in plaintext/encrypted SG arrays in case sk_alloc_sg()
returns -ENOSPC.
Fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal says:
====================
bug fixes for ENA Ethernet driver
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Added memory barriers where they were missing to support multiple
architectures, and removed redundant ones.
As part of removing the redundant memory barriers and improving
performance, we moved to more relaxed versions of memory barriers,
as well as to the more relaxed version of writel - writel_relaxed,
while maintaining correctness.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add READ_ONCE calls where necessary (for example when iterating
over a memory field that gets updated by the hardware).
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
acquire the rtnl_lock during device destruction to avoid
using partially destroyed device.
ena_remove() shares almost the same logic as ena_destroy_device(),
so use ena_destroy_device() and avoid duplications.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ena_destroy_device() can potentially be called twice.
To avoid this, check that the device is running and
only then proceed destroying it.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When ena_destroy_device() is called from ena_suspend(), the device is
still reachable from the driver. Therefore, the driver can send a command
to the device to free all resources.
However, in all other cases of calling ena_destroy_device(), the device is
potentially in an error state and unreachable from the driver. In these
cases the driver must not send commands to the device.
The current implementation does not request resource freeing from the
device even when possible. We add the graceful parameter to
ena_destroy_device() to enable resource freeing when possible, and
use it in ena_suspend().
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>