Commit Graph

62449 Commits

Author SHA1 Message Date
Vladimir Oltean
2e554a7a5d net: dsa: propagate switchdev vlan_filtering prepare phase to drivers
A driver may refuse to enable VLAN filtering for any reason beyond what
the DSA framework cares about, such as:
- having tc-flower rules that rely on the switch being VLAN-aware
- the particular switch does not support VLAN, even if the driver does
  (the DSA framework just checks for the presence of the .port_vlan_add
  and .port_vlan_del pointers)
- simply not supporting this configuration to be toggled at runtime

Currently, when a driver rejects a configuration it cannot support, it
does this from the commit phase, which triggers various warnings in
switchdev.

So propagate the prepare phase to drivers, to give them the ability to
refuse invalid configurations cleanly and avoid the warnings.

Since we need to modify all function prototypes and check for the
prepare phase from within the drivers, take that opportunity and move
the existing driver restrictions within the prepare phase where that is
possible and easy.

Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Woojung Huh <woojung.huh@microchip.com>
Cc: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
Cc: Sean Wang <sean.wang@mediatek.com>
Cc: Landen Chao <Landen.Chao@mediatek.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Jonathan McDowell <noodles@earth.li>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-05 05:56:48 -07:00
Rikard Falkeborn
b980b313e5 net: openvswitch: Constify static struct genl_small_ops
The only usage of these is to assign their address to the small_ops field
in the genl_family struct, which is a const pointer, and applying
ARRAY_SIZE() on them. Make them const to allow the compiler to put them
in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 21:13:36 -07:00
Rikard Falkeborn
674d3ab949 mptcp: Constify mptcp_pm_ops
The only usages of mptcp_pm_ops is to assign its address to the small_ops
field of the genl_family struct, which is a const pointer, and applying
ARRAY_SIZE() on it. Make it const to allow the compiler to put it in
read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 21:13:36 -07:00
Andrew Lunn
08156ba430 net: dsa: Add devlink port regions support to DSA
Allow DSA drivers to make use of devlink port regions, via simple
wrappers.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 14:38:53 -07:00
Andrew Lunn
544e7c33ec net: devlink: Add support for port regions
Allow regions to be registered to a devlink port. The same netlink API
is used, but the port index is provided to indicate when a region is a
port region as opposed to a device region.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 14:38:53 -07:00
Andrew Lunn
3122433eb5 net: dsa: Register devlink ports before calling DSA driver setup()
DSA drivers want to create regions on devlink ports as well as the
devlink device instance, in order to export registers and other tables
per port. To keep all this code together in the drivers, have the
devlink ports registered early, so the setup() method can setup both
device and port devlink regions.

v3:
Remove dp->setup
Move common code out of switch statement.
Fix wrong goto

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 14:38:53 -07:00
Andrew Lunn
f15ec13a96 net: dsa: Make use of devlink port flavour unused
If a port is unused, still create a devlink port for it, but set the
flavour to unused. This allows us to attach devlink regions to the
port, etc.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 14:38:52 -07:00
Andrew Lunn
cf1166349c net: devlink: Add unused port flavour
Not all ports of a switch need to be used, particularly in embedded
systems. Add a port flavour for ports which physically exist in the
switch, but are not connected to the front panel etc, and so are
unused. By having unused ports present in devlink, it gives a more
accurate representation of the hardware. It also allows regions to be
associated to such ports, so allowing, for example, to determine
unused ports are correctly powered off, or to compare probable reset
defaults of unused ports to used ports experiences issues.

Actually registering unused ports and setting the flavour to unused is
optional. The DSA core will register all such switch ports, but such
ports are expected to be limited in number. Bigger ASICs may decide
not to list unused ports.

v2:
Expand the description about why it is useful

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 14:38:52 -07:00
David S. Miller
321e921daa Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Rename 'searched' column to 'clashres' in conntrack /proc/ stats
   to amend a recent patch, from Florian Westphal.

2) Remove unused nft_data_debug(), from YueHaibing.

3) Remove unused definitions in IPVS, also from YueHaibing.

4) Fix user data memleak in tables and objects, this is also amending
   a recent patch, from Jose M. Guisado.

5) Use nla_memdup() to allocate user data in table and objects, also
   from Jose M. Guisado

6) User data support for chains, from Jose M. Guisado

7) Remove unused definition in nf_tables_offload, from YueHaibing.

8) Use kvzalloc() in ip_set_alloc(), from Vasily Averin.

9) Fix false positive reported by lockdep in nfnetlink mutexes,
   from Florian Westphal.

10) Extend fast variant of cmp for neq operation, from Phil Sutter.

11) Implement fast bitwise variant, also from Phil Sutter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 14:35:53 -07:00
Phil Sutter
10fdd6d80e netfilter: nf_tables: Implement fast bitwise expression
A typical use of bitwise expression is to mask out parts of an IP
address when matching on the network part only. Optimize for this common
use with a fast variant for NFT_BITWISE_BOOL-type expressions operating
on 32bit-sized values.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-10-04 21:08:33 +02:00
Phil Sutter
5f48846daf netfilter: nf_tables: Enable fast nft_cmp for inverted matches
Add a boolean indicating NFT_CMP_NEQ. To include it into the match
decision, it is sufficient to XOR it with the data comparison's result.

While being at it, store the mask that is calculated during expression
init and free the eval routine from having to recalculate it each time.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-10-04 21:08:32 +02:00
Florian Westphal
ab6c41eefd netfilter: nfnetlink: place subsys mutexes in distinct lockdep classes
From time to time there are lockdep reports similar to this one:

 WARNING: possible circular locking dependency detected
 ------------------------------------------------------
 000000004f61aa56 (&table[i].mutex){+.+.}, at: nfnl_lock [nfnetlink]
 but task is already holding lock:
 [..] (&net->nft.commit_mutex){+.+.}, at: nf_tables_valid_genid [nf_tables]
 which lock already depends on the new lock.
 the existing dependency chain (in reverse order) is:
 -> #1 (&net->nft.commit_mutex){+.+.}:
 [..]
        nf_tables_valid_genid+0x18/0x60 [nf_tables]
        nfnetlink_rcv_batch+0x24c/0x620 [nfnetlink]
        nfnetlink_rcv+0x110/0x140 [nfnetlink]
        netlink_unicast+0x12c/0x1e0
 [..]
        sys_sendmsg+0x18/0x40
        linux_sparc_syscall+0x34/0x44
 -> #0 (&table[i].mutex){+.+.}:
 [..]
        nfnl_lock+0x24/0x40 [nfnetlink]
        ip_set_nfnl_get_byindex+0x19c/0x280 [ip_set]
        set_match_v1_checkentry+0x14/0xc0 [xt_set]
        xt_check_match+0x238/0x260 [x_tables]
        __nft_match_init+0x160/0x180 [nft_compat]
 [..]
        sys_sendmsg+0x18/0x40
        linux_sparc_syscall+0x34/0x44
 other info that might help us debug this:
  Possible unsafe locking scenario:
        CPU0                    CPU1
        ----                    ----
   lock(&net->nft.commit_mutex);
                                lock(&table[i].mutex);
                                lock(&net->nft.commit_mutex);
   lock(&table[i].mutex);

Lockdep considers this an ABBA deadlock because the different nfnl subsys
mutexes reside in the same lockdep class, but this is a false positive.

CPU1 table[i] refers to the nftables subsys mutex, whereas CPU1 locks
the ipset subsys mutex.

Yi Che reported a similar lockdep splat, this time between ipset and
ctnetlink subsys mutexes.

Time to place them in distinct classes to avoid these warnings.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-10-04 21:08:32 +02:00
Vasily Averin
9446ab34ac netfilter: ipset: enable memory accounting for ipset allocations
Currently netadmin inside non-trusted container can quickly allocate
whole node's memory via request of huge ipset hashtable.
Other ipset-related memory allocations should be restricted too.

v2: fixed typo ALLOC -> ACCOUNT

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-10-04 21:08:25 +02:00
YueHaibing
82ec6630f9 netfilter: nf_tables_offload: Remove unused macro FLOW_SETUP_BLOCK
commit 9a32669fec ("netfilter: nf_tables_offload: support indr block call")
left behind this.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-10-04 21:07:55 +02:00
Matthieu Baerts
456afe01b1 mptcp: ADD_ADDRs with echo bit are smaller
The MPTCP ADD_ADDR suboption with echo-flag=1 has no HMAC, the size is
smaller than the one initially sent without echo-flag=1. We then need to
use the correct size everywhere when we need this echo bit.

Before this patch, the wrong size was reserved but the correct amount of
bytes were written (and read): the remaining bytes contained garbage.

Fixes: 6a6c05a8b0 ("mptcp: send out ADD_ADDR with echo flag")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/95
Reported-and-tested-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 17:36:37 -07:00
Guillaume Nault
a45294af9e net/sched: act_mpls: Add action to push MPLS LSE before Ethernet header
Define the MAC_PUSH action which pushes an MPLS LSE before the mac
header (instead of between the mac and the network headers as the
plain PUSH action does).

The only special case is when the skb has an offloaded VLAN. In that
case, it has to be inlined before pushing the MPLS header.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 17:28:45 -07:00
Guillaume Nault
19fbcb36a3 net/sched: act_vlan: Add {POP,PUSH}_ETH actions
Implement TCA_VLAN_ACT_POP_ETH and TCA_VLAN_ACT_PUSH_ETH, to
respectively pop and push a base Ethernet header at the beginning of a
frame.

POP_ETH is just a matter of pulling ETH_HLEN bytes. VLAN tags, if any,
must be stripped before calling POP_ETH.

PUSH_ETH is restricted to skbs with no mac_header, and only the MAC
addresses can be configured. The Ethertype is automatically set from
skb->protocol. These restrictions ensure that all skb's fields remain
consistent, so that this action can't confuse other part of the
networking stack (like GSO).

Since openvswitch already had these actions, consolidate the code in
skbuff.c (like for vlan and mpls push/pop).

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 17:28:45 -07:00
Karsten Graul
fd6ebb6fb2 net/smc: use an array to check fields in system EID
The check for old hardware versions that did not have SMCDv2 support was
using suspicious pointer magic. Address the fields using an array.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 17:04:48 -07:00
Karsten Graul
839d696ffb net/smc: send ISM devices with unique chid in CLC proposal
When building a CLC proposal message then the list of ISM devices does
not need to contain multiple devices that have the same chid value,
all these devices use the same function at the end.
Improve smc_find_ism_v2_device_clnt() to collect only ISM devices that
have unique chid values.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 17:04:48 -07:00
Yuchung Cheng
9cd8b6c905 tcp: account total lost packets properly
The retransmission refactoring patch
686989700c ("tcp: simplify tcp_mark_skb_lost")
does not properly update the total lost packet counter which may
break the policer mode in BBR. This patch fixes it.

Fixes: 686989700c ("tcp: simplify tcp_mark_skb_lost")
Reported-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 16:57:18 -07:00
Julian Wiedmann
a29f245ec9 net/iucv: fix indentation in __iucv_message_receive()
smatch complains about
net/iucv/iucv.c:1119 __iucv_message_receive() warn: inconsistent indenting

While touching this line, also make the return logic consistent and thus
get rid of a goto label.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 16:51:07 -07:00
Julian Wiedmann
398999bac6 net/af_iucv: right-size the uid variable in iucv_sock_bind()
smatch complains about
net/iucv/af_iucv.c:624 iucv_sock_bind() error: memcpy() 'sa->siucv_user_id' too small (8 vs 9)

Which is absolutely correct - the memcpy() takes 9 bytes (sizeof(uid))
from an 8-byte field (sa->siucv_user_id).
Luckily the sockaddr_iucv struct contains more data after the
.siucv_user_id field, and we checked the size of the passed data earlier
on. So the memcpy() won't accidentally read from an invalid location.

Fix the warning by reducing the size of the uid variable to what's
actually needed, and thus reducing the amount of copied data.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 16:51:07 -07:00
Jakub Kicinski
e992a6eda9 genetlink: allow dumping command-specific policy
Right now CTRL_CMD_GETPOLICY can only dump the family-wide
policy. Support dumping policy of a specific op.

v3:
 - rebase after per-op policy export and handle that
v2:
 - make cmd U32, just in case.
v1:
 - don't echo op in the output in a naive way, this should
   make it cleaner to extend the output format for dumping
   policies for all the commands at once in the future.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20201001225933.1373426-11-kuba@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 14:18:29 -07:00
Johannes Berg
50a896cf2d genetlink: properly support per-op policy dumping
Add support for per-op policy dumping. The data is pretty much
as before, except that now the assumption that the policy with
index 0 is "the" policy no longer holds - you now need to look
at the new CTRL_ATTR_OP_POLICY attribute which is a nested attr
(indexed by op) containing attributes for do and dump policies.

When a single op is requested, the CTRL_ATTR_OP_POLICY will be
added in the same way, since do and dump policies may differ.

v2:
 - conditionally advertise per-command policies only if there
   actually is a policy being used for the do/dump and it's
   present at all

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 14:18:29 -07:00
Johannes Berg
aa85ee5f95 genetlink: factor skb preparation out of ctrl_dumppolicy()
We'll need this later for the per-op policy index dump.

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 14:18:29 -07:00
Johannes Berg
04a351a62b netlink: rework policy dump to support multiple policies
Rework the policy dump code a bit to support adding multiple
policies to a single dump, in order to e.g. support per-op
policies in generic netlink.

v2:
 - move kernel-doc to implementation [Jakub]
 - squash the first patch to not flip-flop on the prototype
   [Jakub]
 - merge netlink_policy_dump_get_policy_idx() with the old
   get_policy_idx() we already had
 - rebase without Jakub's patch to have per-op dump

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 14:18:29 -07:00
Johannes Berg
899b07c578 netlink: compare policy more accurately
The maxtype is really an integral part of the policy, and while we
haven't gotten into a situation yet where this happens, it seems
that some developer might eventually have two places pointing to
identical policies, with different maxattr to exclude some attrs
in one of the places.

Even if not, it's really the right thing to compare both since the
two data items fundamentally belong together.

v2:
 - also do the proper comparison in get_policy_idx()

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 14:18:29 -07:00
Jakub Kicinski
a4bb4f5fc8 genetlink: switch control commands to per-op policies
In preparation for adding a new attribute to CTRL_CMD_GETPOLICY
split the policies for getpolicy and getfamily apart.

This will cause a slight user-visible change in that dumping
the policies will switch from per family to per op, but
supposedly sniffer-type applications (which are the main use
case for policy dumping thus far) should support both, anyway.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 19:11:12 -07:00
Jakub Kicinski
8e1ed28fd8 genetlink: use parsed attrs in dumppolicy
Attributes are already parsed based on the policy specified
in the family and ready-to-use in info->attrs. No need to
call genlmsg_parse() again.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 19:11:12 -07:00
Jakub Kicinski
48526a0f4c genetlink: bring back per op policy
Add policy to the struct genl_ops structure, this time
with maxattr, so it can be used properly.

Propagate .policy and .maxattr from the family
in genl_get_cmd() if needed, this way the rest of the
code does not have to worry if the policy is per op
or global.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 19:11:12 -07:00
Jakub Kicinski
78ade619c1 genetlink: use .start callback for dumppolicy
The structure of ctrl_dumppolicy() is clearly split into
init and dumping. Move the init to a .start callback
for clarity, it's a more idiomatic netlink dump code structure.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 19:11:12 -07:00
Jakub Kicinski
adc848450f genetlink: add a structure for dump state
Whenever netlink dump uses more than 2 cb->args[] entries
code gets hard to read. We're about to add more state to
ctrl_dumppolicy() so create a structure.

Since the structure is typed and clearly named we can remove
the local fam_id variable and use ctx->fam_id directly.

v3:
 - rebase onto explicit free fix
v1:
 - s/nl_policy_dump/netlink_policy_dump_state/
 - forward declare struct netlink_policy_dump_state,
   and move from passing unsigned long to actual pointer type
 - add build bug on
 - u16 fam_id
 - s/args/ctx/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 19:11:12 -07:00
Jakub Kicinski
66a9b9287d genetlink: move to smaller ops wherever possible
Bulk of the genetlink users can use smaller ops, move them.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 19:11:11 -07:00
Jakub Kicinski
0b588afdd1 genetlink: add small version of ops
We want to add maxattr and policy back to genl_ops, to enable
dumping per command policy to user space. This, however, would
cause bloat for all the families with global policies. Introduce
smaller version of ops (half the size of genl_ops). Translate
these smaller ops into a full blown struct before use in the
core.

v1:
 - use struct assignment
 - put a full copy of the op in struct genl_dumpit_info
 - s/light/small/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 19:11:11 -07:00
Ioana Ciornei
c50bf2be73 devlink: add .trap_group_action_set() callback
Add a new devlink callback, .trap_group_action_set(), which can be used
by device drivers which do not support controlling the action (drop,
trap) on each trap but rather on the entire group trap.
If this new callback is populated, it will take precedence over the
.trap_action_set() callback when the user requests a change of all the
traps in a group.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 16:31:56 -07:00
Ioana Ciornei
10c24eb23d devlink: add parser error drop packet traps
Add parser error drop packet traps, so that capable device driver could
register them with devlink. The new packet trap group holds any drops of
packets which were marked by the device as erroneous during header
parsing. Add documentation for every added packet trap and packet trap
group.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 16:31:55 -07:00
David S. Miller
26d0a8edca Another set of changes, this time with:
* lots more S1G band support
  * 6 GHz scanning, finally
  * kernel-doc fixes
  * non-split wiphy dump fixes in nl80211
  * various other small cleanups/features
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl92/IsACgkQB8qZga/f
 l8QrBxAAi8HJdFyZtCIuLrXL3KHzey5AjmrYuHdsFcdk8NJZkEco17I3l05D9ek6
 76VvqjiYzDwdmgoHr3yz0K7pOAoTRpBKlaecvZLPXWf2bVhebWSU5EPcrTZHolrJ
 JBoBj4FU6Im/MnFbeiKxPj3M+NTQrLdekODSeaC5hFhi/oSF9lap6RMC8sz4YrVp
 9yKzB8zjz+eL4wL3EsztEzpTxbvHTaVMe0XBVou7Fg2ZauJGwqMxpIukpMWUmmNr
 EequhVFpdlXbVMle8wP4ZR58c4+O1kbRoYL9WhAILtdDhCKfLccWXnlUjzuQlCeB
 RH/jzG7AlVhm972oUuqG9szAcU8hEgWdsNEML7pilXmFk/ZSNLpUZfZCAILn+Gd3
 8oMQnXp2br+DLzf1SO7cxpL2KrTNjrb4gcJVBJ9eBlDjK/64N22MqZkpOKcMxq51
 ocmf1MJ1TbAbZn/kY2hsoaPYt2+bm1umMa/t/Pwuds+xKZEOOuPNgZQILcAsfJZB
 2OWDDT+RNLo/K4mPETtyQQZoCxAWB9n/CcnU+UTsmnUmMsEnCEbPnbYKBrc6jX1l
 jSP6XUD8fxhB2lfW+SPtQPnAi86+gblXVvEO8zm0+ez3juItlIsVcRY8ey/j9N1F
 uorpycvrfl6+Q1+mmhe6et2r+TLdGs73I0PJ44HRA9JZexKBrDg=
 =B9Xl
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-net-next-2020-10-02' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Another set of changes, this time with:
 * lots more S1G band support
 * 6 GHz scanning, finally
 * kernel-doc fixes
 * non-split wiphy dump fixes in nl80211
 * various other small cleanups/features
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:33:13 -07:00
Florian Fainelli
3a68844dd2 net: dsa: Utilize __vlan_find_dev_deep_rcu()
Now that we are guaranteed that dsa_untag_bridge_pvid() is called after
eth_type_trans() we can utilize __vlan_find_dev_deep_rcu() which will
take care of finding an 802.1Q upper on top of a bridge master.

A common use case, prior to 12a1526d067 ("net: dsa: untag the bridge
pvid from rx skbs") was to configure a bridge 802.1Q upper like this:

ip link add name br0 type bridge vlan_filtering 0
ip link add link br0 name br0.1 type vlan id 1

in order to pop the default_pvid VLAN tag.

With this change we restore that behavior while still allowing the DSA
receive path to automatically pop the VLAN tag.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 13:36:07 -07:00
Florian Fainelli
a348292b63 net: dsa: Obtain VLAN protocol from skb->protocol
Now that dsa_untag_bridge_pvid() is called after eth_type_trans() we are
guaranteed that skb->protocol will be set to a correct value, thus
allowing us to avoid calling vlan_eth_hdr().

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 13:36:07 -07:00
Florian Fainelli
1c5ad5a940 net: dsa: b53: Set untag_bridge_pvid
Indicate to the DSA receive path that we need to untage the bridge PVID,
this allows us to remove the dsa_untag_bridge_pvid() calls from
net/dsa/tag_brcm.c.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 13:36:07 -07:00
Florian Fainelli
1dc0408cdf net: dsa: Call dsa_untag_bridge_pvid() from dsa_switch_rcv()
When a DSA switch driver needs to call dsa_untag_bridge_pvid(), it can
set dsa_switch::untag_brige_pvid to indicate this is necessary.

This is a pre-requisite to making sure that we are always calling
dsa_untag_bridge_pvid() after eth_type_trans() has been called.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 13:36:07 -07:00
David S. Miller
c16bcd70a1 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2020-10-02

1) Add a full xfrm compatible layer for 32-bit applications on
   64-bit kernels. From Dmitry Safonov.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 13:16:15 -07:00
Johannes Berg
949ca6b82e netlink: fix policy dump leak
[ Upstream commit a95bc734e6 ]

If userspace doesn't complete the policy dump, we leak the
allocated state. Fix this.

Fixes: d07dcf9aad ("netlink: add infrastructure to expose policies to userspace")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 13:07:42 -07:00
Thomas Pedersen
75f87eaeac mac80211: avoid processing non-S1G elements on S1G band
In ieee80211_determine_chantype(), the sband->ht_cap was
being processed before S1G Operation element.  Since the
HT capability element should not be present on the S1G
band, avoid processing potential garbage by moving the
call to ieee80211_apply_htcap_overrides() to after the S1G
block.

Also, in case of a missing S1G Operation element, we would
continue trying to process non-S1G elements (and return
with a channel width of 20MHz). Instead, just assume
primary channel is equal to operating and infer the
operating width from the BSS channel, then return.

Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com>
Link: https://lore.kernel.org/r/20201001174748.24520-1-thomas@adapt-ip.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-10-02 12:07:24 +02:00
Johannes Berg
ab10c22bc3 nl80211: fix non-split wiphy information
When dumping wiphy information, we try to split the data into
many submessages, but for old userspace we still support the
old mode where this doesn't happen.

However, in this case we were not resetting our state correctly
and dumping multiple messages for each wiphy, which would have
broken such older userspace.

This was broken pretty much immediately afterwards because it
only worked in the original commit where non-split dumps didn't
have any more data than split dumps...

Fixes: fe1abafd94 ("nl80211: re-add channel width and extended capa advertising")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20200928130717.3e6d9c6bada2.Ie0f151a8d0d00a8e1e18f6a8c9244dd02496af67@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-10-02 12:07:09 +02:00
Johannes Berg
f8d504caa9 nl80211: reduce non-split wiphy dump size
When wiphy dumps cannot be split, such as in events or with
older userspace that doesn't support it, the size can today
be too big.

Reduce it, by doing two things:

 1) remove data that couldn't have been present before the
    split capability was introduced since it's new, such as
    HE capabilities

 2) as suggested by Martin Willi, remove management frame
    subtypes from the split dumps, as just (1) isn't even
    enough due to other new code capabilities. This is fine
    as old consumers (really just wpa_supplicant) didn't
    check this data before they got support for split dumps.

Reported-by: Martin Willi <martin@strongswan.org>
Suggested-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Tested-by: Martin Willi <martin@strongswan.org>
Link: https://lore.kernel.org/r/20200928130655.53bce7873164.I71f06c9a221cd0630429a1a56eeae68a13beca61@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-10-02 12:06:49 +02:00
Ye Bin
000fe2685b net-sysfs: Fix inconsistent of format with argument type in net-sysfs.c
Fix follow warnings:
[net/core/net-sysfs.c:1161]: (warning) %u in format string (no. 1)
	requires 'unsigned int' but the argument type is 'int'.
[net/core/net-sysfs.c:1162]: (warning) %u in format string (no. 1)
	requires 'unsigned int' but the argument type is 'int'.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-01 18:45:23 -07:00
Ye Bin
32be425b45 pktgen: Fix inconsistent of format with argument type in pktgen.c
Fix follow warnings:
[net/core/pktgen.c:925]: (warning) %u in format string (no. 1)
	requires 'unsigned int' but the argument type is 'signed int'.
[net/core/pktgen.c:942]: (warning) %u in format string (no. 1)
	requires 'unsigned int' but the argument type is 'signed int'.
[net/core/pktgen.c:962]: (warning) %u in format string (no. 1)
	requires 'unsigned int' but the argument type is 'signed int'.
[net/core/pktgen.c:984]: (warning) %u in format string (no. 1)
	requires 'unsigned int' but the argument type is 'signed int'.
[net/core/pktgen.c:1149]: (warning) %d in format string (no. 1)
	requires 'int' but the argument type is 'unsigned int'.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-01 18:45:23 -07:00
David S. Miller
23a1f682a9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-10-01

The following pull-request contains BPF updates for your *net-next* tree.

We've added 90 non-merge commits during the last 8 day(s) which contain
a total of 103 files changed, 7662 insertions(+), 1894 deletions(-).

Note that once bpf(/net) tree gets merged into net-next, there will be a small
merge conflict in tools/lib/bpf/btf.c between commit 1245008122 ("libbpf: Fix
native endian assumption when parsing BTF") from the bpf tree and the commit
3289959b97 ("libbpf: Support BTF loading and raw data output in both endianness")
from the bpf-next tree. Correct resolution would be to stick with bpf-next, it
should look like:

  [...]
        /* check BTF magic */
        if (fread(&magic, 1, sizeof(magic), f) < sizeof(magic)) {
                err = -EIO;
                goto err_out;
        }
        if (magic != BTF_MAGIC && magic != bswap_16(BTF_MAGIC)) {
                /* definitely not a raw BTF */
                err = -EPROTO;
                goto err_out;
        }

        /* get file size */
  [...]

The main changes are:

1) Add bpf_snprintf_btf() and bpf_seq_printf_btf() helpers to support displaying
   BTF-based kernel data structures out of BPF programs, from Alan Maguire.

2) Speed up RCU tasks trace grace periods by a factor of 50 & fix a few race
   conditions exposed by it. It was discussed to take these via BPF and
   networking tree to get better testing exposure, from Paul E. McKenney.

3) Support multi-attach for freplace programs, needed for incremental attachment
   of multiple XDP progs using libxdp dispatcher model, from Toke Høiland-Jørgensen.

4) libbpf support for appending new BTF types at the end of BTF object, allowing
   intrusive changes of prog's BTF (useful for future linking), from Andrii Nakryiko.

5) Several BPF helper improvements e.g. avoid atomic op in cookie generator and add
   a redirect helper into neighboring subsys, from Daniel Borkmann.

6) Allow map updates on sockmaps from bpf_iter context in order to migrate sockmaps
   from one to another, from Lorenz Bauer.

7) Fix 32 bit to 64 bit assignment from latest alu32 bounds tracking which caused
   a verifier issue due to type downgrade to scalar, from John Fastabend.

8) Follow-up on tail-call support in BPF subprogs which optimizes x64 JIT prologue
   and epilogue sections, from Maciej Fijalkowski.

9) Add an option to perf RB map to improve sharing of event entries by avoiding remove-
   on-close behavior. Also, add BPF_PROG_TEST_RUN for raw_tracepoint, from Song Liu.

10) Fix a crash in AF_XDP's socket_release when memory allocation for UMEMs fails,
    from Magnus Karlsson.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-01 14:29:01 -07:00
Ido Schimmel
93e155967c drop_monitor: Filter control packets in drop monitor
Previously, devlink called into drop monitor in order to report hardware
originated drops / exceptions. devlink intentionally filtered control
packets and did not pass them to drop monitor as they were not dropped
by the underlying hardware.

Now drop monitor registers its probe on a generic 'devlink_trap_report'
tracepoint and should therefore perform this filtering itself instead of
having devlink do that.

Add the trap type as metadata and have drop monitor ignore control
packets.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30 18:01:26 -07:00