Add extack arg to netdev_upper_dev_link and netdev_master_upper_dev_link
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)
to do the replacement at the end of the merge window.
Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make struct pernet_operations::id unsigned.
There are 2 reasons to do so:
1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.
2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.
"int" being used as an array index needs to be sign-extended
to 64-bit before being used.
void f(long *p, int i)
{
g(p[i]);
}
roughly translates to
movsx rsi, esi
mov rdi, [rsi+...]
call g
MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.
Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:
static inline void *net_generic(const struct net *net, int id)
{
...
ptr = ng->ptr[id - 1];
...
}
And this function is used a lot, so those sign extensions add up.
Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):
add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]
However, overall balance is in negative direction:
add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
function old new delta
nfsd4_lock 3886 3959 +73
tipc_link_build_proto_msg 1096 1140 +44
mac80211_hwsim_new_radio 2776 2808 +32
tipc_mon_rcv 1032 1058 +26
svcauth_gss_legacy_init 1413 1429 +16
tipc_bcbase_select_primary 379 392 +13
nfsd4_exchange_id 1247 1260 +13
nfsd4_setclientid_confirm 782 793 +11
...
put_client_renew_locked 494 480 -14
ip_set_sockfn_get 730 716 -14
geneve_sock_add 829 813 -16
nfsd4_sequence_done 721 703 -18
nlmclnt_lookup_host 708 686 -22
nfsd4_lockt 1085 1063 -22
nfs_get_client 1077 1050 -27
tcf_bpf_init 1106 1076 -30
nfsd4_encode_fattr 5997 5930 -67
Total: Before=154856051, After=154854321, chg -0.00%
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mostly simple overlapping changes.
For example, David Ahern's adjacency list revamp in 'net-next'
conflicted with an adjacency list traversal bug fix in 'net'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, GRO can do unlimited recursion through the gro_receive
handlers. This was fixed for tunneling protocols by limiting tunnel GRO
to one level with encap_mark, but both VLAN and TEB still have this
problem. Thus, the kernel is vulnerable to a stack overflow, if we
receive a packet composed entirely of VLAN headers.
This patch adds a recursion counter to the GRO layer to prevent stack
overflow. When a gro_receive function hits the recursion limit, GRO is
aborted for this skb and it is processed normally. This recursion
counter is put in the GRO CB, but could be turned into a percpu counter
if we run out of space in the CB.
Thanks to Vladimír Beneš <vbenes@redhat.com> for the initial bug report.
Fixes: CVE-2016-7039
Fixes: 9b174d88c2 ("net: Add Transparent Ethernet Bridging GRO support.")
Fixes: 66e5133f19 ("vlan: Add GRO support for non hardware accelerated vlan")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use sizeof variable instead of literal number to enhance the readability.
Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
args.u.name_type is of type unsigned int and is always >= 0.
This fixes the following GCC warning:
net/8021q/vlan.c: In function ‘vlan_ioctl_handler’:
net/8021q/vlan.c:574:14: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The idea for type_check in dev_get_nest_level() was to count the number
of nested devices of the same type (currently, only macvlan or vlan
devices).
This prevented the false positive lockdep warning on configurations such
as:
eth0 <--- macvlan0 <--- vlan0 <--- macvlan1
However, this doesn't prevent a warning on a configuration such as:
eth0 <--- macvlan0 <--- vlan0
eth1 <--- vlan1 <--- macvlan1
In this case, all the locks end up with a nesting subclass of 1, so
lockdep thinks that there is still a deadlock:
- in the first case we have (macvlan_netdev_addr_lock_key, 1) and then
take (vlan_netdev_xmit_lock_key, 1)
- in the second case, we have (vlan_netdev_xmit_lock_key, 1) and then
take (macvlan_netdev_addr_lock_key, 1)
By removing the linktype check in dev_get_nest_level() and always
incrementing the nesting depth, lockdep considers this configuration
valid.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MAC address of the physical interface is only copied to the VLAN
when it is first created, resulting in an inconsistency after MAC
address changes of only newly created VLANs having an up-to-date MAC.
The VLANs should continue inheriting the MAC address of the physical
interface until the VLAN MAC address is explicitly set to any value.
This allows IPv6 EUI64 addresses for the VLAN to reflect any changes
to the MAC of the physical interface and thus for DAD to behave as
expected.
Signed-off-by: Mike Manning <mmanning@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
vlan drivers lack proper propagation of gso_max_segs from
lower device.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently vlan device inherits unicast filtering flag from underlying
device. If underlying device doesn't support unicast filter, this will
put vlan device into promiscuous mode when it's stacked.
Tun on IFF_UNICAST_FLT on the vlan device in any case so that it does
not go into promiscuous mode needlessly. If underlying device does not
support unicast filtering, that device will enter promiscuous mode.
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently vlan notifier handler will try to update all vlans
for a device when that device comes up. A problem occurs,
however, when the vlan device was set to promiscuous, but not
by the user (ex: a bridge). In that case, dev->gflags are
not updated. What results is that the lower device ends
up with an extra promiscuity count. Here are the
backtraces that prove this:
[62852.052179] [<ffffffff814fe248>] __dev_set_promiscuity+0x38/0x1e0
[62852.052186] [<ffffffff8160bcbb>] ? _raw_spin_unlock_bh+0x1b/0x40
[62852.052188] [<ffffffff814fe4be>] ? dev_set_rx_mode+0x2e/0x40
[62852.052190] [<ffffffff814fe694>] dev_set_promiscuity+0x24/0x50
[62852.052194] [<ffffffffa0324795>] vlan_dev_open+0xd5/0x1f0 [8021q]
[62852.052196] [<ffffffff814fe58f>] __dev_open+0xbf/0x140
[62852.052198] [<ffffffff814fe88d>] __dev_change_flags+0x9d/0x170
[62852.052200] [<ffffffff814fe989>] dev_change_flags+0x29/0x60
The above comes from the setting the vlan device to IFF_UP state.
[62852.053569] [<ffffffff814fe248>] __dev_set_promiscuity+0x38/0x1e0
[62852.053571] [<ffffffffa032459b>] ? vlan_dev_set_rx_mode+0x2b/0x30
[8021q]
[62852.053573] [<ffffffff814fe8d5>] __dev_change_flags+0xe5/0x170
[62852.053645] [<ffffffff814fe989>] dev_change_flags+0x29/0x60
[62852.053647] [<ffffffffa032334a>] vlan_device_event+0x18a/0x690
[8021q]
[62852.053649] [<ffffffff8161036c>] notifier_call_chain+0x4c/0x70
[62852.053651] [<ffffffff8109d456>] raw_notifier_call_chain+0x16/0x20
[62852.053653] [<ffffffff814f744d>] call_netdevice_notifiers+0x2d/0x60
[62852.053654] [<ffffffff814fe1a3>] __dev_notify_flags+0x33/0xa0
[62852.053656] [<ffffffff814fe9b2>] dev_change_flags+0x52/0x60
[62852.053657] [<ffffffff8150cd57>] do_setlink+0x397/0xa40
And this one comes from the notification code. What we end
up with is a vlan with promiscuity count of 1 and and a physical
device with a promiscuity count of 2. They should both have
a count 1.
To resolve this issue, vlan code can use dev_get_flags() api
which correctly masks promiscuity and allmulti flags.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a networking device is taken down that has a non-trivial number
of VLAN devices configured under it, we eat a full synchronize_net()
for every such VLAN device.
This is because of the call chain:
NETDEV_DOWN notifier
--> vlan_device_event()
--> dev_change_flags()
--> __dev_change_flags()
--> __dev_close()
--> __dev_close_many()
--> dev_deactivate_many()
--> synchronize_net()
This is kind of rediculous because we already have infrastructure for
batching doing operation X to a list of net devices so that we only
incur one sync.
So make use of that by exporting dev_close_many() and adjusting it's
interfaace so that the caller can fully manage the batch list. Use
this in vlan_device_event() and all the overhead goes away.
Reported-by: Salam Noureddine <noureddine@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similarly, vlan will create /proc/net/vlan/<dev>, so when we
create dev with name "config", it will confict with
/proc/net/vlan/config.
Reported-by: Stephane Chazelas <stephane.chazelas@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit dc8eaaa006.
vlan: Fix lockdep warning when vlan dev handle notification
Instead we use the new new API to find the lock subclass of
our vlan device. This way we can support configurations where
vlans are interspersed with other devices:
bond -> vlan -> macvlan -> vlan
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, if the card supports CTAG acceleration we do not
account for the vlan header even if we are configuring an
8021AD vlan. This may not be best since we'll do software
tagging for 8021AD which will cause data copy on skb head expansion
Configure the length based on available hw offload capabilities and
vlan protocol.
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use ether_addr_copy instead of memcpy(a, b, ETH_ALEN) to
save some cycles on arm and powerpc.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On netdev unregister we're removing also all of its sysfs-associated stuff,
including the sysfs symlinks that are controlled by netdev neighbour code.
Also, it's a subtle race condition - cause we can still access it after
unregistering.
Move the unlinking right before the unregistering to fix both.
CC: Patrick McHardy <kaber@trash.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise users might access it without being fully registered, as per
sysfs - it only inits in register_netdevice(), so is unusable till it is
called.
CC: Patrick McHardy <kaber@trash.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch cleanup 2 points for the usage of vlan_dev_priv(dev):
* In vlan_dev.c/vlan_dev_hard_header, we should use the var *vlan directly
after grabing the pointer at the beginning with
*vlan = vlan_dev_priv(dev);
when we need to access the fields of *vlan.
* In vlan.c/register_vlan_device, add the var *vlan pointer
struct vlan_dev_priv *vlan;
to cleanup the code to access the fields of vlan_dev_priv(new_dev).
Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now, bond_resend_igmp_join_requests() looks for vlans attached to
bonding device, bridge where bonding act as port manually. It does not
care of other scenarios, like stacked bonds or team device above. Make
this more generic and use netdev notifier to propagate the event to
upper devices and to actually call ip_mc_rejoin_groups().
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So far, only net_device * could be passed along with netdevice notifier
event. This patch provides a possibility to pass custom structure
able to provide info that event listener needs to know.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
v2->v3: fix typo on simeth
shortened dev_getter
shortened notifier_info struct name
v1->v2: fix notifier_call parameter in call_netdevice_notifier()
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the encapsulation protocol value a property of VLAN devices and change
the device lookup functions to take the protocol value into account.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the rx_{add,kill}_vid callbacks to take a protocol argument in
preparation of 802.1ad support. The protocol argument used so far is
always htons(ETH_P_8021Q).
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the hardware VLAN acceleration features to include "CTAG" to indicate
that they only support CTAGs. Follow up patches will introduce 802.1ad
server provider tagging (STAGs) and require the distinction for hardware not
supporting acclerating both.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
vlan_vid_del() could possibly free ->vlan_info after a RCU grace
period, however, we may still refer to the freed memory area
by 'grp' pointer. Found by code inspection.
This patch moves vlan_vid_del() as behind as possible.
Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initial implementation of the Multiple VLAN Registration Protocol
(MVRP) from IEEE 802.1Q-2011, based on the existing implementation
of the GARP VLAN Registration Protocol (GVRP).
Signed-off-by: David Ward <david.ward@ll.mit.edu>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of jumping aroung bugs that are easily fixed just don't let them in:
affected drivers should be either fixed or have NETIF_F_HW_VLAN_FILTER
removed from advertised features.
Quick grep in drivers/net shows two drivers that have NETIF_F_HW_VLAN_FILTER
but not ndo_vlan_rx_add/kill_vid(), but those are false-positives (features
are commented out).
OTOH two drivers have ndo_vlan_rx_add/kill_vid() implemented but don't
advertise NETIF_F_HW_VLAN_FILTER. Those are:
+ethernet/cisco/enic/enic_main.c
+ethernet/qlogic/qlcnic/qlcnic_main.c
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
This bug is observed on running FCoE over a VLAN device associated w/
a real device that has IFF_UNICAST_FLT set since FCoE would add unicast
address such as FLOGI MAC to the VLAN interface that FCoE is on. Since
currently, VLAN device is not inheriting the IFF_UNICAST_FLT flag from the
parent real device even though the real device is capable of doing unicast
filtering. This forces the VLAN device and its real device go to promiscuous
mode unnecessarily even the added address is actually being added to the
available unicast filter table in real device.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Cc: devel@open-fcoe.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) and
capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
Allow the vlan ioctls:
SET_VLAN_INGRESS_PRIORITY_CMD
SET_VLAN_EGRESS_PRIORITY_CMD
SET_VLAN_FLAG_CMD
SET_VLAN_NAME_TYPE_CMD
ADD_VLAN_CMD
DEL_VLAN_CMD
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
#if defined(CONFIG_FOO) || defined(CONFIG_FOO_MODULE)
can be replaced by
#if IS_ENABLED(CONFIG_FOO)
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
vlan_info might be present but still no vlan devices might be there.
That is in case of vlan0 automatically added.
So in that case, allow to change netdev type.
Reported-by: Jon Stanley <jstanley@rmrf.net>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
In driver reload test there is a memory leak.
The structure vlan_info was not freed when the driver was removed.
It was not released since the nr_vids var is one after last vlan was removed.
The nr_vids is one, since vlan zero is added to the interface when the interface
is being set, but the vlan zero is not deleted at unregister.
Fix - delete vlan zero when we unregister the device.
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the new bool function ether_addr_equal to add
some clarity and reduce the likelihood for misuse
of compare_ether_addr for sorting.
Done via cocci script:
$ cat compare_ether_addr.cocci
@@
expression a,b;
@@
- !compare_ether_addr(a, b)
+ ether_addr_equal(a, b)
@@
expression a,b;
@@
- compare_ether_addr(a, b)
+ !ether_addr_equal(a, b)
@@
expression a,b;
@@
- !ether_addr_equal(a, b) == 0
+ ether_addr_equal(a, b)
@@
expression a,b;
@@
- !ether_addr_equal(a, b) != 0
+ !ether_addr_equal(a, b)
@@
expression a,b;
@@
- ether_addr_equal(a, b) == 0
+ !ether_addr_equal(a, b)
@@
expression a,b;
@@
- ether_addr_equal(a, b) != 0
+ ether_addr_equal(a, b)
@@
expression a,b;
@@
- !!ether_addr_equal(a, b)
+ ether_addr_equal(a, b)
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows to keep track of vids needed to be in rx vlan filters of
devices even if they are used in bond/team etc.
vlan_info as well as vlan_group previously was, is allocated when first
vid is added and dealocated whan last vid is deleted.
vlan_group definition is moved to private header.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds wrapper for ndo_vlan_rx_add_vid/ndo_vlan_rx_kill_vid
functions. Check for NETIF_F_HW_VLAN_FILTER feature is done in this
wrapper.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As this structure is priv, name it approprietely. Also for pointer to it
use name "vlan".
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When assigning a NULL value to an RCU protected pointer, no barrier
is needed. The rcu_assign_pointer, used to handle that but will soon
change to not handle the special case.
Convert all rcu_assign_pointer of NULL value.
//smpl
@@ expression P; @@
- rcu_assign_pointer(P, NULL)
+ RCU_INIT_POINTER(P, NULL)
// </smpl>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the call to ndo_vlan_rx_register if the underlying
device doesn't have hardware support for VLAN.
Signed-off-by: Antoine Reversat <a.reversat@gmail.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Use the current logging style.
Add #define pr_fmt and remove embedded prefix from formats.
Not converting the current pr_<level> uses to netdev_<level>
because all the output here is nicely prefaced with "8021q: ".
Remove __func__ use from proc registration failure message.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The below patch removes vlan_buggyright and vlan_copyright from vlan_proto_init,
so that it prints out just the fullname of vlan and the version number.
before:
[ 30.438203] 802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
[ 30.441542] All bugs added by David S. Miller <davem@redhat.com>
after:
[ 31.513910] 802.1Q VLAN Support v1.8
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
CC: Joe Perches <joe@perches.com>
CC: David S. Miller <davem@davemloft.net>
CC: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At VLAN dismantle phase, unregister_vlan_dev() makes one
synchronize_net() call after vlan_group_set_device(grp, vlan_id, NULL).
This call can be safely removed because we are calling
unregister_netdevice_queue() to queue device for deletion, and this
process needs at least one rcu grace period to complete.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>