linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2025-01-07 22:34:18 +08:00

Author	SHA1	Message	Date
Veaceslav Falico	d3ab3ffd1d	bonding: use rlb_client_info->vlan_id instead of ->tag Store VID in ->vlan_id (if any), and remove the useless ->tag. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-03 22:02:32 -04:00
Veaceslav Falico	6f477d4201	bonding: remove bond_vlan_used() We're using it currently to verify if we have vlans before getting the tag from the skb we're about to send. It's useless because the vlan_get_tag() verifies if the skb has the tag (and returns an error if not), and we can receive tagged skbs only if we already have vlans. Plus, the current RCUed implementation is kind of useless anyway - the we can remove the last vlan in the moment we return from the function. So remove the only usage of it and the whole function. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-03 22:02:32 -04:00
Veaceslav Falico	3e32582f7d	bonding: pr_debug instead of pr_warn in bond_arp_send_all They're simply annoying and will spam dmesg constantly if we hit them, so convert to pr_debug so that we still can access them in case of debugging. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-29 16:19:43 -04:00
Veaceslav Falico	e868b0c938	bonding: remove vlan_list/current_alb_vlan Currently there are no real users of vlan_list/current_alb_vlan, only the helpers which maintain them, so remove them. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-29 16:19:43 -04:00
Veaceslav Falico	5bf94b839a	bonding: make alb_send_learning_packets() use upper dev list Currently, if there are vlans on top of bond, alb_send_learning_packets() will never send LPs from the bond itself (i.e. untagged), which might leave untagged clients unupdated. Also, the 'circular vlan' logic (i.e. update only MAX_LP_BURST vlans at a time, and save the last vlan for the next update) is really suboptimal - in case of lots of vlans it will take a lot of time to update every vlan. It is also never called in any hot path and sends only a few small packets - thus the optimization by itself is useless. So remove the whole current_alb_vlan/MAX_LP_BURST logic from alb_send_learning_packets(). Instead, we'll first send a packet untagged and then traverse the upper dev list, sending a tagged packet for each vlan found. Also, remove the MAX_LP_BURST define - we already don't need it. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-29 16:19:43 -04:00
Veaceslav Falico	7aa6498123	bonding: split alb_send_learning_packets() Create alb_send_lp_vid(), which will handle the skb/lp creation, vlan tagging and sending, and use it in alb_send_learning_packets(). This way all the logic remains in alb_send_learning_packets(), which becomes a lot more cleaner and easier to understand. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-29 16:19:43 -04:00
Veaceslav Falico	a59d3d21ea	bonding: use vlan_uses_dev() in __bond_release_one() We always hold the rtnl_lock() in __bond_release_one(), so use vlan_uses_dev() instead of bond_vlan_used(). CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-29 16:19:43 -04:00
Veaceslav Falico	50223ce4be	bonding: convert bond_has_this_ip() to use upper devices Currently, bond_has_this_ip() is aware only of vlan upper devices, and thus will return false if the address is associated with the upper bridge or any other device, and thus will break the arp logic. Fix this by using the upper device list. For every upper device we verify if the address associated with it is our address, and if yes - return true. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-29 16:19:42 -04:00
Veaceslav Falico	27bc11e638	bonding: make bond_arp_send_all use upper device list Currently, bond_arp_send_all() is aware only of vlans, which breaks configurations like bond <- bridge (or any other 'upper' device) with IP (which is quite a common scenario for virt setups). To fix this we convert the bond_arp_send_all() to first verify if the rt device is the bond itself, and if not - to go through its list of upper vlans and their respectiv upper devices (if the vlan's upper device matches - tag the packet), if still not found - go through all of our upper list devices to see if any of them match the route device for the target. If the match is a vlan device - we also save its vlan_id and tag it in bond_arp_send(). Also, clean the function a bit to be more readable. CC: Vlad Yasevich <vyasevic@redhat.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-29 16:19:42 -04:00
Veaceslav Falico	c752af2c55	bonding: use netdev_upper list in bond_vlan_used Convert bond_vlan_used() to traverse the upper device list to see if we have any vlans above us. It's protected by rcu, and in case we are holding rtnl_lock we should call vlan_uses_dev() instead - it's faster. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-29 16:19:42 -04:00
Wei Yongjun	b8e2fde466	bonding: fix error return code in bond_enslave() Fix to return a negative error code in the add bond vlan ids error handling case instead of 0, as done elsewhere in this function. Introduced by commit `1ff412ad77`. (bonding: change the bond's vlan syncing functions with the standard ones) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-25 18:37:35 -04:00
nikolay@redhat.com	b20903f2a9	bonding: unwind on bond_add_vlan failure In case of bond_add_vlan() failure currently we'll have the vlan's refcnt bumped up in all slaves, but it will never go down because it failed to get added to the bond, so properly unwind the added vlan if bond_add_vlan fails. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-08 22:31:21 -07:00
nikolay@redhat.com	1ff412ad77	bonding: change the bond's vlan syncing functions with the standard ones Now we have vlan_vids_add/del_by_dev() which serve the same purpose as bond's bond_add/del_vlans_on_slave() with the good side effect of reverting the changes if one of the additions fails. There's only 1 change in the behaviour of enslave: if adding of the vlans to the slave fails, we'll fail the enslaving because otherwise we might delete some vlan that wasn't added by the bonding. The only way this may happen is with ENOMEM currently, so we're in trouble anyway. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-08 22:31:21 -07:00
Veaceslav Falico	7864a1adf7	bonding: remove locking from bond_set_rx_mode() We're already protected by RTNL lock, so nothing can happen to bond/its slaves, and thus the locking is useless here (both bond->lock and bond->curr_active_slave). Also, add ASSERT_RTNL() both to bond_set_rx_mode() and bond_hw_addr_swap() to catch possible uses of it without RTNL locking. This patch also saves us from a lockdep false-positive in bond_set_rx_mode() vs bond_hw_addr_swap(). CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-05 12:22:53 -07:00
Veaceslav Falico	e7f63f1dc4	bonding: add bond_time_in_interval() and use it for time comparison Currently we use a lot of time comparison math for arp_interval comparisons, which are sometimes quite hard to read and understand. All the time comparisons have one pattern: (time - arp_interval_jiffies) <= jiffies <= (time + mod * arp_interval_jiffies + arp_interval_jiffies/2) Introduce a new helper - bond_time_in_interval(), which will do the math in one place and, thus, will clean up the logical code. This helper introduces a bit of overhead (by always calculating the jiffies from arp_interval), however it's really not visible, considering that functions using it usually run once in arp_interval milliseconds. There are several lines slightly over 80 chars, however breaking them would result in more hard-to-read code than several character after the 80 mark. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-05 12:19:45 -07:00
Veaceslav Falico	def4460cdb	bonding: call slave_last_rx() only once per slave Simple cleanup to not call slave_last_rx() on every time function. It won't give any measurable boost - but looks cleaner and easier to understand. There are no time-consuming functions in between these calls, so it's safe to call it in the beginning only once. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-05 12:19:44 -07:00
Veaceslav Falico	9918d5bf32	bonding: modify only neigh_parms owned by us Otherwise, on neighbour creation, bond_neigh_init() will be called with a foreign netdev. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-02 15:44:23 -07:00
nikolay@redhat.com	278b208375	bonding: initial RCU conversion This patch does the initial bonding conversion to RCU. After it the following modes are protected by RCU alone: roundrobin, active-backup, broadcast and xor. Modes ALB/TLB and 3ad still acquire bond->lock for reading, and will be dealt with later. curr_active_slave needs to be dereferenced via rcu in the converted modes because the only thing protecting the slave after this patch is rcu_read_lock, so we need the proper barrier for weakly ordered archs and to make sure we don't have stale pointer. It's not tagged with __rcu yet because there's still work to be done to remove the curr_slave_lock, so sparse will complain when rcu_assign_pointer and rcu_dereference are used, but the alternative to use rcu_dereference_protected would've created much bigger code churn which is more difficult to test and review. That will be converted in time. 1. Active-backup mode 1.1 Perf recording while doing iperf -P 4 - old bonding: iperf spent 0.55% in bonding, system spent 0.29% CPU in bonding - new bonding: iperf spent 0.29% in bonding, system spent 0.15% CPU in bonding 1.2. Bandwidth measurements - old bonding: 16.1 gbps consistently - new bonding: 17.5 gbps consistently 2. Round-robin mode 2.1 Perf recording while doing iperf -P 4 - old bonding: iperf spent 0.51% in bonding, system spent 0.24% CPU in bonding - new bonding: iperf spent 0.16% in bonding, system spent 0.11% CPU in bonding 2.2 Bandwidth measurements - old bonding: 8 gbps (variable due to packet reorderings) - new bonding: 10 gbps (variable due to packet reorderings) Of course the latency has improved in all converted modes, and moreover while doing enslave/release (since it doesn't affect tx anymore). Also I've stress tested all modes doing enslave/release in a loop while transmitting traffic. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 16:42:02 -07:00
Nikolay Aleksandrov	15077228ca	bonding: factor out slave id tx code and simplify xmit paths I factored out the tx xmit code which relies on slave id in bond_xmit_slave_id. It is global because later it can be used also in 3ad mode xmit. Unnecessary obvious comments are removed. Active-backup mode is simplified because bond_dev_queue_xmit always consumes the skb. bond_xmit_xor becomes one line because of bond_xmit_slave_id. bond_for_each_slave_from is not used in bond_xmit_slave_id because later when RCU is used we can avoid important race condition by using standard rculist routines. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 16:42:02 -07:00
Nikolay Aleksandrov	78a646ced8	bonding: simplify broadcast_xmit function We don't need to start from the curr_active_slave as the frame will be sent to all eligible slaves anyway, so we remove the unnecessary local variables, checks and comments, and make it use the standard list API. This has the nice side-effect that later when it's converted to RCU a race condition will be avoided which could lead to double packet tx. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 16:42:02 -07:00
nikolay@redhat.com	71bc3b2dc5	bonding: remove unnecessary read_locks of curr_slave_lock In all the cases we already hold bond->lock for reading, so the slave can't get away and the check != NULL is sufficient. curr_active_slave can still change after the read_lock is unlocked prior to use of the dereferenced value, so there's no need for it. It either contains a valid slave which we use (and can't get away), or it is NULL which is checked. In some places the read_lock of curr_slave_lock was left because we need it not to change while performing some action (e.g. syncing current active slave's addresses, sending ARP requests through the active slave) such cases will be dealt with individually while converting to RCU. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 16:42:02 -07:00
nikolay@redhat.com	dec1e90e8c	bonding: convert to list API and replace bond's custom list This patch aims to remove struct bonding's first_slave and struct slave's next and prev pointers, and replace them with the standard Linux list API. The old macros are converted to list API as well and some new primitives are available now. The checks if there're slaves that used slave_cnt have been replaced by the list_empty macro. Also a few small style fixes, changing longest -> shortest line in local variable declarations, leaving an empty line before return and removing unnecessary brackets. This is the first step to gradual RCU conversion. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 16:42:01 -07:00
Nikolay Aleksandrov	4beac0293f	bonding: fix system hang due to fast igmp timer rescheduling After commit `4aa5dee4d9` ("net: convert resend IGMP to notifier event") we try to acquire rtnl in bond_resend_igmp_join_requests but it can be scheduled with rtnl already held (e.g. when bond_change_active_slave is called with rtnl) causing a loop of immediate reschedules + calls because rtnl_trylock fails each time since it's being already held. For me this issue leads to system hangs very easy: modprobe bonding; ifconfig bond0 up; ifenslave bond0 eth0; rmmod bonding; The fix is to introduce a small (1 jiffy) delay which is enough for the sections holding rtnl to finish without putting any strain on the system. Also adjust the timer in bond_change_active_slave to be 1 jiffy, since most of the time it's called with rtnl already held. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 15:52:49 -07:00
nikolay@redhat.com	dcfe8048de	bonding: remove bond_resend_igmp_join_requests read_unlock leftover After commit `4aa5dee4d9` ("net: convert resend IGMP to notifier event") we have 1 read_unlock in bond_resend_igmp_join_requests which isn't paired with a read_lock because it's removed by that commit. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-28 01:08:04 -07:00
stephen hemminger	10eccb46b5	bond: cleanup netpoll code This started out with fixing a sparse warning, then I realized that the wrapper function bond_netpoll_info could just be removed by rolling it into the enable code. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-26 15:24:47 -07:00
Wang Sheng-Hui	f52809483c	bonding: use pre-defined macro in bond_mode_name instead of magic number 0 We have BOND_MODE_ROUNDROBIN pre-defined as 0, and it's the lowest mode number. Use it to check the arg lower bound instead of magic number 0 in bond_mode_name. Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-26 13:53:49 -07:00
dingtianhong	b07ea07bd0	bonding: Fixed up a error "do not initialise statics to 0 or NULL" in bond_main.c The error is found by the checkpatch.pl tools. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 17:45:23 -07:00
dingtianhong	9402b746e7	bonding: add rtnl protection for bonding_store_fail_over_mac We need rtnl protection while reading slave_cnt and updating the .fail_over_mac, and it also follows the logic "don't change anything slave-related without rtnl". :) Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 17:45:23 -07:00
dingtianhong	38c4916a78	bonding: bond_sysfs.c checkpatch cleanup net/bonding/bond_sysfs.c:1302: ERROR: else should follow close brace '}' net/bonding/bond_sysfs.c:1314: ERROR: else should follow close brace '}' Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 17:45:23 -07:00
dingtianhong	c4cdef9b71	bonding: don't call slave_xxx_netpoll under spinlocks The slave_xxx_netpoll will call synchronize_rcu_bh(), so the function may schedule and sleep, it should't be called under spinlocks. bond_netpoll_setup() and bond_netpoll_cleanup() are always protected by rtnl lock, it is no need to take the read lock, as the slave list couldn't be changed outside rtnl lock. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 17:45:23 -07:00
Jiri Pirko	4aa5dee4d9	net: convert resend IGMP to notifier event Until now, bond_resend_igmp_join_requests() looks for vlans attached to bonding device, bridge where bonding act as port manually. It does not care of other scenarios, like stacked bonds or team device above. Make this more generic and use netdev notifier to propagate the event to upper devices and to actually call ip_mc_rejoin_groups(). Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-23 16:52:47 -07:00
David S. Miller	0c1072ae02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/freescale/fec_main.c drivers/net/ethernet/renesas/sh_eth.c net/ipv4/gre.c The GRE conflict is between a bug fix (kfree_skb --> kfree_skb_list) and the splitting of the gre.c code into seperate files. The FEC conflict was two sets of changes adding ethtool support code in an "!CONFIG_M5272" CPP protected block. Finally the sh_eth.c conflict was between one commit add bits set in the .eesr_err_check mask whilst another commit removed the .tx_error_check member and assignments. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-03 14:55:13 -07:00
Nikolay Aleksandrov	008aebde9b	bonding: combine pr_debugs in bond_set_dev_addr into one Combine the multiple pr_debugs in bond_set_dev_addr into one pr_debug. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-29 12:37:08 -07:00
nikolay@redhat.com	ae0d67505c	bonding: when cloning a MAC use NET_ADDR_STOLEN A simple semantic change, when a slave's MAC is cloned by the bond master then set addr_assign_type to NET_ADDR_STOLEN instead of NET_ADDR_SET. Also use bond_set_dev_addr() in BOND_FOM_ACTIVE mode to change the bond's MAC address because the assign_type has to be set properly. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-27 22:50:15 -07:00
nikolay@redhat.com	97a1e6396b	bonding: remove unnecessary dev_addr_from_first member In struct bonding there's a member called dev_addr_from_first which is used to denote when the bond dev should clone the first slave's MAC address but since we have netdev's addr_assign_type variable that is not necessary. We clone the first slave's MAC each time we have a random MAC set to the bond device. This has the nice side-effect of also fixing an inconsistency - when the MAC address of the bond dev is set after its creation, but prior to having slaves, it's not kept and the first slave's MAC is cloned. The only way to keep the MAC was to create the bond device with the MAC address set (e.g. through ip link). In all cases if the bond device is left without any slaves - its MAC gets reset to a random one as before. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-27 22:50:15 -07:00
nikolay@redhat.com	8d2ada77f8	bonding: remove unnecessary setup_by_slave member We have a member called setup_by_slave in struct bonding to denote if the bond dev has different type than ARPHRD_ETHER, but that is already denoted in bond's netdev type variable if it was setup by the slave, so use that instead of the member. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-27 22:50:15 -07:00
Veaceslav Falico	8599b52e14	bonding: add an option to fail when any of arp_ip_target is inaccessible Currently, we fail only when all of the ips in arp_ip_target are gone. However, in some situations we might need to fail if even one host from arp_ip_target becomes unavailable. All situations, obviously, rely on the idea that we need completely functional network, with all interfaces/addresses working correctly. One real world example might be: vlans on top on bond (hybrid port). If bond and vlans have ips assigned and we have their peers monitored via arp_ip_target - in case of switch misconfiguration (trunk/access port), slave driver malfunction or tagged/untagged traffic dropped on the way - we will be able to switch to another slave. Though any other configuration needs that if we need to have access to all arp_ip_targets. This patch adds this possibility by adding a new parameter - arp_all_targets (both as a module parameter and as a sysfs knob). It can be set to: 0 or any (the default) - which works exactly as it's working now - the slave is up if any of the arp_ip_targets are up. 1 or all - the slave is up if all of the arp_ip_targets are up. This parameter can be changed on the fly (via sysfs), and requires the mode to be active-backup and arp_validate to be enabled (it obeys the arp_validate config on which slaves to validate). Internally it's done through: 1) Add target_last_arp_rx[BOND_MAX_ARP_TARGETS] array to slave struct. It's an array of jiffies, meaning that slave->target_last_arp_rx[i] is the last time we've received arp from bond->params.arp_targets[i] on this slave. 2) If we successfully validate an arp from bond->params.arp_targets[i] in bond_validate_arp() - update the slave->target_last_arp_rx[i] with the current jiffies value. 3) When getting slave's last_rx via slave_last_rx(), we return the oldest time when we've received an arp from any address in bond->params.arp_targets[]. If the value of arp_all_targets == 0 - we still work the same way as before. Also, update the documentation to reflect the new parameter. v3->v4: Kill the forgotten rtnl_unlock(), rephrase the documentation part to be more clear, don't fail setting arp_all_targets if arp_validate is not set - it has no effect anyway but can be easier to set up. Also, print a warning if the last arp_ip_target is removed while the arp_interval is on, but not the arp_validate. v2->v3: Use _bh spinlock, remove useless rtnl_lock() and use jiffies for new arp_ip_target last arp, instead of slave_last_rx(). On bond_enslave(), use the same initialization value for target_last_arp_rx[] as is used for the default last_arp_rx, to avoid useless interface flaps. Also, instead of failing to remove the last arp_ip_target just print a warning - otherwise it might break existing scripts. v1->v2: Correctly handle adding/removing hosts in arp_ip_target - we need to shift/initialize all slave's target_last_arp_rx. Also, don't fail module loading on arp_all_targets misconfiguration, just disable it, and some minor style fixes. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:58:38 -07:00
Veaceslav Falico	aeea64ac71	bonding: don't trust arp requests unless active slave really works Currently, if we receive any arp packet on a backup slave in active-backup mode and arp_validate enabled, we suppose that it's an arp request, swap source/target ip and try to validate it. This optimization gives us virtually no downtime in the most common situation (active and backup slaves are in the same broadcast domain and the active slave failed). However, if we can't reach the arp_ip_target(s), we end up in an endless loop of reselecting slaves, because we receive our arp requests, sent by the active slave, and think that backup slaves are up, thus selecting them as active and, again, sending arp requests, which fool our backup slaves. Fix this by not validating the swapped arp packets if the current active slave didn't receive any arp reply after it was selected as active. This way we will only accept arp requests if we know that the current active slave can actually reach arp_ip_target. v3->v4: Obey 80 lines and make checkpatch.pl happy, per Sergei's suggestion. v1->v3: No change. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:58:38 -07:00
Veaceslav Falico	2c14610210	bonding: don't validate arp if we don't have to Currently, we validate all the incoming arps if arp_validate not 0. However, we don't have to validate backup slaves if arp_validate == active and vice versa, so return early in bond_arp_rcv() in these cases. It works correctly now because we verify arp_validate in slave_last_rx(), however we're just doing useless work in bond_arp_rcv(). Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:58:38 -07:00
Veaceslav Falico	0afee4e8b9	bonding: don't add duplicate targets to arp_ip_target Print a warning and skip them. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:58:38 -07:00
Veaceslav Falico	87a7b84b58	bonding: add helper function bond_get_targets_ip(targets, ip) Add function bond_get_targets_ip(targets, ip) which searches through targets array of ips (arp_targets) and returns the position of first match. If ip == 0, returns the first free slot. On failure to find the ip or free slot, return -1. Use it to verify if the arp we've received is valid and in sysfs. v1->v2: Fix "[2/6] bonding: add helper function bond_get_targets_ip(targets, ip)", per Nikolay's advice, to verify if source ip != 0.0.0.0, otherwise we might update 'null' arp_ip_targets' last_rx. Also, address style. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:58:37 -07:00
Nikolay Aleksandrov	db4e9b2b98	bonding: fix slave speed reporting in bond_miimon_commit When we have BOND_LINK_UP the speed is reported unconditionally with %u format although it can be SPEED_UNKNOWN (-1). After this patch it returns 0 in that case in an attempt to keep the existing scripts happy. One line is intenionally left 81 chars because it gets ugly if broken. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-24 00:04:55 -07:00
Veaceslav Falico	b88ec38d13	bonding: trivial: make alb use bond_slave_has_mac() Also, cleanup bond_alb_handle_active_change() from 2 identical ifs. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 22:20:08 -07:00
David S. Miller	d98cae64e4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/wireless/ath/ath9k/Kconfig drivers/net/xen-netback/netback.c net/batman-adv/bat_iv_ogm.c net/wireless/nl80211.c The ath9k Kconfig conflict was a change of a Kconfig option name right next to the deletion of another option. The xen-netback conflict was overlapping changes involving the handling of the notify list in xen_netbk_rx_action(). Batman conflict resolution provided by Antonio Quartulli, basically keep everything in both conflict hunks. The nl80211 conflict is a little more involved. In 'net' we added a dynamic memory allocation to nl80211_dump_wiphy() to fix a race that Linus reported. Meanwhile in 'net-next' the handlers were converted to use pre and post doit handlers which use a flag to determine whether to hold the RTNL mutex around the operation. However, the dump handlers to not use this logic. Instead they have to explicitly do the locking. There were apparent bugs in the conversion of nl80211_dump_wiphy() in that we were not dropping the RTNL mutex in all the return paths, and it seems we very much should be doing so. So I fixed that whilst handling the overlapping changes. To simplify the initial returns, I take the RTNL mutex after we try to allocate 'tb'. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 16:49:39 -07:00
Veaceslav Falico	cedb743f3e	bonding: don't call alb_set_slave_mac_addr() while atomic alb_set_slave_mac_addr() sets the mac address in alb mode via dev_set_mac_address(), which might sleep. It's called from alb_handle_addr_collision_on_attach() in atomic context (under read_lock(bond->lock)), thus triggering a bug. Fix this by moving the lock inside alb_handle_addr_collision_on_attach(). v1->v2: As Nikolay Aleksandrov noticed, we can drop the bond->lock completely. Also, use bond_slave_has_mac(), when possible. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 16:27:24 -07:00
Nikolay Aleksandrov	4f5474e7fd	bonding: fix igmp_retrans type and two related races First the type of igmp_retrans (which is the actual counter of igmp_resend parameter) is changed to u8 to be able to store values up to 255 (as per documentation). There are two races that were hidden there and which are easy to trigger after the previous fix, the first is between bond_resend_igmp_join_requests and bond_change_active_slave where igmp_retrans is set and can be altered by the periodic. The second race condition is between multiple running instances of the periodic (upon execution it can be scheduled again for immediate execution which can cause the counter to go < 0 which in the unsigned case leads to unnecessary igmp retransmissions). Since in bond_change_active_slave bond->lock is held for reading and curr_slave_lock for writing, we use curr_slave_lock for mutual exclusion. We can't drop them as there're cases where RTNL is not held when bond_change_active_slave is called. RCU is unlocked in bond_resend_igmp_join_requests before getting curr_slave_lock since we don't need it there and it's pointless to delay. The decrement is moved inside the "if" block because if we decrement unconditionally there's still a possibility for a race condition although it is much more difficult to hit (many changes have to happen in a very short period in order to trigger) which in the case of 3 parallel running instances of this function and igmp_retrans == 1 (with check bond->igmp_retrans-- > 1) is: f1 passes, doesn't re-schedule, but decrements - igmp_retrans = 0 f2 then passes, doesn't re-schedule, but decrements - igmp_retrans = 255 f3 does the unnecessary retransmissions. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 02:33:37 -07:00
Nikolay Aleksandrov	b8fad459f9	bonding: reset master mac on first enslave failure If the bond device is supposed to get the first slave's MAC address and the first enslavement fails then we need to reset the master's MAC otherwise it will stay the same as the failed slave device. We do it after err_undo_flags since that is the first place where the MAC can be changed and we check if it should've been the first slave and if the bond's MAC was set to it because that err place is used by multiple locations prior to changing the master's MAC address. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 02:33:37 -07:00
Jay Vosburgh	1b5acd2923	bonding: disallow change of MAC if fail_over_mac enabled Currently, if fail_over_mac is set to active, then attempts to change the MAC of the bond itself silently fail. However, if fail_over_mac is set to follow, changes are permitted. Permitting the bond's MAC to change with fail_over_mac=follow will disrupt the follow functionality, which normally controls the assignment of MAC address to the bond and its slaves, and can cause multiple ports to be assigned the same MAC address. which will interfere with the functioning of the device (where the device here is a virtualization-aware card for s390, qeth). Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 15:05:51 -07:00
Jay Vosburgh	303d1cbf61	bonding: Convert hw addr handling to sync/unsync, support ucast addresses This patch converts bonding to use the dev_uc/mc_sync and dev_uc/mc_sync_multiple functions for updating the hardware addresses of bonding slaves. The existing functions to add or remove addresses are removed, and their functionality is replaced with calls to dev_mc_sync or dev_mc_sync_multiple, depending upon the bonding mode. Calls to dev_uc_sync and dev_uc_sync_multiple are also added, so that unicast addresses added to a bond will be properly synced with its slaves. Various functions are renamed to better reflect the new situation, and relevant comments are updated. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Cc: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 15:05:51 -07:00
Veaceslav Falico	d6641ccff9	bonding: trivial: update the comments to reflect the reality Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 23:57:23 -07:00
Veaceslav Falico	43547ea669	bonding: trivial: remove unused parameter from alb_swap_mac_addr() After `b924551` ("bonding: fix enslaving in alb mode when link down") we don't need the bond parameter in alb_swap_mac_addr(), so remove it. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 23:57:23 -07:00
Jiri Pirko	351638e7de	net: pass info struct via netdevice notifier So far, only net_device * could be passed along with netdevice notifier event. This patch provides a possibility to pass custom structure able to provide info that event listener needs to know. Signed-off-by: Jiri Pirko <jiri@resnulli.us> v2->v3: fix typo on simeth shortened dev_getter shortened notifier_info struct name v1->v2: fix notifier_call parameter in call_netdevice_notifier() Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 13:11:01 -07:00
Nikolay Aleksandrov	53edee2cfb	bonding: allow xmit hash policy change while bond dev is up Since the xmit_hash_policy pointer is always valid and not dependent on anything, we can change it while the bond device is up and running. The only downside would be the out of order packets but that is a small price to pay. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-27 23:27:14 -07:00
nikolay@redhat.com	318debd897	bonding: fix multiple 3ad mode sysfs race conditions When bond_3ad_get_active_agg_info() is used in all show_ad_ functions it is not protected against slave manipulation and since it walks over the slaves and uses them, this can easily result in NULL pointer dereference or use of freed memory. Both the new wrapper and the internal function are exported to the bonding as they're needed in different places. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-19 23:25:49 -07:00
nikolay@redhat.com	5a5c5fd48e	bonding: arp_ip_count and arp_targets can be wrong When getting arp_ip_targets if we encounter a bad IP, arp_ip_count still gets increased and all the targets after the wrong one will not be probed if arp_interval is enabled after that (unless a new IP target is added through sysfs) because of the zero entry, in this case reading arp_ip_target through sysfs will show valid targets even if there's a zero entry. Example: 1.2.3.4,4.5.6.7,blah,5.6.7.8 When retrieving the list from arp_ip_target the output would be: 1.2.3.4,4.5.6.7,5.6.7.8 but there will be a 0 entry between 4.5.6.7 and 5.6.7.8. If arp_interval is enabled after that 5.6.7.8 will never be checked because of that. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-19 23:25:49 -07:00
nikolay@redhat.com	acca2674a7	bonding: replace %x with %pI4 for IPv4 addresses There're few pr_debug() places that can provide the IPv4 address in dotted decimal format instead which is more helpful. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-19 23:25:49 -07:00
nikolay@redhat.com	ea6836dd7e	bonding: fix set mode race conditions Changing the mode without any locking can result in multiple races (e.g. upping a bond, enslaving/releasing). Depending on which race is hit the impact can vary from incosistent bond state to kernel crash. Use RTNL to synchronize the mode setting with the dangerous races. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-19 23:25:49 -07:00
Eric Dumazet	b0ce3508b2	bonding: allow TSO being set on bonding master In some situations, we need to disable TSO on bonding slaves. bonding device automatically unset TSO in bond_fix_features(), and performance is not good because : 1) We consume more cpu cycles. 2) GSO segmentation has some bugs leading to out of order TCP packets if this segmentation is done before virtual device. This particular problem will be addressed in a separate patch. This patch allows TSO being set/unset on the bonding master, so that GSO segmentation is done after bonding layer. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Michał Mirosław <mirqus@gmail.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Maciej Żenczykowski <maze@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-16 15:02:01 -07:00
Linus Torvalds	20b4fb4852	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS updates from Al Viro, Misc cleanups all over the place, mainly wrt /proc interfaces (switch create_proc_entry to proc_create(), get rid of the deprecated create_proc_read_entry() in favor of using proc_create_data() and seq_file etc). 7kloc removed. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits) don't bother with deferred freeing of fdtables proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h proc: Make the PROC_I() and PDE() macros internal to procfs proc: Supply a function to remove a proc entry by PDE take cgroup_open() and cpuset_open() to fs/proc/base.c ppc: Clean up scanlog ppc: Clean up rtas_flash driver somewhat hostap: proc: Use remove_proc_subtree() drm: proc: Use remove_proc_subtree() drm: proc: Use minor->index to label things, not PDE->name drm: Constify drm_proc_list[] zoran: Don't print proc_dir_entry data in debug reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show() proc: Supply an accessor for getting the data from a PDE's parent airo: Use remove_proc_subtree() rtl8192u: Don't need to save device proc dir PDE rtl8187se: Use a dir under /proc/net/r8180/ proc: Add proc_mkdir_data() proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h} proc: Move PDE_NET() to fs/proc/proc_net.c ...	2013-05-01 17:51:54 -07:00
David S. Miller	58717686cf	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c drivers/net/ethernet/emulex/benet/be.h include/net/tcp.h net/mac802154/mac802154.h Most conflicts were minor overlapping stuff. The be2net driver brought in some fixes that added __vlan_put_tag calls, which in net-next take an additional argument. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-30 03:55:20 -04:00
nikolay@redhat.com	c6cdcf6d82	bonding: fix locking in enslave failure path In commit `3c5913b53f` ("bonding: primary_slave & curr_active_slave are not cleaned on enslave failure") I didn't account for the use of curr_active_slave without curr_slave_lock and since there are such users, we should hold bond->lock for writing while setting it to NULL (in the NULL case we don't need the curr_slave_lock). Keeping the bond lock as to avoid the extra release/acquire cycle. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-25 04:03:21 -04:00
David S. Miller	6e0895c2ea	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/emulex/benet/be_main.c drivers/net/ethernet/intel/igb/igb_main.c drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c include/net/scm.h net/batman-adv/routing.c net/ipv4/tcp_input.c The e{uid,gid} --> {uid,gid} credentials fix conflicted with the cleanup in net-next to now pass cred structs around. The be2net driver had a bug fix in 'net' that overlapped with the VLAN interface changes by Patrick McHardy in net-next. An IGB conflict existed because in 'net' the build_skb() support was reverted, and in 'net-next' there was a comment style fix within that code. Several batman-adv conflicts were resolved by making sure that all calls to batadv_is_my_mac() are changed to have a new bat_priv first argument. Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO rewrite in 'net-next', mostly overlapping changes. Thanks to Stephen Rothwell and Antonio Quartulli for help with several of these merge resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-22 20:32:51 -04:00
nikolay@redhat.com	d632ce989c	bonding: in bond_mc_swap() bond's mc addr list is walked without lock Use netif_addr_lock_bh() to acquire the appropriate lock before walking. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-19 17:48:19 -04:00
nikolay@redhat.com	fc7a72ac86	bonding: disable netpoll on enslave failure slave_disable_netpoll() is not called upon enslave failure which would lead to a memory leak. Call slave_disable_netpoll() after err_detach as that's the first error path after enabling netpoll on that slave. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-19 17:48:19 -04:00
nikolay@redhat.com	3c5913b53f	bonding: primary_slave & curr_active_slave are not cleaned on enslave failure On enslave failure primary_slave can point to new_slave which is to be freed, and the same applies to curr_active_slave. So check if this is the case and clean up properly after err_detach because that's the first error code path after they're set. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-19 17:48:19 -04:00
nikolay@redhat.com	a506e7b479	bonding: vlans don't get deleted on enslave failure The main problem is with vid refcount which only gets bumped up. Delete the vlans after err_detach as that's the first error path after the vlans are added. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-19 17:48:18 -04:00
nikolay@redhat.com	25e40305d4	bonding: mc addresses don't get deleted on enslave failure Add bond_mc_list_flush() after err_detach as that's the first error path after the addresses are added. The main issue is the mc addresses' refcount which only gets bumped up. v2: update log message and don't move code unnecessarily Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-19 17:48:18 -04:00
Andy Gospodarek	bb5b052f75	bond: add support to read speed and duplex via ethtool This patch adds support for the get_settings ethtool op to the bonding driver. This was motivated by users who wanted to get the speed of the bond and compare that against throughput to understand utilization. The behavior before this patch was added was problematic when computing line utilization after trying to get link-speed and throughput via SNMP. Output from ethtool looks like this for a round-robin bond: Settings for bond0: Supported ports: [ ] Supported link modes: Not reported Supported pause frame use: No Supports auto-negotiation: No Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Speed: 11000Mb/s Duplex: Full Port: Other PHYAD: 0 Transceiver: internal Auto-negotiation: off MDI-X: Unknown Link detected: yes I tested this and verified it works as expected. A test was also done on a version backported to an older kernel and it worked well there. v2: Switch to using ethtool_cmd_speed_set to set speed, added check to SLAVE_IS_OK for each slave in bond, dropped mode-specific calculations as they were not needed, and set port type to 'Other.' v3: Fix useless assignment and checkpatch warning. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-19 16:39:50 -04:00
Patrick McHardy	86a9bad3ab	net: vlan: add protocol argument to packet tagging functions Add a protocol argument to the VLAN packet tagging functions. In case of HW tagging, we need that protocol available in the ndo_start_xmit functions, so it is stored in a new field in the skb. The new field fits into a hole (on 64 bit) and doesn't increase the sks's size. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-19 14:46:06 -04:00
Patrick McHardy	1fd9b1fc31	net: vlan: prepare for 802.1ad support Make the encapsulation protocol value a property of VLAN devices and change the device lookup functions to take the protocol value into account. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-19 14:45:27 -04:00
Patrick McHardy	80d5c3689b	net: vlan: prepare for 802.1ad VLAN filtering offload Change the rx_{add,kill}_vid callbacks to take a protocol argument in preparation of 802.1ad support. The protocol argument used so far is always htons(ETH_P_8021Q). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-19 14:45:27 -04:00
Patrick McHardy	f646968f8f	net: vlan: rename NETIF_F_HW_VLAN_* feature flags to NETIF_F_HW_VLAN_CTAG_* Rename the hardware VLAN acceleration features to include "CTAG" to indicate that they only support CTAGs. Follow up patches will introduce 802.1ad server provider tagging (STAGs) and require the distinction for hardware not supporting acclerating both. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-19 14:45:26 -04:00
Eric Dumazet	4394542ca4	bonding: fix l23 and l34 load balancing in forwarding path Since commit `6b923cb718` (bonding: support for IPv6 transmit hashing) bonding doesn't properly hash traffic in forwarding setups. Vitaly V. Bursov diagnosed that skb_network_header_len() returned 0 in this case. More generally, the transport header might not be in the skb head. Use pskb_may_pull() & skb_header_pointer() to get it right, and use proto_ports_offset() in bond_xmit_hash_policy_l34() to get support for more protocols than TCP and UDP. Reported-by: Vitaly V. Bursov <vitalyb@telenet.dn.ua> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: John Eaglesham <linux@8192.net> Tested-by: Vitaly V. Bursov <vitalyb@telenet.dn.ua> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-18 15:07:29 -04:00
nikolay@redhat.com	b6a5a7b9a5	bonding: IFF_BONDING is not stripped on enslave failure While enslaving a new device and after IFF_BONDING flag is set, in case of failure it is not stripped from the device's priv_flags while cleaning up, which could lead to other problems. Cleaning at err_close because the flag is set after dev_open(). v2: no change Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-11 16:01:47 -04:00
nikolay@redhat.com	6101391d4a	bonding: fix netdev event NULL pointer dereference In commit `471cb5a33d` ("bonding: remove usage of dev->master") a bug was introduced which causes a NULL pointer dereference. If a bond device is in mode 6 (ALB) and a slave is added it will dereference a NULL pointer in bond_slave_netdev_event(). This is because in bond_enslave we have bond_alb_init_slave() which changes the MAC address of the slave and causes a NETDEV_CHANGEADDR. Then we have in bond_slave_netdev_event(): struct slave slave = bond_slave_get_rtnl(slave_dev); struct bonding bond = slave->bond; bond_slave_get_rtnl() dereferences slave_dev->rx_handler_data which at that time is NULL since netdev_rx_handler_register() is called later. This is fixed by checking if slave is NULL before dereferencing it. v2: Comment style changed. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-11 16:01:47 -04:00
Al Viro	d9dda78bad	procfs: new helper - PDE_DATA(inode) The only part of proc_dir_entry the code outside of fs/proc really cares about is PDE(inode)->data. Provide a helper for that; static inline for now, eventually will be moved to fs/proc, along with the knowledge of struct proc_dir_entry layout. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-04-09 14:13:32 -04:00
nikolay@redhat.com	69b0216ac2	bonding: fix bonding_masters race condition in bond unloading While the bonding module is unloading, it is considered that after rtnl_link_unregister all bond devices are destroyed but since no synchronization mechanism exists, a new bond device can be created via bonding_masters before unregister_pernet_subsys which would lead to multiple problems (e.g. NULL pointer dereference, wrong RIP, list corruption). This patch fixes the issue by removing any bond devices left in the netns after bonding_masters is removed from sysfs. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-08 16:45:09 -04:00
nikolay@redhat.com	ffcdedb667	Revert "bonding: remove sysfs before removing devices" This reverts commit `4de79c737b`. This patch introduces a new bug which causes access to freed memory. In bond_uninit: list_del(&bond->bond_list); bond_list is linked in bond_net's dev_list which is freed by unregister_pernet_subsys. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-08 16:45:09 -04:00
David S. Miller	d978a6361a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/nfc/microread/mei.c net/netfilter/nfnetlink_queue_core.c Pull in 'net' to get Eric Biederman's AF_UNIX fix, upon which some cleanups are going to go on-top. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-07 18:37:01 -04:00
Veaceslav Falico	4de79c737b	bonding: remove sysfs before removing devices We have a race condition if we try to rmmod bonding and simultaneously add a bond master through sysfs. In bonding_exit() we first remove the devices (through rtnl_link_unregister() ) and only after that we remove the sysfs. If we manage to add a device through sysfs after that the devices were removed - we'll end up with that device/sysfs structure and with the module unloaded. Fix this by first removing the sysfs and only after that calling rtnl_link_unregister(). Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-05 00:46:13 -04:00
David S. Miller	d662483264	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull net into net-next to get the synchronize_net() bug fix in bonding. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-03 01:31:54 -04:00
Veaceslav Falico	fcd99434fb	bonding: get netdev_rx_handler_unregister out of locks Now that netdev_rx_handler_unregister contains synchronize_net(), we need to call it outside of bond->lock, cause it might sleep. Also, remove the already unneded synchronize_net(). Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-02 12:05:28 -04:00
David S. Miller	a210576cf8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/mac80211/sta_info.c net/wireless/core.h Two minor conflicts in wireless. Overlapping additions of extern declarations in net/wireless/core.h and a bug fix overlapping with the addition of a boolean parameter to __ieee80211_key_free(). Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-01 13:36:50 -04:00
nikolay@redhat.com	1bc7db1678	bonding: fix disabling of arp_interval and miimon Currently if either arp_interval or miimon is disabled, they both get disabled, and upon disabling they get executed once more which is not the proper behaviour. Also when doing a no-op and disabling an already disabled one, the other again gets disabled. Also fix the error messages with the proper valid ranges, and a small typo fix in the up delay error message (outputting "down delay", instead of "up delay"). Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-29 15:02:49 -04:00
David S. Miller	e2a553dbf1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: include/net/ipip.h The changes made to ipip.h in 'net' were already included in 'net-next' before that header was moved to another location. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-27 13:52:49 -04:00
Veaceslav Falico	9fe16b78ee	bonding: remove already created master sysfs link on failure If slave sysfs symlink failes to be created - we end up without removing the master sysfs symlink. Remove it in case of failure. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-26 13:00:02 -04:00
Veaceslav Falico	ad999eee66	bonding: cleanup unneeded rcu_read_lock() bond_resend_igmp_join_requests_delayed() calls _resend_igmp_join_requests() under rcu_read_lock(), while it gets its own rcu_read_lock() for the whole function. Remove the lock from the _delayed function. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-26 12:49:40 -04:00
Veaceslav Falico	876254ae27	bonding: don't call update_speed_duplex() under spinlocks bond_update_speed_duplex() might sleep while calling underlying slave's routines. Move it out of atomic context in bond_enslave() and remove it from bond_miimon_commit() - it was introduced by commit `546add79`, however when the slave interfaces go up/change state it's their responsibility to fire NETDEV_UP/NETDEV_CHANGE events so that bonding can properly update their speed. I've tested it on all combinations of ifup/ifdown, autoneg/speed/duplex changes, remote-controlled and local, on (not) MII-based cards. All changes are visible. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-13 04:53:17 -04:00
Veaceslav Falico	80028ea1c0	bonding: fire NETDEV_RELEASE event only on 0 slaves Currently, if we set up netconsole over bonding and release a slave, netconsole will stop logging on the whole bonding device. Change the behavior to stop the netconsole only when the last slave is released. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 16:15:18 -05:00
Jiri Pirko	6c8c4e4c24	bond: check if slave count is 0 in case when deciding to take slave's mac in bond_enslave(), check slave_cnt before actually using slave address. introduced by: commit `409cc1f8a4` (bond: have random dev address by default instead of zeroes) Reported-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-26 17:30:38 -05:00
Doug Goldstein	b3f92b63c4	bonding: set sysfs device_type to 'bond' Sets the sysfs device_type to 'bond' for udev. This allows udev rules to be created for bond devices. This is similar to how other network devices set their device_type. Signed-off-by: Doug Goldstein <cardoe@cardoe.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-19 00:51:09 -05:00
nikolay@redhat.com	0896341a44	bonding: fix bond_release_all inconsistencies This patch fixes the following inconsistencies in bond_release_all: - IFF_BONDING flag is not stripped from slaves - MTU is not restored - no netdev notifiers are sent Instead of trying to keep bond_release and bond_release_all in sync I think we can re-use bond_release as the environment for calling it is correct (RTNL is held). I have been running tests for the past week and they came out successful. The only way for bond_release to fail is for the slave to be attached in a different bond or to not be a slave but that cannot happen as RTNL is held and no slave manipulations can be achieved. V2: As suggested bond_release is renamed to __bond_release_one with a new parameter "all" introduced so to avoid calling unnecessary code while destroying a bond, and a wrapper for it called bond_release is created because of ndo_del_link. bond_release_all() is removed. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-19 00:51:09 -05:00
nikolay@redhat.com	e0809dbc47	bonding: Fix initialize after use for 3ad machine state spinlock The 3ad machine state spinlock can be used before it is inititialized while doing bond_enslave() (and the port is being initialized) since port->slave is set before the lock is prepared, thus causing soft lock-ups and a multitude of other nasty bugs. [ Rename __initialize_port_locks() variable name to 'slave' -DaveM ] Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-19 00:51:08 -05:00
nikolay@redhat.com	b59340c2c0	bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate() port->slave can be NULL since it's being initialized in bond_enslave thus dereferencing a NULL pointer in bond_3ad_update_lacp_rate() Also fix a minor bug, which could cause a port not to have AD_STATE_LACP_TIMEOUT since there's no sync between bond_3ad_update_lacp_rate() and bond_3ad_bind_slave(), by changing the read_lock to a write_lock_bh in bond_3ad_update_lacp_rate(). Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-19 00:51:08 -05:00
Neil Horman	2cde6acd49	netpoll: Fix __netpoll_rcu_free so that it can hold the rtnl lock __netpoll_rcu_free is used to free netpoll structures when the rtnl_lock is already held. The mechanism is used to asynchronously call __netpoll_cleanup outside of the holding of the rtnl_lock, so as to avoid deadlock. Unfortunately, __netpoll_cleanup modifies pointers (dev->np), which means the rtnl_lock must be held while calling it. Further, it cannot be held, because rcu callbacks may be issued in softirq contexts, which cannot sleep. Fix this by converting the rcu callback to a work queue that is guaranteed to get scheduled in process context, so that we can hold the rtnl properly while calling __netpoll_cleanup Tested successfully by myself. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> CC: Cong Wang <amwang@redhat.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-11 19:19:33 -05:00
David S. Miller	188d1f76d0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/intel/e1000e/ethtool.c drivers/net/vmxnet3/vmxnet3_drv.c drivers/net/wireless/iwlwifi/dvm/tx.c net/ipv6/route.c The ipv6 route.c conflict is simple, just ignore the 'net' side change as we fixed the same problem in 'net-next' by eliminating cached neighbours from ipv6 routes. The e1000e conflict is an addition of a new statistic in the ethtool code, trivial. The vmxnet3 conflict is about one change in 'net' removing a guarding conditional, whilst in 'net-next' we had a netdev_info() conversion. The iwlwifi conflict is dealing with a WARN_ON() conversion in 'net-next' vs. a revert happening in 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-05 14:12:20 -05:00
Gao feng	387ff91184	netns: bond: allow unprivileged users to control bond device reduce the permission check of bond device's ioctl. allow the userns root to control the bond device. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-04 13:12:16 -05:00
Jiri Pirko	409cc1f8a4	bond: have random dev address by default instead of zeroes Makes more sense to have randomly generated address by default than to have all zeroes. It also allows user to for example put the bond into bridge without need to have any slaves in it. Also note that this changes only behaviour of bonds with no slaves. Once the first slave device is enslaved, its address will be used (no change here). Also, fix dev_assign_type values on the way. Reported-by: Pavel Šimerda <psimerda@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-30 15:34:00 -05:00
Milos Vyletel	eb492f7443	bonding: unset primary slave via sysfs When bonding module is loaded with primary parameter and one decides to unset primary slave using sysfs these settings are not preserved during bond device restart. Primary slave is only unset once and it's not remembered in bond->params structure. Below is example of recreation. grep OPTS /etc/sysconfig/network-scripts/ifcfg-bond0 BONDING_OPTS="mode=active-backup miimon=100 primary=eth01" grep "Primary Slave" /proc/net/bonding/bond0 Primary Slave: eth01 (primary_reselect always) echo "" > /sys/class/net/bond0/bonding/primary grep "Primary Slave" /proc/net/bonding/bond0 Primary Slave: None sed -i -e 's/primary=eth01//' /etc/sysconfig/network-scripts/ifcfg-bond0 grep OPTS /etc/sysconfig/network-scripts/ifcfg-bond BONDING_OPTS="mode=active-backup miimon=100 " ifdown bond0 && ifup bond0 without patch: grep "Primary Slave" /proc/net/bonding/bond0 Primary Slave: eth01 (primary_reselect always) with patch: grep "Primary Slave" /proc/net/bonding/bond0 Primary Slave: None Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Milos Vyletel <milos.vyletel@sde.cz> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-29 15:43:35 -05:00
Jiri Pirko	7826d43f2d	ethtool: fix drvinfo strings set in drivers Use strlcpy where possible to ensure the string is \0 terminated. Use always sizeof(string) instead of 32, ETHTOOL_BUSINFO_LEN and custom defines. Use snprintf instead of sprint. Remove unnecessary inits of ->fw_version Remove unnecessary inits of drvinfo struct. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-06 21:06:31 -08:00

1 2 3 4 5 ...

723 Commits