linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-18 09:44:18 +08:00

Author	SHA1	Message	Date
Mitch Williams	905770fa3e	i40evf: lower message level We see this message regularly on VF reset or unload (which invokes a reset). It's essentially meaningless unless it's happening constantly. To prevent consternation, lower the log level to debug so it's not seen under normal circumstance. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:51:00 -07:00
Mariusz Stachura	0dc8692e91	i40e: fix for flow director counters not wrapping as expected An errata with GLQF_PCNT causes it to not wrap as expected. This can cause an error in flow director statistics. This patch resets affected counters just after reading. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:50:59 -07:00
Mariusz Stachura	e04ea00217	i40e: relax warning message in case of version mismatch Fortville and Fort Park devices are often on different firmware release schedules. This change relaxes the minor version warning message, so it is only displayed for older FW warning version for old firmware Fortville 3 or earlier. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:50:59 -07:00
Sudheer Mogilappagari	3fded4663b	i40e: simplify member variable accesses This commit replaces usage of vsi->back in i40e_print_link_message() (which is actually a PF pointer) with temp variable. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:50:59 -07:00
Sudheer Mogilappagari	9a03449d3e	i40e: Fix link down message when interface is brought up i40e_print_link_message() is intended to compare new link state with current link state and print log message only if the new state is different from current state. However in current driver the new state does not get updated when link is going down because of the if condition. When an interface is brought down, vsi->state is set to I40E_VSI_DOWN in i40e_vsi_close() and later i40e_print_link_message() does not get invoked in i40e_link_event due to if condition. Hence link down message doesn't appear when link is going down. The down state is seen later during i40e_open() and old state gets printed. The actual link state doesn't get updated in i40e_close() or i40e_open() but when i40e_handle_link_event is called inside i40e_clean_adminq_subtask. This change allows i40e_print_link_message() to be called when interface is going down and keeps the state information updated. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:50:59 -07:00
Sudheer Mogilappagari	16badf758b	i40e: Fix unqualified module message while bringing link up In current driver, when ifconfig ethx up is done, the link state doesn't transition to UP inside i40e_open(). It changes after AQ command response is handled in i40e_handle_link_event(). When pf->hw.phy.link_info.link_info is DOWN inside i40e_open(), The state is transient and invalid. So log message gets printed based on incorrect info (i.e link_info and an_info). This commit removes check for unqualified module inside i40e_up_complete(). The existing check in i40e_handle_link_event() logs the error message based on correct link state information. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:50:59 -07:00
Jacob Keller	2b634bb068	i40e/i40evf: rename bytes_per_int to bytes_per_usec This value is not calculating bytes_per_int, which would actually just be bytes/ITR_COUNTDOWN_START, but rather it's calculating bytes/usecs. Rename the variable for clarity so that future developers understand what the value is actually calculating. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-09-29 12:50:58 -07:00
David Ahern	fa8fefaa67	net: ipv4: remove fib_info arg to fib_check_nh fib_check_nh does not use the fib_info arg; remove t. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-29 06:19:32 +01:00
David Ahern	c7c3e5913b	net: ipv4: remove fib_weight fib_weight in fib_info is set but not used. Remove it and the helpers for setting it. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-29 06:19:32 +01:00
David S. Miller	fadad670a8	Merge branch 'bpf-extend-info' Martin KaFai Lau says: ==================== bpf: Extend bpf_{prog,map}_info This patch series adds more fields to bpf_prog_info and bpf_map_info. Please see individual patch for details. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-29 06:17:06 +01:00
Martin KaFai Lau	3a8ad560a9	bpf: Test new fields in bpf_attr and bpf_{prog, map}_info This patch tests newly added fields of the bpf_attr, bpf_prog_info and bpf_map_info. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-29 06:17:05 +01:00
Martin KaFai Lau	6e525d0667	bpf: Swap the order of checking prog_info and map_info This patch swaps the checking order. It now checks the map_info first and then prog_info. It is a prep work for adding test to the newly added fields (the map_ids of prog_info field in particular). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-29 06:17:05 +01:00
Martin KaFai Lau	88cda1c9da	bpf: libbpf: Provide basic API support to specify BPF obj name This patch extends the libbpf to provide API support to allow specifying BPF object name. In tools/lib/bpf/libbpf, the C symbol of the function and the map is used. Regarding section name, all maps are under the same section named "maps". Hence, section name is not a good choice for map's name. To be consistent with map, bpf_prog also follows and uses its function symbol as the prog's name. This patch adds logic to collect function's symbols in libbpf. There is existing codes to collect the map's symbols and no change is needed. The bpf_load_program_name() and bpf_map_create_name() are added to take the name argument. For the other bpf_map_create_xxx() variants, a name argument is directly added to them. In samples/bpf, bpf_load.c in particular, the symbol is also used as the map's name and the map symbols has already been collected in the existing code. For bpf_prog, bpf_load.c does not collect the function symbol name. We can consider to collect them later if there is a need to continue supporting the bpf_load.c. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-29 06:17:05 +01:00
Martin KaFai Lau	ad5b177bd7	bpf: Add map_name to bpf_map_info This patch allows userspace to specify a name for a map during BPF_MAP_CREATE. The map's name can later be exported to user space via BPF_OBJ_GET_INFO_BY_FD. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-29 06:17:05 +01:00
Martin KaFai Lau	cb4d2b3f03	bpf: Add name, load_time, uid and map_ids to bpf_prog_info The patch adds name and load_time to struct bpf_prog_aux. They are also exported to bpf_prog_info. The bpf_prog's name is passed by userspace during BPF_PROG_LOAD. The kernel only stores the first (BPF_PROG_NAME_LEN - 1) bytes and the name stored in the kernel is always \0 terminated. The kernel will reject name that contains characters other than isalnum() and '_'. It will also reject name that is not null terminated. The existing 'user->uid' of the bpf_prog_aux is also exported to the bpf_prog_info as created_by_uid. The existing 'used_maps' of the bpf_prog_aux is exported to the newly added members 'nr_map_ids' and 'map_ids' of the bpf_prog_info. On the input, nr_map_ids tells how big the userspace's map_ids buffer is. On the output, nr_map_ids tells the exact user_map_cnt and it will only copy up to the userspace's map_ids buffer is allowed. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-29 06:17:05 +01:00
Hoang Tran	cf5d74b85e	tcp: fix under-evaluated ssthresh in TCP Vegas With the commit `76174004a0` (tcp: do not slow start when cwnd equals ssthresh), the comparison to the reduced cwnd in tcp_vegas_ssthresh() would under-evaluate the ssthresh. Signed-off-by: Hoang Tran <hoang.tran@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-29 06:07:00 +01:00
Nikolay Aleksandrov	5af48b59f3	net: bridge: add per-port group_fwd_mask with less restrictions We need to be able to transparently forward most link-local frames via tunnels (e.g. vxlan, qinq). Currently the bridge's group_fwd_mask has a mask which restricts the forwarding of STP and LACP, but we need to be able to forward these over tunnels and control that forwarding on a per-port basis thus add a new per-port group_fwd_mask option which only disallows mac pause frames to be forwarded (they're always dropped anyway). The patch does not change the current default situation - all of the others are still restricted unless configured for forwarding. We have successfully tested this patch with LACP and STP forwarding over VxLAN and qinq tunnels. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-29 06:02:55 +01:00
David S. Miller	de9c8a6a5f	Merge branch 'hns3-dcb' Yunsheng Lin says: ==================== Add support for DCB feature in hns3 driver The patchset contains some enhancement related to DCB before adding support for DCB feature. This patchset depends on the following patchset: https://patchwork.ozlabs.org/cover/815646/ https://patchwork.ozlabs.org/cover/816145/ High Level Architecture: [ lldpad ] \| \| \| [ hns3_dcbnl ] \| \| \| [ hclge_dcb ] / \ / \ / \ [ hclge_main ] [ hclge_tm ] Current patch-set support following functionality: Use of lldptool to configure the tc schedule mode, tc bandwidth(if schedule mode is ETS), prio_tc_map and PFC parameter. V3: Drop mqprio support V2: Fix for not defining variables in local loop. V1: Initial Submit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:35:12 -07:00
Yunsheng Lin	9df8f79a4d	net: hns3: Add DCB support when interacting with network stack When using lldptool to configure DCB parameter, hclge_dcb module call the client_ops->setup_tc to tell network stack which queue and priority is using for specific tc. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:35:12 -07:00
Yunsheng Lin	7979a22330	net: hns3: Setting for fc_mode and dcb enable flag in TM module After the DCB feature is supported, fc_mode and dcb enable flag must be set according to the DCB parameter. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:35:12 -07:00
Yunsheng Lin	986743dbf0	net: hns3: Add dcb netlink interface for the support of DCB feature This patch add dcb netlink interface by calling the interface from hclge_dcb module. This patch also update Makefile in order to build hns3_dcbnl module. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:35:12 -07:00
Yunsheng Lin	cacde272dd	net: hns3: Add hclge_dcb module for the support of DCB feature The hclge_dcb module calls the interface from hclge_main/tm and provide interface for the dcb netlink interface. This patch also update Makefiles required to build the DCB supported code in HNS3 Ethernet driver and update the existing Kconfig file in the hisilicon folder. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:35:12 -07:00
Yunsheng Lin	77f255c1c6	net: hns3: Add some interface for the support of DCB feature This patch add some interface and export some interface from hclge_tm and hclgc_main to support the upcoming DCB feature. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:35:11 -07:00
Yunsheng Lin	cc9bb43ab3	net: hns3: Add tc-based TM support for sriov enabled port When sriov is enabled and TM is in tc-based mode, vf's TM parameters is not set in TM initialization process. This patch add the tc_based TM support for sriov enabled using the information in vport struct. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:35:11 -07:00
Yunsheng Lin	0a5677d39e	net: hns3: Add support for port shaper setting in TM module This patch add a tm_port_shaper cmd and set port shaper to HCLGE_ETHER_MAX_RATE on TM initialization process. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:35:11 -07:00
Yunsheng Lin	9dc2145d91	net: hns3: Add support for PFC setting in TM module This patch add a pfc_pause_en cmd, and use it to configure PFC option according to fc_mode in hdev->tm_info. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:35:11 -07:00
Yunsheng Lin	acf61ecd44	net: hns3: Add support for dynamically buffer reallocation Current buffer allocation can only happen at init, when doing buffer reallocation after init, care must be taken care of memory which priv_buf points to. This patch fixes it by using a dynamic allocated temporary memory. Because we only do buffer reallocation at init or when setting up the DCB parameter, and priv_buf is only used at buffer allocation process, so it is ok to use a dynamic allocated temporary memory. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:35:11 -07:00
Yunsheng Lin	9ffe79a9c2	net: hns3: Support for dynamically assigning tx buffer to TC This patch add support of dynamically assigning tx buffer to TC when the TC is enabled. It will save buffer for rx direction to avoid packet loss. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:35:11 -07:00
Alexey Dobriyan	6ade97da60	arp: make arp_hdr_len() return unsigned int Negative ARP header length are not a thing. Constify arguments while I'm at it. Space savings: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-3 (-3) function old new delta arpt_do_table 1163 1160 -3 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:29:36 -07:00
Aviad Krawczyk	bbdc9e687f	net-next/hinic: Fix a case of Tx Queue is Stopped forever Fix the following scenario: 1. tx_free_poll is running on cpu X 2. xmit function is running on cpu Y and fails to get sq wqe 3. tx_free_poll frees wqes on cpu X and checks the queue is not stopped 4. xmit function stops the queue after failed to get sq wqe 5. The queue is stopped forever Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:26:50 -07:00
Aviad Krawczyk	352f58b0d9	net-next/hinic: Set Rxq irq to specific cpu for NUMA Set Rxq irq to specific cpu for allocating and receiving the skb from the same node. Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:26:50 -07:00
David S. Miller	93771b0160	Merge branch 'bpf-verifier-disassembly-improvements' Edward Cree says: ==================== bpf/verifier: disassembly improvements Fix the output of print_bpf_insn() for ALU ops that don't look like compound assignment (i.e. BPF_END and BPF_NEG). Sample output for a short test program: 0: (b4) (u32) r0 = (u32) 0 1: (dc) r0 = be32 r0 2: (84) r0 = (u32) -r0 3: (95) exit processed 4 insns, stack depth 0 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:23:19 -07:00
Edward Cree	73c864b383	bpf/verifier: improve disassembly of BPF_NEG instructions BPF_NEG takes only one operand, unlike the bulk of BPF_ALU[64] which are compound-assignments. So give it its own format in print_bpf_insn(). Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:23:18 -07:00
Edward Cree	2b7c6ba945	bpf/verifier: improve disassembly of BPF_END instructions print_bpf_insn() was treating all BPF_ALU[64] the same, but BPF_END has a different structure: it has a size in insn->imm (even if it's BPF_X) and uses the BPF_SRC (X or K) to indicate which endianness to use. So it needs different code to print it. Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:23:18 -07:00
Colin Ian King	4d8806fd14	cxgb4: make function ch_flower_stats_cb, fixes warning The function ch_flower_stats_cb is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warnings: symbol 'ch_flower_stats_cb' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:21:37 -07:00
David S. Miller	f46d963286	Merge branch 'rtnl-pushdown-prep' Florian Westphal says: ==================== rtnetlink: preparation patches for further rtnl lock pushdown/removal Patches split large rtnl_fill_ifinfo into smaller chunks to better see which parts 1. require rtnl 2. do not require it at all 3. rely on rtnl locking now but could be converted Changes since v3: I dropped the 'ifalias' patch, I have a change to decouple ifalias and rtnl mutex, I will send it once this series has been merged. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:20:49 -07:00
Florian Westphal	4c82a95e52	rtnetlink: rtnl_have_link_slave_info doesn't need rtnl it can be switched to rcu. Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:20:49 -07:00
Florian Westphal	b1e66b9a67	rtnetlink: add helpers to dump netnsid information Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:20:49 -07:00
Florian Westphal	250fc3dfdb	rtnetlink: add helpers to dump vf information similar to earlier patches, split out more parts of this function to better see what is happening and where we assume rtnl is locked. Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:20:49 -07:00
Florian Westphal	79110a0426	rtnetlink: add helper to put master and link ifindexes rtnl_fill_ifinfo currently requires caller to hold the rtnl mutex. Unfortunately the function is quite large which makes it harder to see which spots require the lock, which spots assume it and which ones could do without. Add helpers to factor out the ifindex dumping, one can use rcu to avoid rtnl dependency. Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:20:49 -07:00
Florian Westphal	61f26d9251	selftests: rtnetlink.sh: add rudimentary vrf test Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 10:14:16 -07:00
Cong Wang	e7614370d6	net_sched: use idr to allocate u32 filter handles Instead of calling u32_lookup_ht() in a loop to find a unused handle, just switch to idr API to allocate new handles. u32 filters are special as the handle could contain a hash table id and a key id, so we need two IDR to allocate each of them. Cc: Chris Mi <chrism@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 09:43:41 -07:00
Cong Wang	1d8134fea2	net_sched: use idr to allocate basic filter handles Instead of calling basic_get() in a loop to find a unused handle, just switch to idr API to allocate new handles. Cc: Chris Mi <chrism@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 09:43:41 -07:00
Cong Wang	76cf546c28	net_sched: use idr to allocate bpf filter handles Instead of calling cls_bpf_get() in a loop to find a unused handle, just switch to idr API to allocate new handles. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Chris Mi <chrism@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 09:43:41 -07:00
Eric Dumazet	8f1975e31d	inetpeer: speed up inetpeer_invalidate_tree() As measured in my prior patch ("sch_netem: faster rb tree removal"), rbtree_postorder_for_each_entry_safe() is nice looking but much slower than using rb_next() directly, except when tree is small enough to fit in CPU caches (then the cost is the same) From: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-28 09:40:59 -07:00
David S. Miller	a2e4a21906	Merge branch 'mlxsw-Add-support-for-offloading-IPv4-multicast-routes' Jiri Pirko says: ==================== mlxsw: Add support for offloading IPv4 multicast routes Yotam says: This patch-set introduces offloading of the kernel IPv4 multicast router logic in the Spectrum driver. The first patch makes the Spectrum driver ignore FIB notifications that are not of address family IPv4 or IPv6. This is needed in order to prevent crashes while the next patches introduce the RTNL_FAMILY_IPMR FIB notifications. Patches 2-5 update ipmr to use the FIB notification chain for both MFC and VIF notifications, and patches 8-12 update the Spectrum driver to register to these notifications and offload the routes. Similarly to IPv4 and IPv6, any failure will trigger the abort mechanism which is updated in this patch-set to eject multicast route tables too. At this stage, the following limitations apply: - A multicast MFC route will be offloaded by the driver if all the output interfaces are Spectrum router interfaces (RIFs). In any other case (which includes pimreg device, tunnel devices and management ports) the route will be trapped to the CPU and the packets will be forwarded by software. - ipmr proxy routes are not supported and will trigger the abort mechanism. - The MFC TTL values are currently treated as boolean: if the value is different than 255, the traffic is forwarded to the interface and if the value is 255 it is not forwarded. Dropping packets based on their TTL isn't currently supported. To allow users to have visibility on which of the routes are offloaded and which are not, patch 6 introduces a per-route offload indication similar to IPv4 and IPv6 routes which is sent to the user via the RTNetlink interface. The Spectrum driver multicast router offloading support, which is introduced in patches 8 and 9, is divided into two parts: - The hardware logic which abstracts the Spectrum hardware and provides a simple API for the upper levels. - The offloading logic which gets the MFC and VIF notifications from the kernel and updates the hardware using the hardware logic part. Finally, the last patch makes the Spectrum router logic not ignore the multicast FIB notifications and call the corresponding functions in the multicast router offloading logic. --- v2->v3: - Move the ipmr_rule_default function definition to be inside the already existing CONFIG_IP_MROUTE_MULTIPLE_TABLES ifdef block (patch 6) - Remove double =0 initialization in spectrum_mr.c (patch 7) - Fix route4 allocation size (patch 7) v1->v2: - Add comments for struct fields in mroute.h - Take the mrt_lock while dumping VIFs in the fib_notifier dump callback - Update the MFC lastuse field too ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-27 11:33:28 -07:00
Yotam Gigi	664375e956	mlxsw: spectrum: router: Don't ignore IPMR notifications Make the Spectrum router logic not ignore the RTNL_FAMILY_IPMR FIB notifications. Past commits added the IPMR VIF and MFC add/del notifications via the fib_notifier chain. In addition, a code for handling these notifications in the Spectrum router logic was added. Make the Spectrum router logic not ignore these notifications and forward the requests to the Spectrum multicast router offloading logic. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-27 11:33:28 -07:00
Yotam Gigi	fd890fe98f	mlxsw: spectrum: Notify multicast router on RIF MTU changes Due to the fact that multicast routes hold the minimum MTU of all the egress RIFs and trap packets that don't meet it, notify the mulitcast router code on RIF MTU changes. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-27 11:33:28 -07:00
Yotam Gigi	d42b0965b1	mlxsw: spectrum_router: Add multicast routes notification handling functionality Add functionality for calling the multicast routing offloading logic upon MFC and VIF add and delete notifications. In addition, call the multicast routing upon RIF addition and deletion events. As the multicast routing offload logic may sleep, the actual calls are done in a deferred work. To ensure the MFC object is not freed in that interval, a reference is held to it. In case of a failure, the abort mechanism is used, which ejects all the routes from the hardware and triggers the traffic to flow through the kernel. Note: At that stage, the FIB notifications are still ignored, and will be enabled in a further patch. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-27 11:33:28 -07:00
Yotam Gigi	7e50d43575	mlxsw: spectrum: router: Squash the default route table to main Currently, the mlxsw Spectrum driver offloads only either the RT_TABLE_MAIN FIB table or the VRF tables, so the RT_TABLE_LOCAL table is squashed to the RT_TABLE_MAIN table to allow local routes to be offloaded too. By default, multicast MFC routes which are not assigned to any user requested table are put in the RT_TABLE_DEFAULT table. Due to the fact that offloading multicast MFC routes support in Spectrum router logic is going to be introduced soon, squash the default table to MAIN too. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-27 11:33:28 -07:00

1 2 3 4 5 ...

706696 Commits