Commit Graph

918626 Commits

Author SHA1 Message Date
Roopa Prabhu
0534c5489c selftests: net: add fdb nexthop tests
This commit adds ipv4 and ipv6 fdb nexthop api tests to fib_nexthops.sh.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:00:38 -07:00
Roopa Prabhu
c7cdbe2efc vxlan: support for nexthop notifiers
vxlan driver registers for nexthop add/del notifiers to
cleanup fdb entries pointing to such nexthops.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:00:38 -07:00
Roopa Prabhu
8590ceedb7 nexthop: add support for notifiers
This patch adds nexthop add/del notifiers. To be used by
vxlan driver in a later patch. Could possibly be used by
switchdev drivers in the future.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:00:38 -07:00
Roopa Prabhu
1274e1cc42 vxlan: ecmp support for mac fdb entries
Todays vxlan mac fdb entries can point to multiple remote
ips (rdsts) with the sole purpose of replicating
broadcast-multicast and unknown unicast packets to those remote ips.

E-VPN multihoming [1,2,3] requires bridged vxlan traffic to be
load balanced to remote switches (vteps) belonging to the
same multi-homed ethernet segment (E-VPN multihoming is analogous
to multi-homed LAG implementations, but with the inter-switch
peerlink replaced with a vxlan tunnel). In other words it needs
support for mac ecmp. Furthermore, for faster convergence, E-VPN
multihoming needs the ability to update fdb ecmp nexthops independent
of the fdb entries.

New route nexthop API is perfect for this usecase.
This patch extends the vxlan fdb code to take a nexthop id
pointing to an ecmp nexthop group.

Changes include:
- New NDA_NH_ID attribute for fdbs
- Use the newly added fdb nexthop groups
- makes vxlan rdsts and nexthop handling code mutually
  exclusive
- since this is a new use-case and the requirement is for ecmp
nexthop groups, the fdb add and update path checks that the
nexthop is really an ecmp nexthop group. This check can be relaxed
in the future, if we want to introduce replication fdb nexthop groups
and allow its use in lieu of current rdst lists.
- fdb update requests with nexthop id's only allowed for existing
fdb's that have nexthop id's
- learning will not override an existing fdb entry with nexthop
group
- I have wrapped the switchdev offload code around the presence of
rdst

[1] E-VPN RFC https://tools.ietf.org/html/rfc7432
[2] E-VPN with vxlan https://tools.ietf.org/html/rfc8365
[3] http://vger.kernel.org/lpc_net2018_talks/scaling_bridge_fdb_database_slidesV3.pdf

Includes a null check fix in vxlan_xmit from Nikolay

v2 - Fixed build issue:
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:00:38 -07:00
Roopa Prabhu
38428d6871 nexthop: support for fdb ecmp nexthops
This patch introduces ecmp nexthops and nexthop groups
for mac fdb entries. In subsequent patches this is used
by the vxlan driver fdb entries. The use case is
E-VPN multihoming [1,2,3] which requires bridged vxlan traffic
to be load balanced to remote switches (vteps) belonging to
the same multi-homed ethernet segment (This is analogous to
a multi-homed LAG but over vxlan).

Changes include new nexthop flag NHA_FDB for nexthops
referenced by fdb entries. These nexthops only have ip.
This patch includes appropriate checks to avoid routes
referencing such nexthops.

example:
$ip nexthop add id 12 via 172.16.1.2 fdb
$ip nexthop add id 13 via 172.16.1.3 fdb
$ip nexthop add id 102 group 12/13 fdb

$bridge fdb add 02:02:00:00:00:13 dev vxlan1000 nhid 101 self

[1] E-VPN https://tools.ietf.org/html/rfc7432
[2] E-VPN VxLAN: https://tools.ietf.org/html/rfc8365
[3] LPC talk with mention of nexthop groups for L2 ecmp
http://vger.kernel.org/lpc_net2018_talks/scaling_bridge_fdb_database_slidesV3.pdf

v4 - fixed uninitialized variable reported by kernel test robot
Reported-by: kernel test robot <rong.a.chen@intel.com>

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:00:38 -07:00
David S. Miller
7b1b843a1e Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2020-05-21

This series contains updates to igc and e1000.

Andre cleans up code that was left over from the igb driver that handled
MAC address filters based on the source address, which is not currently
supported.  Simplifies the MAC address filtering code and prepare the
igc driver for future source address support.  Updated the MAC address
filter internal APIs to support filters based on source address.  Added
support for Network Flow Classification (NFC) rules based on source MAC
address.  Cleaned up the 'cookie' field which is not used anywhere in
the code and cleaned up a wrapper function that was not needed.
Simplified the filtering code for readability and aligned the ethtool
functions, so that function names were consistent.

Alex provides a fix for e1000 to resolve a deadlock issue when NAPI is
being disabled.

Sasha does additional cleanup of the igc driver of dead code that is not
used or needed.

v2: Fix the function header comment in patch 3 of the series, based on
    the feedback from Jakub Kicinski.
====================

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 13:48:50 -07:00
David S. Miller
2a330b5334 Merge branch 'provide-KAPI-for-SQI'
Oleksij Rempel says:

====================
provide KAPI for SQI

This patches are extending ethtool netlink interface to export Signal
Quality Index (SQI). SQI provided by 100Base-T1 PHYs and can be used for
cable diagnostic. Compared to a typical cable tests, this value can be
only used after link is established.

changes v3:
- rename __ethtool_get_sqi* to linkstate_get_sqi*. And move this
  functions to the net/ethtool/linkstate.c
- protect linkstate_get_sqi* with locking

changes v2:
- use u32 instead of u8 for SQI
- add SQI_MAX field and callbacks
- some style fixes in the rst.
- do not convert index to shifted index.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21 17:18:00 -07:00
Oleksij Rempel
68ff5e1475 net: phy: tja11xx: add SQI support
This patch implements reading of the Signal Quality Index for better
cable/link troubleshooting.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21 17:18:00 -07:00
Oleksij Rempel
8066021915 ethtool: provide UAPI for PHY Signal Quality Index (SQI)
Signal Quality Index is a mandatory value required by "OPEN Alliance
SIG" for the 100Base-T1 PHYs [1]. This indicator can be used for cable
integrity diagnostic and investigating other noise sources and
implement by at least two vendors: NXP[2] and TI[3].

[1] http://www.opensig.org/download/document/218/Advanced_PHY_features_for_automotive_Ethernet_V1.0.pdf
[2] https://www.nxp.com/docs/en/data-sheet/TJA1100.pdf
[3] https://www.ti.com/product/DP83TC811R-Q1

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21 17:18:00 -07:00
David S. Miller
b0301a5a28 Merge branch 'qed-next'
Yuval Basson says:

====================
qed: Add xrc core support for RoCE

This patch adds support for configuring XRC and provides the necessary
APIs for rdma upper layer driver (qedr) to enable the XRC feature.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21 17:08:25 -07:00
Yuval Basson
7bfb399eca qed: Add XRC to RoCE
Add support for XRC-SRQ's and XRC-QP's for upper layer driver.

We maintain separate bitmaps for resource management for srq and
xrc-srq, However, the range in FW is one, The xrc-srq's are first
and then the srq's follow. Therefore we maintain a srq-id offset.

v2: perform cleanups if XRC bitmpas allocation fail.

Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Bason <ybason@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21 17:08:25 -07:00
Yuval Basson
b8204ad878 qed: changes to ILT to support XRC
First ILT page for TSDM client is allocated for XRC-SRQ's.
For regular SRQ's skip first ILT page that is reserved for
XRC-SRQ's.

Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Bason <ybason@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21 17:08:25 -07:00
Chris Mi
d8bed686ab net: psample: Add tunnel support
Currently, psample can only send the packet bits after decapsulation.
The tunnel information is lost. Add the tunnel support.

If the sampled packet has no tunnel info, the behavior is the same as
before. If it has, add a nested metadata field named PSAMPLE_ATTR_TUNNEL
and include the tunnel subfields if applicable.

Increase the metadata length for sampled packet with the tunnel info.
If new subfields of tunnel info should be included, update the metadata
length accordingly.

Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21 17:04:07 -07:00
Andre Guedes
c983e32719 igc: Change byte order in struct igc_nfc_filter
Every time we access the 'etype' and 'vlan_tci' fields from struct
igc_nfc_filter to enable or disable filters in hardware we have to
convert them from big endian to host order so it makes more sense to
simply have these fields in host order.

The byte order conversion should take place in igc_ethtool_get_nfc_
rule() and igc_ethtool_add_nfc_rule(), which are called by .get_rxnfc
and .set_rxnfc ethtool ops, since ethtool subsystem is the one who deals
with them in big endian order.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:28 -07:00
Andre Guedes
97700bc86d igc: Align terms used in NFC support code
The Network Flow Classification (NFC) support code from IGC driver uses
terms such as 'rule', 'filter', 'entry', 'input' interchangeably when
referring to NFC rules, making it harder to follow the code. This patch
renames IGC's internal APIs, structs, and variables so we stick with the
term 'rule' since this is the term used in ethtool APIs. It also removes
some not applicable comments along the way. No functionality is changed
by this patch.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:23 -07:00
Andre Guedes
7df76bd191 igc: Add 'igc_ethtool_' prefix to functions in igc_ethtool.c
This patch adds the prefix 'igc_ethtool_' to all functions defined in
igc_ethtool.c so they align with the name convention already followed by
other parts of the driver (e.g. igc_tsn, igc_ptp). Also, this avoids
some name clashing with functions added to igc_main.c by upcoming
patches in this series. No functionality is changed by this patch, just
function renaming.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:19 -07:00
Andre Guedes
876ea04db7 igc: Early return in igc_get_ethtool_nfc_entry()
This patch re-writes the second half of igc_ethtool_get_nfc_entry() to
follow the 'return early' pattern seen in other parts of the driver and
removes some duplicate comments.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:15 -07:00
Andre Guedes
8b9c23cdf0 igc: Cleanup _get|set_rxnfc ethtool ops
This patch does a trivial change in igc_ethtool_get_rxnfc() and
igc_ethtool_set_rxnfc() to simplify their logic.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:11 -07:00
Andre Guedes
4d0710c241 igc: Get rid of igc_max_channels()
The local function igc_max_channels() is a pointless wrapper around
igc_get_max_rss_queues(). This patch removes it and updates the callers
accordingly. It also does some cleanup on igc_get_max_rss_queues().

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:07 -07:00
Andre Guedes
8e34cad167 igc: Remove unused field from igc_nfc_filter
The 'cookie' field is not used anywhere in the code so this patch
removes it from struct igc_nfc_filter.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:19:00 -07:00
Sasha Neftin
281380a6fd igc: Remove per queue good transmited counter register
Per queue good transmitted packet counter not applicable for i225 device.
This patch comes to clean up this register.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:18:56 -07:00
Sasha Neftin
d1fe569f51 igc: Remove header redirection register
Header redirection missed packet counter not applicable for i225 device.
This patch comes to clean up this register.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:18:52 -07:00
Sasha Neftin
3b5fc88f78 igc: Remove obsolete circuit breaker registers
Part of circuit breaker registers is obsolete
and not applicable for i225 device.
This patch comes to clean up these registers.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:18:48 -07:00
Alexander Duyck
49ee3c2ab5 e1000: Do not perform reset in reset_task if we are already down
We are seeing a deadlock in e1000 down when NAPI is being disabled. Looking
over the kernel function trace of the system it appears that the interface
is being closed and then a reset is hitting which deadlocks the interface
as the NAPI interface is already disabled.

To prevent this from happening I am disabling the reset task when
__E1000_DOWN is already set. In addition code has been added so that we set
the __E1000_DOWN while holding the __E1000_RESET flag in e1000_close in
order to guarantee that the reset task will not run after we have started
the close call.

Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Tested-by: Maxim Zhukov <mussitantesmortem@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:18:42 -07:00
Andre Guedes
8eb2449d83 igc: Enable NFC rules based source MAC address
This patch adds support for Network Flow Classification (NFC) rules
based on source MAC address. Note that the controller doesn't support
rules with both source and destination addresses set, so this special
case is checked in igc_add_ethtool_nfc_entry().

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:18:37 -07:00
Andre Guedes
750433d0aa igc: Add support for source address filters in core
This patch extends MAC address filter internal APIs igc_add_mac_filter()
and igc_del_mac_filter(), as well as local helpers, to support filters
based on source address.

A new parameters 'type' is added to the APIs to indicate if the filter
type is source or destination. In case it is source type, the RAH
register is configured accordingly in igc_set_mac_filter_hw().

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 16:18:30 -07:00
Andre Guedes
d66358cae2 igc: Remove mac_table from igc_adapter
In igc_adapter we keep a sort of shadow copy of RAL and RAH registers.
There is not much benefit in keeping it, at the cost of maintainability,
since adding/removing MAC address filters is not hot path, and we
already keep filters information in adapter->nfc_filter_list for cleanup
and restoration purposes.

So in order to simplify the MAC address filtering code and prepare it
for source address support, this patch removes the mac_table from
igc_adapter.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-20 22:28:49 -07:00
Andre Guedes
1c3739cb6e igc: Remove IGC_MAC_STATE_SRC_ADDR flag
MAC address filters based on source address are not currently supported
by the IGC driver. Despite of that, the driver have some dangling code
to handle it, inherited from IGB driver. This patch removes that code to
prepare for a follow up patch that adds proper source MAC address filter
support.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-20 22:23:30 -07:00
David S. Miller
de1b99ef2a Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2020-05-19

This series contains updates to igc only.

Sasha cleans up the igc driver code that is not used or needed.

Vitaly cleans up driver code that was used to support Virtualization on
a device that is not supported by igc, so remove the dead code.

Andre renames a few macros to align with register and field names
described in the data sheet.  Also adds the VLAN Priority Queue Fliter
and EType Queue Filter registers to the list of registers dumped by
igc_get_regs().  Added additional debug messages and updated return codes
for unsupported features.  Refactored the VLAN priority filtering code to
move the core logic into igc_main.c.  Cleaned up duplicate code and
useless code.
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-20 19:27:57 -07:00
David S. Miller
c536fc74b4 Merge branch 'uaccess.net' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Al Viro says:

====================
uaccess-related stuff in net/*

Assorted uaccess-related work in net/*.  First, there's
getting rid of compat_alloc_user_space() mess in MCAST_...
[gs]etsockopt() - no need to play with copying to/from temporary
object on userland stack, etc., when ->compat_[sg]etsockopt()
instances in question can easly do everything without that.
That's the first 13 patches.  Then there's a trivial bit in
net/batman-adv (completely unrelated to everything else) and
finally getting the atm compat ioctls into simpler shape.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-20 19:07:25 -07:00
Al Viro
0edecc020b atm: switch do_atmif_sioc() to direct use of atm_dev_ioctl()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:36 -04:00
Al Viro
8cacb41659 atm: lift copyin from atm_dev_ioctl()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:35 -04:00
Al Viro
36085049bc atm: switch do_atm_iobuf() to direct use of atm_getnames()
... and sod the compat_alloc_user_space() with its complications

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:35 -04:00
Al Viro
a3929484af atm: move copyin from atm_getnames() into the caller
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:34 -04:00
Al Viro
8c2348e36a atm: separate ATM_GETNAMES handling from the rest of atm_dev_ioctl()
atm_dev_ioctl() does copyin in two different ways - one for
ATM_GETNAMES, another for everything else.  Start with separating
the former into a new helper (atm_getnames()).  The next step
will be to lift the copyin into the callers.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:33 -04:00
Al Viro
38c53ca3c1 batadv_socket_read(): get rid of pointless access_ok()
address is passed only to copy_to_user()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:33 -04:00
Al Viro
bbced07d99 get rid of compat_mc_setsockopt()
not used anymore

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:32 -04:00
Al Viro
b212c322c8 handle the group_source_req options directly
Native ->setsockopt() handling of these options (MCAST_..._SOURCE_GROUP
and MCAST_{,UN}BLOCK_SOURCE) consists of copyin + call of a helper that
does the actual work.  The only change needed for ->compat_setsockopt()
is a slightly different copyin - the helpers can be reused as-is.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:32 -04:00
Al Viro
fcfa0b09d3 ipv6: take handling of group_source_req options into a helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:31 -04:00
Al Viro
2bbf8c1ead ipv4: take handling of group_source_req options into a helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:31 -04:00
Al Viro
2f984f11fd ipv[46]: do compat setsockopt for MCAST_{JOIN,LEAVE}_GROUP directly
direct parallel to the way these two are handled in the native
->setsockopt() instances - the helpers that do the real work
are already separated and can be reused as-is in this case.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:30 -04:00
Al Viro
168a2cca81 ipv6: do compat setsockopt for MCAST_MSFILTER directly
similar to the ipv4 counterpart of that patch - the same
trick used to align the tail array properly.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:29 -04:00
Al Viro
d59eb177c8 ip6_mc_msfilter(): pass the address list separately
that way we'll be able to reuse it for compat case

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:29 -04:00
Al Viro
2e04172875 ipv4: do compat setsockopt for MCAST_MSFILTER directly
Parallel to what the native setsockopt() does, except that unlike
the native setsockopt() we do not use memdup_user() - we want
the sockaddr_storage fields properly aligned, so we allocate
4 bytes more and copy compat_group_filter at the offset 4,
which yields the proper alignments.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:28 -04:00
Al Viro
e986d4dabc set_mcast_msfilter(): take the guts of setsockopt(MCAST_MSFILTER) into a helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:28 -04:00
Al Viro
0dfe6581a7 get rid of compat_mc_getsockopt()
now we can do MCAST_MSFILTER in compat ->getsockopt() without
playing silly buggers with copying things back and forth.
We can form a native struct group_filter (sans the variable-length
tail) on stack, pass that + pointer to the tail of original request
to the helper doing the bulk of the work, then do the rest of
copyout - same as the native getsockopt() does.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:27 -04:00
Al Viro
931ca7ab7f ip*_mc_gsfget(): lift copyout of struct group_filter into callers
pass the userland pointer to the array in its tail, so that part
gets copied out by our functions; copyout of everything else is
done in the callers.  Rationale: reuse for compat; the array
is the same in native and compat, the layout of parts before it
is different for compat.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:27 -04:00
Al Viro
e9c375fb5e compat_ip{,v6}_setsockopt(): enumerate MCAST_... options explicitly
We want to check if optname is among the MCAST_... ones; do that as
an explicit switch.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:26 -04:00
Al Viro
63287de66d lift compat definitions of mcast [sg]etsockopt requests into net/compat.h
We want to get rid of compat_mc_[sg]etsockopt() and to have that stuff
handled without compat_alloc_user_space(), extra copying through
userland, etc.  To do that we'll need ipv4 and ipv6 instances of
->compat_[sg]etsockopt() to manipulate the 32bit variants of mcast
requests, so we need to move the definitions of those out of net/compat.c
and into a public header.

This patch just does a mechanical move to include/net/compat.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 20:31:25 -04:00
John Hubbard
f78cdbd75a rds: fix crash in rds_info_getsockopt()
The conversion to pin_user_pages() had a bug: it overlooked
the case of allocation of pages failing. Fix that by restoring
an equivalent check.

Reported-by: syzbot+118ac0af4ac7f785a45b@syzkaller.appspotmail.com
Fixes: dbfe7d7437 ("rds: convert get_user_pages() --> pin_user_pages()")

Cc: David S. Miller <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: rds-devel@oss.oracle.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-20 14:08:06 -07:00