2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2025-01-15 09:03:59 +08:00
Commit Graph

32900 Commits

Author SHA1 Message Date
Phoebe Buckheister
6f3eabcd04 mac802154: llsec: fix incorrect lock pairing
In encrypt, sec->lock is taken with read_lock_bh, so in the error path,
we must read_unlock_bh.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22 15:24:13 -04:00
Michal Kubeček
da08143b85 vlan: more careful checksum features handling
When combining real_dev's features and vlan_features, simple
bitwise AND is used. This doesn't work well for checksum
offloading features as if one set has NETIF_F_HW_CSUM and the
other NETIF_F_IP_CSUM and/or NETIF_F_IPV6_CSUM, we end up with
no checksum offloading. However, from the logical point of view
(how can_checksum_protocol() works), NETIF_F_HW_CSUM contains
the functionality of NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM so
that the result should be IP/IPV6.

Add helper function netdev_intersect_features() implementing
this logic and use it in vlan_dev_fix_features().

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22 15:07:23 -04:00
Ezequiel Garcia
e876f208af net: Add a software TSO helper API
Although the implementation probably needs a lot of work, this initial API
allows to implement software TSO in mvneta and mv643xx_eth drivers in a not
so intrusive way.

Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22 14:57:15 -04:00
John W. Linville
40a10fd740 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2014-05-22 13:58:36 -04:00
John W. Linville
99abe65ff1 NFC: 3.16: First pull request
This is the NFC pull request for 3.16. We have:
 
 - STMicroeectronics st21nfca support. The st21nfca is an HCI chipset and
   thus relies on the HCI stack. This submission provides support for tag
   redaer/writer mode (including Type 5) and device tree bindings.
 
 - PM runtime support and a bunch of bug fixes for TI's trf7970a.
 
 - Device tree support for NXP's pn544. Legacy platform data support is
   obviously kept intact.
 
 - NFC Tag type 4B support to the NFC Digital stack.
 
 - SOCK_RAW type support to the raw NFC socket, and allow NCI
   sniffing from that. This can be extended to report HCI frames and also
   proprietarry ones like e.g. the pn533 ones.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTepRlAAoJEIqAPN1PVmxKnF0P/RvfrZs6CbGNJC+dkEbk90p1
 nsngy4+4MmPwJYVzObnLz4Br0k1kmFKiOKske6drjMpgzDWeuQelw3B7bd3FYfxD
 YkQsc5RC984xrDoDH5pn8mA6VJqmn7whrmcibTYAixrDqTvo8gw6uja4ryAnSdZm
 n7cRbh/A5F/sa7O4mPA0bCTdp4jAS/vOP9rGFDOth/b5yJVs99XmC+AZp/Ad9BUx
 +/osWGmBV5jshtX7aPTSxIQB4BUaP/lP1DW8yF5whKDjsHC9QyJcAtw9HfZ4tv2h
 YNteZZ8yjM+rSjnDw/LvDc2Gp8Z8P1GYf8D3QN3cWhw1ZvXi7CnqKjEnm41sbfaH
 L5esIfsRBUdmk6Ika7zALqmOQFI3PzH+ag96punl29qb2gyBDRSnXKVLirv3xxFG
 h7vYtQL43Rosn/4pSilRbYReRwyKbSCxW3un/tUJy0Faafs6q+9oMC2aWbIfTT6l
 40n4H9EmzYy2OaaXSFckiIIYYgVDAji8GLXTf+dPHb+NrH3QQOR3m27WzHc4rmYk
 kUrv0lKoFswA+VLlIcJTrSKNF21FDjwuImzIWiPz6Fx/+rWJ0b4GlQyIynD72LpR
 2LkUhTrxuRuRtxVCtvTdkPlL6Bdp3HO7t4qZ0EirgnpmGK6NScBgABoqFJSbz9uS
 UUvZbHVIjLrDU9zzoyz8
 =cSl+
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-3.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

Samuel Ortiz <sameo@linux.intel.com> says:

"NFC: 3.16: First pull request

This is the NFC pull request for 3.16. We have:

- STMicroeectronics st21nfca support. The st21nfca is an HCI chipset and
  thus relies on the HCI stack. This submission provides support for tag
  redaer/writer mode (including Type 5) and device tree bindings.

- PM runtime support and a bunch of bug fixes for TI's trf7970a.

- Device tree support for NXP's pn544. Legacy platform data support is
  obviously kept intact.

- NFC Tag type 4B support to the NFC Digital stack.

- SOCK_RAW type support to the raw NFC socket, and allow NCI
  sniffing from that. This can be extended to report HCI frames and also
  proprietarry ones like e.g. the pn533 ones."

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-05-22 13:56:46 -04:00
David S. Miller
8af750d739 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables
Pablo Neira Ayuso says:

====================
Netfilter/nftables updates for net-next

The following patchset contains Netfilter/nftables updates for net-next,
most relevantly they are:

1) Add set element update notification via netlink, from Arturo Borrero.

2) Put all object updates in one single message batch that is sent to
   kernel-space. Before this patch only rules where included in the batch.
   This series also introduces the generic transaction infrastructure so
   updates to all objects (tables, chains, rules and sets) are applied in
   an all-or-nothing fashion, these series from me.

3) Defer release of objects via call_rcu to reduce the time required to
   commit changes. The assumption is that all objects are destroyed in
   reverse order to ensure that dependencies betweem them are fulfilled
   (ie. rules and sets are destroyed first, then chains, and finally
   tables).

4) Allow to match by bridge port name, from Tomasz Bursztyka. This series
   include two patches to prepare this new feature.

5) Implement the proper set selection based on the characteristics of the
   data. The new infrastructure also allows you to specify your preferences
   in terms of memory and computational complexity so the underlying set
   type is also selected according to your needs, from Patrick McHardy.

6) Several cleanup patches for nft expressions, including one minor possible
   compilation breakage due to missing mark support, also from Patrick.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22 12:06:23 -04:00
Neal Cardwell
ca8a226343 tcp: make cwnd-limited checks measurement-based, and gentler
Experience with the recent e114a710aa ("tcp: fix cwnd limited
checking to improve congestion control") has shown that there are
common cases where that commit can cause cwnd to be much larger than
necessary. This leads to TSO autosizing cooking skbs that are too
large, among other things.

The main problems seemed to be:

(1) That commit attempted to predict the future behavior of the
connection by looking at the write queue (if TSO or TSQ limit
sending). That prediction sometimes overestimated future outstanding
packets.

(2) That commit always allowed cwnd to grow to twice the number of
outstanding packets (even in congestion avoidance, where this is not
needed).

This commit improves both of these, by:

(1) Switching to a measurement-based approach where we explicitly
track the largest number of packets in flight during the past window
("max_packets_out"), and remember whether we were cwnd-limited at the
moment we finished sending that flight.

(2) Only allowing cwnd to grow to twice the number of outstanding
packets ("max_packets_out") in slow start. In congestion avoidance
mode we now only allow cwnd to grow if it was fully utilized.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22 12:04:49 -04:00
Emmanuel Grumbach
67af981153 cfg80211: allow RSSI compensation
Channels in 2.4GHz band overlap, this means that if we
send a probe request on channel 1 and then move to channel
2, we will hear the probe response on channel 2. In this
case, the RSSI will be lower than if we had heard it on
the channel on which it was sent (1 in this case).

The firmware / low level driver can parse the channel in
the DS IE or HT IE and compensate the RSSI so that it will
still have a valid value even if we heard the frame on an
adjacent channel. This can be done up to a certain offset.

Add this offset as a configuration for the low level driver.
A low level driver that can compensate the low RSSI in this
case should assign the maximal offset for which the RSSI
value is still valid.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-22 09:58:49 +02:00
Eric Dumazet
4de462ab63 ipv6: gro: fix CHECKSUM_COMPLETE support
When GRE support was added in linux-3.14, CHECKSUM_COMPLETE handling
broke on GRE+IPv6 because we did not update/use the appropriate csum :

GRO layer is supposed to use/update NAPI_GRO_CB(skb)->csum instead of
skb->csum

Tested using a GRE tunnel and IPv6 traffic. GRO aggregation now happens
at the first level (ethernet device) instead of being done in gre
tunnel. Native IPv6+TCP is still properly aggregated.

Fixes: bf5a755f5e ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-21 17:18:47 -04:00
Alexei Starovoitov
5fe821a9de net: filter: cleanup invocation of internal BPF
Kernel API for classic BPF socket filters is:

sk_unattached_filter_create() - validate classic BPF, convert, JIT
SK_RUN_FILTER() - run it
sk_unattached_filter_destroy() - destroy socket filter

Cleanup internal BPF kernel API as following:

sk_filter_select_runtime() - final step of internal BPF creation.
  Try to JIT internal BPF program, if JIT is not available select interpreter
SK_RUN_FILTER() - run it
sk_filter_free() - free internal BPF program

Disallow direct calls to BPF interpreter. Execution of the BPF program should
be done with SK_RUN_FILTER() macro.

Example of internal BPF create, run, destroy:

  struct sk_filter *fp;

  fp = kzalloc(sk_filter_size(prog_len), GFP_KERNEL);
  memcpy(fp->insni, prog, prog_len * sizeof(fp->insni[0]));
  fp->len = prog_len;

  sk_filter_select_runtime(fp);

  SK_RUN_FILTER(fp, ctx);

  sk_filter_free(fp);

Sockets, seccomp, testsuite, tracing are using different ways to populate
sk_filter, so first steps of program creation are not common.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-21 17:07:17 -04:00
Cong Wang
bf63ac73b3 net_sched: fix an oops in tcindex filter
Kelly reported the following crash:

        IP: [<ffffffff817a993d>] tcf_action_exec+0x46/0x90
        PGD 3009067 PUD 300c067 PMD 11ff30067 PTE 800000011634b060
        Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
        CPU: 1 PID: 639 Comm: dhclient Not tainted 3.15.0-rc4+ #342
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        task: ffff8801169ecd00 ti: ffff8800d21b8000 task.ti: ffff8800d21b8000
        RIP: 0010:[<ffffffff817a993d>]  [<ffffffff817a993d>] tcf_action_exec+0x46/0x90
        RSP: 0018:ffff8800d21b9b90  EFLAGS: 00010283
        RAX: 00000000ffffffff RBX: ffff88011634b8e8 RCX: ffff8800cf7133d8
        RDX: ffff88011634b900 RSI: ffff8800cf7133e0 RDI: ffff8800d210f840
        RBP: ffff8800d21b9bb0 R08: ffffffff8287bf60 R09: 0000000000000001
        R10: ffff8800d2b22b24 R11: 0000000000000001 R12: ffff8800d210f840
        R13: ffff8800d21b9c50 R14: ffff8800cf7133e0 R15: ffff8800cad433d8
        FS:  00007f49723e1840(0000) GS:ffff88011a800000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: ffff88011634b8f0 CR3: 00000000ce469000 CR4: 00000000000006e0
        Stack:
         ffff8800d2170188 ffff8800d210f840 ffff8800d2171b90 0000000000000000
         ffff8800d21b9be8 ffffffff817c55bb ffff8800d21b9c50 ffff8800d2171b90
         ffff8800d210f840 ffff8800d21b0300 ffff8800d21b9c50 ffff8800d21b9c18
        Call Trace:
         [<ffffffff817c55bb>] tcindex_classify+0x88/0x9b
         [<ffffffff817a7f7d>] tc_classify_compat+0x3e/0x7b
         [<ffffffff817a7fdf>] tc_classify+0x25/0x9f
         [<ffffffff817b0e68>] htb_enqueue+0x55/0x27a
         [<ffffffff817b6c2e>] dsmark_enqueue+0x165/0x1a4
         [<ffffffff81775642>] __dev_queue_xmit+0x35e/0x536
         [<ffffffff8177582a>] dev_queue_xmit+0x10/0x12
         [<ffffffff818f8ecd>] packet_sendmsg+0xb26/0xb9a
         [<ffffffff810b1507>] ? __lock_acquire+0x3ae/0xdf3
         [<ffffffff8175cf08>] __sock_sendmsg_nosec+0x25/0x27
         [<ffffffff8175d916>] sock_aio_write+0xd0/0xe7
         [<ffffffff8117d6b8>] do_sync_write+0x59/0x78
         [<ffffffff8117d84d>] vfs_write+0xb5/0x10a
         [<ffffffff8117d96a>] SyS_write+0x49/0x7f
         [<ffffffff8198e212>] system_call_fastpath+0x16/0x1b

This is because we memcpy struct tcindex_filter_result which contains
struct tcf_exts, obviously struct list_head can not be simply copied.
This is a regression introduced by commit 33be627159
(net_sched: act: use standard struct list_head).

It's not very easy to fix it as the code is a mess:

       if (old_r)
               memcpy(&cr, r, sizeof(cr));
       else {
               memset(&cr, 0, sizeof(cr));
               tcf_exts_init(&cr.exts, TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE);
       }
       ...
       tcf_exts_change(tp, &cr.exts, &e);
       ...
       memcpy(r, &cr, sizeof(cr));

the above code should equal to:

        tcindex_filter_result_init(&cr);
        if (old_r)
               cr.res = r->res;
        ...
        if (old_r)
               tcf_exts_change(tp, &r->exts, &e);
        else
               tcf_exts_change(tp, &cr.exts, &e);
        ...
        r->res = cr.res;

after this change, since there is no need to copy struct tcf_exts.

And it also fixes other places zero'ing struct's contains struct tcf_exts.

Fixes: commit 33be627159 (net_sched: act: use standard struct list_head)
Reported-by: Kelly Anderson <kelly@xilka.com>
Tested-by: Kelly Anderson <kelly@xilka.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-21 16:47:13 -04:00
Li RongQing
1495664355 ipv6: slight optimization in ip6_dst_gc
entries is always greater than rt_max_size here, since if entries is less
than rt_max_size, the fib6_run_gc function will be skipped

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-21 15:52:23 -04:00
Tom Gundersen
f98f89a010 net: tunnels - enable module autoloading
Enable the module alias hookup to allow tunnel modules to be autoloaded on demand.

This is in line with how most other netdev kinds work, and will allow userspace
to create tunnels without having CAP_SYS_MODULE.

Signed-off-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-21 15:46:52 -04:00
Arik Nemtsov
4d3df547e8 cfg80211: don't set reg timeout for user-handled hint
Otherwise every "indoor" setting by usermode will cause a regdomain reset.

Acked-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-21 09:15:18 +02:00
Antonio Quartulli
7406353d43 cfg80211: implement cfg80211_get_station cfg80211 API
Implement and export the new cfg80211_get_station() API.
This utility can be used by other kernel modules to obtain
detailed information about a given wireless station.

It will be in particular useful to batman-adv which will
implement a wireless rate based metric.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-21 09:15:17 +02:00
Antonio Quartulli
cca674d47e mac80211: export the expected throughput
Add get_expected_throughput() API to mac80211 so that each
driver can implement its own version based on the RC
algorithm they are using (might be using an HW RC algo).
The API returns a value expressed in Kbps.

Also, add the new get_expected_throughput() member
to the rate_control_ops structure in order to be
able to query the RC algorithm (this patch provides an
implementation of this API for both minstrel and
minstrel_ht).

The related member in the station_info object is now
filled accordingly when dumping a station.

Cc: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-21 09:15:16 +02:00
Steffen Klassert
78ff4be45a ip_tunnel: Initialize the fallback device properly
We need to initialize the fallback device to have a correct mtu
set on this device. Otherwise the mtu is set to null and the device
is unusable.

Fixes: fd58156e45 ("IPIP: Use ip-tunneling code.")
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-21 02:08:32 -04:00
David S. Miller
d050de607f Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/nftables fixes for net

The following patchset contains nftables fixes for your net tree, they
are:

1) Fix crash when using the goto action in a rule by making sure that
   we always fall back on the base chain. Otherwise, this may try to
   access the counter memory area of non-base chains, which does not
   exists.

2) Fix several aspects of the rule tracing that are currently broken:

   * Reset rule number counter after goto/jump action, otherwise the
     tracing reports a bogus rule number.
   * Fix tracing of the goto action.
   * Fix bogus rule number counter after goto.
   * Fix missing return trace after finishing the walk through the
     non-base chain.
   * Fix missing trace when matching non-terminal rule.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-21 01:24:19 -04:00
Antonio Quartulli
867d849fc8 cfg80211: export expected throughput through get_station()
Users may need information about the expected throughput
towards a given peer.
This value is supposed to consider the size overhead
generated by the 802.11 header.

This value is exported in kbps through the get_station() API
by including it into the station_info object.
Moreover, it is sent to user space when replying to the
nl80211 GET_STATION command.

This information will be useful to the batman-adv module
which will use it for its new metric computation.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-20 15:13:32 +02:00
Hiren Tandel
0515829642 NFC: NCI: Send all NCI frames to raw sockets
So that anyone listening on SOCKPROTO_RAW for raw frames will get all
NCI frames, in both directions. This actually implements userspace NFC
NCI sniffing.
It's now up to userspace to decode those frames.

Signed-off-by: Hiren Tandel <hirent@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-05-20 00:23:59 +02:00
Hiren Tandel
57be1f3f3e NFC: Add RAW socket type support for SOCKPROTO_RAW
This allows for a more generic NFC sniffing by using SOCKPROTO_RAW
SOCK_RAW to read RAW NFC frames. This is for sniffing anything but LLCP
(HCI, NCI, etc...).

Signed-off-by: Hiren Tandel <hirent@marvell.com>
Signed-off-by: Rahul Tank <rahult@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-05-20 00:06:04 +02:00
Hiren Tandel
c79d9f9ef8 NFC: NCI: No need to reverse ATR_RES Response
ATR_RES response received within Activation Parameters is already
in correct order. Reversing it fails LLCP magic number check and
so P2P functionality fails.

Signed-off-by: Hiren Tandel <hirent@marvell.com>
Signed-off-by: Rahul Tank <rahult@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-05-19 23:58:08 +02:00
Mark A. Greer
4b8b6267be NFC: digital: Handle multiple SENSF_REQ frames
According to section 5.15.1.3 of the NFC Activity
Specification, multiple SENSF_REQ commands can be
received by a target before it receives an ATR_REQ
command.  To handle this, add a routine that checks
whether a SENSF_REQ or ATR_REQ has been recieved.
If its a SENSF_REQ, respond appropriately and
continue waiting for a ATR_REQ.  If its an ATR_REQ,
handle it as before.

CC: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-05-19 23:52:40 +02:00
Mark A. Greer
96e829b433 NFC: digital: SENSF_RES excludes RD when SENSF_REQ RC is zero
The check in digital_tg_send_sensf_res() that excludes
the 'RD' field from the SENSF_RES is inverted.  The 'RD'
field should be excluded when the SENSF_REQ 'RC' field
is equal to DIGITAL_SENSF_REQ_RC_NONE instead of when
its not equal.  This is described in section 6.6.2.11
of the NFC Digital Specification.

CC: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-05-19 23:52:37 +02:00
John W. Linville
20b4f9c73f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2014-05-19 16:34:27 -04:00
Johannes Berg
922bd80fc3 cfg80211: constify wowlan/coalesce mask/pattern pointers
This requires changing the nl80211 parsing code a bit to use
intermediate pointers for the allocation, but clarifies the
API towards the drivers.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-19 18:06:50 +02:00
Johannes Berg
c1e5f4714d cfg80211: constify more pointers in the cfg80211 API
This also propagates through the drivers.

The orinoco driver uses the cfg80211 API structs for internal
bookkeeping, and so needs a (void *) cast that removes the
const - but that's OK because it allocates those pointers.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-19 17:53:16 +02:00
Johannes Berg
3b3a0162fa cfg80211: constify MAC addresses in cfg80211 ops
This propagates through all the drivers and mac80211.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-19 17:34:42 +02:00
Johannes Berg
00591cea31 mac80211: minstrel-ht: small clarifications
Antonio and I were looking over this code and some things
didn't immediately make sense, so we came up with two small
clarifications.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-19 14:30:37 +02:00
Pablo Neira Ayuso
c7c32e72cb netfilter: nf_tables: defer all object release via rcu
Now that all objects are released in the reverse order via the
transaction infrastructure, we can enqueue the release via
call_rcu to save one synchronize_rcu. For small rule-sets loaded
via nft -f, it now takes around 50ms less here.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:13 +02:00
Pablo Neira Ayuso
128ad3322b netfilter: nf_tables: remove skb and nlh from context structure
Instead of caching the original skbuff that contains the netlink
messages, this stores the netlink message sequence number, the
netlink portID and the report flag. This helps to prepare the
introduction of the object release via call_rcu.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:13 +02:00
Pablo Neira Ayuso
35151d840c netfilter: nf_tables: simplify nf_tables_*_notify
Now that all these function are called from the commit path, we can
pass the context structure to reduce the amount of parameters in all
of the nf_tables_*_notify functions. This patch also removes unneeded
branches to check for skb, nlh and net that should be always set in
the context structure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso
60319eb1ca netfilter: nf_tables: use new transaction infrastructure to handle elements
Leave the set content in consistent state if we fail to load the
batch. Use the new generic transaction infrastructure to achieve
this.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso
55dd6f9307 netfilter: nf_tables: use new transaction infrastructure to handle table
This patch speeds up rule-set updates and it also provides a way
to revert updates and leave things in consistent state in case that
the batch needs to be aborted.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso
e1aaca93ee netfilter: nf_tables: pass context to nf_tables_updtable()
So nf_tables_uptable() only takes one single parameter.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso
f75edf5e9c netfilter: nf_tables: disabling table hooks always succeeds
nf_tables_table_disable() always succeeds, make this function void.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso
91c7b38dc9 netfilter: nf_tables: use new transaction infrastructure to handle chain
This patch speeds up rule-set updates and it also introduces a way to
revert chain updates if the batch is aborted. The idea is to store the
changes in the transaction to apply that in the commit step.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso
ff3cd7b3c9 netfilter: nf_tables: refactor chain statistic routines
Add new routines to encapsulate chain statistics allocation and
replacement.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso
958bee14d0 netfilter: nf_tables: use new transaction infrastructure to handle sets
This patch reworks the nf_tables API so set updates are included in
the same batch that contains rule updates. This speeds up rule-set
updates since we skip a dialog of four messages between kernel and
user-space (two on each direction), from:

 1) create the set and send netlink message to the kernel
 2) process the response from the kernel that contains the allocated name.
 3) add the set elements and send netlink message to the kernel.
 4) process the response from the kernel (to check for errors).

To:

 1) add the set to the batch.
 2) add the set elements to the batch.
 3) add the rule that points to the set.
 4) send batch to the kernel.

This also introduces an internal set ID (NFTA_SET_ID) that is unique
in the batch so set elements and rules can refer to new sets.

Backward compatibility has been only retained in userspace, this
means that new nft versions can talk to the kernel both in the new
and the old fashion.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso
b380e5c733 netfilter: nf_tables: add message type to transactions
The patch adds message type to the transaction to simplify the
commit the and abort routines. Yet another step forward in the
generalisation of the transaction infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso
37082f930b netfilter: nf_tables: relocate commit and abort routines in the source file
Move the commit and abort routines to the bottom of the source code
file. This change is required by the follow up patches that add the
set, chain and table transaction support.

This patch is just a cleanup to access several functions without
having to declare their prototypes.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso
1081d11b08 netfilter: nf_tables: generalise transaction infrastructure
This patch generalises the existing rule transaction infrastructure
so it can be used to handle set, table and chain object transactions
as well. The transaction provides a data area that stores private
information depending on the transaction type.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso
7c95f6d866 netfilter: nf_tables: deconstify table and chain in context structure
The new transaction infrastructure updates the family, table and chain
objects in the context structure, so let's deconstify them. While at it,
move the context structure initialization routine to the top of the
source file as it will be also used from the table and chain routines.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:09 +02:00
Oliver Hartkopp
45c700291a can: add hash based access to single EFF frame filters
In contrast to the direct access to the single SFF frame filters (which are
indexed by the SFF CAN ID itself) the single EFF frame filters are arranged
in a single linked hlist. To reduce the hlist traversal in the case of many
filter subscriptions a hash based access is introduced for single EFF filters.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-05-19 09:38:24 +02:00
Oliver Hartkopp
e3d3917f3d can: proc: make array printing function indenpendent from sff frames
The can_rcvlist_sff_proc_show_one() function which prints the array of filters
for the single SFF CAN identifiers is prepared to be used by a second caller.
Therefore it is also renamed to properly describe its future functionality.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-05-19 09:38:24 +02:00
David S. Miller
b6052af61a Included changes:
- fix codestyle to respect new checkpatch warnings
 - increase internal version number
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJTeLMvAAoJEEKTMo6mOh1VKtgP/RuR34USuUbY/xMZ9/Rn2/E7
 z1qn6hh8hlw+Hd+Vn+9BvDJzwn+Baneu1c3SMP08kE+pAst0n788y/f/pVzfToJk
 Gll0sOVHiSm05M0QQ0Vq57H+rxoFv2KACM1t2+NMW+pB+PsSYG5y87b6I+0hR4Pv
 lbBCNmgIxY2alxM8qab2Zlt+cCUdkKUnI67P0LtVnMh91JuKwsheOdR+Smxz2+2g
 J+2Bzcz+NIHhJP9c+QmJipV+gtIRjFr7+bebaXDm/eEBq/3f6cEhFtwa76CmCpI/
 cAIMDFORCHB27qNMgKSuzFDdhF1qQJnZh8FX0dfRBXvH8NwxBOkjFh1CBJ3iwjm1
 T7GBTLTKiv/JqdNjqrWJ9OxChl8I2jppevZdimq1VUjhv9117Jc73TnzazjULTST
 xr5PpZ1gRfruUVXl362otrtzm0N/hdqez+mYlkZEx/ERTDedLCZZAnjTsx5PPMG+
 GXlbc1BWuQZuHpvs8uWMcnXDaWtNyNKKpvfRPuvLIST80F1Bw/KRd2FDH/AiO2tL
 2eACn9ughC5XO9E+/iyfWm1MQMEwo/w9+EfWpnRWV9HtDuHepVGy59x3mCYH/bN0
 7FP23lbaFw05i/UpsRRneqkzMJLk/16qLCiNoC8u2hEiqKzu0/celPwl7B16Fs4Z
 CU65LSN/QNU9q+AXVQOd
 =tdAQ
 -----END PGP SIGNATURE-----

Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge

Included changes:
- fix codestyle to respect new checkpatch warnings
- increase internal version number
2014-05-18 21:27:09 -04:00
Manuel Schölling
71fd762f2e net: rds: Use time_after() for time comparison
To be future-proof and for better readability the time comparisons are modified
to use time_after() instead of raw math.

Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-18 21:24:52 -04:00
stephen hemminger
614d056c8e ipv4: minor spelling fix
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-18 21:10:29 -04:00
stephen hemminger
025559eec8 bridge: fix spelling of promiscuous
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-18 21:10:08 -04:00
Alexei Starovoitov
d4f0e0958d net: bridge: fix build
fix build when BRIDGE_VLAN_FILTERING is not set

Fixes: 2796d0c648 ("bridge: Automatically manage port promiscuous mode")

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-18 20:09:50 -04:00
Simon Wunderlich
871d3d9fdf batman-adv: Start new development cycle
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2014-05-18 15:04:00 +02:00
Antonio Quartulli
2b64df2058 batman-adv: remove semi-colon after macro definition
Reported by checkpatch with the following warning:
"WARNING: macros should not use a trailing semicolon"

Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2014-05-18 15:04:00 +02:00
Antonio Quartulli
f138694b15 batman-adv: add blank line between declarations and the rest of the code
Reported by checkpatch with the following message:
"WARNING: Missing a blank line after declarations"

Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2014-05-18 15:03:52 +02:00
Vlad Yasevich
44a4085538 bonding: Fix stacked device detection in arp monitoring
Prior to commit fbd929f2dc
	bonding: support QinQ for bond arp interval

the arp monitoring code allowed for proper detection of devices
stacked on top of vlans.  Since the above commit, the
code can still detect a device stacked on top of single
vlan, but not a device stacked on top of Q-in-Q configuration.
The search will only set the inner vlan tag if the route
device is the vlan device.  However, this is not always the
case, as it is possible to extend the stacked configuration.

With this patch it is possible to provision devices on
top Q-in-Q vlan configuration that should be used as
a source of ARP monitoring information.

For example:
ip link add link bond0 vlan10 type vlan proto 802.1q id 10
ip link add link vlan10 vlan100 type vlan proto 802.1q id 100
ip link add link vlan100 type macvlan

Note:  This patch limites the number of stacked VLANs to 2,
just like before.  The original, however had another issue
in that if we had more then 2 levels of VLANs, we would end
up generating incorrectly tagged traffic.  This is no longer
possible.

Fixes: fbd929f2dc (bonding: support QinQ for bond arp interval)
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@redhat.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Ding Tianhong <dingtianhong@huawei.com>
CC: Patric McHardy <kaber@trash.net>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 22:29:05 -04:00
Vlad Yasevich
d38569ab2b vlan: Fix lockdep warning with stacked vlan devices.
This reverts commit dc8eaaa006.
	vlan: Fix lockdep warning when vlan dev handle notification

Instead we use the new new API to find the lock subclass of
our vlan device.  This way we can support configurations where
vlans are interspersed with other devices:
  bond -> vlan -> macvlan -> vlan

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 22:14:49 -04:00
Vlad Yasevich
4085ebe8c3 net: Find the nesting level of a given device by type.
Multiple devices in the kernel can be stacked/nested and they
need to know their nesting level for the purposes of lockdep.
This patch provides a generic function that determines a nesting
level of a particular device by its type (ex: vlan, macvlan, etc).
We only care about nesting of the same type of devices.

For example:
  eth0 <- vlan0.10 <- macvlan0 <- vlan1.20

The nesting level of vlan1.20 would be 1, since there is another vlan
in the stack under it.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 22:14:49 -04:00
Thomas Graf
97dc48e220 pktgen: Use seq_puts() where seq_printf() is not needed
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:30:30 -04:00
Eric Dumazet
29e9824278 net: gro: make sure skb->cb[] initial content has not to be zero
Starting from linux-3.13, GRO attempts to build full size skbs.

Problem is the commit assumed one particular field in skb->cb[]
was clean, but it is not the case on some stacked devices.

Timo reported a crash in case traffic is decrypted before
reaching a GRE device.

Fix this by initializing NAPI_GRO_CB(skb)->last at the right place,
this also removes one conditional.

Thanks a lot to Timo for providing full reports and bisecting this.

Fixes: 8a29111c7c ("net: gro: allow to build full sized skb")
Bisected-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:24:54 -04:00
Phoebe Buckheister
f0f77dc6be ieee802154, mac802154: implement devkey record option
The 802.15.4-2011 standard states that for each key, a list of devices
that use this key shall be kept. Previous patches have only considered
two options:

 * a device "uses" (or may use) all keys, rendering the list useless
 * a device is restricted to a certain set of keys

Another option would be that a device *may* use all keys, but need not
do so, and we are interested in the actual set of keys the device uses.
Recording keys used by any given device may have a noticable performance
impact and might not be needed as often. The common case, in which a
device will not switch keys too often, should still perform well.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:42 -04:00
Phoebe Buckheister
3e9c156e2c ieee802154: add netlink interfaces for llsec
This patch adds user-visible interfaces for the llsec infrastructure.
For the added methods, the only major difference between all add/remove
implementation lies in how the specific object is parsed, and for dump
requests, how objects are written into netlink messages.

To save on boilerplate code, table dumps are routed through a helper
function that handles netlink dump state, leaving the actual dumping
code to care only about iterating over the table to be dumped and
filling netlink messages. For add/remove methods, the boilerplate
required to work is not quite as large, but still enough to also move
into a local helper.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:41 -04:00
Phoebe Buckheister
9b0bb4a83f mac802154: propagate device address changes to llsec
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:41 -04:00
Phoebe Buckheister
29e023746a mac802154: add llsec configuration functions
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:41 -04:00
Phoebe Buckheister
af9eed5bbf ieee802154: add dgram sockopts for security control
Allow datagram sockets to override the security settings of the device
they send from on a per-socket basis. Requires CAP_NET_ADMIN or
CAP_NET_RAW, since raw sockets can send arbitrary packets anyway.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:41 -04:00
Phoebe Buckheister
f30be4d53c mac802154: integrate llsec with wpan devices
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:41 -04:00
Phoebe Buckheister
4c14a2fb5d mac802154: add llsec decryption method
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:41 -04:00
Phoebe Buckheister
03556e4d0d mac802154: add llsec encryption method
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:40 -04:00
Phoebe Buckheister
5d637d5aab mac802154: add llsec structures and mutators
This patch adds containers and mutators for the major ieee802154_llsec
structures to mac802154. Most of the (rather simple) ieee802154_llsec
structs are wrapped only to provide an rcu_head for orderly disposal,
but some structs - llsec keys notably - require more complex
bookkeeping.

Since each llsec key may be referenced by a number of llsec key table
entries (with differing key ids, but the same actual key), we want to
save memory and not allocate crypto transforms for each entry in the
table. Thus, the mac802154 llsec key is reference-counted instead.
Further, each key will have four associated crypto transforms - three
CCM transforms for the authsizes 4/8/16 and one CTR transform for
unauthenticated encryption. If we had a CCM* transform that allowed
authsize 0, and authsize as part of requests instead of transforms, this
would not be necessary.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:40 -04:00
Phoebe Buckheister
87de726c9b mac802154: update Kconfig
Link-layer security requires AES CCM for authenticated modes and AES CTR
for the unauthenticated encryption mode.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:40 -04:00
David S. Miller
e54740e6d7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
Jesse Gross says:

====================
A set of OVS changes for net-next/3.16.

The major change here is a switch from per-CPU to per-NUMA flow
statistics. This improves scalability by reducing kernel overhead
in flow setup and maintenance.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:21:51 -04:00
Vlad Yasevich
2796d0c648 bridge: Automatically manage port promiscuous mode.
There exist configurations where the administrator or another management
entity has the foreknowledge of all the mac addresses of end systems
that are being bridged together.

In these environments, the administrator can statically configure known
addresses in the bridge FDB and disable flooding and learning on ports.
This makes it possible to turn off promiscuous mode on the interfaces
connected to the bridge.

Here is why disabling flooding and learning allows us to control
promiscuity:
 Consider port X.  All traffic coming into this port from outside the
bridge (ingress) will be either forwarded through other ports of the
bridge (egress) or dropped.  Forwarding (egress) is defined by FDB
entries and by flooding in the event that no FDB entry exists.
In the event that flooding is disabled, only FDB entries define
the egress.  Once learning is disabled, only static FDB entries
provided by a management entity define the egress.  If we provide
information from these static FDBs to the ingress port X, then we'll
be able to accept all traffic that can be successfully forwarded and
drop all the other traffic sooner without spending CPU cycles to
process it.
 Another way to define the above is as following equations:
    ingress = egress + drop
 expanding egress
    ingress = static FDB + learned FDB + flooding + drop
 disabling flooding and learning we a left with
    ingress = static FDB + drop

By adding addresses from the static FDB entries to the MAC address
filter of an ingress port X, we fully define what the bridge can
process without dropping and can thus turn off promiscuous mode,
thus dropping packets sooner.

There have been suggestions that we may want to allow learning
and update the filters with learned addresses as well.  This
would require mac-level authentication similar to 802.1x to
prevent attacks against the hw filters as they are limited
resource.

Additionally, if the user places the bridge device in promiscuous mode,
all ports are placed in promiscuous mode regardless of the changes
to flooding and learning.

Since the above functionality depends on full static configuration,
we have also require that vlan filtering be enabled to take
advantage of this.  The reason is that the bridge has to be
able to receive and process VLAN-tagged frames and the there
are only 2 ways to accomplish this right now: promiscuous mode
or vlan filtering.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:06:33 -04:00
Vlad Yasevich
145beee8d6 bridge: Add addresses from static fdbs to non-promisc ports
When a static fdb entry is created, add the mac address
from this fdb entry to any ports that are currently running
in non-promiscuous mode.  These ports need this data so that
they can receive traffic destined to these addresses.
By default ports start in promiscuous mode, so this feature
is disabled.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:06:33 -04:00
Vlad Yasevich
f3a6ddf152 bridge: Introduce BR_PROMISC flag
Introduce a BR_PROMISC per-port flag that will help us track if the
current port is supposed to be in promiscuous mode or not.  For now,
always start in promiscuous mode.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:06:33 -04:00
Vlad Yasevich
8db24af71b bridge: Add functionality to sync static fdb entries to hw
Add code that allows static fdb entires to be synced to the
hw list for a specified port.  This will be used later to
program ports that can function in non-promiscuous mode.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:06:33 -04:00
Vlad Yasevich
e028e4b8dc bridge: Keep track of ports capable of automatic discovery.
By default, ports on the bridge are capable of automatic
discovery of nodes located behind the port.  This is accomplished
via flooding of unknown traffic (BR_FLOOD) and learning the
mac addresses from these packets (BR_LEARNING).
If the above functionality is disabled by turning off these
flags, the port requires static configuration in the form
of static FDB entries to function properly.

This patch adds functionality to keep track of all ports
capable of automatic discovery.  This will later be used
to control promiscuity settings.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:06:33 -04:00
Vlad Yasevich
63c3a622dd bridge: Turn flag change macro into a function.
Turn the flag change macro into a function to allow
easier updates and to reduce space.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:06:32 -04:00
Timo Teräs
22fb22eaeb ipv4: ip_tunnels: disable cache for nbma gre tunnels
The connected check fails to check for ip_gre nbma mode tunnels
properly. ip_gre creates temporary tnl_params with daddr specified
to pass-in the actual target on per-packet basis from neighbor
layer. Detect these tunnels by inspecting the actual tunnel
configuration.

Minimal test case:
 ip route add 192.168.1.1/32 via 10.0.0.1
 ip route add 192.168.1.2/32 via 10.0.0.2
 ip tunnel add nbma0 mode gre key 1 tos c0
 ip addr add 172.17.0.0/16 dev nbma0
 ip link set nbma0 up
 ip neigh add 172.17.0.1 lladdr 192.168.1.1 dev nbma0
 ip neigh add 172.17.0.2 lladdr 192.168.1.2 dev nbma0
 ping 172.17.0.1
 ping 172.17.0.2

The second ping should be going to 192.168.1.2 and head 10.0.0.2;
but cached gre tunnel level route is used and it's actually going
to 192.168.1.1 via 10.0.0.1.

The lladdr's need to go to separate dst for the bug to trigger.
Test case uses separate route entries, but this can also happen
when the route entry is same: if there is a nexthop exception or
the GRE tunnel is IPsec'ed in which case the dst points to xfrm
bundle unique to the gre lladdr.

Fixes: 7d442fab0a ("ipv4: Cache dst in tunnels")
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Cc: Tom Herbert <therbert@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 16:58:41 -04:00
Duan Jiong
ee30ef4d45 ip_tunnel: don't add tunnel twice
When using command "ip tunnel add" to add a tunnel, the tunnel will be added twice,
through ip_tunnel_create() and ip_tunnel_update().

Because the second is unnecessary, so we can just break after adding tunnel
through ip_tunnel_create().

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 16:57:44 -04:00
Fabian Godehardt
d1c0b471b3 net/dsa/dsa.c: increment chip_index during of_node handling on dsa_of_probe()
Adding more than one chip on device-tree currently causes the probing
routine to always use the first chips data pointer.

Signed-off-by: Fabian Godehardt <fg@emlix.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 16:56:33 -04:00
Lorenzo Colitti
2e47b29195 net: ipv6: make "ip -6 route get mark xyz" work.
Currently, "ip -6 route get mark xyz" ignores the mark passed in
by userspace. Make it honour the mark, just like IPv4 does.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 16:50:30 -04:00
Monam Agarwal
944df8ae84 net/openvswitch: Use with RCU_INIT_POINTER(x, NULL) in vport-gre.c
This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL)

The rcu_assign_pointer() ensures that the initialization of a structure
is carried out before storing a pointer to that structure.
And in the case of the NULL pointer, there is no structure to initialize.
So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL)

Signed-off-by: Monam Agarwal <monamagarwal123@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-05-16 13:40:29 -07:00
Jarno Rajahalme
88d73f6c41 openvswitch: Use TCP flags in the flow key for stats.
We already extract the TCP flags for the key, might as well use that
for stats.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-05-16 13:40:29 -07:00
Jarno Rajahalme
d92ab13558 openvswitch: Fix output of SCTP mask.
The 'output' argument of the ovs_nla_put_flow() is the one from which
the bits are written to the netlink attributes.  For SCTP we
accidentally used the bits from the 'swkey' instead.  This caused the
mask attributes to include the bits from the actual flow key instead
of the mask.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-05-16 13:40:29 -07:00
Jarno Rajahalme
63e7959c4b openvswitch: Per NUMA node flow stats.
Keep kernel flow stats for each NUMA node rather than each (logical)
CPU.  This avoids using the per-CPU allocator and removes most of the
kernel-side OVS locking overhead otherwise on the top of perf reports
and allows OVS to scale better with higher number of threads.

With 9 handlers and 4 revalidators netperf TCP_CRR test flow setup
rate doubles on a server with two hyper-threaded physical CPUs (16
logical cores each) compared to the current OVS master.  Tested with
non-trivial flow table with a TCP port match rule forcing all new
connections with unique port numbers to OVS userspace.  The IP
addresses are still wildcarded, so the kernel flows are not considered
as exact match 5-tuple flows.  This type of flows can be expected to
appear in large numbers as the result of more effective wildcarding
made possible by improvements in OVS userspace flow classifier.

Perf results for this test (master):

Events: 305K cycles
+   8.43%     ovs-vswitchd  [kernel.kallsyms]   [k] mutex_spin_on_owner
+   5.64%     ovs-vswitchd  [kernel.kallsyms]   [k] __ticket_spin_lock
+   4.75%     ovs-vswitchd  ovs-vswitchd        [.] find_match_wc
+   3.32%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_lock
+   2.61%     ovs-vswitchd  [kernel.kallsyms]   [k] pcpu_alloc_area
+   2.19%     ovs-vswitchd  ovs-vswitchd        [.] flow_hash_in_minimask_range
+   2.03%          swapper  [kernel.kallsyms]   [k] intel_idle
+   1.84%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_unlock
+   1.64%     ovs-vswitchd  ovs-vswitchd        [.] classifier_lookup
+   1.58%     ovs-vswitchd  libc-2.15.so        [.] 0x7f4e6
+   1.07%     ovs-vswitchd  [kernel.kallsyms]   [k] memset
+   1.03%          netperf  [kernel.kallsyms]   [k] __ticket_spin_lock
+   0.92%          swapper  [kernel.kallsyms]   [k] __ticket_spin_lock
...

And after this patch:

Events: 356K cycles
+   6.85%     ovs-vswitchd  ovs-vswitchd        [.] find_match_wc
+   4.63%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_lock
+   3.06%     ovs-vswitchd  [kernel.kallsyms]   [k] __ticket_spin_lock
+   2.81%     ovs-vswitchd  ovs-vswitchd        [.] flow_hash_in_minimask_range
+   2.51%     ovs-vswitchd  libpthread-2.15.so  [.] pthread_mutex_unlock
+   2.27%     ovs-vswitchd  ovs-vswitchd        [.] classifier_lookup
+   1.84%     ovs-vswitchd  libc-2.15.so        [.] 0x15d30f
+   1.74%     ovs-vswitchd  [kernel.kallsyms]   [k] mutex_spin_on_owner
+   1.47%          swapper  [kernel.kallsyms]   [k] intel_idle
+   1.34%     ovs-vswitchd  ovs-vswitchd        [.] flow_hash_in_minimask
+   1.33%     ovs-vswitchd  ovs-vswitchd        [.] rule_actions_unref
+   1.16%     ovs-vswitchd  ovs-vswitchd        [.] hindex_node_with_hash
+   1.16%     ovs-vswitchd  ovs-vswitchd        [.] do_xlate_actions
+   1.09%     ovs-vswitchd  ovs-vswitchd        [.] ofproto_rule_ref
+   1.01%          netperf  [kernel.kallsyms]   [k] __ticket_spin_lock
...

There is a small increase in kernel spinlock overhead due to the same
spinlock being shared between multiple cores of the same physical CPU,
but that is barely visible in the netperf TCP_CRR test performance
(maybe ~1% performance drop, hard to tell exactly due to variance in
the test results), when testing for kernel module throughput (with no
userspace activity, handful of kernel flows).

On flow setup, a single stats instance is allocated (for the NUMA node
0).  As CPUs from multiple NUMA nodes start updating stats, new
NUMA-node specific stats instances are allocated.  This allocation on
the packet processing code path is made to never block or look for
emergency memory pools, minimizing the allocation latency.  If the
allocation fails, the existing preallocated stats instance is used.
Also, if only CPUs from one NUMA-node are updating the preallocated
stats instance, no additional stats instances are allocated.  This
eliminates the need to pre-allocate stats instances that will not be
used, also relieving the stats reader from the burden of reading stats
that are never used.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-05-16 13:40:29 -07:00
Jarno Rajahalme
23dabf88ab openvswitch: Remove 5-tuple optimization.
The 5-tuple optimization becomes unnecessary with a later per-NUMA
node stats patch.  Remove it first to make the changes easier to
grasp.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-05-16 13:40:29 -07:00
Joe Perches
8c63ff09bd openvswitch: Use ether_addr_copy
It's slightly smaller/faster for some architectures.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-05-16 13:40:29 -07:00
Joe Perches
2235ad1c3a openvswitch: flow_netlink: Use pr_fmt to OVS_NLERR output
Add "openvswitch: " prefix to OVS_NLERR output
to match the other OVS_NLERR output of datapath.c

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-05-16 13:40:28 -07:00
Joe Perches
1815a8831f openvswitch: Use net_ratelimit in OVS_NLERR
Each use of pr_<level>_once has a per-site flag.

Some of the OVS_NLERR messages look as if seeing them
multiple times could be useful, so use net_ratelimit()
instead of pr_info_once.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-05-16 13:40:28 -07:00
Daniele Di Proietto
cc23ebf3bb openvswitch: Added (unsigned long long) cast in printf
This is necessary, since u64 is not unsigned long long
in all architectures: u64 could be also uint64_t.

Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-05-16 13:40:28 -07:00
Daniele Di Proietto
07dc0602c5 openvswitch: avoid cast-qual warning in vport_priv
This function must cast a const value to a non const value.
By adding an uintptr_t cast the warning is suppressed.
To avoid the cast (proper solution) several function signatures
must be changed.

Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-05-16 13:40:28 -07:00
Daniele Di Proietto
d0b4da1375 openvswitch: avoid warnings in vport_from_priv
This change, firstly, avoids declaring the formal parameter const,
since it is treated as non const. (to avoid -Wcast-qual)
Secondly, it cast the pointer from void* to u8*, since it is used
in arithmetic (to avoid -Wpointer-arith)

Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-05-16 13:40:28 -07:00
Daniele Di Proietto
7085130bab openvswitch: use const in some local vars and casts
In few functions, const formal parameters are assigned or cast to
non-const.
These changes suppress warnings if compiled with -Wcast-qual.

Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-05-16 13:40:28 -07:00
David S. Miller
2f67cc87d6 Include changes:
- fix NULL dereference in batadv_orig_hardif_seq_print_text()
 - fix reference counting imbalance when using fragmentation
 - avoid access to orig_node objects after they have been free'd
 - fix local TT check for outgoing arp requests in DAT
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJTdQmwAAoJEEKTMo6mOh1VGLAP/A1nHDzPMUOcDttR49Cs38w0
 oD2Ox66xSJh2Yn8qRg9k7CshG65pU70J77bQjkPvMtTlwsgwgLFcHP1b/RJQl7Cz
 aFSJY5tKvLL41TwqxLSAmvUyPMfvagXvxH65bLBIQ9+dLNkDiHNH/IjdnYKWHYi9
 0tqUi7/pLaCfWXMkDVeWn0P2M8baDyU1HUTuRX3ctE4l9PKF9ZVgxsxaPrhTYlXY
 J61KT+VXs19rdAnYQlFiaDk64Q6meMjuNjxuLkViTmqKi6pSDGi9skeKWZXaKOjT
 UmLLygVyf9Sh36TWDKinSV09r/s+TeU35o6bCgrmshZebSmFEUkEDA7oxNJ5JW+Q
 Lh2Y2SrX/+F0+9yhxhDd0fHP3PAwt2XNKjIQjurE85Gw84ZoMyBsVIpF8LD3IS+I
 T5CSAB0fEyeS0ZFyChbgWSLZzFjcowRHwK1iO8SJC5LHRtYerEqnvgP/V3ej0dt9
 A4nq8eO8N9AorQc1G9qMosLNLheMCmFenU2nb8MbC5yDvq2X9jxsmgYm0fvr/y47
 f667bowPr0afhsLvTqy6ezYma9EV40F8jW2/OovyBRUuytavJ4xcbCz/FUlWfNRU
 xx68e15t49iOFJynGXt62LJnEmBzRaE2uUagZaMNms18gmsL10y5pECAmi9zhQWK
 smkfqmsVWU8nB9UsDIT7
 =+DKS
 -----END PGP SIGNATURE-----

Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge

Include changes:
- fix NULL dereference in batadv_orig_hardif_seq_print_text()
- fix reference counting imbalance when using fragmentation
- avoid access to orig_node objects after they have been free'd
- fix local TT check for outgoing arp requests in DAT
2014-05-16 16:28:53 -04:00
Kirill Tkhai
31ff6aa5c8 net: unix: Align send data_len up to PAGE_SIZE
Using whole of allocated pages reduces requested skb->data size.
This is just a little more thriftily allocation.

netperf does not show difference with the current performance.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 16:04:03 -04:00
David S. Miller
202630b445 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
pull request: wireless 2014-05-15

Please pull this batch of fixes for the 3.15 stream...

For the mac80211 bits, Johannes says:

"One fix is to get better VHT performance and the other fixes tracing
garbage or other potential issues with the interface name tracing."

And...

"This has a fix from Emmanuel for a problem I failed to fix - when
association is in progress then it needs to be cancelled while
suspending (I had fixed the same for authentication). Also included a
fix from myself for a userspace API problem that hit the iw tool and a
fix to the remain-on-channel framework."

For the iwlwifi bits, Emmanuel says:

"Alex fixes the scan by disabling the fragmented scan. David prevents
scan offload while associated, the firmware seems not to like it. I
fix a stupid bug I made in BT Coex, and fix a bad #ifdef clause in rate
scaling.  Along with that there is a fix for a NULL pointer exception
that can happen if we load the driver and our ISR gets called because
the interrupt line is shared. The fix has been tested by the reporter."

And...

"We have here a fix from David Spinadel that makes a previous fix more
complete, and an off-by-one issue fixed by Eliad in the same area.
I fix the monitor that broke on the way."

Beyond that...

Daniel Kim's one-liner fixes a brcmfmac regression caused by a typo
in an earlier commit..

Rajkumar Manoharan fixes an ath9k oops reported by David Herrmann.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 15:45:56 -04:00
Nathaniel W Filardo
fde0133b9c af_rxrpc: Fix XDR length check in rxrpc key demarshalling.
There may be padding on the ticket contained in the key payload, so just ensure
that the claimed token length is large enough, rather than exactly the right
size.

Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 15:24:47 -04:00
Ilya Dryomov
f140662f35 crush: decode and initialize chooseleaf_vary_r
Commit e2b149cc4b ("crush: add chooseleaf_vary_r tunable") added the
crush_map::chooseleaf_vary_r field but missed the decode part.  This
lead to misdirected requests caused by incorrect raw crush mapping
sets.

Fixes: http://tracker.ceph.com/issues/8226

Reported-and-Tested-by: Dmitry Smirnov <onlyjob@member.fsf.org>
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-05-16 21:29:55 +04:00
Chunwei Chen
178eda29ca libceph: fix corruption when using page_count 0 page in rbd
It has been reported that using ZFSonLinux on rbd will result in memory
corruption. The bug report can be found here:

https://github.com/zfsonlinux/spl/issues/241
http://tracker.ceph.com/issues/7790

The reason is that ZFS will send pages with page_count 0 into rbd, which in
turns send them to tcp_sendpage. However, tcp_sendpage cannot deal with
page_count 0, as it will do get_page and put_page, and erroneously free the
page.

This type of issue has been noted before, and handled in iscsi, drbd,
etc. So, rbd should also handle this. This fix address this issue by fall back
to slower sendmsg when page_count 0 detected.

Cc: Sage Weil <sage@inktank.com>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: stable@vger.kernel.org
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-05-16 21:29:26 +04:00
Duan Jiong
be7a010d6f ipv6: update Destination Cache entries when gateway turn into host
RFC 4861 states in 7.2.5:

	The IsRouter flag in the cache entry MUST be set based on the
         Router flag in the received advertisement.  In those cases
         where the IsRouter flag changes from TRUE to FALSE as a result
         of this update, the node MUST remove that router from the
         Default Router List and update the Destination Cache entries
         for all destinations using that neighbor as a router as
         specified in Section 7.3.3.  This is needed to detect when a
         node that is used as a router stops forwarding packets due to
         being configured as a host.

Currently, when dealing with NA Message which IsRouter flag changes from
TRUE to FALSE, the kernel only removes router from the Default Router List,
and don't update the Destination Cache entries.

Now in order to update those Destination Cache entries, i introduce
function rt6_clean_tohost().

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15 23:26:27 -04:00
David S. Miller
f895f0cfbb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Conflicts:
	net/ipv4/ip_vti.c

Steffen Klassert says:

====================
pull request (net): ipsec 2014-05-15

This pull request has a merge conflict in net/ipv4/ip_vti.c
between commit 8d89dcdf80 ("vti: don't allow to add the same
tunnel twice") and commit a32452366b  ("vti4:Don't count header
length twice"). It can be solved like it is done in linux-next.

1) Fix a ipv6 xfrm output crash when a packet is rerouted
   by netfilter to not use IPsec.

2) vti4 counts some header lengths twice leading to an incorrect
   device mtu. Fix this by counting these headers only once.

3) We don't catch the case if an unsupported protocol is submitted
   to the xfrm protocol handlers, this can lead to NULL pointer
   dereferences. Fix this by adding the appropriate checks.

4) vti6 may unregister pernet ops twice on init errors.
   Fix this by removing one of the calls to do it only once.
   From Mathias Krause.

5) Set the vti tunnel mark before doing a lookup in the error
   handlers. Otherwise we don't find the correct xfrm state.
====================

The conflict in ip_vti.c was simple, 'net' had a commit
removing a line from vti_tunnel_init() and this tree
being merged had a commit adding a line to the same
location.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15 23:23:48 -04:00
Julia Lawall
112a3513b5 vti6: delete unneeded call to netdev_priv
Netdev_priv is an accessor function, and has no purpose if its result is
not used.

A simplified version of the semantic match that fixes this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@ local idexpression x; @@
-x = netdev_priv(...);
... when != x
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15 16:57:47 -04:00