Commit Graph

8354 Commits

Author SHA1 Message Date
Madhu Challa
93a714d6b5 multicast: Extend ip address command to enable multicast group join/leave on
Joining multicast group on ethernet level via "ip maddr" command would
not work if we have an Ethernet switch that does igmp snooping since
the switch would not replicate multicast packets on ports that did not
have IGMP reports for the multicast addresses.

Linux vxlan interfaces created via "ip link add vxlan" have the group option
that enables then to do the required join.

By extending ip address command with option "autojoin" we can get similar
functionality for openvswitch vxlan interfaces as well as other tunneling
mechanisms that need to receive multicast traffic. The kernel code is
structured similar to how the vxlan driver does a group join / leave.

example:
ip address add 224.1.1.10/24 dev eth5 autojoin
ip address del 224.1.1.10/24 dev eth5

Signed-off-by: Madhu Challa <challa@noironetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 16:25:25 -05:00
Madhu Challa
46a4dee074 igmp v6: add __ipv6_sock_mc_join and __ipv6_sock_mc_drop
Based on the igmp v4 changes from Eric Dumazet.
959d10f6bbf6("igmp: add __ip_mc_{join|leave}_group()")

These changes are needed to perform igmp v6 join/leave while
RTNL is held.

Make ipv6_sock_mc_join and ipv6_sock_mc_drop wrappers around
__ipv6_sock_mc_join and  __ipv6_sock_mc_drop to avoid
proliferation of work queues.

Signed-off-by: Madhu Challa <challa@noironetworks.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 16:25:24 -05:00
Tom Herbert
723b8e460d udp: In udp_flow_src_port use random hash value if skb_get_hash fails
In the unlikely event that skb_get_hash is unable to deduce a hash
in udp_flow_src_port we use a consistent random value instead.
This is specified in GRE/UDP draft section 3.2.1:
https://tools.ietf.org/html/draft-ietf-tsvwg-gre-in-udp-encap-04

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 16:00:01 -05:00
Johan Hedberg
4cd3928a8b Bluetooth: Update New CSRK event to match latest specification
The 'master' parameter of the New CSRK event was recently renamed to
'type', with the old values kept for backwards compatibility as
unauthenticated local/remote keys. This patch updates the code to take
into account the two new (authenticated) values and ensures they get
used based on the security level of the connection that the respective
keys get distributed over.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-27 18:25:48 +01:00
Guenter Roeck
d79d210736 net: dsa: Introduce dsa_is_port_initialized
To avoid race conditions when using the ds->ports[] array,
we need to check if the accessed port has been initialized.
Introduce and use helper function dsa_is_port_initialized
for that purpose and use it where needed.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-25 17:57:48 -05:00
Florian Fainelli
b73adef677 net: dsa: integrate with SWITCHDEV for HW bridging
In order to support bridging offloads in DSA switch drivers, select
NET_SWITCHDEV to get access to the port_stp_update and parent_get_id
NDOs that we are required to implement.

To facilitate the integratation at the DSA driver level, we implement 3
types of operations:

- port_join_bridge
- port_leave_bridge
- port_stp_update

DSA will resolve which switch ports that are currently bridge port
members as some Switch hardware/drivers need to know about that to limit
the register programming to just the relevant registers (especially for
slow MDIO buses).

We also take care of setting the correct STP state when slave network
devices are brought up/down while being bridge members.

Finally, when a port is leaving the bridge, we make sure we set in
BR_STATE_FORWARDING state, otherwise the bridge layer would leave it
disabled as a result of having left the bridge.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-25 17:03:38 -05:00
Marcelo Ricardo Leitner
d752c36457 ipvs: allow rescheduling of new connections when port reuse is detected
Currently, when TCP/SCTP port reusing happens, IPVS will find the old
entry and use it for the new one, behaving like a forced persistence.
But if you consider a cluster with a heavy load of small connections,
such reuse will happen often and may lead to a not optimal load
balancing and might prevent a new node from getting a fair load.

This patch introduces a new sysctl, conn_reuse_mode, that allows
controlling how to proceed when port reuse is detected. The default
value will allow rescheduling of new connections only if the old entry
was in TIME_WAIT state for TCP or CLOSED for SCTP.

Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-25 13:46:35 +09:00
Mahesh Bandewar
14c9551a32 bonding: Implement port churn-machine (AD standard 43.4.17).
The Churn Detection machines detect the situation where a port is operable,
but the Actor and Partner have not attached the link to an Aggregator and
brought the link into operation within a bound time period. Under normal
operation of the LACP, agreement between Actor and Partner should be reached
very rapidly. Continued failure to reach agreement can be symptomatic of
device failure.

Actor-churn-detection state-machine
Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com>

===================================

BEGIN=True + PortEnable=False
           |
           v
 +------------------------+   ActorPort.Sync=True  +------------------+
 |   ACTOR_CHURN_MONITOR  | ---------------------> |  NO_ACTOR_CHURN  |
 |========================|                        |==================|
 |    ActorChurn=False    |  ActorPort.Sync=False  | ActorChurn=False |
 | ActorChurn.Timer=Start | <--------------------- |                  |
 +------------------------+                        +------------------+
           |                                                ^
           |                                                |
  ActorChurn.Timer=Expired                                  |
           |                                       ActorPort.Sync=True
           |                                                |
           |                +-----------------+             |
           |                |   ACTOR_CHURN   |             |
           |                |=================|             |
           +--------------> | ActorChurn=True | ------------+
                            |                 |
                            +-----------------+

Similar for the Partner-churn-detection.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-24 16:05:48 -05:00
Dan Carpenter
278f7b4fff caif: fix a signedness bug in cfpkt_iterate()
The cfpkt_iterate() function can return -EPROTO on error, but the
function is a u16 so the negative value gets truncated to a positive
unsigned short.  This causes a static checker warning.

The only caller which might care is cffrml_receive(), when it's checking
the frame checksum.  I modified cffrml_receive() so that it never says
-EPROTO is a valid checksum.

Also this isn't ever going to be inlined so I removed the "inline".

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 17:35:14 -05:00
Johan Hedberg
7129069e84 Bluetooth: Rename hci_send_to_control to hci_send_to_channel
The hci_send_to_control() can be made more general purpose with a small
change of passing the desired HCI channel as a parameter to it. This
allows using it for the monitor channel as well as e.g. 6lowpan in the
future.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-20 18:20:17 +01:00
Johan Hedberg
3a6d576be9 Bluetooth: Convert disconn_cfm to be triggered through hci_cb
This patch moves all the disconn_cfm callbacks to be based on the hci_cb
list. This means making l2cap_disconn_cfm private to l2cap_core.c and
sco_conn_cb private to sco.c respectively. Since the hci_conn type
filtering isn't done any more on the wrapper level the callbacks
themselves need to check that they were passed a relevant type of
connection.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-19 08:44:29 +01:00
Johan Hedberg
539c496d88 Bluetooth: Convert connect_cfm to be triggered through hci_cb
This patch moves all the connect_cfm callbacks to be based on the hci_cb
list. This means making l2cap_connect_cfm private to l2cap_core.c and
sco_connect_cb private to sco.c respectively. Since the hci_conn type
filtering isn't done any more on the wrapper level the callbacks
themselves need to check that they were passed a relevant type of
connection.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-19 08:44:29 +01:00
Johan Hedberg
354fe804ed Bluetooth: Convert L2CAP security callback to use hci_cb
There's no reason to have the custom hci_proto_auth/encrypt_cfm helpers
when the hci_cb list works equally well. This patch adds L2CAP to the
hci_cb list and makes l2cap_security_cfm a private function of
l2cap_core.c.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-19 08:44:28 +01:00
Johan Hedberg
fba7ecf09b Bluetooth: Convert hci_cb_list_lock to a mutex
We'll soon need to be able to sleep inside the loops that iterate the
hci_cb list, so neither a spinlock, rwlock or rcu are usable. This patch
changes the lock to a mutex which permits sleeping while holding the
lock.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-19 08:44:28 +01:00
Linus Torvalds
f5af19d10d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking updates from David Miller:

 1) Missing netlink attribute validation in nft_lookup, from Patrick
    McHardy.

 2) Restrict ipv6 partial checksum handling to UDP, since that's the
    only case it works for.  From Vlad Yasevich.

 3) Clear out silly device table sentinal macros used by SSB and BCMA
    drivers.  From Joe Perches.

 4) Make sure the remote checksum code never creates a situation where
    the remote checksum is applied yet the tunneling metadata describing
    the remote checksum transformation is still present.  Otherwise an
    external entity might see this and apply the checksum again.  From
    Tom Herbert.

 5) Use msecs_to_jiffies() where applicable, from Nicholas Mc Guire.

 6) Don't explicitly initialize timer struct fields, use setup_timer()
    and mod_timer() instead.  From Vaishali Thakkar.

 7) Don't invoke tg3_halt() without the tp->lock held, from Jun'ichi
    Nomura.

 8) Missing __percpu annotation in ipvlan driver, from Eric Dumazet.

 9) Don't potentially perform skb_get() on shared skbs, also from Eric
    Dumazet.

10) Fix COW'ing of metrics for non-DST_HOST routes in ipv6, from Martin
    KaFai Lau.

11) Fix merge resolution error between the iov_iter changes in vhost and
    some bug fixes that occurred at the same time.  From Jason Wang.

12) If rtnl_configure_link() fails we have to perform a call to
    ->dellink() before unregistering the device.  From WANG Cong.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (39 commits)
  net: dsa: Set valid phy interface type
  rtnetlink: call ->dellink on failure when ->newlink exists
  com20020-pci: add support for eae single card
  vhost_net: fix wrong iter offset when setting number of buffers
  net: spelling fixes
  net/core: Fix warning while make xmldocs caused by dev.c
  net: phy: micrel: disable NAND-tree for KSZ8021, KSZ8031, KSZ8051, KSZ8081
  ipv6: fix ipv6_cow_metrics for non DST_HOST case
  openvswitch: Fix key serialization.
  r8152: restore hw settings
  hso: fix rx parsing logic when skb allocation fails
  tcp: make sure skb is not shared before using skb_get()
  bridge: netfilter: Move sysctl-specific error code inside #ifdef
  ipv6: fix possible deadlock in ip6_fl_purge / ip6_fl_gc
  ipvlan: add a missing __percpu pcpu_stats
  tg3: Hold tp->lock before calling tg3_halt() from tg3_init_one()
  bgmac: fix device initialization on Northstar SoCs (condition typo)
  qlcnic: Delete existing multicast MAC list before adding new
  net/mlx5_core: Fix configuration of log_uar_page_sz
  sunvnet: don't change gso data on clones
  ...
2015-02-17 17:41:19 -08:00
Tedd Ho-Jeong An
a44fecbd52 Bluetooth: Add shutdown callback before closing the device
This callback allows a vendor to send the vendor specific commands
before cloing the hci interface.

Signed-off-by: Tedd Ho-Jeong An <tedd.an@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-15 00:37:52 +01:00
Alexander Aring
b976796950 ieee802154: cleanup ieee802154_le64_to_be64
This patch cleanups the ieee802154_le64_to_be64 function. This patch
removes an unnecessary temporary variable.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-14 05:19:58 +01:00
Alexander Aring
ba5bf17e83 ieee802154: cleanup ieee802154_be64_to_le64
This patch cleanups the ieee802154_be64_to_le64 function. This patch
removes an unnecessary temporary variable.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-14 05:19:58 +01:00
Vladimir Davydov
f48b80a5e2 memcg: cleanup static keys decrement
Move memcg_socket_limit_enabled decrement to tcp_destroy_cgroup (called
from memcg_destroy_kmem -> mem_cgroup_sockets_destroy) and zap a bunch of
wrapper functions.

Although this patch moves static keys decrement from __mem_cgroup_free to
mem_cgroup_css_free, it does not introduce any functional changes, because
the keys are incremented on setting the limit (tcp or kmem), which can
only happen after successful mem_cgroup_css_online.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:10 -08:00
huaibin Wang
ac37e2515c xfrm: release dst_orig in case of error in xfrm_lookup()
dst_orig should be released on error. Function like __xfrm_route_forward()
expects that behavior.
Since a recent commit, xfrm_lookup() may also be called by xfrm_lookup_route(),
which expects the opposite.
Let's introduce a new flag (XFRM_LOOKUP_KEEP_DST_REF) to tell what should be
done in case of error.

Fixes: f92ee61982d("xfrm: Generate blackhole routes only from route lookup functions")
Signed-off-by: huaibin Wang <huaibin.wang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2015-02-12 07:10:56 +01:00
Linus Torvalds
8cc748aa76 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security layer updates from James Morris:
 "Highlights:

   - Smack adds secmark support for Netfilter
   - /proc/keys is now mandatory if CONFIG_KEYS=y
   - TPM gets its own device class
   - Added TPM 2.0 support
   - Smack file hook rework (all Smack users should review this!)"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (64 commits)
  cipso: don't use IPCB() to locate the CIPSO IP option
  SELinux: fix error code in policydb_init()
  selinux: add security in-core xattr support for pstore and debugfs
  selinux: quiet the filesystem labeling behavior message
  selinux: Remove unused function avc_sidcmp()
  ima: /proc/keys is now mandatory
  Smack: Repair netfilter dependency
  X.509: silence asn1 compiler debug output
  X.509: shut up about included cert for silent build
  KEYS: Make /proc/keys unconditional if CONFIG_KEYS=y
  MAINTAINERS: email update
  tpm/tpm_tis: Add missing ifdef CONFIG_ACPI for pnp_acpi_device
  smack: fix possible use after frees in task_security() callers
  smack: Add missing logging in bidirectional UDS connect check
  Smack: secmark support for netfilter
  Smack: Rework file hooks
  tpm: fix format string error in tpm-chip.c
  char/tpm/tpm_crb: fix build error
  smack: Fix a bidirectional UDS connect check typo
  smack: introduce a special case for tmpfs in smack_d_instantiate()
  ...
2015-02-11 20:25:11 -08:00
Tom Herbert
0ace2ca89c vxlan: Use checksum partial with remote checksum offload
Change remote checksum handling to set checksum partial as default
behavior. Added an iflink parameter to configure not using
checksum partial (calling csum_partial to update checksum).

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11 15:12:13 -08:00
Tom Herbert
26c4f7da3e net: Fix remcsum in GRO path to not change packet
Remote checksum offload processing is currently the same for both
the GRO and non-GRO path. When the remote checksum offload option
is encountered, the checksum field referred to is modified in
the packet. So in the GRO case, the packet is modified in the
GRO path and then the operation is skipped when the packet goes
through the normal path based on skb->remcsum_offload. There is
a problem in that the packet may be modified in the GRO path, but
then forwarded off host still containing the remote checksum option.
A remote host will again perform RCO but now the checksum verification
will fail since GRO RCO already modified the checksum.

To fix this, we ensure that GRO restores a packet to it's original
state before returning. In this model, when GRO processes a remote
checksum option it still changes the checksum per the algorithm
but on return from lower layer processing the checksum is restored
to its original value.

In this patch we add define gro_remcsum structure which is passed
to skb_gro_remcsum_process to save offset and delta for the checksum
being changed. After lower layer processing, skb_gro_remcsum_cleanup
is called to restore the checksum before returning from GRO.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11 15:12:09 -08:00
Paul Moore
04f81f0154 cipso: don't use IPCB() to locate the CIPSO IP option
Using the IPCB() macro to get the IPv4 options is convenient, but
unfortunately NetLabel often needs to examine the CIPSO option outside
of the scope of the IP layer in the stack.  While historically IPCB()
worked above the IP layer, due to the inclusion of the inet_skb_param
struct at the head of the {tcp,udp}_skb_cb structs, recent commit
971f10ec ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
reordered the tcp_skb_cb struct and invalidated this IPCB() trick.

This patch fixes the problem by creating a new function,
cipso_v4_optptr(), which locates the CIPSO option inside the IP header
without calling IPCB().  Unfortunately, this isn't as fast as a simple
lookup so some additional tweaks were made to limit the use of this
new function.

Cc: <stable@vger.kernel.org> # 3.18
Reported-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
2015-02-11 14:46:37 -05:00
Fan Du
b0f9ca53cb ipv4: Namespecify TCP PMTU mechanism
Packetization Layer Path MTU Discovery works separately beside
Path MTU Discovery at IP level, different net namespace has
various requirements on which one to chose, e.g., a virutalized
container instance would require TCP PMTU to probe an usable
effective mtu for underlying tunnel, while the host would
employ classical ICMP based PMTU to function.

Hence making TCP PMTU mechanism per net namespace to decouple
two functionality. Furthermore the probe base MSS should also
be configured separately for each namespace.

Signed-off-by: Fan Du <fan.du@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-09 18:45:00 -08:00
David S. Miller
2573beec56 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-09 14:35:57 -08:00
Vlad Yasevich
8381eacf5c ipv6: Make __ipv6_select_ident static
Make __ipv6_select_ident() static as it isn't used outside
the file.

Fixes: 0508c07f5e (ipv6: Select fragment id during UFO segmentation if not set.)
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-09 14:21:03 -08:00
Moni Shoua
92e584fe44 net/bonding: Fix potential bad memory access during bonding events
When queuing work to send the NETDEV_BONDING_INFO netdev event, it's
possible that when the work is executed, the pointer to the slave
becomes invalid. This can happen if between queuing the event and the
execution of the work, the net-device was un-ensvaled and re-enslaved.

Fix that by queuing a work with the data of the slave instead of the
slave structure.

Fixes: 69e6113343 ('net/bonding: Notify state change on slaves')
Reported-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-09 14:03:53 -08:00
Julian Anastasov
cd67cd5eb2 ipvs: use 64-bit rates in stats
IPVS stats are limited to 2^(32-10) conns/s and packets/s,
2^(32-5) bytes/s. It is time to use 64 bits:

* Change all conn/packet kernel counters to 64-bit and update
them in u64_stats_update_{begin,end} section

* In kernel use struct ip_vs_kstats instead of the user-space
struct ip_vs_stats_user and use new func ip_vs_export_stats_user
to export it to sockopt users to preserve compatibility with
32-bit values

* Rename cpu counters "ustats" to "cnt"

* To netlink users provide additionally 64-bit stats:
IPVS_SVC_ATTR_STATS64 and IPVS_DEST_ATTR_STATS64. Old stats
remain for old binaries.

* We can use ip_vs_copy_stats in ip_vs_stats_percpu_show

Thanks to Chris Caputo for providing initial patch for ip_vs_est.c

Signed-off-by: Chris Caputo <ccaputo@alt.net>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-09 16:59:03 +09:00
Eric Dumazet
567e4b7973 net: rfs: add hash collision detection
Receive Flow Steering is a nice solution but suffers from
hash collisions when a mix of connected and unconnected traffic
is received on the host, when flow hash table is populated.

Also, clearing flow in inet_release() makes RFS not very good
for short lived flows, as many packets can follow close().
(FIN , ACK packets, ...)

This patch extends the information stored into global hash table
to not only include cpu number, but upper part of the hash value.

I use a 32bit value, and dynamically split it in two parts.

For host with less than 64 possible cpus, this gives 6 bits for the
cpu number, and 26 (32-6) bits for the upper part of the hash.

Since hash bucket selection use low order bits of the hash, we have
a full hash match, if /proc/sys/net/core/rps_sock_flow_entries is big
enough.

If the hash found in flow table does not match, we fallback to RPS (if
it is enabled for the rxqueue).

This means that a packet for an non connected flow can avoid the
IPI through a unrelated/victim CPU.

This also means we no longer have to clear the table at socket
close time, and this helps short lived flows performance.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-08 16:53:57 -08:00
Neal Cardwell
a9b2c06dbe tcp: mitigate ACK loops for connections as tcp_request_sock
In the SYN_RECV state, where the TCP connection is represented by
tcp_request_sock, we now rate-limit SYNACKs in response to a client's
retransmitted SYNs: we do not send a SYNACK in response to client SYN
if it has been less than sysctl_tcp_invalid_ratelimit (default 500ms)
since we last sent a SYNACK in response to a client's retransmitted
SYN.

This allows the vast majority of legitimate client connections to
proceed unimpeded, even for the most aggressive platforms, iOS and
MacOS, which actually retransmit SYNs 1-second intervals for several
times in a row. They use SYN RTO timeouts following the progression:
1,1,1,1,1,2,4,8,16,32.

Reported-by: Avery Fay <avery@mixpanel.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-08 01:03:12 -08:00
Neal Cardwell
032ee42369 tcp: helpers to mitigate ACK loops by rate-limiting out-of-window dupacks
Helpers for mitigating ACK loops by rate-limiting dupacks sent in
response to incoming out-of-window packets.

This patch includes:

- rate-limiting logic
- sysctl to control how often we allow dupacks to out-of-window packets
- SNMP counter for cases where we rate-limited our dupack sending

The rate-limiting logic in this patch decides to not send dupacks in
response to out-of-window segments if (a) they are SYNs or pure ACKs
and (b) the remote endpoint is sending them faster than the configured
rate limit.

We rate-limit our responses rather than blocking them entirely or
resetting the connection, because legitimate connections can rely on
dupacks in response to some out-of-window segments. For example, zero
window probes are typically sent with a sequence number that is below
the current window, and ZWPs thus expect to thus elicit a dupack in
response.

We allow dupacks in response to TCP segments with data, because these
may be spurious retransmissions for which the remote endpoint wants to
receive DSACKs. This is safe because segments with data can't
realistically be part of ACK loops, which by their nature consist of
each side sending pure/data-less ACKs to each other.

The dupack interval is controlled by a new sysctl knob,
tcp_invalid_ratelimit, given in milliseconds, in case an administrator
needs to dial this upward in the face of a high-rate DoS attack. The
name and units are chosen to be analogous to the existing analogous
knob for ICMP, icmp_ratelimit.

The default value for tcp_invalid_ratelimit is 500ms, which allows at
most one such dupack per 500ms. This is chosen to be 2x faster than
the 1-second minimum RTO interval allowed by RFC 6298 (section 2, rule
2.4). We allow the extra 2x factor because network delay variations
can cause packets sent at 1 second intervals to be compressed and
arrive much closer.

Reported-by: Avery Fay <avery@mixpanel.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-08 01:03:12 -08:00
David S. Miller
3c09e92fb6 NFC: 3.20 second pull request
This is the second NFC pull request for 3.20.
 
 It brings:
 
 - NCI NFCEE (NFC Execution Environment, typically an embedded or
   external secure element) discovery and enabling/disabling support.
   In order to communicate with an NFCEE, we also added NCI's logical
   connections support to the NCI stack.
 
 - HCI over NCI protocol support. Some secure elements only understand
   HCI and thus we need to send them HCI frames when they're part of
   an NCI chipset.
 
 - NFC_EVT_TRANSACTION userspace API addition. Whenever an application
   running on a secure element needs to notify its host counterpart,
   we send an NFC_EVENT_SE_TRANSACTION event to userspace through the
   NFC netlink socket.
 
 - Secure element and HCI transaction event support for the st21nfcb
   chipset.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJU06ozAAoJEIqAPN1PVmxKdtkP/2CeQ0+xJnJWgyh4xjkVxfPb
 wPYrz3cHeHdnVX36mnIdRoQZRjxh/Ef/vgK8RsXohcldzHA9D8GjxRpj0GrkVGQ9
 xoDkiHsFDdsZHeezSrlnXNnObvZrkeMsmnUDJxfwLtcX1nzFKzBgW2GaDiNjUazC
 r5Ip4YIfoCaVLq5MSqYRqJ6uhplXd/ePdtPoo54L/E81y4ptQi8pUsQBy4K3GrFn
 FbXoemcymhi9AVBGl+OlppZ93t6bdOV1D5tcOwpytAbfa3jp2oUnJPZay7EoWruR
 aj8l5v3P+SSphAJwaGSbny0L83tTS7sq+N4QG+VxxdMkw2YjpmxgN3AGguNBtPGG
 SUThpZV6QhrkAOnwzP2BJnh68WSU1eJrGvk8zUWW0SanA8k2jaHO/VzXCA3/ZkC4
 IFcXl4Jr6R3iUxDFgS/6MB5E9EGedzFTacsWoHVyZPtaDAQp4WVdAIRQ4QKgJCLw
 1yRmCeVunTsrs6O0aenU1xHinPlnx+tkuiNh4H64APQYTz54WwXH1/S04OL1SkqI
 mTzUvFgWqJvSIOuU19g0HBw+xZ/9vvX8LLkm9kSTbe7tcmFH5CI7ZCN/hNDHvMSd
 sjNYOyag/SAHJ01tTKcSPAgWO49bHWPodI04zP0lK0gMLjKvRknVMdTgIDfu49HN
 2da9E23iBmNNgG79iVHN
 =8caD
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-3.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

NFC: 3.20 second pull request

This is the second NFC pull request for 3.20.

It brings:

- NCI NFCEE (NFC Execution Environment, typically an embedded or
  external secure element) discovery and enabling/disabling support.
  In order to communicate with an NFCEE, we also added NCI's logical
  connections support to the NCI stack.

- HCI over NCI protocol support. Some secure elements only understand
  HCI and thus we need to send them HCI frames when they're part of
  an NCI chipset.

- NFC_EVT_TRANSACTION userspace API addition. Whenever an application
  running on a secure element needs to notify its host counterpart,
  we send an NFC_EVENT_SE_TRANSACTION event to userspace through the
  NFC netlink socket.

- Secure element and HCI transaction event support for the st21nfcb
  chipset.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-07 22:22:25 -08:00
Erik Kline
c58da4c659 net: ipv6: allow explicitly choosing optimistic addresses
RFC 4429 ("Optimistic DAD") states that optimistic addresses
should be treated as deprecated addresses.  From section 2.1:

   Unless noted otherwise, components of the IPv6 protocol stack
   should treat addresses in the Optimistic state equivalently to
   those in the Deprecated state, indicating that the address is
   available for use but should not be used if another suitable
   address is available.

Optimistic addresses are indeed avoided when other addresses are
available (i.e. at source address selection time), but they have
not heretofore been available for things like explicit bind() and
sendmsg() with struct in6_pktinfo, etc.

This change makes optimistic addresses treated more like
deprecated addresses than tentative ones.

Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05 15:37:41 -08:00
David S. Miller
6e03f896b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/vxlan.c
	drivers/vhost/net.c
	include/linux/if_vlan.h
	net/core/dev.c

The net/core/dev.c conflict was the overlap of one commit marking an
existing function static whilst another was adding a new function.

In the include/linux/if_vlan.h case, the type used for a local
variable was changed in 'net', whereas the function got rewritten
to fix a stacked vlan bug in 'net-next'.

In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next'
overlapped with an endainness fix for VHOST 1.0 in 'net'.

In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter
in 'net-next' whereas in 'net' there was a bug fix to pass in the
correct network namespace pointer in calls to this function.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05 14:33:28 -08:00
Eric Dumazet
677651462c ipv6: fix sparse errors in ip6_make_flowlabel()
include/net/ipv6.h:713:22: warning: incorrect type in assignment (different base types)
include/net/ipv6.h:713:22:    expected restricted __be32 [usertype] hash
include/net/ipv6.h:713:22:    got unsigned int
include/net/ipv6.h:719:25: warning: restricted __be32 degrades to integer
include/net/ipv6.h:719:22: warning: invalid assignment: ^=
include/net/ipv6.h:719:22:    left side has type restricted __be32
include/net/ipv6.h:719:22:    right side has type unsigned int

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05 00:42:28 -08:00
Eric Dumazet
f4575d3534 flow_keys: n_proto type should be __be16
(struct flow_keys)->n_proto is in network order, use
proper type for this.

Fixes following sparse errors :

net/core/flow_dissector.c:139:39: warning: incorrect type in assignment (different base types)
net/core/flow_dissector.c:139:39:    expected unsigned short [unsigned] [usertype] n_proto
net/core/flow_dissector.c:139:39:    got restricted __be16 [assigned] [usertype] proto
net/core/flow_dissector.c:237:23: warning: incorrect type in assignment (different base types)
net/core/flow_dissector.c:237:23:    expected unsigned short [unsigned] [usertype] n_proto
net/core/flow_dissector.c:237:23:    got restricted __be16 [assigned] [usertype] proto

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: e0f31d8498 ("flow_keys: Record IP layer protocol in skb_flow_dissect()")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05 00:40:22 -08:00
David S. Miller
f2683b743f Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
More iov_iter work from Al Viro.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 20:46:55 -08:00
Eric Dumazet
9878196578 tcp: do not pace pure ack packets
When we added pacing to TCP, we decided to let sch_fq take care
of actual pacing.

All TCP had to do was to compute sk->pacing_rate using simple formula:

sk->pacing_rate = 2 * cwnd * mss / rtt

It works well for senders (bulk flows), but not very well for receivers
or even RPC :

cwnd on the receiver can be less than 10, rtt can be around 100ms, so we
can end up pacing ACK packets, slowing down the sender.

Really, only the sender should pace, according to its own logic.

Instead of adding a new bit in skb, or call yet another flow
dissection, we tweak skb->truesize to a small value (2), and
we instruct sch_fq to use new helper and not pace pure ack.

Note this also helps TCP small queue, as ack packets present
in qdisc/NIC do not prevent sending a data packet (RPC workload)

This helps to reduce tx completion overhead, ack packets can use regular
sock_wfree() instead of tcp_wfree() which is a bit more expensive.

This has no impact in the case packets are sent to loopback interface,
as we do not coalesce ack packets (were we would detect skb->truesize
lie)

In case netem (with a delay) is used, skb_orphan_partial() also sets
skb->truesize to 1.

This patch is a combination of two patches we used for about one year at
Google.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 20:36:31 -08:00
Moni Shoua
69e6113343 net/bonding: Notify state change on slaves
Use notifier chain to dispatch an event upon a change in slave state.
Event is dispatched with slave specific info.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 16:14:24 -08:00
Moni Shoua
69a2338e05 net/bonding: Move slave state changes to a helper function
Move slave state changes to a helper function, this is a pre-step for adding
functionality of dispatching an event when this helper is called.

This commit doesn't add new functionality.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 16:14:24 -08:00
David S. Miller
940288b6a5 Last round of updates for net-next:
* revert a patch that caused a regression with mesh userspace (Bob)
  * fix a number of suspend/resume related races
    (from Emmanuel, Luca and myself - we'll look at backporting later)
  * add software implementations for new ciphers (Jouni)
  * add a new ACPI ID for Broadcom's rfkill (Mika)
  * allow using netns FD for wireless (Vadim)
  * some other cleanups (various)
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJU0NEaAAoJEDBSmw7B7bqr2DAP/3nbk6WB2Lz4Vwi8fh9C4y6X
 4ZzN2v7NHaimC+Wpxg3wP2wCdX2VG3ZWwF3yLm6qgVHGnE35RFLxURen1eqNeW77
 Yf5gvZ066nCGEQ8l8J6YK9vrLX4qp5c4lyE00bbxpZTA4Qq71SgTg+rmGC1be8uX
 vacfaLwfDWffuOOnjBAPfanj7f4AQaUEY2uN1WkBFC7iEeOtPcWVkHAFVIeGjJfQ
 vfgQJcwOjgWjYbwZdQQi7Aj+k1Yzda4pg1yEWn3CkZ8zyeCYGs1gk2ovoBgPgXBP
 yT9ypIdFsN242VJvy7nkFnCKA8mhKyltMQ1Xjs0Q9lAxWdaq9U+iqt5cUn4jxVIG
 T9Vi3PCbx/nVOqcfR81dBTQ3uDU5AyosPsmh2YTxi5lpRBrsjNY2FtaKE0sm2Om3
 wRiSPOdPrXBeEnU0KssI9e6euXgS4JQV78Naq85OWZDd2yZ1fT5U2fi8y4drRGlz
 rucbbobdVhQch5L4FStPz1uW5pNuJrhekXeZIE8MruTNg2A2oBAK3ApO7hxn68sE
 RnbAnkxVLwgedC9042JF5eiS1PDIU46w4e782j/+XskVKMEakqd23iJycx3tmgHZ
 cxDi/qKZ2RCE74YsT61o/9ErSVPvCfNPL3+CVov918jidQQg2WHfKm/jFlIxHIxk
 4wBP7p2VGlENMgw/R8GJ
 =wOaR
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2015-02-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Last round of updates for net-next:
 * revert a patch that caused a regression with mesh userspace (Bob)
 * fix a number of suspend/resume related races
   (from Emmanuel, Luca and myself - we'll look at backporting later)
 * add software implementations for new ciphers (Jouni)
 * add a new ACPI ID for Broadcom's rfkill (Mika)
 * allow using netns FD for wireless (Vadim)
 * some other cleanups (various)

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 14:57:45 -08:00
David S. Miller
45e826fd57 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2015-02-03

Here's what's likely the last bluetooth-next pull request for 3.20.
Notable changes include:

 - xHCI workaround + a new id for the ath3k driver
 - Several new ids for the btusb driver
 - Support for new Intel Bluetooth controllers
 - Minor cleanups to ieee802154 code
 - Nested sleep warning fix in socket accept() code path
 - Fixes for Out of Band pairing handling
 - Support for LE scan restarting for HCI_QUIRK_STRICT_DUPLICATE_FILTER
 - Improvements to data we expose through debugfs
 - Proper handling of Hardware Error HCI events

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 13:56:37 -08:00
Christophe Ricard
15d4a8da0e NFC: nci: Move logical connection structure allocation
conn_info is currently allocated only after nfcee_discovery_ntf
which is not generic enough for logical connection other than
NFCEE. The corresponding conn_info is now created in
nci_core_conn_create_rsp().

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-02-04 09:14:09 +01:00
Christophe Ricard
3ba5c8466b NFC: nci: Change credits field to credits_cnt
For consistency sake change nci_core_conn_create_rsp structure
credits field to credits_cnt.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-02-04 09:13:15 +01:00
Christophe Ricard
b16ae7160a NFC: nci: Support all destinations type when creating a connection
The current implementation limits nci_core_conn_create_req()
to only manage NCI_DESTINATION_NFCEE.
Add new parameters to nci_core_conn_create() to support all
destination types described in the NCI specification.
Because there are some parameters with variable size dynamic
buffer allocation is needed.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-02-04 09:10:50 +01:00
Christophe Ricard
12bdf27d46 NFC: nci: Add reference to the RF logical connection
The NCI_STATIC_RF_CONN_ID logical connection is the most used
connection. Keeping it directly accessible in the nci_dev
structure will simplify and optimize the access.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-02-04 09:09:53 +01:00
Vlad Yasevich
0508c07f5e ipv6: Select fragment id during UFO segmentation if not set.
If the IPv6 fragment id has not been set and we perform
fragmentation due to UFO, select a new fragment id.
We now consider a fragment id of 0 as unset and if id selection
process returns 0 (after all the pertrubations), we set it to
0x80000000, thus giving us ample space not to create collisions
with the next packet we may have to fragment.

When doing UFO integrity checking, we also select the
fragment id if it has not be set yet.   This is stored into
the skb_shinfo() thus allowing UFO to function correclty.

This patch also removes duplicate fragment id generation code
and moves ipv6_select_ident() into the header as it may be
used during GSO.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03 23:06:43 -08:00
Al Viro
21226abb4e net: switch memcpy_fromiovec()/memcpy_fromiovecend() users to copy_from_iter()
That takes care of the majority of ->sendmsg() instances - most of them
via memcpy_to_msg() or assorted getfrag() callbacks.  One place where we
still keep memcpy_fromiovecend() is tipc - there we potentially read the
same data over and over; separate patch, that...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:15 -05:00
Al Viro
57be5bdad7 ip: convert tcp_sendmsg() to iov_iter primitives
patch is actually smaller than it seems to be - most of it is unindenting
the inner loop body in tcp_sendmsg() itself...

the bit in tcp_input.c is going to get reverted very soon - that's what
memcpy_from_msg() will become, but not in this commit; let's keep it
reasonably contained...

There's one potentially subtle change here: in case of short copy from
userland, mainline tcp_send_syn_data() discards the skb it has allocated
and falls back to normal path, where we'll send as much as possible after
rereading the same data again.  This patch trims SYN+data skb instead -
that way we don't need to copy from the same place twice.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:14 -05:00
Al Viro
cacdc7d2f9 ip: stash a pointer to msghdr in struct ping_fakehdr
... instead of storing its ->mgs_iter.iov there

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:14 -05:00
David S. Miller
3ae55826ae Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:

1) Validate hooks for nf_tables NAT expressions, otherwise users can
   crash the kernel when using them from the wrong hook. We already
   got one user trapped on this when configuring masquerading.

2) Fix a BUG splat in nf_tables with CONFIG_DEBUG_PREEMPT=y. Reported
   by Andreas Schultz.

3) Avoid unnecessary reroute of traffic in the local input path
   in IPVS that triggers a crash in in xfrm. Reported by Florian
   Wiessner and fixes by Julian Anastasov.

4) Fix memory and module refcount leak from the error path of
   nf_tables_newchain().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 19:30:53 -08:00
Vlad Yasevich
6422398c2a ipv6: introduce ipv6_make_skb
This commit is very similar to
commit 1c32c5ad6f
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Tue Mar 1 02:36:47 2011 +0000

    inet: Add ip_make_skb and ip_finish_skb

It adds IPv6 version of the helpers ip6_make_skb and ip6_finish_skb.

The job of ip6_make_skb is to collect messages into an ipv6 packet
and poplulate ipv6 eader.  The job of ip6_finish_skb is to transmit
the generated skb.  Together they replicated the job of
ip6_push_pending_frames() while also provide the capability to be
called independently.  This will be needed to add lockless UDP sendmsg
support.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 19:28:04 -08:00
Willem de Bruijn
b245be1f4d net-timestamp: no-payload only sysctl
Tx timestamps are looped onto the error queue on top of an skb. This
mechanism leaks packet headers to processes unless the no-payload
options SOF_TIMESTAMPING_OPT_TSONLY is set.

Add a sysctl that optionally drops looped timestamp with data. This
only affects processes without CAP_NET_RAW.

The policy is checked when timestamps are generated in the stack.
It is possible for timestamps with data to be reported after the
sysctl is set, if these were queued internally earlier.

No vulnerability is immediately known that exploits knowledge
gleaned from packet headers, but it may still be preferable to allow
administrators to lock down this path at the cost of possible
breakage of legacy applications.

Signed-off-by: Willem de Bruijn <willemb@google.com>

----

Changes
  (v1 -> v2)
  - test socket CAP_NET_RAW instead of capable(CAP_NET_RAW)
  (rfc -> v1)
  - document the sysctl in Documentation/sysctl/net.txt
  - fix access control race: read .._OPT_TSONLY only once,
        use same value for permission check and skb generation.
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02 18:46:51 -08:00
Christophe Ricard
a41bb8448e NFC: nci: Add RF NFCEE action notification support
The NFCC sends an NCI_OP_RF_NFCEE_ACTION_NTF notification
to the host (DH) to let it know that for example an RF
transaction with a payment reader is done.
For now the notification handler is empty.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-02-02 21:50:41 +01:00
Christophe Ricard
447b27c4f2 NFC: Forward NFC_EVT_TRANSACTION to user space
NFC_EVT_TRANSACTION is sent through netlink in order for a
specific application running on a secure element to notify
userspace of an event. Typically the secure element application
counterpart on the host could interpret that event and act
upon it.

Forwarded information contains:
- SE host generating the event
- Application IDentifier doing the operation
- Applications parameters

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-02-02 21:50:40 +01:00
Christophe Ricard
11f54f2286 NFC: nci: Add HCI over NCI protocol support
According to the NCI specification, one can use HCI over NCI
to talk with specific NFCEE. The HCI network is viewed as one
logical NFCEE.
This is needed to support secure element running HCI only
firmwares embedded on an NCI capable chipset, like e.g. the
st21nfcb.
There is some duplication between this piece of code and the
HCI core code, but the latter would need to be abstracted even
more to be able to use NCI as a logical transport for HCP packets.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-02-02 21:50:40 +01:00
Christophe Ricard
736bb95774 NFC: nci: Support logical connections management
In order to communicate with an NFCEE, we need to open a logical
connection to it, by sending the NCI_OP_CORE_CONN_CREATE_CMD
command to the NFCC. It's left up to the drivers to decide when
to close an already opened logical connection.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-02-02 21:50:39 +01:00
Christophe Ricard
f7f793f313 NFC: nci: Add NFCEE enabling and disabling support
NFCEEs can be enabled or disabled by sending the
NCI_OP_NFCEE_MODE_SET_CMD command to the NFCC. This patch
provides an API for drivers to enable and disable e.g. their
NCI discoveredd secure elements.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-02-02 21:50:39 +01:00
Christophe Ricard
af9c8aa67d NFC: nci: Add NFCEE discover support
NFCEEs (NFC Execution Environment) have to be explicitly
discovered by sending the NCI_OP_NFCEE_DISCOVER_CMD
command. The NFCC will respond to this command by telling
us how many NFCEEs are connected to it. Then the NFCC sends
a notification command for each and every NFCEE connected.
Here we implement support for sending
NCI_OP_NFCEE_DISCOVER_CMD command, receiving the response
and the potential notifications.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-02-02 21:50:38 +01:00
Christophe Ricard
8277f6937a NFC: nci: Add NCI NFCEE constants
Add NFCEE NCI constant for:
- NFCEE Interface/Protocols
- Destination type
- Destination-specific parameters type
- NFCEE Discovery Action

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-02-02 21:50:38 +01:00
Christophe Ricard
4aeee6871e NFC: nci: Add dynamic logical connections support
The current NCI core only support the RF static connection.
For other NFC features such as Secure Element communication, we
may need to create logical connections to the NFCEE (Execution
Environment.

In order to track each logical connection ID dynamically, we add a
linked list of connection info pointers to the nci_dev structure.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-02-02 21:50:31 +01:00
Johan Hedberg
66f096f791 Bluetooth: Remove mgmt_rp_read_local_oob_ext_data struct
This extended return parameters struct conflicts with the new Read Local
OOB Extended Data command definition. To avoid the conflict simply
rename the old "extended" version to the normal one and update the code
appropriately to take into account the two possible response PDU sizes.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-02 18:27:56 +01:00
Jakub Pawlowski
4b0e0ceddf Bluetooth: Add restarting to service discovery
When using LE_SCAN_FILTER_DUP_ENABLE, some controllers would send
advertising report from each LE device only once. That means that we
don't get any updates on RSSI value, and makes Service Discovery very
slow. This patch adds restarting scan when in Service Discovery, and
device with filtered uuid is found, but it's not in RSSI range to send
event yet. This way if device moves into range, we will quickly get RSSI
update.

Signed-off-by: Jakub Pawlowski <jpawlowski@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-02 08:52:34 +01:00
Jakub Pawlowski
2d28cfe7aa Bluetooth: Add le_scan_restart work for LE scan restarting
Currently there is no way to restart le scan, and it's needed in
service scan method. The way it work: it disable, and then enable le
scan on controller.

During the restart, we must remember when the scan was started, and
it's duration, to later re-schedule the le_scan_disable work, that was
stopped during the stop scan phase.

Signed-off-by: Jakub Pawlowski <jpawlowski@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-02 08:52:33 +01:00
Roopa Prabhu
8a44dbb202 swdevice: add new apis to set and del bridge port attributes
This patch adds two new api's netdev_switch_port_bridge_setlink
and netdev_switch_port_bridge_dellink to offload bridge port attributes
to switch port

(The names of the apis look odd with 'switch_port_bridge',
but am more inclined to change the prefix of the api to something else.
Will take any suggestions).

The api's look at the NETIF_F_HW_SWITCH_OFFLOAD feature flag to
pass bridge port attributes to the port device.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:16:34 -08:00
Eric Dumazet
bdbbb8527b ipv4: tcp: get rid of ugly unicast_sock
In commit be9f4a44e7 ("ipv4: tcp: remove per net tcp_sock")
I tried to address contention on a socket lock, but the solution
I chose was horrible :

commit 3a7c384ffd ("ipv4: tcp: unicast_sock should not land outside
of TCP stack") addressed a selinux regression.

commit 0980e56e50 ("ipv4: tcp: set unicast_sock uc_ttl to -1")
took care of another regression.

commit b5ec8eeac4 ("ipv4: fix ip_send_skb()") fixed another regression.

commit 811230cd85 ("tcp: ipv4: initialize unicast_sock sk_pacing_rate")
was another shot in the dark.

Really, just use a proper socket per cpu, and remove the skb_orphan()
call, to re-enable flow control.

This solves a serious problem with FQ packet scheduler when used in
hostile environments, as we do not want to allocate a flow structure
for every RST packet sent in response to a spoofed packet.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-01 23:06:19 -08:00
Eric Dumazet
0d32ef8cef net: sched: fix panic in rate estimators
Doing the following commands on a non idle network device
panics the box instantly, because cpu_bstats gets overwritten
by stats.

tc qdisc add dev eth0 root <your_favorite_qdisc>
... some traffic (one packet is enough) ...
tc qdisc replace dev eth0 root est 1sec 4sec <your_favorite_qdisc>

[  325.355596] BUG: unable to handle kernel paging request at ffff8841dc5a074c
[  325.362609] IP: [<ffffffff81541c9e>] __gnet_stats_copy_basic+0x3e/0x90
[  325.369158] PGD 1fa7067 PUD 0
[  325.372254] Oops: 0000 [#1] SMP
[  325.375514] Modules linked in: ...
[  325.398346] CPU: 13 PID: 14313 Comm: tc Not tainted 3.19.0-smp-DEV #1163
[  325.412042] task: ffff8800793ab5d0 ti: ffff881ff2fa4000 task.ti: ffff881ff2fa4000
[  325.419518] RIP: 0010:[<ffffffff81541c9e>]  [<ffffffff81541c9e>] __gnet_stats_copy_basic+0x3e/0x90
[  325.428506] RSP: 0018:ffff881ff2fa7928  EFLAGS: 00010286
[  325.433824] RAX: 000000000000000c RBX: ffff881ff2fa796c RCX: 000000000000000c
[  325.440988] RDX: ffff8841dc5a0744 RSI: 0000000000000060 RDI: 0000000000000060
[  325.448120] RBP: ffff881ff2fa7948 R08: ffffffff81cd4f80 R09: 0000000000000000
[  325.455268] R10: ffff883ff223e400 R11: 0000000000000000 R12: 000000015cba0744
[  325.462405] R13: ffffffff81cd4f80 R14: ffff883ff223e460 R15: ffff883feea0722c
[  325.469536] FS:  00007f2ee30fa700(0000) GS:ffff88407fa20000(0000) knlGS:0000000000000000
[  325.477630] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  325.483380] CR2: ffff8841dc5a074c CR3: 0000003feeae9000 CR4: 00000000001407e0
[  325.490510] Stack:
[  325.492524]  ffff883feea0722c ffff883fef719dc0 ffff883feea0722c ffff883ff223e4a0
[  325.499990]  ffff881ff2fa79a8 ffffffff815424ee ffff883ff223e49c 000000015cba0744
[  325.507460]  00000000f2fa7978 0000000000000000 ffff881ff2fa79a8 ffff883ff223e4a0
[  325.514956] Call Trace:
[  325.517412]  [<ffffffff815424ee>] gen_new_estimator+0x8e/0x230
[  325.523250]  [<ffffffff815427aa>] gen_replace_estimator+0x4a/0x60
[  325.529349]  [<ffffffff815718ab>] tc_modify_qdisc+0x52b/0x590
[  325.535117]  [<ffffffff8155edd0>] rtnetlink_rcv_msg+0xa0/0x240
[  325.540963]  [<ffffffff8155ed30>] ? __rtnl_unlock+0x20/0x20
[  325.546532]  [<ffffffff8157f811>] netlink_rcv_skb+0xb1/0xc0
[  325.552145]  [<ffffffff8155b355>] rtnetlink_rcv+0x25/0x40
[  325.557558]  [<ffffffff8157f0d8>] netlink_unicast+0x168/0x220
[  325.563317]  [<ffffffff8157f47c>] netlink_sendmsg+0x2ec/0x3e0

Lets play safe and not use an union : percpu 'pointers' are mostly read
anyway, and we have typically few qdiscs per host.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Fixes: 22e0f8b932 ("net: sched: make bstats per cpu and estimator RCU safe")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-31 17:49:37 -08:00
Eric Dumazet
349c9e3c73 ipv4: icmp: use percpu allocation
Get rid of nr_cpu_ids and use modern percpu allocation.

Note that the sockets themselves are not yet allocated
using NUMA affinity.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-31 17:48:18 -08:00
Marcel Holtmann
f7697b1602 Bluetooth: Store OOB data present value for each set of remote OOB data
Instead of doing complex calculation every time the OOB data is used,
just calculate the OOB data present value and store it with the OOB
data raw values.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-31 09:59:45 +02:00
Christoph Hellwig
7cc0566268 net: remove sock_iocb
The sock_iocb structure is allocate on stack for each read/write-like
operation on sockets, and contains various fields of which only the
embedded msghdr and sometimes a pointer to the scm_cookie is ever used.
Get rid of the sock_iocb and put a msghdr directly on the stack and pass
the scm_cookie explicitly to netlink_mmap_sendmsg.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-28 23:15:07 -08:00
Jesse Gross
b8693877ae openvswitch: Add support for checksums on UDP tunnels.
Currently, it isn't possible to request checksums on the outer UDP
header of tunnels - the TUNNEL_CSUM flag is ignored. This adds
support for requesting that UDP checksums be computed on transmit
and properly reported if they are present on receive.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-28 23:04:15 -08:00
David S. Miller
b8f8be3f04 NFC: 3.20 first pull request
This is the first NFC pull request for 3.20.
 
 With this one we have:
 
 - Secure element support for the ST Micro st21nfca driver. This depends
   on a few HCI internal changes in order for example to support more
   than one secure element per controller.
 
 - ACPI support for NXP's pn544 HCI driver. This controller is found on
   many x86 SoCs and is typically enumerated on the ACPI bus there.
 
 - A few st21nfca and st21nfcb fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUyXy7AAoJEIqAPN1PVmxKOY4P/1WJtEZfzBOFMlh9qeA8YESv
 cS1xbZuAVeEJ3r/sgDc87iA/4MMNZDzfhDHQlQJs/pJWAXE3Am1/dZGWHJC6pbk/
 roTlbEh5OvQU8cRIdAvOcEgrBIWk+E30Mkd2OtOMpWyhbgChN7hKz0KUs7znVHBJ
 G3YCcZOPr7K2ra78UlYzApvGqoxiVaiEWQyj6rzx2HbVzxzICL6A5m9cRcZuGxYR
 sK3Y/DpKJKGwH3p1kkLIOqHy3nGhq2LttVSXF/f5xkOB8teERSkt4i8aJBiSb9ym
 a++iAWp2syH3sGh2rsb3G4KRYCYq2J9mJD7oBB3G/6UoNwyeSgW3GdxgH/yxt6C9
 KfzHm9T8jfvQIBlf4lGeboV8zu+ysQehJFNKtxdQDlvwqPiR14clT5JFejocr9+N
 SegCbF+LJaZW8boOZVvhxTxASpEZ0RTFwUIkKKxtXOpK4ha0s1gEAtuZUpTYmRtJ
 wGqds8wszkE0wOdZJi7dsCGpp5JNI0LZZCsuKa6Ko3E7LkTAor8Bbmob0RR51ClD
 srtoz6kJgoozAMRLMISXBimk0geOp38iGs26GPSpRHwSV/1loN6+abaOMYlGbDCq
 +oVpqMmJsS/z5US5+wEkWU+O4y9TS0O74d/TxxcwUM+CLNyKI+Ms1rMOyYi0domo
 2xa7MuDCrmztQbs6ROKT
 =SNWp
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-3.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

NFC: 3.20 first pull request

This is the first NFC pull request for 3.20.

With this one we have:

- Secure element support for the ST Micro st21nfca driver. This depends
  on a few HCI internal changes in order for example to support more
  than one secure element per controller.

- ACPI support for NXP's pn544 HCI driver. This controller is found on
  many x86 SoCs and is typically enumerated on the ACPI bus there.

- A few st21nfca and st21nfcb fixes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-28 22:49:55 -08:00
Neal Cardwell
e73ebb0881 tcp: stretch ACK fixes prep
LRO, GRO, delayed ACKs, and middleboxes can cause "stretch ACKs" that
cover more than the RFC-specified maximum of 2 packets. These stretch
ACKs can cause serious performance shortfalls in common congestion
control algorithms that were designed and tuned years ago with
receiver hosts that were not using LRO or GRO, and were instead
politely ACKing every other packet.

This patch series fixes Reno and CUBIC to handle stretch ACKs.

This patch prepares for the upcoming stretch ACK bug fix patches. It
adds an "acked" parameter to tcp_cong_avoid_ai() to allow for future
fixes to tcp_cong_avoid_ai() to correctly handle stretch ACKs, and
changes all congestion control algorithms to pass in 1 for the ACKed
count. It also changes tcp_slow_start() to return the number of packet
ACK "credits" that were not processed in slow start mode, and can be
processed by the congestion control module in additive increase mode.

In future patches we will fix tcp_cong_avoid_ai() to handle stretch
ACKs, and fix Reno and CUBIC handling of stretch ACKs in slow start
and additive increase mode.

Reported-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-28 22:18:37 -08:00
Marcel Holtmann
c7741d16a5 Bluetooth: Perform a power cycle when receiving hardware error event
When receiving a HCI Hardware Error event, the controller should be
assumed to be non-functional until issuing a HCI Reset command.

The Bluetooth hardware errors are vendor specific and so add a
new hdev->hw_error callback that drivers can provide to run extra
code to handle the hardware error.

After completing the vendor specific error handling perform a full
reset of the Bluetooth stack by closing and re-opening the transport.

Based-on-patch-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-28 21:26:24 +01:00
Jonathan Toppins
303691042d bonding: cleanup and remove dead code
fix sparse warning about non-static function

drivers/net/bonding/bond_main.c:3737:5: warning: symbol
'bond_3ad_xor_xmit' was not declared. Should it be static?

Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com>
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-27 17:09:04 -08:00
Jonathan Toppins
2477bc9a3d bonding: update bond carrier state when min_links option changes
Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com>
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-27 17:09:03 -08:00
David S. Miller
95f873f2ff Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	arch/arm/boot/dts/imx6sx-sdb.dts
	net/sched/cls_bpf.c

Two simple sets of overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-27 16:59:56 -08:00
Christophe Ricard
8409e4283c NFC: hci: Add cmd_received handler
When a command is received, it is sometime needed to let the CLF driver do
some additional operations. (ex: count remaining pipe notification...)

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-01-28 00:03:34 +01:00
Christophe Ricard
af77522320 NFC: hci: Change nfc_hci_send_response gate parameter to pipe
As there can be several pipes connected to the same gate, we need
to know which pipe ID to use when sending an HCI response. A gate
ID is not enough.

Instead of changing the nfc_hci_send_response() API to something
not aligned with the rest of the HCI API, we call nfc_hci_hcp_message_tx
directly.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-01-27 23:55:20 +01:00
Christophe Ricard
118278f20a NFC: hci: Add pipes table to reference them with a tuple {gate, host}
In order to keep host source information on specific hci event (such as
evt_connectivity or evt_transaction) and because 2 pipes can be connected
to the same gate, it is necessary to add a table referencing every pipe
with a {gate, host} tuple.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-01-27 23:39:32 +01:00
Christophe Ricard
fda7a49cb9 NFC: hci: Change event_received handler gate parameter to pipe
Several pipes may point to the same CLF gate, so getting the gate ID
as an input is not enough.
For example dual secure element may have 2 pipes (1 for uicc and
1 for eSE) pointing to the connectivity gate.

As resolving gate and host IDs can be done from a pipe, we now pass
the pipe ID to the event received handler.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-01-27 23:39:23 +01:00
Jouni Malinen
8ade538bf3 mac80111: Add BIP-GMAC-128 and BIP-GMAC-256 ciphers
This allows mac80211 to configure BIP-GMAC-128 and BIP-GMAC-256 to the
driver and also use software-implementation within mac80211 when the
driver does not support this with hardware accelaration.

Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-27 11:10:13 +01:00
Jouni Malinen
00b9cfa3ff mac80111: Add GCMP and GCMP-256 ciphers
This allows mac80211 to configure GCMP and GCMP-256 to the driver and
also use software-implementation within mac80211 when the driver does
not support this with hardware accelaration.

Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
[remove a spurious newline]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-27 11:06:09 +01:00
Hannes Frederic Sowa
df4d92549f ipv4: try to cache dst_entries which would cause a redirect
Not caching dst_entries which cause redirects could be exploited by hosts
on the same subnet, causing a severe DoS attack. This effect aggravated
since commit f886497212 ("ipv4: fix dst race in sk_dst_get()").

Lookups causing redirects will be allocated with DST_NOCACHE set which
will force dst_release to free them via RCU.  Unfortunately waiting for
RCU grace period just takes too long, we can end up with >1M dst_entries
waiting to be released and the system will run OOM. rcuos threads cannot
catch up under high softirq load.

Attaching the flag to emit a redirect later on to the specific skb allows
us to cache those dst_entries thus reducing the pressure on allocation
and deallocation.

This issue was discovered by Marcelo Leitner.

Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Marcelo Leitner <mleitner@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-26 17:28:27 -08:00
Joe Stringer
7b1883cefc genetlink: Add genlmsg_parse() helper function.
The first user will be the next patch.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-26 15:45:50 -08:00
Tom Herbert
af33c1adae vxlan: Eliminate dependency on UDP socket in transmit path
In the vxlan transmit path there is no need to reference the socket
for a tunnel which is needed for the receive side. We do, however,
need the vxlan_dev flags. This patch eliminate references
to the socket in the transmit path, and changes VXLAN_F_UNSHAREABLE
to be VXLAN_F_RCV_FLAGS. This mask is used to store the flags
applicable to receive (GBP, CSUM6_RX, and REMCSUM_RX) in the
vxlan_sock flags.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-24 23:15:40 -08:00
Tom Herbert
d998f8efa4 udp: Do not require sock in udp_tunnel_xmit_skb
The UDP tunnel transmit functions udp_tunnel_xmit_skb and
udp_tunnel6_xmit_skb include a socket argument. The socket being
passed to the functions (from VXLAN) is a UDP created for receive
side. The only thing that the socket is used for in the transmit
functions is to get the setting for checksum (enabled or zero).
This patch removes the argument and and adds a nocheck argument
for checksum setting. This eliminates the unnecessary dependency
on a UDP socket for UDP tunnel transmit.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-24 23:15:40 -08:00
Johan Hedberg
a1443f5a27 Bluetooth: Convert Set SC to use HCI Request
This patch converts the Set Secure Connection HCI handling to use a HCI
request instead of using a hard-coded callback in hci_event.c. This e.g.
ensures that we don't clear the flags incorrectly if something goes
wrong with the power up process (not related to a mgmt Set SC command).

The code can also be simplified a bit since only one pending Set SC
command is allowed, i.e. mgmt_pending_foreach usage is not needed.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-01-23 19:07:03 +01:00
Luciano Coelho
9c74893441 nl80211: add an attribute to allow delaying the first scheduled scan cycle
The userspace may want to delay the the first scheduled scan or
net-detect cycle.  Add an optional attribute to the scheduled scan
configuration to pass the delay to be (optionally) used by the driver.

Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
[add the attribute to the policy to validate it]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-23 10:30:47 +01:00
Lorenzo Bianconi
db82d8a966 mac80211: enable TPC through mac80211 stack
Control per packet Transmit Power Control (TPC) in lower drivers
according to TX power settings configured by the user. In particular TPC is
enabled if value passed in enum nl80211_tx_power_setting is
NL80211_TX_POWER_LIMITED (allow using less than specified from userspace),
whereas TPC is disabled if nl80211_tx_power_setting is set to
NL80211_TX_POWER_FIXED (use value configured from userspace)

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-23 10:28:51 +01:00
Johannes Berg
fa7e1fbcb5 mac80211: allow drivers to control software crypto
Some drivers unfortunately cannot support software crypto, but
mac80211 currently assumes that they do.

This has the issue that if the hardware enabling fails for some
reason, the software fallback is used, which won't work. This
clearly isn't desirable, the error should be reported and the
key setting refused.

Support this in mac80211 by allowing drivers to set a new HW
flag IEEE80211_HW_SW_CRYPTO_CONTROL, in which case mac80211 will
only allow software fallback if the set_key() method returns 1.
The driver will also need to advertise supported cipher suites
so that mac80211 doesn't advertise any (future) software ciphers
that the driver can't actually do.

While at it, to make it easier to support this, refactor the
ieee80211_init_cipher_suites() code.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-22 22:01:01 +01:00
David S. Miller
0c49087462 Some further updates for net-next:
* fix network-manager which was broken by the previous changes
  * fix delete-station events, which were broken by me making the
    genlmsg_end() mistake
  * fix a timer left running during suspend in some race conditions
    that would cause an annoying (but harmless) warning
  * (less important, but in the tree already) remove 80+80 MHz rate
    reporting since the spec doesn't distinguish it from 160 MHz;
    as the bitrate they're both 160 MHz bandwidth
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJUvUZlAAoJEDBSmw7B7bqrfNMQAJT5jjOSjmwW8Zdvujvda/qt
 bFpYa9t0NsN3izzMpjPSrCwPrHN5qE86ZA8TcZrIzejPH4rpltjaXB6JNHZardVo
 deCUWU9xotoPELoE0Xex9mHPEkYYvOaht/P8A/88qP1S2PykMmj9fqNeijyUvwuo
 Jlsh0wKe4Jq6bCmdxvy/bde84ceAQcuh2TKNov1S0tB38tRY9qSu2n6ZGpoMNcEe
 CWuW+23jL1uAvt6Ljk2fTKdLR8iyXykfM0UMX2/4R2PMnJXK/dQqV/eeXTjpxtMk
 UV4aIMcSS19D7HxICKOXOdZLdMMuXaFUnUMlGLBtXZz3n9lZoP7RtVIHoib8VBXZ
 tY7xQrK6YNvwZ4SZZPuv/yr6YWP2+vBM2FUfXjzD+or6uYsej203a5q0RsOY+3Tp
 6Yklr+mQNlrOtpMsHMSgrBUUZsAK1I95ALMVVqaq1hgf1cDvRIDHlOo4A7bjwNFw
 eA3L1o4O1/E/IGp4v6+2Iukc9rIwm11sNr/wuD8dDkZTradUPH1iY6J5sxJNb2Nq
 YdpCnQ/lNXj650+z9/G2omSA6DTTTOtXJPxKR+FOHZVKDpZYtF6TxKb0S79fINps
 6ZlWIna5bUiXF1b6xad1x+vtyjNMgTvkg6mR+WQnvF57Ri8hucbtpv5wpA5bhYUQ
 Fbz9VZF2nfMeIbXfTaWi
 =Bvmr
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2015-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Some further updates for net-next:
 * fix network-manager which was broken by the previous changes
 * fix delete-station events, which were broken by me making the
   genlmsg_end() mistake
 * fix a timer left running during suspend in some race conditions
   that would cause an annoying (but harmless) warning
 * (less important, but in the tree already) remove 80+80 MHz rate
   reporting since the spec doesn't distinguish it from 160 MHz;
   as the bitrate they're both 160 MHz bandwidth

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 16:22:19 -05:00
Felix Fietkau
22a5dc0e5e net: sched: Introduce connmark action
This tc action allows you to retrieve the connection tracking mark
This action has been used heavily by openwrt for a few years now.

There are known limitations currently:

doesn't work for initial packets, since we only query the ct table.
  Fine given use case is for returning packets

no implicit defrag.
  frags should be rare so fix later..

won't work for more complex tasks, e.g. lookup of other extensions
  since we have no means to store results

we still have a 2nd lookup later on via normal conntrack path.
This shouldn't break anything though since skb->nfct isn't altered.

V2:
remove unnecessary braces (Jiri)
change the action identifier to 14 (Jiri)
Fix some stylistic issues caught by checkpatch
V3:
Move module params to bottom (Cong)
Get rid of tcf_hashinfo_init and friends and conform to newer API (Cong)

Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 16:02:06 -05:00
Nicolas Dichtel
1728d4fabd tunnels: advertise link netns via netlink
Implement rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is
added to rtnetlink messages.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 14:32:03 -05:00
Nicolas Dichtel
d37512a277 rtnl: add link netns id to interface messages
This patch adds a new attribute (IFLA_LINK_NETNSID) which contains the 'link'
netns id when this netns is different from the netns where the interface
stands (for example for x-net interfaces like ip tunnels).
With this attribute, it's possible to interpret correctly all advertised
information (like IFLA_LINK, etc.).

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 14:21:26 -05:00
Nicolas Dichtel
0c7aecd4bd netns: add rtnl cmd to add and get peer netns ids
With this patch, a user can define an id for a peer netns by providing a FD or a
PID. These ids are local to the netns where it is added (ie valid only into this
netns).

The main function (ie the one exported to other module), peernet2id(), allows to
get the id of a peer netns. If no id has been assigned by the user, this
function allocates one.

These ids will be used in netlink messages to point to a peer netns, for example
in case of a x-netns interface.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 14:21:18 -05:00
Pablo Neira Ayuso
75e8d06d43 netfilter: nf_tables: validate hooks in NAT expressions
The user can crash the kernel if it uses any of the existing NAT
expressions from the wrong hook, so add some code to validate this
when loading the rule.

This patch introduces nft_chain_validate_hooks() which is based on
an existing function in the bridge version of the reject expression.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-01-19 14:52:39 +01:00
Jiri Pirko
27c0013285 switchdev: fix typo in inline function definition
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-18 12:24:11 -05:00
Martin KaFai Lau
88340160f3 ip_tunnel: Create percpu gro_cell
In the ipip tunnel, the skb->queue_mapping is lost in ipip_rcv().
All skb will be queued to the same cell->napi_skbs.  The
gro_cell_poll is pinned to one core under load.  In production traffic,
we also see severe rx_dropped in the tunl iface and it is probably due to
this limit: skb_queue_len(&cell->napi_skbs) > netdev_max_backlog.

This patch is trying to alloc_percpu(struct gro_cell) and schedule
gro_cell_poll to process the skb in the same core.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-18 01:56:32 -05:00
Johannes Berg
053c095a82 netlink: make nlmsg_end() and genlmsg_end() void
Contrary to common expectations for an "int" return, these functions
return only a positive value -- if used correctly they cannot even
return 0 because the message header will necessarily be in the skb.

This makes the very common pattern of

  if (genlmsg_end(...) < 0) { ... }

be a whole bunch of dead code. Many places also simply do

  return nlmsg_end(...);

and the caller is expected to deal with it.

This also commonly (at least for me) causes errors, because it is very
common to write

  if (my_function(...))
    /* error condition */

and if my_function() does "return nlmsg_end()" this is of course wrong.

Additionally, there's not a single place in the kernel that actually
needs the message length returned, and if anyone needs it later then
it'll be very easy to just use skb->len there.

Remove this, and make the functions void. This removes a bunch of dead
code as described above. The patch adds lines because I did

-	return nlmsg_end(...);
+	nlmsg_end(...);
+	return 0;

I could have preserved all the function's return values by returning
skb->len, but instead I've audited all the places calling the affected
functions and found that none cared. A few places actually compared
the return value with <= 0 in dump functionality, but that could just
be changed to < 0 with no change in behaviour, so I opted for the more
efficient version.

One instance of the error I've made numerous times now is also present
in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
check for <0 or <=0 and thus broke out of the loop every single time.
I've preserved this since it will (I think) have caused the messages to
userspace to be formatted differently with just a single message for
every SKB returned to userspace. It's possible that this isn't needed
for the tools that actually use this, but I don't even know what they
are so couldn't test that changing this behaviour would be acceptable.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-18 01:03:45 -05:00
David S. Miller
e445dd5f67 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2015-01-16

Here are some more bluetooth & ieee802154 patches intended for 3.20:

 - Refactoring & cleanups of ieee802154 & 6lowpan code
 - Various fixes to the btmrvl driver
 - Fixes for Bluetooth Low Energy Privacy feature handling
 - Added build-time sanity checks for sockaddr sizes
 - Fixes for Security Manager registration on LE-only controllers
 - Refactoring of broken inquiry mode handling to a generic quirk

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-18 00:25:30 -05:00
Jiri Pirko
3aeb66176f net: replace br_fdb_external_learn_* calls with switchdev notifier events
This patch benefits from newly introduced switchdev notifier and uses it
to propagate fdb learn events from rocker driver to bridge. That avoids
direct function calls and possible use by other listeners (ovs).

Suggested-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-18 00:23:57 -05:00
Jiri Pirko
03bf0c2812 switchdev: introduce switchdev notifier
This patch introduces new notifier for purposes of exposing events which happen
on switch driver side. The consumers of the event messages are mainly involved
masters, namely bridge and ovs.

Suggested-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-18 00:23:57 -05:00
Jiri Pirko
d23b8ad8ab tc: add BPF based action
This action provides a possibility to exec custom BPF code.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-17 23:51:10 -05:00
Johannes Berg
ee1c244219 genetlink: synchronize socket closing and family removal
In addition to the problem Jeff Layton reported, I looked at the code
and reproduced the same warning by subscribing and removing the genl
family with a socket still open. This is a fairly tricky race which
originates in the fact that generic netlink allows the family to go
away while sockets are still open - unlike regular netlink which has
a module refcount for every open socket so in general this cannot be
triggered.

Trying to resolve this issue by the obvious locking isn't possible as
it will result in deadlocks between unregistration and group unbind
notification (which incidentally lockdep doesn't find due to the home
grown locking in the netlink table.)

To really resolve this, introduce a "closing socket" reference counter
(for generic netlink only, as it's the only affected family) in the
core netlink code and use that in generic netlink to wait for all the
sockets that are being closed at the same time as a generic netlink
family is removed.

This fixes the race that when a socket is closed, it will should call
the unbind, but if the family is removed at the same time the unbind
will not find it, leading to the warning. The real problem though is
that in this case the unbind could actually find a new family that is
registered to have a multicast group with the same ID, and call its
mcast_unbind() leading to confusing.

Also remove the warning since it would still trigger, but is now no
longer a problem.

This also moves the code in af_netlink.c to before unreferencing the
module to avoid having the same problem in the normal non-genl case.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-16 17:04:25 -05:00
Johannes Berg
f555f3d76a genetlink: document parallel_ops
The kernel-doc for the parallel_ops family struct member is
missing, add it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-16 17:04:24 -05:00
Rickard Strandqvist
0026b6551b Bluetooth: Remove unused function
Remove the function hci_conn_change_link_key() that is not used anywhere.

This was partially found by using a static code analysis program called
cppcheck.

Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-16 13:06:38 +02:00
David S. Miller
27f097177d Here's a big pile of changes for this round.
We have
  * a lot of regulatory code changes to deal with the
    way newer Intel devices handle this
  * a change to drop packets while disconnecting from
    an AP instead of trying to wait for them
  * a new attempt at improving the tailroom accounting
    to not kick in too much for performance reasons
  * improvements in wireless link statistics
  * many other small improvements and small fixes that
    didn't seem necessary for 3.19 (e.g. in hwsim which
    is testing only code)
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJUt7WEAAoJEDBSmw7B7bqrVBoP/2EViE62HMmXdqG1SZWz8q9o
 Iigq8STC/sT2WCx1pYm+tKuVW4LD2O3mCriGNP8A3RwzDZ6H7sKJYb1gV6QCPV6f
 4+yT5VSAB3D3lHmp/bbyNsmKCBQ5uS4LVgDrokrkbGpacDu94PYS5Wv9t3x6PBVB
 5Xjky6g6A/pSuxTIstSO9k5xkzNjaB1TxvVRz/gJrGcFQVkDFSlVbuTHUVxs8p+p
 k6mwY/2WYijZkswWZVQTJLQlF9vRI7PYkKs5m8gz4pjNU48oFJoyu4IP3Z1Xj/Sm
 zgT1C9rgp0Du74HYO2niGAvLWgKajAZuW5hIacDndUPjYQQBLgGs/bCJGSntM+x9
 XoOdPixdFPT/58ijyYZlmHc8rxPOd2kHsVbwGplp8f195S4VO04D+ejfOaoAUFwX
 v/kMvO3XIFmEH1jjkDAC3OTcRMYVMuENyWl7pFzxHIzPeRiEpQUd9iSdM4yol0F2
 ZyWvKud4U75Sh+aCiDIIBETtdfCRFe12hgKs4COYbI/UYkGPTPrNei/uisopdubT
 JC+7pZOYdSgoX12yVi6ds6DmKE/ZpIQyhIK4wTWgVoszbnfdb9Mw7mJEThwNRjeK
 JJPsbuty7u8HWjXzEqHLoTV3BFv1cgRSJc5Wt0zfME+LzD7iQpEpv+QBAguwwChD
 Osn55Z3FnKEmBdGcOIje
 =vaEW
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2015-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Here's a big pile of changes for this round.

We have
 * a lot of regulatory code changes to deal with the
   way newer Intel devices handle this
 * a change to drop packets while disconnecting from
   an AP instead of trying to wait for them
 * a new attempt at improving the tailroom accounting
   to not kick in too much for performance reasons
 * improvements in wireless link statistics
 * many other small improvements and small fixes that
   didn't seem necessary for 3.19 (e.g. in hwsim which
   is testing only code)

Conflicts:
	drivers/staging/rtl8723au/os_dep/ioctl_cfg80211.c

Minor overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15 19:16:56 -05:00
Eric Dumazet
5055c371bf ipv4: per cpu uncached list
RAW sockets with hdrinc suffer from contention on rt_uncached_lock
spinlock.

One solution is to use percpu lists, since most routes are destroyed
by the cpu that created them.

It is unclear why we even have to put these routes in uncached_list,
as all outgoing packets should be freed when a device is dismantled.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: caacf05e5a ("ipv4: Properly purge netdev references on uncached routes.")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15 18:26:16 -05:00
Johannes Berg
b51f3beecf cfg80211: change bandwidth reporting to explicit field
For some reason, we made the bandwidth separate flags, which
is rather confusing - a single rate cannot have different
bandwidths at the same time.

Change this to no longer be flags but use a separate field
for the bandwidth ('bw') instead.

While at it, add support for 5 and 10 MHz rates - these are
reported as regular legacy rates with their real bitrate,
but tagged as 5/10 now to make it easier to distinguish them.

In the nl80211 API, the flags are preserved, but the code
now can also clearly only set a single one of the flags.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-15 22:41:32 +01:00
Johannes Berg
97d910d0aa cfg80211: remove 80+80 MHz rate reporting
These rates are treated the same as 160 MHz in the spec, so
it makes no sense to distinguish them. As no driver uses them
yet, this is also not a problem, just remove them.

In the userspace API the field remains reserved to preserve
API and ABI.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-15 16:05:21 +01:00
Johannes Berg
f89903d53f mac80211: remove 80+80 MHz rate reporting
These rates are treated the same as 160 MHz in the spec,
so it makes no sense to distinguish them. As no driver
uses them yet, this is also not a problem, just remove
them.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-15 16:02:46 +01:00
David S. Miller
4e7a84b1a5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
netfilter updates for net-next

The following patchset contains netfilter updates for net-next, just a
bunch of cleanups and small enhancement to selectively flush conntracks
in ctnetlink, more specifically the patches are:

1) Rise default number of buckets in conntrack from 16384 to 65536 in
   systems with >= 4GBytes, patch from Marcelo Leitner.

2) Small refactor to save one level on indentation in xt_osf, from
   Joe Perches.

3) Remove unnecessary sizeof(char) in nf_log, from Fabian Frederick.

4) Another small cleanup to remove redundant variable in nfnetlink,
   from Duan Jiong.

5) Fix compilation warning in nfnetlink_cthelper on parisc, from
   Chen Gang.

6) Fix wrong format in debugging for ctseqadj, from Gao feng.

7) Selective conntrack flushing through the mark for ctnetlink, patch
   from Kristian Evensen.

8) Remove nf_ct_conntrack_flush_report() exported symbol now that is
   not required anymore after the selective flushing patch, again from
   Kristian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15 01:50:25 -05:00
Thomas Graf
1dd144cf5b openvswitch: Support VXLAN Group Policy extension
Introduces support for the group policy extension to the VXLAN virtual
port. The extension is disabled by default and only enabled if the user
has provided the respective configuration.

  ovs-vsctl add-port br0 vxlan0 -- \
     set Interface vxlan0 type=vxlan options:exts=gbp

The configuration interface to enable the extension is based on a new
attribute OVS_VXLAN_EXT_GBP nested inside OVS_TUNNEL_ATTR_EXTENSION
which can carry additional extensions as needed in the future.

The group policy metadata is stored as binary blob (struct ovs_vxlan_opts)
internally just like Geneve options but transported as nested Netlink
attributes to user space.

Renames the existing TUNNEL_OPTIONS_PRESENT to TUNNEL_GENEVE_OPT with the
binary value kept intact, a new flag TUNNEL_VXLAN_OPT is introduced.

The attributes OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS and existing
OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS are implemented mutually exclusive.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15 01:11:41 -05:00
Thomas Graf
ac5132d1a0 vxlan: Only bind to sockets with compatible flags enabled
A VXLAN net_device looking for an appropriate socket may only consider
a socket which has a matching set of flags/extensions enabled. If
incompatible flags are enabled, return a conflict to have the caller
create a distinct socket with distinct port.

The OVS VXLAN port is kept unaware of extensions at this point.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15 01:11:41 -05:00
Thomas Graf
3511494ce2 vxlan: Group Policy extension
Implements supports for the Group Policy VXLAN extension [0] to provide
a lightweight and simple security label mechanism across network peers
based on VXLAN. The security context and associated metadata is mapped
to/from skb->mark. This allows further mapping to a SELinux context
using SECMARK, to implement ACLs directly with nftables, iptables, OVS,
tc, etc.

The group membership is defined by the lower 16 bits of skb->mark, the
upper 16 bits are used for flags.

SELinux allows to manage label to secure local resources. However,
distributed applications require ACLs to implemented across hosts. This
is typically achieved by matching on L2-L4 fields to identify the
original sending host and process on the receiver. On top of that,
netlabel and specifically CIPSO [1] allow to map security contexts to
universal labels.  However, netlabel and CIPSO are relatively complex.
This patch provides a lightweight alternative for overlay network
environments with a trusted underlay. No additional control protocol
is required.

           Host 1:                       Host 2:

      Group A        Group B        Group B     Group A
      +-----+   +-------------+    +-------+   +-----+
      | lxc |   | SELinux CTX |    | httpd |   | VM  |
      +--+--+   +--+----------+    +---+---+   +--+--+
	  \---+---/                     \----+---/
	      |                              |
	  +---+---+                      +---+---+
	  | vxlan |                      | vxlan |
	  +---+---+                      +---+---+
	      +------------------------------+

Backwards compatibility:
A VXLAN-GBP socket can receive standard VXLAN frames and will assign
the default group 0x0000 to such frames. A Linux VXLAN socket will
drop VXLAN-GBP  frames. The extension is therefore disabled by default
and needs to be specifically enabled:

   ip link add [...] type vxlan [...] gbp

In a mixed environment with VXLAN and VXLAN-GBP sockets, the GBP socket
must run on a separate port number.

Examples:
 iptables:
  host1# iptables -I OUTPUT -m owner --uid-owner 101 -j MARK --set-mark 0x200
  host2# iptables -I INPUT -m mark --mark 0x200 -j DROP

 OVS:
  # ovs-ofctl add-flow br0 'in_port=1,actions=load:0x200->NXM_NX_TUN_GBP_ID[],NORMAL'
  # ovs-ofctl add-flow br0 'in_port=2,tun_gbp_id=0x200,actions=drop'

[0] https://tools.ietf.org/html/draft-smith-vxlan-group-policy
[1] http://lwn.net/Articles/204905/

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15 01:11:41 -05:00
Tom Herbert
dfd8645ea1 vxlan: Remote checksum offload
Add support for remote checksum offload in VXLAN. This uses a
reserved bit to indicate that RCO is being done, and uses the low order
reserved eight bits of the VNI to hold the start and offset values in a
compressed manner.

Start is encoded in the low order seven bits of VNI. This is start >> 1
so that the checksum start offset is 0-254 using even values only.
Checksum offset (transport checksum field) is indicated in the high
order bit in the low order byte of the VNI. If the bit is set, the
checksum field is for UDP (so offset = start + 6), else checksum
field is for TCP (so offset = start + 16). Only TCP and UDP are
supported in this implementation.

Remote checksum offload for VXLAN is described in:

https://tools.ietf.org/html/draft-herbert-vxlan-rco-00

Tested by running 200 TCP_STREAM connections with VXLAN (over IPv4).

With UDP checksums and Remote Checksum Offload
  IPv4
      Client
        11.84% CPU utilization
      Server
        12.96% CPU utilization
      9197 Mbps
  IPv6
      Client
        12.46% CPU utilization
      Server
        14.48% CPU utilization
      8963 Mbps

With UDP checksums, no remote checksum offload
  IPv4
      Client
        15.67% CPU utilization
      Server
        14.83% CPU utilization
      9094 Mbps
  IPv6
      Client
        16.21% CPU utilization
      Server
        14.32% CPU utilization
      9058 Mbps

No UDP checksums
  IPv4
      Client
        15.03% CPU utilization
      Server
        23.09% CPU utilization
      9089 Mbps
  IPv6
      Client
        16.18% CPU utilization
      Server
        26.57% CPU utilization
       8954 Mbps

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-14 15:20:04 -05:00
Arik Nemtsov
2c3e861c94 cfg80211: introduce sync regdom set API for self-managed
A self-managed device will sometimes need to set its regdomain synchronously.
Notably it should be set before usermode has a chance to query it. Expose
a new API to accomplish this which requires the RTNL.

Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-14 09:43:44 +01:00
Jiri Pirko
df8a39defa net: rename vlan_tx_* helpers since "tx" is misleading there
The same macros are used for rx as well. So rename it.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-13 17:51:08 -05:00
Jiri Pirko
d8b9605d26 net: sched: fix skb->protocol use in case of accelerated vlan path
tc code implicitly considers skb->protocol even in case of accelerated
vlan paths and expects vlan protocol type here. However, on rx path,
if the vlan header was already stripped, skb->protocol contains value
of next header. Similar situation is on tx path.

So for skbs that use skb->vlan_tci for tagging, use skb->vlan_proto instead.

Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-13 17:51:08 -05:00
Tom Herbert
3bf3947526 vxlan: Improve support for header flags
This patch cleans up the header flags of VXLAN in anticipation of
defining some new ones:

- Move header related definitions from vxlan.c to vxlan.h
- Change VXLAN_FLAGS to be VXLAN_HF_VNI (only currently defined flag)
- Move check for unknown flags to after we find vxlan_sock, this
  assumes that some flags may be processed based on tunnel
  configuration
- Add a comment about why the stack treating unknown set flags as an
  error instead of ignoring them

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-12 16:05:01 -05:00
Marcel Holtmann
039d4e410c Bluetooth: Add missing response structure for HCI Delete Stored Link Key
This patch adds this missing structure for processing the result of the
HCI Delete Stored Link Key command.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-12 21:55:40 +02:00
Marcel Holtmann
c2f0f97927 Bluetooth: Handle command complete event for HCI Read Stored Link Keys
When the HCI Read Stored Link Keys command completes it gives useful
information of the current stored keys and maximum keys a controller
can actually store. So process this event and store these information
in hci_dev structure.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-12 21:54:16 +02:00
Marcel Holtmann
cb9627806c Bluetooth: Add defintions for HCI Read Stored Link Key command
This patch adds the missing commmand structure and command complete
structure for the HCI Read Store Link Key command.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-12 21:53:54 +02:00
Marcel Holtmann
1904a853fa Bluetooth: Add opcode parameter to hci_req_complete_t callback
When hci_req_run() calls its provided complete function and one of the
HCI commands in the sequence fails, then provide the opcode of failing
command. In case of success HCI_OP_NOP is provided since all commands
completed.

This patch fixes the prototype of hci_req_complete_t and all its users.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-12 11:16:31 +02:00
Johannes Berg
6de39808cf nl80211: support per-TID station statistics
The base for the current statistics is pretty mixed up, support
exporting RX/TX statistics for MSDUs per TID. This (currently)
covers received MSDUs, transmitted MSDUs and retries/failures
thereof.

Doing it per TID for MSDUs makes more sense than say only per AC
because it's symmetric - we could export per-AC statistics for all
frames (which AC we used for transmission can be determined also
for management frames) but per TID is better and usually data
frames are really the ones we care about. Also, on RX we can't
determine the AC - but we do know the TID for any QoS MPDU we
received.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-08 15:28:18 +01:00
Johannes Berg
8d791361a4 nl80211: clarify packet statistics descriptions
The current statistics we keep aren't very clear, some are on
MPDUs and some on MSDUs/MMPDUs. Clarify the descriptions based
on the counters mac80211 keeps.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-08 15:28:16 +01:00
Johannes Berg
a76b1942a1 cfg80211: add nl80211 beacon-only statistics
Add these two values:
 * BEACON_RX: number of beacons received from this peer
 * BEACON_SIGNAL_AVG: signal strength average for beacons only

These can then be used for Android Lollipop's statistics request.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-08 15:28:13 +01:00
Johannes Berg
319090bf6c cfg80211: remove enum station_info_flags
This is really just duplicating the list of information that's
already available in the nl80211 attribute, so remove the list.
Two small changes are needed:
 * remove STATION_INFO_ASSOC_REQ_IES complete, but the length
   (assoc_req_ies_len) can be used instead
 * add NL80211_STA_INFO_RX_DROP_MISC which exists internally
   but not in nl80211 yet

This gets rid of the duplicate maintenance of the two lists.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-08 15:28:10 +01:00
Johannes Berg
2b9a7e1bac mac80211: allow drivers to provide most station statistics
In many cases, drivers can filter things like beacons that will
skew statistics reported by mac80211. To get correct statistics
in these cases, call drivers to obtain statistics and let them
override all values, filling values from mac80211 if the driver
didn't provide them. Not all of them make sense for the driver
to fill, so some are still always done by mac80211.

Note that this doesn't currently allow a driver to say "I know
this value is wrong, don't report it at all", or to sum it up
with a mac80211 value (as could be useful for "dropped misc"),
that can be added if it turns out to be needed.

This also gets rid of the get_rssi() method as is can now be
implemented using sta_statistics().

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-08 15:28:06 +01:00
Johannes Berg
cf5ead822d cfg80211: allow including station info in delete event
When a station is removed, its statistics may be interesting to
userspace, for example for further aggregation of statistics of
all stations that ever connected to an AP.

Introduce a new cfg80211_del_sta_sinfo() function (and make the
cfg80211_del_sta() a static inline calling it) to allow passing
a struct station_info along with this, and send the data in the
nl80211 event message.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-08 15:28:00 +01:00
Johannes Berg
052536abfa cfg80211: add scan time to survey data
Add the time spent scanning to the survey data so it can be
reported by drivers that collect such information.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-08 15:27:58 +01:00
Johannes Berg
11f78ac32b cfg80211: allow survey data to return global data
Not all devices are able to report survey data (particularly
time spent for various operations) per channel. As all these
statistics already exist in survey data, allow such devices
to report them (if userspace requested it)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-08 15:27:54 +01:00
Johannes Berg
4ed20bebf5 cfg80211: remove "channel" from survey names
All of the survey data is (currently) per channel anyway,
so having the word "channel" in the name does nothing. In
the next patch I'll introduce global data to the survey,
where the word "channel" is actually confusing.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-08 15:27:52 +01:00
Kristian Evensen
ae406bd057 netfilter: conntrack: Remove nf_ct_conntrack_flush_report
The only user of nf_ct_conntrack_flush_report() was ctnetlink_del_conntrack().
After adding support for flushing connections with a given mark, this function
is no longer called.

Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-01-08 12:16:58 +01:00
Ido Yariv
db12847ca8 mac80211: Re-fix accounting of the tailroom-needed counter
When hw acceleration is enabled, the GENERATE_IV or PUT_IV_SPACE flags
only require headroom space. Therefore, the tailroom-needed counter can
safely be decremented for most drivers.

The older incarnation of this patch (ca34e3b5) assumed that the above
holds true for all drivers. As reported by Christopher Chavez and
researched by Christian Lamparter and Larry Finger, this isn't a valid
assumption for p54 and cw1200.

Drivers that still require tailroom for ICV/MIC even when HW encryption
is enabled can use IEEE80211_KEY_FLAG_RESERVE_TAILROOM to indicate it.

Signed-off-by: Ido Yariv <idox.yariv@intel.com>
Cc: Christopher Chavez <chrischavez@gmx.us>
Cc: Christian Lamparter <chunkeey@googlemail.com>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Solomon Peachy <pizza@shaftnet.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-07 14:39:32 +01:00
Johannes Berg
3a4b0c948d Merge branch 'mac80211' into mac80211-next
Merge mac80211.git to get some changes that would otherwise
cause conflicts with new changes coming here.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-07 14:39:16 +01:00
David S. Miller
44d84d7272 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-01-06 22:29:20 -05:00
David S. Miller
15ecf7a063 Here's just a single fix - a revert of a patch that broke the
p54 and cw2100 drivers (arguably due to bad assumptions there.)
 Since this affects kernels since 3.17, I decided to revert for
 now and we'll revisit this optimisation properly for -next.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJUq9EvAAoJEDBSmw7B7bqr7mMQAJRbdepVVqK5IFH1BH0NC7vi
 LkBE2lp8ZAz1Crg+OAQNUdlZUHtGyfYoXSfzezmrMG51i5xjHyOYQQikW7aJ2SQ0
 XsjJJ5TcqKe83NwoakXUrMpE7KmCt/LnbjKNXDsZIvLlUkqa7ksXaS7btK195aXy
 WlVrmUE+BqT9a16VjFLZ6wRjI43+3bGxhtFL+g1eXw6nZ4a2o4EbIXdc9SN+/bT4
 tAhWJfdAQqQc34jhesWGbMIvkXWhzy2R6Js+9gMIBNsmlAiYbFa4QZ/9tI3nBI/O
 yHSiDc7JnPNjkkC+3wTJxMl7mEd6fEKnAS1ryZ5L4XhPrQpV39iZuWSPvPGw6LLW
 kB6+wXkIyQdCSoyrQZxY75ibqOUKYYxhhkSYfMePXRKTYY6MlHYqiH8wPWFpPoqO
 iumLqx8/CtRW1q1t2EBAG6rZLRF8HqmfqtB+ptT0DWcAP8E81q8BImPoPFr+P9S2
 XfuuSw97xKCcilOcYJ0uYSBe4XNNhy1dtC/zJ8cA9nV4WNkofALga5Z/t8ARhDsM
 wvP1D2uIX3U9My17bXq+Xn/fSSS7yhpLZjEHj/JNRvpDCWGf/tQl6A3ydMy//Oqe
 lRSKfmiAGysqhXnmK12+YhfO+4ioTz8dA88tHs1AO8qasfQwx45eRsUPemWeExiL
 9Lntb0U6MhYvgiTdWqt6
 =CnbJ
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2015-01-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Here's just a single fix - a revert of a patch that broke the
p54 and cw2100 drivers (arguably due to bad assumptions there.)
Since this affects kernels since 3.17, I decided to revert for
now and we'll revisit this optimisation properly for -next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-06 13:29:27 -05:00
Gautam Kumar Shukla
d75bb06b61 cfg80211: add extensible feature flag attribute
With the wiphy::features flag being used up this patch adds a
new field wiphy::ext_features. Considering extensibility this
new field is declared as a byte array. This extensible flag is
exposed to user-space by NL80211_ATTR_EXT_FEATURES.

Cc: Avinash Patil <patila@marvell.com>
Signed-off-by: Gautam (Gautam Kumar) Shukla <gautams@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-06 12:10:24 +01:00
Daniel Borkmann
81164413ad net: tcp: add per route congestion control
This work adds the possibility to define a per route/destination
congestion control algorithm. Generally, this opens up the possibility
for a machine with different links to enforce specific congestion
control algorithms with optimal strategies for each of them based
on their network characteristics, even transparently for a single
application listening on all links.

For our specific use case, this additionally facilitates deployment
of DCTCP, for example, applications can easily serve internal
traffic/dsts in DCTCP and external one with CUBIC. Other scenarios
would also allow for utilizing e.g. long living, low priority
background flows for certain destinations/routes while still being
able for normal traffic to utilize the default congestion control
algorithm. We also thought about a per netns setting (where different
defaults are possible), but given its actually a link specific
property, we argue that a per route/destination setting is the most
natural and flexible.

The administrator can utilize this through ip-route(8) by appending
"congctl [lock] <name>", where <name> denotes the name of a
congestion control algorithm and the optional lock parameter allows
to enforce the given algorithm so that applications in user space
would not be allowed to overwrite that algorithm for that destination.

The dst metric lookups are being done when a dst entry is already
available in order to avoid a costly lookup and still before the
algorithms are being initialized, thus overhead is very low when the
feature is not being used. While the client side would need to drop
the current reference on the module, on server side this can actually
even be avoided as we just got a flat-copied socket clone.

Joint work with Florian Westphal.

Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:55:24 -05:00
Daniel Borkmann
ea69763999 net: tcp: add RTAX_CC_ALGO fib handling
This patch adds the minimum necessary for the RTAX_CC_ALGO congestion
control metric to be set up and dumped back to user space.

While the internal representation of RTAX_CC_ALGO is handled as a u32
key, we avoided to expose this implementation detail to user space, thus
instead, we chose the netlink attribute that is being exchanged between
user space to be the actual congestion control algorithm name, similarly
as in the setsockopt(2) API in order to allow for maximum flexibility,
even for 3rd party modules.

It is a bit unfortunate that RTAX_QUICKACK used up a whole RTAX slot as
it should have been stored in RTAX_FEATURES instead, we first thought
about reusing it for the congestion control key, but it brings more
complications and/or confusion than worth it.

Joint work with Florian Westphal.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:55:24 -05:00
Daniel Borkmann
c5c6a8ab45 net: tcp: add key management to congestion control
This patch adds necessary infrastructure to the congestion control
framework for later per route congestion control support.

For a per route congestion control possibility, our aim is to store
a unique u32 key identifier into dst metrics, which can then be
mapped into a tcp_congestion_ops struct. We argue that having a
RTAX key entry is the most simple, generic and easy way to manage,
and also keeps the memory footprint of dst entries lower on 64 bit
than with storing a pointer directly, for example. Having a unique
key id also allows for decoupling actual TCP congestion control
module management from the FIB layer, i.e. we don't have to care
about expensive module refcounting inside the FIB at this point.

We first thought of using an IDR store for the realization, which
takes over dynamic assignment of unused key space and also performs
the key to pointer mapping in RCU. While doing so, we stumbled upon
the issue that due to the nature of dynamic key distribution, it
just so happens, arguably in very rare occasions, that excessive
module loads and unloads can lead to a possible reuse of previously
used key space. Thus, previously stale keys in the dst metric are
now being reassigned to a different congestion control algorithm,
which might lead to unexpected behaviour. One way to resolve this
would have been to walk FIBs on the actually rare occasion of a
module unload and reset the metric keys for each FIB in each netns,
but that's just very costly.

Therefore, we argue a better solution is to reuse the unique
congestion control algorithm name member and map that into u32 key
space through jhash. For that, we split the flags attribute (as it
currently uses 2 bits only anyway) into two u32 attributes, flags
and key, so that we can keep the cacheline boundary of 2 cachelines
on x86_64 and cache the precalculated key at registration time for
the fast path. On average we might expect 2 - 4 modules being loaded
worst case perhaps 15, so a key collision possibility is extremely
low, and guaranteed collision-free on LE/BE for all in-tree modules.
Overall this results in much simpler code, and all without the
overhead of an IDR. Due to the deterministic nature, modules can
now be unloaded, the congestion control algorithm for a specific
but unloaded key will fall back to the default one, and on module
reload time it will switch back to the expected algorithm
transparently.

Joint work with Florian Westphal.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:55:24 -05:00
Florian Westphal
e715b6d3a5 net: fib6: convert cfg metric to u32 outside of table write lock
Do the nla validation earlier, outside the write lock.

This is needed by followup patch which needs to be able to call
request_module (which can sleep) if needed.

Joint work with Daniel Borkmann.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:55:24 -05:00
Tom Herbert
ad6f939ab1 ip: Add offset parameter to ip_cmsg_recv
Add ip_cmsg_recv_offset function which takes an offset argument
that indicates the starting offset in skb where data is being received
from. This will be useful in the case of UDP and provided checksum
to user space.

ip_cmsg_recv is an inline call to ip_cmsg_recv_offset with offset of
zero.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:44:46 -05:00
Tom Herbert
5961de9f19 ip: Add offset parameter to ip_cmsg_recv
Add ip_cmsg_recv_offset function which takes an offset argument
that indicates the starting offset in skb where data is being received
from. This will be useful in the case of UDP and provided checksum
to user space.

ip_cmsg_recv is an inline call to ip_cmsg_recv_offset with offset of
zero.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:44:46 -05:00
Tom Herbert
c44d13d6f3 ip: IP cmsg cleanup
Move the IP_CMSG_* constants from ip_sockglue.c to inet_sock.h so that
they can be referenced in other source files.

Restructure ip_cmsg_recv to not go through flags using shift, check
for flags by 'and'. This eliminates both the shift and a conditional
per flag check.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:44:46 -05:00
Tom Herbert
224d019c4f ip: Move checksum convert defines to inet
Move convert_csum from udp_sock to inet_sock. This allows the
possibility that we can use convert checksum for different types
of sockets and also allows convert checksum to be enabled from
inet layer (what we'll want to do when enabling IP_CHECKSUM cmsg).

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:44:46 -05:00
Thomas Graf
149118d893 netlink: Warn on unordered or illegal nla_nest_cancel() or nlmsg_cancel()
Calling nla_nest_cancel() in a different order as the nesting was
built up can lead to negative offsets being calculated which
results in skb_trim() being called with an underflowed unsigned
int. Warn if mark < skb->data as it's definitely a bug.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:38:22 -05:00
Johannes Berg
1e359a5de8 Revert "mac80211: Fix accounting of the tailroom-needed counter"
This reverts commit ca34e3b5c8.

It turns out that the p54 and cw2100 drivers assume that there's
tailroom even when they don't say they really need it. However,
there's currently no way for them to explicitly say they do need
it, so for now revert this.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=90331.

Cc: stable@vger.kernel.org
Fixes: ca34e3b5c8 ("mac80211: Fix accounting of the tailroom-needed counter")
Reported-by: Christopher Chavez <chrischavez@gmx.us>
Bisected-by: Larry Finger <Larry.Finger@lwfinger.net>
Debugged-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-05 10:33:46 +01:00
Jesse Gross
df5dba8e52 geneve: Remove socket hash table.
The hash table for open Geneve ports is used only on creation and
deletion time. It is not performance critical and is not likely to
grow to a large number of items. Therefore, this can be changed
to use a simple linked list.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-04 22:21:33 -05:00
Jesse Gross
829a3ada9c geneve: Simplify locking.
The existing Geneve locking scheme was pulled over directly from
VXLAN. However, VXLAN has a number of built in mechanisms which make
the locking more complex and are unlikely to be necessary with Geneve.
This simplifies the locking to use a basic scheme of a mutex
when doing updates plus RCU on receive.

In addition to making the code easier to read, this also avoids the
possibility of a race when creating or destroying sockets since
UDP sockets and the list of Geneve sockets are protected by different
locks. After this change, the entire operation is atomic.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-04 22:21:33 -05:00
Jesse Gross
61f3cade76 geneve: Remove workqueue.
The work queue is used only to free the UDP socket upon destruction.
This is not necessary with Geneve and generally makes the code more
difficult to reason about. It also introduces nondeterministic
behavior such as when a socket is rapidly deleted and recreated, which
could fail as the the deletion happens asynchronously.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-04 22:21:33 -05:00
Marcel Holtmann
043ec9bf7b Bluetooth: Introduce HCI_QUIRK_FIXUP_INQUIRY_MODE option
The HCI_QUIRK_FIXUP_INQUIRY_MODE option allows to force Inquiry Result
with RSSI setting on controllers that do not indicate support for it,
but where it is known to be fully functional.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-03 22:31:09 +02:00
Marcel Holtmann
05b3c3e790 Bluetooth: Remove no longer needed force_sc_support debugfs option
The force_sc_support debugfs option was introduced to easily work with
pre-production Bluetooth 4.1 silicon. This option is no longer needed
since controllers supporting BR/EDR Secure Connections feature are now
available.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-02 22:22:04 +01:00
Marcel Holtmann
91389af67c Bluetooth: Remove broken force_lesc_support debugfs option
The force_lesc_support debugfs option never really worked. It has a race
condition between creating the debugfs entry and registering the L2CAP
fixed channel for BR/EDR SMP support.

Also this has been replaced with a working force_bredr_smp debugfs
switch that developers can use now.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-02 22:22:03 +01:00
Marcel Holtmann
300acfdec9 Bluetooth: Introduce force_bredr_smp debugfs option for testing
Testing cross-transport pairing that starts on BR/EDR is only valid when
using a controller with BR/EDR Secure Connections. Devices will indicate
this by providing BR/EDR SMP fixed channel over L2CAP. To allow testing
of this feature on Bluetooth 4.0 controller or controllers without the
BR/EDR Secure Connections features, introduce a force_bredr_smp debugfs
option that allows faking the required AES connection.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-01-02 22:22:03 +01:00
David S. Miller
6c032edc8a Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg say:

====================
pull request: bluetooth-next 2014-12-31

Here's the first batch of bluetooth patches for 3.20.

 - Cleanups & fixes to ieee802154  drivers
 - Fix synchronization of mgmt commands with respective HCI commands
 - Add self-tests for LE pairing crypto functionality
 - Remove 'BlueFritz!' specific handling from core using a new quirk flag
 - Public address configuration support for ath3012
 - Refactor debugfs support into a dedicated file
 - Initial support for LE Data Length Extension feature from Bluetooth 4.2

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-02 15:58:21 -05:00
Alexander Duyck
345e9b5426 fib_trie: Push rcu_read_lock/unlock to callers
This change is to start cleaning up some of the rcu_read_lock/unlock
handling.  I realized while reviewing the code there are several spots that
I don't believe are being handled correctly or are masking warnings by
locally calling rcu_read_lock/unlock instead of calling them at the correct
level.

A common example is a call to fib_get_table followed by fib_table_lookup.
The rcu_read_lock/unlock ought to wrap both but there are several spots where
they were not wrapped.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-31 18:25:54 -05:00
Johannes Berg
023e2cfa36 netlink/genetlink: pass network namespace to bind/unbind
Netlink families can exist in multiple namespaces, and for the most
part multicast subscriptions are per network namespace. Thus it only
makes sense to have bind/unbind notifications per network namespace.

To achieve this, pass the network namespace of a given client socket
to the bind/unbind functions.

Also do this in generic netlink, and there also make sure that any
bind for multicast groups that only exist in init_net is rejected.
This isn't really a problem if it is accepted since a client in a
different namespace will never receive any notifications from such
a group, but it can confuse the family if not rejected (it's also
possible to silently (without telling the family) accept it, but it
would also have to be ignored on unbind so families that take any
kind of action on bind/unbind won't do unnecessary work for invalid
clients like that.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-27 03:07:50 -05:00
Johannes Berg
c380d9a7af genetlink: pass multicast bind/unbind to families
In order to make the newly fixed multicast bind/unbind
functionality in generic netlink, pass them down to the
appropriate family.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-27 02:20:23 -05:00
Johannes Berg
f8403a2e47 genetlink: pass only network namespace to genl_has_listeners()
There's no point to force the caller to know about the internal
genl_sock to use inside struct net, just have them pass the network
namespace. This doesn't really change code generation since it's
an inline, but makes the caller less magic - there's never any
reason to pass another socket.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-27 02:20:23 -05:00
Jesse Gross
5f35227ea3 net: Generalize ndo_gso_check to ndo_features_check
GSO isn't the only offload feature with restrictions that
potentially can't be expressed with the current features mechanism.
Checksum is another although it's a general issue that could in
theory apply to anything. Even if it may be possible to
implement these restrictions in other ways, it can result in
duplicate code or inefficient per-packet behavior.

This generalizes ndo_gso_check so that drivers can remove any
features that don't make sense for a given packet, similar to
netif_skb_features(). It also converts existing driver
restrictions to the new format, completing the work that was
done to support tunnel protocols since the issues apply to
checksums as well.

By actually removing features from the set that are used to do
offloading, it solves another problem with the existing
interface. In these cases, GSO would run with the original set
of features and not do anything because it appears that
segmentation is not required.

CC: Tom Herbert <therbert@google.com>
CC: Joe Stringer <joestringer@nicira.com>
CC: Eric Dumazet <edumazet@google.com>
CC: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by:  Tom Herbert <therbert@google.com>
Fixes: 04ffcb255f ("net: Add ndo_gso_check")
Tested-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-26 17:20:56 -05:00
Nicolas Dichtel
ef8f342b43 neigh: remove next ptr from struct neigh_table
After commit
d7480fd3b1 ("neigh: remove dynamic neigh table registration support"),
this field is not used anymore.

CC: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-26 17:07:08 -05:00
Marcel Holtmann
711ffa78f4 Bluetooth: Introduce HCI_QUIRK_BROKEN_LOCAL_COMMANDS constant
Some controllers advertise support for Bluetooth 1.2 specification,
but they do not support the HCI Read Local Supported Commands command.

If that is the case, then the driver can quirk the behavior and force
the core to skip this command. This will allow removing vendor specific
checks out of the core.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-26 20:16:10 +02:00
Marcel Holtmann
72e4a6bd02 Bluetooth: Remove duplicate constant for RFCOMM PSM
The RFCOMM_PSM constant is actually a duplicate. So remove it and
use the L2CAP_PSM_RFCOMM constant instead.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-20 19:55:04 +02:00
Marcel Holtmann
23b9ceb74f Bluetooth: Create debugfs directory for each connection handle
For every internal representation of a Bluetooth connection which is
identified by hci_conn, create a debugfs directory with the handle
number as directory name.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-20 19:54:24 +02:00
Marcel Holtmann
a8e1bfaa55 Bluetooth: Store default and maximum LE data length settings
When the controller supports the LE Data Length Extension feature, the
default and maximum data length are read and now stored.

For backwards compatibility all values are initialized to the data
length values from Bluetooth 4.1 and earlier specifications.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-20 17:52:21 +02:00
Marcel Holtmann
94a3bd02a6 Bluetooth: Add structures for LE Data Length Extension feature
This patch adds the structures for HCI commands and events of the
LE Data Length Extension feature from Bluetooth 4.2 specification.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-20 17:51:41 +02:00
Johan Hedberg
5a154e6f71 Bluetooth: Fix Add Device to wait for HCI before sending cmd_complete
This patch updates the Add Device mgmt command handler to use a
hci_request to wait for HCI command completion before notifying user
space of the mgmt command completion. To do this we need to add an extra
hci_request parameter to the hci_conn_params_set function. Since this
function has no other users besides mgmt.c it's moved there as a static
function.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-19 22:06:37 +01:00
Johan Hedberg
2cf22218b0 Bluetooth: Add hci_request support for hci_update_background_scan
Many places using hci_update_background_scan() try to synchronize
whatever they're doing with the help of hci_request callbacks. However,
since the hci_update_background_scan() function hasn't so far accepted a
hci_request pointer any commands triggered by it have been left out by
the synchronization. This patch modifies the API in a similar way as was
done for hci_update_page_scan, i.e. there's a variant that takes a
hci_request and another one that takes a hci_dev.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-19 22:06:37 +01:00
Johan Hedberg
0857dd3bed Bluetooth: Split hci_request helpers to hci_request.[ch]
None of the hci_request related things in net/bluetooth/hci_core.h are
needed anywhere outside of the core bluetooth module. This patch creates
a new net/bluetooth/hci_request.c file with its corresponding h-file and
moves the functionality there from hci_core.c and hci_core.h.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-19 13:04:42 +01:00
Johan Hedberg
1d2dc5b7b3 Bluetooth: Split hci_update_page_scan into two functions
To keep the parameter list and its semantics clear it makes sense to
split the hci_update_page_scan function into two separate functions: one
taking a hci_dev and another taking a hci_request. The one taking a
hci_dev constructs its own hci_request and then calls the other
function.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-19 12:52:18 +01:00
Jukka Rissanen
cab9e3a055 Bluetooth: 6lowpan: Add IPSP PSM value
The Internet Protocol Support Profile a.k.a BT 6LoWPAN specification
is ready so PSM value for it is now known.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-19 11:12:05 +01:00
Alexander Aring
ba2a9506a7 nl802154: introduce support for cca settings
This patch adds support for setting cca parameters via nl802154.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-19 00:19:23 +01:00
Alexander Aring
7fe9a3882b ieee802154: rework cca setting
The current cca setting handle is a driver specific call. We need to
introduce some 802.15.4 specific layer and mapping 802.15.4 cca modes to
driver specific ones inside the 802.15.4 driver. This patch will add
such 802.15.4 layer and mapping the cca settings to driver specific ones.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-19 00:19:23 +01:00
Alexander Aring
b40d6376ff nl802154: introduce cca mode enums
This patch adds enums for 802.15.4 specific CCA settings.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-19 00:19:23 +01:00
Jukka Rissanen
93a1e86ce1 nl80211: Stop scheduled scan if netlink client disappears
An attribute NL80211_ATTR_SOCKET_OWNER can be set by the scan initiator.
If present, the attribute will cause the scan to be stopped if the client
dies.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-12-18 14:38:44 +01:00
Jukka Rissanen
31a60ed1e9 nl80211: Convert sched_scan_req pointer to RCU pointer
Because of possible races when accessing sched_scan_req pointer in
rdev, the sched_scan_req is converted to RCU pointer.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-12-18 14:38:09 +01:00
Jonathan Doron
b0d7aa5959 cfg80211: allow wiphy specific regdomain management
Add a new regulatory flag that allows a driver to manage regdomain
changes/updates for its own wiphy.
A self-managed wiphys only employs regulatory information obtained from
the FW and driver and does not use other cfg80211 sources like
beacon-hints, country-code IEs and hints from other devices on the same
system. Conversely, a self-managed wiphy does not share its regulatory
hints with other devices in the system. If a system contains several
devices, one or more of which are self-managed, there might be
contradictory regulatory settings between them. Usage of flag is
generally discouraged. Only use it if the FW/driver is incompatible
with non-locally originated hints.

A new API lets the driver send a complete regdomain, to be applied on
its wiphy only.

After a wiphy-specific regdomain change takes place, usermode will get
a new type of change notification. The regulatory core also takes care
enforce regulatory restrictions, in case some interfaces are on
forbidden channels.

Signed-off-by: Jonathan Doron <jonathanx.doron@intel.com>
Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-12-17 11:49:55 +01:00
Linus Torvalds
603ba7e41b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile #2 from Al Viro:
 "Next pile (and there'll be one or two more).

  The large piece in this one is getting rid of /proc/*/ns/* weirdness;
  among other things, it allows to (finally) make nameidata completely
  opaque outside of fs/namei.c, making for easier further cleanups in
  there"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  coda_venus_readdir(): use file_inode()
  fs/namei.c: fold link_path_walk() call into path_init()
  path_init(): don't bother with LOOKUP_PARENT in argument
  fs/namei.c: new helper (path_cleanup())
  path_init(): store the "base" pointer to file in nameidata itself
  make default ->i_fop have ->open() fail with ENXIO
  make nameidata completely opaque outside of fs/namei.c
  kill proc_ns completely
  take the targets of /proc/*/ns/* symlinks to separate fs
  bury struct proc_ns in fs/proc
  copy address of proc_ns_ops into ns_common
  new helpers: ns_alloc_inum/ns_free_inum
  make proc_ns_operations work with struct ns_common * instead of void *
  switch the rest of proc_ns_operations to working with &...->ns
  netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns
  make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns
  common object embedded into various struct ....ns
2014-12-16 15:53:03 -08:00
Johannes Berg
848955ccf0 mac80211: move U-APSD enablement to vif flags
In order to let drivers have more dynamic U-APSD support,
move the enablement flag to the virtual interface driver
flags. This lets drivers not only set it up differently
for different interfaces, but also enable/disable on the
fly if needed.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-12-15 12:34:45 +01:00
Linus Torvalds
e3aa91a7cb Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 - The crypto API is now documented :)
 - Disallow arbitrary module loading through crypto API.
 - Allow get request with empty driver name through crypto_user.
 - Allow speed testing of arbitrary hash functions.
 - Add caam support for ctr(aes), gcm(aes) and their derivatives.
 - nx now supports concurrent hashing properly.
 - Add sahara support for SHA1/256.
 - Add ARM64 version of CRC32.
 - Misc fixes.

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (77 commits)
  crypto: tcrypt - Allow speed testing of arbitrary hash functions
  crypto: af_alg - add user space interface for AEAD
  crypto: qat - fix problem with coalescing enable logic
  crypto: sahara - add support for SHA1/256
  crypto: sahara - replace tasklets with kthread
  crypto: sahara - add support for i.MX53
  crypto: sahara - fix spinlock initialization
  crypto: arm - replace memset by memzero_explicit
  crypto: powerpc - replace memset by memzero_explicit
  crypto: sha - replace memset by memzero_explicit
  crypto: sparc - replace memset by memzero_explicit
  crypto: algif_skcipher - initialize upon init request
  crypto: algif_skcipher - removed unneeded code
  crypto: algif_skcipher - Fixed blocking recvmsg
  crypto: drbg - use memzero_explicit() for clearing sensitive data
  crypto: drbg - use MODULE_ALIAS_CRYPTO
  crypto: include crypto- module prefix in template
  crypto: user - add MODULE_ALIAS
  crypto: sha-mb - remove a bogus NULL check
  crytpo: qat - Fix 64 bytes requests
  ...
2014-12-13 13:33:26 -08:00
Sujith Manoharan
5cf16616e1 mac80211: Fix accounting of multicast frames
Since multicast frames are marked as no-ack, using
IEEE80211_TX_STAT_ACK to check if they have been
successfully transmitted by the driver is incorrect
since a driver can choose to ignore transmission status
for no-ack frames. This results in incorrect accounting
for such frames.

To fix this issue, this patch introduces a new flag
that can be used by drivers to indicate error-free
transmission of no-ack frames.

Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
[add a note about not setting the flag for non-no-ack frames]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-12-12 13:48:43 +01:00
Sujith Manoharan
6b127c71fb mac80211: Move IEEE80211_TX_CTL_PS_RESPONSE
Move IEEE80211_TX_CTL_PS_RESPONSE to info->control.flags since
this is used only in the TX path (by ath9k). This frees up
a bit which can be used for other purposes.

Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-12-12 13:48:26 +01:00
Linus Torvalds
70e71ca0af Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) New offloading infrastructure and example 'rocker' driver for
    offloading of switching and routing to hardware.

    This work was done by a large group of dedicated individuals, not
    limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend,
    Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu

 2) Start making the networking operate on IOV iterators instead of
    modifying iov objects in-situ during transfers.  Thanks to Al Viro
    and Herbert Xu.

 3) A set of new netlink interfaces for the TIPC stack, from Richard
    Alpe.

 4) Remove unnecessary looping during ipv6 routing lookups, from Martin
    KaFai Lau.

 5) Add PAUSE frame generation support to gianfar driver, from Matei
    Pavaluca.

 6) Allow for larger reordering levels in TCP, which are easily
    achievable in the real world right now, from Eric Dumazet.

 7) Add a variable of napi_schedule that doesn't need to disable cpu
    interrupts, from Eric Dumazet.

 8) Use a doubly linked list to optimize neigh_parms_release(), from
    Nicolas Dichtel.

 9) Various enhancements to the kernel BPF verifier, and allow eBPF
    programs to actually be attached to sockets.  From Alexei
    Starovoitov.

10) Support TSO/LSO in sunvnet driver, from David L Stevens.

11) Allow controlling ECN usage via routing metrics, from Florian
    Westphal.

12) Remote checksum offload, from Tom Herbert.

13) Add split-header receive, BQL, and xmit_more support to amd-xgbe
    driver, from Thomas Lendacky.

14) Add MPLS support to openvswitch, from Simon Horman.

15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen
    Klassert.

16) Do gro flushes on a per-device basis using a timer, from Eric
    Dumazet.  This tries to resolve the conflicting goals between the
    desired handling of bulk vs.  RPC-like traffic.

17) Allow userspace to ask for the CPU upon what a packet was
    received/steered, via SO_INCOMING_CPU.  From Eric Dumazet.

18) Limit GSO packets to half the current congestion window, from Eric
    Dumazet.

19) Add a generic helper so that all drivers set their RSS keys in a
    consistent way, from Eric Dumazet.

20) Add xmit_more support to enic driver, from Govindarajulu
    Varadarajan.

21) Add VLAN packet scheduler action, from Jiri Pirko.

22) Support configurable RSS hash functions via ethtool, from Eyal
    Perry.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits)
  Fix race condition between vxlan_sock_add and vxlan_sock_release
  net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header
  net/mlx4: Add support for A0 steering
  net/mlx4: Refactor QUERY_PORT
  net/mlx4_core: Add explicit error message when rule doesn't meet configuration
  net/mlx4: Add A0 hybrid steering
  net/mlx4: Add mlx4_bitmap zone allocator
  net/mlx4: Add a check if there are too many reserved QPs
  net/mlx4: Change QP allocation scheme
  net/mlx4_core: Use tasklet for user-space CQ completion events
  net/mlx4_core: Mask out host side virtualization features for guests
  net/mlx4_en: Set csum level for encapsulated packets
  be2net: Export tunnel offloads only when a VxLAN tunnel is created
  gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
  cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
  net: fec: only enable mdio interrupt before phy device link up
  net: fec: clear all interrupt events to support i.MX6SX
  net: fec: reset fep link status in suspend function
  net: sock: fix access via invalid file descriptor
  net: introduce helper macro for_each_cmsghdr
  ...
2014-12-11 14:27:06 -08:00
Linus Torvalds
b6da0076ba Merge branch 'akpm' (patchbomb from Andrew)
Merge first patchbomb from Andrew Morton:
 - a few minor cifs fixes
 - dma-debug upadtes
 - ocfs2
 - slab
 - about half of MM
 - procfs
 - kernel/exit.c
 - panic.c tweaks
 - printk upates
 - lib/ updates
 - checkpatch updates
 - fs/binfmt updates
 - the drivers/rtc tree
 - nilfs
 - kmod fixes
 - more kernel/exit.c
 - various other misc tweaks and fixes

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits)
  exit: pidns: fix/update the comments in zap_pid_ns_processes()
  exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting
  exit: exit_notify: re-use "dead" list to autoreap current
  exit: reparent: call forget_original_parent() under tasklist_lock
  exit: reparent: avoid find_new_reaper() if no children
  exit: reparent: introduce find_alive_thread()
  exit: reparent: introduce find_child_reaper()
  exit: reparent: document the ->has_child_subreaper checks
  exit: reparent: s/while_each_thread/for_each_thread/ in find_new_reaper()
  exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting
  exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
  exit: proc: don't try to flush /proc/tgid/task/tgid
  exit: release_task: fix the comment about group leader accounting
  exit: wait: drop tasklist_lock before psig->c* accounting
  exit: wait: don't use zombie->real_parent
  exit: wait: cleanup the ptrace_reparented() checks
  usermodehelper: kill the kmod_thread_locker logic
  usermodehelper: don't use CLONE_VFORK for ____call_usermodehelper()
  fs/hfs/catalog.c: fix comparison bug in hfs_cat_keycmp
  nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races
  ...
2014-12-10 18:34:42 -08:00
Al Viro
707c5960f1 Merge branch 'nsfs' into for-next 2014-12-10 21:31:59 -05:00
Johannes Weiner
3e32cb2e0a mm: memcontrol: lockless page counters
Memory is internally accounted in bytes, using spinlock-protected 64-bit
counters, even though the smallest accounting delta is a page.  The
counter interface is also convoluted and does too many things.

Introduce a new lockless word-sized page counter API, then change all
memory accounting over to it.  The translation from and to bytes then only
happens when interfacing with userspace.

The removed locking overhead is noticable when scaling beyond the per-cpu
charge caches - on a 4-socket machine with 144-threads, the following test
shows the performance differences of 288 memcgs concurrently running a
page fault benchmark:

vanilla:

   18631648.500498      task-clock (msec)         #  140.643 CPUs utilized            ( +-  0.33% )
         1,380,638      context-switches          #    0.074 K/sec                    ( +-  0.75% )
            24,390      cpu-migrations            #    0.001 K/sec                    ( +-  8.44% )
     1,843,305,768      page-faults               #    0.099 M/sec                    ( +-  0.00% )
50,134,994,088,218      cycles                    #    2.691 GHz                      ( +-  0.33% )
   <not supported>      stalled-cycles-frontend
   <not supported>      stalled-cycles-backend
 8,049,712,224,651      instructions              #    0.16  insns per cycle          ( +-  0.04% )
 1,586,970,584,979      branches                  #   85.176 M/sec                    ( +-  0.05% )
     1,724,989,949      branch-misses             #    0.11% of all branches          ( +-  0.48% )

     132.474343877 seconds time elapsed                                          ( +-  0.21% )

lockless:

   12195979.037525      task-clock (msec)         #  133.480 CPUs utilized            ( +-  0.18% )
           832,850      context-switches          #    0.068 K/sec                    ( +-  0.54% )
            15,624      cpu-migrations            #    0.001 K/sec                    ( +- 10.17% )
     1,843,304,774      page-faults               #    0.151 M/sec                    ( +-  0.00% )
32,811,216,801,141      cycles                    #    2.690 GHz                      ( +-  0.18% )
   <not supported>      stalled-cycles-frontend
   <not supported>      stalled-cycles-backend
 9,999,265,091,727      instructions              #    0.30  insns per cycle          ( +-  0.10% )
 2,076,759,325,203      branches                  #  170.282 M/sec                    ( +-  0.12% )
     1,656,917,214      branch-misses             #    0.08% of all branches          ( +-  0.55% )

      91.369330729 seconds time elapsed                                          ( +-  0.45% )

On top of improved scalability, this also gets rid of the icky long long
types in the very heart of memcg, which is great for 32 bit and also makes
the code a lot more readable.

Notable differences between the old and new API:

- res_counter_charge() and res_counter_charge_nofail() become
  page_counter_try_charge() and page_counter_charge() resp. to match
  the more common kernel naming scheme of try_do()/do()

- res_counter_uncharge_until() is only ever used to cancel a local
  counter and never to uncharge bigger segments of a hierarchy, so
  it's replaced by the simpler page_counter_cancel()

- res_counter_set_limit() is replaced by page_counter_limit(), which
  expects its callers to serialize against themselves

- res_counter_memparse_write_strategy() is replaced by
  page_counter_limit(), which rounds down to the nearest page size -
  rather than up.  This is more reasonable for explicitely requested
  hard upper limits.

- to keep charging light-weight, page_counter_try_charge() charges
  speculatively, only to roll back if the result exceeds the limit.
  Because of this, a failing bigger charge can temporarily lock out
  smaller charges that would otherwise succeed.  The error is bounded
  to the difference between the smallest and the biggest possible
  charge size, so for memcg, this means that a failing THP charge can
  send base page charges into reclaim upto 2MB (4MB) before the limit
  would have been reached.  This should be acceptable.

[akpm@linux-foundation.org: add includes for WARN_ON_ONCE and memparse]
[akpm@linux-foundation.org: add includes for WARN_ON_ONCE, memparse, strncmp, and PAGE_SIZE]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10 17:41:04 -08:00
Linus Torvalds
cbfe0de303 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS changes from Al Viro:
 "First pile out of several (there _definitely_ will be more).  Stuff in
  this one:

   - unification of d_splice_alias()/d_materialize_unique()

   - iov_iter rewrite

   - killing a bunch of ->f_path.dentry users (and f_dentry macro).

     Getting that completed will make life much simpler for
     unionmount/overlayfs, since then we'll be able to limit the places
     sensitive to file _dentry_ to reasonably few.  Which allows to have
     file_inode(file) pointing to inode in a covered layer, with dentry
     pointing to (negative) dentry in union one.

     Still not complete, but much closer now.

   - crapectomy in lustre (dead code removal, mostly)

   - "let's make seq_printf return nothing" preparations

   - assorted cleanups and fixes

  There _definitely_ will be more piles"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  copy_from_iter_nocache()
  new helper: iov_iter_kvec()
  csum_and_copy_..._iter()
  iov_iter.c: handle ITER_KVEC directly
  iov_iter.c: convert copy_to_iter() to iterate_and_advance
  iov_iter.c: convert copy_from_iter() to iterate_and_advance
  iov_iter.c: get rid of bvec_copy_page_{to,from}_iter()
  iov_iter.c: convert iov_iter_zero() to iterate_and_advance
  iov_iter.c: convert iov_iter_get_pages_alloc() to iterate_all_kinds
  iov_iter.c: convert iov_iter_get_pages() to iterate_all_kinds
  iov_iter.c: convert iov_iter_npages() to iterate_all_kinds
  iov_iter.c: iterate_and_advance
  iov_iter.c: macros for iterating over iov_iter
  kill f_dentry macro
  dcache: fix kmemcheck warning in switch_names
  new helper: audit_file()
  nfsd_vfs_write(): use file_inode()
  ncpfs: use file_inode()
  kill f_dentry uses
  lockd: get rid of ->f_path.dentry->d_sb
  ...
2014-12-10 16:10:49 -08:00
David S. Miller
22f10923dd Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/amd/xgbe/xgbe-desc.c
	drivers/net/ethernet/renesas/sh_eth.c

Overlapping changes in both conflict cases.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10 15:48:20 -05:00
Joe Perches
785c20a08b irda: Convert function pointer arrays and uses to const
Making things const is a good thing.

(x86-64 defconfig with all irda)
$ size net/irda/built-in.o*
   text	   data	    bss	    dec	    hex	filename
 109276	   1868	    244	 111388	  1b31c	net/irda/built-in.o.new
 108828	   2316	    244	 111388	  1b31c	net/irda/built-in.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10 15:33:16 -05:00
Joe Perches
22bbf5f3e4 llc: Make llc_sap_action_t function pointer arrays const
It's better when function pointer arrays aren't modifiable.

Net change:

$ size net/llc/built-in.o.*
   text	   data	    bss	    dec	    hex	filename
  61193	  12758	   1344	  75295	  1261f	net/llc/built-in.o.new
  47113	  27030	   1344	  75487	  126df	net/llc/built-in.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10 15:21:24 -05:00
Joe Perches
9b37306935 llc: Make llc_conn_ev_qfyr_t function pointer arrays const
It's better when function pointer arrays aren't modifiable.

Net change from original:

$ size net/llc/built-in.o.*
   text	   data	    bss	    dec	    hex	filename
  61065	  12886	   1344	  75295	  1261f	net/llc/built-in.o.new
  47113	  27030	   1344	  75487	  126df	net/llc/built-in.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10 15:21:24 -05:00
Joe Perches
14b7d95fd2 llc: Make function pointer arrays const
It's better when function pointer arrays aren't modifiable.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10 15:21:24 -05:00
David S. Miller
6e5f59aacb Merge branch 'for-davem-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
More iov_iter work for the networking from Al Viro.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10 13:17:23 -05:00
Linus Torvalds
86c6a2fddf Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle are:

   - 'Nested Sleep Debugging', activated when CONFIG_DEBUG_ATOMIC_SLEEP=y.

     This instruments might_sleep() checks to catch places that nest
     blocking primitives - such as mutex usage in a wait loop.  Such
     bugs can result in hard to debug races/hangs.

     Another category of invalid nesting that this facility will detect
     is the calling of blocking functions from within schedule() ->
     sched_submit_work() -> blk_schedule_flush_plug().

     There's some potential for false positives (if secondary blocking
     primitives themselves are not ready yet for this facility), but the
     kernel will warn once about such bugs per bootup, so the warning
     isn't much of a nuisance.

     This feature comes with a number of fixes, for problems uncovered
     with it, so no messages are expected normally.

   - Another round of sched/numa optimizations and refinements, for
     CONFIG_NUMA_BALANCING=y.

   - Another round of sched/dl fixes and refinements.

  Plus various smaller fixes and cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  sched: Add missing rcu protection to wake_up_all_idle_cpus
  sched/deadline: Introduce start_hrtick_dl() for !CONFIG_SCHED_HRTICK
  sched/numa: Init numa balancing fields of init_task
  sched/deadline: Remove unnecessary definitions in cpudeadline.h
  sched/cpupri: Remove unnecessary definitions in cpupri.h
  sched/deadline: Fix rq->dl.pushable_tasks bug in push_dl_task()
  sched/fair: Fix stale overloaded status in the busiest group finding logic
  sched: Move p->nr_cpus_allowed check to select_task_rq()
  sched/completion: Document when to use wait_for_completion_io_*()
  sched: Update comments about CLONE_NEWUTS and CLONE_NEWIPC
  sched/fair: Kill task_struct::numa_entry and numa_group::task_list
  sched: Refactor task_struct to use numa_faults instead of numa_* pointers
  sched/deadline: Don't check CONFIG_SMP in switched_from_dl()
  sched/deadline: Reschedule from switched_from_dl() after a successful pull
  sched/deadline: Push task away if the deadline is equal to curr during wakeup
  sched/deadline: Add deadline rq status print
  sched/deadline: Fix artificial overrun introduced by yield_task_dl()
  sched/rt: Clean up check_preempt_equal_prio()
  sched/core: Use dl_bw_of() under rcu_read_lock_sched()
  sched: Check if we got a shallowest_idle_cpu before searching for least_loaded_cpu
  ...
2014-12-09 21:21:34 -08:00
David S. Miller
b5f185f33d Merge tag 'master-2014-12-08' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
pull request: wireless-next 2014-12-08

Please pull this last batch of pending wireless updates for the 3.19 tree...

For the wireless bits, Johannes says:

"This time I have Felix's no-status rate control work, which will allow
drivers to work better with rate control even if they don't have perfect
status reporting. In addition to this, a small hwsim fix from Patrik,
one of the regulatory patches from Arik, and a number of cleanups and
fixes I did myself.

Of note is a patch where I disable CFG80211_WEXT so that compatibility
is no longer selectable - this is intended as a wake-up call for anyone
who's still using it, and is still easily worked around (it's a one-line
patch) before we fully remove the code as well in the future."

For the Bluetooth bits, Johan says:

"Here's one more bluetooth-next pull request for 3.19:

 - Minor cleanups for ieee802154 & mac802154
 - Fix for the kernel warning with !TASK_RUNNING reported by Kirill A.
   Shutemov
 - Support for another ath3k device
 - Fix for tracking link key based security level
 - Device tree bindings for btmrvl + a state update fix
 - Fix for wrong ACL flags on LE links"

And...

"In addition to the previous one this contains two more cleanups to
mac802154 as well as support for some new HCI features from the
Bluetooth 4.2 specification.

From the original request:

'Here's what should be the last bluetooth-next pull request for 3.19.
It's rather large but the majority of it is the Low Energy Secure
Connections feature that's part of the Bluetooth 4.2 specification. The
specification went public only this week so we couldn't publish the
corresponding code before that. The code itself can nevertheless be
considered fairly mature as it's been in development for over 6 months
and gone through several interoperability test events.

Besides LE SC the pull request contains an important fix for command
complete events for mgmt sockets which also fixes some leaks of hci_conn
objects when powering off or unplugging Bluetooth adapters.

A smaller feature that's part of the pull request is service discovery
support. This is like normal device discovery except that devices not
matching specific UUIDs or strong enough RSSI are filtered out.

Other changes that the pull request contains are firmware dump support
to the btmrvl driver, firmware download support for Broadcom BCM20702A0
variants, as well as some coding style cleanups in 6lowpan &
ieee802154/mac802154 code.'"

For the NFC bits, Samuel says:

"With this one we get:

- NFC digital improvements for DEP support: Chaining, NACK and ATN
  support added.

- NCI improvements: Support for p2p target, SE IO operand addition,
  SE operands extensions to support proprietary implementations, and
  a few fixes.

- NFC HCI improvements: OPEN_PIPE and NOTIFY_ALL_CLEARED support,
  and SE IO operand addition.

- A bunch of minor improvements and fixes for STMicro st21nfcb and
  st21nfca"

For the iwlwifi bits, Emmanuel says:

"Major works are CSA and TDLS. On top of that I have a new
firmware API for scan and a few rate control improvements.
Johannes find a few tricks to improve our CPU utilization
and adds support for a new spin of 7265 called 7265D.
Along with this a few random things that don't stand out."

And...

"I deprecate here -8.ucode since -9 has been published long ago.
Along with that I have a new activity, we have now better
a infrastructure for firmware debugging. This will allow to
have configurable probes insides the firmware.
Luca continues his work on NetDetect, this feature is now
complete. All the rest is minor fixes here and there."

For the Atheros bits, Kalle says:

"Only ath10k changes this time and no major changes. Most visible are:

o new debugfs interface for runtime firmware debugging (Yanbo)

o fix shared WEP (Sujith)

o don't rebuild whenever kernel version changes (Johannes)

o lots of refactoring to make it easier to add new hw support (Michal)

There's also smaller fixes and improvements with no point of listing
here."

In addition, there are a few last minute updates to ath5k,
ath9k, brcmfmac, brcmsmac, mwifiex, rt2x00, rtlwifi, and wil6210.
Also included is a pull of the wireless tree to pick-up the fixes
originally included in "pull request: wireless 2014-12-03"...

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09 18:12:03 -05:00
Al Viro
17836394e5 first fruits - kill l2cap ->memcpy_fromiovec()
Just use copy_from_iter().  That's what this method is trying to do
in all cases, in a very convoluted fashion.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09 16:29:10 -05:00
Al Viro
c0371da604 put iov_iter into msghdr
Note that the code _using_ ->msg_iter at that point will be very
unhappy with anything other than unshifted iovec-backed iov_iter.
We still need to convert users to proper primitives.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09 16:29:03 -05:00
Al Viro
56c39fb67c switch l2cap ->memcpy_fromiovec() to msghdr
it'll die soon enough - now that kvec-backed iov_iter works regardless
of set_fs(), both instances will become copy_from_iter() as soon as
we introduce ->msg_iter...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09 16:28:23 -05:00
Al Viro
f69e6d131f ip_generic_getfrag, udplite_getfrag: switch to passing msghdr
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09 16:28:22 -05:00
Jiri Pirko
57d743a3de net: sched: cls: remove unused op put from tcf_proto_ops
It is never called and implementations are void. So just remove it.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09 14:49:02 -05:00
David S. Miller
6db70e3e1d Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2014-12-03

1) Fix a set but not used warning. From Fabian Frederick.

2) Currently we make sequence number values available to userspace
   only if we use ESN. Make the sequence number values also available
   for non ESN states. From Zhi Ding.

3) Remove socket policy hashing. We don't need it because socket
   policies are always looked up via a linked list. From Herbert Xu.

4) After removing socket policy hashing, we can use __xfrm_policy_link
   in xfrm_policy_insert. From Herbert Xu.

5) Add a lookup method for vti6 tunnels with wildcard endpoints.
   I forgot this when I initially implemented vti6.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-08 21:30:21 -05:00
Alexander Duyck
a5a519b271 fib_trie: Fix /proc/net/fib_trie when CONFIG_IP_MULTIPLE_TABLES is not defined
In recent testing I had disabled CONFIG_IP_MULTIPLE_TABLES and as a result
when I ran "cat /proc/net/fib_trie" the main trie was displayed multiple
times.  I found that the problem line of code was in the function
fib_trie_seq_next.  Specifically the line below caused the indexes to go in
the opposite direction of our traversal:

	h = tb->tb_id & (FIB_TABLE_HASHSZ - 1);

This issue was that the RT tables are defined such that RT_TABLE_LOCAL is ID
255, while it is located at TABLE_LOCAL_INDEX of 0, and RT_TABLE_MAIN is 254
with a TABLE_MAIN_INDEX of 1.  This means that the above line will return 1
for the local table and 0 for main.  The result is that fib_trie_seq_next
will return NULL at the end of the local table, fib_trie_seq_start will
return the start of the main table, and then fib_trie_seq_next will loop on
main forever as h will always return 0.

The fix for this is to reverse the ordering of the two tables.  It has the
advantage of making it so that the tables now print in the same order
regardless of if multiple tables are enabled or not.  In order to make the
definition consistent with the multiple tables case I simply masked the to
RT_TABLE_XXX values by (FIB_TABLE_HASHSZ - 1).  This way the two table
layouts should always stay consistent.

Fixes: 93456b6 ("[IPV4]: Unify access to the routing tables")
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-08 21:14:32 -05:00
Al Viro
ba00410b81 Merge branch 'iov_iter' into for-next 2014-12-08 20:39:29 -05:00
David S. Miller
244ebd9f8f Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following batch contains netfilter updates for net-next. Basically,
enhancements for xt_recent, skip zeroing of timer in conntrack, fix
linking problem with recent redirect support for nf_tables, ipset
updates and a couple of cleanups. More specifically, they are:

1) Rise maximum number per IP address to be remembered in xt_recent
   while retaining backward compatibility, from Florian Westphal.

2) Skip zeroing timer area in nf_conn objects, also from Florian.

3) Inspect IPv4 and IPv6 traffic from the bridge to allow filtering using
   using meta l4proto and transport layer header, from Alvaro Neira.

4) Fix linking problems in the new redirect support when CONFIG_IPV6=n
   and IP6_NF_IPTABLES=n.

And ipset updates from Jozsef Kadlecsik:

5) Support updating element extensions when the set is full (fixes
   netfilter bugzilla id 880).

6) Fix set match with 32-bits userspace / 64-bits kernel.

7) Indicate explicitly when /0 networks are supported in ipset.

8) Simplify cidr handling for hash:*net* types.

9) Allocate the proper size of memory when /0 networks are supported.

10) Explicitly add padding elements to hash:net,net and hash:net,port,
    because the elements must be u32 sized for the used hash function.

Jozsef is also cooking ipset RCU conversion which should land soon if
they reach the merge window in time.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-05 20:56:46 -08:00
John W. Linville
f700076a9d Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-12-05 14:12:24 -05:00
Marcel Holtmann
4b71bba45c Bluetooth: Enabled LE Direct Advertising Report event if supported
When the controller supports the Extended Scanner Filter Policies, it
supports the LE Direct Advertising Report event. However by default
that event is blocked by the LE event mask. It is required to enable
it during controller setup.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-05 18:15:33 +02:00
Marcel Holtmann
32c9d43fa5 Bluetooth: Add definitions for LE Direct Advertising Report event
This patch adds the event id and data structures for the LE Direct
Advertising Report event.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-05 18:15:18 +02:00
Marcel Holtmann
2331fd46f5 Bluetooth: Move LE advertising report defines to the right location
All Bluetooth commands and events are ordered by its opcode or event
id, but for some reason this one now stands out. So move it to its
correct spot in the list.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-05 18:15:02 +02:00
Marcel Holtmann
da25cf6a98 Bluetooth: Report invalid RSSI for service discovery and background scan
When using Start Service Discovery and when background scanning is used
to report devices, the RSSI is reported or the value 127 is provided in
case RSSI in unavailable.

For Start Discovery the value 0 is reported to keep backwards
compatibility with the existing users.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-05 14:14:28 +02:00
Marcel Holtmann
0256325ed6 Bluetooth: Add helper function for clearing the discovery filter
The discovery filter allocates memory for its UUID list. So use
a helper function to free it and reset it to default states.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-05 13:12:58 +02:00
Jakub Pawlowski
37eab042be Bluetooth: Add extra discovery fields for storing filter information
With the upcoming addition of support for Start Service Discovery, the
discovery handling needs to filter on RSSI and UUID values. For that
they need to be stored in the discovery handling. This patch adds the
appropiate fields and also make sure they are reset when discovery
has been stopped.

Signed-off-by: Jakub Pawlowski <jpawlowski@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-05 12:37:33 +02:00
Jakub Pawlowski
7e61df5423 Bluetooth: Add definitions for MGMT_OP_START_SERVICE_DISCOVERY
This patch adds the opcode and structure for Start Service Discovery
operation.

Signed-off-by: Jakub Pawlowski <jpawlowski@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-05 12:37:32 +02:00
Marcel Holtmann
a89612aac3 Bluetooth: Add HCI_RSSI_INVALID for unknown RSSI value
The Bluetooth core specification defines the value 127 as invalid for
RSSI values. So instead of hard coding it, lets add a constant for it.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-05 12:37:30 +02:00
Al Viro
435d5f4bb2 common object embedded into various struct ....ns
for now - just move corresponding ->proc_inum instances over there

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-04 14:31:00 -05:00
John W. Linville
de51f1649a Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg <johannes@sipsolutions.net> says:

"This time I have Felix's no-status rate control work, which will allow
drivers to work better with rate control even if they don't have perfect
status reporting. In addition to this, a small hwsim fix from Patrik,
one of the regulatory patches from Arik, and a number of cleanups and
fixes I did myself.

Of note is a patch where I disable CFG80211_WEXT so that compatibility
is no longer selectable - this is intended as a wake-up call for anyone
who's still using it, and is still easily worked around (it's a one-line
patch) before we fully remove the code as well in the future."

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-12-04 11:29:10 -05:00
John W. Linville
04bb7ecf88 NFC: 3.19 pull request
This is the NFC pull request for 3.19.
 
 With this one we get:
 
 - NFC digital improvements for DEP support: Chaining, NACK and ATN
   support added.
 
 - NCI improvements: Support for p2p target, SE IO operand addition,
   SE operands extensions to support proprietary implementations, and
   a few fixes.
 
 - NFC HCI improvements: OPEN_PIPE and NOTIFY_ALL_CLEARED support,
   and SE IO operand addition.
 
 - A bunch of minor improvements and fixes for STMicro st21nfcb and
   st21nfca
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUfjvWAAoJEIqAPN1PVmxK46MP/1QFb093QmAK3AfMS7XBZdwg
 vgJkZIyFmTeRlDTonRJpv1MXZhPL+JHYK0wGANts1Ba04nZIgK9jzdcqjvd1niD1
 Hk9ggHpfxsIl35FfRVcWAUpyb71Ykug2V396gJH2iiHvt2jHr8IsRWOWCVaItWMO
 ZpVpV58xkROEu115wjKdrmKijL0xf7/F70RlReV4tOLGip3E/bP7EhutF2s5D8Lt
 A5hfauQS3lUJIhlJ1IT59XtdnBSIS5D8OuNYjuJuFkPfIcXAHacEUBrta3KahQPy
 kFGYYjj+V0vZA9JQG/j6vomP4AI0pP4DPkwWqgmioaPshkOOktwf1l21IKQrC7aw
 h7EFEG6q5AxnL5JFbm4y+b5IEp8ALeVi5rv1xdRZ7/AW/n6FWGrkBHatsJceQgX+
 wFhZ52USGHQUzWoJ6SC0O82U831/dgxWm5M42CV5/Bl9d69XX7Om9MWytdjQ94yB
 ERoSJcivc+Q30bAiZmvb44ghv3qMGNArlBVVlcN7+3rbz4XOWLhduqCJaUs12fgY
 lyGdNHxVVmEeEY9ByOjr6RD1pYpSDgEQSYvn5pbKadX+DqpXHQuGC4JHtbBkzKn0
 Se9c442HRBjts0VprJ7wjLgbf5VfqFjtgsfqubD/mcTn61B4/t9zyGgV1VuI7UgL
 QxK0m0rB1nOxpdWFesl1
 =WHHp
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

Samuel Ortiz <sameo@linux.intel.com> says:

"NFC: 3.19 pull request

This is the NFC pull request for 3.19.

With this one we get:

- NFC digital improvements for DEP support: Chaining, NACK and ATN
  support added.

- NCI improvements: Support for p2p target, SE IO operand addition,
  SE operands extensions to support proprietary implementations, and
  a few fixes.

- NFC HCI improvements: OPEN_PIPE and NOTIFY_ALL_CLEARED support,
  and SE IO operand addition.

- A bunch of minor improvements and fixes for STMicro st21nfcb and
  st21nfca"

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-12-04 11:27:40 -05:00
Johan Hedberg
6928a9245f Bluetooth: Store address type with OOB data
To be able to support OOB data for LE pairing we need to store the
address type of the remote device. This patch extends the relevant
functions and data types with a bdaddr_type variable.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-03 16:51:21 +01:00
Johan Hedberg
81328d5cca Bluetooth: Unify remote OOB data functions
There's no need to duplicate code for the 192 vs 192+256 variants of the
OOB data functions. This is also helpful to pave the way to support LE
SC OOB data where only 256 bit data is provided.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-03 16:51:20 +01:00
Johan Hedberg
ef8efe4bf8 Bluetooth: Add skeleton for BR/EDR SMP channel
This patch adds the very basic code for creating and destroying SMP
L2CAP channels for BR/EDR connections.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-03 16:51:20 +01:00
Johan Hedberg
858cdc78be Bluetooth: Add debugfs switch for forcing SMP over BR/EDR
To make it possible to use LE SC functionality over BR/EDR with pre-4.1
controllers (that do not support BR/EDR SC links) it's useful to be able
to force LE SC operations even over a traditional SSP protected link.
This patch adds a debugfs switch to force a special debug flag which is
used to skip the checks for BR/EDR SC support.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-03 16:51:20 +01:00
Johan Hedberg
fe8bc5ac67 Bluetooth: Add hci_conn flag for new link key generation
For LE Secure Connections we want to trigger cross transport key
generation only if a new link key was actually created during the BR/EDR
connection. This patch adds a new flag to track this information.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-03 16:51:20 +01:00
Johan Hedberg
f3a73d97b3 Bluetooth: Rename hci_find_ltk_by_addr to hci_find_ltk
Now that hci_find_ltk_by_addr is the only LTK lookup function there's no
need to keep the long name anymore. This patch shortens the function
name to simply hci_find_ltk.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-03 16:51:16 +01:00
Johan Hedberg
0ac3dbf999 Bluetooth: Remove unused hci_find_ltk function
Now that LTKs are always looked up based on bdaddr (with EDiv/Rand
checks done after a successful lookup) the hci_find_ltk function is not
needed anymore. This patch removes the function.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-03 16:51:16 +01:00
Johan Hedberg
710f11c08e Bluetooth: Use custom macro for testing BR/EDR SC enabled
Since the HCI_SC_ENABLED flag will also be used for controllers without
BR/EDR Secure Connections support whenever we need to check specifically
for SC for BR/EDR we also need to check that the controller actually
supports it. This patch adds a convenience macro for check all the
necessary conditions and converts the places in the code that need it to
use it.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-03 16:51:15 +01:00
Johan Hedberg
23fb8de376 Bluetooth: Add mgmt support for LE Secure Connections LTK types
We need a dedicated LTK type for LTK resulting from a Secure Connections
based SMP pairing. This patch adds a new define for it and ensures that
both the New LTK event as well as the Load LTKs command supports it.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-03 16:51:15 +01:00
Scott Feldman
38dcf357ae bridge: call netdev_sw_port_stp_update when bridge port STP status changes
To notify switch driver of change in STP state of bridge port, add new
.ndo op and provide switchdev wrapper func to call ndo op. Use it in bridge
code then.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 20:01:22 -08:00
Jiri Pirko
007f790c82 net: introduce generic switch devices support
The goal of this is to provide a possibility to support various switch
chips. Drivers should implement relevant ndos to do so. Now there is
only one ndo defined:
- for getting physical switch id is in place.

Note that user can use random port netdevice to access the switch.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 20:01:20 -08:00
Christophe Ricard
b3a55b9c5d NFC: hci: Add specific hci macro to not create a pipe
Some pipe are only created by other host (different than the
Terminal Host).
The pipe values will for example be notified by
NFC_HCI_ADM_NOTIFY_PIPE_CREATED.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-12-02 22:48:13 +01:00
Johan Hedberg
0bd49fc75a Bluetooth: Track both local and remote L2CAP fixed channel mask
To pave the way for future fixed channels to be added easily we should
track both the local and remote mask on a per-L2CAP connection (struct
l2cap_conn) basis. So far the code has used a global variable in a racy
way which anyway needs fixing.

This patch renames the existing conn->fixed_chan_mask that tracked
the remote mask to conn->remote_fixed_chan and adds a new variable
conn->local_fixed_chan to track the local mask. Since the HS support
info is now available in the local mask we can remove the
conn->hs_enabled variable.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-12-02 09:26:50 +01:00
Christophe Ricard
a688bf55c5 NFC: nci: Add se_io NCI operand
se_io allows to send apdu over the CLF to the embedded Secure Element.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-12-02 02:01:21 +01:00
Christophe Ricard
e9ef9431a3 NFC: nci: Update nci_disable_se to run proprietary commands to disable a secure element
Some NFC controller using NCI protocols may need a proprietary commands
flow to disable a secure element

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-12-02 02:01:21 +01:00
Christophe Ricard
93bca2bfa4 NFC: nci: Update nci_enable_se to run proprietary commands to enable a secure element
Some NFC controller using NCI protocols may need a proprietary commands
flow to enable a secure element

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-12-02 02:01:21 +01:00
Christophe Ricard
ba4db551bb NFC: nci: Update nci_discover_se to run proprietary commands to discover all available secure element
Some NFC controller using NCI protocols may need a proprietary commands
flow to discover all available secure element

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-12-02 02:01:21 +01:00
Christophe Ricard
9b8d32b7ac NFC: hci: Add se_io HCI operand
se_io allows to send apdu over the CLF to the embedded Secure Element.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-12-02 01:49:58 +01:00
John W. Linville
992066c8d3 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-12-01 15:49:58 -05:00
David S. Miller
60b7379dc5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-11-29 20:47:48 -08:00
Felix Fietkau
f027c2aca0 mac80211: add ieee80211_tx_status_noskb
This can be used by drivers that cannot reliably map tx status
information onto specific skbs.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-11-28 15:01:51 +01:00
Felix Fietkau
f684565e0a mac80211: add tx_status_noskb to rate_control_ops
This op works like .tx_status, except it does not need access to the
skb. This will be used by drivers that cannot match tx status
information to specific packets.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-11-28 15:01:50 +01:00
Arik Nemtsov
ad932f046f cfg80211: leave invalid channels on regdomain change
When the regulatory settings change, some channels might become invalid.
Disconnect interfaces acting on these channels, after giving userspace
code a grace period to leave them.

This mode is currently opt-in, and not all interface operating modes are
supported for regulatory-enforcement checks. A wiphy that wishes to use
the new enforcement code must specify an appropriate regulatory flag,
and all its supported interface modes must be supported by the checking
code.

Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com>
[fix some indentation, typos]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-11-28 14:33:41 +01:00
Julien Lefrique
529ee06682 NFC: NCI: Configure ATR_RES general bytes
The Target responds to the ATR_REQ with the ATR_RES. Configure the General
Bytes in ATR_RES with the first three octets equal to the NFC Forum LLCP
magic number, followed by some LLC Parameters TLVs described in section
4.5 of [LLCP].

Signed-off-by: Julien Lefrique <lefrique@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-11-28 14:07:51 +01:00
Julien Lefrique
a99903ec45 NFC: NCI: Handle Target mode activation
Changes:

 * Extract the Listen mode activation parameters from RF_INTF_ACTIVATED_NTF.

 * Store the General Bytes of ATR_REQ.

 * Signal that Target mode is activated in case of an activation in NFC-DEP.

 * Update the NCI state accordingly.

 * Use the various constants defined in nfc.h.

 * Fix the ATR_REQ and ATR_RES maximum size. As per NCI 1.0 and NCI 1.1, the
   Activation Parameters for both Poll and Listen mode contain all the bytes of
   ATR_REQ/ATR_RES starting and including Byte 3 as defined in [DIGITAL].
   In [DIGITAL], the maximum size of ATR_REQ/ATR_RES is 64 bytes and they are
   numbered starting from Byte 1.

Signed-off-by: Julien Lefrique <lefrique@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-11-28 14:07:51 +01:00
Julien Lefrique
90d78c1396 NFC: NCI: Enable NFC-DEP in Listen A and Listen F
Send LA_SEL_INFO and LF_PROTOCOL_TYPE with NFC-DEP protocol enabled.
Configure 212 Kbit/s and 412 Kbit/s bit rates for Listen F.

Signed-off-by: Julien Lefrique <lefrique@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-11-28 14:07:51 +01:00
Mark A. Greer
384ab1d174 NFC: digital: Add NFC-DEP Initiator-side ATN Support
When an NFC-DEP Initiator times out when waiting for
a DEP_RES from the Target, its supposed to send an
ATN to the Target.  The Target should respond to the
ATN with a similar ATN PDU and the Initiator can then
resend the last non-ATN PDU that it sent.  No more
than 'N(retry,atn)' are to be send where
2 <= 'N(retry,atn)' <= 5.  If the Initiator had just
sent a NACK PDU when the timeout occurred, it is to
continue sending NACKs until 'N(retry,nack)' NACKs
have been send.  This is described in section
14.12.5.6 of the NFC-DEP Digital Protocol Spec.

The digital layer's NFC-DEP code doesn't implement
this so add that support.

The value chosen for 'N(retry,atn)' is 2.

Reviewed-by: Thierry Escande <thierry.escande@linux.intel.com>
Tested-by: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-11-28 12:39:55 +01:00
Mark A. Greer
49dbb14e30 NFC: digital: Add NFC-DEP Target-side NACK Support
When an NFC-DEP Target receives a NACK PDU with
a PNI equal to 1 less than the current PNI, it
is supposed to re-send the last PDU.  This is
implied in section 14.12.5.4 of the NFC Digital
Protocol Spec.

The digital layer's NFC-DEP code doesn't implement
Target-side NACK handing so add it.  The last PDU
that was sent is saved in the 'nfc_digital_dev'
structure's 'saved_skb' member.  The skb will have
an additional reference taken to ensure that the skb
isn't freed when the driver performs a kfree_skb()
on the skb.  The length of the skb/PDU is also saved
so the length can be restored when re-sending the PDU
in the skb (the driver will perform an skb_pull() so
an skb_push() needs to be done to restore the skb's
data pointer/length).

Reviewed-by: Thierry Escande <thierry.escande@linux.intel.com>
Tested-by: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-11-28 12:39:47 +01:00
Mark A. Greer
a80509c76b NFC: digital: Add NFC-DEP Initiator-side NACK Support
When an NFC-DEP Initiator receives a frame with
an incorrect CRC or with a parity error, and the
frame is at least 4 bytes long, its supposed to
send a NACK to the Target.  The Initiator can
send up to 'N(retry,nack)' consecutive NACKs
where 2 <= 'N(retry,nack)' <= 5.  When the limit
is exceeded, a PROTOCOL EXCEPTION is raised.
Any other type of transmission error is to be
ignored and the Initiator should continue
waiting for a new frame.  This is described
in section 14.12.5.4 of the NFC Digital Protocol
Spec.

The digital layer's NFC-DEP code doesn't implement
any of this so add it.  This support diverges from
the spec in two significant ways:

a) NACKs will be sent for ANY error reported by the
   driver except a timeout.  This is done because
   there is currently no way for the digital layer
   to distinguish a CRC or parity error from any
   other type of error reported by the driver.

b) All other errors will cause a PROTOCOL EXCEPTION
   even frames with CRC errors that are less than 4
   bytes.

The value chosen for 'N(retry,nack)' is 2.

Targets do not send NACK PDUs.

Reviewed-by: Thierry Escande <thierry.escande@linux.intel.com>
Tested-by: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-11-28 12:39:33 +01:00
Mark A. Greer
3bd2a5bcc6 NFC: digital: Add NFC-DEP Send Chaining Support
When the NFC-DEP code is given a packet to send
that is larger than the peer's maximum payload,
its supposed to set the 'MI' bit in the 'I' PDU's
Protocol Frame Byte (PFB).  Setting this bit
indicates that NFC-DEP chaining is to occur.

When NFC-DEP chaining is progress, sender 'I' PDUs
are acknowledged with 'ACK' PDUs until the last 'I'
PDU in the chain (which has the 'MI' bit cleared)
is responded to with a normal 'I' PDU.  This can
occur while in Initiator mode or in Target mode.

Sender NFC-DEP chaining is currently not implemented
in the digital layer so add that support.  Unfortunately,
since sending a frame may require writing the CRC to the
end of the data, the relevant data part of the original
skb must be copied for each intermediate frame.

Reviewed-by: Thierry Escande <thierry.escande@linux.intel.com>
Tested-by: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-11-28 12:39:10 +01:00