2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-19 18:53:52 +08:00
Commit Graph

21374 Commits

Author SHA1 Message Date
Ben Hutchings
278bc4296b ethtool: Define and apply a default policy for RX flow hash indirection
All drivers that support modification of the RX flow hash indirection
table initialise it in the same way: RX rings are assigned to table
entries in rotation.  Make that default policy explicit by having them
call a ethtool_rxfh_indir_default() function.

In the ethtool core, add support for a zero size value for
ETHTOOL_SRXFHINDIR, which resets the table to this default.

Partly-suggested-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:53:18 -05:00
Ben Hutchings
7850f63f16 ethtool: Centralise validation of ETHTOOL_{G, S}RXFHINDIR parameters
Add a new ethtool operation (get_rxfh_indir_size) to get the
indirectional table size.  Use this to validate the user buffer size
before calling get_rxfh_indir or set_rxfh_indir.  Use get_rxnfc to get
the number of RX rings, and validate the contents of the new
indirection table before calling set_rxfh_indir.  Remove this
validation from drivers.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:52:47 -05:00
Pavel Emelyanov
5d531aaa64 unix_diag: Write it into kbuild
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:29 -05:00
Pavel Emelyanov
cbf391958a unix_diag: Receive queue lenght NLA
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:29 -05:00
Pavel Emelyanov
2aac7a2cb0 unix_diag: Pending connections IDs NLA
When establishing a unix connection on stream sockets the
server end receives an skb with socket in its receive queue.

Report who is waiting for these ends to be accepted for
listening sockets via NLA.

There's a lokcing issue with this -- the unix sk state lock is
required to access the peer, and it is taken under the listening
sk's queue lock. Strictly speaking the queue lock should be taken
inside the state lock, but since in this case these two sockets
are different it shouldn't lead to deadlock.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:28 -05:00
Pavel Emelyanov
ac02be8d96 unix_diag: Unix peer inode NLA
Report the peer socket inode ID as NLA. With this it's finally
possible to find out the other end of an interesting unix connection.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:28 -05:00
Pavel Emelyanov
5f7b056946 unix_diag: Unix inode info NLA
Actually, the socket path if it's not anonymous doesn't give
a clue to which file the socket is bound to. Even if the path
is absolute, it can be unlinked and then new socket can be
bound to it.

With this NLA it's possible to check which file a particular
socket is really bound to.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:28 -05:00
Pavel Emelyanov
f5248b48a6 unix_diag: Unix socket name NLA
Report the sun_path when requested as NLA. With leading '\0' if
present but without the leading AF_UNIX bits.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:28 -05:00
Pavel Emelyanov
5d3cae8bc3 unix_diag: Dumping exact socket core
The socket inode is used as a key for lookup. This is effectively
the only really unique ID of a unix socket, but using this for
search currently has one problem -- it is O(number of sockets) :(

Does it worth fixing this lookup or inventing some other ID for
unix sockets?

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:28 -05:00
Pavel Emelyanov
45a96b9be6 unix_diag: Dumping all sockets core
Walk the unix sockets table and fill the core response structure,
which includes type, state and inode.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:28 -05:00
Pavel Emelyanov
22931d3b90 unix_diag: Basic module skeleton
Includes basic module_init/_exit functionality, dump/get_exact stubs
and declares the basic API structures for request and response.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:27 -05:00
Pavel Emelyanov
fa7ff56f75 af_unix: Export stuff required for diag module
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:27 -05:00
Pavel Emelyanov
f65c1b534b sock_diag: Generalize requests cookies managements
The sk address is used as a cookie between dump/get_exact calls.
It will be required for unix socket sdumping, so move it from
inet_diag to sock_diag.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:27 -05:00
Pavel Emelyanov
aec8dc62f6 sock_diag: Fix module netlink aliases
I've made a mistake when fixing the sock_/inet_diag aliases :(

1. The sock_diag layer should request the family-based alias,
   not just the IPPROTO_IP one;
2. The inet_diag layer should request for AF_INET+protocol alias,
   not just the protocol one.

Thus fix this.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-16 13:48:27 -05:00
Rajkumar Manoharan
5ce543d148 cfg80211: Restore orig channel values upon disconnect
When we restore regulatory settings the world regulatory domain
is properly reset on cfg80211 (or user prefered regulatory domain)
but we were never setting back channel values for drivers that use
WIPHY_FLAG_CUSTOM_REGULATORY. Set these values up again by using
the orig_ channel parameters.

This fixes restoring custom regulatory settings upon disconnect
events.

Cc: compat@orbit-lab.org
Cc: Paul Stewart <pstew@google.com>
Cc: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Cc: Senthilkumar Balasubramanian <senthilb@qca.qualcomm.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-16 09:30:43 -05:00
Luis R. Rodriguez
061acaae76 cfg80211: allow following country IE power for custom regdom cards
By definition WIPHY_FLAG_STRICT_REGULATORY was intended to allow the
wiphy to adjust itself to the country IE power information if the
card had no regulatory data but we had no way to tell cfg80211 that if
the card also had its own custom regulatory domain (these are typically
custom world regulatory domains) that we want to follow the country IE's
noted values for power for each channel. We add support for this and
document it.

This is not a critical fix but a performance optimization for cards
with custom regulatory domains that associate to an AP with sends
out country IEs with a higher EIRP than the one on the custom
regulatory domain. In practice the only driver affected right now
are the Atheros drivers as they are the only drivers using both
WIPHY_FLAG_STRICT_REGULATORY and WIPHY_FLAG_CUSTOM_REGULATORY --
used on cards that have an Atheros world regulatory domain. Cards
that have been programmed to follow a country specifically will not
follow the country IE power. So although not a stable fix distributions
should consider cherry picking this.

Cc: compat@orbit-lab.org
Cc: Paul Stewart <pstew@google.com>
Cc: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Cc: Senthilkumar Balasubramanian <senthilb@qca.qualcomm.com>
Reported-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-16 09:30:42 -05:00
David S. Miller
b26e478f8f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/freescale/fsl_pq_mdio.c
	net/batman-adv/translation-table.c
	net/ipv6/route.c
2011-12-16 02:11:14 -05:00
Mohammed Shafi Shajakhan
1478acb392 mac80211: Fix power save in change interface
we found that power save is not getting enabled when we do
change interface in this order STA->IBSS->STA. this is
because ieee80211_setup_sdata clears type-dependent union

Reported-by: Leela Kella <leela@qca.qualcomm.com>
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:35 -05:00
Mohammed Shafi Shajakhan
9c38a8b491 mac80211: remove an unnecessary paraenthesis
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:35 -05:00
Helmut Schaa
cf6bb79ad8 mac80211: Use appropriate TID for sending BAR, ADDBA and DELBA frames
Currently BAR, ADDBA and DELBA frames are always sent using AC_VO. If
the TID for which a BA session is established is assigned to a different
queue BAR, ADDBA and DELBA frames can "overtake" frames of the according
BA session.

Hence, always put BA session related frames into the same queue as the
BA sessions data frames.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:35 -05:00
Johannes Berg
4d33960bf9 mac80211: reduce station management complexity
Now that IBSS no longer needs to insert stations
from atomic context, we can get rid of all the
special cases for that, and even get rid of the
sta_lock (though it needs to stay as tim_lock.)

This makes the station management code much more
straight-forward.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:35 -05:00
Johannes Berg
8bf11d8d08 mac80211: delay IBSS station insertion
In order to notify drivers and simplify the station
management code, defer IBSS station insertion to a
work item and don't do it directly while receiving
a frame.

This increases the complexity in IBSS a little bit,
but it's pretty straight forward and it allows us
to reduce the station management complexity (next
patch) considerably.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:34 -05:00
Johannes Berg
56544160d4 mac80211: make address arguments to sta_info_alloc const
No real changes, just note that they are const.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:34 -05:00
Johannes Berg
29623892e1 mac80211: count authorized stations per BSS
Currently, each AP interface will send multicast
traffic if any interface has a station entry even
if that station entry is allocated only. With the
new station state management we can easily fix it
by adding a counter that counts each authorized
station only and send multicast traffic only when
the correct interface has at least one authorized
station.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:34 -05:00
Johannes Berg
d9a7ddb05e mac80211: refactor station state transitions
Station entries can have various states, the most
important ones being auth, assoc and authorized.
This patch prepares us for telling the driver about
these states, we don't want to confuse drivers with
strange transitions, so with this we enforce that
they move in the right order between them (back and
forth); some transitions might happen before the
driver even knows about the station, but at least
runtime transitions will be ordered correctly.

As a consequence, IBSS and MESH stations will now
have the ASSOC flag set (so they can transition to
AUTHORIZED), and we can get rid of a special case
in TX processing.

When freeing a station, unwind the state so that
other parts of the code (or drivers later) can rely
on the transitions.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:46:34 -05:00
Johannes Berg
87be1e1e00 mac80211: use station mutex in configuration
There's no need to use RCU here, we can just lock
the station mutex instead. This allows the code
to sleep, which is necessary for later patches.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:45:46 -05:00
Johannes Berg
92b62f28d0 mac80211: remove duplicate TDLS peer verification
This is already checked in cfg80211, so no need
to repeat the checks here.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:45:45 -05:00
Johannes Berg
bdd90d5e36 cfg80211: validate nl80211 station handling better
The nl80211 station handling code is a bit messy
and doesn't do a lot of validation. It seems like
this could be an issue for drivers that don't use
mac80211 to validate everything.

As cfg80211 doesn't keep station state, move the
validation of allowing supported_rates to change
for TDLS only in station mode to mac80211.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:45:45 -05:00
Johannes Berg
d83023daa2 nl80211: add TDLS peer flag to policy
This was evidently missed in the TDLS patch (07ba55d7).

Cc: Arik Nemtsov <arik@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-15 14:45:45 -05:00
John W. Linville
42a3b63bb2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2011-12-15 13:47:58 -05:00
Dan Carpenter
c48e074c7c tcp_memcontrol: fix reversed if condition
We should only dereference the pointer if it's valid, not the other way
round.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-15 11:59:44 -05:00
Samuel Ortiz
d646960f79 NFC: Initial LLCP support
This patch is an initial implementation for the NFC Logical Link Control
protocol. It's also known as NFC peer to peer mode.
This is a basic implementation as it lacks SDP (services Discovery
Protocol), frames aggregation support, and frame rejecion parsing.
Follow up patches will implement those missing features.
This code has been tested against a Nexus S phone implementing LLCP 1.0.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:13 -05:00
Samuel Ortiz
541d920b05 NFC: Set and get DEP general bytes
Without an API for setting and getting the local and remote general bytes,
drivers won't be able to properly establish a DEP link.
This API also allows them to propagate the remote general bytes they get
from the DEP link establishment up to the LLCP layer.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:13 -05:00
Samuel Ortiz
1ed28f6106 NFC: Add a DEP link control netlink command
NFC-DEP (Data Exchange Protocol) is an NFC MAC layer.
This command allows to enable and disable the DEP link on to which e.g.
LLCP can run.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:12 -05:00
Samuel Ortiz
db81a62451 NFC: Atomic socket allocation
rawsock_create() is called with preemption disabled, so we should not
sleep.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:12 -05:00
Samuel Ortiz
94a098da42 NFC: Do not take the genl mutex from the netlink release notifier
The netlink notifier is atomic so we must not sleep in that context.
Also we know that Any netlink packets arriving to us will be purged when
the notifier is called, so we don't need to take the mutex.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:12 -05:00
Samuel Ortiz
7c7cd3bfec NFC: Add tx skb allocation routine
This is a factorization of the current rawsock tx skb allocation routine,
as it will be used by the LLCP code.
We also rename nfc_alloc_skb to nfc_alloc_recv_skb for consistency sake.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:12 -05:00
Samuel Ortiz
52858b51b2 NFC: Add function name to the NFC pr_fmt() routine
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:12 -05:00
Simon Wunderlich
cb71b8d803 mac80211: free skb on error path of ieee80211_ibss_join()
Our new return also created a memleak. The skb should be freed before
returning an error.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:11 -05:00
Johannes Berg
00918d33c0 nl80211: accept testmode dump with netdev
All nl80211 commands that need only the wiphy
still allow identifying it by giving an interface
index, except, as Kenny pointed out, the testmode
dump support.

Fix this by looking up the wiphy via the ifidx in
this case as well.

Tested-by: Kenny Hsu <kenny.hsu@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-14 14:50:11 -05:00
John W. Linville
5d22df200b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-agn.c
2011-12-14 14:35:41 -05:00
Eric Dumazet
e6560d4dfe net: ping: remove some sparse errors
net/ipv4/sysctl_net_ipv4.c:78:6: warning: symbol 'inet_get_ping_group_range_table'
was not declared. Should it be static?

net/ipv4/sysctl_net_ipv4.c:119:31: warning: incorrect type in argument 2
(different signedness)
net/ipv4/sysctl_net_ipv4.c:119:31: expected int *range
net/ipv4/sysctl_net_ipv4.c:119:31: got unsigned int *<noident>

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-14 13:34:55 -05:00
Eric Dumazet
3a53943b5a cls_flow: remove one dynamic array
Its better to use a predefined size for this small automatic variable.

Removes a sparse error as well :

net/sched/cls_flow.c:288:13: error: bad constant expression

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-14 13:34:55 -05:00
Eric Dumazet
de93cb2eaf vlan: static functions
commit 6d4cdf47d2 (vlan: add 802.1q netpoll support) forgot to declare
as static some private functions.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-14 02:39:30 -05:00
Dan Carpenter
f9586f79bf vlan: add rtnl_dereference() annotations
The original code generates a Sparse warning:
net/8021q/vlan_core.c:336:9:
	error: incompatible types in comparison expression (different address spaces)

It's ok to dereference __rcu pointers here because we are holding the
RTNL lock.  I've added some calls to rtnl_dereference() to silence the
warning.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-14 02:39:30 -05:00
Eric Dumazet
c63044f0d2 rtnetlink: rtnl_link_register() sanity test
Before adding a struct rtnl_link_ops into link_ops list, check it doesnt
clash with a prior one.

Based on a previous patch from Alexander Smirnov

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-14 02:39:29 -05:00
Linus Torvalds
653f42f6b6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: add missing spin_unlock at ceph_mdsc_build_path()
  ceph: fix SEEK_CUR, SEEK_SET regression
  crush: fix mapping calculation when force argument doesn't exist
  ceph: use i_ceph_lock instead of i_lock
  rbd: remove buggy rollback functionality
  rbd: return an error when an invalid header is read
  ceph: fix rasize reporting by ceph_show_options
2011-12-13 14:59:42 -08:00
David S. Miller
bb3c36863e ipv6: Check dest prefix length on original route not copied one in rt6_alloc_cow().
After commit 8e2ec63917 ("ipv6: don't
use inetpeer to store metrics for routes.") the test in rt6_alloc_cow()
for setting the ANYCAST flag is now wrong.

'rt' will always now have a plen of 128, because it is set explicitly
to 128 by ip6_rt_copy.

So to restore the semantics of the test, check the destination prefix
length of 'ort'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13 17:35:06 -05:00
David S. Miller
b43faac690 ipv6: If neigh lookup fails during icmp6 dst allocation, propagate error.
Don't just succeed with a route that has a NULL neighbour attached.
This follows the behavior of addrconf_dst_alloc().

Allowing this kind of route to end up with a NULL neigh attached will
result in packet drops on output until the route is somehow
invalidated, since nothing will meanwhile try to lookup the neigh
again.

A statistic is bumped for the case where we see a neigh-less route on
output, but the resulting packet drop is otherwise silent in nature,
and frankly it's a hard error for this to happen and ipv6 should do
what ipv4 does which is say something in the kernel logs.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13 16:51:51 -05:00
David S. Miller
5c3ddec73d net: Remove unused neighbour layer ops.
It's simpler to just keep these things out until there is a real user
of them, so we can see what the needs actually are, rather than keep
these things around as useless overhead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13 16:44:22 -05:00
Eliad Peller
53d69c399a mac80211: don't check sdata_running in vif notifier
The ip address of the vif can be set even before the
vif is up. requiring the vif to be up in the vif
notifier makes the notifer ignore this event, which
causes wrong arp filter configuration later on.

Reported-by: Eyal Shapira <eyal@wizery.com>
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:34:15 -05:00
Eliad Peller
0d392e938b mac80211: configure BSS_CHANGED_ARP_FILTER on reconfiguration
Configure arp filtering on sta reconfiguration.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:34:11 -05:00
Dmitry TARNYAGIN
cd6c524e9e mac80211: Do not request FIF_BCN_PRBRESP_PROMISC for HW scan.
ieee80211_configure_filter code used local->scanning as a boolean
value when it was a bit mask. Bits SCAN_COMPLETED, SCAN_ABORTED
should not set FIF_BCN_PRBRESP_PROMISC filter.

SCAN_HW_SCANNING should not set FIF_BCN_PRBRESP_PROMISC either,
as there is no explicit filter configuration request from
scan code. If a driver requires FIF_BCN_PRBRESP_PROMISC mode
during HW scanning, it's up to the driver to temporary enable it.

Similar mistake was fixed also in ieee80211_hw_config (power
configuration code).

Verified-by: Vitaly Wool <vitaly.wool@sonyericsson.com>
Signed-off-by: Dmitry Tarnyagin <dmitry.tarnyagin@stericsson.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:34:08 -05:00
Rajkumar Manoharan
4a38994f1c cfg80211: notify core hints that helps to restore regd settings
Regulatory updates set by CORE are ignored for custom regulatory cards.
Let us notify the changes to the driver, as some drivers uses core hint
to restore its orig_* reg domain setting.

Cc: Paul Stewart <pstew@google.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Acked-by: Luis R. Rodriguez <mcgrof@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:31:04 -05:00
Helmut Schaa
adf5ace5d8 mac80211: Make use of ieee80211_is_* functions in tx status path
Use ieee80211_is_data, ieee80211_is_mgmt and ieee80211_is_first_frag
in the tx status path. This makes the code easier to read and allows us
to remove two local variables: frag and type.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:30:46 -05:00
Yogesh Ashok Powar
42624d4913 mac80211: Purge A-MPDU TX queues before station destructions
When a station leaves suddenly while ampdu traffic to that station is still
running, there is a possibility that the ampdu pending queues are not freed due
to a race condition leading to memory leaks. In '__sta_info_destroy' when we
attempt to destroy the ampdu sessions in 'ieee80211_sta_tear_down_BA_sessions',
the driver calls 'ieee80211_stop_tx_ba_cb_irqsafe' to delete the ampdu
structures (tid_tx) and splice the pending queues and this job gets queued in
sdata workqueue. However, the sta entry can get destroyed before the above work
gets scheduled and hence the race.

Purging the queues and freeing the tid_tx to avoid the leak. The better solution
would be to fix the race, but that can be taken up in a separate patch.

Signed-off-by: Nishant Sarmukadam <nishants@marvell.com>
Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:30:33 -05:00
Vasanthakumar Thiagarajan
adbde344dc cfg80211: Fix race in bss timeout
It is quite possible to run into a race in bss timeout where
the drivers see the bss entry just before notifying cfg80211
of a roaming event but it got timed out by the time rdev->event_work
got scehduled from cfg80211_wq. This would result in the following
WARN-ON() along with the failure to notify the user space of
the roaming. The other situation which is happening with ath6kl
that runs into issue is when the driver reports roam to same AP
event where the AP bss entry already got expired. To fix this,
move cfg80211_get_bss() from __cfg80211_roamed() to cfg80211_roamed().

[158645.538384] WARNING: at net/wireless/sme.c:586
__cfg80211_roamed+0xc2/0x1b1()
[158645.538810] Call Trace:
[158645.538838]  [<c1033527>] warn_slowpath_common+0x65/0x7a
[158645.538917]  [<c14cfacf>] ? __cfg80211_roamed+0xc2/0x1b1
[158645.538946]  [<c103354b>] warn_slowpath_null+0xf/0x13
[158645.539055]  [<c14cfacf>] __cfg80211_roamed+0xc2/0x1b1
[158645.539086]  [<c14beb5b>] cfg80211_process_rdev_events+0x153/0x1cc
[158645.539166]  [<c14bd57b>] cfg80211_event_work+0x26/0x36
[158645.539195]  [<c10482ae>] process_one_work+0x219/0x38b
[158645.539273]  [<c14bd555>] ? wiphy_new+0x419/0x419
[158645.539301]  [<c10486cb>] worker_thread+0xf6/0x1bf
[158645.539379]  [<c10485d5>] ? rescuer_thread+0x1b5/0x1b5
[158645.539407]  [<c104b3e2>] kthread+0x62/0x67
[158645.539484]  [<c104b380>] ? __init_kthread_worker+0x42/0x42
[158645.539514]  [<c151309a>] kernel_thread_helper+0x6/0xd

Reported-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:30:28 -05:00
Dan Carpenter
fb03c5eb8c mac80211: unlock on error path in ieee80211_ibss_join()
We recently introduced a new return here but it needs an unlock first.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-13 15:30:25 -05:00
Hagen Paul Pfeifer
90b41a1cd4 netem: add cell concept to simulate special MAC behavior
This extension can be used to simulate special link layer
characteristics. Simulate because packet data is not modified, only the
calculation base is changed to delay a packet based on the original
packet size and artificial cell information.

packet_overhead can be used to simulate a link layer header compression
scheme (e.g. set packet_overhead to -20) or with a positive
packet_overhead value an additional MAC header can be simulated. It is
also possible to "replace" the 14 byte Ethernet header with something
else.

cell_size and cell_overhead can be used to simulate link layer schemes,
based on cells, like some TDMA schemes. Another application area are MAC
schemes using a link layer fragmentation with a (small) header each.
Cell size is the maximum amount of data bytes within one cell. Cell
overhead is an additional variable to change the per-cell-overhead
(e.g.  5 byte header per fragment).

Example (5 kbit/s, 20 byte per packet overhead, cell-size 100 byte, per
cell overhead 5 byte):

  tc qdisc add dev eth0 root netem rate 5kbit 20 100 5

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:44:48 -05:00
David S. Miller
c7c6575f25 Merge branch 'batman-adv/next' of git://git.open-mesh.org/linux-merge 2011-12-12 19:26:07 -05:00
Eric Dumazet
3f1e6d3fd3 sch_gred: should not use GFP_KERNEL while holding a spinlock
gred_change_vq() is called under sch_tree_lock(sch).

This means a spinlock is held, and we are not allowed to sleep in this
context.

We might pre-allocate memory using GFP_KERNEL before taking spinlock,
but this is not suitable for stable material.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:08:54 -05:00
Glauber Costa
0850f0f5c5 Display maximum tcp memory allocation in kmem cgroup
This patch introduces kmem.tcp.max_usage_in_bytes file, living in the
kmem_cgroup filesystem. The root cgroup will display a value equal
to RESOURCE_MAX. This is to avoid introducing any locking schemes in
the network paths when cgroups are not being actively used.

All others, will see the maximum memory ever used by this cgroup.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:11 -05:00
Glauber Costa
ffea59e504 Display current tcp failcnt in kmem cgroup
This patch introduces kmem.tcp.failcnt file, living in the
kmem_cgroup filesystem. Following the pattern in the other
memcg resources, this files keeps a counter of how many times
allocation failed due to limits being hit in this cgroup.
The root cgroup will always show a failcnt of 0.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:11 -05:00
Glauber Costa
5a6dd34377 Display current tcp memory allocation in kmem cgroup
This patch introduces kmem.tcp.usage_in_bytes file, living in the
kmem_cgroup filesystem. It is a simple read-only file that displays the
amount of kernel memory currently consumed by the cgroup.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:11 -05:00
Glauber Costa
3aaabe2342 tcp buffer limitation: per-cgroup limit
This patch uses the "tcp.limit_in_bytes" field of the kmem_cgroup to
effectively control the amount of kernel memory pinned by a cgroup.

This value is ignored in the root cgroup, and in all others,
caps the value specified by the admin in the net namespaces'
view of tcp_sysctl_mem.

If namespaces are being used, the admin is allowed to set a
value bigger than cgroup's maximum, the same way it is allowed
to set pretty much unlimited values in a real box.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:11 -05:00
Glauber Costa
3dc43e3e4d per-netns ipv4 sysctl_tcp_mem
This patch allows each namespace to independently set up
its levels for tcp memory pressure thresholds. This patch
alone does not buy much: we need to make this values
per group of process somehow. This is achieved in the
patches that follows in this patchset.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:11 -05:00
Glauber Costa
d1a4c0b37c tcp memory pressure controls
This patch introduces memory pressure controls for the tcp
protocol. It uses the generic socket memory pressure code
introduced in earlier patches, and fills in the
necessary data in cg_proto struct.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:10 -05:00
Glauber Costa
e1aab161e0 socket: initial cgroup code.
The goal of this work is to move the memory pressure tcp
controls to a cgroup, instead of just relying on global
conditions.

To avoid excessive overhead in the network fast paths,
the code that accounts allocated memory to a cgroup is
hidden inside a static_branch(). This branch is patched out
until the first non-root cgroup is created. So when nobody
is using cgroups, even if it is mounted, no significant performance
penalty should be seen.

This patch handles the generic part of the code, and has nothing
tcp-specific.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtsu.com>
CC: Kirill A. Shutemov <kirill@shutemov.name>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:10 -05:00
Glauber Costa
180d8cd942 foundations of per-cgroup memory pressure controlling.
This patch replaces all uses of struct sock fields' memory_pressure,
memory_allocated, sockets_allocated, and sysctl_mem to acessor
macros. Those macros can either receive a socket argument, or a mem_cgroup
argument, depending on the context they live in.

Since we're only doing a macro wrapping here, no performance impact at all is
expected in the case where we don't have cgroups disabled.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 19:04:10 -05:00
Ted Feng
72b36015ba ipip, sit: copy parms.name after register_netdevice
Same fix as 731abb9cb2 for ipip and sit tunnel.
Commit 1c5cae815d removed an explicit call to dev_alloc_name in
ipip_tunnel_locate and ipip6_tunnel_locate, because register_netdevice
will now create a valid name, however the tunnel keeps a copy of the
name in the private parms structure. Fix this by copying the name back
after register_netdevice has successfully returned.

This shows up if you do a simple tunnel add, followed by a tunnel show:

$ sudo ip tunnel add mode ipip remote 10.2.20.211
$ ip tunnel
tunl0: ip/ip  remote any  local any  ttl inherit  nopmtudisc
tunl%d: ip/ip  remote 10.2.20.211  local any  ttl inherit
$ sudo ip tunnel add mode sit remote 10.2.20.212
$ ip tunnel
sit0: ipv6/ip  remote any  local any  ttl 64  nopmtudisc 6rd-prefix 2002::/16
sit%d: ioctl 89f8 failed: No such device
sit%d: ipv6/ip  remote 10.2.20.212  local any  ttl inherit

Cc: stable@vger.kernel.org
Signed-off-by: Ted Feng <artisdom@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 18:50:51 -05:00
Li Wei
4af04aba93 ipv6: Fix for adding multicast route for loopback device automatically.
There is no obvious reason to add a default multicast route for loopback
devices, otherwise there would be a route entry whose dst.error set to
-ENETUNREACH that would blocking all multicast packets.

====================

[ more detailed explanation ]

The problem is that the resulting routing table depends on the sequence
of interface's initialization and in some situation, that would block all
muticast packets. Suppose there are two interfaces on my computer
(lo and eth0), if we initailize 'lo' before 'eth0', the resuting routing
table(for multicast) would be

# ip -6 route show | grep ff00::
unreachable ff00::/8 dev lo metric 256 error -101
ff00::/8 dev eth0 metric 256

When sending multicasting packets, routing subsystem will return the first
route entry which with a error set to -101(ENETUNREACH).

I know the kernel will set the default ipv6 address for 'lo' when it is up
and won't set the default multicast route for it, but there is no reason to
stop 'init' program from setting address for 'lo', and that is exactly what
systemd did.

I am sure there is something wrong with kernel or systemd, currently I preferred
kernel caused this problem.

====================

Signed-off-by: Li Wei <lw@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-12 18:48:18 -05:00
Dan Carpenter
f8c141c3e9 nfc: signedness bug in __nci_request()
wait_for_completion_interruptible_timeout() returns -ERESTARTSYS if
interrupted so completion_rc needs to be signed.  The current code
probably returns -ETIMEDOUT if we hit this situation, but after this
patch is applied it will return -ERESTARTSYS.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-12-12 14:23:27 -05:00
John W. Linville
f2abba4921 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2011-12-12 14:19:43 -05:00
Sage Weil
f1932fc1a6 crush: fix mapping calculation when force argument doesn't exist
If the force argument isn't valid, we should continue calculating a
mapping as if it weren't specified.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-12 09:09:45 -08:00
Sven Eckelmann
b5a1eeef04 batman-adv: Only write requested number of byte to user buffer
Don't write more than the requested number of bytes of an batman-adv icmp
packet to the userspace buffer. Otherwise unrelated userspace memory might get
overridden by the kernel.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-12-12 19:11:07 +08:00
Sven Eckelmann
d18eb45332 batman-adv: Directly check read of icmp packet in copy_from_user
The access_ok read check can be directly done in copy_from_user since a failure
of access_ok is handled the same way as an error in __copy_from_user.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-12-12 19:11:06 +08:00
Paul Kot
c00b6856fc batman-adv: bat_socket_read missing checks
Writing a icmp_packet_rr and then reading icmp_packet can lead to kernel
memory corruption, if __user *buf is just below TASK_SIZE.

Signed-off-by: Paul Kot <pawlkt@gmail.com>
[sven@narfation.org: made it checkpatch clean]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-12-12 19:11:06 +08:00
Eric Dumazet
dfd56b8b38 net: use IS_ENABLED(CONFIG_IPV6)
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-11 18:25:16 -05:00
Pavel Emelyanov
86e62ad6b2 udp_diag: Fix the !ipv6 case
Wrap the udp6 lookup into the proper ifdef-s.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-10 13:14:59 -05:00
Pavel Emelyanov
b872a2371f udp_diag: Make it module when ipv6 is a module
Eric Dumazet reported, that when inet_diag is built-in the udp_diag also goes
built-in and when ipv6 is a module the udp6 lookup symbol is not found.

  LD      .tmp_vmlinux1
net/built-in.o: In function `udp_dump_one':
udp_diag.c:(.text+0xa2b40): undefined reference to `__udp6_lib_lookup'
make: *** [.tmp_vmlinux1] Erreur 1

Fix this by making udp diag build mode depend on both -- inet diag and ipv6.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-10 13:14:59 -05:00
Pavel Emelyanov
507dd7961e udp_diag: Wire the udp_diag module into kbuild
Copy-s/tcp/udp/-paste from TCP bits.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:15:00 -05:00
Pavel Emelyanov
b6d640c228 udp_diag: Implement the dump-all functionality
Do the same as TCP does -- iterate the given udp_table, filter
sockets with bytecode and dump sockets into reply message.

The same filtering as for TCP applies, though only some of the
state bits really matter.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:15:00 -05:00
Pavel Emelyanov
a925aa00a5 udp_diag: Implement the get_exact dumping functionality
Do the same as TCP does -- lookup a socket in the given udp_table,
check cookie, fill the reply message with existing inet socket dumping
helper and send one back.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:15:00 -05:00
Pavel Emelyanov
52b7c59bc3 udp_diag: Basic skeleton
Introduce the transport level diag handler module for UDP (and UDP-lite)
sockets and register (empty for now) callbacks in the inet_diag module.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:15:00 -05:00
Pavel Emelyanov
fce823381e udp: Export code sk lookup routines
The UDP diag get_exact handler will require them to find a
socket by provided net, [sd]addr-s, [sd]ports and device.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov
1942c518ca inet_diag: Generalize inet_diag dump and get_exact calls
Introduce two callbacks in inet_diag_handler -- one for dumping all
sockets (with filters) and the other one for dumping a single sk.

Replace direct calls to icsk handlers with indirect calls to callbacks
provided by handlers.

Make existing TCP and DCCP handlers use provided helpers for icsk-s.

The UDP diag module will provide its own.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov
3c4d05c805 inet_diag: Introduce the inet socket dumping routine
The existing inet_csk_diag_fill dumps the inet connection sock info
into the netlink inet_diag_message. Prepare this routine to be able
to dump only the inet_sock part of a socket if the icsk part is missing.

This will be used by UDP diag module when dumping UDP sockets.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov
8d07d1518a inet_diag: Introduce the byte-code run on an inet socket
The upcoming UDP module will require exactly this ability, so just
move the existing code to provide one.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov
efb3cb428d inet_diag: Split inet_diag_get_exact into parts
Similar to previous patch: the 1st part locks the inet handler
and will get generalized and the 2nd one dumps icsk-s and will
be used by TCP and DCCP handlers.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov
476f7dbff3 inet_diag: Split inet_diag_get_exact into parts
The 1st part locks the inet handler and the 2nd one dump the
inet connection sock.

In the next patches the 1st part will be generalized to call
the socket dumping routine indirectly (i.e. TCP/UDP/DCCP) and
the 2nd part will be used by TCP and DCCP handlers.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov
b005ab4ef8 inet_diag: Export inet diag cookie checking routine
The netlink diag susbsys stores sk address bits in the nl message
as a "cookie" and uses one when dumps details about particular
socket.

The same will be required for udp diag module, so introduce a heler
in inet_diag module

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:08 -05:00
Pavel Emelyanov
87c22ea52e inet_diag: Reduce the number of args for bytecode run routine
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:07 -05:00
Pavel Emelyanov
7b35eadd7e inet_diag: Remove indirect sizeof from inet diag handlers
There's an info_size value stored on inet_diag_handler, but for existing
code this value is effectively constant, so just use sizeof(struct tcp_info)
where required.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 14:14:07 -05:00
John W. Linville
e7ab5f1c32 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2011-12-09 14:07:12 -05:00
Eric Dumazet
a73ed26bba sch_red: generalize accurate MAX_P support to RED/GRED/CHOKE
Now RED uses a Q0.32 number to store max_p (max probability), allow
RED/GRED/CHOKE to use/report full resolution at config/dump time.

Old tc binaries are non aware of new attributes, and still set/get Plog.

New tc binary set/get both Plog and max_p for backward compatibility,
they display "probability value" if they get max_p from new kernels.

# tc -d  qdisc show dev ...
...
qdisc red 10: parent 1:1 limit 360Kb min 30Kb max 90Kb ecn ewma 5
probability 0.09 Scell_log 15

Make sure we avoid potential divides by 0 in reciprocal_value(), if
(max_th - min_th) is big.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 13:46:15 -05:00
John Fastabend
0221cd5154 Revert "net: netprio_cgroup: make net_prio_subsys static"
This reverts commit 865d9f9f74.

This commit breaks the build with CONFIG_NETPRIO_CGROUP=y so
revert it. It does build as a module though. The SUBSYS macro
in the cgroup core code automatically defines a subsys structure
as extern. Long term we should fix the macro. And I need to
fully build test things.

Tested with CONFIG_NETPRIO_CGROUP={y|m|n} with and without
CONFIG_CGROUPS defined.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
CC: Neil Horman <nhorman@tuxdriver.com>
Reported-By: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-09 13:39:27 -05:00
Dan Carpenter
6f8e4ad0ef sock_diag: off by one checks
These tests are off by one because sock_diag_handlers[] only has AF_MAX
elements.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:58:35 -05:00
John Fastabend
865d9f9f74 net: netprio_cgroup: make net_prio_subsys static
net_prio_subsys can be made static this removes the sparse
warning it was throwing.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:58:35 -05:00
Benjamin LaHaise
6d4cdf47d2 vlan: add 802.1q netpoll support
Add netpoll support to 802.1q vlan devices.  Based on the netpoll support
in the bridging code.  Tested on a forced_eth device with netconsole.

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:58:30 -05:00
Eric Dumazet
8af2a218de sch_red: Adaptative RED AQM
Adaptative RED AQM for linux, based on paper from Sally FLoyd,
Ramakrishna Gummadi, and Scott Shenker, August 2001 :

http://icir.org/floyd/papers/adaptiveRed.pdf

Goal of Adaptative RED is to make max_p a dynamic value between 1% and
50% to reach the target average queue : (max_th - min_th) / 2

Every 500 ms:
 if (avg > target and max_p <= 0.5)
  increase max_p : max_p += alpha;
 else if (avg < target and max_p >= 0.01)
  decrease max_p : max_p *= beta;

target :[min_th + 0.4*(min_th - max_th),
          min_th + 0.6*(min_th - max_th)].
alpha : min(0.01, max_p / 4)
beta : 0.9
max_P is a Q0.32 fixed point number (unsigned, with 32 bits mantissa)

Changes against our RED implementation are :

max_p is no longer a negative power of two (1/(2^Plog)), but a Q0.32
fixed point number, to allow full range described in Adatative paper.

To deliver a random number, we now use a reciprocal divide (thats really
a multiply), but this operation is done once per marked/droped packet
when in RED_BETWEEN_TRESH window, so added cost (compared to previous
AND operation) is near zero.

dump operation gives current max_p value in a new TCA_RED_MAX_P
attribute.

Example on a 10Mbit link :

tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 8sec red \
   limit 400000 min 30000 max 90000 avpkt 1000 \
   burst 55 ecn adaptative bandwidth 10Mbit

# tc -s -d qdisc show dev eth3
...
qdisc red 10: parent 1:1 limit 400000b min 30000b max 90000b ecn
adaptative ewma 5 max_p=0.113335 Scell_log 15
 Sent 50414282 bytes 34504 pkt (dropped 35, overlimits 1392 requeues 0)
 rate 9749Kbit 831pps backlog 72056b 16p requeues 0
  marked 1357 early 35 pdrop 0 other 0

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-08 19:52:43 -05:00