Commit Graph

12146 Commits

Author SHA1 Message Date
David S. Miller
75a241f959 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2009-04-17 15:54:40 -07:00
David S. Miller
e18e37e509 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2009-04-17 15:38:38 -07:00
Johannes Berg
60375541f7 mac80211: validate TIM IE length
The TIM IE must not be shorter than 4 bytes, so verify that
when parsing it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-04-17 15:27:13 -04:00
Johannes Berg
cd1658f592 cfg80211: do not replace BSS structs
Instead, allocate extra IE memory if necessary. Normally,
this isn't even necessary since there's enough space.

This is a better way of correcting the "held BSS can
disappear" issue, but also a lot more code. It is also
necessary for proper auth/assoc BSS handling in the
future.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-04-17 15:27:13 -04:00
Johannes Berg
160002fe84 cfg80211: copy hold when replacing BSS
When we receive a probe response frame we can replace the
BSS struct in our list -- but if that struct is held then
we need to hold the new one as well.

We really should fix this completely and not replace the
struct, but this is a bandaid for now.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-04-17 15:27:13 -04:00
Johannes Berg
7181d46737 mac80211: avoid crashing when no scan sdata
Using the scan_sdata variable here is terribly wrong,
if there has never been a scan then we fail. However,
we need a bandaid...

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org [2.6.29]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-04-17 15:27:13 -04:00
Pablo Neira Ayuso
a0142733a7 netfilter: nfnetlink: return ENOMEM if we fail to create netlink socket
With this patch, nfnetlink returns -ENOMEM instead of -EPERM if we
fail to create the nfnetlink netlink socket during the module
loading. This is exactly what rtnetlink does in this case.

Ideally, it would be better if we propagate the error that has
happened in netlink_kernel_create(), however, this function still
does not implement this yet.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-04-17 17:48:44 +02:00
Pablo Neira Ayuso
150ace0db3 netfilter: ctnetlink: report error if event message allocation fails
This patch fixes an inconsistency that results in no error reports
to user-space listeners if we fail to allocate the event message.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-04-17 17:47:31 +02:00
Herbert Xu
a0a69a0106 gro: Fix use after free in tcp_gro_receive
After calling skb_gro_receive skb->len can no longer be relied
on since if the skb was merged using frags, then its pages will
have been removed and the length reduced.

This caused tcp_gro_receive to prematurely end merging which
resulted in suboptimal performance with ixgbe.

The fix is to store skb->len on the stack.

Reported-by: Mark Wagner <mwagner@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-17 02:34:38 -07:00
Oliver Hartkopp
62bcaa1303 can: Network Drop Monitor: Make use of consume_skb() in af_can.c
Since commit ead2ceb0ec ("Network Drop
Monitor: Adding kfree_skb_clean for non-drops and modifying
end-of-line points for skbs") so called end-of-line points for skb's
should use consume_skb() to free the socket buffer.

In opposite to consume_skb() the function kfree_skb() is intended to
be used for unexpected skb drops e.g. in error conditions that now can
trigger the network drop monitor if enabled.

This patch moves the skb end-of-line point in af_can.c to use
consume_skb().

Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-17 01:38:46 -07:00
David S. Miller
134ffb4cad Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2009-04-16 16:32:29 -07:00
Patrick McHardy
98d500d66c netfilter: nf_nat: add support for persistent mappings
The removal of the SAME target accidentally removed one feature that is
not available from the normal NAT targets so far, having multi-range
mappings that use the same mapping for each connection from a single
client. The current behaviour is to choose the address from the range
based on source and destination IP, which breaks when communicating
with sites having multiple addresses that require all connections to
originate from the same IP address.

Introduce a IP_NAT_RANGE_PERSISTENT option that controls whether the
destination address is taken into account for selecting addresses.

http://bugzilla.kernel.org/show_bug.cgi?id=12954

Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-04-16 18:33:01 +02:00
Gerrit Renker
23a99840d5 mac80211: Fragmentation threshold (typo)
mac80211: Fragmentation threshold (typo)

ieee80211_ioctl_siwfrag() sets the fragmentation_threshold to 2352
when frame fragmentation is to be disabled, yet the corresponding
'get' function tests for 2353 bytes instead.

This causes user-space tools to display a fragmentation threshold
of 2352 bytes even if fragmentation has been disabled.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-04-16 10:39:16 -04:00
Michael Buesch
a860402d8f mac80211: quiet beacon loss messages
On Sunday 05 April 2009 11:29:38 Michael Buesch wrote:
> On Sunday 05 April 2009 11:23:59 Jaswinder Singh Rajput wrote:
> > With latest linus tree I am getting, .config file attached:
> >
> > [   22.895051] r8169: eth0: link down
> > [   22.897564] ADDRCONF(NETDEV_UP): eth0: link is not ready
> > [   22.928047] ADDRCONF(NETDEV_UP): wlan0: link is not ready
> > [   22.982292] libvirtd used greatest stack depth: 4200 bytes left
> > [   63.709879] wlan0: authenticate with AP 00:11:95:9e:df:f6
> > [   63.712096] wlan0: authenticated
> > [   63.712127] wlan0: associate with AP 00:11:95:9e:df:f6
> > [   63.726831] wlan0: RX AssocResp from 00:11:95:9e:df:f6 (capab=0x471 status=0 aid=1)
> > [   63.726855] wlan0: associated
> > [   63.730093] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
> > [   74.296087] wlan0: no IPv6 routers present
> > [   79.349044] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [  119.358200] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [  179.354292] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [  259.366044] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [  359.348292] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [  361.953459] packagekitd used greatest stack depth: 4160 bytes left
> > [  478.824258] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [  598.813343] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [  718.817292] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [  838.824567] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [  958.815402] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1078.848434] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1198.822913] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1318.824931] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1438.814157] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1558.827336] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1678.823011] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1798.830589] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1918.828044] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 2038.827224] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 2116.517152] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 2158.840243] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 2278.827427] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
>
>
> I think this message should only show if CONFIG_MAC80211_VERBOSE_DEBUG is set.
> It's kind of expected that we lose a beacon once in a while, so we shouldn't print
> verbose messages to the kernel log (even if they are KERN_DEBUG).
>
> And besides that, I think one can easily remotely trigger this message and flood the logs.
> So it should probably _also_ be ratelimited.

Something like this:

Signed-off-by: Michael Buesch <mb@bu3sch.de>
2009-04-16 10:39:14 -04:00
Johannes Berg
47afbaf5af mac80211: correct wext transmit power handler
Wext makes no assumptions about the contents of
data->txpower.fixed and data->txpower.value when
data->txpower.disabled is set, so do not update
the user-requested power level while disabling.

Also, when wext configures a really _fixed_ power
output [1], we should reject it instead of limiting it
to the regulatory constraint. If the user wants to set
a _limit_ [2] then we should honour that.

[1] iwconfig wlan0 txpower 20dBm fixed
[2] iwconfig wlan0 txpower 10dBm

This fixes
http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1942

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-04-16 10:39:08 -04:00
Vasanthakumar Thiagarajan
b3631286ac mac80211: Fix bug in getting rx status for frames pending in reorder buffer
Currently rx status for frames which are completed from reorder buffer
is taken from it's cb area which is not always right, cb is not holding
the rx status when driver uses mac80211's non-irq rx handler to pass it's
received frames. This results in dropping almost all frames from reorder
buffer when security is enabled by doing double decryption (first in hw,
second in sw because of wrong rx status). This patch copies rx status into
cb area before the frame is put into reorder buffer. After this patch,
there is a significant improvement in throughput with ath9k + WPA2(AES).

Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-04-16 10:39:02 -04:00
Luis R. Rodriguez
0ad8acaf43 cfg80211: fix NULL pointer deference in reg_device_remove()
We won't ever get here as regulatory_hint_core() can only fail
on -ENOMEM and in that case we don't initialize cfg80211 but this is
technically correct code.

This is actually good for stable, where we don't check for -ENOMEM
failure on __regulatory_hint()'s failure.

Cc: stable@kernel.org
Reported-by: Quentin Armitage <Quentin@armitage.org.uk>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-04-16 10:39:01 -04:00
Patrick McHardy
38fb0afcd8 netfilter: nf_conntrack: fix crash when unloading helpers
Commit ea781f197d (netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and)
get rid of call_rcu() was missing one conversion to the hlist_nulls
functions, causing a crash when unloading conntrack helper modules.

Reported-and-tested-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-04-15 12:45:08 +02:00
Eric Dumazet
719bfeaae8 packet: avoid warnings when high-order page allocation fails
Latest tcpdump/libpcap triggers annoying messages because of high order page
allocation failures (when lowmem exhausted or fragmented)

These allocation errors are correctly handled so could be silent.

[22660.208901] tcpdump: page allocation failure. order:5, mode:0xc0d0
[22660.208921] Pid: 13866, comm: tcpdump Not tainted 2.6.30-rc2 #170
[22660.208936] Call Trace:
[22660.208950]  [<c04e2b46>] ? printk+0x18/0x1a
[22660.208965]  [<c02760f7>] __alloc_pages_internal+0x357/0x460
[22660.208980]  [<c0276251>] __get_free_pages+0x21/0x40
[22660.208995]  [<c04cc835>] packet_set_ring+0x105/0x3d0
[22660.209009]  [<c04ccd1d>] packet_setsockopt+0x21d/0x4d0
[22660.209025]  [<c0270400>] ? filemap_fault+0x0/0x450
[22660.209040]  [<c0449e34>] sys_setsockopt+0x54/0xa0
[22660.209053]  [<c044b97f>] sys_socketcall+0xef/0x270
[22660.209067]  [<c0202e34>] sysenter_do_call+0x12/0x26

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-15 03:39:52 -07:00
Eric Dumazet
b6f0a3652e netfilter: nf_log regression fix
commit ca735b3aaa
'netfilter: use a linked list of loggers'
introduced an array of list_head in "struct nf_logger", but
forgot to initialize it in nf_log_register(). This resulted
in oops when calling nf_log_unregister() at module unload time.

Reported-and-tested-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-04-15 12:16:19 +02:00
David S. Miller
6fd4777a1f Revert "rose: zero length frame filtering in af_rose.c"
This reverts commit 244f46ae6e.

Alan Cox did the research, and just like the other radio protocols
zero-length frames have meaning because at the top level ROSE is
X.25 PLP.

So this zero-length filtering is invalid.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-14 20:28:00 -07:00
Herbert Xu
fc59f9a3bf gro: Restore correct value to gso_size
Since everybody has been focusing on baremetal GRO performance
no one noticed when I added a bug that zapped gso_size for all
GRO packets.  This only gets picked up when you forward the skb
out of an interface.

Thanks to Mark Wagner for noticing this bug when testing kvm.

Reported-by: Mark Wagner <mwagner@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-14 15:11:06 -07:00
Yang Hongyang
ce8632ba6b ipv6:remove useless check
After switch (rthdr->type) {...},the check below is completely useless.Because:
if the type is 2,then hdrlen must be 2 and segments_left must be 1,clearly the
check is redundant;if the type is not 2,then goto sticky_done,the check is useless
too.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Reviewed-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-14 02:21:41 -07:00
Ilpo Järvinen
86bcebafc5 tcp: fix >2 iw selection
A long-standing feature in tcp_init_metrics() is such that
any of its goto reset prevents call to tcp_init_cwnd().

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-14 02:08:53 -07:00
Stephen Hemminger
1a31f2042e netsched: Allow meta match on vlan tag on receive
When vlan acceleration is used on receive, the vlan tag is maintained
outside of the skb data. The existing vlan tag match only works on TX
path because it uses vlan_get_tag which tests for VLAN_HW_TX_ACCEL.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-13 18:12:57 -07:00
Herbert Xu
1db9e29bb0 gro: Normalise skb before bypassing GRO on netpoll VLAN path
Hi:

gro: Normalise skb before bypassing GRO on netpoll VLAN path

When we detect netpoll RX on the GRO VLAN path we bail out and
call the normal VLAN receive handler.  However, the packet needs
to be normalised by calling eth_type_trans since that's what the
normal path expects (normally the GRO path does the fixup).

This patch adds the necessary call to vlan_gro_frags.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-13 15:44:50 -07:00
Vlad Yasevich
499923c7a3 ipv6: Fix NULL pointer dereference with time-wait sockets
Commit b2f5e7cd3d
(ipv6: Fix conflict resolutions during ipv6 binding)
introduced a regression where time-wait sockets were
not treated correctly.  This resulted in the following:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000062
IP: [<ffffffff805d7d61>] ipv4_rcv_saddr_equal+0x61/0x70
...
Call Trace:
[<ffffffffa033847b>] ipv6_rcv_saddr_equal+0x1bb/0x250 [ipv6]
[<ffffffffa03505a8>] inet6_csk_bind_conflict+0x88/0xd0 [ipv6]
[<ffffffff805bb18e>] inet_csk_get_port+0x1ee/0x400
[<ffffffffa0319b7f>] inet6_bind+0x1cf/0x3a0 [ipv6]
[<ffffffff8056d17c>] ? sockfd_lookup_light+0x3c/0xd0
[<ffffffff8056ed49>] sys_bind+0x89/0x100
[<ffffffff80613ea2>] ? trace_hardirqs_on_thunk+0x3a/0x3c
[<ffffffff8020bf9b>] system_call_fastpath+0x16/0x1b

Tested-by: Brian Haley <brian.haley@hp.com>
Tested-by: Ed Tomlinson <edt@aei.ca>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-11 01:53:06 -07:00
Wei Yongjun
3384901f1b tr: fix leakage of device in net/802/tr.c
Add dev_put() after dev_get_by_index() to avoid leakage
of device.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-11 01:43:17 -07:00
Alexander Duyck
d543103a0c net: netif_device_attach/detach should start/stop all queues
Currently netif_device_attach/detach are only stopping one queue.  They
should be starting and stopping all the queues on a given device.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-11 01:43:10 -07:00
David S. Miller
fd1cc48024 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2009-04-08 13:39:54 -07:00
Linus Torvalds
3989203290 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  b44: Use kernel DMA addresses for the kernel DMA API
  forcedeth: Fix resume from hibernation regression.
  xfrm: fix fragmentation on inter family tunnels
  ibm_newemac: Fix dangerous struct assumption
  gigaset: documentation update
  gigaset: in file ops, check for device disconnect before anything else
  bas_gigaset: use tasklet_hi_schedule for timing critical tasklets
  net/802/fddi.c: add MODULE_LICENSE
  smsc911x: remove unused #include <linux/version.h>
  axnet_cs: fix phy_id detection for bogus Asix chip.
  bnx2: Use request_firmware()
  b44: Fix sizes passed to b44_sync_dma_desc_for_{device,cpu}()
  socket: use percpu_add() while updating sockets_in_use
  virtio_net: Set the mac config only when VIRITO_NET_F_MAC
  myri_sbus: use request_firmware
  e1000: fix loss of multicast packets
  vxge: should include tcp.h

Conflict in firmware/WHENCE (SCSI vs net firmware)
2009-04-06 18:05:43 -07:00
Steffen Klassert
d1d88e5de4 xfrm: fix fragmentation on inter family tunnels
If an ipv4 packet (not locally generated with IP_DF flag not set) bigger
than mtu size is supposed to go via a xfrm ipv6 tunnel, the packetsize
check in xfrm4_tunnel_check_size() is omited and ipv6 drops the packet
without sending a notice to the original sender of the ipv4 packet.

Another issue is that ipv4 connection tracking does reassembling of
incomming fragmented packets. If such a reassembled packet is supposed to
go via a xfrm ipv6 tunnel it will be droped, even if the original sender
did proper fragmentation.

According to RFC 2473 (section 7) tunnel ipv6 packets resulting from the
encapsulation of an original packet are considered as locally generated
packets. If such a packet passed the checks in xfrm{4,6}_tunnel_check_size()
fragmentation is allowed according to RFC 2473 (section 7.1/7.2).

This patch sets skb->local_df in xfrm6_prepare_output() to achieve
fragmentation in this case.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-06 17:07:59 -07:00
Adrian Bunk
d9677a45cf net/802/fddi.c: add MODULE_LICENSE
This patch adds the missing MODULE_LICENSE("GPL").

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-06 17:07:55 -07:00
Linus Torvalds
a63856252d Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: (81 commits)
  nfsd41: define nfsd4_set_statp as noop for !CONFIG_NFSD_V4
  nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc
  nfsd41: Documentation/filesystems/nfs41-server.txt
  nfsd41: CREATE_EXCLUSIVE4_1
  nfsd41: SUPPATTR_EXCLCREAT attribute
  nfsd41: support for 3-word long attribute bitmask
  nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify
  nfsd41: pass writable attrs mask to nfsd4_decode_fattr
  nfsd41: provide support for minor version 1 at rpc level
  nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions
  nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap
  nfsd41: access_valid
  nfsd41: clientid handling
  nfsd41: check encode size for sessions maxresponse cached
  nfsd41: stateid handling
  nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op
  nfsd41: destroy_session operation
  nfsd41: non-page DRC for solo sequence responses
  nfsd41: Add a create session replay cache
  nfsd41: create_session operation
  ...
2009-04-06 13:25:56 -07:00
Pablo Neira Ayuso
83731671d9 netfilter: ctnetlink: fix regression in expectation handling
This patch fixes a regression (introduced by myself in commit 19abb7b:
netfilter: ctnetlink: deliver events for conntracks changed from
userspace) that results in an expectation re-insertion since
__nf_ct_expect_check() may return 0 for expectation timer refreshing.

This patch also removes a unnecessary refcount bump that
pretended to avoid a possible race condition with event delivery
and expectation timers (as said, not needed since we hold a
reference to the object since until we finish the expectation
setup). This also merges nf_ct_expect_related_report() and
nf_ct_expect_related() which look basically the same.

Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-04-06 17:47:20 +02:00
Alex Riesen
3ae16f1302 netfilter: fix selection of "LED" target in netfilter
It's plural, not LED_TRIGGERS.

Signed-off-by: Alex Riesen <fork0@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-04-06 17:09:43 +02:00
Eric Dumazet
49a88d18a1 netfilter: ip6tables regression fix
Commit 7845447 (netfilter: iptables: lock free counters) broke
ip6_tables by unconditionally returning ENOMEM in alloc_counters(),

Reported-by: Graham Murray <graham@gmurray.org.uk>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-04-06 17:06:55 +02:00
Linus Torvalds
90975ef712 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask: (36 commits)
  cpumask: remove cpumask allocation from idle_balance, fix
  numa, cpumask: move numa_node_id default implementation to topology.h, fix
  cpumask: remove cpumask allocation from idle_balance
  x86: cpumask: x86 mmio-mod.c use cpumask_var_t for downed_cpus
  x86: cpumask: update 32-bit APM not to mug current->cpus_allowed
  x86: microcode: cleanup
  x86: cpumask: use work_on_cpu in arch/x86/kernel/microcode_core.c
  cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash
  numa, cpumask: move numa_node_id default implementation to topology.h
  cpumask: convert node_to_cpumask_map[] to cpumask_var_t
  cpumask: remove x86 cpumask_t uses.
  cpumask: use cpumask_var_t in uv_flush_tlb_others.
  cpumask: remove cpumask_t assignment from vector_allocation_domain()
  cpumask: make Xen use the new operators.
  cpumask: clean up summit's send_IPI functions
  cpumask: use new cpumask functions throughout x86
  x86: unify cpu_callin_mask/cpu_callout_mask/cpu_initialized_mask/cpu_sibling_setup_mask
  cpumask: convert struct cpuinfo_x86's llc_shared_map to cpumask_var_t
  cpumask: convert node_to_cpumask_map[] to cpumask_var_t
  x86: unify 32 and 64-bit node_to_cpumask_map
  ...
2009-04-05 10:33:07 -07:00
Eric Dumazet
4e69489a0a socket: use percpu_add() while updating sockets_in_use
sock_alloc() currently uses following code to update sockets_in_use

get_cpu_var(sockets_in_use)++;
put_cpu_var(sockets_in_use);

This translates to :

c0436274:       b8 01 00 00 00          mov    $0x1,%eax
c0436279:       e8 42 40 df ff          call   c022a2c0 <add_preempt_count>
c043627e:       bb 20 4f 6a c0          mov    $0xc06a4f20,%ebx
c0436283:       e8 18 ca f0 ff          call   c0342ca0 <debug_smp_processor_id>
c0436288:       03 1c 85 60 4a 65 c0    add    -0x3f9ab5a0(,%eax,4),%ebx
c043628f:       ff 03                   incl   (%ebx)
c0436291:       b8 01 00 00 00          mov    $0x1,%eax
c0436296:       e8 75 3f df ff          call   c022a210 <sub_preempt_count>
c043629b:       89 e0                   mov    %esp,%eax
c043629d:       25 00 e0 ff ff          and    $0xffffe000,%eax
c04362a2:       f6 40 08 08             testb  $0x8,0x8(%eax)
c04362a6:       75 07                   jne    c04362af <sock_alloc+0x7f>
c04362a8:       8d 46 d8                lea    -0x28(%esi),%eax
c04362ab:       5b                      pop    %ebx
c04362ac:       5e                      pop    %esi
c04362ad:       c9                      leave
c04362ae:       c3                      ret
c04362af:       e8 cc 5d 09 00          call   c04cc080 <preempt_schedule>
c04362b4:       8d 74 26 00             lea    0x0(%esi,%eiz,1),%esi
c04362b8:       eb ee                   jmp    c04362a8 <sock_alloc+0x78>

While percpu_add(sockets_in_use, 1) translates to a single instruction :

c0436275:   64 83 05 20 5f 6a c0    addl   $0x1,%fs:0xc06a5f20

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-04 16:41:09 -07:00
Andy Adamson
2f425878b6 nfsd: don't use the deferral service, return NFS4ERR_DELAY
On an NFSv4.1 server cache miss that causes an upcall, NFS4ERR_DELAY will be
returned. It is up to the NFSv4.1 client to resend only the operations that
have not been processed.

Initialize rq_usedeferral to 1 in svc_process(). It sill be turned off in
nfsd4_proc_compound() only when NFSv4.1 Sessions are used.

Note: this isn't an adequate solution on its own. It's acceptable as a way
to get some minimal 4.1 up and working, but we're going to have to find a
way to avoid returning DELAY in all common cases before 4.1 can really be
considered ready.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: reverse rq_nodeferral negative logic]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[sunrpc: initialize rq_usedeferral]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:12 -07:00
Linus Torvalds
811158b147 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits)
  trivial: Update my email address
  trivial: NULL noise: drivers/mtd/tests/mtd_*test.c
  trivial: NULL noise: drivers/media/dvb/frontends/drx397xD_fw.h
  trivial: Fix misspelling of "Celsius".
  trivial: remove unused variable 'path' in alloc_file()
  trivial: fix a pdlfush -> pdflush typo in comment
  trivial: jbd header comment typo fix for JBD_PARANOID_IOFAIL
  trivial: wusb: Storage class should be before const qualifier
  trivial: drivers/char/bsr.c: Storage class should be before const qualifier
  trivial: h8300: Storage class should be before const qualifier
  trivial: fix where cgroup documentation is not correctly referred to
  trivial: Give the right path in Documentation example
  trivial: MTD: remove EOL from MODULE_DESCRIPTION
  trivial: Fix typo in bio_split()'s documentation
  trivial: PWM: fix of #endif comment
  trivial: fix typos/grammar errors in Kconfig texts
  trivial: Fix misspelling of firmware
  trivial: cgroups: documentation typo and spelling corrections
  trivial: Update contact info for Jochen Hein
  trivial: fix typo "resgister" -> "register"
  ...
2009-04-03 15:24:35 -07:00
Linus Torvalds
8fe74cf053 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  Remove two unneeded exports and make two symbols static in fs/mpage.c
  Cleanup after commit 585d3bc06f
  Trim includes of fdtable.h
  Don't crap into descriptor table in binfmt_som
  Trim includes in binfmt_elf
  Don't mess with descriptor table in load_elf_binary()
  Get rid of indirect include of fs_struct.h
  New helper - current_umask()
  check_unsafe_exec() doesn't care about signal handlers sharing
  New locking/refcounting for fs_struct
  Take fs_struct handling to new file (fs/fs_struct.c)
  Get rid of bumping fs_struct refcount in pivot_root(2)
  Kill unsharing fs_struct in __set_personality()
2009-04-02 21:09:10 -07:00
Linus Torvalds
ef8a97bbc9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (54 commits)
  glge: remove unused #include <version.h>
  dnet: remove unused #include <version.h>
  tcp: miscounts due to tcp_fragment pcount reset
  tcp: add helper for counter tweaking due mid-wq change
  hso: fix for the 'invalid frame length' messages
  hso: fix for crash when unplugging the device
  fsl_pq_mdio: Fix compile failure
  fsl_pq_mdio: Revive UCC MDIO support
  ucc_geth: Pass proper device to DMA routines, otherwise oops happens
  i.MX31: Fixing cs89x0 network building to i.MX31ADS
  tc35815: Fix build error if NAPI enabled
  hso: add Vendor/Product ID's for new devices
  ucc_geth: Remove unused header
  gianfar: Remove unused header
  kaweth: Fix locking to be SMP-safe
  net: allow multiple dev per napi with GRO
  r8169: reset IntrStatus after chip reset
  ixgbe: Fix potential memory leak/driver panic issue while setting up Tx & Rx ring parameters
  ixgbe: fix ethtool -A|a behavior
  ixgbe: Patch to fix driver panic while freeing up tx & rx resources
  ...
2009-04-02 21:05:30 -07:00
Ilpo Järvinen
9eb9362e56 tcp: miscounts due to tcp_fragment pcount reset
It seems that trivial reset of pcount to one was not sufficient
in tcp_retransmit_skb. Multiple counters experience a positive
miscount when skb's pcount gets lowered without the necessary
adjustments (depending on skb's sacked bits which exactly), at
worst a packets_out miscount can crash at RTO if the write queue
is empty!

Triggering this requires mss change, so bidir tcp or mtu probe or
like.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Tested-by: Uwe Bugla <uwe.bugla@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-02 16:31:45 -07:00
Ilpo Järvinen
797108d134 tcp: add helper for counter tweaking due mid-wq change
We need full-scale adjustment to fix a TCP miscount in the next
patch, so just move it into a helper and call for that from the
other places.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-02 16:31:44 -07:00
Stephen Hemminger
f2bde73286 net: allow multiple dev per napi with GRO
GRO assumes that there is a one-to-one relationship between NAPI
structure and network device. Some devices like sky2 share multiple
devices on a single interrupt so only have one NAPI handler. Rather than
split GRO from NAPI, just have GRO assume if device changes that
it is a different flow.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-02 01:07:37 -07:00
Eric Dumazet
fa9a86ddc8 netfilter: use rcu_read_bh() in ipt_do_table()
Commit 784544739a
(netfilter: iptables: lock free counters) forgot to disable BH
in arpt_do_table(), ipt_do_table() and  ip6t_do_table()

Use rcu_read_lock_bh() instead of rcu_read_lock() cures the problem.

Reported-and-bisected-by: Roman Mindalev <r000n@r000n.net>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-02 00:54:43 -07:00
Andy Grover
8cbd9606a6 RDS: Use spinlock to protect 64b value update on 32b archs
We have a 64bit value that needs to be set atomically.
This is easy and quick on all 64bit archs, and can also be done
on x86/32 with set_64bit() (uses cmpxchg8b). However other
32b archs don't have this.

I actually changed this to the current state in preparation for
mainline because the old way (using a spinlock on 32b) resulted in
unsightly #ifdefs in the code. But obviously, being correct takes
precedence.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-02 00:52:22 -07:00
Andy Grover
745cbccac3 RDS: Rewrite connection cleanup, fixing oops on rmmod
This fixes a bug where a connection was unexpectedly
not on *any* list while being destroyed. It also
cleans up some code duplication and regularizes some
function names.

* Grab appropriate lock in conn_free() and explain in comment
* Ensure via locking that a conn is never not on either
  a dev's list or the nodev list
* Add rds_xx_remove_conn() to match rds_xx_add_conn()
* Make rds_xx_add_conn() return void
* Rename remove_{,nodev_}conns() to
  destroy_{,nodev_}conns() and unify their implementation
  in a helper function
* Document lock ordering as nodev conn_lock before
  dev_conn_lock

Reported-by: Yosef Etigin <yosefe@voltaire.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-02 00:52:22 -07:00
Andy Grover
f1cffcbfcc RDS: Fix m_rs_lock deadlock
rs_send_drop_to() is called during socket close. If it takes
m_rs_lock without disabling interrupts, then
rds_send_remove_from_sock() can run from the rx completion
handler and thus deadlock.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-02 00:52:21 -07:00