Move the global initial codes to the module_init/exit context.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Move the global initial codes to the module_init/exit context.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Move the global initial codes to the module_init/exit context.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Move the global initial codes to the module_init/exit context.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Move the global initial codes to the module_init/exit context.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Move the global initial codes to the module_init/exit context.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Move the global initial codes to the module_init/exit context.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Move the global initial codes to the module_init/exit context.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
nf_conntrack initialization and cleanup codes happens in pernet
operations function. This task should be done in module_init/exit.
We can't use init_net to identify if it's the right time to initialize
or cleanup since we cannot make assumption on the order netns are
created/destroyed.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fengguang reported:
net/core/netpoll.c: In function 'netpoll_setup':
net/core/netpoll.c:1049:6: warning: 'err' may be used uninitialized in this function [-Wmaybe-uninitialized]
in !CONFIG_IPV6 case, we may error out without initializing
'err'.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is declared in:
include/net/ip6_route.h:187:int ip6_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *));
and net/ip6_route.h is already included.
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we have removed NCE (Neighbour Cache Entry) reference from
routing entries, the only refcnt holders of an NCE are its timer
(if running) and its owner table, in usual cases. As a result,
neigh_periodic_work() purges NCEs over and over again even for
gateways.
It does not make sense to purge entries, if number of them is
very small, so keep them. The minimum number of entries to keep
is specified by gc_thresh1.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
mfc_mcastgrp and mfc_origin are __be32, thus we need to convert INADDR_ANY.
Because INADDR_ANY is 0, this patch just fix sparse warnings.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
git commit 9cb3a50c (ipv4: Invalidate the socket cached route on
pmtu events if possible) introduced a refcount problem. We don't
get a refcount on the route if we get it from__sk_dst_get(), but
we need one if we want to reuse this route because __sk_dst_set()
releases the refcount of the old route. This patch adds proper
refcount handling for that case. We introduce a 'new' flag to
indicate that we are going to use a new route and we release the
old route only if we replace it by a new one.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
1) The transport header did not point to the right place after
esp/ah processing on tunnel mode in the receive path. As a
result, the ECN field of the inner header was not set correctly,
fixes from Li RongQing.
2) We did a null check too late in one of the xfrm_replay advance
functions. This can lead to a division by zero, fix from
Nickolai Zeldovich.
3) The size calculation of the hash table missed the muiltplication
with the actual struct size when the hash table is freed.
We might call the wrong free function, fix from Michal Kubecek.
4) On IPsec pmtu events we can't access the transport headers of
the original packet, so force a relookup for all routes
to notify about the pmtu event.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6a328d8c6f changed the update
logic for the socket but it does not update the SCM_RIGHTS update
as well. This patch is based on the net_prio fix commit
48a87cc26c
net: netprio: fd passed in SCM_RIGHTS datagram not set correctly
A socket fd passed in a SCM_RIGHTS datagram was not getting
updated with the new tasks cgrp prioidx. This leaves IO on
the socket tagged with the old tasks priority.
To fix this add a check in the scm recvmsg path to update the
sock cgrp prioidx with the new tasks value.
Let's apply the same fix for net_cls.
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Reported-by: Li Zefan <lizefan@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: netdev@vger.kernel.org
Cc: cgroups@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 2152caea ("ipv6: Do not depend on rt->n in rt6_probe().")
introduce a bug to try to update "updated" time in neighbour
structure.
Update the "updated" time only if neighbour is available.
Bug was found by Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes dsa_switch_setup() to ensure that at least one valid
valid port name is specified and will bail out with an error in case we
walked the maximum number of port with a valid port name found.
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The slave MII bus registered by the DSA code is using the parent MII bus
as part of its name (ds->master_mii_bus_id), in case the parent MII bus
name is already 16 characters long (such as d0072004.mdio-mi) we will
get the following WARN_ON in dsa_switch_setup() when calling
mdiobus_register():
[ 79.088782] ------------[ cut here ]------------
[ 79.093448] WARNING: at fs/sysfs/dir.c:536 sysfs_add_one+0x80/0xa0()
[ 79.099831] sysfs: cannot create duplicate filename
'/class/mdio_bus/d0072004.mdio-mi'
This is a genuine warning, because the DSA slave MII bus will also be
named d0072004.mdio-mi, and since MII_BUS_ID_SIZE is 17 characters long
(with null-terminator) the following will truncate the slave MII bus id:
snprintf(ds->slave_mii_bus->id, MII_BUS_ID_SIZE, "%s-%d:%.2x",
ds->master_mii_bus->id, ds->pd->sw_addr);
Fix this by using dsa-<switch index->:<sw_add> which is guaranteed to be
unique.
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
__skb_tx_hash() and __skb_get_rxhash() are all for calculating hash
value based by some fields in skb, mostly used for selecting queues
by device drivers.
Meanwhile, net/core/dev.c is bloating.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This implements a socket release callback function to check
if the socket cached route got invalid during the time
we owned the socket. The function is used from udp, raw
and ping sockets.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The route lookup in ipv4_sk_update_pmtu() might return a route
different from the route we cached at the socket. This is because
standart routes are per cpu, so each cpu has it's own struct rtable.
This means that we do not invalidate the socket cached route if the
NET_RX_SOFTIRQ is not served by the same cpu that the sending socket
uses. As a result, the cached route reused until we disconnect.
With this patch we invalidate the socket cached route if possible.
If the socket is owened by the user, we can't update the cached
route directly. A followup patch will implement socket release
callback functions for datagram sockets to handle this case.
Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we set mac address, software mac address in system and hardware mac
address all need to be updated. Current eth_mac_addr() doesn't allow
callers to implement error handling nicely.
This patch split eth_mac_addr() to prepare part and real commit part,
then we can prepare first, and try to change hardware address, then do
the real commit if hardware address is set successfully.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch add the support of proxy multicast, ie being able to build a static
multicast tree. It adds the support of (*,*) and (*,G) entries.
The user should define an (*,*) entry which is not used for real forwarding.
This entry defines the upstream in iif and contains all interfaces from the
static tree in its oifs. It will be used to forward packet upstream when they
come from an interface belonging to the static tree.
Hence, the user should define (*,G) entries to build its static tree. Note that
upstream interface must be part of oifs: packets are sent to all oifs
interfaces except the input interface. This ensures to always join the whole
static tree, even if the packet is not coming from the upstream interface.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Construct NS/NA/RS message directly using C99 compound literals.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_transport_header() (thus icmp6_hdr()) is available here,
use it.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Build ICMPv6 message first and make buffer management easier;
we can use skb->len when filling checksum in ICMPv6 header,
and then build IP header with length field.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
- move ip6_nd_hdr() to its users' source files.
In net/ipv6/mcast.c, it will be called ip6_mc_hdr().
- make return type to void since this function never fails.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suggested by Eric Dumazet <edumazet@google.com>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This also makes ndisc_opt_addr_data() and ndisc_fill_addr_option()
use ndisc_opt_addr_space().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add pointer to struct net_device (dev) and remove
data_len (= dev->addr_len) and addr_type (= dev->type).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
On IPsec pmtu events we can't access the transport headers of
the original packet, so we can't find the socket that sent
the packet. The only chance to notify the socket about the
pmtu change is to force a relookup for all routes. This
patch implenents this for the IPsec protocols.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Support arbitrary linux socket filter (BPF) programs as x_tables
match rules. This allows for very expressive filters, and on
platforms with BPF JIT appears competitive with traditional
hardcoded iptables rules using the u32 match.
The size of the filter has been artificially limited to 64
instructions maximum to avoid bloating the size of each rule
using this new match.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Missing multiplication of block size by sizeof(struct hlist_head)
can cause xfrm_hash_free() to be called with wrong second argument
so that kfree() is called on a block allocated with vzalloc() or
__get_free_pages() or free_pages() is called with wrong order when
a namespace with enough policies is removed.
Bug introduced by commit a35f6c5d, i.e. versions >= 2.6.29 are
affected.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
commit 9ca1b22d6d (net: splice: avoid high order page splitting)
forgot that skb->head could need a copy into several page frags.
This could be the case for loopback traffic mostly.
Also remove now useless skb argument from linear_to_page()
and __splice_segment() prototypes.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
splice() can handle pages of any order, but network code tries hard to
split them in PAGE_SIZE units. Not quite successfully anyway, as
__splice_segment() assumed poff < PAGE_SIZE. This is true for
the skb->data part, not necessarily for the fragments.
This patch removes this logic to give the pages as they are in the skb.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 563d34d057 (tcp: dont drop MTU reduction indications)
added an error leading to incorrect accounting of
LINUX_MIB_LOCKDROPPEDICMPS
If socket is owned by the user, we want to increment
this SNMP counter, unless the message is a
(ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED) one.
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When processing the unregister notify for a hard interface, removing
the sysfs files may lead to a circular deadlock (rtnl mutex <->
s_active).
To overcome this problem, postpone the sysfs removal in a worker.
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Reported-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Use more preferable function name which implies using a pseudo-random
number generator.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Cc: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Cc: Antonio Quartulli <ordex@autistici.org>
Cc: b.a.t.m.a.n@lists.open-mesh.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Thanks to Sven Eckelmann and Simon Wunderlich for their support.
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
A delayed_work struct does not need to be initialized each
every time before being enqueued. Therefore the
INIT_DELAYED_WORK() macro should be used during the
initialization process only.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
This is already checked by the caller (tunnel64_rcv) and brings ipip6_rcv
in line with ipip_rcv.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check message length before accessing "target" field,
as we do for other types.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because of rt->n removal, we do not need neigh argument any more.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
pmtu and redirect events are now handled in the protocols error handler,
so add an error handler for icmp6 to do this. It is needed in the case
when we have no socket context. Based on a patch by Duan Jiong.
Reported-by: Duan Jiong <djduanjiong@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Handle the reception of uncompressed packets (dispatch type = IPv6).
Signed-off-by: Alan Ott <alan@signal11.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactor the handing of the skb's to the individual lowpan devices into a
function.
Signed-off-by: Alan Ott <alan@signal11.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
All of the xfrm_replay->advance functions in xfrm_replay.c check if
x->replay_esn->replay_window is zero (and return if so). However,
one of them, xfrm_replay_advance_bmp(), divides by that value (in the
'%' operator) before doing the check, which can potentially trigger
a divide-by-zero exception. Some compilers will also assume that the
earlier division means the value cannot be zero later, and thus will
eliminate the subsequent zero check as dead code.
This patch moves the division to after the check.
Signed-off-by: Nickolai Zeldovich <nickolai@csail.mit.edu>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Jamie Parsons reported a problem recently, in which the re-initalization of an
association (The duplicate init case), resulted in a loss of receive window
space. He tracked down the root cause to sctp_outq_teardown, which discarded
all the data on an outq during a re-initalization of the corresponding
association, but never reset the outq->outstanding_data field to zero. I wrote,
and he tested this fix, which does a proper full re-initalization of the outq,
fixing this problem, and hopefully future proofing us from simmilar issues down
the road.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Jamie Parsons <Jamie.Parsons@metaswitch.com>
Tested-by: Jamie Parsons <Jamie.Parsons@metaswitch.com>
CC: Jamie Parsons <Jamie.Parsons@metaswitch.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: netdev@vger.kernel.org
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CC: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
CC: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
neigh->nud_state and neigh->updated are under protection of
neigh->lock.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prototype for nf_nat_snmp_hook is in nf_conntrack_snmp.h therefore
it should be included to get type checking. Found by sparse.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add the ability to set/clear labels assigned to a conntrack
via ctnetlink.
To allow userspace to only alter specific bits, Pablo suggested to add
a new CTA_LABELS_MASK attribute:
The new set of active labels is then determined via
active = (active & ~mask) ^ changeset
i.e., the mask selects those bits in the existing set that should be
changed.
This follows the same method already used by MARK and CONNMARK targets.
Omitting CTA_LABELS_MASK is the same as setting all bits in CTA_LABELS_MASK
to 1: The existing set is replaced by the one from userspace.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Introduce CTA_LABELS attribute to send a bit-vector of currently active labels
to userspace.
Future patch will permit userspace to also set/delete active labels.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
similar to connmarks, except labels are bit-based; i.e.
all labels may be attached to a flow at the same time.
Up to 128 labels are supported. Supporting more labels
is possible, but requires increasing the ct offset delta
from u8 to u16 type due to increased extension sizes.
Mapping of bit-identifier to label name is done in userspace.
The extension is enabled at run-time once "-m connlabel" netfilter
rules are added.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Most SIP devices use a source port of 5060/udp on SIP requests, so the
response automatically comes back to port 5060:
phone_ip:5060 -> proxy_ip:5060 REGISTER
proxy_ip:5060 -> phone_ip:5060 100 Trying
The newer Cisco IP phones, however, use a randomly chosen high source
port for the SIP request but expect the response on port 5060:
phone_ip:49173 -> proxy_ip:5060 REGISTER
proxy_ip:5060 -> phone_ip:5060 100 Trying
Standard Linux NAT, with or without nf_nat_sip, will send the reply back
to port 49173, not 5060:
phone_ip:49173 -> proxy_ip:5060 REGISTER
proxy_ip:5060 -> phone_ip:49173 100 Trying
But the phone is not listening on 49173, so it will never see the reply.
This patch modifies nf_*_sip to work around this quirk by extracting
the SIP response port from the Via: header, iff the source IP in the
packet header matches the source IP in the SIP request.
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Increase the amount of memory usage limits for incomplete
IP fragments.
Arguing for new thresh high/low values:
High threshold = 4 MBytes
Low threshold = 3 MBytes
The fragmentation memory accounting code, tries to account for the
real memory usage, by measuring both the size of frag queue struct
(inet_frag_queue (ipv4:ipq/ipv6:frag_queue)) and the SKB's truesize.
We want to be able to handle/hold-on-to enough fragments, to ensure
good performance, without causing incomplete fragments to hurt
scalability, by causing the number of inet_frag_queue to grow too much
(resulting longer searches for frag queues).
For IPv4, how much memory does the largest frag consume.
Maximum size fragment is 64K, which is approx 44 fragments with
MTU(1500) sized packets. Sizeof(struct ipq) is 200. A 1500 byte
packet results in a truesize of 2944 (not 2048 as I first assumed)
(44*2944)+200 = 129736 bytes
The current default high thresh of 262144 bytes, is obviously
problematic, as only two 64K fragments can fit in the queue at the
same time.
How many 64K fragment can we fit into 4 MBytes:
4*2^20/((44*2944)+200) = 32.34 fragment in queues
An attacker could send a separate/distinct fake fragment packets per
queue, causing us to allocate one inet_frag_queue per packet, and thus
attacking the hash table and its lists.
How many frag queue do we need to store, and given a current hash size
of 64, what is the average list length.
Using one MTU sized fragment per inet_frag_queue, each consuming
(2944+200) 3144 bytes.
4*2^20/(2944+200) = 1334 frag queues -> 21 avg list length
An attack could send small fragments, the smallest packet I could send
resulted in a truesize of 896 bytes (I'm a little surprised by this).
4*2^20/(896+200) = 3827 frag queues -> 59 avg list length
When increasing these number, we also need to followup with
improvements, that is going to help scalability. Simply increasing
the hash size, is not enough as the current implementation does not
have a per hash bucket locking.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
afinfo->type_map and afinfo->mode_map deserve separated locks,
they are different things.
We should just take RCU read lock to protect afinfo itself,
but not for the inner pointers.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Routes with locked mtu should not use learned pmtu informations,
so do not update the pmtu on these routes.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The output route check was introduced with git commit 261663b0
(ipv4: Don't use the cached pmtu informations for input routes)
during times when we cached the pmtu informations on the
inetpeer. Now the pmtu informations are back in the routes,
so this check is obsolete. It also had some unwanted side effects,
as reported by Timo Teras and Lukas Tribus.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 299b0767 (ipv6: Fix IPsec slowpath fragmentation problem)
has introduced a error in the header length calculation that
provokes corrupted packets when non-fragmentable extensions
headers (Destination Option or Routing Header Type 2) are used.
rt->rt6i_nfheader_len is the length of the non-fragmentable
extension header, and it should be substracted to
rt->dst.header_len, and not to exthdrlen, as it was done before
commit 299b0767.
This patch reverts to the original and correct behavior. It has
been successfully tested with and without IPsec on packets
that include non-fragmentable extensions headers.
Signed-off-by: Romain Kuntz <r.kuntz@ipflavors.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While a privileged program can open a raw socket, attach some
restrictive filter and drop its privileges (or send the socket to an
unprivileged program through some Unix socket), the filter can still
be removed or modified by the unprivileged program. This commit adds a
socket option to lock the filter (SO_LOCK_FILTER) preventing any
modification of a socket filter program.
This is similar to OpenBSD BIOCLOCK ioctl on bpf sockets, except even
root is not allowed change/drop the filter.
The state of the lock can be read with getsockopt(). No error is
triggered if the state is not changed. -EPERM is returned when a user
tries to remove the lock or to change/remove the filter while the lock
is active. The check is done directly in sk_attach_filter() and
sk_detach_filter() and does not affect only setsockopt() syscall.
Signed-off-by: Vincent Bernat <bernat@luffy.cx>
Signed-off-by: David S. Miller <davem@davemloft.net>
__dev_get_by_name() doesn't refcount the network device,
so we have to do this by ourselves. Noticed by Eric.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some new APIs will require tracing a chandef without
it being part of a channel context, so separate out
the tracing macros for that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
To ease further DFS development regarding interface combinations, use
the interface combinations structure to test for radar capabilities.
Drivers can specify which channel widths they support, and in which
modes. Right now only a single AP interface is allowed, but as the
DFS code evolves other combinations can be enabled.
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The NL80211_ATTR_USE_MFP attribute was originally added for
NL80211_CMD_ASSOCIATE, but it is actually as useful (if not even more
useful) with NL80211_CMD_CONNECT, so process that attribute with the
connect command, too.
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Mesh PERR action frames are robust and thus may be encrypted, so add
proper head/tailroom to allow this. Fixes this warning when operating
a Mesh STA on ath5k:
WARNING: at net/mac80211/wpa.c:427 ccmp_encrypt_skb.isra.5+0x7b/0x1a0 [mac80211]()
Call Trace:
[<c011c5e7>] warn_slowpath_common+0x63/0x78
[<c011c60b>] warn_slowpath_null+0xf/0x13
[<e090621d>] ccmp_encrypt_skb.isra.5+0x7b/0x1a0 [mac80211]
[<e090685c>] ieee80211_crypto_ccmp_encrypt+0x1f/0x37 [mac80211]
[<e0917113>] invoke_tx_handlers+0xcad/0x10bd [mac80211]
[<e0917665>] ieee80211_tx+0x87/0xb3 [mac80211]
[<e0918932>] ieee80211_tx_pending+0xcc/0x170 [mac80211]
[<c0121c43>] tasklet_action+0x3e/0x65
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
A user reported warnings in ath5k due to transmitting frames with no
rates set up. The frames were Mesh PERR frames, and some debugging
showed an empty control block with just the vif pointer:
> [ 562.522682] XXX txinfo: 00000000: 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 ................
> [ 562.522688] XXX txinfo: 00000010: 00 00 00 00 00 00 00 00 54 b8 f2
> db 00 00 00 00 ........T.......
> [ 562.522693] XXX txinfo: 00000020: 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 ................
Set the IEEE80211_TX_INTFL_NEED_TXPROCESSING flag to ensure that
rate control gets run before the frame is sent.
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
As per email discussion Jouni Malinen pointed out that:
"P2P message exchanges can be executed on the current operating channel
of any operation (both P2P and non-P2P station). These can be on 5 GHz
and even on 60 GHz (so yes, you _can_ do GO Negotiation on 60 GHz).
As an example, it would be possible to receive a GO Negotiation Request
frame on a 5 GHz only radio and then to complete GO Negotiation on that
band. This can happen both when connected to a P2P group (through client
discoverability mechanism) and when connected to a legacy AP (assuming
the station receive Probe Request frame from full scan in the beginning
of P2P device discovery)."
This means that P2P messages can be sent over different radio devices.
However, these should use the same P2P device address so it should be
able to provision this from user-space. This patch adds a parameter for
this to struct vif_params which should only be used during creation of
the P2P device interface.
Cc: Jouni Malinen <j@w1.fi>
Cc: Greg Goldman <ggoldman@broadcom.com>
Cc: Jithu Jance <jithu@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
[add error checking]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add the nl80211_mesh_power_mode enumeration which holds possible
values for the mesh power mode. These modes are unknown, active,
light sleep and deep sleep.
Add power_mode entry to the mesh config structure to hold the
user-configured default mesh power mode. This value will be used
for new peer links.
Add the dot11MeshAwakeWindowDuration value to the mesh config.
The awake window is a duration in TU describing how long the STA
will stay awake after transmitting its beacon in PS mode.
Add access routines to:
- get/set local link-specific power mode (STA)
- get remote STA's link-specific power mode (STA)
- get remote STA's non-peer power mode (STA)
- get/set default mesh power mode (mesh config)
- get/set mesh awake window duration (mesh config)
All config changes may be done at mesh runtime and take effect
immediately.
Signed-off-by: Marco Porsch <marco@cozybit.com>
Signed-off-by: Ivan Bezyazychnyy <ivan.bezyazychnyy@gmail.com>
Signed-off-by: Mike Krinkin <krinkin.m.u@gmail.com>
[fix commit message line length, error handling in set station]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Move the default mesh beacon interval and DTIM period to cfg80211
and make them accessible to nl80211. This enables setting both
values when joining an MBSS.
Previously the DTIM parameter was not set by mac80211 so the
driver's default value was used.
Signed-off-by: Marco Porsch <marco@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This functions will be used for mesh beacons, too.
Signed-off-by: Marco Porsch <marco@cozybit.com>
[some formatting fixes]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The established peer link count is indicated in mesh beacons and
used for other internal tasks. Previously it was not updated when
authenticated peering is performed in userspace.
Signed-off-by: Marco Porsch <marco@cozybit.com>
Acked-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Ranges are taken from IEEE 802.11-2012, common sense or current
implementation requirements.
Signed-off-by: Marco Porsch <marco@cozybit.com>
Acked-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
v4: hold rtnl lock for the whole netpoll_setup()
v3: remove the comment
v2: use RCU read lock
This patch fixes the following warning:
[ 72.013864] RTNL: assertion failed at net/core/dev.c (4955)
[ 72.017758] Pid: 668, comm: netpoll-prep-v6 Not tainted 3.8.0-rc1+ #474
[ 72.019582] Call Trace:
[ 72.020295] [<ffffffff8176653d>] netdev_master_upper_dev_get+0x35/0x58
[ 72.022545] [<ffffffff81784edd>] netpoll_setup+0x61/0x340
[ 72.024846] [<ffffffff815d837e>] store_enabled+0x82/0xc3
[ 72.027466] [<ffffffff815d7e51>] netconsole_target_attr_store+0x35/0x37
[ 72.029348] [<ffffffff811c3479>] configfs_write_file+0xe2/0x10c
[ 72.030959] [<ffffffff8115d239>] vfs_write+0xaf/0xf6
[ 72.032359] [<ffffffff81978a05>] ? sysret_check+0x22/0x5d
[ 72.033824] [<ffffffff8115d453>] sys_write+0x5c/0x84
[ 72.035328] [<ffffffff819789d9>] system_call_fastpath+0x16/0x1b
In case of other races, hold rtnl lock for the entire netpoll_setup() function.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow mesh interface to disable the power save which is by default
turn on in certain chipset. Testing with 2 units of ZCN-1523H-5-16
featuring AR9280 chipset which have power save enabled by default.
Constant reset if the average signal of the peer mesh STA is below
-80 dBm and power save is enabled.
Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When the driver's resume function can't completely
restore the configuration in the device, it returns
1 from the callback which will be treated like a HW
restart request, but done directly.
In this case, also call the driver's restart_complete()
function so it can finish the reconfiguration there.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
My commit 529ba6e931
("mac80211: clean up association better in suspend")
introduced a bug when resuming from WoWLAN when a
device reset is desired. This case must not use the
suspend_bss_conf as it hasn't been stored.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Channel contexts are not always used with monitor interfaces. If no channel
context is set, use the oper channel, otherwise tx fails.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
[check local->use_chanctx]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Since:
commit b23b025fe2
Author: Ben Greear <greearb@candelatech.com>
Date: Fri Feb 4 11:54:17 2011 -0800
mac80211: Optimize scans on current operating channel.
we do not disable PS while going back to operational channel (on
ieee80211_scan_state_suspend) and deffer that until scan finish.
But since we are allowed to send frames, we can send a frame to AP
without PM bit set, so disable PS on AP side. Then when we switch
to off-channel (in ieee80211_scan_state_resume) we do not enable PS.
Hence we are off-channel with PS disabled, frames are not buffered
by AP.
To fix remove offchannel_ps_disable argument and always enable PS when
going off-channel and disable it when going on-channel, like it was
before.
Cc: stable@vger.kernel.org # 2.6.39+
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Tested-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
During FT roaming, wpa_supplicant attempts to set the
key before association. This used to be rejected, but
as a side effect of my commit 66e67e4189
("mac80211: redesign auth/assoc") the key was accepted
causing hardware crypto to not be used for it as the
station isn't added to the driver yet.
It would be possible to accept the key and then add it
to the driver when the station has been added. However,
this may run into issues with drivers using the state-
based station adding if they accept the key only after
association like it used to be.
For now, revert to the behaviour from before the auth
and assoc change.
Cc: stable@vger.kernel.org
Reported-by: Cédric Debarge <cedric.debarge@acksys.fr>
Tested-by: Cédric Debarge <cedric.debarge@acksys.fr>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Similar to commit 418a99ac6a
(Replace rwlock on xfrm_policy_afinfo with rcu), the rwlock
on xfrm_state_afinfo can be replaced by RCU too.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
commit 1def9238d4 (net_sched: more precise pkt_len computation)
does a wrong computation of mac + network headers length, as it includes
the padding before the frame.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
Documentation/networking/ip-sysctl.txt
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
Both conflicts were simply overlapping context.
A build fix for qlcnic is in here too, simply removing the added
devinit annotations which no longer exist.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
The following patchset contains netfilter fixes for 3.8-rc3,
they are:
* fix possible BUG_ON if several netns are in use and the nf_conntrack
module is removed, initial patch from Gao feng, final patch from myself.
* fix unset return value if conntrack zone are disabled at
compile-time, reported by Borislav Petkov, fix from myself.
* fix display error message via dmesg for arp_tables, from Jan Engelhardt.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
spin_is_locked() on a non !SMP build is kind of useless.
BUG_ON(!spin_is_locked(xx)) is guaranteed to crash.
Just remove this check in reqsk_fastopen_remove() as
the callers do hold the socket lock.
Reported-by: Ketan Kulkarni <ketkulka@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jerry Chu <hkchu@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Dave Taht <dave.taht@gmail.com>
Acked-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The flags argument of the phy_{attach,connect,connect_direct} functions
is then used to assign a struct phy_device dev_flags with its value.
All callers but the tg3 driver pass the flag 0, which results in the
underlying PHY drivers in drivers/net/phy/ not being able to actually
use any of the flags they would set in dev_flags. This patch gets rid of
the flags argument, and passes phydev->dev_flags to the internal PHY
library call phy_attach_direct() such that drivers which actually modify
a phy device dev_flags get the value preserved for use by the underlying
phy driver.
Acked-by: Kosta Zertsekel <konszert@marvell.com>
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet pointed out that act_mirred needs to find the current net_ns,
and struct net pointer is not provided in the call chain. His original
patch made use of current->nsproxy->net_ns to find the network namespace,
but this fails to work correctly for userspace code that makes use of
netlink sockets in different network namespaces. Instead, pass the
"struct net *" down along the call chain to where it is needed.
This version removes the ifb changes as Eric has submitted that patch
separately, but is otherwise identical to the previous version.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It brings the following goodies:
- LLCP socket timestamping (To be used e.g with the recently released nfctool
application for a more efficient skb timestamping when sniffing).
- A pretty big pn533 rework from Waldemar, preparing the driver to support
more flavours of pn533 based devices.
- HCI changes from Eric in preparation for the microread driver support.
- Some LLCP memory leak fixes, cleanups and slight improvements.
- pn544 and nfcwilink move to the devm_kzalloc API.
- An initial Secure Element (SE) API.
- An nfc.h license change from the original author, allowing non GPL
application code to safely include it.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQ8+7TAAoJEIqAPN1PVmxKHQIP/3dfFPQsOxQRj6sIkFVE/Yzh
AomeiBh5oGsZkxWzGEolWvHU+DEYTZFz/TKyhneHtWIENTj8+ueo1dh5i35DcKvL
NiZJT3ASqyJV1ipwQG102y6J511pJsVoQkFSh0Xb/yTDNjwZnL9Jp2N3vsb3rJyN
DzqNHOx+oCZvjeoGaUzRyjgndcWzeVw0f5IuyRJlCUdh9bj3Er1uP6ugCMiUkMBH
FcY3Qwdc4WbgtpyYv+Y79/vny1kQ+JPf0Rk9VlylcFZ5RsLEc7K3G3rrTQZktlAT
/fCVxURotu8XdFf6lj0qRHLnrnTICj1sDcApVOm2XtoXicOtw0q9GaUJVvgPChkc
vJ2bAYrWMeQ1FZJQt5DaQdsfsglq64ROiAlI8/s9upKP3+Pt0HNnKqUXEZVYTnxZ
wgFVj20nO2vl5tmI3Z65ZyA1cJ1uSsOcCH8sB7V+OJYQoSKWVyxJw5AOHHh3tHz7
+JfNrfTvIYG5woUivFmpdVslOHXMCe+lUfrXbvNCfF1PFsTUaQWa/dpxq6/pD991
eTn5uP+UoJqL5oiYQJzwbKMvQ+3qGPxQuaVhbicRQRiCYA3yflg43iN8aSe4ARzs
5dxY66WZgmAG1gamKxx7tJiPmkmgrrd/jAztLHdEZLmYnDUr+yQoazkaTOcWUCBt
J7RoY3HcZsiDYwuB/D14
=YeWq
-----END PGP SIGNATURE-----
Merge tag 'nfc-next-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next
Samuel Ortiz <sameo@linux.intel.com> says:
"This is the first NFC patchset targeted at the 3.9 merge window.
It brings the following goodies:
- LLCP socket timestamping (To be used e.g with the recently released nfctool
application for a more efficient skb timestamping when sniffing).
- A pretty big pn533 rework from Waldemar, preparing the driver to support
more flavours of pn533 based devices.
- HCI changes from Eric in preparation for the microread driver support.
- Some LLCP memory leak fixes, cleanups and slight improvements.
- pn544 and nfcwilink move to the devm_kzalloc API.
- An initial Secure Element (SE) API.
- An nfc.h license change from the original author, allowing non GPL
application code to safely include it."
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The only user is cxgb3 driver.
old_neigh is used to check device change, but it must not happen
on redirect. In this sense, we can remove old_neigh argument.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix regression allowing IP_TTL setting of zero, fix from Cong Wang.
2) Fix leak regressions in tunap, from Jason Wang.
3) be2net driver always returns IRQ_HANDLED in INTx handler, fix from
Sathya Perla.
4) qlge doesn't really support NETIF_F_TSO6, don't set that flag. Fix
from Amerigo Wang.
5) Add 802.11ad Atheros wil6210 driver, from Vladimir Kondratiev.
6) Fix MTU calculations in mac80211 layer, from T Krishna Chaitanya.
7) Station info layer of mac80211 needs to use del_timer_sync(), from
Johannes Berg.
8) tcp_read_sock() can loop forever, because we don't immediately stop
when recv_actor() returns zero. Fix from Eric Dumazet.
9) Fix WARN_ON() in tcp_cleanup_rbuf(). We have to use sk_eat_skb() in
tcp_recv_skb() to handle the case where a large GRO packet is split
up while it is use by a splice() operation. Fix also from Eric
Dumazet.
10) addrconf_get_prefix_route() in ipv6 tests flags incorrectly, it
does:
if (X && (p->flags & Y) != 0)
when it really meant to go:
if (X && (p->flags & X) != 0)
fix from Romain Kuntz.
11) Fix lost Kconfig dependency for bfin_mac driver hardware
timestamping. From Lars-Peter Clausen.
12) Fix regression in handling of RST without ACK in TCP, from Eric
Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits)
be2net: fix unconditionally returning IRQ_HANDLED in INTx
tuntap: fix leaking reference count
tuntap: forbid calling TUNSETIFF when detached
tuntap: switch to use rtnl_dereference()
net, wireless: overwrite default_ethtool_ops
qlge: remove NETIF_F_TSO6 flag
tcp: accept RST without ACK flag
net: ethernet: xilinx: Do not use NO_IRQ in axienet
net: ethernet: xilinx: Do not use axienet on PPC
bnx2x: Allow management traffic after boot from SAN
bnx2x: Fix fastpath structures when memory allocation fails
bfin_mac: Restore hardware time-stamping dependency on BF518
tun: avoid owner checks on IFF_ATTACH_QUEUE
bnx2x: move debugging code before the return
tuntap: refuse to re-attach to different tun_struct
ipv6: use addrconf_get_prefix_route for prefix route lookup [v2]
ipv6: fix the noflags test in addrconf_get_prefix_route
tcp: fix splice() and tcp collapsing interaction
tcp: splice: fix an infinite loop in tcp_read_sock()
net: prevent setting ttl=0 via IP_TTL
...
- use per_cpu_add when possible
- prevent the TT component to add multicast address as "mesh clients"
- some debug output improvements
- proper lockdeps class initializations
- new style fixes (space before/after brackets)
- other minor fixes and refactoring
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJQ8y1GAAoJEADl0hg6qKeOgegQAKq0TMD2XVDpFthqwszcEZ7o
1zunQ/UGp1+8ZZ+GiJ3B8VxbFpq4MYybC/J2eow7WmlDPMGmRCS5dTsYn1tkYiwY
MgBGhzuoDLcSpKZsJTrfzu8l6CkgdRHf3hLVON/UNu5SDueEXxPO7cRCrOtfNR/g
auFxNIfCderWaBiRmSQ2BZdAhSwdsh0igqg87YVETuXJrFBG4yLTAEerxOU2m+eC
v2YrNI+tRIlCoE8N8o3cJwkpCml+teFElA0Z7KxswHzmTIipkALd/LWBaU/GDKWS
Pto5n9bH2oIS4VywxH3OB5Gd8NNnEPlxUblupZljlcHLJiVD06qFYCQx0/iVon5j
Dbcfc9P9RSL4pp1zrOR6VvtOmrrtn06HXVdIT0Fse9ZjgaHX0WzdMV2ZM830xmdq
AslxOcZsuQMXmiPr8pwmHvIwSfVnWPmQpkXyP8NCYIqM6NYzznAuLdnbT6bix1wC
sPPtNEip2HW0d524qWiyVDrSuJ44tndk6t6Ycjw0uv/KnGX4XWUi4XaR6/qVOCES
EmGVt+cQhqPTl+qzXsyfGoyE5cieU0BfTyvQ//5JS/At9H4VGNGsBvJx3T2HD0+0
fkP3uu9EvChZWDKGQbJE3FMeBG8v+xtQIIS81UCRrryIpkCY4Txl264wdvDcjbup
DzS6fJgDo2WnRNs8JPT8
=G3Vh
-----END PGP SIGNATURE-----
Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge
Included changes:
- use per_cpu_add when possible
- prevent the TT component to add multicast address as "mesh clients"
- some debug output improvements
- proper lockdeps class initializations
- new style fixes (space before/after brackets)
- other minor fixes and refactoring
Signed-off-by: David S. Miller <davem@davemloft.net>
Router Alert option is very small and we can store the value
itself in the skb.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move generalized version of ipv6_is_mld() to header,
and use it from ip6_mc_input().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7a3198a8 ("ipv6: helper function to get tclass") introduced
ipv6_tclass(), but similar function is already available as
ipv6_get_dsfield().
We might be able to call ipv6_tclass() from ipv6_get_dsfield(),
but it is confusing to have two versions.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is not only for readability but also for optimization.
What we do here is to build the 32bit word at the beginning of the ipv6
header (the "ip6_flow" virtual member of struct ip6_hdr in RFC3542) and
we do not need to read the tclass portion of the target buffer.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
arptables 0.0.4 (released on 10th Jan 2013) supports calling the
CLASSIFY target, but on adding a rule to the wrong chain, the
diagnostic is as follows:
# arptables -A INPUT -j CLASSIFY --set-class 0:0
arptables: Invalid argument
# dmesg | tail -n1
x_tables: arp_tables: CLASSIFY target: used from hooks
PREROUTING, but only usable from INPUT/FORWARD
This is incorrect, since xt_CLASSIFY.c does specify
(1 << NF_ARP_OUT) | (1 << NF_ARP_FORWARD).
This patch corrects the x_tables diagnostic message to print the
proper hook names for the NFPROTO_ARP case.
Affects all kernels down to and including v2.6.31.
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
canqun zhang reported that we're hitting BUG_ON in the
nf_conntrack_destroy path when calling kfree_skb while
rmmod'ing the nf_conntrack module.
Currently, the nf_ct_destroy hook is being set to NULL in the
destroy path of conntrack.init_net. However, this is a problem
since init_net may be destroyed before any other existing netns
(we cannot assume any specific ordering while releasing existing
netns according to what I read in recent emails).
Thanks to Gao feng for initial patch to address this issue.
Reported-by: canqun zhang <canqunzhang@gmail.com>
Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
in bat_iv_ogm.c a debug message should print "tq" instead of "td"
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
The data argument in each hash function should carry the
"const" qualifier as it is never modified.
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
When the Bridge Loop Avoidance component is not compiled-in, its boolean switch
should be not compiled as well. This patch surrounds the switch with a proper
ifdef.
This behaviour was introduced by 9fd6b0615b5499b270d39a92b8790e206cf75833
("batman-adv: add bridge loop avoidance compile option")
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
debugfs_remove_recursive() checks whether its argument is not null
on its own, therefore it is possible to remove the external check.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Different hashes have the same class key because they get
initialised with the same one. For this reason lockdep can create
false warning when they are used recursively.
Re-initialise the key for each hash after the invocation to hash_new()
to avoid this problem.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Tested-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
The flag field of the tt_local_entry->common structure in
tt_local_add() is first assigned NO_FLAGS and then TT_CLIENT_NEW so
nullifying the first operation. For this reason it is safe to remove
the first assignment.
This was introuduced by ("batman-adv: keep local table consistency for
further TT_RESPONSE")
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Values are printed in hexadecimal format in several points in the
code, but they are not printed using the same format string.
This patches unifies the format used for such numbers so that they
look the same everywhere.
Given the fact that all the variables printed as hexadecimal are 16
bit long, this is the chosen printing format: %#.4x
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
To simplify debugging operations, it is better to print the related
CRC together with the translation table (local CRC for the local
table and global CRC for each entry in the global table)
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>