Commit Graph

1626 Commits

Author SHA1 Message Date
Florian Westphal
97d32cf944 netfilter: nfnetlink_queue: batch verdict support
Introduces a new nfnetlink type that applies a given
verdict to all queued packets with an id <= the id in the verdict
message.

If a mark is provided it is applied to all matched packets.

This reduces the number of verdicts that have to be sent.
Applications that make use of this feature need to maintain
a timeout to send a batchverdict periodically to avoid starvation.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-19 11:46:33 +02:00
Eric Dumazet
5863702a34 netfilter: nfnetlink_queue: assert monotonic packet ids
Packet identifier is currently setup in nfqnl_build_packet_message(),
using one atomic_inc_return().

Problem is that since several cpus might concurrently call
nfqnl_enqueue_packet() for the same queue, we can deliver packets to
consumer in non monotonic way (packet N+1 being delivered after packet
N)

This patch moves the packet id setup from nfqnl_build_packet_message()
to nfqnl_enqueue_packet() to guarantee correct delivery order.

This also removes one atomic operation.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Florian Westphal <fw@strlen.de>
CC: Pablo Neira Ayuso <pablo@netfilter.org>
CC: Eric Leblond <eric@regit.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-19 11:44:17 +02:00
Eric Dumazet
84a797dd0b netfilter: nfnetlink_queue: provide rcu enabled callbacks
nenetlink_queue operations on SMP are not efficent if several queues are
used, because of nfnl_mutex contention when applications give packet
verdict.

Use new call_rcu field in struct nfnl_callback to advertize a callback
that is called under rcu_read_lock instead of nfnl_mutex.

On my 2x4x2 machine, I was able to reach 2.000.000 pps going through
user land returning NF_ACCEPT verdicts without losses, instead of less
than 500.000 pps before patch.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Florian Westphal <fw@strlen.de>
CC: Eric Leblond <eric@regit.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-18 16:08:27 +02:00
Eric Dumazet
6b75e3e8d6 netfilter: nfnetlink: add RCU in nfnetlink_rcv_msg()
Goal of this patch is to permit nfnetlink providers not mandate
nfnl_mutex being held while nfnetlink_rcv_msg() calls them.

If struct nfnl_callback contains a non NULL call_rcu(), then
nfnetlink_rcv_msg() will use it instead of call() field, holding
rcu_read_lock instead of nfnl_mutex

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Florian Westphal <fw@strlen.de>
CC: Eric Leblond <eric@regit.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-18 16:08:07 +02:00
Mr Dash Four
131ad62d8f netfilter: add SELinux context support to AUDIT target
In this revision the conversion of secid to SELinux context and adding it
to the audit log is moved from xt_AUDIT.c to audit.c with the aid of a
separate helper function - audit_log_secctx - which does both the conversion
and logging of SELinux context, thus also preventing internal secid number
being leaked to userspace. If conversion is not successful an error is raised.

With the introduction of this helper function the work done in xt_AUDIT.c is
much more simplified. It also opens the possibility of this helper function
being used by other modules (including auditd itself), if desired. With this
addition, typical (raw auditd) output after applying the patch would be:

type=NETFILTER_PKT msg=audit(1305852240.082:31012): action=0 hook=1 len=52 inif=? outif=eth0 saddr=10.1.1.7 daddr=10.1.2.1 ipid=16312 proto=6 sport=56150 dport=22 obj=system_u:object_r:ssh_client_packet_t:s0
type=NETFILTER_PKT msg=audit(1306772064.079:56): action=0 hook=3 len=48 inif=eth0 outif=? smac=00:05:5d:7c:27:0b dmac=00:02:b3:0a:7f:81 macproto=0x0800 saddr=10.1.2.1 daddr=10.1.1.7 ipid=462 proto=6 sport=22 dport=3561 obj=system_u:object_r:ssh_server_packet_t:s0

Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Mr Dash Four <mr.dash.four@googlemail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-30 13:31:57 +02:00
Jozsef Kadlecsik
15b4d93f03 netfilter: ipset: whitespace and coding fixes detected by checkpatch.pl
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 19:01:26 +02:00
Jozsef Kadlecsik
e385357a2f netfilter: ipset: hash:net,iface type introduced
The hash:net,iface type makes possible to store network address and
interface name pairs in a set. It's mostly suitable for egress
and ingress filtering. Examples:

        # ipset create test hash:net,iface
        # ipset add test 192.168.0.0/16,eth0
        # ipset add test 192.168.0.0/24,eth1

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 19:00:48 +02:00
Jozsef Kadlecsik
9b03a5ef49 netfilter: ipset: use the stored first cidr value instead of '1'
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:58:20 +02:00
Jozsef Kadlecsik
9d8832320f netfilter: ipset: fix return code for destroy when sets are in use
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:57:44 +02:00
Jozsef Kadlecsik
b66554cf03 netfilter: ipset: add xt_action_param to the variant level kadt functions, ipset API change
With the change the sets can use any parameter available for the match
and target extensions, like input/output interface. It's required for
the hash:net,iface set type.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:56:47 +02:00
Jozsef Kadlecsik
e6146e8684 netfilter: ipset: use unified from/to address masking and check the usage
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:55:58 +02:00
Jozsef Kadlecsik
f3dfd1538f netfilter: ipset: take into account cidr value for the from address when creating the set
When creating a set from a range expressed as a network like
10.1.1.172/29, the from address was taken as the IP address part and
not masked with the netmask from the cidr.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:54:43 +02:00
Jozsef Kadlecsik
d0d9e0a5a8 netfilter: ipset: support range for IPv4 at adding/deleting elements for hash:*net* types
The range internally is converted to the network(s) equal to the range.
Example:

	# ipset new test hash:net
	# ipset add test 10.2.0.0-10.2.1.12
	# ipset list test
	Name: test
	Type: hash:net
	Header: family inet hashsize 1024 maxelem 65536
	Size in memory: 16888
	References: 0
	Members:
	10.2.1.12
	10.2.1.0/29
	10.2.0.0/24
	10.2.1.8/30

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:52:41 +02:00
Jozsef Kadlecsik
f1e00b3979 netfilter: ipset: set type support with multiple revisions added
A set type may have multiple revisions, for example when syntax is
extended. Support continuous revision ranges in set types.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:51:41 +02:00
Jozsef Kadlecsik
3d14b171f0 netfilter: ipset: fix adding ranges to hash types
When ranges are added to hash types, the elements may trigger rehashing
the set. However, the last successfully added element was not kept track
so the adding started again with the first element after the rehashing.

Bug reported by Mr Dash Four.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:49:17 +02:00
Jozsef Kadlecsik
c1e2e04388 netfilter: ipset: support listing setnames and headers too
Current listing makes possible to list sets with full content only.
The patch adds support partial listings, i.e. listing just
the existing setnames or listing set headers, without set members.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:47:07 +02:00
Jozsef Kadlecsik
ac8cc925d3 netfilter: ipset: options and flags support added to the kernel API
The support makes possible to specify the timeout value for
the SET target and a flag to reset the timeout for already existing
entries.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:42:40 +02:00
Jozsef Kadlecsik
483e9ea357 netfilter: ipset: whitespace fixes: some space before tab slipped in
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:41:53 +02:00
Jozsef Kadlecsik
5416219e5c netfilter: ipset: timeout can be modified for already added elements
When an element to a set with timeout added, one can change the timeout
by "readding" the element with the "-exist" flag. That means the timeout
value is reset to the specified one (or to the default from the set
specification if the "timeout n" option is not used). Example

ipset add foo 1.2.3.4 timeout 10
ipset add foo 1.2.3.4 timeout 600 -exist

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:40:55 +02:00
Patrick McHardy
619c15171f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next-2.6 2011-06-16 17:05:24 +02:00
Patrick McHardy
1f2d9c9dd8 Merge branch 'master' of /repos/git/net-next-2.6 2011-06-16 17:01:10 +02:00
Hans Schillstrom
6c8f794993 IPVS: remove unused init and cleanup functions.
After restructuring, there is some unused or empty functions
left to be removed.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-14 09:07:32 +09:00
Hans Schillstrom
552ad65aa5 IPVS: labels at pos 0
Put goto labels at the beginig of row
acording to coding style example.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-14 09:07:25 +09:00
Hans Schillstrom
503cf15a5e IPVS: rename of netns init and cleanup functions.
Make it more clear what the functions does,
on request by Julian.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-13 17:10:09 +09:00
Julian Anastasov
c3aa1bd376 ipvs: support more FTP PASV responses
Change the parsing of FTP commands and responses to
support skip character. It allows to detect variations in
the 227 PASV response.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-13 10:03:01 +09:00
Greg Rose
c7ac8679be rtnetlink: Compute and store minimum ifinfo dump size
The message size allocated for rtnl ifinfo dumps was limited to
a single page.  This is not enough for additional interface info
available with devices that support SR-IOV and caused a bug in
which VF info would not be displayed if more than approximately
40 VFs were created per interface.

Implement a new function pointer for the rtnl_register service that will
calculate the amount of data required for the ifinfo dump and allocate
enough data to satisfy the request.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-06-09 20:38:07 -07:00
Hans Schillstrom
c74c0bfe0b IPVS: bug in ip_vs_ftp, same list heaad used in all netns.
When ip_vs was adapted to netns the ftp application was not adapted
in a correct way.
However this is a fix to avoid kernel errors. In the long term another solution
might be chosen.  I.e the ports that the ftp appl, uses should be per netns.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-05-27 13:37:46 +02:00
Jozsef Kadlecsik
9184a9cba6 netfilter: ipset: fix ip_set_flush return code
ip_set_flush returned -EPROTO instead of -IPSET_ERR_PROTOCOL, fixed

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-05-26 19:08:11 +02:00
Linus Torvalds
57d19e80f4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  b43: fix comment typo reqest -> request
  Haavard Skinnemoen has left Atmel
  cris: typo in mach-fs Makefile
  Kconfig: fix copy/paste-ism for dell-wmi-aio driver
  doc: timers-howto: fix a typo ("unsgined")
  perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
  md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
  treewide: fix a few typos in comments
  regulator: change debug statement be consistent with the style of the rest
  Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
  audit: acquire creds selectively to reduce atomic op overhead
  rtlwifi: don't touch with treewide double semicolon removal
  treewide: cleanup continuations and remove logging message whitespace
  ath9k_hw: don't touch with treewide double semicolon removal
  include/linux/leds-regulator.h: fix syntax in example code
  tty: fix typo in descripton of tty_termios_encode_baud_rate
  xtensa: remove obsolete BKL kernel option from defconfig
  m68k: fix comment typo 'occcured'
  arch:Kconfig.locks Remove unused config option.
  treewide: remove extra semicolons
  ...
2011-05-23 09:12:26 -07:00
Linus Torvalds
06f4e926d2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits)
  macvlan: fix panic if lowerdev in a bond
  tg3: Add braces around 5906 workaround.
  tg3: Fix NETIF_F_LOOPBACK error
  macvlan: remove one synchronize_rcu() call
  networking: NET_CLS_ROUTE4 depends on INET
  irda: Fix error propagation in ircomm_lmp_connect_response()
  irda: Kill set but unused variable 'bytes' in irlan_check_command_param()
  irda: Kill set but unused variable 'clen' in ircomm_connect_indication()
  rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport()
  be2net: Kill set but unused variable 'req' in lancer_fw_download()
  irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication()
  atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined.
  rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer().
  rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler()
  rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection()
  rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window()
  pkt_sched: Kill set but unused variable 'protocol' in tc_classify()
  isdn: capi: Use pr_debug() instead of ifdefs.
  tg3: Update version to 3.119
  tg3: Apply rx_discards fix to 5719/5720
  ...

Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c
as per Davem.
2011-05-20 13:43:21 -07:00
Linus Torvalds
eb04f2f04e Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits)
  Revert "rcu: Decrease memory-barrier usage based on semi-formal proof"
  net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree
  batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu
  batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree()
  batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu
  net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu()
  net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu()
  net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu()
  perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu()
  perf,rcu: convert call_rcu(free_ctx) to kfree_rcu()
  net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(net_generic_release) to kfree_rcu()
  net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu()
  net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu()
  security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu()
  net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu()
  net,rcu: convert call_rcu(xps_map_release) to kfree_rcu()
  net,rcu: convert call_rcu(rps_map_release) to kfree_rcu()
  ...
2011-05-19 18:14:34 -07:00
David S. Miller
9cbc94eabb Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/vmxnet3/vmxnet3_ethtool.c
	net/core/dev.c
2011-05-17 17:33:11 -04:00
David S. Miller
30b9284db3 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2011-05-17 14:18:25 -04:00
Patrick McHardy
e6e4d9ed11 netfilter: nf_ct_sip: fix SDP parsing in TCP SIP messages for some Cisco phones
Some Cisco phones do not place the Content-Length field at the end of the
SIP message. This is valid, due to a misunderstanding of the specification
the parser expects the SDP body to start directly after the Content-Length
field. Fix the parser to scan for \r\n\r\n to locate the beginning of the
SDP body.

Reported-by: Teresa Kang <teresa_kang@gemtek.com.tw>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-05-16 14:45:39 +02:00
Patrick McHardy
274ea0e2a4 netfilter: nf_ct_sip: validate Content-Length in TCP SIP messages
Verify that the message length of a single SIP message, which is calculated
based on the Content-Length field contained in the SIP message, does not
exceed the packet boundaries.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-05-16 14:42:26 +02:00
Hans Schillstrom
0f08190fe8 IPVS: fix netns if reading ip_vs_* procfs entries
Without this patch every access to ip_vs in procfs will increase
the netns count i.e. an unbalanced get_net()/put_net().
(ipvsadm commands also use procfs.)
The result is you can't exit a netns if reading ip_vs_* procfs entries.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-05-15 17:27:18 +02:00
Julian Anastasov
c92f5ca2e5 ipvs: Remove all remaining references to rt->rt_{src,dst}
Remove all remaining references to rt->rt_{src,dst}
by using dest->dst_saddr to cache saddr (used for TUN mode).
For ICMP in FORWARD hook just restrict the rt_mode for NAT
to disable LOCALNODE. All other modes do not allow
IP_VS_RT_MODE_RDR, so we should be safe with the ICMP
forwarding. Using cp->daddr as replacement for rt_dst
is safe for all modes except BYPASS, even when cp->dest is
NULL because it is cp->daddr that is used to assign cp->dest
for sync-ed connections.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12 18:24:46 -04:00
David S. Miller
44e3125ccd ipvs: Eliminate rt->rt_dst usage in __ip_vs_get_out_rt().
We can simply track what destination address is used based upon which
code block is taken at the top of the function.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12 18:23:23 -04:00
David S. Miller
e58b34425b ipvs: Use IP_VS_RT_MODE_* instead of magic constants.
[ Add some cases I missed, from Julian Anastasov ]

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12 18:22:34 -04:00
David S. Miller
3c709f8fb4 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-3.6
Conflicts:
	drivers/net/benet/be_main.c
2011-05-11 14:26:58 -04:00
Pablo Neira Ayuso
93bbce1ad0 netfilter: revert a2361c8735
This patch reverts a2361c8735:
"[PATCH] netfilter: xt_conntrack: warn about use in raw table"

Florian Wesphal says:
"... when the packet was sent from the local machine the skb
already has ->nfct attached, and -m conntrack seems to do
the right thing."

Acked-by: Jan Engelhardt <jengelh@medozas.de>
Reported-by: Florian Wesphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-05-10 12:13:36 +02:00
Fernando Luis Vazquez Cao
1ed2f73d90 netfilter: IPv6: fix DSCP mangle code
The mask indicates the bits one wants to zero out, so it needs to be
inverted before applying to the original TOS field.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-05-10 10:00:21 +02:00
Hans Schillstrom
7a4f0761fc IPVS: init and cleanup restructuring
DESCRIPTION
This patch tries to restore the initial init and cleanup
sequences that was before namspace patch.
Netns also requires action when net devices unregister
which has never been implemented. I.e this patch also
covers when a device moves into a network namespace,
and has to be released.

IMPLEMENTATION
The number of calls to register_pernet_device have been
reduced to one for the ip_vs.ko
Schedulers still have their own calls.

This patch adds a function __ip_vs_service_cleanup()
and an enable flag for the netfilter hooks.

The nf hooks will be enabled when the first service is loaded
and never disabled again, except when a namespace exit starts.

Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-05-10 09:52:47 +02:00
Hans Schillstrom
1ae132b034 IPVS: Change of socket usage to enable name space exit.
If the sync daemons run in a name space while it crashes
or get killed, there is no way to stop them except for a reboot.
When all patches are there, ip_vs_core will handle register_pernet_(),
i.e. ip_vs_sync_init() and ip_vs_sync_cleanup() will be removed.

Kernel threads should not increment the use count of a socket.
By calling sk_change_net() after creating a socket this is avoided.
sock_release cant be used intead sk_release_kernel() should be used.

Thanks Eric W Biederman for your advices.

Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-05-10 09:52:33 +02:00
Eric Dumazet
5a6351eecf netfilter: fix ebtables compat support
commit 255d0dc340 (netfilter: x_table: speedup compat operations)
made ebtables not working anymore.

1) xt_compat_calc_jump() is not an exact match lookup
2) compat_table_info() has a typo in xt_compat_init_offsets() call
3) compat_do_replace() misses a xt_compat_init_offsets() call

Reported-by: dann frazier <dannf@dannf.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-05-10 09:48:59 +02:00
Pablo Neira Ayuso
315c34dae0 netfilter: ctnetlink: fix timestamp support for new conntracks
This patch fixes the missing initialization of the start time if
the timestamp support is enabled.

libnetfilter_conntrack/utils# conntrack -E &
libnetfilter_conntrack/utils# ./conntrack_create
tcp      6 109 ESTABLISHED src=1.1.1.1 dst=2.2.2.2 sport=1025 dport=21 packets=0 bytes=0 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=21 dport=1025 packets=0 bytes=0 mark=0 delta-time=1303296401 use=2

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-05-10 09:48:06 +02:00
Hans Schillstrom
74973f6fbf IPVS: init and cleanup restructuring
DESCRIPTION
This patch tries to restore the initial init and cleanup
sequences that was before namspace patch.
Netns also requires action when net devices unregister
which has never been implemented. I.e this patch also
covers when a device moves into a network namespace,
and has to be released.

IMPLEMENTATION
The number of calls to register_pernet_device have been
reduced to one for the ip_vs.ko
Schedulers still have their own calls.

This patch adds a function __ip_vs_service_cleanup()
and an enable flag for the netfilter hooks.

The nf hooks will be enabled when the first service is loaded
and never disabled again, except when a namespace exit starts.

Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-05-09 07:08:30 +09:00
Hans Schillstrom
421eab4cf3 IPVS: Change of socket usage to enable name space exit.
If the sync daemons run in a name space while it crashes
or get killed, there is no way to stop them except for a reboot.
When all patches are there, ip_vs_core will handle register_pernet_(),
i.e. ip_vs_sync_init() and ip_vs_sync_cleanup() will be removed.

Kernel threads should not increment the use count of a socket.
By calling sk_change_net() after creating a socket this is avoided.
sock_release cant be used intead sk_release_kernel() should be used.

Thanks Eric W Biederman for your advices.

Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-05-09 07:05:35 +09:00
Lai Jiangshan
88b4a0347a net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu()
The rcu callback xt_osf_finger_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(xt_osf_finger_free_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-07 22:51:12 -07:00
Lai Jiangshan
1f8d36a186 net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu()
The rcu callback __nf_ct_ext_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(__nf_ct_ext_free_rcu).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-07 22:51:07 -07:00