2006-01-03 02:04:38 +08:00
|
|
|
/*
|
|
|
|
* net/tipc/name_table.c: TIPC name table code
|
2007-02-09 22:25:21 +08:00
|
|
|
*
|
2018-03-15 23:48:55 +08:00
|
|
|
* Copyright (c) 2000-2006, 2014-2018, Ericsson AB
|
2014-12-02 15:00:24 +08:00
|
|
|
* Copyright (c) 2004-2008, 2010-2014, Wind River Systems
|
2006-01-03 02:04:38 +08:00
|
|
|
* All rights reserved.
|
|
|
|
*
|
2006-01-11 20:30:43 +08:00
|
|
|
* Redistribution and use in source and binary forms, with or without
|
2006-01-03 02:04:38 +08:00
|
|
|
* modification, are permitted provided that the following conditions are met:
|
|
|
|
*
|
2006-01-11 20:30:43 +08:00
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer.
|
|
|
|
* 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer in the
|
|
|
|
* documentation and/or other materials provided with the distribution.
|
|
|
|
* 3. Neither the names of the copyright holders nor the names of its
|
|
|
|
* contributors may be used to endorse or promote products derived from
|
|
|
|
* this software without specific prior written permission.
|
2006-01-03 02:04:38 +08:00
|
|
|
*
|
2006-01-11 20:30:43 +08:00
|
|
|
* Alternatively, this software may be distributed under the terms of the
|
|
|
|
* GNU General Public License ("GPL") version 2 as published by the Free
|
|
|
|
* Software Foundation.
|
|
|
|
*
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
|
|
|
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
|
|
|
|
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
|
|
|
|
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
|
|
|
|
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
|
|
|
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
|
|
|
|
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
|
|
|
|
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
|
2006-01-03 02:04:38 +08:00
|
|
|
* POSSIBILITY OF SUCH DAMAGE.
|
|
|
|
*/
|
|
|
|
|
2015-01-09 15:27:09 +08:00
|
|
|
#include <net/sock.h>
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
#include <linux/list_sort.h>
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
#include <linux/rbtree_augmented.h>
|
2006-01-03 02:04:38 +08:00
|
|
|
#include "core.h"
|
2015-02-09 16:50:18 +08:00
|
|
|
#include "netlink.h"
|
2006-01-03 02:04:38 +08:00
|
|
|
#include "name_table.h"
|
|
|
|
#include "name_distr.h"
|
|
|
|
#include "subscr.h"
|
2015-01-09 15:27:07 +08:00
|
|
|
#include "bcast.h"
|
2015-02-09 16:50:18 +08:00
|
|
|
#include "addr.h"
|
2015-11-20 03:30:42 +08:00
|
|
|
#include "node.h"
|
tipc: introduce communication groups
As a preparation for introducing flow control for multicast and datagram
messaging we need a more strictly defined framework than we have now. A
socket must be able keep track of exactly how many and which other
sockets it is allowed to communicate with at any moment, and keep the
necessary state for those.
We therefore introduce a new concept we have named Communication Group.
Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
The call takes four parameters: 'type' serves as group identifier,
'instance' serves as an logical member identifier, and 'scope' indicates
the visibility of the group (node/cluster/zone). Finally, 'flags' makes
it possible to set certain properties for the member. For now, there is
only one flag, indicating if the creator of the socket wants to receive
a copy of broadcast or multicast messages it is sending via the socket,
and if wants to be eligible as destination for its own anycasts.
A group is closed, i.e., sockets which have not joined a group will
not be able to send messages to or receive messages from members of
the group, and vice versa.
Any member of a group can send multicast ('group broadcast') messages
to all group members, optionally including itself, using the primitive
send(). The messages are received via the recvmsg() primitive. A socket
can only be member of one group at a time.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 17:04:23 +08:00
|
|
|
#include "group.h"
|
2006-01-03 02:04:38 +08:00
|
|
|
|
|
|
|
/**
|
2018-03-30 05:20:41 +08:00
|
|
|
* struct service_range - container for all bindings of a service range
|
|
|
|
* @lower: service range lower bound
|
|
|
|
* @upper: service range upper bound
|
|
|
|
* @tree_node: member of service range RB tree
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
* @max: largest 'upper' in this node subtree
|
2018-03-30 05:20:41 +08:00
|
|
|
* @local_publ: list of identical publications made from this node
|
|
|
|
* Used by closest_first lookup and multicast lookup algorithm
|
|
|
|
* @all_publ: all publications identical to this one, whatever node and scope
|
|
|
|
* Used by round-robin lookup algorithm
|
2006-01-03 02:04:38 +08:00
|
|
|
*/
|
2018-03-30 05:20:41 +08:00
|
|
|
struct service_range {
|
2011-05-30 21:44:38 +08:00
|
|
|
u32 lower;
|
|
|
|
u32 upper;
|
2018-03-30 05:20:41 +08:00
|
|
|
struct rb_node tree_node;
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
u32 max;
|
2018-03-30 05:20:41 +08:00
|
|
|
struct list_head local_publ;
|
|
|
|
struct list_head all_publ;
|
2011-05-30 21:44:38 +08:00
|
|
|
};
|
|
|
|
|
2007-02-09 22:25:21 +08:00
|
|
|
/**
|
2018-03-30 05:20:41 +08:00
|
|
|
* struct tipc_service - container for all published instances of a service type
|
|
|
|
* @type: 32 bit 'type' value for service
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
* @publ_cnt: increasing counter for publications in this service
|
2018-03-30 05:20:41 +08:00
|
|
|
* @ranges: rb tree containing all service ranges for this service
|
|
|
|
* @service_list: links to adjacent name ranges in hash chain
|
|
|
|
* @subscriptions: list of subscriptions for this service type
|
|
|
|
* @lock: spinlock controlling access to pertaining service ranges/publications
|
2014-12-02 15:00:30 +08:00
|
|
|
* @rcu: RCU callback head used for deferred freeing
|
2006-01-03 02:04:38 +08:00
|
|
|
*/
|
2018-03-30 05:20:41 +08:00
|
|
|
struct tipc_service {
|
2006-01-03 02:04:38 +08:00
|
|
|
u32 type;
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
u32 publ_cnt;
|
2018-03-30 05:20:41 +08:00
|
|
|
struct rb_root ranges;
|
|
|
|
struct hlist_node service_list;
|
2006-01-03 02:04:38 +08:00
|
|
|
struct list_head subscriptions;
|
2018-03-30 05:20:41 +08:00
|
|
|
spinlock_t lock; /* Covers service range list */
|
2014-12-02 15:00:30 +08:00
|
|
|
struct rcu_head rcu;
|
2006-01-03 02:04:38 +08:00
|
|
|
};
|
|
|
|
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
#define service_range_upper(sr) ((sr)->upper)
|
|
|
|
RB_DECLARE_CALLBACKS_MAX(static, sr_callbacks,
|
|
|
|
struct service_range, tree_node, u32, max,
|
|
|
|
service_range_upper)
|
|
|
|
|
|
|
|
#define service_range_entry(rbtree_node) \
|
|
|
|
(container_of(rbtree_node, struct service_range, tree_node))
|
|
|
|
|
|
|
|
#define service_range_overlap(sr, start, end) \
|
|
|
|
((sr)->lower <= (end) && (sr)->upper >= (start))
|
|
|
|
|
|
|
|
/**
|
|
|
|
* service_range_foreach_match - iterate over tipc service rbtree for each
|
|
|
|
* range match
|
|
|
|
* @sr: the service range pointer as a loop cursor
|
|
|
|
* @sc: the pointer to tipc service which holds the service range rbtree
|
|
|
|
* @start, end: the range (end >= start) for matching
|
|
|
|
*/
|
|
|
|
#define service_range_foreach_match(sr, sc, start, end) \
|
|
|
|
for (sr = service_range_match_first((sc)->ranges.rb_node, \
|
|
|
|
start, \
|
|
|
|
end); \
|
|
|
|
sr; \
|
|
|
|
sr = service_range_match_next(&(sr)->tree_node, \
|
|
|
|
start, \
|
|
|
|
end))
|
|
|
|
|
|
|
|
/**
|
|
|
|
* service_range_match_first - find first service range matching a range
|
|
|
|
* @n: the root node of service range rbtree for searching
|
|
|
|
* @start, end: the range (end >= start) for matching
|
|
|
|
*
|
|
|
|
* Return: the leftmost service range node in the rbtree that overlaps the
|
|
|
|
* specific range if any. Otherwise, returns NULL.
|
|
|
|
*/
|
|
|
|
static struct service_range *service_range_match_first(struct rb_node *n,
|
|
|
|
u32 start, u32 end)
|
|
|
|
{
|
|
|
|
struct service_range *sr;
|
|
|
|
struct rb_node *l, *r;
|
|
|
|
|
|
|
|
/* Non overlaps in tree at all? */
|
|
|
|
if (!n || service_range_entry(n)->max < start)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
while (n) {
|
|
|
|
l = n->rb_left;
|
|
|
|
if (l && service_range_entry(l)->max >= start) {
|
|
|
|
/* A leftmost overlap range node must be one in the left
|
|
|
|
* subtree. If not, it has lower > end, then nodes on
|
|
|
|
* the right side cannot satisfy the condition either.
|
|
|
|
*/
|
|
|
|
n = l;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* No one in the left subtree can match, return if this node is
|
|
|
|
* an overlap i.e. leftmost.
|
|
|
|
*/
|
|
|
|
sr = service_range_entry(n);
|
|
|
|
if (service_range_overlap(sr, start, end))
|
|
|
|
return sr;
|
|
|
|
|
|
|
|
/* Ok, try to lookup on the right side */
|
|
|
|
r = n->rb_right;
|
|
|
|
if (sr->lower <= end &&
|
|
|
|
r && service_range_entry(r)->max >= start) {
|
|
|
|
n = r;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* service_range_match_next - find next service range matching a range
|
|
|
|
* @n: a node in service range rbtree from which the searching starts
|
|
|
|
* @start, end: the range (end >= start) for matching
|
|
|
|
*
|
|
|
|
* Return: the next service range node to the given node in the rbtree that
|
|
|
|
* overlaps the specific range if any. Otherwise, returns NULL.
|
|
|
|
*/
|
|
|
|
static struct service_range *service_range_match_next(struct rb_node *n,
|
|
|
|
u32 start, u32 end)
|
|
|
|
{
|
|
|
|
struct service_range *sr;
|
|
|
|
struct rb_node *p, *r;
|
|
|
|
|
|
|
|
while (n) {
|
|
|
|
r = n->rb_right;
|
|
|
|
if (r && service_range_entry(r)->max >= start)
|
|
|
|
/* A next overlap range node must be one in the right
|
|
|
|
* subtree. If not, it has lower > end, then any next
|
|
|
|
* successor (- an ancestor) of this node cannot
|
|
|
|
* satisfy the condition either.
|
|
|
|
*/
|
|
|
|
return service_range_match_first(r, start, end);
|
|
|
|
|
|
|
|
/* No one in the right subtree can match, go up to find an
|
|
|
|
* ancestor of this node which is parent of a left-hand child.
|
|
|
|
*/
|
|
|
|
while ((p = rb_parent(n)) && n == p->rb_right)
|
|
|
|
n = p;
|
|
|
|
if (!p)
|
|
|
|
break;
|
|
|
|
|
|
|
|
/* Return if this ancestor is an overlap */
|
|
|
|
sr = service_range_entry(p);
|
|
|
|
if (service_range_overlap(sr, start, end))
|
|
|
|
return sr;
|
|
|
|
|
|
|
|
/* Ok, try to lookup more from this ancestor */
|
|
|
|
if (sr->lower <= end) {
|
|
|
|
n = p;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2006-03-21 14:37:04 +08:00
|
|
|
static int hash(int x)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2012-08-16 20:09:11 +08:00
|
|
|
return x & (TIPC_NAMETBL_SIZE - 1);
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2018-03-30 05:20:41 +08:00
|
|
|
* tipc_publ_create - create a publication structure
|
2006-01-03 02:04:38 +08:00
|
|
|
*/
|
2018-03-30 05:20:41 +08:00
|
|
|
static struct publication *tipc_publ_create(u32 type, u32 lower, u32 upper,
|
|
|
|
u32 scope, u32 node, u32 port,
|
|
|
|
u32 key)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2006-07-22 05:51:30 +08:00
|
|
|
struct publication *publ = kzalloc(sizeof(*publ), GFP_ATOMIC);
|
2018-03-30 05:20:41 +08:00
|
|
|
|
|
|
|
if (!publ)
|
2006-03-21 14:36:47 +08:00
|
|
|
return NULL;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
|
|
|
publ->type = type;
|
|
|
|
publ->lower = lower;
|
|
|
|
publ->upper = upper;
|
|
|
|
publ->scope = scope;
|
|
|
|
publ->node = node;
|
2018-03-15 23:48:55 +08:00
|
|
|
publ->port = port;
|
2006-01-03 02:04:38 +08:00
|
|
|
publ->key = key;
|
2018-03-15 23:48:55 +08:00
|
|
|
INIT_LIST_HEAD(&publ->binding_sock);
|
2018-03-30 05:20:41 +08:00
|
|
|
INIT_LIST_HEAD(&publ->binding_node);
|
|
|
|
INIT_LIST_HEAD(&publ->local_publ);
|
|
|
|
INIT_LIST_HEAD(&publ->all_publ);
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
INIT_LIST_HEAD(&publ->list);
|
2006-01-03 02:04:38 +08:00
|
|
|
return publ;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2018-03-30 05:20:41 +08:00
|
|
|
* tipc_service_create - create a service structure for the specified 'type'
|
2007-02-09 22:25:21 +08:00
|
|
|
*
|
2018-03-30 05:20:41 +08:00
|
|
|
* Allocates a single range structure and sets it to all 0's.
|
2006-01-03 02:04:38 +08:00
|
|
|
*/
|
2018-03-30 05:20:41 +08:00
|
|
|
static struct tipc_service *tipc_service_create(u32 type, struct hlist_head *hd)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct tipc_service *service = kzalloc(sizeof(*service), GFP_ATOMIC);
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
if (!service) {
|
|
|
|
pr_warn("Service creation failed, no memory\n");
|
2006-03-21 14:36:47 +08:00
|
|
|
return NULL;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_lock_init(&service->lock);
|
|
|
|
service->type = type;
|
|
|
|
service->ranges = RB_ROOT;
|
|
|
|
INIT_HLIST_NODE(&service->service_list);
|
|
|
|
INIT_LIST_HEAD(&service->subscriptions);
|
|
|
|
hlist_add_head_rcu(&service->service_list, hd);
|
|
|
|
return service;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
2018-05-09 08:59:41 +08:00
|
|
|
/* tipc_service_find_range - find service range matching publication parameters
|
|
|
|
*/
|
|
|
|
static struct service_range *tipc_service_find_range(struct tipc_service *sc,
|
|
|
|
u32 lower, u32 upper)
|
|
|
|
{
|
|
|
|
struct service_range *sr;
|
|
|
|
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
service_range_foreach_match(sr, sc, lower, upper) {
|
|
|
|
/* Look for exact match */
|
|
|
|
if (sr->lower == lower && sr->upper == upper)
|
|
|
|
return sr;
|
2018-05-09 08:59:41 +08:00
|
|
|
}
|
|
|
|
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
return NULL;
|
2018-05-09 08:59:41 +08:00
|
|
|
}
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
static struct service_range *tipc_service_create_range(struct tipc_service *sc,
|
|
|
|
u32 lower, u32 upper)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct rb_node **n, *parent = NULL;
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
struct service_range *sr;
|
2018-03-30 05:20:41 +08:00
|
|
|
|
|
|
|
n = &sc->ranges.rb_node;
|
|
|
|
while (*n) {
|
|
|
|
parent = *n;
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
sr = service_range_entry(parent);
|
|
|
|
if (lower == sr->lower && upper == sr->upper)
|
|
|
|
return sr;
|
|
|
|
if (sr->max < upper)
|
|
|
|
sr->max = upper;
|
|
|
|
if (lower <= sr->lower)
|
|
|
|
n = &parent->rb_left;
|
2006-01-03 02:04:38 +08:00
|
|
|
else
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
n = &parent->rb_right;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
sr = kzalloc(sizeof(*sr), GFP_ATOMIC);
|
|
|
|
if (!sr)
|
|
|
|
return NULL;
|
|
|
|
sr->lower = lower;
|
|
|
|
sr->upper = upper;
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
sr->max = upper;
|
2018-03-30 05:20:41 +08:00
|
|
|
INIT_LIST_HEAD(&sr->local_publ);
|
|
|
|
INIT_LIST_HEAD(&sr->all_publ);
|
|
|
|
rb_link_node(&sr->tree_node, parent, n);
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
rb_insert_augmented(&sr->tree_node, &sc->ranges, &sr_callbacks);
|
2018-03-30 05:20:41 +08:00
|
|
|
return sr;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
static struct publication *tipc_service_insert_publ(struct net *net,
|
|
|
|
struct tipc_service *sc,
|
2015-01-09 15:27:10 +08:00
|
|
|
u32 type, u32 lower,
|
|
|
|
u32 upper, u32 scope,
|
2018-03-30 05:20:41 +08:00
|
|
|
u32 node, u32 port,
|
|
|
|
u32 key)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct tipc_subscription *sub, *tmp;
|
|
|
|
struct service_range *sr;
|
|
|
|
struct publication *p;
|
|
|
|
bool first = false;
|
2011-05-30 21:44:38 +08:00
|
|
|
|
2018-03-30 05:20:43 +08:00
|
|
|
sr = tipc_service_create_range(sc, lower, upper);
|
|
|
|
if (!sr)
|
|
|
|
goto err;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2018-03-30 05:20:43 +08:00
|
|
|
first = list_empty(&sr->all_publ);
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
/* Return if the publication already exists */
|
|
|
|
list_for_each_entry(p, &sr->all_publ, all_publ) {
|
|
|
|
if (p->key == key && (!p->node || p->node == node))
|
|
|
|
return NULL;
|
|
|
|
}
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
/* Create and insert publication */
|
|
|
|
p = tipc_publ_create(type, lower, upper, scope, node, port, key);
|
|
|
|
if (!p)
|
|
|
|
goto err;
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
/* Suppose there shouldn't be a huge gap btw publs i.e. >INT_MAX */
|
|
|
|
p->id = sc->publ_cnt++;
|
2018-03-15 23:48:53 +08:00
|
|
|
if (in_own_node(net, node))
|
2018-03-30 05:20:41 +08:00
|
|
|
list_add(&p->local_publ, &sr->local_publ);
|
|
|
|
list_add(&p->all_publ, &sr->all_publ);
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2012-05-01 03:29:02 +08:00
|
|
|
/* Any subscriptions waiting for notification? */
|
2018-03-30 05:20:41 +08:00
|
|
|
list_for_each_entry_safe(sub, tmp, &sc->subscriptions, service_list) {
|
|
|
|
tipc_sub_report_overlap(sub, p->lower, p->upper, TIPC_PUBLISHED,
|
|
|
|
p->port, p->node, p->scope, first);
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
return p;
|
|
|
|
err:
|
|
|
|
pr_warn("Failed to bind to %u,%u,%u, no memory\n", type, lower, upper);
|
|
|
|
return NULL;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2018-03-30 05:20:41 +08:00
|
|
|
* tipc_service_remove_publ - remove a publication from a service
|
2006-01-03 02:04:38 +08:00
|
|
|
*/
|
2018-05-09 08:59:41 +08:00
|
|
|
static struct publication *tipc_service_remove_publ(struct service_range *sr,
|
|
|
|
u32 node, u32 key)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct publication *p;
|
2011-05-30 21:44:38 +08:00
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
list_for_each_entry(p, &sr->all_publ, all_publ) {
|
|
|
|
if (p->key != key || (node && node != p->node))
|
|
|
|
continue;
|
2018-05-09 08:59:41 +08:00
|
|
|
list_del(&p->all_publ);
|
|
|
|
list_del(&p->local_publ);
|
|
|
|
return p;
|
2011-05-30 22:48:48 +08:00
|
|
|
}
|
2018-05-09 08:59:41 +08:00
|
|
|
return NULL;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
/**
|
|
|
|
* Code reused: time_after32() for the same purpose
|
|
|
|
*/
|
|
|
|
#define publication_after(pa, pb) time_after32((pa)->id, (pb)->id)
|
|
|
|
static int tipc_publ_sort(void *priv, struct list_head *a,
|
|
|
|
struct list_head *b)
|
|
|
|
{
|
|
|
|
struct publication *pa, *pb;
|
|
|
|
|
|
|
|
pa = container_of(a, struct publication, list);
|
|
|
|
pb = container_of(b, struct publication, list);
|
|
|
|
return publication_after(pa, pb);
|
|
|
|
}
|
|
|
|
|
2006-01-03 02:04:38 +08:00
|
|
|
/**
|
2018-03-30 05:20:41 +08:00
|
|
|
* tipc_service_subscribe - attach a subscription, and optionally
|
|
|
|
* issue the prescribed number of events if there is any service
|
|
|
|
* range overlapping with the requested range
|
2006-01-03 02:04:38 +08:00
|
|
|
*/
|
2018-03-30 05:20:41 +08:00
|
|
|
static void tipc_service_subscribe(struct tipc_service *service,
|
2018-02-15 17:40:46 +08:00
|
|
|
struct tipc_subscription *sub)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct tipc_subscr *sb = &sub->evt.s;
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
struct publication *p, *first, *tmp;
|
|
|
|
struct list_head publ_list;
|
2018-03-30 05:20:41 +08:00
|
|
|
struct service_range *sr;
|
2016-02-02 17:52:10 +08:00
|
|
|
struct tipc_name_seq ns;
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
u32 filter;
|
2016-02-02 17:52:10 +08:00
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
ns.type = tipc_sub_read(sb, seq.type);
|
|
|
|
ns.lower = tipc_sub_read(sb, seq.lower);
|
|
|
|
ns.upper = tipc_sub_read(sb, seq.upper);
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
filter = tipc_sub_read(sb, filter);
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2018-02-15 17:40:48 +08:00
|
|
|
tipc_sub_get(sub);
|
2018-03-30 05:20:41 +08:00
|
|
|
list_add(&sub->service_list, &service->subscriptions);
|
2006-01-03 02:04:38 +08:00
|
|
|
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
if (filter & TIPC_SUB_NO_STATUS)
|
2006-01-03 02:04:38 +08:00
|
|
|
return;
|
|
|
|
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
INIT_LIST_HEAD(&publ_list);
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
service_range_foreach_match(sr, service, ns.lower, ns.upper) {
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
first = NULL;
|
2018-03-30 05:20:41 +08:00
|
|
|
list_for_each_entry(p, &sr->all_publ, all_publ) {
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
if (filter & TIPC_SUB_PORTS)
|
|
|
|
list_add_tail(&p->list, &publ_list);
|
|
|
|
else if (!first || publication_after(first, p))
|
|
|
|
/* Pick this range's *first* publication */
|
|
|
|
first = p;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
tipc: support in-order name publication events
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 16:34:58 +08:00
|
|
|
if (first)
|
|
|
|
list_add_tail(&first->list, &publ_list);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Sort the publications before reporting */
|
|
|
|
list_sort(NULL, &publ_list, tipc_publ_sort);
|
|
|
|
list_for_each_entry_safe(p, tmp, &publ_list, list) {
|
|
|
|
tipc_sub_report_overlap(sub, p->lower, p->upper,
|
|
|
|
TIPC_PUBLISHED, p->port, p->node,
|
|
|
|
p->scope, true);
|
|
|
|
list_del_init(&p->list);
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
static struct tipc_service *tipc_service_find(struct net *net, u32 type)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct name_table *nt = tipc_name_table(net);
|
|
|
|
struct hlist_head *service_head;
|
|
|
|
struct tipc_service *service;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
service_head = &nt->services[hash(type)];
|
|
|
|
hlist_for_each_entry_rcu(service, service_head, service_list) {
|
|
|
|
if (service->type == type)
|
|
|
|
return service;
|
|
|
|
}
|
2006-03-21 14:36:47 +08:00
|
|
|
return NULL;
|
2006-01-03 02:04:38 +08:00
|
|
|
};
|
|
|
|
|
2015-01-09 15:27:09 +08:00
|
|
|
struct publication *tipc_nametbl_insert_publ(struct net *net, u32 type,
|
2018-03-30 05:20:41 +08:00
|
|
|
u32 lower, u32 upper,
|
|
|
|
u32 scope, u32 node,
|
|
|
|
u32 port, u32 key)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct name_table *nt = tipc_name_table(net);
|
|
|
|
struct tipc_service *sc;
|
|
|
|
struct publication *p;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2018-03-15 23:48:51 +08:00
|
|
|
if (scope > TIPC_NODE_SCOPE || lower > upper) {
|
2018-03-30 05:20:41 +08:00
|
|
|
pr_debug("Failed to bind illegal {%u,%u,%u} with scope %u\n",
|
2012-06-29 12:16:37 +08:00
|
|
|
type, lower, upper, scope);
|
2006-03-21 14:36:47 +08:00
|
|
|
return NULL;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
sc = tipc_service_find(net, type);
|
|
|
|
if (!sc)
|
|
|
|
sc = tipc_service_create(type, &nt->services[hash(type)]);
|
|
|
|
if (!sc)
|
2006-03-21 14:36:47 +08:00
|
|
|
return NULL;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_lock_bh(&sc->lock);
|
|
|
|
p = tipc_service_insert_publ(net, sc, type, lower, upper,
|
|
|
|
scope, node, port, key);
|
|
|
|
spin_unlock_bh(&sc->lock);
|
|
|
|
return p;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
2015-01-09 15:27:09 +08:00
|
|
|
struct publication *tipc_nametbl_remove_publ(struct net *net, u32 type,
|
2018-03-30 05:20:43 +08:00
|
|
|
u32 lower, u32 upper,
|
|
|
|
u32 node, u32 key)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct tipc_service *sc = tipc_service_find(net, type);
|
2018-05-09 08:59:41 +08:00
|
|
|
struct tipc_subscription *sub, *tmp;
|
2018-04-18 03:25:42 +08:00
|
|
|
struct service_range *sr = NULL;
|
2018-03-30 05:20:41 +08:00
|
|
|
struct publication *p = NULL;
|
2018-05-09 08:59:41 +08:00
|
|
|
bool last;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
if (!sc)
|
2006-03-21 14:36:47 +08:00
|
|
|
return NULL;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_lock_bh(&sc->lock);
|
2018-05-09 08:59:41 +08:00
|
|
|
sr = tipc_service_find_range(sc, lower, upper);
|
|
|
|
if (!sr)
|
|
|
|
goto exit;
|
|
|
|
p = tipc_service_remove_publ(sr, node, key);
|
|
|
|
if (!p)
|
|
|
|
goto exit;
|
|
|
|
|
|
|
|
/* Notify any waiting subscriptions */
|
|
|
|
last = list_empty(&sr->all_publ);
|
|
|
|
list_for_each_entry_safe(sub, tmp, &sc->subscriptions, service_list) {
|
|
|
|
tipc_sub_report_overlap(sub, lower, upper, TIPC_WITHDRAWN,
|
|
|
|
p->port, node, p->scope, last);
|
|
|
|
}
|
2018-04-18 03:25:42 +08:00
|
|
|
|
|
|
|
/* Remove service range item if this was its last publication */
|
2018-05-09 08:59:41 +08:00
|
|
|
if (list_empty(&sr->all_publ)) {
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
rb_erase_augmented(&sr->tree_node, &sc->ranges, &sr_callbacks);
|
2018-04-18 03:25:42 +08:00
|
|
|
kfree(sr);
|
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
|
|
|
|
/* Delete service item if this no more publications and subscriptions */
|
|
|
|
if (RB_EMPTY_ROOT(&sc->ranges) && list_empty(&sc->subscriptions)) {
|
|
|
|
hlist_del_init_rcu(&sc->service_list);
|
|
|
|
kfree_rcu(sc, rcu);
|
2014-12-02 15:00:26 +08:00
|
|
|
}
|
2018-05-09 08:59:41 +08:00
|
|
|
exit:
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_unlock_bh(&sc->lock);
|
|
|
|
return p;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
2012-07-10 18:55:09 +08:00
|
|
|
/**
|
2018-03-30 05:20:41 +08:00
|
|
|
* tipc_nametbl_translate - perform service instance to socket translation
|
2006-01-03 02:04:38 +08:00
|
|
|
*
|
2018-03-30 05:20:42 +08:00
|
|
|
* On entry, 'dnode' is the search domain used during translation.
|
2011-11-08 06:00:54 +08:00
|
|
|
*
|
|
|
|
* On exit:
|
2018-03-30 05:20:42 +08:00
|
|
|
* - if translation is deferred to another node, leave 'dnode' unchanged and
|
|
|
|
* return 0
|
|
|
|
* - if translation is attempted and succeeds, set 'dnode' to the publishing
|
|
|
|
* node and return the published (non-zero) port number
|
|
|
|
* - if translation is attempted and fails, set 'dnode' to 0 and return 0
|
|
|
|
*
|
|
|
|
* Note that for legacy users (node configured with Z.C.N address format) the
|
|
|
|
* 'closest-first' lookup algorithm must be maintained, i.e., if dnode is 0
|
|
|
|
* we must look in the local binding list first
|
2006-01-03 02:04:38 +08:00
|
|
|
*/
|
2018-03-30 05:20:42 +08:00
|
|
|
u32 tipc_nametbl_translate(struct net *net, u32 type, u32 instance, u32 *dnode)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
tipc: allow closest-first lookup algorithm when legacy address is configured
The removal of an internal structure of the node address has an unwanted
side effect.
- Currently, if a user is sending an anycast message with destination
domain 0, the tipc_namebl_translate() function will use the 'closest-
first' algorithm to first look for a node local destination, and only
when no such is found, will it resort to the cluster global 'round-
robin' lookup algorithm.
- Current users can get around this, and enforce unconditional use of
global round-robin by indicating a destination as Z.0.0 or Z.C.0.
- This option disappears when we make the node address flat, since the
lookup algorithm has no way of recognizing this case. So, as long as
there are node local destinations, the algorithm will always select
one of those, and there is nothing the sender can do to change this.
We solve this by eliminating the 'closest-first' option, which was never
a good idea anyway, for non-legacy users, but only for those. To
distinguish between legacy users and non-legacy users we introduce a new
flag 'legacy_addr_format' in struct tipc_core, to be set when the user
configures a legacy-style Z.C.N node address. Hence, when a legacy user
indicates a zero lookup domain 'closest-first' is selected, and in all
other cases we use 'round-robin'.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23 03:42:48 +08:00
|
|
|
struct tipc_net *tn = tipc_net(net);
|
|
|
|
bool legacy = tn->legacy_addr_format;
|
|
|
|
u32 self = tipc_own_addr(net);
|
2018-03-30 05:20:41 +08:00
|
|
|
struct service_range *sr;
|
|
|
|
struct tipc_service *sc;
|
2018-03-30 05:20:42 +08:00
|
|
|
struct list_head *list;
|
2018-03-30 05:20:41 +08:00
|
|
|
struct publication *p;
|
2018-03-15 23:48:55 +08:00
|
|
|
u32 port = 0;
|
2011-11-08 06:00:54 +08:00
|
|
|
u32 node = 0;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2018-03-30 05:20:42 +08:00
|
|
|
if (!tipc_in_scope(legacy, *dnode, self))
|
2006-01-03 02:04:38 +08:00
|
|
|
return 0;
|
|
|
|
|
2014-12-02 15:00:30 +08:00
|
|
|
rcu_read_lock();
|
2018-03-30 05:20:41 +08:00
|
|
|
sc = tipc_service_find(net, type);
|
|
|
|
if (unlikely(!sc))
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
goto exit;
|
2018-03-30 05:20:41 +08:00
|
|
|
|
|
|
|
spin_lock_bh(&sc->lock);
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
service_range_foreach_match(sr, sc, instance, instance) {
|
|
|
|
/* Select lookup algo: local, closest-first or round-robin */
|
|
|
|
if (*dnode == self) {
|
|
|
|
list = &sr->local_publ;
|
|
|
|
if (list_empty(list))
|
|
|
|
continue;
|
|
|
|
p = list_first_entry(list, struct publication,
|
|
|
|
local_publ);
|
|
|
|
list_move_tail(&p->local_publ, &sr->local_publ);
|
|
|
|
} else if (legacy && !*dnode && !list_empty(&sr->local_publ)) {
|
|
|
|
list = &sr->local_publ;
|
|
|
|
p = list_first_entry(list, struct publication,
|
|
|
|
local_publ);
|
|
|
|
list_move_tail(&p->local_publ, &sr->local_publ);
|
|
|
|
} else {
|
|
|
|
list = &sr->all_publ;
|
|
|
|
p = list_first_entry(list, struct publication,
|
|
|
|
all_publ);
|
|
|
|
list_move_tail(&p->all_publ, &sr->all_publ);
|
|
|
|
}
|
|
|
|
port = p->port;
|
|
|
|
node = p->node;
|
|
|
|
/* Todo: as for legacy, pick the first matching range only, a
|
|
|
|
* "true" round-robin will be performed as needed.
|
|
|
|
*/
|
|
|
|
break;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_unlock_bh(&sc->lock);
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
|
|
|
|
exit:
|
2014-12-02 15:00:30 +08:00
|
|
|
rcu_read_unlock();
|
2018-03-30 05:20:42 +08:00
|
|
|
*dnode = node;
|
2018-03-15 23:48:55 +08:00
|
|
|
return port;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
2018-01-09 04:03:30 +08:00
|
|
|
bool tipc_nametbl_lookup(struct net *net, u32 type, u32 instance, u32 scope,
|
2017-10-13 17:04:28 +08:00
|
|
|
struct list_head *dsts, int *dstcnt, u32 exclude,
|
|
|
|
bool all)
|
|
|
|
{
|
|
|
|
u32 self = tipc_own_addr(net);
|
2018-03-30 05:20:41 +08:00
|
|
|
struct service_range *sr;
|
|
|
|
struct tipc_service *sc;
|
|
|
|
struct publication *p;
|
2017-10-13 17:04:28 +08:00
|
|
|
|
|
|
|
*dstcnt = 0;
|
|
|
|
rcu_read_lock();
|
2018-03-30 05:20:41 +08:00
|
|
|
sc = tipc_service_find(net, type);
|
|
|
|
if (unlikely(!sc))
|
2017-10-13 17:04:28 +08:00
|
|
|
goto exit;
|
2018-03-30 05:20:41 +08:00
|
|
|
|
|
|
|
spin_lock_bh(&sc->lock);
|
|
|
|
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
/* Todo: a full search i.e. service_range_foreach_match() instead? */
|
|
|
|
sr = service_range_match_first(sc->ranges.rb_node, instance, instance);
|
2018-03-30 05:20:41 +08:00
|
|
|
if (!sr)
|
|
|
|
goto no_match;
|
|
|
|
|
|
|
|
list_for_each_entry(p, &sr->all_publ, all_publ) {
|
|
|
|
if (p->scope != scope)
|
|
|
|
continue;
|
|
|
|
if (p->port == exclude && p->node == self)
|
|
|
|
continue;
|
|
|
|
tipc_dest_push(dsts, p->node, p->port);
|
|
|
|
(*dstcnt)++;
|
|
|
|
if (all)
|
|
|
|
continue;
|
|
|
|
list_move_tail(&p->all_publ, &sr->all_publ);
|
|
|
|
break;
|
2017-10-13 17:04:28 +08:00
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
no_match:
|
|
|
|
spin_unlock_bh(&sc->lock);
|
2017-10-13 17:04:28 +08:00
|
|
|
exit:
|
|
|
|
rcu_read_unlock();
|
|
|
|
return !list_empty(dsts);
|
|
|
|
}
|
|
|
|
|
2018-03-15 23:48:53 +08:00
|
|
|
void tipc_nametbl_mc_lookup(struct net *net, u32 type, u32 lower, u32 upper,
|
|
|
|
u32 scope, bool exact, struct list_head *dports)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct service_range *sr;
|
|
|
|
struct tipc_service *sc;
|
2018-01-09 04:03:30 +08:00
|
|
|
struct publication *p;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2014-12-02 15:00:30 +08:00
|
|
|
rcu_read_lock();
|
2018-03-30 05:20:41 +08:00
|
|
|
sc = tipc_service_find(net, type);
|
|
|
|
if (!sc)
|
2006-01-03 02:04:38 +08:00
|
|
|
goto exit;
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_lock_bh(&sc->lock);
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
service_range_foreach_match(sr, sc, lower, upper) {
|
2018-03-30 05:20:41 +08:00
|
|
|
list_for_each_entry(p, &sr->local_publ, local_publ) {
|
2018-01-09 04:03:30 +08:00
|
|
|
if (p->scope == scope || (!exact && p->scope < scope))
|
2018-03-15 23:48:55 +08:00
|
|
|
tipc_dest_push(dports, 0, p->port);
|
2008-07-15 13:45:33 +08:00
|
|
|
}
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_unlock_bh(&sc->lock);
|
2006-01-03 02:04:38 +08:00
|
|
|
exit:
|
2014-12-02 15:00:30 +08:00
|
|
|
rcu_read_unlock();
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
2017-01-19 02:50:51 +08:00
|
|
|
/* tipc_nametbl_lookup_dst_nodes - find broadcast destination nodes
|
|
|
|
* - Creates list of nodes that overlap the given multicast address
|
2018-03-30 05:20:41 +08:00
|
|
|
* - Determines if any node local destinations overlap
|
2017-01-19 02:50:51 +08:00
|
|
|
*/
|
|
|
|
void tipc_nametbl_lookup_dst_nodes(struct net *net, u32 type, u32 lower,
|
2018-01-13 03:56:50 +08:00
|
|
|
u32 upper, struct tipc_nlist *nodes)
|
2017-01-19 02:50:51 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct service_range *sr;
|
|
|
|
struct tipc_service *sc;
|
|
|
|
struct publication *p;
|
2017-01-19 02:50:51 +08:00
|
|
|
|
|
|
|
rcu_read_lock();
|
2018-03-30 05:20:41 +08:00
|
|
|
sc = tipc_service_find(net, type);
|
|
|
|
if (!sc)
|
2017-01-19 02:50:51 +08:00
|
|
|
goto exit;
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_lock_bh(&sc->lock);
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
service_range_foreach_match(sr, sc, lower, upper) {
|
2018-03-30 05:20:41 +08:00
|
|
|
list_for_each_entry(p, &sr->all_publ, all_publ) {
|
|
|
|
tipc_nlist_add(nodes, p->node);
|
2017-01-19 02:50:51 +08:00
|
|
|
}
|
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_unlock_bh(&sc->lock);
|
2017-01-19 02:50:51 +08:00
|
|
|
exit:
|
|
|
|
rcu_read_unlock();
|
|
|
|
}
|
|
|
|
|
tipc: introduce communication groups
As a preparation for introducing flow control for multicast and datagram
messaging we need a more strictly defined framework than we have now. A
socket must be able keep track of exactly how many and which other
sockets it is allowed to communicate with at any moment, and keep the
necessary state for those.
We therefore introduce a new concept we have named Communication Group.
Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
The call takes four parameters: 'type' serves as group identifier,
'instance' serves as an logical member identifier, and 'scope' indicates
the visibility of the group (node/cluster/zone). Finally, 'flags' makes
it possible to set certain properties for the member. For now, there is
only one flag, indicating if the creator of the socket wants to receive
a copy of broadcast or multicast messages it is sending via the socket,
and if wants to be eligible as destination for its own anycasts.
A group is closed, i.e., sockets which have not joined a group will
not be able to send messages to or receive messages from members of
the group, and vice versa.
Any member of a group can send multicast ('group broadcast') messages
to all group members, optionally including itself, using the primitive
send(). The messages are received via the recvmsg() primitive. A socket
can only be member of one group at a time.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 17:04:23 +08:00
|
|
|
/* tipc_nametbl_build_group - build list of communication group members
|
|
|
|
*/
|
|
|
|
void tipc_nametbl_build_group(struct net *net, struct tipc_group *grp,
|
2018-01-09 04:03:30 +08:00
|
|
|
u32 type, u32 scope)
|
tipc: introduce communication groups
As a preparation for introducing flow control for multicast and datagram
messaging we need a more strictly defined framework than we have now. A
socket must be able keep track of exactly how many and which other
sockets it is allowed to communicate with at any moment, and keep the
necessary state for those.
We therefore introduce a new concept we have named Communication Group.
Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
The call takes four parameters: 'type' serves as group identifier,
'instance' serves as an logical member identifier, and 'scope' indicates
the visibility of the group (node/cluster/zone). Finally, 'flags' makes
it possible to set certain properties for the member. For now, there is
only one flag, indicating if the creator of the socket wants to receive
a copy of broadcast or multicast messages it is sending via the socket,
and if wants to be eligible as destination for its own anycasts.
A group is closed, i.e., sockets which have not joined a group will
not be able to send messages to or receive messages from members of
the group, and vice versa.
Any member of a group can send multicast ('group broadcast') messages
to all group members, optionally including itself, using the primitive
send(). The messages are received via the recvmsg() primitive. A socket
can only be member of one group at a time.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 17:04:23 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct service_range *sr;
|
|
|
|
struct tipc_service *sc;
|
tipc: introduce communication groups
As a preparation for introducing flow control for multicast and datagram
messaging we need a more strictly defined framework than we have now. A
socket must be able keep track of exactly how many and which other
sockets it is allowed to communicate with at any moment, and keep the
necessary state for those.
We therefore introduce a new concept we have named Communication Group.
Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
The call takes four parameters: 'type' serves as group identifier,
'instance' serves as an logical member identifier, and 'scope' indicates
the visibility of the group (node/cluster/zone). Finally, 'flags' makes
it possible to set certain properties for the member. For now, there is
only one flag, indicating if the creator of the socket wants to receive
a copy of broadcast or multicast messages it is sending via the socket,
and if wants to be eligible as destination for its own anycasts.
A group is closed, i.e., sockets which have not joined a group will
not be able to send messages to or receive messages from members of
the group, and vice versa.
Any member of a group can send multicast ('group broadcast') messages
to all group members, optionally including itself, using the primitive
send(). The messages are received via the recvmsg() primitive. A socket
can only be member of one group at a time.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 17:04:23 +08:00
|
|
|
struct publication *p;
|
2018-03-30 05:20:41 +08:00
|
|
|
struct rb_node *n;
|
tipc: introduce communication groups
As a preparation for introducing flow control for multicast and datagram
messaging we need a more strictly defined framework than we have now. A
socket must be able keep track of exactly how many and which other
sockets it is allowed to communicate with at any moment, and keep the
necessary state for those.
We therefore introduce a new concept we have named Communication Group.
Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
The call takes four parameters: 'type' serves as group identifier,
'instance' serves as an logical member identifier, and 'scope' indicates
the visibility of the group (node/cluster/zone). Finally, 'flags' makes
it possible to set certain properties for the member. For now, there is
only one flag, indicating if the creator of the socket wants to receive
a copy of broadcast or multicast messages it is sending via the socket,
and if wants to be eligible as destination for its own anycasts.
A group is closed, i.e., sockets which have not joined a group will
not be able to send messages to or receive messages from members of
the group, and vice versa.
Any member of a group can send multicast ('group broadcast') messages
to all group members, optionally including itself, using the primitive
send(). The messages are received via the recvmsg() primitive. A socket
can only be member of one group at a time.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 17:04:23 +08:00
|
|
|
|
|
|
|
rcu_read_lock();
|
2018-03-30 05:20:41 +08:00
|
|
|
sc = tipc_service_find(net, type);
|
|
|
|
if (!sc)
|
tipc: introduce communication groups
As a preparation for introducing flow control for multicast and datagram
messaging we need a more strictly defined framework than we have now. A
socket must be able keep track of exactly how many and which other
sockets it is allowed to communicate with at any moment, and keep the
necessary state for those.
We therefore introduce a new concept we have named Communication Group.
Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
The call takes four parameters: 'type' serves as group identifier,
'instance' serves as an logical member identifier, and 'scope' indicates
the visibility of the group (node/cluster/zone). Finally, 'flags' makes
it possible to set certain properties for the member. For now, there is
only one flag, indicating if the creator of the socket wants to receive
a copy of broadcast or multicast messages it is sending via the socket,
and if wants to be eligible as destination for its own anycasts.
A group is closed, i.e., sockets which have not joined a group will
not be able to send messages to or receive messages from members of
the group, and vice versa.
Any member of a group can send multicast ('group broadcast') messages
to all group members, optionally including itself, using the primitive
send(). The messages are received via the recvmsg() primitive. A socket
can only be member of one group at a time.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 17:04:23 +08:00
|
|
|
goto exit;
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_lock_bh(&sc->lock);
|
|
|
|
for (n = rb_first(&sc->ranges); n; n = rb_next(n)) {
|
|
|
|
sr = container_of(n, struct service_range, tree_node);
|
|
|
|
list_for_each_entry(p, &sr->all_publ, all_publ) {
|
2018-01-09 04:03:30 +08:00
|
|
|
if (p->scope != scope)
|
tipc: introduce communication groups
As a preparation for introducing flow control for multicast and datagram
messaging we need a more strictly defined framework than we have now. A
socket must be able keep track of exactly how many and which other
sockets it is allowed to communicate with at any moment, and keep the
necessary state for those.
We therefore introduce a new concept we have named Communication Group.
Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
The call takes four parameters: 'type' serves as group identifier,
'instance' serves as an logical member identifier, and 'scope' indicates
the visibility of the group (node/cluster/zone). Finally, 'flags' makes
it possible to set certain properties for the member. For now, there is
only one flag, indicating if the creator of the socket wants to receive
a copy of broadcast or multicast messages it is sending via the socket,
and if wants to be eligible as destination for its own anycasts.
A group is closed, i.e., sockets which have not joined a group will
not be able to send messages to or receive messages from members of
the group, and vice versa.
Any member of a group can send multicast ('group broadcast') messages
to all group members, optionally including itself, using the primitive
send(). The messages are received via the recvmsg() primitive. A socket
can only be member of one group at a time.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 17:04:23 +08:00
|
|
|
continue;
|
2018-03-15 23:48:55 +08:00
|
|
|
tipc_group_add_member(grp, p->node, p->port, p->lower);
|
tipc: introduce communication groups
As a preparation for introducing flow control for multicast and datagram
messaging we need a more strictly defined framework than we have now. A
socket must be able keep track of exactly how many and which other
sockets it is allowed to communicate with at any moment, and keep the
necessary state for those.
We therefore introduce a new concept we have named Communication Group.
Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
The call takes four parameters: 'type' serves as group identifier,
'instance' serves as an logical member identifier, and 'scope' indicates
the visibility of the group (node/cluster/zone). Finally, 'flags' makes
it possible to set certain properties for the member. For now, there is
only one flag, indicating if the creator of the socket wants to receive
a copy of broadcast or multicast messages it is sending via the socket,
and if wants to be eligible as destination for its own anycasts.
A group is closed, i.e., sockets which have not joined a group will
not be able to send messages to or receive messages from members of
the group, and vice versa.
Any member of a group can send multicast ('group broadcast') messages
to all group members, optionally including itself, using the primitive
send(). The messages are received via the recvmsg() primitive. A socket
can only be member of one group at a time.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 17:04:23 +08:00
|
|
|
}
|
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_unlock_bh(&sc->lock);
|
tipc: introduce communication groups
As a preparation for introducing flow control for multicast and datagram
messaging we need a more strictly defined framework than we have now. A
socket must be able keep track of exactly how many and which other
sockets it is allowed to communicate with at any moment, and keep the
necessary state for those.
We therefore introduce a new concept we have named Communication Group.
Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
The call takes four parameters: 'type' serves as group identifier,
'instance' serves as an logical member identifier, and 'scope' indicates
the visibility of the group (node/cluster/zone). Finally, 'flags' makes
it possible to set certain properties for the member. For now, there is
only one flag, indicating if the creator of the socket wants to receive
a copy of broadcast or multicast messages it is sending via the socket,
and if wants to be eligible as destination for its own anycasts.
A group is closed, i.e., sockets which have not joined a group will
not be able to send messages to or receive messages from members of
the group, and vice versa.
Any member of a group can send multicast ('group broadcast') messages
to all group members, optionally including itself, using the primitive
send(). The messages are received via the recvmsg() primitive. A socket
can only be member of one group at a time.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 17:04:23 +08:00
|
|
|
exit:
|
|
|
|
rcu_read_unlock();
|
|
|
|
}
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
/* tipc_nametbl_publish - add service binding to name table
|
2006-01-03 02:04:38 +08:00
|
|
|
*/
|
2015-01-09 15:27:05 +08:00
|
|
|
struct publication *tipc_nametbl_publish(struct net *net, u32 type, u32 lower,
|
2018-03-30 05:20:41 +08:00
|
|
|
u32 upper, u32 scope, u32 port,
|
2015-01-09 15:27:05 +08:00
|
|
|
u32 key)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct name_table *nt = tipc_name_table(net);
|
|
|
|
struct tipc_net *tn = tipc_net(net);
|
|
|
|
struct publication *p = NULL;
|
|
|
|
struct sk_buff *skb = NULL;
|
tipc: update a binding service via broadcast
Currently, updating binding table (add service binding to
name table/withdraw a service binding) is being sent over replicast.
However, if we are scaling up clusters to > 100 nodes/containers this
method is less affection because of looping through nodes in a cluster one
by one.
It is worth to use broadcast to update a binding service. This way, the
binding table can be updated on all peer nodes in one shot.
Broadcast is used when all peer nodes, as indicated by a new capability
flag TIPC_NAMED_BCAST, support reception of this message type.
Four problems need to be considered when introducing this feature.
1) When establishing a link to a new peer node we still update this by a
unicast 'bulk' update. This may lead to race conditions, where a later
broadcast publication/withdrawal bypass the 'bulk', resulting in
disordered publications, or even that a withdrawal may arrive before the
corresponding publication. We solve this by adding an 'is_last_bulk' bit
in the last bulk messages so that it can be distinguished from all other
messages. Only when this message has arrived do we open up for reception
of broadcast publications/withdrawals.
2) When a first legacy node is added to the cluster all distribution
will switch over to use the legacy 'replicast' method, while the
opposite happens when the last legacy node leaves the cluster. This
entails another risk of message disordering that has to be handled. We
solve this by adding a sequence number to the broadcast/replicast
messages, so that disordering can be discovered and corrected. Note
however that we don't need to consider potential message loss or
duplication at this protocol level.
3) Bulk messages don't contain any sequence numbers, and will always
arrive in order. Hence we must exempt those from the sequence number
control and deliver them unconditionally. We solve this by adding a new
'is_bulk' bit in those messages so that they can be recognized.
4) Legacy messages, which don't contain any new bits or sequence
numbers, but neither can arrive out of order, also need to be exempt
from the initial synchronization and sequence number check, and
delivered unconditionally. Therefore, we add another 'is_not_legacy' bit
to all new messages so that those can be distinguished from legacy
messages and the latter delivered directly.
v1->v2:
- fix warning issue reported by kbuild test robot <lkp@intel.com>
- add santiy check to drop the publication message with a sequence
number that is lower than the agreed synch point
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-17 14:56:05 +08:00
|
|
|
u32 rc_dests;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2015-01-09 15:27:09 +08:00
|
|
|
spin_lock_bh(&tn->nametbl_lock);
|
2018-03-30 05:20:41 +08:00
|
|
|
|
|
|
|
if (nt->local_publ_count >= TIPC_MAX_PUBL) {
|
|
|
|
pr_warn("Bind failed, max limit %u reached\n", TIPC_MAX_PUBL);
|
|
|
|
goto exit;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
p = tipc_nametbl_insert_publ(net, type, lower, upper, scope,
|
|
|
|
tipc_own_addr(net), port, key);
|
|
|
|
if (p) {
|
|
|
|
nt->local_publ_count++;
|
|
|
|
skb = tipc_named_publish(net, p);
|
2011-11-10 03:22:52 +08:00
|
|
|
}
|
tipc: update a binding service via broadcast
Currently, updating binding table (add service binding to
name table/withdraw a service binding) is being sent over replicast.
However, if we are scaling up clusters to > 100 nodes/containers this
method is less affection because of looping through nodes in a cluster one
by one.
It is worth to use broadcast to update a binding service. This way, the
binding table can be updated on all peer nodes in one shot.
Broadcast is used when all peer nodes, as indicated by a new capability
flag TIPC_NAMED_BCAST, support reception of this message type.
Four problems need to be considered when introducing this feature.
1) When establishing a link to a new peer node we still update this by a
unicast 'bulk' update. This may lead to race conditions, where a later
broadcast publication/withdrawal bypass the 'bulk', resulting in
disordered publications, or even that a withdrawal may arrive before the
corresponding publication. We solve this by adding an 'is_last_bulk' bit
in the last bulk messages so that it can be distinguished from all other
messages. Only when this message has arrived do we open up for reception
of broadcast publications/withdrawals.
2) When a first legacy node is added to the cluster all distribution
will switch over to use the legacy 'replicast' method, while the
opposite happens when the last legacy node leaves the cluster. This
entails another risk of message disordering that has to be handled. We
solve this by adding a sequence number to the broadcast/replicast
messages, so that disordering can be discovered and corrected. Note
however that we don't need to consider potential message loss or
duplication at this protocol level.
3) Bulk messages don't contain any sequence numbers, and will always
arrive in order. Hence we must exempt those from the sequence number
control and deliver them unconditionally. We solve this by adding a new
'is_bulk' bit in those messages so that they can be recognized.
4) Legacy messages, which don't contain any new bits or sequence
numbers, but neither can arrive out of order, also need to be exempt
from the initial synchronization and sequence number check, and
delivered unconditionally. Therefore, we add another 'is_not_legacy' bit
to all new messages so that those can be distinguished from legacy
messages and the latter delivered directly.
v1->v2:
- fix warning issue reported by kbuild test robot <lkp@intel.com>
- add santiy check to drop the publication message with a sequence
number that is lower than the agreed synch point
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-17 14:56:05 +08:00
|
|
|
rc_dests = nt->rc_dests;
|
2018-03-30 05:20:41 +08:00
|
|
|
exit:
|
2015-01-09 15:27:09 +08:00
|
|
|
spin_unlock_bh(&tn->nametbl_lock);
|
2014-04-28 18:00:10 +08:00
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
if (skb)
|
tipc: update a binding service via broadcast
Currently, updating binding table (add service binding to
name table/withdraw a service binding) is being sent over replicast.
However, if we are scaling up clusters to > 100 nodes/containers this
method is less affection because of looping through nodes in a cluster one
by one.
It is worth to use broadcast to update a binding service. This way, the
binding table can be updated on all peer nodes in one shot.
Broadcast is used when all peer nodes, as indicated by a new capability
flag TIPC_NAMED_BCAST, support reception of this message type.
Four problems need to be considered when introducing this feature.
1) When establishing a link to a new peer node we still update this by a
unicast 'bulk' update. This may lead to race conditions, where a later
broadcast publication/withdrawal bypass the 'bulk', resulting in
disordered publications, or even that a withdrawal may arrive before the
corresponding publication. We solve this by adding an 'is_last_bulk' bit
in the last bulk messages so that it can be distinguished from all other
messages. Only when this message has arrived do we open up for reception
of broadcast publications/withdrawals.
2) When a first legacy node is added to the cluster all distribution
will switch over to use the legacy 'replicast' method, while the
opposite happens when the last legacy node leaves the cluster. This
entails another risk of message disordering that has to be handled. We
solve this by adding a sequence number to the broadcast/replicast
messages, so that disordering can be discovered and corrected. Note
however that we don't need to consider potential message loss or
duplication at this protocol level.
3) Bulk messages don't contain any sequence numbers, and will always
arrive in order. Hence we must exempt those from the sequence number
control and deliver them unconditionally. We solve this by adding a new
'is_bulk' bit in those messages so that they can be recognized.
4) Legacy messages, which don't contain any new bits or sequence
numbers, but neither can arrive out of order, also need to be exempt
from the initial synchronization and sequence number check, and
delivered unconditionally. Therefore, we add another 'is_not_legacy' bit
to all new messages so that those can be distinguished from legacy
messages and the latter delivered directly.
v1->v2:
- fix warning issue reported by kbuild test robot <lkp@intel.com>
- add santiy check to drop the publication message with a sequence
number that is lower than the agreed synch point
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-17 14:56:05 +08:00
|
|
|
tipc_node_broadcast(net, skb, rc_dests);
|
2018-03-30 05:20:41 +08:00
|
|
|
return p;
|
tipc: update a binding service via broadcast
Currently, updating binding table (add service binding to
name table/withdraw a service binding) is being sent over replicast.
However, if we are scaling up clusters to > 100 nodes/containers this
method is less affection because of looping through nodes in a cluster one
by one.
It is worth to use broadcast to update a binding service. This way, the
binding table can be updated on all peer nodes in one shot.
Broadcast is used when all peer nodes, as indicated by a new capability
flag TIPC_NAMED_BCAST, support reception of this message type.
Four problems need to be considered when introducing this feature.
1) When establishing a link to a new peer node we still update this by a
unicast 'bulk' update. This may lead to race conditions, where a later
broadcast publication/withdrawal bypass the 'bulk', resulting in
disordered publications, or even that a withdrawal may arrive before the
corresponding publication. We solve this by adding an 'is_last_bulk' bit
in the last bulk messages so that it can be distinguished from all other
messages. Only when this message has arrived do we open up for reception
of broadcast publications/withdrawals.
2) When a first legacy node is added to the cluster all distribution
will switch over to use the legacy 'replicast' method, while the
opposite happens when the last legacy node leaves the cluster. This
entails another risk of message disordering that has to be handled. We
solve this by adding a sequence number to the broadcast/replicast
messages, so that disordering can be discovered and corrected. Note
however that we don't need to consider potential message loss or
duplication at this protocol level.
3) Bulk messages don't contain any sequence numbers, and will always
arrive in order. Hence we must exempt those from the sequence number
control and deliver them unconditionally. We solve this by adding a new
'is_bulk' bit in those messages so that they can be recognized.
4) Legacy messages, which don't contain any new bits or sequence
numbers, but neither can arrive out of order, also need to be exempt
from the initial synchronization and sequence number check, and
delivered unconditionally. Therefore, we add another 'is_not_legacy' bit
to all new messages so that those can be distinguished from legacy
messages and the latter delivered directly.
v1->v2:
- fix warning issue reported by kbuild test robot <lkp@intel.com>
- add santiy check to drop the publication message with a sequence
number that is lower than the agreed synch point
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-17 14:56:05 +08:00
|
|
|
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2018-03-30 05:20:41 +08:00
|
|
|
* tipc_nametbl_withdraw - withdraw a service binding
|
2006-01-03 02:04:38 +08:00
|
|
|
*/
|
2018-03-30 05:20:41 +08:00
|
|
|
int tipc_nametbl_withdraw(struct net *net, u32 type, u32 lower,
|
2018-03-30 05:20:43 +08:00
|
|
|
u32 upper, u32 key)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct name_table *nt = tipc_name_table(net);
|
|
|
|
struct tipc_net *tn = tipc_net(net);
|
|
|
|
u32 self = tipc_own_addr(net);
|
2014-12-02 15:00:28 +08:00
|
|
|
struct sk_buff *skb = NULL;
|
2018-03-30 05:20:41 +08:00
|
|
|
struct publication *p;
|
tipc: update a binding service via broadcast
Currently, updating binding table (add service binding to
name table/withdraw a service binding) is being sent over replicast.
However, if we are scaling up clusters to > 100 nodes/containers this
method is less affection because of looping through nodes in a cluster one
by one.
It is worth to use broadcast to update a binding service. This way, the
binding table can be updated on all peer nodes in one shot.
Broadcast is used when all peer nodes, as indicated by a new capability
flag TIPC_NAMED_BCAST, support reception of this message type.
Four problems need to be considered when introducing this feature.
1) When establishing a link to a new peer node we still update this by a
unicast 'bulk' update. This may lead to race conditions, where a later
broadcast publication/withdrawal bypass the 'bulk', resulting in
disordered publications, or even that a withdrawal may arrive before the
corresponding publication. We solve this by adding an 'is_last_bulk' bit
in the last bulk messages so that it can be distinguished from all other
messages. Only when this message has arrived do we open up for reception
of broadcast publications/withdrawals.
2) When a first legacy node is added to the cluster all distribution
will switch over to use the legacy 'replicast' method, while the
opposite happens when the last legacy node leaves the cluster. This
entails another risk of message disordering that has to be handled. We
solve this by adding a sequence number to the broadcast/replicast
messages, so that disordering can be discovered and corrected. Note
however that we don't need to consider potential message loss or
duplication at this protocol level.
3) Bulk messages don't contain any sequence numbers, and will always
arrive in order. Hence we must exempt those from the sequence number
control and deliver them unconditionally. We solve this by adding a new
'is_bulk' bit in those messages so that they can be recognized.
4) Legacy messages, which don't contain any new bits or sequence
numbers, but neither can arrive out of order, also need to be exempt
from the initial synchronization and sequence number check, and
delivered unconditionally. Therefore, we add another 'is_not_legacy' bit
to all new messages so that those can be distinguished from legacy
messages and the latter delivered directly.
v1->v2:
- fix warning issue reported by kbuild test robot <lkp@intel.com>
- add santiy check to drop the publication message with a sequence
number that is lower than the agreed synch point
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-17 14:56:05 +08:00
|
|
|
u32 rc_dests;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2015-01-09 15:27:09 +08:00
|
|
|
spin_lock_bh(&tn->nametbl_lock);
|
2018-03-30 05:20:41 +08:00
|
|
|
|
2018-03-30 05:20:43 +08:00
|
|
|
p = tipc_nametbl_remove_publ(net, type, lower, upper, self, key);
|
2018-03-30 05:20:41 +08:00
|
|
|
if (p) {
|
|
|
|
nt->local_publ_count--;
|
|
|
|
skb = tipc_named_withdraw(net, p);
|
|
|
|
list_del_init(&p->binding_sock);
|
|
|
|
kfree_rcu(p, rcu);
|
2014-12-02 15:00:28 +08:00
|
|
|
} else {
|
2018-03-30 05:20:41 +08:00
|
|
|
pr_err("Failed to remove local publication {%u,%u,%u}/%u\n",
|
2018-03-30 05:20:43 +08:00
|
|
|
type, lower, upper, key);
|
2014-12-02 15:00:28 +08:00
|
|
|
}
|
tipc: update a binding service via broadcast
Currently, updating binding table (add service binding to
name table/withdraw a service binding) is being sent over replicast.
However, if we are scaling up clusters to > 100 nodes/containers this
method is less affection because of looping through nodes in a cluster one
by one.
It is worth to use broadcast to update a binding service. This way, the
binding table can be updated on all peer nodes in one shot.
Broadcast is used when all peer nodes, as indicated by a new capability
flag TIPC_NAMED_BCAST, support reception of this message type.
Four problems need to be considered when introducing this feature.
1) When establishing a link to a new peer node we still update this by a
unicast 'bulk' update. This may lead to race conditions, where a later
broadcast publication/withdrawal bypass the 'bulk', resulting in
disordered publications, or even that a withdrawal may arrive before the
corresponding publication. We solve this by adding an 'is_last_bulk' bit
in the last bulk messages so that it can be distinguished from all other
messages. Only when this message has arrived do we open up for reception
of broadcast publications/withdrawals.
2) When a first legacy node is added to the cluster all distribution
will switch over to use the legacy 'replicast' method, while the
opposite happens when the last legacy node leaves the cluster. This
entails another risk of message disordering that has to be handled. We
solve this by adding a sequence number to the broadcast/replicast
messages, so that disordering can be discovered and corrected. Note
however that we don't need to consider potential message loss or
duplication at this protocol level.
3) Bulk messages don't contain any sequence numbers, and will always
arrive in order. Hence we must exempt those from the sequence number
control and deliver them unconditionally. We solve this by adding a new
'is_bulk' bit in those messages so that they can be recognized.
4) Legacy messages, which don't contain any new bits or sequence
numbers, but neither can arrive out of order, also need to be exempt
from the initial synchronization and sequence number check, and
delivered unconditionally. Therefore, we add another 'is_not_legacy' bit
to all new messages so that those can be distinguished from legacy
messages and the latter delivered directly.
v1->v2:
- fix warning issue reported by kbuild test robot <lkp@intel.com>
- add santiy check to drop the publication message with a sequence
number that is lower than the agreed synch point
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-17 14:56:05 +08:00
|
|
|
rc_dests = nt->rc_dests;
|
2015-01-09 15:27:09 +08:00
|
|
|
spin_unlock_bh(&tn->nametbl_lock);
|
2014-04-28 18:00:10 +08:00
|
|
|
|
2014-12-02 15:00:28 +08:00
|
|
|
if (skb) {
|
tipc: update a binding service via broadcast
Currently, updating binding table (add service binding to
name table/withdraw a service binding) is being sent over replicast.
However, if we are scaling up clusters to > 100 nodes/containers this
method is less affection because of looping through nodes in a cluster one
by one.
It is worth to use broadcast to update a binding service. This way, the
binding table can be updated on all peer nodes in one shot.
Broadcast is used when all peer nodes, as indicated by a new capability
flag TIPC_NAMED_BCAST, support reception of this message type.
Four problems need to be considered when introducing this feature.
1) When establishing a link to a new peer node we still update this by a
unicast 'bulk' update. This may lead to race conditions, where a later
broadcast publication/withdrawal bypass the 'bulk', resulting in
disordered publications, or even that a withdrawal may arrive before the
corresponding publication. We solve this by adding an 'is_last_bulk' bit
in the last bulk messages so that it can be distinguished from all other
messages. Only when this message has arrived do we open up for reception
of broadcast publications/withdrawals.
2) When a first legacy node is added to the cluster all distribution
will switch over to use the legacy 'replicast' method, while the
opposite happens when the last legacy node leaves the cluster. This
entails another risk of message disordering that has to be handled. We
solve this by adding a sequence number to the broadcast/replicast
messages, so that disordering can be discovered and corrected. Note
however that we don't need to consider potential message loss or
duplication at this protocol level.
3) Bulk messages don't contain any sequence numbers, and will always
arrive in order. Hence we must exempt those from the sequence number
control and deliver them unconditionally. We solve this by adding a new
'is_bulk' bit in those messages so that they can be recognized.
4) Legacy messages, which don't contain any new bits or sequence
numbers, but neither can arrive out of order, also need to be exempt
from the initial synchronization and sequence number check, and
delivered unconditionally. Therefore, we add another 'is_not_legacy' bit
to all new messages so that those can be distinguished from legacy
messages and the latter delivered directly.
v1->v2:
- fix warning issue reported by kbuild test robot <lkp@intel.com>
- add santiy check to drop the publication message with a sequence
number that is lower than the agreed synch point
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-17 14:56:05 +08:00
|
|
|
tipc_node_broadcast(net, skb, rc_dests);
|
2006-01-03 02:04:38 +08:00
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2006-01-18 07:38:21 +08:00
|
|
|
* tipc_nametbl_subscribe - add a subscription object to the name table
|
2006-01-03 02:04:38 +08:00
|
|
|
*/
|
2018-04-12 04:52:09 +08:00
|
|
|
bool tipc_nametbl_subscribe(struct tipc_subscription *sub)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct name_table *nt = tipc_name_table(sub->net);
|
2018-02-15 17:40:49 +08:00
|
|
|
struct tipc_net *tn = tipc_net(sub->net);
|
2018-02-15 17:40:46 +08:00
|
|
|
struct tipc_subscr *s = &sub->evt.s;
|
|
|
|
u32 type = tipc_sub_read(s, seq.type);
|
2018-03-30 05:20:41 +08:00
|
|
|
struct tipc_service *sc;
|
2018-04-12 04:52:09 +08:00
|
|
|
bool res = true;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2015-01-09 15:27:09 +08:00
|
|
|
spin_lock_bh(&tn->nametbl_lock);
|
2018-03-30 05:20:41 +08:00
|
|
|
sc = tipc_service_find(sub->net, type);
|
|
|
|
if (!sc)
|
|
|
|
sc = tipc_service_create(type, &nt->services[hash(type)]);
|
|
|
|
if (sc) {
|
|
|
|
spin_lock_bh(&sc->lock);
|
|
|
|
tipc_service_subscribe(sc, sub);
|
|
|
|
spin_unlock_bh(&sc->lock);
|
2007-02-09 22:25:21 +08:00
|
|
|
} else {
|
2018-03-30 05:20:41 +08:00
|
|
|
pr_warn("Failed to subscribe for {%u,%u,%u}\n", type,
|
|
|
|
tipc_sub_read(s, seq.lower),
|
|
|
|
tipc_sub_read(s, seq.upper));
|
2018-04-12 04:52:09 +08:00
|
|
|
res = false;
|
2007-02-09 22:25:21 +08:00
|
|
|
}
|
2015-01-09 15:27:09 +08:00
|
|
|
spin_unlock_bh(&tn->nametbl_lock);
|
2018-04-12 04:52:09 +08:00
|
|
|
return res;
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2006-01-18 07:38:21 +08:00
|
|
|
* tipc_nametbl_unsubscribe - remove a subscription object from name table
|
2006-01-03 02:04:38 +08:00
|
|
|
*/
|
2018-02-15 17:40:46 +08:00
|
|
|
void tipc_nametbl_unsubscribe(struct tipc_subscription *sub)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-02-15 17:40:49 +08:00
|
|
|
struct tipc_net *tn = tipc_net(sub->net);
|
2018-03-30 05:20:41 +08:00
|
|
|
struct tipc_subscr *s = &sub->evt.s;
|
2018-02-15 17:40:46 +08:00
|
|
|
u32 type = tipc_sub_read(s, seq.type);
|
2018-03-30 05:20:41 +08:00
|
|
|
struct tipc_service *sc;
|
2006-01-03 02:04:38 +08:00
|
|
|
|
2015-01-09 15:27:09 +08:00
|
|
|
spin_lock_bh(&tn->nametbl_lock);
|
2018-03-30 05:20:41 +08:00
|
|
|
sc = tipc_service_find(sub->net, type);
|
|
|
|
if (!sc)
|
|
|
|
goto exit;
|
|
|
|
|
|
|
|
spin_lock_bh(&sc->lock);
|
|
|
|
list_del_init(&sub->service_list);
|
|
|
|
tipc_sub_put(sub);
|
|
|
|
|
|
|
|
/* Delete service item if no more publications and subscriptions */
|
|
|
|
if (RB_EMPTY_ROOT(&sc->ranges) && list_empty(&sc->subscriptions)) {
|
|
|
|
hlist_del_init_rcu(&sc->service_list);
|
|
|
|
kfree_rcu(sc, rcu);
|
2007-02-09 22:25:21 +08:00
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_unlock_bh(&sc->lock);
|
|
|
|
exit:
|
2015-01-09 15:27:09 +08:00
|
|
|
spin_unlock_bh(&tn->nametbl_lock);
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
|
|
|
|
2015-01-09 15:27:09 +08:00
|
|
|
int tipc_nametbl_init(struct net *net)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct tipc_net *tn = tipc_net(net);
|
|
|
|
struct name_table *nt;
|
2014-12-02 15:00:24 +08:00
|
|
|
int i;
|
|
|
|
|
2018-07-27 17:28:25 +08:00
|
|
|
nt = kzalloc(sizeof(*nt), GFP_KERNEL);
|
2018-03-30 05:20:41 +08:00
|
|
|
if (!nt)
|
2006-01-03 02:04:38 +08:00
|
|
|
return -ENOMEM;
|
|
|
|
|
2014-12-02 15:00:24 +08:00
|
|
|
for (i = 0; i < TIPC_NAMETBL_SIZE; i++)
|
2018-03-30 05:20:41 +08:00
|
|
|
INIT_HLIST_HEAD(&nt->services[i]);
|
2014-12-02 15:00:24 +08:00
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
INIT_LIST_HEAD(&nt->node_scope);
|
|
|
|
INIT_LIST_HEAD(&nt->cluster_scope);
|
tipc: eliminate message disordering during binding table update
We have seen the following race scenario:
1) named_distribute() builds a "bulk" message, containing a PUBLISH
item for a certain publication. This is based on the contents of
the binding tables's 'cluster_scope' list.
2) tipc_named_withdraw() removes the same publication from the list,
bulds a WITHDRAW message and distributes it to all cluster nodes.
3) tipc_named_node_up(), which was calling named_distribute(), sends
out the bulk message built under 1)
4) The WITHDRAW message arrives at the just detected node, finds
no corresponding publication, and is dropped.
5) The PUBLISH item arrives at the same node, is added to its binding
table, and remains there forever.
This arrival disordering was earlier taken care of by the backlog queue,
originally added for a different purpose, which was removed in the
commit referred to below, but we now need a different solution.
In this commit, we replace the rcu lock protecting the 'cluster_scope'
list with a regular RW lock which comprises even the sending of the
bulk message. This both guarantees both the list integrity and the
message sending order. We will later add a commit which cleans up
this code further.
Note that this commit needs recently added commit d3092b2efca1 ("tipc:
fix unsafe rcu locking when accessing publication list") to apply
cleanly.
Fixes: 37922ea4a310 ("tipc: permit overlapping service ranges in name table")
Reported-by: Tuong Lien Tong <tuong.t.lien@dektech.com.au>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-20 01:55:40 +08:00
|
|
|
rwlock_init(&nt->cluster_scope_lock);
|
2018-03-30 05:20:41 +08:00
|
|
|
tn->nametbl = nt;
|
2015-01-09 15:27:09 +08:00
|
|
|
spin_lock_init(&tn->nametbl_lock);
|
2006-01-03 02:04:38 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2014-03-06 21:40:20 +08:00
|
|
|
/**
|
2018-03-30 05:20:41 +08:00
|
|
|
* tipc_service_delete - purge all publications for a service and delete it
|
2014-03-06 21:40:20 +08:00
|
|
|
*/
|
2018-03-30 05:20:41 +08:00
|
|
|
static void tipc_service_delete(struct net *net, struct tipc_service *sc)
|
2014-03-06 21:40:20 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct service_range *sr, *tmpr;
|
2018-04-18 03:25:42 +08:00
|
|
|
struct publication *p, *tmp;
|
2018-03-30 05:20:41 +08:00
|
|
|
|
|
|
|
spin_lock_bh(&sc->lock);
|
|
|
|
rbtree_postorder_for_each_entry_safe(sr, tmpr, &sc->ranges, tree_node) {
|
2018-04-18 03:25:42 +08:00
|
|
|
list_for_each_entry_safe(p, tmp, &sr->all_publ, all_publ) {
|
2018-05-09 08:59:41 +08:00
|
|
|
tipc_service_remove_publ(sr, p->node, p->key);
|
2018-03-30 05:20:41 +08:00
|
|
|
kfree_rcu(p, rcu);
|
|
|
|
}
|
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-10 16:21:02 +08:00
|
|
|
rb_erase_augmented(&sr->tree_node, &sc->ranges, &sr_callbacks);
|
2018-04-18 03:25:42 +08:00
|
|
|
kfree(sr);
|
2014-03-06 21:40:20 +08:00
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
hlist_del_init_rcu(&sc->service_list);
|
|
|
|
spin_unlock_bh(&sc->lock);
|
|
|
|
kfree_rcu(sc, rcu);
|
2014-03-06 21:40:20 +08:00
|
|
|
}
|
|
|
|
|
2015-01-09 15:27:09 +08:00
|
|
|
void tipc_nametbl_stop(struct net *net)
|
2006-01-03 02:04:38 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct name_table *nt = tipc_name_table(net);
|
|
|
|
struct tipc_net *tn = tipc_net(net);
|
|
|
|
struct hlist_head *service_head;
|
|
|
|
struct tipc_service *service;
|
2006-01-03 02:04:38 +08:00
|
|
|
u32 i;
|
|
|
|
|
2014-03-06 21:40:20 +08:00
|
|
|
/* Verify name table is empty and purge any lingering
|
|
|
|
* publications, then release the name table
|
|
|
|
*/
|
2015-01-09 15:27:09 +08:00
|
|
|
spin_lock_bh(&tn->nametbl_lock);
|
2012-08-16 20:09:11 +08:00
|
|
|
for (i = 0; i < TIPC_NAMETBL_SIZE; i++) {
|
2018-03-30 05:20:41 +08:00
|
|
|
if (hlist_empty(&nt->services[i]))
|
2012-07-12 05:35:01 +08:00
|
|
|
continue;
|
2018-03-30 05:20:41 +08:00
|
|
|
service_head = &nt->services[i];
|
|
|
|
hlist_for_each_entry_rcu(service, service_head, service_list) {
|
|
|
|
tipc_service_delete(net, service);
|
2014-03-06 21:40:20 +08:00
|
|
|
}
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
2015-01-09 15:27:09 +08:00
|
|
|
spin_unlock_bh(&tn->nametbl_lock);
|
2014-12-02 15:00:24 +08:00
|
|
|
|
2014-12-02 15:00:30 +08:00
|
|
|
synchronize_net();
|
2018-03-30 05:20:41 +08:00
|
|
|
kfree(nt);
|
2006-01-03 02:04:38 +08:00
|
|
|
}
|
2014-11-20 17:29:20 +08:00
|
|
|
|
2014-11-24 18:10:29 +08:00
|
|
|
static int __tipc_nl_add_nametable_publ(struct tipc_nl_msg *msg,
|
2018-03-30 05:20:41 +08:00
|
|
|
struct tipc_service *service,
|
|
|
|
struct service_range *sr,
|
|
|
|
u32 *last_key)
|
2014-11-20 17:29:20 +08:00
|
|
|
{
|
|
|
|
struct publication *p;
|
2018-03-30 05:20:41 +08:00
|
|
|
struct nlattr *attrs;
|
|
|
|
struct nlattr *b;
|
|
|
|
void *hdr;
|
2014-11-20 17:29:20 +08:00
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
if (*last_key) {
|
|
|
|
list_for_each_entry(p, &sr->all_publ, all_publ)
|
|
|
|
if (p->key == *last_key)
|
2014-11-20 17:29:20 +08:00
|
|
|
break;
|
2018-03-30 05:20:41 +08:00
|
|
|
if (p->key != *last_key)
|
2014-11-20 17:29:20 +08:00
|
|
|
return -EPIPE;
|
|
|
|
} else {
|
2018-03-30 05:20:41 +08:00
|
|
|
p = list_first_entry(&sr->all_publ,
|
|
|
|
struct publication,
|
2018-03-15 23:48:55 +08:00
|
|
|
all_publ);
|
2014-11-20 17:29:20 +08:00
|
|
|
}
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
list_for_each_entry_from(p, &sr->all_publ, all_publ) {
|
|
|
|
*last_key = p->key;
|
2014-11-20 17:29:20 +08:00
|
|
|
|
|
|
|
hdr = genlmsg_put(msg->skb, msg->portid, msg->seq,
|
2015-02-09 16:50:03 +08:00
|
|
|
&tipc_genl_family, NLM_F_MULTI,
|
2014-11-20 17:29:20 +08:00
|
|
|
TIPC_NL_NAME_TABLE_GET);
|
|
|
|
if (!hdr)
|
|
|
|
return -EMSGSIZE;
|
|
|
|
|
2019-04-26 17:13:06 +08:00
|
|
|
attrs = nla_nest_start_noflag(msg->skb, TIPC_NLA_NAME_TABLE);
|
2014-11-20 17:29:20 +08:00
|
|
|
if (!attrs)
|
|
|
|
goto msg_full;
|
|
|
|
|
2019-04-26 17:13:06 +08:00
|
|
|
b = nla_nest_start_noflag(msg->skb, TIPC_NLA_NAME_TABLE_PUBL);
|
2018-03-30 05:20:41 +08:00
|
|
|
if (!b)
|
2014-11-20 17:29:20 +08:00
|
|
|
goto attr_msg_full;
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_TYPE, service->type))
|
2014-11-20 17:29:20 +08:00
|
|
|
goto publ_msg_full;
|
2018-03-30 05:20:41 +08:00
|
|
|
if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_LOWER, sr->lower))
|
2014-11-20 17:29:20 +08:00
|
|
|
goto publ_msg_full;
|
2018-03-30 05:20:41 +08:00
|
|
|
if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_UPPER, sr->upper))
|
2014-11-20 17:29:20 +08:00
|
|
|
goto publ_msg_full;
|
|
|
|
if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_SCOPE, p->scope))
|
|
|
|
goto publ_msg_full;
|
|
|
|
if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_NODE, p->node))
|
|
|
|
goto publ_msg_full;
|
2018-03-15 23:48:55 +08:00
|
|
|
if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_REF, p->port))
|
2014-11-20 17:29:20 +08:00
|
|
|
goto publ_msg_full;
|
|
|
|
if (nla_put_u32(msg->skb, TIPC_NLA_PUBL_KEY, p->key))
|
|
|
|
goto publ_msg_full;
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
nla_nest_end(msg->skb, b);
|
2014-11-20 17:29:20 +08:00
|
|
|
nla_nest_end(msg->skb, attrs);
|
|
|
|
genlmsg_end(msg->skb, hdr);
|
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
*last_key = 0;
|
2014-11-20 17:29:20 +08:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
publ_msg_full:
|
2018-03-30 05:20:41 +08:00
|
|
|
nla_nest_cancel(msg->skb, b);
|
2014-11-20 17:29:20 +08:00
|
|
|
attr_msg_full:
|
|
|
|
nla_nest_cancel(msg->skb, attrs);
|
|
|
|
msg_full:
|
|
|
|
genlmsg_cancel(msg->skb, hdr);
|
|
|
|
|
|
|
|
return -EMSGSIZE;
|
|
|
|
}
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
static int __tipc_nl_service_range_list(struct tipc_nl_msg *msg,
|
|
|
|
struct tipc_service *sc,
|
|
|
|
u32 *last_lower, u32 *last_key)
|
2014-11-20 17:29:20 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct service_range *sr;
|
|
|
|
struct rb_node *n;
|
2014-11-20 17:29:20 +08:00
|
|
|
int err;
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
for (n = rb_first(&sc->ranges); n; n = rb_next(n)) {
|
|
|
|
sr = container_of(n, struct service_range, tree_node);
|
|
|
|
if (sr->lower < *last_lower)
|
|
|
|
continue;
|
|
|
|
err = __tipc_nl_add_nametable_publ(msg, sc, sr, last_key);
|
2014-11-20 17:29:20 +08:00
|
|
|
if (err) {
|
2018-03-30 05:20:41 +08:00
|
|
|
*last_lower = sr->lower;
|
2014-11-20 17:29:20 +08:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
*last_lower = 0;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
static int tipc_nl_service_list(struct net *net, struct tipc_nl_msg *msg,
|
|
|
|
u32 *last_type, u32 *last_lower, u32 *last_key)
|
2014-11-20 17:29:20 +08:00
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct tipc_net *tn = tipc_net(net);
|
|
|
|
struct tipc_service *service = NULL;
|
|
|
|
struct hlist_head *head;
|
2014-11-20 17:29:20 +08:00
|
|
|
int err;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
if (*last_type)
|
|
|
|
i = hash(*last_type);
|
|
|
|
else
|
|
|
|
i = 0;
|
|
|
|
|
|
|
|
for (; i < TIPC_NAMETBL_SIZE; i++) {
|
2018-03-30 05:20:41 +08:00
|
|
|
head = &tn->nametbl->services[i];
|
2014-11-20 17:29:20 +08:00
|
|
|
|
2019-04-09 15:59:24 +08:00
|
|
|
if (*last_type ||
|
|
|
|
(!i && *last_key && (*last_lower == *last_key))) {
|
2018-03-30 05:20:41 +08:00
|
|
|
service = tipc_service_find(net, *last_type);
|
|
|
|
if (!service)
|
2014-11-20 17:29:20 +08:00
|
|
|
return -EPIPE;
|
|
|
|
} else {
|
2018-03-30 05:20:41 +08:00
|
|
|
hlist_for_each_entry_rcu(service, head, service_list)
|
2014-12-02 15:00:30 +08:00
|
|
|
break;
|
2018-03-30 05:20:41 +08:00
|
|
|
if (!service)
|
2014-11-20 17:29:20 +08:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2018-03-30 05:20:41 +08:00
|
|
|
hlist_for_each_entry_from_rcu(service, service_list) {
|
|
|
|
spin_lock_bh(&service->lock);
|
|
|
|
err = __tipc_nl_service_range_list(msg, service,
|
|
|
|
last_lower,
|
|
|
|
last_key);
|
2014-11-20 17:29:20 +08:00
|
|
|
|
|
|
|
if (err) {
|
2018-03-30 05:20:41 +08:00
|
|
|
*last_type = service->type;
|
|
|
|
spin_unlock_bh(&service->lock);
|
2014-11-20 17:29:20 +08:00
|
|
|
return err;
|
|
|
|
}
|
2018-03-30 05:20:41 +08:00
|
|
|
spin_unlock_bh(&service->lock);
|
2014-11-20 17:29:20 +08:00
|
|
|
}
|
|
|
|
*last_type = 0;
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
int tipc_nl_name_table_dump(struct sk_buff *skb, struct netlink_callback *cb)
|
|
|
|
{
|
2018-03-30 05:20:41 +08:00
|
|
|
struct net *net = sock_net(skb->sk);
|
2014-11-20 17:29:20 +08:00
|
|
|
u32 last_type = cb->args[0];
|
|
|
|
u32 last_lower = cb->args[1];
|
2018-03-30 05:20:41 +08:00
|
|
|
u32 last_key = cb->args[2];
|
|
|
|
int done = cb->args[3];
|
2014-11-20 17:29:20 +08:00
|
|
|
struct tipc_nl_msg msg;
|
2018-03-30 05:20:41 +08:00
|
|
|
int err;
|
2014-11-20 17:29:20 +08:00
|
|
|
|
|
|
|
if (done)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
msg.skb = skb;
|
|
|
|
msg.portid = NETLINK_CB(cb->skb).portid;
|
|
|
|
msg.seq = cb->nlh->nlmsg_seq;
|
|
|
|
|
2014-12-02 15:00:30 +08:00
|
|
|
rcu_read_lock();
|
2018-03-30 05:20:41 +08:00
|
|
|
err = tipc_nl_service_list(net, &msg, &last_type,
|
|
|
|
&last_lower, &last_key);
|
2014-11-20 17:29:20 +08:00
|
|
|
if (!err) {
|
|
|
|
done = 1;
|
|
|
|
} else if (err != -EMSGSIZE) {
|
|
|
|
/* We never set seq or call nl_dump_check_consistent() this
|
|
|
|
* means that setting prev_seq here will cause the consistence
|
|
|
|
* check to fail in the netlink callback handler. Resulting in
|
|
|
|
* the NLMSG_DONE message having the NLM_F_DUMP_INTR flag set if
|
|
|
|
* we got an error.
|
|
|
|
*/
|
|
|
|
cb->prev_seq = 1;
|
|
|
|
}
|
2014-12-02 15:00:30 +08:00
|
|
|
rcu_read_unlock();
|
2014-11-20 17:29:20 +08:00
|
|
|
|
|
|
|
cb->args[0] = last_type;
|
|
|
|
cb->args[1] = last_lower;
|
2018-03-30 05:20:41 +08:00
|
|
|
cb->args[2] = last_key;
|
2014-11-20 17:29:20 +08:00
|
|
|
cb->args[3] = done;
|
|
|
|
|
|
|
|
return skb->len;
|
|
|
|
}
|
2015-02-05 21:36:43 +08:00
|
|
|
|
2017-10-13 17:04:22 +08:00
|
|
|
struct tipc_dest *tipc_dest_find(struct list_head *l, u32 node, u32 port)
|
2015-02-05 21:36:43 +08:00
|
|
|
{
|
2017-10-13 17:04:22 +08:00
|
|
|
struct tipc_dest *dst;
|
2015-02-05 21:36:43 +08:00
|
|
|
|
2017-10-13 17:04:22 +08:00
|
|
|
list_for_each_entry(dst, l, list) {
|
2018-08-27 09:32:26 +08:00
|
|
|
if (dst->node == node && dst->port == port)
|
|
|
|
return dst;
|
2015-02-05 21:36:43 +08:00
|
|
|
}
|
2017-10-13 17:04:22 +08:00
|
|
|
return NULL;
|
2017-01-03 23:55:10 +08:00
|
|
|
}
|
|
|
|
|
2017-10-13 17:04:22 +08:00
|
|
|
bool tipc_dest_push(struct list_head *l, u32 node, u32 port)
|
2017-01-03 23:55:10 +08:00
|
|
|
{
|
2017-10-13 17:04:22 +08:00
|
|
|
struct tipc_dest *dst;
|
2017-01-03 23:55:10 +08:00
|
|
|
|
2017-10-13 17:04:22 +08:00
|
|
|
if (tipc_dest_find(l, node, port))
|
2017-01-03 23:55:10 +08:00
|
|
|
return false;
|
|
|
|
|
2017-10-13 17:04:22 +08:00
|
|
|
dst = kmalloc(sizeof(*dst), GFP_ATOMIC);
|
|
|
|
if (unlikely(!dst))
|
|
|
|
return false;
|
2018-08-27 09:32:26 +08:00
|
|
|
dst->node = node;
|
|
|
|
dst->port = port;
|
2017-10-13 17:04:22 +08:00
|
|
|
list_add(&dst->list, l);
|
2017-01-03 23:55:10 +08:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2017-10-13 17:04:22 +08:00
|
|
|
bool tipc_dest_pop(struct list_head *l, u32 *node, u32 *port)
|
2017-01-03 23:55:10 +08:00
|
|
|
{
|
2017-10-13 17:04:22 +08:00
|
|
|
struct tipc_dest *dst;
|
2017-01-03 23:55:10 +08:00
|
|
|
|
|
|
|
if (list_empty(l))
|
2017-10-13 17:04:22 +08:00
|
|
|
return false;
|
|
|
|
dst = list_first_entry(l, typeof(*dst), list);
|
|
|
|
if (port)
|
|
|
|
*port = dst->port;
|
|
|
|
if (node)
|
|
|
|
*node = dst->node;
|
|
|
|
list_del(&dst->list);
|
|
|
|
kfree(dst);
|
|
|
|
return true;
|
2017-01-03 23:55:10 +08:00
|
|
|
}
|
|
|
|
|
2017-10-13 17:04:22 +08:00
|
|
|
bool tipc_dest_del(struct list_head *l, u32 node, u32 port)
|
2017-01-03 23:55:10 +08:00
|
|
|
{
|
2017-10-13 17:04:22 +08:00
|
|
|
struct tipc_dest *dst;
|
2017-01-03 23:55:10 +08:00
|
|
|
|
2017-10-13 17:04:22 +08:00
|
|
|
dst = tipc_dest_find(l, node, port);
|
|
|
|
if (!dst)
|
|
|
|
return false;
|
|
|
|
list_del(&dst->list);
|
|
|
|
kfree(dst);
|
|
|
|
return true;
|
2017-01-03 23:55:10 +08:00
|
|
|
}
|
|
|
|
|
2017-10-13 17:04:22 +08:00
|
|
|
void tipc_dest_list_purge(struct list_head *l)
|
2017-01-03 23:55:10 +08:00
|
|
|
{
|
2017-10-13 17:04:22 +08:00
|
|
|
struct tipc_dest *dst, *tmp;
|
2017-01-03 23:55:10 +08:00
|
|
|
|
2017-10-13 17:04:22 +08:00
|
|
|
list_for_each_entry_safe(dst, tmp, l, list) {
|
|
|
|
list_del(&dst->list);
|
|
|
|
kfree(dst);
|
2015-02-05 21:36:43 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-10-13 17:04:22 +08:00
|
|
|
int tipc_dest_list_len(struct list_head *l)
|
2015-02-05 21:36:43 +08:00
|
|
|
{
|
2017-10-13 17:04:22 +08:00
|
|
|
struct tipc_dest *dst;
|
2017-01-03 23:55:10 +08:00
|
|
|
int i = 0;
|
2015-02-05 21:36:43 +08:00
|
|
|
|
2017-10-13 17:04:22 +08:00
|
|
|
list_for_each_entry(dst, l, list) {
|
2017-01-03 23:55:10 +08:00
|
|
|
i++;
|
2015-02-05 21:36:43 +08:00
|
|
|
}
|
2017-01-03 23:55:10 +08:00
|
|
|
return i;
|
2015-02-05 21:36:43 +08:00
|
|
|
}
|