2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-21 11:44:01 +08:00
Commit Graph

3861 Commits

Author SHA1 Message Date
Matthew Wilcox
a8e4da25d3 radix-tree: tidy up range_tag_if_tagged
Convert radix_tree_range_tag_if_tagged to name the nodes parent, node
and child instead of node & slot.

Use parent->offset instead of playing games with 'upindex'.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
8c1244de00 radix-tree: tidy up next_chunk
Convert radix_tree_next_chunk to use 'child' instead of 'slot' as the
name of the child node.  Also use node_maxindex() where it makes sense.

The 'rnode' variable was unnecessary; it doesn't overlap in usage with
'node', so we can just use 'node' the whole way through the function.

Improve the testcase to start the walk from every index in the carefully
constructed tree, and to accept any index within the range covered by
the entry.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
af49a63e10 radix-tree: change naming conventions in radix_tree_shrink
Use the more standard 'node' and 'child' instead of 'to_free' and
'slot'.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
b194d16c27 radix-tree: rename radix_tree_is_indirect_ptr()
As with indirect_to_ptr(), ptr_to_indirect() and
RADIX_TREE_INDIRECT_PTR, change radix_tree_is_indirect_ptr() to
radix_tree_is_internal_node().

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
4dd6c0987c radix-tree: rename indirect_to_ptr() to entry_to_node()
Mirrors the earlier commit introducing node_to_entry().

Also change the type returned to be a struct radix_tree_node pointer.
That lets us simplify a couple of places in the radix tree shrink &
extend paths where we could convert an entry into a pointer, modify the
node, then convert the pointer back into an entry.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
a4db4dcea1 radix-tree: rename ptr_to_indirect() to node_to_entry()
ptr_to_indirect() was a bad name.  What it really means is "Convert this
pointer to a node into an entry suitable for storing in the radix tree".
So node_to_entry() seemed like a better name.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
30ff46ccb3 radix-tree: rename INDIRECT_PTR to INTERNAL_NODE
The name RADIX_TREE_INDIRECT_PTR doesn't really match the meaning.
RADIX_TREE_INTERNAL_NODE is a better name.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
d0891265bb radix-tree: remove root->height
The only remaining references to root->height were in extend and shrink,
where it was updated.  Now we can remove it entirely.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
fb209019c9 radix-tree: remove a use of root->height from delete_node
If radix_tree_shrink returns whether it managed to shrink, then
__radix_tree_delete_node doesn't ned to query the tree to find out
whether it did any work or not.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
c12e51b07b radix-tree: replace node->height with node->shift
node->shift represents the shift necessary for looking in the slots
array at this level.  It is equal to the old (node->height - 1) *
RADIX_TREE_MAP_SHIFT.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
0c7fa0a841 radix-tree: split node->path into offset and height
Neither piece of information we're storing in node->path can be larger
than 64, so store each in its own unsigned char instead of shifting and
masking to store them both in an unsigned int.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
2fcd9005cc radix-tree: miscellaneous fixes
Typos, whitespace, grammar, line length, using the correct types, etc.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
6b053b8e5f radix-tree: add copyright statements
The multiorder support is a sufficiently large feature to be worth
adding copyrigt lines for.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Ross Zwisler
0796c58325 radix-tree: fix radix_tree_dump() for multi-order entries
- Print which indices are covered by every leaf entry
 - Print sibling entries
 - Print the node pointer instead of the slot entry
 - Build by default in userspace, and make it accessible to the test-suite

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
070c5ac274 radix-tree: fix radix_tree_range_tag_if_tagged() for multiorder entries
I had previously decided that tagging a single multiorder entry would
count as tagging 2^order entries for the purposes of 'nr_to_tag'.  I now
believe that decision to be a mistake, and it should count as a single
entry.  That's more likely to be what callers expect.

When walking back up the tree from a newly-tagged entry, the current
code assumed we were starting from the lowest level of the tree; if we
have a multiorder entry with an order at least RADIX_TREE_MAP_SHIFT in
size then we need to shift the index by 'shift' before we start walking
back up the tree, or we will end up not setting tags on higher entries,
and then mistakenly thinking that entries below a certain point in the
tree are not tagged.

If the first index we examine is a sibling entry of a tagged multiorder
entry, we were not tagging it.  We need to examine the canonical entry,
and the easiest way to do that is to use radix_tree_descend().  We then
have to skip over sibling slots when looking for the next entry in the
tree or we will end up walking back to the canonical entry.

Add several tests for radix_tree_range_tag_if_tagged().

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
0a2efc6c80 radix-tree: rewrite radix_tree_locate_item
Use the new multi-order support functions to rewrite
radix_tree_locate_item().  Modify the locate tests to test multiorder
entries too.

[hughd@google.com: radix_tree_locate_item() is often returning the wrong index]
  Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1605012108490.1166@eggly.anvils
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
8a14f4d832 radix-tree: fix radix_tree_create for sibling entries
If the radix tree user attempted to insert a colliding entry with an
existing multiorder entry, then radix_tree_create() could encounter a
sibling entry when walking down the tree to look for a slot.  Use
radix_tree_descend() to fix the problem, and add a test-case to make
sure the problem doesn't come back in future.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Ross Zwisler
4589ba6d0f radix-tree: rewrite radix_tree_tag_get
Use the new multi-order support functions to rewrite
radix_tree_tag_get()

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Ross Zwisler
00f47b5811 radix-tree: rewrite radix_tree_tag_clear
Use the new multi-order support functions to rewrite
radix_tree_tag_clear()

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Ross Zwisler
fb969909dd radix-tree: rewrite radix_tree_tag_set
Use the new multi-order support functions to rewrite
radix_tree_tag_set()

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Ross Zwisler
21ef533931 radix-tree: add support for multi-order iterating
This enables the macros radix_tree_for_each_slot() and friends to be
used with multi-order entries.

The way that this works is that we treat all entries in a given slots[]
array as a single chunk.  If the index given to radix_tree_next_chunk()
happens to point us to a sibling entry, we will back up iter->index so
that it points to the canonical entry, and that will be the place where
we start our iteration.

As we're processing a chunk in radix_tree_next_slot(), we process
canonical entries, skip over sibling entries, and restart the chunk
lookup if we find a non-sibling indirect pointer.  This drops back to
the radix_tree_next_chunk() code, which will re-walk the tree and look
for another chunk.

This allows us to properly handle multi-order entries mixed with other
entries that are at various heights in the radix tree.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
7b60e9ad59 radix-tree: fix multiorder BUG_ON in radix_tree_insert
These BUG_ON tests are to ensure that all the tags are clear when
inserting a new entry.  If we insert a multiorder entry, we'll end up
looking at the tags for a different node, and so the BUG_ON can end up
triggering spuriously.

Also, we now have three tags, not two, so check all three are clear, and
check all the root tags with a single call to BUG_ON since the bits are
stored contiguously.

Include a test-case to ensure this problem does not reoccur.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
858299544e radix-tree: rewrite __radix_tree_lookup
Use the new multi-order support functions to rewrite __radix_tree_lookup()

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
afe0e395b6 radix-tree: fix several shrinking bugs with multiorder entries
Setting the indirect bit on the user data entry used to be unambiguous
because the tree walking code knew not to expect internal nodes in the
last level of the tree.  Multiorder entries can appear at any level of
the tree, and a leaf with the indirect bit set is indistinguishable from
a pointer to a node.

Introduce a special entry (RADIX_TREE_RETRY) which is neither a valid
user entry, nor a valid pointer to a node.  The radix_tree_deref_retry()
function continues to work the same way, but tree walking code can
distinguish it from a pointer to a node.

Also fix the condition for setting slot->parent to NULL; it does not
matter what height the tree is, it only matters whether slot is an
indirect pointer.  Move this code above the comment which is referring
to the assignment to root->rnode.

Also fix the condition for preventing the tree from shrinking to a
single entry if it's a multiorder entry.

Add a test-case to the test suite that checks that the tree goes back
down to its original height after an item is inserted & deleted from a
higher index in the tree.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
49ea6ebcd3 radix-tree: fix extending the tree for multi-order entries at offset 0
The current code will insert entries at each level, as if we're going to
add a new entry at the bottom level, so we then get an -EEXIST when we
try to insert the entry into the tree.  The best way to fix this is to
not check 'order' when inserting into an empty tree.

We still need to 'extend' the tree to the height necessary for the maximum
index corresponding to this entry, so pass that value to
radix_tree_extend() rather than the index we're asked to create, or we
won't create a tree that's deep enough.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
1456a439fc radix-tree: introduce radix_tree_load_root()
All the tree walking functions start with some variant of this code;
centralise it in one place so we're not chasing subtly different bugs
everywhere.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
aa54757602 radix-tree: remove restriction on multi-order entries
Now that sibling pointers are handled explicitly, there is no purpose
served by restricting the order to be >= RADIX_TREE_MAP_SHIFT.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
29e0967c2f radix-tree: fix deleting a multi-order entry through an alias
If we deleted an entry through an index which looked up a sibling
pointer, we'd end up zeroing out the wrong slots in the node.  Use
get_slot_offset() to find the right slot.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
3b8c00f684 radix-tree: fix sibling entry insertion
The subtraction was the wrong way round, leading to undefined behaviour
(shift by an amount larger than the size of the type).

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
db050f2924 radix-tree: add missing sibling entry functionality
The code I previously added to enable multiorder radix tree entries was
untested and therefore buggy.  This commit adds the support functions
that Ross and I decided were necessary over a four-week period of
iterating various designs.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox
57578c2ea2 raxix-tree: introduce CONFIG_RADIX_TREE_MULTIORDER
I've been receiving increasingly concerned notes from 0day about how
much my recent changes have been bloating the radix tree.  Make it
happier by only including multiorder support if
CONFIG_TRANSPARENT_HUGEPAGES is set.

This is an independent Kconfig option, so other radix tree users can
also set it if they have a need.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Andy Shevchenko
e3a93bce69 lib/uuid.c: remove FSF address
There is no point in keeping an address in the file since it's subject
to change.

While here, update Intel Copyright years.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Andy Shevchenko
2b1b0d6670 lib/uuid.c: introduce a few more generic helpers
There are new helpers in this patch:

  uuid_is_valid		checks if a UUID is valid
  uuid_be_to_bin	converts from string to binary (big endian)
  uuid_le_to_bin	converts from string to binary (little endian)

They will be used in future, i.e. in the following patches in the series.

This also moves the indices arrays to lib/uuid.c to be shared accross
modules.

[andriy.shevchenko@linux.intel.com: fix typo]
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Andy Shevchenko
8da4b8c48e lib/uuid.c: move generate_random_uuid() to uuid.c
Let's gather the UUID related functions under one hood.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Andy Shevchenko
aa4ea1c3b3 lib/vsprintf: simplify UUID printing
There are few functions here and there along with type definitions that
provide UUID API.  This series consolidates everything under one hood
and converts current users.

This has been tested for a while internally, however it doesn't mean we
covered all possible cases (especially accuracy of UUID constants after
conversion).  So, please test this as much as you can and provide your
tag.  We appreciate the effort.

The ACPI conversion is postponed for now to sort more generic things out
first.

This patch (of 9):

Since we have hex_byte_pack_upper() we may use it directly and avoid
second loop.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Petr Mladek
42a0bb3f71 printk/nmi: generic solution for safe printk in NMI
printk() takes some locks and could not be used a safe way in NMI
context.

The chance of a deadlock is real especially when printing stacks from
all CPUs.  This particular problem has been addressed on x86 by the
commit a9edc88093 ("x86/nmi: Perform a safe NMI stack trace on all
CPUs").

The patchset brings two big advantages.  First, it makes the NMI
backtraces safe on all architectures for free.  Second, it makes all NMI
messages almost safe on all architectures (the temporary buffer is
limited.  We still should keep the number of messages in NMI context at
minimum).

Note that there already are several messages printed in NMI context:
WARN_ON(in_nmi()), BUG_ON(in_nmi()), anything being printed out from MCE
handlers.  These are not easy to avoid.

This patch reuses most of the code and makes it generic.  It is useful
for all messages and architectures that support NMI.

The alternative printk_func is set when entering and is reseted when
leaving NMI context.  It queues IRQ work to copy the messages into the
main ring buffer in a safe context.

__printk_nmi_flush() copies all available messages and reset the buffer.
Then we could use a simple cmpxchg operations to get synchronized with
writers.  There is also used a spinlock to get synchronized with other
flushers.

We do not longer use seq_buf because it depends on external lock.  It
would be hard to make all supported operations safe for a lockless use.
It would be confusing and error prone to make only some operations safe.

The code is put into separate printk/nmi.c as suggested by Steven
Rostedt.  It needs a per-CPU buffer and is compiled only on
architectures that call nmi_enter().  This is achieved by the new
HAVE_NMI Kconfig flag.

The are MN10300 and Xtensa architectures.  We need to clean up NMI
handling there first.  Let's do it separately.

The patch is heavily based on the draft from Peter Zijlstra, see

  https://lkml.org/lkml/2015/6/10/327

[arnd@arndb.de: printk-nmi: use %zu format string for size_t]
[akpm@linux-foundation.org: min_t->min - all types are size_t here]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Jan Kara <jack@suse.cz>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>	[arm part]
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jiri Kosina <jkosina@suse.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: David Miller <davem@davemloft.net>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Andrey Ryabinin
eae08dcab8 kasan/tests: add tests for user memory access functions
Add some tests for the newly-added user memory access API.

Link: http://lkml.kernel.org/r/1462538722-1574-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Andrey Ryabinin
1771c6e1a5 x86/kasan: instrument user memory access API
Exchange between user and kernel memory is coded in assembly language.
Which means that such accesses won't be spotted by KASAN as a compiler
instruments only C code.

Add explicit KASAN checks to user memory access API to ensure that
userspace writes to (or reads from) a valid kernel memory.

Note: Unlike others strncpy_from_user() is written mostly in C and KASAN
sees memory accesses in it.  However, it makes sense to add explicit
check for all @count bytes that *potentially* could be written to the
kernel.

[aryabinin@virtuozzo.com: move kasan check under the condition]
  Link: http://lkml.kernel.org/r/1462869209-21096-1-git-send-email-aryabinin@virtuozzo.com
Link: http://lkml.kernel.org/r/1462538722-1574-4-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Alexander Potapenko
96fe805fb6 mm, kasan: add a ksize() test
Add a test that makes sure ksize() unpoisons the whole chunk.

Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Konstantin Serebryany <kcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Linus Torvalds
a05a70db34 Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - fsnotify fix

 - poll() timeout fix

 - a few scripts/ tweaks

 - debugobjects updates

 - the (small) ocfs2 queue

 - Minor fixes to kernel/padata.c

 - Maybe half of the MM queue

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits)
  mm, page_alloc: restore the original nodemask if the fast path allocation failed
  mm, page_alloc: uninline the bad page part of check_new_page()
  mm, page_alloc: don't duplicate code in free_pcp_prepare
  mm, page_alloc: defer debugging checks of pages allocated from the PCP
  mm, page_alloc: defer debugging checks of freed pages until a PCP drain
  cpuset: use static key better and convert to new API
  mm, page_alloc: inline pageblock lookup in page free fast paths
  mm, page_alloc: remove unnecessary variable from free_pcppages_bulk
  mm, page_alloc: pull out side effects from free_pages_check
  mm, page_alloc: un-inline the bad part of free_pages_check
  mm, page_alloc: check multiple page fields with a single branch
  mm, page_alloc: remove field from alloc_context
  mm, page_alloc: avoid looking up the first zone in a zonelist twice
  mm, page_alloc: shortcut watermark checks for order-0 pages
  mm, page_alloc: reduce cost of fair zone allocation policy retry
  mm, page_alloc: shorten the page allocator fast path
  mm, page_alloc: check once if a zone has isolated pageblocks
  mm, page_alloc: move __GFP_HARDWALL modifications out of the fastpath
  mm, page_alloc: simplify last cpupid reset
  mm, page_alloc: remove unnecessary initialisation from __alloc_pages_nodemask()
  ...
2016-05-19 20:00:06 -07:00
Andrew Morton
0edaf86cf1 include/linux/nodemask.h: create next_node_in() helper
Lots of code does

	node = next_node(node, XXX);
	if (node == MAX_NUMNODES)
		node = first_node(XXX);

so create next_node_in() to do this and use it in various places.

[mhocko@suse.com: use next_node_in() helper]
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Hui Zhu <zhuhui@xiaomi.com>
Cc: Wang Xiaoqiang <wangxq10@lzu.edu.cn>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19 19:12:14 -07:00
Du, Changbin
b9fdac7f66 debugobjects: insulate non-fixup logic related to static obj from fixup callbacks
When activating a static object we need make sure that the object is
tracked in the object tracker.  If it is a non-static object then the
activation is illegal.

In previous implementation, each subsystem need take care of this in
their fixup callbacks.  Actually we can put it into debugobjects core.
Thus we can save duplicated code, and have *pure* fixup callbacks.

To achieve this, a new callback "is_static_object" is introduced to let
the type specific code decide whether a object is static or not.  If
yes, we take it into object tracker, otherwise give warning and invoke
fixup callback.

This change has paassed debugobjects selftest, and I also do some test
with all debugobjects supports enabled.

At last, I have a concern about the fixups that can it change the object
which is in incorrect state on fixup? Because the 'addr' may not point
to any valid object if a non-static object is not tracked.  Then Change
such object can overwrite someone's memory and cause unexpected
behaviour.  For example, the timer_fixup_activate bind timer to function
stub_timer.

Link: http://lkml.kernel.org/r/1462576157-14539-1-git-send-email-changbin.du@intel.com
[changbin.du@intel.com: improve code comments where invoke the new is_static_object callback]
  Link: http://lkml.kernel.org/r/1462777431-8171-1-git-send-email-changbin.du@intel.com
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Triplett <josh@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19 19:12:14 -07:00
Du, Changbin
d99b1d8912 percpu_counter: update debugobjects fixup callbacks return type
Update the return type to use bool instead of int, corresponding to
cheange (debugobjects: make fixup functions return bool instead of int).

Signed-off-by: Du, Changbin <changbin.du@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Triplett <josh@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19 19:12:14 -07:00
Du, Changbin
e7a8e78bd4 debugobjects: correct the usage of fixup call results
If debug_object_fixup() return non-zero when problem has been fixed.
But the code got it backwards, it taks 0 as fixup successfully.  So fix
it.

Signed-off-by: Du, Changbin <changbin.du@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Triplett <josh@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19 19:12:14 -07:00
Du, Changbin
b1e4d9d82d debugobjects: make fixup functions return bool instead of int
I am going to introduce debugobjects infrastructure to USB subsystem.
But before this, I found the code of debugobjects could be improved.
This patchset will make fixup functions return bool type instead of int.
Because fixup only need report success or no.  boolean is the 'real'
type.

This patch (of 7):

The object debugging infrastructure core provides some fixup callbacks
for the subsystem who use it.  These callbacks are called from the debug
code whenever a problem in debug_object_init is detected.  And
debugobjects core suppose them returns 1 when the fixup was successful,
otherwise 0.  So the return type is boolean.

A bad thing is that debug_object_fixup use the return value for
arithmetic operation.  It confused me that what is the reall return
type.

Reading over the whole code, I found some place do use the return value
incorrectly(see next patch).  So why use bool type instead?

Signed-off-by: Du, Changbin <changbin.du@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Triplett <josh@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19 19:12:14 -07:00
Linus Torvalds
f4f27d0028 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - A new LSM, "LoadPin", from Kees Cook is added, which allows forcing
     of modules and firmware to be loaded from a specific device (this
     is from ChromeOS, where the device as a whole is verified
     cryptographically via dm-verity).

     This is disabled by default but can be configured to be enabled by
     default (don't do this if you don't know what you're doing).

   - Keys: allow authentication data to be stored in an asymmetric key.
     Lots of general fixes and updates.

   - SELinux: add restrictions for loading of kernel modules via
     finit_module().  Distinguish non-init user namespace capability
     checks.  Apply execstack check on thread stacks"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (48 commits)
  LSM: LoadPin: provide enablement CONFIG
  Yama: use atomic allocations when reporting
  seccomp: Fix comment typo
  ima: add support for creating files using the mknodat syscall
  ima: fix ima_inode_post_setattr
  vfs: forbid write access when reading a file into memory
  fs: fix over-zealous use of "const"
  selinux: apply execstack check on thread stacks
  selinux: distinguish non-init user namespace capability checks
  LSM: LoadPin for kernel file loading restrictions
  fs: define a string representation of the kernel_read_file_id enumeration
  Yama: consolidate error reporting
  string_helpers: add kstrdup_quotable_file
  string_helpers: add kstrdup_quotable_cmdline
  string_helpers: add kstrdup_quotable
  selinux: check ss_initialized before revalidating an inode label
  selinux: delay inode label lookup as long as possible
  selinux: don't revalidate an inode's label when explicitly setting it
  selinux: Change bool variable name to index.
  KEYS: Add KEYCTL_DH_COMPUTE command
  ...
2016-05-19 09:21:36 -07:00
Linus Torvalds
675e0655c1 SCSI misc on 20160517
This patch includes the usual quota of driver updates (bnx2fc, mp3sas,
 hpsa, ncr5380, lpfc, hisi_sas, snic, aacraid, megaraid_sas) there's
 also a multiqueue update for scsi_debug, assorted bug fixes and a few
 other minor updates (refactor of scsi_sg_pools into generic code, alua
 and VPD updates, and struct timeval conversions).
 
 Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJXO8W0AAoJEDeqqVYsXL0MW24H/jGWwfjsDUiSsLwbLca6DWu8
 ZCWZ7rSZ27CApwGPgZGpLvUg+vpW8Ykm2zdeBnlZ6ScXS+dT3uo/PHsnemsTextj
 6glQNIOFY0Ja2GwkkN00M6IZQhTJ628cqJKIEJxC68lIw16wiOwjZaK68GMrusDO
 Sl062rkuLR6Jb2T+YoT/sD8jQfWlSj2V9e9rqJoS/rIbS6B+hUipuybz2yQ2yK2u
 XFc30yal9oVz1fHEoh2O8aqckW3/iskukVXVuZ0MQzT/lV/bm9I6AnWVHw7d0Yhp
 ZELjXpjx5M2Z/d8k0Wvx1e25oL/ERwa96yLnTvRcqyF5Yt1EgAhT+jKvo4pnGr8=
 =L6y/
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "First round of SCSI updates for the 4.6+ merge window.

  This batch includes the usual quota of driver updates (bnx2fc, mp3sas,
  hpsa, ncr5380, lpfc, hisi_sas, snic, aacraid, megaraid_sas).  There's
  also a multiqueue update for scsi_debug, assorted bug fixes and a few
  other minor updates (refactor of scsi_sg_pools into generic code, alua
  and VPD updates, and struct timeval conversions)"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (138 commits)
  mpt3sas: Used "synchronize_irq()"API to synchronize timed-out IO & TMs
  mpt3sas: Set maximum transfer length per IO to 4MB for VDs
  mpt3sas: Updating mpt3sas driver version to 13.100.00.00
  mpt3sas: Fix initial Reference tag field for 4K PI drives.
  mpt3sas: Handle active cable exception event
  mpt3sas: Update MPI header to 2.00.42
  Revert "lpfc: Delete unnecessary checks before the function call mempool_destroy"
  eata_pio: missing break statement
  hpsa: Fix type ZBC conditional checks
  scsi_lib: Decode T10 vendor IDs
  scsi_dh_alua: do not fail for unknown VPD identification
  scsi_debug: use locally assigned naa
  scsi_debug: uuid for lu name
  scsi_debug: vpd and mode page work
  scsi_debug: add multiple queue support
  bfa: fix bfa_fcb_itnim_alloc() error handling
  megaraid_sas: Downgrade two success messages to info
  cxlflash: Fix to resolve dead-lock during EEH recovery
  scsi_debug: rework resp_report_luns
  scsi_debug: use pdt constants
  ...
2016-05-18 16:38:59 -07:00
Linus Torvalds
69370471d0 Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull iov_iter cleanups from Al Viro.

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fold checks into iterate_and_advance()
  rw_verify_area(): saner calling conventions
  aio: remove a pointless assignment
2016-05-18 11:46:23 -07:00
James Bottomley
e7ca7f9fa2 Merge branch 'fixes' into misc 2016-05-17 21:12:50 -04:00
Linus Torvalds
a7fd20d1c4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support SPI based w5100 devices, from Akinobu Mita.

   2) Partial Segmentation Offload, from Alexander Duyck.

   3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.

   4) Allow cls_flower stats offload, from Amir Vadai.

   5) Implement bpf blinding, from Daniel Borkmann.

   6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
      actually using FASYNC these atomics are superfluous.  From Eric
      Dumazet.

   7) Run TCP more preemptibly, also from Eric Dumazet.

   8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
      driver, from Gal Pressman.

   9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.

  10) Improve BPF usage documentation, from Jesper Dangaard Brouer.

  11) Support tunneling offloads in qed, from Manish Chopra.

  12) aRFS offloading in mlx5e, from Maor Gottlieb.

  13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
      Leitner.

  14) Add MSG_EOR support to TCP, this allows controlling packet
      coalescing on application record boundaries for more accurate
      socket timestamp sampling.  From Martin KaFai Lau.

  15) Fix alignment of 64-bit netlink attributes across the board, from
      Nicolas Dichtel.

  16) Per-vlan stats in bridging, from Nikolay Aleksandrov.

  17) Several conversions of drivers to ethtool ksettings, from Philippe
      Reynes.

  18) Checksum neutral ILA in ipv6, from Tom Herbert.

  19) Factorize all of the various marvell dsa drivers into one, from
      Vivien Didelot

  20) Add VF support to qed driver, from Yuval Mintz"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
  Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
  Revert "phy dp83867: Make rgmii parameters optional"
  r8169: default to 64-bit DMA on recent PCIe chips
  phy dp83867: Make rgmii parameters optional
  phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
  bpf: arm64: remove callee-save registers use for tmp registers
  asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
  switchdev: pass pointer to fib_info instead of copy
  net_sched: close another race condition in tcf_mirred_release()
  tipc: fix nametable publication field in nl compat
  drivers: net: Don't print unpopulated net_device name
  qed: add support for dcbx.
  ravb: Add missing free_irq() calls to ravb_close()
  qed: Remove a stray tab
  net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
  net: ethernet: fec-mpc52xx: use phydev from struct net_device
  bpf, doc: fix typo on bpf_asm descriptions
  stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
  net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
  net: ethernet: fs-enet: use phydev from struct net_device
  ...
2016-05-17 16:26:30 -07:00
Linus Torvalds
9a07a79684 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 "API:

   - Crypto self tests can now be disabled at boot/run time.
   - Add async support to algif_aead.

  Algorithms:

   - A large number of fixes to MPI from Nicolai Stange.
   - Performance improvement for HMAC DRBG.

  Drivers:

   - Use generic crypto engine in omap-des.
   - Merge ppc4xx-rng and crypto4xx drivers.
   - Fix lockups in sun4i-ss driver by disabling IRQs.
   - Add DMA engine support to ccp.
   - Reenable talitos hash algorithms.
   - Add support for Hisilicon SoC RNG.
   - Add basic crypto driver for the MXC SCC.

  Others:

   - Do not allocate crypto hash tfm in NORECLAIM context in ecryptfs"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (77 commits)
  crypto: qat - change the adf_ctl_stop_devices to void
  crypto: caam - fix caam_jr_alloc() ret code
  crypto: vmx - comply with ABIs that specify vrsave as reserved.
  crypto: testmgr - Add a flag allowing the self-tests to be disabled at runtime.
  crypto: ccp - constify ccp_actions structure
  crypto: marvell/cesa - Use dma_pool_zalloc
  crypto: qat - make adf_vf_isr.c dependant on IOV config
  crypto: qat - Fix typo in comments
  lib: asn1_decoder - add MODULE_LICENSE("GPL")
  crypto: omap-sham - Use dma_request_chan() for requesting DMA channel
  crypto: omap-des - Use dma_request_chan() for requesting DMA channel
  crypto: omap-aes - Use dma_request_chan() for requesting DMA channel
  crypto: omap-des - Integrate with the crypto engine framework
  crypto: s5p-sss - fix incorrect usage of scatterlists api
  crypto: s5p-sss - Fix missed interrupts when working with 8 kB blocks
  crypto: s5p-sss - Use common BIT macro
  crypto: mxc-scc - fix unwinding in mxc_scc_crypto_register()
  crypto: mxc-scc - signedness bugs in mxc_scc_ablkcipher_req_init()
  crypto: talitos - fix ahash algorithms registration
  crypto: ccp - Ensure all dependencies are specified
  ...
2016-05-17 09:33:39 -07:00
Linus Torvalds
a3871bd434 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main changes are:

   - Documentation updates, including fixes to the design-level
     requirements documentation and a fixed version of the design-level
     data-structure documentation.  These fixes include removing
     cartoons and getting rid of the html/htmlx duplication.

   - Further improvements to the new-age expedited grace periods.

   - Miscellaneous fixes.

   - Torture-test changes, including a new rcuperf module for measuring
     RCU grace-period performance and scalability, which is useful for
     the expedited-grace-period changes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
  rcutorture: Add boot-time adjustment of leaf fanout
  rcutorture: Add irqs-disabled test for call_rcu()
  rcutorture: Dump trace buffer upon shutdown
  rcutorture: Don't rebuild identical kernel
  rcutorture: Add OS-jitter capability
  documentation: Add documentation for RCU's major data structures
  rcutorture: Convert test duration to seconds early
  torture: Kill qemu, not parent process
  torture: Clarify refusal to run more than one torture test
  rcutorture: Consider FROZEN hotplug notifier transitions
  rcutorture: Remove redundant initialization to zero
  rcuperf: Do not wake up shutdown wait queue if "shutdown" is false.
  rcutorture: Add largish-system rcuperf scenario
  rcutorture: Avoid RCU CPU stall warning and RT throttling
  rcutorture: Add rcuperf holdoff boot parameter to reduce interference
  rcutorture: Make scripts analyze rcuperf trace data, if present
  rcutorture: Make rcuperf collect expedited event-trace data
  rcutorture: Print measure of batching efficiency
  rcutorture: Set rcuperf writer kthreads to real-time priority
  rcutorture: Bind rcuperf reader/writer kthreads to CPUs
  ...
2016-05-16 12:02:08 -07:00
Linus Torvalds
0052af4411 Merge branch 'core-lib-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core/lib update from Ingo Molnar:
 "This contains a single commit that removes an unused facility that the
  scheduler used to make use of"

* 'core-lib-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lib/proportions: Remove unused code
2016-05-16 11:36:02 -07:00
Daniel Borkmann
d1c55ab5e4 bpf: prepare bpf_int_jit_compile/bpf_prog_select_runtime apis
Since the blinding is strictly only called from inside eBPF JITs,
we need to change signatures for bpf_int_jit_compile() and
bpf_prog_select_runtime() first in order to prepare that the
eBPF program we're dealing with can change underneath. Hence,
for call sites, we need to return the latest prog. No functional
change in this patch.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:49:32 -04:00
David S. Miller
909b27f706 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The nf_conntrack_core.c fix in 'net' is not relevant in 'net-next'
because we no longer have a per-netns conntrack hash.

The ip_gre.c conflict as well as the iwlwifi ones were cases of
overlapping changes.

Conflicts:
	drivers/net/wireless/intel/iwlwifi/mvm/tx.c
	net/ipv4/ip_gre.c
	net/netfilter/nf_conntrack_core.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-15 13:32:48 -04:00
Linus Torvalds
6ba5b85fd4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Overlayfs fixes from Miklos, assorted fixes from me.

  Stable fodder of varying severity, all sat in -next for a while"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ovl: ignore permissions on underlying lookup
  vfs: add lookup_hash() helper
  vfs: rename: check backing inode being equal
  vfs: add vfs_select_inode() helper
  get_rock_ridge_filename(): handle malformed NM entries
  ecryptfs: fix handling of directory opening
  atomic_open(): fix the handling of create_error
  fix the copy vs. map logics in blk_rq_map_user_iov()
  do_splice_to(): cap the size before passing to ->splice_read()
2016-05-14 11:59:43 -07:00
David Howells
23c8a812dc KEYS: Fix ASN.1 indefinite length object parsing
This fixes CVE-2016-0758.

In the ASN.1 decoder, when the length field of an ASN.1 value is extracted,
it isn't validated against the remaining amount of data before being added
to the cursor.  With a sufficiently large size indicated, the check:

	datalen - dp < 2

may then fail due to integer overflow.

Fix this by checking the length indicated against the amount of remaining
data in both places a definite length is determined.

Whilst we're at it, make the following changes:

 (1) Check the maximum size of extended length does not exceed the capacity
     of the variable it's being stored in (len) rather than the type that
     variable is assumed to be (size_t).

 (2) Compare the EOC tag to the symbolic constant ASN1_EOC rather than the
     integer 0.

 (3) To reduce confusion, move the initialisation of len outside of:

	for (len = 0; n > 0; n--) {

     since it doesn't have anything to do with the loop counter n.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Peter Jones <pjones@redhat.com>
2016-05-12 12:01:49 +01:00
Al Viro
e4d35be584 Merge branch 'ovl-fixes' into for-linus 2016-05-11 00:00:29 -04:00
David S. Miller
e800072c18 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
In netdevice.h we removed the structure in net-next that is being
changes in 'net'.  In macsec.c and rtnetlink.c we have overlaps
between fixes in 'net' and the u64 attribute changes in 'net-next'.

The mlx5 conflicts have to do with vxlan support dependencies.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09 15:59:24 -04:00
Al Viro
dd254f5a38 fold checks into iterate_and_advance()
they are open-coded in all users except iov_iter_advance(), and there
they wouldn't be a bad idea either - as it is, iov_iter_advance(i, 0)
ends up dereferencing potentially past the end of iovec array.  It
doesn't do anything with the value it reads, and very unlikely to
trigger an oops on dereference, but it is not impossible.

Reported-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09 14:04:29 -04:00
Joonsoo Kim
7c31190bcf lib/stackdepot: avoid to return 0 handle
Recently, we allow to save the stacktrace whose hashed value is 0.  It
causes the problem that stackdepot could return 0 even if in success.
User of stackdepot cannot distinguish whether it is success or not so we
need to solve this problem.  In this patch, 1 bit are added to handle
and make valid handle none 0 by setting this bit.  After that, valid
handle will not be 0 and 0 handle will represent failure correctly.

Fixes: 33334e2576 ("lib/stackdepot.c: allow the stack trace hash to be zero")
Link: http://lkml.kernel.org/r/1462252403-1106-1-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-05 17:38:53 -07:00
David S. Miller
cba6532100 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/ipv4/ip_gre.c

Minor conflicts between tunnel bug fixes in net and
ipv6 tunnel cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 00:52:29 -04:00
Tudor Ambarus
ccab6058da lib: asn1_decoder - add MODULE_LICENSE("GPL")
A kernel taint results when loading the rsa_generic module:

root@(none):~# modprobe rsa_generic
asn1_decoder: module license 'unspecified' taints kernel.
Disabling lock debugging due to kernel taint

"Tainting" of the kernel is (usually) a way of indicating that
a proprietary module has been inserted, which is not the case here.

Signed-off-by: Tudor Ambarus <tudor-dan.ambarus@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-05-03 16:10:12 +08:00
Alexander Potapenko
33334e2576 lib/stackdepot.c: allow the stack trace hash to be zero
Do not bail out from depot_save_stack() if the stack trace has zero hash.
Initially depot_save_stack() silently dropped stack traces with zero
hashes, however there's actually no point in reserving this zero value.

Reported-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-28 19:34:04 -07:00
Ingo Molnar
41ed943d85 Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

 * Documentation updates, including fixes to the design-level
   requirements documentation and a fixed version of the design-level
   data-structure documentation.  These fixes include removing
   cartoons and getting rid of the html/htmlx duplication.

 * Further improvements to the new-age expedited grace periods.

 * Miscellaneous fixes.

 * Torture-test changes, including a new rcuperf module for measuring
   RCU grace-period performance and scalability, which is useful for
   the expedited-grace-period changes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-27 16:57:36 +02:00
Nicolas Dichtel
11a9957307 libnl: fix help of _64bit functions
Fix typo and describe 'padattr'.

Fixes: 089bf1a6a9 ("libnl: add more helpers to align attributes on 64-bit")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23 20:13:24 -04:00
David S. Miller
1602f49b58 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts were two cases of simple overlapping changes,
nothing serious.

In the UDP case, we need to add a hlist_add_tail_rcu()
to linux/rculist.h, because we've moved UDP socket handling
away from using nulls lists.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23 18:51:33 -04:00
Nicolas Dichtel
089bf1a6a9 libnl: add more helpers to align attributes on 64-bit
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 14:22:12 -04:00
Kees Cook
21985319ad string_helpers: add kstrdup_quotable_file
Allocate a NULL-terminated file path with special characters escaped,
safe for logging.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-04-21 10:47:26 +10:00
Kees Cook
0d0443288f string_helpers: add kstrdup_quotable_cmdline
Provide an escaped (but readable: no inter-argument NULLs) commandline
safe for logging.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-04-21 10:47:26 +10:00
Kees Cook
b53f27e4fa string_helpers: add kstrdup_quotable
Handle allocating and escaping a string safe for logging.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-04-21 10:47:25 +10:00
Greg Kroah-Hartman
5614e77258 Merge 4.6-rc4 into driver-core-next
We want those fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-19 04:28:28 +09:00
Linus Torvalds
e1e22b27ec "driver core" fixes for 4.6-rc4
Here are 3 small fixes 4.6-rc4.  Two fix up some lz4 issues with big
 endian systems, and the remaining one resolves a minor debugfs issue
 that was reported.
 
 All have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlcS3ucACgkQMUfUDdst+ynw1QCbBGgbY7Xt08whNFcAP81z1Q5X
 fmsAn1fyrhfsxe+JnybzswsTOjFw99Xd
 =YX6h
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull misc fixes from Greg KH:
 "Here are three small fixes for 4.6-rc4.

  Two fix up some lz4 issues with big endian systems, and the remaining
  one resolves a minor debugfs issue that was reported.

  All have been in linux-next with no reported issues"

* tag 'driver-core-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  lib: lz4: cleanup unaligned access efficiency detection
  lib: lz4: fixed zram with lz4 on big endian machines
  debugfs: Make automount point inodes permanently empty
2016-04-16 20:53:50 -07:00
Ming Lin
9b1d6c8950 lib: scatterlist: move SG pool code from SCSI driver to lib/sg_pool.c
Now it's ready to move the mempool based SG chained allocator code from
SCSI driver to lib/sg_pool.c, which will be compiled only based on a Kconfig
symbol CONFIG_SG_POOL.

SCSI selects CONFIG_SG_POOL.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-15 16:53:14 -04:00
Rui Salvaterra
dea5c24a14 lib: lz4: cleanup unaligned access efficiency detection
These identifiers are bogus. The interested architectures should define
HAVE_EFFICIENT_UNALIGNED_ACCESS whenever relevant to do so. If this
isn't true for some arch, it should be fixed in the arch definition.

Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-13 09:22:49 -07:00
Rui Salvaterra
3e26a691fe lib: lz4: fixed zram with lz4 on big endian machines
Based on Sergey's test patch [1], this fixes zram with lz4 compression
on big endian cpus.

Note that the 64-bit preprocessor test is not a cleanup, it's part of
the fix, since those identifiers are bogus (for example, __ppc64__
isn't defined anywhere else in the kernel, which means we'd fall into
the 32-bit definitions on ppc64).

Tested on ppc64 with no regression on x86_64.

[1] http://marc.info/?l=linux-kernel&m=145994470805853&w=4

Cc: stable@vger.kernel.org
Suggested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-13 09:22:49 -07:00
James Morris
58976eef9d Merge tag 'keys-fixes-20160412' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into for-linus 2016-04-13 11:06:52 +10:00
Nicolai Stange
9fd4dcece4 debugfs: prevent access to possibly dead file_operations at file open
Nothing prevents a dentry found by path lookup before a return of
__debugfs_remove() to actually get opened after that return. Now, after
the return of __debugfs_remove(), there are no guarantees whatsoever
regarding the memory the corresponding inode's file_operations object
had been kept in.

Since __debugfs_remove() is seldomly invoked, usually from module exit
handlers only, the race is hard to trigger and the impact is very low.

A discussion of the problem outlined above as well as a suggested
solution can be found in the (sub-)thread rooted at

  http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk
  ("Yet another pipe related oops.")

Basically, Greg KH suggests to introduce an intermediate fops and
Al Viro points out that a pointer to the original ones may be stored in
->d_fsdata.

Follow this line of reasoning:
- Add SRCU as a reverse dependency of DEBUG_FS.
- Introduce a srcu_struct object for the debugfs subsystem.
- In debugfs_create_file(), store a pointer to the original
  file_operations object in ->d_fsdata.
- Make debugfs_remove() and debugfs_remove_recursive() wait for a
  SRCU grace period after the dentry has been delete()'d and before they
  return to their callers.
- Introduce an intermediate file_operations object named
  "debugfs_open_proxy_file_operations". It's ->open() functions checks,
  under the protection of a SRCU read lock, whether the dentry is still
  alive, i.e. has not been d_delete()'d and if so, tries to acquire a
  reference on the owning module.
  On success, it sets the file object's ->f_op to the original
  file_operations and forwards the ongoing open() call to the original
  ->open().
- For clarity, rename the former debugfs_file_operations to
  debugfs_noop_file_operations -- they are in no way canonical.

The choice of SRCU over "normal" RCU is justified by the fact, that the
former may also be used to protect ->i_private data from going away
during the execution of a file's readers and writers which may (and do)
sleep.

Finally, introduce the fs/debugfs/internal.h header containing some
declarations internal to the debugfs implementation.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12 14:14:21 -07:00
David S. Miller
ae95d71261 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-04-09 17:41:41 -04:00
Al Viro
357f435d8a fix the copy vs. map logics in blk_rq_map_user_iov()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-08 19:46:28 -04:00
Naveen N. Rao
9c94f6c8e0 lib/test_bpf: Add additional BPF_ADD tests
Some of these tests proved useful with the powerpc eBPF JIT port due to
sign-extended 16-bit immediate loads. Though some of these aspects get
covered in other tests, it is better to have explicit tests so as to
quickly tag the precise problem.

Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 16:47:51 -04:00
Naveen N. Rao
b64b50eac4 lib/test_bpf: Add test to check for result of 32-bit add that overflows
BPF_ALU32 and BPF_ALU64 tests for adding two 32-bit values that results in
32-bit overflow.

Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 16:47:51 -04:00
Naveen N. Rao
c7395d6bd7 lib/test_bpf: Add tests for unsigned BPF_JGT
Unsigned Jump-if-Greater-Than.

Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 16:47:51 -04:00
Naveen N. Rao
9f134c34fb lib/test_bpf: Fix JMP_JSET tests
JMP_JSET tests incorrectly used BPF_JNE. Fix the same.

Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 16:47:50 -04:00
Jerome Marchand
8d4a2ec1e0 assoc_array: don't call compare_object() on a node
Changes since V1: fixed the description and added KASan warning.

In assoc_array_insert_into_terminal_node(), we call the
compare_object() method on all non-empty slots, even when they're
not leaves, passing a pointer to an unexpected structure to
compare_object(). Currently it causes an out-of-bound read access
in keyring_compare_object detected by KASan (see below). The issue
is easily reproduced with keyutils testsuite.
Only call compare_object() when the slot is a leave.

KASan warning:
==================================================================
BUG: KASAN: slab-out-of-bounds in keyring_compare_object+0x213/0x240 at addr ffff880060a6f838
Read of size 8 by task keyctl/1655
=============================================================================
BUG kmalloc-192 (Not tainted): kasan: bad access detected
-----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: Allocated in assoc_array_insert+0xfd0/0x3a60 age=69 cpu=1 pid=1647
	___slab_alloc+0x563/0x5c0
	__slab_alloc+0x51/0x90
	kmem_cache_alloc_trace+0x263/0x300
	assoc_array_insert+0xfd0/0x3a60
	__key_link_begin+0xfc/0x270
	key_create_or_update+0x459/0xaf0
	SyS_add_key+0x1ba/0x350
	entry_SYSCALL_64_fastpath+0x12/0x76
INFO: Slab 0xffffea0001829b80 objects=16 used=8 fp=0xffff880060a6f550 flags=0x3fff8000004080
INFO: Object 0xffff880060a6f740 @offset=5952 fp=0xffff880060a6e5d1

Bytes b4 ffff880060a6f730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f740: d1 e5 a6 60 00 88 ff ff 0e 00 00 00 00 00 00 00  ...`............
Object ffff880060a6f750: 02 cf 8e 60 00 88 ff ff 02 c0 8e 60 00 88 ff ff  ...`.......`....
Object ffff880060a6f760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f7a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f7b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f7c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f7d0: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f7e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f7f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
CPU: 0 PID: 1655 Comm: keyctl Tainted: G    B           4.5.0-rc4-kasan+ #291
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 0000000000000000 000000001b2800b4 ffff880060a179e0 ffffffff81b60491
 ffff88006c802900 ffff880060a6f740 ffff880060a17a10 ffffffff815e2969
 ffff88006c802900 ffffea0001829b80 ffff880060a6f740 ffff880060a6e650
Call Trace:
 [<ffffffff81b60491>] dump_stack+0x85/0xc4
 [<ffffffff815e2969>] print_trailer+0xf9/0x150
 [<ffffffff815e9454>] object_err+0x34/0x40
 [<ffffffff815ebe50>] kasan_report_error+0x230/0x550
 [<ffffffff819949be>] ? keyring_get_key_chunk+0x13e/0x210
 [<ffffffff815ec62d>] __asan_report_load_n_noabort+0x5d/0x70
 [<ffffffff81994cc3>] ? keyring_compare_object+0x213/0x240
 [<ffffffff81994cc3>] keyring_compare_object+0x213/0x240
 [<ffffffff81bc238c>] assoc_array_insert+0x86c/0x3a60
 [<ffffffff81bc1b20>] ? assoc_array_cancel_edit+0x70/0x70
 [<ffffffff8199797d>] ? __key_link_begin+0x20d/0x270
 [<ffffffff8199786c>] __key_link_begin+0xfc/0x270
 [<ffffffff81993389>] key_create_or_update+0x459/0xaf0
 [<ffffffff8128ce0d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81992f30>] ? key_type_lookup+0xc0/0xc0
 [<ffffffff8199e19d>] ? lookup_user_key+0x13d/0xcd0
 [<ffffffff81534763>] ? memdup_user+0x53/0x80
 [<ffffffff819983ea>] SyS_add_key+0x1ba/0x350
 [<ffffffff81998230>] ? key_get_type_from_user.constprop.6+0xa0/0xa0
 [<ffffffff828bcf4e>] ? retint_user+0x18/0x23
 [<ffffffff8128cc7e>] ? trace_hardirqs_on_caller+0x3fe/0x580
 [<ffffffff81004017>] ? trace_hardirqs_on_thunk+0x17/0x19
 [<ffffffff828bc432>] entry_SYSCALL_64_fastpath+0x12/0x76
Memory state around the buggy address:
 ffff880060a6f700: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
 ffff880060a6f780: 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc
>ffff880060a6f800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                                        ^
 ffff880060a6f880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff880060a6f900: fc fc fc fc fc fc 00 00 00 00 00 00 00 00 00 00
==================================================================

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@vger.kernel.org
2016-04-06 14:06:48 +01:00
Nicolai Stange
0bb5c9ead6 lib/mpi: mpi_read_raw_from_sgl(): fix out-of-bounds buffer access
Within the copying loop in mpi_read_raw_from_sgl(), the last input SGE's
byte count gets artificially extended as follows:

  if (sg_is_last(sg) && (len % BYTES_PER_MPI_LIMB))
    len += BYTES_PER_MPI_LIMB - (len % BYTES_PER_MPI_LIMB);

Within the following byte copying loop, this causes reads beyond that
SGE's allocated buffer:

  BUG: KASAN: slab-out-of-bounds in mpi_read_raw_from_sgl+0x331/0x650
                                     at addr ffff8801e168d4d8
  Read of size 1 by task systemd-udevd/721
  [...]
  Call Trace:
   [<ffffffff818c4d35>] dump_stack+0xbc/0x117
   [<ffffffff818c4c79>] ? _atomic_dec_and_lock+0x169/0x169
   [<ffffffff814af5d1>] ? print_section+0x61/0xb0
   [<ffffffff814b1109>] print_trailer+0x179/0x2c0
   [<ffffffff814bc524>] object_err+0x34/0x40
   [<ffffffff814bfdc7>] kasan_report_error+0x307/0x8c0
   [<ffffffff814bf315>] ? kasan_unpoison_shadow+0x35/0x50
   [<ffffffff814bf38e>] ? kasan_kmalloc+0x5e/0x70
   [<ffffffff814c0ad1>] kasan_report+0x71/0xa0
   [<ffffffff81938171>] ? mpi_read_raw_from_sgl+0x331/0x650
   [<ffffffff814bf1a6>] __asan_load1+0x46/0x50
   [<ffffffff81938171>] mpi_read_raw_from_sgl+0x331/0x650
   [<ffffffff817f41b6>] rsa_verify+0x106/0x260
   [<ffffffff817f40b0>] ? rsa_set_pub_key+0xf0/0xf0
   [<ffffffff818edc79>] ? sg_init_table+0x29/0x50
   [<ffffffff817f4d22>] ? pkcs1pad_sg_set_buf+0xb2/0x2e0
   [<ffffffff817f5b74>] pkcs1pad_verify+0x1f4/0x2b0
   [<ffffffff81831057>] public_key_verify_signature+0x3a7/0x5e0
   [<ffffffff81830cb0>] ? public_key_describe+0x80/0x80
   [<ffffffff817830f0>] ? keyring_search_aux+0x150/0x150
   [<ffffffff818334a4>] ? x509_request_asymmetric_key+0x114/0x370
   [<ffffffff814b83f0>] ? kfree+0x220/0x370
   [<ffffffff818312c2>] public_key_verify_signature_2+0x32/0x50
   [<ffffffff81830b5c>] verify_signature+0x7c/0xb0
   [<ffffffff81835d0c>] pkcs7_validate_trust+0x42c/0x5f0
   [<ffffffff813c391a>] system_verify_data+0xca/0x170
   [<ffffffff813c3850>] ? top_trace_array+0x9b/0x9b
   [<ffffffff81510b29>] ? __vfs_read+0x279/0x3d0
   [<ffffffff8129372f>] mod_verify_sig+0x1ff/0x290
  [...]

The exact purpose of the len extension isn't clear to me, but due to
its form, I suspect that it's a leftover somehow accounting for leading
zero bytes within the most significant output limb.

Note however that without that len adjustement, the total number of bytes
ever processed by the inner loop equals nbytes and thus, the last output
limb gets written at this point. Thus the net effect of the len adjustement
cited above is just to keep the inner loop running for some more
iterations, namely < BYTES_PER_MPI_LIMB ones, reading some extra bytes from
beyond the last SGE's buffer and discarding them afterwards.

Fix this issue by purging the extension of len beyond the last input SGE's
buffer length.

Fixes: 2d4d1eea54 ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:51 +08:00
Nicolai Stange
85d541a3d1 lib/mpi: mpi_read_raw_from_sgl(): sanitize meaning of indices
Within the byte reading loop in mpi_read_raw_sgl(), there are two
housekeeping indices used, z and x.

At all times, the index z represents the number of output bytes covered
by the input SGEs for which processing has completed so far. This includes
any leading zero bytes within the most significant limb.

The index x changes its meaning after the first outer loop's iteration
though: while processing the first input SGE, it represents

  "number of leading zero bytes in most significant output limb" +
   "current position within current SGE"

For the remaining SGEs OTOH, x corresponds just to

  "current position within current SGE"

After all, it is only the sum of z and x that has any meaning for the
output buffer and thus, the

  "number of leading zero bytes in most significant output limb"

part can be moved away from x into z from the beginning, opening up the
opportunity for cleaner code.

Before the outer loop iterating over the SGEs, don't initialize z with
zero, but with the number of leading zero bytes in the most significant
output limb. For the inner loop iterating over a single SGE's bytes,
get rid of the buf_shift offset to x' bounds and let x run from zero to
sg->length - 1.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:51 +08:00
Nicolai Stange
64c09b0b59 lib/mpi: mpi_read_raw_from_sgl(): fix nbits calculation
The number of bits, nbits, is calculated in mpi_read_raw_from_sgl() as
follows:

  nbits = nbytes * 8;

Afterwards, the number of leading zero bits of the first byte get
subtracted:

  nbits -= count_leading_zeros(*(u8 *)(sg_virt(sgl) + lzeros));

However, count_leading_zeros() takes an unsigned long and thus,
the u8 gets promoted to an unsigned long.

Thus, the above doesn't subtract the number of leading zeros in the most
significant nonzero input byte from nbits, but the number of leading
zeros of the most significant nonzero input byte promoted to unsigned long,
i.e. BITS_PER_LONG - 8 too many.

Fix this by subtracting

  count_leading_zeros(...) - (BITS_PER_LONG - 8)

from nbits only.

Fixes: 2d4d1eea54 ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:51 +08:00
Nicolai Stange
60e1b74c22 lib/mpi: mpi_read_raw_from_sgl(): purge redundant clearing of nbits
In mpi_read_raw_from_sgl(), unsigned nbits is calculated as follows:

  nbits = nbytes * 8;

and redundantly cleared later on if nbytes == 0:

  if (nbytes > 0)
    ...
  else
    nbits = 0;

Purge this redundant clearing for the sake of clarity.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:51 +08:00
Nicolai Stange
ab1e912ec1 lib/mpi: mpi_read_raw_from_sgl(): don't include leading zero SGEs in nbytes
At the very beginning of mpi_read_raw_from_sgl(), the leading zeros of
the input scatterlist are counted:

  lzeros = 0;
  for_each_sg(sgl, sg, ents, i) {
    ...
    if (/* sg contains nonzero bytes */)
      break;

    /* sg contains nothing but zeros here */
    ents--;
    lzeros = 0;
  }

Later on, the total number of trailing nonzero bytes is calculated by
subtracting the number of leading zero bytes from the total number of input
bytes:

  nbytes -= lzeros;

However, since lzeros gets reset to zero for each completely zero leading
sg in the loop above, it doesn't include those.

Besides wasting resources by allocating a too large output buffer,
this mistake propagates into the calculation of x, the number of
leading zeros within the most significant output limb:

  x = BYTES_PER_MPI_LIMB - nbytes % BYTES_PER_MPI_LIMB;

What's more, the low order bytes of the output, equal in number to the
extra bytes in nbytes, are left uninitialized.

Fix this by adjusting nbytes for each completely zero leading scatterlist
entry.

Fixes: 2d4d1eea54 ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:50 +08:00
Nicolai Stange
b698538951 lib/mpi: mpi_read_raw_from_sgl(): replace len argument by nbytes
Currently, the nbytes local variable is calculated from the len argument
as follows:

  ... mpi_read_raw_from_sgl(..., unsigned int len)
  {
    unsigned nbytes;
    ...
    if (!ents)
      nbytes = 0;
    else
      nbytes = len - lzeros;
    ...
  }

Given that nbytes is derived from len in a trivial way and that the len
argument is shadowed by a local len variable in several loops, this is just
confusing.

Rename the len argument to nbytes and get rid of the nbytes local variable.
Do the nbytes calculation in place.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:49 +08:00
Nicolai Stange
462696fd0f lib/mpi: mpi_read_buffer(): fix buffer overflow
Currently, mpi_read_buffer() writes full limbs to the output buffer
and moves memory around to purge leading zero limbs afterwards.

However, with

  commit 9cbe21d8f8 ("lib/mpi: only require buffers as big as needed for
                        the integer")

the caller is only required to provide a buffer large enough to hold the
result without the leading zeros.

This might result in a buffer overflow for small MP numbers with leading
zeros.

Fix this by coping the result to its final destination within the output
buffer and not copying the leading zeros at all.

Fixes: 9cbe21d8f8 ("lib/mpi: only require buffers as big as needed for
                      the integer")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:49 +08:00
Nicolai Stange
90f864e200 lib/mpi: mpi_read_buffer(): replace open coded endian conversion
Currently, the endian conversion from CPU order to BE is open coded in
mpi_read_buffer().

Replace this by the centrally provided cpu_to_be*() macros.
Copy from the temporary storage on stack to the destination buffer
by means of memcpy().

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:49 +08:00
Nicolai Stange
f00fa2417b lib/mpi: mpi_read_buffer(): optimize skipping of leading zero limbs
Currently, if the number of leading zeros is greater than fits into a
complete limb, mpi_read_buffer() skips them by iterating over them
limb-wise.

Instead of skipping the high order zero limbs within the loop as shown
above, adjust the copying loop's bounds.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:49 +08:00
Nicolai Stange
d755290689 lib/mpi: mpi_write_sgl(): replace open coded endian conversion
Currently, the endian conversion from CPU order to BE is open coded in
mpi_write_sgl().

Replace this by the centrally provided cpu_to_be*() macros.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:48 +08:00
Nicolai Stange
cece762f6f lib/mpi: mpi_write_sgl(): fix out-of-bounds stack access
Within the copying loop in mpi_write_sgl(), we have

  if (lzeros) {
    mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
    mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
                               + lzeros;
    *limb1 = *limb2;
    ...
  }

where p points past the end of alimb2 which lives on the stack and contains
the current limb in BE order.

The purpose of the above is to shift the non-zero bytes of alimb2 to its
beginning in memory, i.e. to skip its leading zero bytes.

However, limb2 points somewhere into the middle of alimb2 and thus, reading
*limb2 pulls in lzero bytes from somewhere.

Indeed, KASAN splats:

  BUG: KASAN: stack-out-of-bounds in mpi_write_to_sgl+0x4e3/0x6f0
                                      at addr ffff8800cb04f601
  Read of size 8 by task systemd-udevd/391
  page:ffffea00032c13c0 count:0 mapcount:0 mapping:   (null) index:0x0
  flags: 0x3fff8000000000()
  page dumped because: kasan: bad access detected
  CPU: 3 PID: 391 Comm: systemd-udevd Tainted: G  B  L
                                              4.5.0-next-20160316+ #12
  [...]
  Call Trace:
   [<ffffffff8194889e>] dump_stack+0xdc/0x15e
   [<ffffffff819487c2>] ? _atomic_dec_and_lock+0xa2/0xa2
   [<ffffffff814892b5>] ? __dump_page+0x185/0x330
   [<ffffffff8150ffd6>] kasan_report_error+0x5e6/0x8b0
   [<ffffffff814724cd>] ? kzfree+0x2d/0x40
   [<ffffffff819c5bce>] ? mpi_free_limb_space+0xe/0x20
   [<ffffffff819c469e>] ? mpi_powm+0x37e/0x16f0
   [<ffffffff815109f1>] kasan_report+0x71/0xa0
   [<ffffffff819c0353>] ? mpi_write_to_sgl+0x4e3/0x6f0
   [<ffffffff8150ed34>] __asan_load8+0x64/0x70
   [<ffffffff819c0353>] mpi_write_to_sgl+0x4e3/0x6f0
   [<ffffffff819bfe70>] ? mpi_set_buffer+0x620/0x620
   [<ffffffff819c0e6f>] ? mpi_cmp+0xbf/0x180
   [<ffffffff8186e282>] rsa_verify+0x202/0x260

What's more, since lzeros can be anything from 1 to sizeof(mpi_limb_t)-1,
the above will cause unaligned accesses which is bad on non-x86 archs.

Fix the issue, by preparing the starting point p for the upcoming copy
operation instead of shifting the source memory, i.e. alimb2.

Fixes: 2d4d1eea54 ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:48 +08:00
Nicolai Stange
ea122be0b8 lib/mpi: mpi_write_sgl(): purge redundant pointer arithmetic
Within the copying loop in mpi_write_sgl(), we have

  if (lzeros) {
    ...
    p -= lzeros;
    y = lzeros;
  }
  p = p - (sizeof(alimb) - y);

If lzeros == 0, then y == 0, too. Thus, lzeros gets subtracted and added
back again to p.

Purge this redundancy.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:48 +08:00
Nicolai Stange
654842ef53 lib/mpi: mpi_write_sgl(): fix style issue with lzero decrement
Within the copying loop in mpi_write_sgl(), we have

  if (lzeros > 0) {
    ...
    lzeros -= sizeof(alimb);
  }

However, at this point, lzeros < sizeof(alimb) holds. Make this fact
explicit by rewriting the above to

  if (lzeros) {
    ...
    lzeros = 0;
  }

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:46 +08:00
Nicolai Stange
f2d1362ff7 lib/mpi: mpi_write_sgl(): fix skipping of leading zero limbs
Currently, if the number of leading zeros is greater than fits into a
complete limb, mpi_write_sgl() skips them by iterating over them limb-wise.

However, it fails to adjust its internal leading zeros tracking variable,
lzeros, accordingly: it does a

  p -= sizeof(alimb);
  continue;

which should really have been a

  lzeros -= sizeof(alimb);
  continue;

Since lzeros never decreases if its initial value >= sizeof(alimb), nothing
gets copied by mpi_write_sgl() in that case.

Instead of skipping the high order zero limbs within the loop as shown
above, fix the issue by adjusting the copying loop's bounds.

Fixes: 2d4d1eea54 ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-04-05 20:35:46 +08:00
Bob Copeland
8f6fd83c6c rhashtable: accept GFP flags in rhashtable_walk_init
In certain cases, the 802.11 mesh pathtable code wants to
iterate over all of the entries in the forwarding table from
the receive path, which is inside an RCU read-side critical
section.  Enable walks inside atomic sections by allowing
GFP_ATOMIC allocations for the walker state.

Change all existing callsites to pass in GFP_KERNEL.

Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Bob Copeland <me@bobcopeland.com>
[also adjust gfs2/glock.c and rhashtable tests]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-05 10:56:32 +02:00