Currently on panic, kernel will lower the loglevel and print out pending
printk msg only with console_flush_on_panic().
Add an option for users to configure the "panic_print" to replay all
dmesg in buffer, some of which they may have never seen due to the
loglevel setting, which will help panic debugging .
[feng.tang@intel.com: keep the original console_flush_on_panic() inside panic()]
Link: http://lkml.kernel.org/r/1556199137-14163-1-git-send-email-feng.tang@intel.com
[feng.tang@intel.com: use logbuf lock to protect the console log index]
Link: http://lkml.kernel.org/r/1556269868-22654-1-git-send-email-feng.tang@intel.com
Link: http://lkml.kernel.org/r/1556095872-36838-1-git-send-email-feng.tang@intel.com
Signed-off-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Aaro Koskinen <aaro.koskinen@nokia.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since commit 54c7a8916a ("initramfs: free initrd memory if opening
/initrd.image fails"), the kernel has unconditionally attempted to free
the initrd even if it doesn't exist.
In the non-existent case this causes a boot-time splat if
CONFIG_DEBUG_VIRTUAL is enabled due to a call to virt_to_phys() with a
NULL address.
Instead we should check that the initrd actually exists and only attempt
to free it if it does.
Link: http://lkml.kernel.org/r/20190516143125.48948-1-steven.price@arm.com
Fixes: 54c7a8916a ("initramfs: free initrd memory if opening /initrd.image fails")
Signed-off-by: Steven Price <steven.price@arm.com>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
synchronize_rcu() didn't wait for call_rcu() callbacks, so inode wb
switch may not go to the workqueue after synchronize_rcu(). Thus
previous scheduled switches was not finished even flushing the
workqueue, which will cause a NULL pointer dereferenced followed below.
VFS: Busy inodes after unmount of vdd. Self-destruct in 5 seconds. Have a nice day...
BUG: unable to handle kernel NULL pointer dereference at 0000000000000278
evict+0xb3/0x180
iput+0x1b0/0x230
inode_switch_wbs_work_fn+0x3c0/0x6a0
worker_thread+0x4e/0x490
? process_one_work+0x410/0x410
kthread+0xe6/0x100
ret_from_fork+0x39/0x50
Replace the synchronize_rcu() call with a rcu_barrier() to wait for all
pending callbacks to finish. And inc isw_nr_in_flight after call_rcu()
in inode_switch_wbs() to make more sense.
Link: http://lkml.kernel.org/r/20190429024108.54150-1-jiufei.xue@linux.alibaba.com
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Acked-by: Tejun Heo <tj@kernel.org>
Suggested-by: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
syzbot reported the following error from a tree with a head commit of
baf76f0c58 ("slip: make slhc_free() silently accept an error pointer")
BUG: unable to handle kernel paging request at ffffea0003348000
#PF error: [normal kernel read fault]
PGD 12c3f9067 P4D 12c3f9067 PUD 12c3f8067 PMD 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 28916 Comm: syz-executor.2 Not tainted 5.1.0-rc6+ #89
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:constant_test_bit arch/x86/include/asm/bitops.h:314 [inline]
RIP: 0010:PageCompound include/linux/page-flags.h:186 [inline]
RIP: 0010:isolate_freepages_block+0x1c0/0xd40 mm/compaction.c:579
Code: 01 d8 ff 4d 85 ed 0f 84 ef 07 00 00 e8 29 00 d8 ff 4c 89 e0 83 85 38 ff
ff ff 01 48 c1 e8 03 42 80 3c 38 00 0f 85 31 0a 00 00 <4d> 8b 2c 24 31 ff 49
c1 ed 10 41 83 e5 01 44 89 ee e8 3a 01 d8 ff
RSP: 0018:ffff88802b31eab8 EFLAGS: 00010246
RAX: 1ffffd4000669000 RBX: 00000000000cd200 RCX: ffffc9000a235000
RDX: 000000000001ca5e RSI: ffffffff81988cc7 RDI: 0000000000000001
RBP: ffff88802b31ebd8 R08: ffff88805af700c0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffea0003348000
R13: 0000000000000000 R14: ffff88802b31f030 R15: dffffc0000000000
FS: 00007f61648dc700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffea0003348000 CR3: 0000000037c64000 CR4: 00000000001426e0
Call Trace:
fast_isolate_around mm/compaction.c:1243 [inline]
fast_isolate_freepages mm/compaction.c:1418 [inline]
isolate_freepages mm/compaction.c:1438 [inline]
compaction_alloc+0x1aee/0x22e0 mm/compaction.c:1550
There is no reproducer and it is difficult to hit -- 1 crash every few
days. The issue is very similar to the fix in commit 6b0868c820
("mm/compaction.c: correct zone boundary handling when resetting pageblock
skip hints"). When isolating free pages around a target pageblock, the
boundary handling is off by one and can stray into the next pageblock.
Triggering the syzbot error requires that the end of pageblock is section
or zone aligned, and that the next section is unpopulated.
A more subtle consequence of the bug is that pageblocks were being
improperly used as migration targets which potentially hurts fragmentation
avoidance in the long-term one page at a time.
A debugging patch revealed that it's definitely possible to stray outside
of a pageblock which is not intended. While syzbot cannot be used to
verify this patch, it was confirmed that the debugging warning no longer
triggers with this patch applied. It has also been confirmed that the THP
allocation stress tests are not degraded by this patch.
Link: http://lkml.kernel.org/r/20190510182124.GI18914@techsingularity.net
Fixes: e332f741a8 ("mm, compaction: be selective about what pageblocks to clear skip hints")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: syzbot+d84c80f9fe26a0f7a734@syzkaller.appspotmail.com
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org> # v5.1+
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This macro adds some debug code to check that vmap allocations are
happened in ascending order.
By default this option is set to 0 and not active. It requires
recompilation of the kernel to activate it. Set to 1, compile the
kernel.
[urezki@gmail.com: v4]
Link: http://lkml.kernel.org/r/20190406183508.25273-4-urezki@gmail.com
Link: http://lkml.kernel.org/r/20190402162531.10888-4-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This macro adds some debug code to check that the augment tree is
maintained correctly, meaning that every node contains valid
subtree_max_size value.
By default this option is set to 0 and not active. It requires
recompilation of the kernel to activate it. Set to 1, compile the
kernel.
[urezki@gmail.com: v4]
Link: http://lkml.kernel.org/r/20190406183508.25273-3-urezki@gmail.com
Link: http://lkml.kernel.org/r/20190402162531.10888-3-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "improve vmap allocation", v3.
Objective
---------
Please have a look for the description at:
https://lkml.org/lkml/2018/10/19/786
but let me also summarize it a bit here as well.
The current implementation has O(N) complexity. Requests with different
permissive parameters can lead to long allocation time. When i say
"long" i mean milliseconds.
Description
-----------
This approach organizes the KVA memory layout into free areas of the
1-ULONG_MAX range, i.e. an allocation is done over free areas lookups,
instead of finding a hole between two busy blocks. It allows to have
lower number of objects which represent the free space, therefore to have
less fragmented memory allocator. Because free blocks are always as large
as possible.
It uses the augment tree where all free areas are sorted in ascending
order of va->va_start address in pair with linked list that provides
O(1) access to prev/next elements.
Since the tree is augment, we also maintain the "subtree_max_size" of VA
that reflects a maximum available free block in its left or right
sub-tree. Knowing that, we can easily traversal toward the lowest (left
most path) free area.
Allocation: ~O(log(N)) complexity. It is sequential allocation method
therefore tends to maximize locality. The search is done until a first
suitable block is large enough to encompass the requested parameters.
Bigger areas are split.
I copy paste here the description of how the area is split, since i
described it in https://lkml.org/lkml/2018/10/19/786
<snip>
A free block can be split by three different ways. Their names are
FL_FIT_TYPE, LE_FIT_TYPE/RE_FIT_TYPE and NE_FIT_TYPE, i.e. they
correspond to how requested size and alignment fit to a free block.
FL_FIT_TYPE - in this case a free block is just removed from the free
list/tree because it fully fits. Comparing with current design there is
an extra work with rb-tree updating.
LE_FIT_TYPE/RE_FIT_TYPE - left/right edges fit. In this case what we do
is just cutting a free block. It is as fast as a current design. Most of
the vmalloc allocations just end up with this case, because the edge is
always aligned to 1.
NE_FIT_TYPE - Is much less common case. Basically it happens when
requested size and alignment does not fit left nor right edges, i.e. it
is between them. In this case during splitting we have to build a
remaining left free area and place it back to the free list/tree.
Comparing with current design there are two extra steps. First one is we
have to allocate a new vmap_area structure. Second one we have to insert
that remaining free block to the address sorted list/tree.
In order to optimize a first case there is a cache with free_vmap objects.
Instead of allocating from slab we just take an object from the cache and
reuse it.
Second one is pretty optimized. Since we know a start point in the tree
we do not do a search from the top. Instead a traversal begins from a
rb-tree node we split.
<snip>
De-allocation. ~O(log(N)) complexity. An area is not inserted straight
away to the tree/list, instead we identify the spot first, checking if it
can be merged around neighbors. The list provides O(1) access to
prev/next, so it is pretty fast to check it. Summarizing. If merged then
large coalesced areas are created, if not the area is just linked making
more fragments.
There is one more thing that i should mention here. After modification of
VA node, its subtree_max_size is updated if it was/is the biggest area in
its left or right sub-tree. Apart of that it can also be populated back
to upper levels to fix the tree. For more details please have a look at
the __augment_tree_propagate_from() function and the description.
Tests and stressing
-------------------
I use the "test_vmalloc.sh" test driver available under
"tools/testing/selftests/vm/" since 5.1-rc1 kernel. Just trigger "sudo
./test_vmalloc.sh" to find out how to deal with it.
Tested on different platforms including x86_64/i686/ARM64/x86_64_NUMA.
Regarding last one, i do not have any physical access to NUMA system,
therefore i emulated it. The time of stressing is days.
If you run the test driver in "stress mode", you also need the patch that
is in Andrew's tree but not in Linux 5.1-rc1. So, please apply it:
http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/commit/?id=e0cf7749bade6da318e98e934a24d8b62fab512c
After massive testing, i have not identified any problems like memory
leaks, crashes or kernel panics. I find it stable, but more testing would
be good.
Performance analysis
--------------------
I have used two systems to test. One is i5-3320M CPU @ 2.60GHz and
another is HiKey960(arm64) board. i5-3320M runs on 4.20 kernel, whereas
Hikey960 uses 4.15 kernel. I have both system which could run on 5.1-rc1
as well, but the results have not been ready by time i an writing this.
Currently it consist of 8 tests. There are three of them which correspond
to different types of splitting(to compare with default). We have 3
ones(see above). Another 5 do allocations in different conditions.
a) sudo ./test_vmalloc.sh performance
When the test driver is run in "performance" mode, it runs all available
tests pinned to first online CPU with sequential execution test order. We
do it in order to get stable and repeatable results. Take a look at time
difference in "long_busy_list_alloc_test". It is not surprising because
the worst case is O(N).
# i5-3320M
How many cycles all tests took:
CPU0=646919905370(default) cycles vs CPU0=193290498550(patched) cycles
# See detailed table with results here:
ftp://vps418301.ovh.net/incoming/vmap_test_results_v2/i5-3320M_performance_default.txt
ftp://vps418301.ovh.net/incoming/vmap_test_results_v2/i5-3320M_performance_patched.txt
# Hikey960 8x CPUs
How many cycles all tests took:
CPU0=3478683207 cycles vs CPU0=463767978 cycles
# See detailed table with results here:
ftp://vps418301.ovh.net/incoming/vmap_test_results_v2/HiKey960_performance_default.txt
ftp://vps418301.ovh.net/incoming/vmap_test_results_v2/HiKey960_performance_patched.txt
b) time sudo ./test_vmalloc.sh test_repeat_count=1
With this configuration, all tests are run on all available online CPUs.
Before running each CPU shuffles its tests execution order. It gives
random allocation behaviour. So it is rough comparison, but it puts in
the picture for sure.
# i5-3320M
<default> vs <patched>
real 101m22.813s real 0m56.805s
user 0m0.011s user 0m0.015s
sys 0m5.076s sys 0m0.023s
# See detailed table with results here:
ftp://vps418301.ovh.net/incoming/vmap_test_results_v2/i5-3320M_test_repeat_count_1_default.txt
ftp://vps418301.ovh.net/incoming/vmap_test_results_v2/i5-3320M_test_repeat_count_1_patched.txt
# Hikey960 8x CPUs
<default> vs <patched>
real unknown real 4m25.214s
user unknown user 0m0.011s
sys unknown sys 0m0.670s
I did not manage to complete this test on "default Hikey960" kernel
version. After 24 hours it was still running, therefore i had to cancel
it. That is why real/user/sys are "unknown".
This patch (of 3):
Currently an allocation of the new vmap area is done over busy list
iteration(complexity O(n)) until a suitable hole is found between two busy
areas. Therefore each new allocation causes the list being grown. Due to
over fragmented list and different permissive parameters an allocation can
take a long time. For example on embedded devices it is milliseconds.
This patch organizes the KVA memory layout into free areas of the
1-ULONG_MAX range. It uses an augment red-black tree that keeps blocks
sorted by their offsets in pair with linked list keeping the free space in
order of increasing addresses.
Nodes are augmented with the size of the maximum available free block in
its left or right sub-tree. Thus, that allows to take a decision and
traversal toward the block that will fit and will have the lowest start
address, i.e. it is sequential allocation.
Allocation: to allocate a new block a search is done over the tree until a
suitable lowest(left most) block is large enough to encompass: the
requested size, alignment and vstart point. If the block is bigger than
requested size - it is split.
De-allocation: when a busy vmap area is freed it can either be merged or
inserted to the tree. Red-black tree allows efficiently find a spot
whereas a linked list provides a constant-time access to previous and next
blocks to check if merging can be done. In case of merging of
de-allocated memory chunk a large coalesced area is created.
Complexity: ~O(log(N))
[urezki@gmail.com: v3]
Link: http://lkml.kernel.org/r/20190402162531.10888-2-urezki@gmail.com
[urezki@gmail.com: v4]
Link: http://lkml.kernel.org/r/20190406183508.25273-2-urezki@gmail.com
Link: http://lkml.kernel.org/r/20190321190327.11813-2-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ido Schimmel says:
====================
mlxsw: Two port module fixes
Patch #1 fixes driver initialization failure on old ASICs due to
unsupported register access. This is fixed by first testing if the
register is supported.
Patch #2 fixes reading of certain modules' EEPROM. The problem and
solution are explained in detail in the commit message.
Please consider both patches for stable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Prevent reading unsupported slave address from SFP EEPROM by testing
Diagnostic Monitoring Type byte in EEPROM. Read only page zero of
EEPROM, in case this byte is zero.
If some SFP transceiver does not support Digital Optical Monitoring
(DOM), reading SFP EEPROM slave address 0x51 could return an error.
Availability of DOM support is verified by reading from zero page
Diagnostic Monitoring Type byte describing how diagnostic monitoring is
implemented by transceiver. If bit 6 of this byte is set, it indicates
that digital diagnostic monitoring has been implemented. Otherwise it is
not and transceiver could fail to reply to transaction for slave address
0x51 [1010001X (A2h)], which is used to access measurements page.
Such issue has been observed when reading cable MCP2M00-xxxx,
MCP7F00-xxxx, and few others.
Fixes: 2ea109039c ("mlxsw: spectrum: Add support for access cable info via ethtool")
Fixes: 4400081b63 ("mlxsw: spectrum: Fix EEPROM access in case of SFP/SFP+")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Old Mellanox silicons, like switchx-2, switch-ib do not support reading
QSFP modules temperature through MTMP register. Attempt to access this
register on systems equipped with the this kind of silicon will cause
initialization flow failure.
Test for hardware resource capability is added in order to distinct
between old and new silicon - old silicons do not have such capability.
Fixes: 6a79507cfe ("mlxsw: core: Extend thermal module with per QSFP module thermal zones")
Fixes: 5c42eaa07b ("mlxsw: core: Extend hwmon interface with QSFP module temperature attributes")
Reported-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
perf.data:
Alexey Budankov:
- Streaming compression of perf ring buffer into PERF_RECORD_COMPRESSED
user space records, resulting in ~3-5x perf.data file size reduction
on variety of tested workloads what saves storage space on larger
server systems where perf.data size can easily reach several tens or
even hundreds of GiBs, especially when profiling with DWARF-based
stacks and tracing of context switches.
perf record:
Arnaldo Carvalho de Melo
- Improve -user-regs/intr-regs suggestions to overcome errors.
perf annotate:
Jin Yao:
- Remove hist__account_cycles() from callback, speeding up branch processing
(perf record -b).
perf stat:
- Add a 'percore' event qualifier, e.g.: -e cpu/event=0,umask=0x3,percore=1/,
that sums up the event counts for both hardware threads in a core.
We can already do this with --per-core, but it's often useful to do
this together with other metrics that are collected per hardware thread.
I.e. now its possible to do this per-event, and have it mixed with other
events not aggregated by core.
core libraries:
Donald Yandt:
- Check for errors when doing fgets(/proc/version).
Jiri Olsa:
- Speed up report for perf compiled with linbunwind.
tools headers:
Arnaldo Carvalho de Melo
- Update memcpy_64.S, x86's kvm.h and pt_regs.h.
arm64:
Florian Fainelli:
- Map Brahma-B53 CPUID to cortex-a53 events.
- Add Cortex-A57 and Cortex-A72 events.
csky:
Mao Han:
- Add DWARF register mappings for libdw, allowing --call-graph=dwarf to work
on the C-SKY arch.
x86:
Andi Kleen/Kan Liang:
- Add support for recording and printing XMM registers, available, for
instance, on Icelake.
Kan Liang:
- Add uncore_upi (Intel's "Ultra Path Interconnect" events) JSON support.
UPI replaced the Intel QuickPath Interconnect (QPI) in Xeon Skylake-SP.
Intel PT:
Adrian Hunter
. Fix instructions sampling rate.
. Timestamp fixes.
. Improve exported-sql-viewer GUI, allowing, for instance, to copy'n'paste
the trees, useful for e-mailing.
Documentation:
Thomas Richter:
- Add description for 'perf --debug stderr=1', which redirects stderr to stdout.
libtraceevent:
Tzvetomir Stoyanov:
- Add man pages for the various APIs.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXN8KEwAKCRCyPKLppCJ+
JxbAAPsE4bDD7UyTJgVWVXFyqMkaFURv9eGtmhbH/EhnZ/ADfwEAoUG6C+G2DLre
IMtgTXSlRShQucQbum5Rcp9lcXGL+gY=
=mGtw
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-5.2-20190517' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf.data:
Alexey Budankov:
- Streaming compression of perf ring buffer into PERF_RECORD_COMPRESSED
user space records, resulting in ~3-5x perf.data file size reduction
on variety of tested workloads what saves storage space on larger
server systems where perf.data size can easily reach several tens or
even hundreds of GiBs, especially when profiling with DWARF-based
stacks and tracing of context switches.
perf record:
Arnaldo Carvalho de Melo
- Improve -user-regs/intr-regs suggestions to overcome errors.
perf annotate:
Jin Yao:
- Remove hist__account_cycles() from callback, speeding up branch processing
(perf record -b).
perf stat:
- Add a 'percore' event qualifier, e.g.: -e cpu/event=0,umask=0x3,percore=1/,
that sums up the event counts for both hardware threads in a core.
We can already do this with --per-core, but it's often useful to do
this together with other metrics that are collected per hardware thread.
I.e. now its possible to do this per-event, and have it mixed with other
events not aggregated by core.
core libraries:
Donald Yandt:
- Check for errors when doing fgets(/proc/version).
Jiri Olsa:
- Speed up report for perf compiled with linbunwind.
tools headers:
Arnaldo Carvalho de Melo
- Update memcpy_64.S, x86's kvm.h and pt_regs.h.
arm64:
Florian Fainelli:
- Map Brahma-B53 CPUID to cortex-a53 events.
- Add Cortex-A57 and Cortex-A72 events.
csky:
Mao Han:
- Add DWARF register mappings for libdw, allowing --call-graph=dwarf to work
on the C-SKY arch.
x86:
Andi Kleen/Kan Liang:
- Add support for recording and printing XMM registers, available, for
instance, on Icelake.
Kan Liang:
- Add uncore_upi (Intel's "Ultra Path Interconnect" events) JSON support.
UPI replaced the Intel QuickPath Interconnect (QPI) in Xeon Skylake-SP.
Intel PT:
Adrian Hunter
. Fix instructions sampling rate.
. Timestamp fixes.
. Improve exported-sql-viewer GUI, allowing, for instance, to copy'n'paste
the trees, useful for e-mailing.
Documentation:
Thomas Richter:
- Add description for 'perf --debug stderr=1', which redirects stderr to stdout.
libtraceevent:
Tzvetomir Stoyanov:
- Add man pages for the various APIs.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In the recent build test of linux-next, Stephen saw a build error
caused by a broken .tmp_versions/*.mod file:
https://lkml.org/lkml/2019/5/13/991
drivers/net/phy/asix.ko and drivers/net/usb/asix.ko have the same
basename, and there is a race in generating .tmp_versions/asix.mod
Kbuild has not checked this before, and it suddenly shows up with
obscure error messages when this kind of race occurs.
Non-unique module names cause various sort of problems, but it is
not trivial to catch them by eyes.
Hence, this script.
It checks not only real modules, but also built-in modules (i.e.
controlled by tristate CONFIG option, but currently compiled with =y).
Non-unique names for built-in modules also cause problems because
/sys/modules/ would fall over.
For the latest kernel, I tested "make allmodconfig all" (or more
quickly "make allyesconfig modules"), and it detected the following:
warning: same basename if the following are built as modules:
drivers/regulator/88pm800.ko
drivers/mfd/88pm800.ko
warning: same basename if the following are built as modules:
drivers/gpu/drm/bridge/adv7511/adv7511.ko
drivers/media/i2c/adv7511.ko
warning: same basename if the following are built as modules:
drivers/net/phy/asix.ko
drivers/net/usb/asix.ko
warning: same basename if the following are built as modules:
fs/coda/coda.ko
drivers/media/platform/coda/coda.ko
warning: same basename if the following are built as modules:
drivers/net/phy/realtek.ko
drivers/net/dsa/realtek.ko
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Currently menu blocks start with a pretty header but end with nothing in
the generated config. So next config options stick together with the
options from the menu block.
Let's terminate menu blocks in the generated config with a comment and
a newline if needed. Example:
...
CONFIG_BPF_STREAM_PARSER=y
CONFIG_NET_FLOW_LIMIT=y
#
# Network testing
#
CONFIG_NET_PKTGEN=y
CONFIG_NET_DROP_MONITOR=y
# end of Network testing
# end of Networking options
CONFIG_HAMRADIO=y
...
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The 'addtree' and 'flags' in scripts/Kbuild.include are so compilecated
and ugly.
As I mentioned in [1], Kbuild should stop automatic prefixing of header
search path options.
I fixed up (almost) all Makefiles in the kernel. Now 'addtree' and
'flags' have been removed.
Kbuild still caters to add $(srctree)/$(src) and $(objtree)/$(obj)
to the header search path for O= building, but never touches extra
compiler options from ccflags-y etc.
[1]: https://patchwork.kernel.org/patch/9632347/
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Currently, the Kbuild core manipulates header search paths in a crazy
way [1].
To fix this mess, I want all Makefiles to add explicit $(srctree)/ to
the search paths in the srctree. Some Makefiles are already written in
that way, but not all. The goal of this work is to make the notation
consistent, and finally get rid of the gross hacks.
Having whitespaces after -I does not matter since commit 48f6e3cf5b
("kbuild: do not drop -I without parameter").
[1]: https://patchwork.kernel.org/patch/9632347/
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Currently, the Kbuild core manipulates header search paths in a crazy
way [1].
To fix this mess, I want all Makefiles to add explicit $(srctree)/ to
the search paths in the srctree. Some Makefiles are already written in
that way, but not all. The goal of this work is to make the notation
consistent, and finally get rid of the gross hacks.
Having whitespaces after -I does not matter since commit 48f6e3cf5b
("kbuild: do not drop -I without parameter").
[1]: https://patchwork.kernel.org/patch/9632347/
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
As of Linux 5.1, alpha and s390 are the last architectures that
have defconfig in arch/*/ instead of arch/*/configs/.
$ find arch -name defconfig | sort
arch/alpha/defconfig
arch/arm64/configs/defconfig
arch/csky/configs/defconfig
arch/nds32/configs/defconfig
arch/riscv/configs/defconfig
arch/s390/defconfig
The arch/$(ARCH)/defconfig is the hard-coded default in Kconfig,
and I want to deprecate it after evacuating the remaining defconfig
into the standard location, arch/*/configs/.
Define KBUILD_DEFCONFIG like other architectures, and move defconfig
into the configs/ subdirectory.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Paul Walmsley <paul@pwsan.com>
If the compiler specified by $(CC) is not present, the Kconfig stage
sprinkles 'not found' messages, then succeeds.
$ make CROSS_COMPILE=foo defconfig
/bin/sh: 1: foogcc: not found
/bin/sh: 1: foogcc: not found
*** Default configuration is based on 'x86_64_defconfig'
./scripts/gcc-version.sh: 17: ./scripts/gcc-version.sh: foogcc: not found
./scripts/gcc-version.sh: 18: ./scripts/gcc-version.sh: foogcc: not found
./scripts/gcc-version.sh: 19: ./scripts/gcc-version.sh: foogcc: not found
./scripts/gcc-version.sh: 17: ./scripts/gcc-version.sh: foogcc: not found
./scripts/gcc-version.sh: 18: ./scripts/gcc-version.sh: foogcc: not found
./scripts/gcc-version.sh: 19: ./scripts/gcc-version.sh: foogcc: not found
./scripts/clang-version.sh: 11: ./scripts/clang-version.sh: foogcc: not found
./scripts/gcc-plugin.sh: 11: ./scripts/gcc-plugin.sh: foogcc: not found
init/Kconfig:16:warning: 'GCC_VERSION': number is invalid
#
# configuration written to .config
#
Terminate parsing files immediately if $(CC) or $(LD) is not found.
"make *config" will fail more nicely.
$ make CROSS_COMPILE=foo defconfig
*** Default configuration is based on 'x86_64_defconfig'
scripts/Kconfig.include:34: compiler 'foogcc' not found
make[1]: *** [scripts/kconfig/Makefile;82: defconfig] Error 1
make: *** [Makefile;557: defconfig] Error 2
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
syncconfig is responsible for keeping auto.conf up-to-date, so if it
fails for any reason, the build must be terminated immediately.
However, since commit 9390dff66a ("kbuild: invoke syncconfig if
include/config/auto.conf.cmd is missing"), Kbuild continues running
even after syncconfig fails.
You can confirm this by intentionally making syncconfig error out:
diff --git a/scripts/kconfig/confdata.c b/scripts/kconfig/confdata.c
index 08ba146..307b9de 100644
--- a/scripts/kconfig/confdata.c
+++ b/scripts/kconfig/confdata.c
@@ -1023,6 +1023,9 @@ int conf_write_autoconf(int overwrite)
FILE *out, *tristate, *out_h;
int i;
+ if (overwrite)
+ return 1;
+
if (!overwrite && is_present(autoconf_name))
return 0;
Then, syncconfig fails, but Make would not stop:
$ make -s mrproper allyesconfig defconfig
$ make
scripts/kconfig/conf --syncconfig Kconfig
*** Error during sync of the configuration.
make[2]: *** [scripts/kconfig/Makefile;69: syncconfig] Error 1
make[1]: *** [Makefile;557: syncconfig] Error 2
make: *** [include/config/auto.conf.cmd] Deleting file 'include/config/tristate.conf'
make: Failed to remake makefile 'include/config/auto.conf'.
SYSTBL arch/x86/include/generated/asm/syscalls_32.h
SYSHDR arch/x86/include/generated/asm/unistd_32_ia32.h
SYSHDR arch/x86/include/generated/asm/unistd_64_x32.h
SYSTBL arch/x86/include/generated/asm/syscalls_64.h
[ continue running ... ]
The reason is in the behavior of a pattern rule with multi-targets.
%/auto.conf %/auto.conf.cmd %/tristate.conf: $(KCONFIG_CONFIG)
$(Q)$(MAKE) -f $(srctree)/Makefile syncconfig
GNU Make knows this rule is responsible for making all the three files
simultaneously. As far as examined, auto.conf.cmd is the target in
question when this rule is invoked. It is probably because auto.conf.cmd
is included below the inclusion of auto.conf.
The inclusion of auto.conf is mandatory, while that of auto.conf.cmd
is optional. GNU Make does not care about the failure in the process
of updating optional include files.
I filed this issue (https://savannah.gnu.org/bugs/?56301) in case this
behavior could be improved somehow in future releases of GNU Make.
Anyway, it is quite easy to fix our Makefile.
Given that auto.conf is already a mandatory include file, there is no
reason to stick auto.conf.cmd optional. Make it mandatory as well.
Cc: linux-stable <stable@vger.kernel.org> # 5.0+
Fixes: 9390dff66a ("kbuild: invoke syncconfig if include/config/auto.conf.cmd is missing")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Also, sort the patterns alphabetically. Update the comment since
we have non-git files here.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
We do not support old Clang versions. Upgrade your clang version
if any of these flags is unsupported.
Let's add all flags inside ifdef CONFIG_CC_IS_CLANG unconditionally.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
This is no longer a valid option in clang, it was removed in 3.5, which
we don't support.
cb3f812b6b
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
These flags are documented in the GCC 4.6 manual, and recognized by
Clang as well. Let's rip off the cc-option / cc-disable-warning switches.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
This flag is documented in the GCC 4.6 manual, and recognized by
Clang as well. Let's rip off the cc-option switch.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
These generic-y defines do not have the corresponding generic header
in include/asm-generic/, so they are definitely invalid.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Do not descend to sub-directories when unneeded.
I used subdir-$(CONFIG_...) for hidraw, seccomp, and vfs because
they only contain host programs.
While we are here, let's add SPDX License tag, and sort the directories
alphabetically.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
This warning was disabled by commit bd664f6b3e ("disable new
gcc-7.1.1 warnings for now") just because it was too noisy.
Thanks to Arnd Bergmann, all warnings have been fixed. Now, we are
ready to re-enable it.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Arnd Bergmann <arnd@arndb.de>
scripts/link-vmlinux.sh is part of kbuild so extend the pattern to match
any vmlinux related scripts.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
arch/sh/boot/.gitignore has the pattern "vmlinux*"; this is effective
not only for the current directory, but also for any sub-directories.
So, from the point of .gitignore grammar, the following check-in files
are also considered to be ignored:
arch/sh/boot/compressed/vmlinux.scr
arch/sh/boot/romimage/vmlinux.scr
As the manual gitignore(5) says "Files already tracked by Git are not
affected", this is not a problem as far as Git is concerned.
However, Git is not the only program that parses .gitignore because
.gitignore is useful to distinguish build artifacts from source files.
For example, tar(1) supports the --exclude-vcs-ignore option. As of
writing, this option does not work perfectly, but it intends to create
a tarball excluding files specified by .gitignore.
So, I believe it is better to fix this issue.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Towards the goal of removing cc-ldoption, it seems that --hash-style=
was added to binutils 2.17.50.0.2 in 2006. The minimal required version
of binutils for the kernel according to
Documentation/process/changes.rst is 2.20.
Link: https://gcc.gnu.org/ml/gcc/2007-01/msg01141.html
Cc: clang-built-linux@googlegroups.com
Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Towards the goal of removing cc-ldoption, it seems that --hash-style=
was added to binutils 2.17.50.0.2 in 2006. The minimal required version
of binutils for the kernel according to
Documentation/process/changes.rst is 2.20.
Link: https://gcc.gnu.org/ml/gcc/2007-01/msg01141.html
Cc: clang-built-linux@googlegroups.com
Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Having a symbolic link arch/*/boot/dts/include/dt-bindings was
deprecated by commit d5d332d3f7 ("devicetree: Move include
prefixes from arch to separate directory").
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Daniel Borkmann says:
====================
pull-request: bpf 2019-05-18
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix bpftool's raw BTF dump in relation to forward declarations of union/
structs, and another fix to unexport logging helpers, from Andrii.
2) Fix inode permission check for retrieving bpf programs, from Chenbo.
3) Fix bpftool to raise rlimit earlier as otherwise libbpf's feature probing
can fail and subsequently it refuses to load an object, from Yonghong.
4) Fix declaration of bpf_get_current_task() in kselftests, from Alexei.
5) Fix up BPF kselftest .gitignore to add generated files, from Stanislav.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAlzfFrEACgkQSD+KveBX
+j4zVggAmqybBgCmoz7+XwXK+Loxrd8PeKtDi8UgGTgkS3qf+dzVAlUHKnsx0nzX
/VDVCz12SzGZgL7xIdCbrVeD088gWroO+lbwK2ucAt2mBxEsNUAu9y4rXmMu1wQB
LPmao5j/v38Ru2lsNnXG8thH1ggGOytsWlBZc7ntC2QDrJp/h+9EOlW9TUFZ+LFr
aFrvrXpzctrOlbLRSrfwIwDhOcs6MW0fElDHMSd72Ax1xnw1PZ9/Tv3kBGA/MD3h
P0GAwu8jAp7ztfcXarr93J2iSfDjZpkBj6MH15P65Xv1gq3thmv8LgG41zYzw6L+
34hsPPFb8DBnGWc8usy9Qch2acA6ug==
=kTWK
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2019-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2019-05-17
This series introduces some fixes to mlx5 driver.
For more information please see tag log below.
Please pull and let me know if there is any problem.
For -stable v4.19
net/mlx5e: Fix ethtool rxfh commands when CONFIG_MLX5_EN_RXNFC is disabled
net/mlx5: Imply MLXFW in mlx5_core
For -stable v5.0
net/mlx5e: Add missing ethtool driver info for representors
net/mlx5e: Additional check for flow destination comparison
For -stable v5.1
net/mlx5: Fix peer pf disable hca command
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Handling of aborted journal is a special code path different from
standard ext4_error() one and it can call panic() as well. Commit
1dc1097ff6 ("ext4: avoid panic during forced reboot") forgot to update
this path so fix that omission.
Fixes: 1dc1097ff6 ("ext4: avoid panic during forced reboot")
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 5.1
Just a few HD-audio fixes, most of which are specific to Realtek
codecs.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAlzepbsOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE82IQ//QpscaQxoFOf6qp3u6pr/F1GhFxCQYkKMMLOE
T4BckQXVS2V+Gqoc3rJWj7066Ik6a6bVVBBmBieQOXmXtHOVMoAHRII63xtDdIp7
uBAKofDZKljQzkm63INMZ7hk9IgZzVOpdshuuoenSMwZ0Ml6CeU0N9ehfvD4DRDa
bDC+q3hyrftNf/8Dujd4EO7noSRaz0qurBbjRGeeiXkiewSKEcY0a82KvXsGo3E1
rHhd56qW0cYjSaVZCDkRlbm8S5RunJTF/LtzBAeIvkyi1Bli1YVwVQrRKBUpIwdq
ySyoshQg5cZEk1GPv1jfLNyFX6sxdVVAqrFF2i2BpOn7qWCEMpWdfk4qlO4OItKz
jBonIT793lLuWkwPJ9UAWSMY4bnyAOcWev4ITGbNG9uQ//UcAoXSSG/XbG48HSvK
PyCCYnmVYZ5+wsEVysP5gSV0A1/08naSpmDhs5CSTexwL50TFdq5pFWjbBg0koda
cLpCFyDqFS0l1AJRHTv3s73aejcpp/c2gtmJ+TkflGhJsUCnW4fridvR+iiTZSJr
ZW+H18soQyfw2cfCVxCOc4VqlaBna+QYRLg8T6sRXVHZa8NVAGtbeUJyDIqIW7oK
iGTY7qQ18gtv8vCjP9xaGHnMbBn5/MlTnzJGy1ViyVDpw5ygdFZBnsCKUXQ0tAYQ
cEWfjO8=
=ypSg
-----END PGP SIGNATURE-----
Merge tag 'sound-fix-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Just a few HD-audio fixes, most of which are specific to Realtek
codecs"
* tag 'sound-fix-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek - Fix for Lenovo B50-70 inverted internal microphone bug
ALSA: hda: Fix race between creating and refreshing sysfs entries
ALSA: hda/realtek - Corrected fixup for System76 Gazelle (gaze14)
ALSA: hda/realtek - Avoid superfluous COEF EAPD setups
ALSA: hda/realtek - Fixup headphone noise via runtime suspend
The cited commit could disable the modify header flag, but did not free
the allocated memory for the modify header actions. Fix it.
Fixes: 27c11b6b84 ("net/mlx5e: Do not rewrite fields with the same match")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
With commit 27c11b6b84 ("net/mlx5e: Do not rewrite fields with the
same match") there are no rewrites if the rewrite value is the same as
the matched value. However, if the field is not matched, the rewrite is
also wrongly skipped. Fix it.
Fixes: 27c11b6b84 ("net/mlx5e: Do not rewrite fields with the same match")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Flow destination comparison has an inaccuracy: code see no
difference between same vf ports, which belong to different pfs.
Example: If start ping from VF0 (PF1) to VF1 (PF1) and mirror
all traffic to VF0 (PF2), icmp reply to VF0 (PF1) and mirrored
flow to VF0 (PF2) would be determined as same destination. It lead
to creating flow handler with rule nodes, which not added to node
tree. When later driver try to delete this flow rules we got
kernel crash.
Add comparison of vhca_id field to avoid this.
Fixes: 1228e912c9 ("net/mlx5: Consider encapsulation properties when comparing destinations")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
For all representors added firmware version info to show in
ethtool driver info.
For uplink representor, because only it is tied to the pci device
sysfs, added pci bus info.
Fixes: ff9b85de5d ("net/mlx5e: Add some ethtool port control entries to the uplink rep netdev")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Reviewed-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
With the cited commit, ACLs are configured for the VF ports. The loop
for the number of ports had the wrong number. Fix it.
Fixes: 184867373d ("net/mlx5e: ACLs for priority tag mode")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
ethtool user spaces needs to know ring count via ETHTOOL_GRXRINGS when
executing (ethtool -x) which is retrieved via ethtool get_rxnfc callback,
in mlx5 this callback is disabled when CONFIG_MLX5_EN_RXNFC=n.
This patch allows only ETHTOOL_GRXRINGS command on mlx5e_get_rxnfc() when
CONFIG_MLX5_EN_RXNFC is disabled, so ethtool -x will continue working.
Fixes: fe6d86b3c3 ("net/mlx5e: Add CONFIG_MLX5_EN_RXNFC for ethtool rx nfc")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Cited patch refactored the xmit_more indication while not preserving
its functionality. Fix it.
Fixes: 3c31ff22b2 ("drivers: mellanox: use netdev_xmit_more() helper")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The command was mistakenly using enable_hca in embedded CPU field.
Fixes: 22e939a91d (net/mlx5: Update enable HCA dependency)
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reported-by: Alex Rosenbaum <alexr@mellanox.com>
Signed-off-by: Alex Rosenbaum <alexr@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
To avoid any ambiguity between vport index and vport number,
rename functions that had vport, to vport_num or vport_index appropriately.
vport_num is u16 hence change mlx5_eswitch_index_to_vport_num() return
type to u16.
vport_index is an int in vport array. Hence change input type of vport
index in mlx5_eswitch_index_to_vport_num() to int.
Correct multiple eswitch representor interfaces use type u16 of
rep->vport as type int vport_index.
Send vport FW commands with correct eswitch u16 vport_num instead
host int vport_index.
Fixes: 5ae5162066 ("net/mlx5: E-Switch, Assign a different position for uplink rep and vport")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Current version of function status_to_err return -1 for any
status returned by mlx5_cmd_invoke function. In case status is
MLX5_DRIVER_STATUS_ABORTED we should return 0 to the caller as we
assume command completed successfully on FW. If error returned we are
getting confusing messages in dmesg. In addition, currently returned
value -1 is confusing with -EPERM.
New implementation actually fix original commit and return meaningful
codes for commands delivery status and print message in case of failure.
Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Valentine Fatiev <valentinef@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
mlxfw can be compiled as external module while mlx5_core can be
builtin, in such case mlx5 will act like mlxfw is disabled.
Since mlxfw is just a service library for mlx* drivers,
imply it in mlx5_core to make it always reachable if it was enabled.
Fixes: 3ffaabecd1 ("net/mlx5e: Support the flash device ethtool callback")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>