Factor out a helper to set all the device specific queue limits and apply
them to the admin queue in addition to the I/O queues. Without this the
command size on the admin queue is arbitrarily low, and the missing
other limitations are just minefields waiting for victims.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
If cgroup writeback is in use, inodes can be scheduled for
asynchronous wb switching. Before 5ff8eaac16 ("writeback: keep
superblock pinned during cgroup writeback association switches"), this
could race with umount leading to super_block being destroyed while
inodes are pinned for wb switching. 5ff8eaac16 fixed it by bumping
s_active while wb switches are in flight; however, this allowed
in-flight wb switches to make umounts asynchronous when the userland
expected synchronosity - e.g. fsck immediately following umount may
fail because the device is still busy.
This patch removes the problematic super_block pinning and instead
makes generic_shutdown_super() flush in-flight wb switches. wb
switches are now executed on a dedicated isw_wq so that they can be
flushed and isw_nr_in_flight keeps track of the number of in-flight wb
switches so that flushing can be avoided in most cases.
v2: Move cgroup_writeback_umount() further below and add MS_ACTIVE
check in inode_switch_wbs() as Jan an Al suggested.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Tahsin Erdogan <tahsin@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Link: http://lkml.kernel.org/g/CAAeU0aNCq7LGODvVGRU-oU_o-6enii5ey0p1c26D1ZzYwkDc5A@mail.gmail.com
Fixes: 5ff8eaac16 ("writeback: keep superblock pinned during cgroup writeback association switches")
Cc: stable@vger.kernel.org #v4.5
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
A user could send a passthrough IO command with a metadata pointer to a
namespace without metadata. With metadata length of 0, kmalloc returns
ZERO_SIZE_PTR. Since that is not NULL, the driver would have set this as
the bio's integrity payload, which causes an access fault on completion.
This patch ignores the users metadata buffer if the namespace format
does not support separate metadata.
Reported-by: Stephen Bates <stephen.bates@microsemi.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
The command flags can change the meaning of other fields in the command
that the driver is not prepared to handle. Specifically, the user could
passthrough an SGL flag, causing the controller to misinterpret the PRP
list the driver created, potentially corrupting memory or data.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Jon Derrick <jonathan.derrick@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
This moves failed queue handling out of the namespace removal path and
into the reset failure path, fixing a hanging condition if the controller
fails or link down during del_gendisk. Previously the driver had to see
the controller as degraded prior to calling del_gendisk to setup the
queues to fail. But, if the controller happened to fail after this,
there was no task to end outstanding requests.
On failure, all namespace states are set to dead. This has capacity
revalidate to 0, and ends all new requests with error status.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
A reset failure schedules the device to unbind from the driver through
the pci driver's remove. This cleans up all intialization, so there is
no need to duplicate the potentially racy cleanup.
To help understand why a reset failed, the status is logged with the
existing warning message.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
This patch makes nvme namespace removal lockless. It is up to the caller
to ensure no active namespace scanning is occuring. To ensure no scan
work occurs, the nvme pci driver adds a removing state to the controller
device to avoid queueing scan work during removal. The work is flushed
after setting the state, so no new scan work can be queued.
The lockless removal allows the driver to cleanup a namespace
request_queue if the controller fails during removal. Previously this
could deadlock trying to acquire the namespace mutex in order to handle
such events.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
A namespace may be detached from a controller, but a user may be holding
a reference to it. Attaching a new namespace with the same NSID will create
duplicate names when using the NSID to name the disk.
This patch uses an IDA that is released only when the last reference is
released instead of using the namespace ID.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Unmapping the registers on reset or shutdown is not necessary. Keeping
the mapping simplifies reset handling.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
This patch applies the two introduced helpers to
figure out the 1st and last bvec.
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This patch applies the two introduced helpers to
figure out the 1st and last bvec, and fixes the
original way after bio splitting.
Cc: stable@vger.kernel.org
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
In the following patch, the way for figuring out
the last bvec will be changed with a bit cost introduced,
so return immediately if the queue doesn't have virt
boundary limit. Actually most of devices have not
this limit.
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The bio passed to bio_will_gap() may be fast cloned from upper
layer(dm, md, bcache, fs, ...), or from bio splitting in block
core.
Unfortunately bio_will_gap() just figures out the last bvec via
'bi_io_vec[prev->bi_vcnt - 1]' directly, and this way is obviously
wrong.
This patch introduces two helpers for getting the first and last
bvec of one bio for fixing the issue.
Cc: stable@vger.kernel.org
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This change will also make Coverity happy by avoiding a theoretical NULL
pointer dereference; yet another reason is to use the above helper function
to tighten the code and make it more readable.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change will also make Coverity happy by avoiding a theoretical NULL
pointer dereference; yet another reason is to use the above helper function
to tighten the code and make it more readable.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ibmvnic_capability struct was defined incorrectly. The last two
elements of the struct are in the wrong order. In addition, the number
element should be 64-bit. Byteswapping functions are updated
as well.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When ipv6_find_hdr is used to find a fragment header
(caller specifies target NEXTHDR_FRAGMENT) we erronously return
-ENOENT for all fragments with nonzero offset.
Before commit 9195bb8e38, when target was specified, we did not
enter the exthdr walk loop as nexthdr == target so this used to work.
Now we do (so we can skip empty route headers). When we then stumble upon
a frag with nonzero frag_off we must return -ENOENT ("header not found")
only if the caller did not specifically request NEXTHDR_FRAGMENT.
This allows nfables exthdr expression to match ipv6 fragments, e.g. via
nft add rule ip6 filter input frag frag-off gt 0
Fixes: 9195bb8e38 ("ipv6: improve ipv6_find_hdr() to skip empty routing headers")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MC74xx and EM74xx modules use different IDs by default, according
to the Lenovo EM7455 driver for Windows.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
reverts commit 94153e36e7 ("tipc: use existing sk_write_queue for
outgoing packet chain")
In Commit 94153e36e7, we assume that we fill & empty the socket's
sk_write_queue within the same lock_sock() session.
This is not true if the link is congested. During congestion, the
socket lock is released while we wait for the congestion to cease.
This implementation causes a nullptr exception, if the user space
program has several threads accessing the same socket descriptor.
Consider two threads of the same program performing the following:
Thread1 Thread2
-------------------- ----------------------
Enter tipc_sendmsg() Enter tipc_sendmsg()
lock_sock() lock_sock()
Enter tipc_link_xmit(), ret=ELINKCONG spin on socket lock..
sk_wait_event() :
release_sock() grab socket lock
: Enter tipc_link_xmit(), ret=0
: release_sock()
Wakeup after congestion
lock_sock()
skb = skb_peek(pktchain);
!! TIPC_SKB_CB(skb)->wakeup_pending = tsk->link_cong;
In this case, the second thread transmits the buffers belonging to
both thread1 and thread2 successfully. When the first thread wakeup
after the congestion it assumes that the pktchain is intact and
operates on the skb's in it, which leads to the following exception:
[2102.439969] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0
[2102.440074] IP: [<ffffffffa005f330>] __tipc_link_xmit+0x2b0/0x4d0 [tipc]
[2102.440074] PGD 3fa3f067 PUD 3fa6b067 PMD 0
[2102.440074] Oops: 0000 [#1] SMP
[2102.440074] CPU: 2 PID: 244 Comm: sender Not tainted 3.12.28 #1
[2102.440074] RIP: 0010:[<ffffffffa005f330>] [<ffffffffa005f330>] __tipc_link_xmit+0x2b0/0x4d0 [tipc]
[...]
[2102.440074] Call Trace:
[2102.440074] [<ffffffff8163f0b9>] ? schedule+0x29/0x70
[2102.440074] [<ffffffffa006a756>] ? tipc_node_unlock+0x46/0x170 [tipc]
[2102.440074] [<ffffffffa005f761>] tipc_link_xmit+0x51/0xf0 [tipc]
[2102.440074] [<ffffffffa006d8ae>] tipc_send_stream+0x11e/0x4f0 [tipc]
[2102.440074] [<ffffffff8106b150>] ? __wake_up_sync+0x20/0x20
[2102.440074] [<ffffffffa006dc9c>] tipc_send_packet+0x1c/0x20 [tipc]
[2102.440074] [<ffffffff81502478>] sock_sendmsg+0xa8/0xd0
[2102.440074] [<ffffffff81507895>] ? release_sock+0x145/0x170
[2102.440074] [<ffffffff815030d8>] ___sys_sendmsg+0x3d8/0x3e0
[2102.440074] [<ffffffff816426ae>] ? _raw_spin_unlock+0xe/0x10
[2102.440074] [<ffffffff81115c2a>] ? handle_mm_fault+0x6ca/0x9d0
[2102.440074] [<ffffffff8107dd65>] ? set_next_entity+0x85/0xa0
[2102.440074] [<ffffffff816426de>] ? _raw_spin_unlock_irq+0xe/0x20
[2102.440074] [<ffffffff8107463c>] ? finish_task_switch+0x5c/0xc0
[2102.440074] [<ffffffff8163ea8c>] ? __schedule+0x34c/0x950
[2102.440074] [<ffffffff81504e12>] __sys_sendmsg+0x42/0x80
[2102.440074] [<ffffffff81504e62>] SyS_sendmsg+0x12/0x20
[2102.440074] [<ffffffff8164aed2>] system_call_fastpath+0x16/0x1b
In this commit, we maintain the skb list always in the stack.
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The argument structs are used in arrays for G_TOPOLOGY IOCTL. The
arguments themselves do not need to be aligned to a power of two, but
aligning them up to the largest basic type alignment (u64) on common ABIs
is a good thing to do.
The patch changes the size of the reserved fields to 5 or 6 u32's and
aligns the size of the struct to 8 bytes so we do no longer depend on the
compiler to perform the alignment.
While at it, add __attribute__ ((packed)) to these structs as well.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
The current reserved_tailroom calculation fails to take hlen and tlen into
account.
skb:
[__hlen__|__data____________|__tlen___|__extra__]
^ ^
head skb_end_offset
In this representation, hlen + data + tlen is the size passed to alloc_skb.
"extra" is the extra space made available in __alloc_skb because of
rounding up by kmalloc. We can reorder the representation like so:
[__hlen__|__data____________|__extra__|__tlen___]
^ ^
head skb_end_offset
The maximum space available for ip headers and payload without
fragmentation is min(mtu, data + extra). Therefore,
reserved_tailroom
= data + extra + tlen - min(mtu, data + extra)
= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
Compare the second line to the current expression:
reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
and we can see that hlen and tlen are not taken into account.
The min() in the third line can be expanded into:
if mtu < skb_tailroom - tlen:
reserved_tailroom = skb_tailroom - mtu
else:
reserved_tailroom = tlen
Depending on hlen, tlen, mtu and the number of multicast address records,
the current code may output skbs that have less tailroom than
dev->needed_tailroom or it may output more skbs than needed because not all
space available is used.
Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here are some new device ids and a patch removing the mxu11x0 driver,
which turned out not to be needed.
Signed-off-by: Johan Hovold <johan@kernel.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJW2GZ1AAoJEEEN5E/e4bSVY9gQAKhsqZAtF8qpEVJVhDdEu21L
o5bNW9d04PaBWIwg/ju00Itsbt0rxBnlCjgAZlcj0sZy+ZR14GStFqxrW4YrPkif
o/VHSMiYJUKqNsvkqdqCWRxgIOK0Wa/FW0b8E0TrYIDI7zopC6uIrS0yxeXks51D
Y15L2Qh0dm4+l9aA0f8IoMRT2pG2zcQELtwlhyQGVrrjNj/WoKuauuzeewXsd2NF
Oz16UMp3/WsiVTpcM39gSdlV0EXxsFweuQsfTN5Fysd8wDf5xMcD+BtX8ApPibsx
/qHGwlGmghpMcXQXUUooKtM85vOhRL7bqUwr8+Q/DIoJ1L3qLaoIu6Z1M+s/jue3
gdavgB3g6nm/fmUp1gF7+3KeLYgPUxTE8y64iE7C1/dJmet5qi4dbA9R2chs6582
w6UsPcXXMUIlGpxE+cffBiYKfTM5/Qe1Ry6D4VP7ZezdGzy0oUhuOXDWzkAL3yjf
ykdyKa5LD11gQInkCJJofBzUT18T9uxWuEB/N9XLc/Nq06WfLef2fCMyUqKaezOF
+fZRrrX3MIAjN9/PuaQACCBpuD2Mtw2Vob/GORE1wLX0TsfBUkOBhjnPlG7Bu3T0
NwETj/pxX8BGJjcJjLo76+4Fl+5FxDFzSTWmEWNv7VEwhDl9CrRkgbn5AigIR74J
WAIVm7omLibBcz5jWO+j
=f2SZ
-----END PGP SIGNATURE-----
Merge tag 'usb-serial-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Jonan writes:
USB-serial fixes for v4.5-rc7
Here are some new device ids and a patch removing the mxu11x0 driver,
which turned out not to be needed.
Signed-off-by: Johan Hovold <johan@kernel.org>
Currently, in a case of error, dev_err is using fman->dev
before its initialization and "(NULL device *)" is printed.
This patch fixes this issue.
Signed-off-by: Igal Liberman <igal.liberman@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch manages the case when you have an Ethernet MAC with
a "fixed link", and not connected to a normal MDIO-managed PHY device.
The test of phy_bus_name was not helpful because it was never affected
and replaced by the mdio test node.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@linaro.org>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix a runtime PM suspend/resume bug in the RCAR driver
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW2EMyAAoJEEEQszewGV1zyccQAITkJRDI4loy9EJLuOZqbYBG
zmKuLJeempbH3o5dtBIcEia7UI+c0jrVrFehfUTPfPAduuHx6Thg6awuOwnoeBBY
K6cVispZfXL6ZpeQW7I7ykcxSW3tSzLRMaNuzG75Uq7hkFuaRXJsi0K0TfB1B8Yn
A3qwrldlxP5y/rHff7Xz7sCfuwjaf60MYFEjQr0/pa9uVhgpXmEDWGWuT9WzoqTG
InX4D6YIrAYbG1RKIbZnc8e2BdwJsCD70ldKIPlHUtORu/iyfObcadl0AT6xD3uB
mHq9cooHwq+cMsH44DqF2i7KwzQjh2goujCO3eXW2E1CzYTjO9gRxXb3KoTgbkxn
BEj08+tByBMwuiiwqqcACiAMQk5MlWzM8qQwQehxo6bnYmxzS9N+NJOq7btnxwmC
yHIjyjYs1IWb0gwlSehkKFT6JLiP4pbCBt34dcJbfSUwpHAwzQzzsYymNHDkwQzJ
pqGIU9JX/dWc1uvp6tLSmXuOh8YQvzj7sowSNkIiW9aX1OQpH+7xCE3b7vKZRZgt
jibAfDKCPq/5bLEKbvGZggkTz6AqW56utaovTLnN7FiMbMEnATBXKP3MRrltDBj/
+rll1hDbPYc8wL6qvGEI5V5wcCGxr6vL8GBloQbjxQfywN0F/+wcnJKl9cWE5xOf
KCH97QpF+N9ZsuDPJIPZ
=5Ovl
-----END PGP SIGNATURE-----
Merge tag 'gpio-v4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull late GPIO fix from Linus Walleij:
"Regressions never arrive when you want them to, so here is a late fix
for the Renesas RCAR GPIO driver. It only affects that driver on the
very specific Renesas platforms:
- Fix a runtime PM suspend/resume bug in the RCAR driver"
* tag 'gpio-v4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: rcar: Add Runtime PM handling for interrupts
Including one fix for Intel VT-d:
* Use BUS_NOTIFY_REMOVED_DEVICE notifier to unbind a device from
its domain _after_ it has been unbound from its driver. This
fixes a BUG_ON being triggered in the PCI hotplug path.
And three for AMD IOMMU:
* Add a workaround for a hardware issue with ATS in use
* Fix ATS enable/disable balance when a device is removed
* Fix a boot warning being triggered when the system has IOMMU
performance counters and PCI device 00:00.0 is not covered by
the IOMMU
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJW2D+vAAoJECvwRC2XARrj3m8QAL8oDvVQp12r6MeT/Mre4JPb
5Y572a7kaA1Sp6b1ymnrMKsOATdxEJ7+P4OCkHDWjEZTGWvyDS/+latvPMTI0djf
fzof5eB+OeGc1nI3z050Y2hKcC17gNs+cy6RoEDmWA8WRCYOh2FV8pgAFAqz/RNt
qOA/8tdzbFyGk/A7GQV4N2ox30eGZwNHfMZDTNYkcAtf5BiKObyUDnkLk98hig5o
bv2oEGckMDv35rOlpe6ToVPKe8hwy+uJLxBSvQKn3+d0Bu/4ZNVq2FVmcn1+DWbY
sSqM6fFfEGyKK1moEOD2tuPJD5cr5qsqpN0GIbk8CRldSG6xYekJxUjY85aFeziS
pARSu9LB+lOdi7yYbZu8zpfvN/G+gbRByqWaXBI1SHvmrx2ZS1VGHvHPel8aBQBz
PLIF+z3ij3sCJwZnim4Tg2pyCAccz2sEiR7O0xBx5kDQKB8ZSeNKVhTz2d9cH5AZ
gP4I7BmlSWaYXXi5FX6BMin3dYzHYuE/wcprQ98v1MIAk4AcqZGtOgX7CeQWABuC
pBpdCYUiSIDQg+pDA6SSx7IJOxwVLhbQg/cxmJB5SWAMsNwnWqy7G2HWNvIJLrwT
Bq++MmOYEwGH7g5xd8Q+rXszmaoVjzMhGojaAZg3gRmIe9AsaFSb9umoLvO1lNy0
u3aOUVDZM3eIrhTHtmB7
=uPsu
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"One fix for Intel VT-d:
- Use BUS_NOTIFY_REMOVED_DEVICE notifier to unbind a device from its
domain _after_ it has been unbound from its driver. This fixes a
BUG_ON being triggered in the PCI hotplug path.
And three for AMD IOMMU:
- Add a workaround for a hardware issue with ATS in use
- Fix ATS enable/disable balance when a device is removed
- Fix a boot warning being triggered when the system has IOMMU
performance counters and PCI device 00:00.0 is not covered by the
IOMMU"
* tag 'iommu-fixes-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Use BUS_NOTIFY_REMOVED_DEVICE in hotplug path
iommu/amd: Detach device from domain before removal
iommu/amd: Apply workaround for ATS write permission check
iommu/amd: Fix boot warning when device 00:00.0 is not iommu covered
This fixes two minor bugs: error handling in vhost,
and capability processing in virtio.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJW1w/9AAoJECgfDbjSjVRpOwMH/10g1iyVAF7FR3RgAgcutROy
mt7RNRGhcJrccB3imWvA5llHfpEQe6b2IGRRjLpy3cHOqoytSJ8F9GbAv/Ya6rBY
ZNsapZkEacLX3Byi/Tll9C6pP7eCHswveBwreXVYpxerTuViorqU0RQXHQCY1nBa
jAHr7eHp7PlSjAKlwUX181/cDKF35VMchCE+NfSmgDFnkOMb8E3TVXqV1wuSZSaf
Ci2IgULrQQC2rzxFg/ZQePweSP9cBrhpX3c3VkVa/N0Io4DOQWLg2KUbwzrs/mel
i4rlKngwwRO0rIsWU+J5hMq4Vqg+Zv6mVQGm3FFAJWle03rUOZPYFCPwbKA6HNU=
=HeDc
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull minor virtio/vhost fixes from Michael Tsirkin:
"This fixes two minor bugs: error handling in vhost, and capability
processing in virtio"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost: fix error path in vhost_init_used()
virtio-pci: read the right virtio_pci_notify_cap field
* fix hang caused by fbconsole blink timer
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW2CSsAAoJEPo9qoy8lh71hjcQAK071t3TPSzKc1+GmLy2yVnV
suaT3C+0uY1yha/V8kRUrwUA4p3wRSHDvr/jF1SSSTgc4J8mY/1mxXYIFmAxfCyb
t/6fKTvHqDFHtCTMCM5RXK+3cR/zTu74KqLmMRllkMa4da7Iyr5fBBnfdryo1yjg
Brj++PANYKXT1B7HR9vrUAL7sU+2y976/oGtAtqTSfRuo4rppXh3PFaNZHJFgdwI
T8wkVIOPJHt9WdSnmXkVzMYeeXRp4mHNev2+tD7OvcH/b21Y+b1aO/ebIbeXWk+T
Or3SWj/5Ckje368NMvIZKywuGcRA9QHMDNqDhBU+pIF8D8Z0TcSd9spV0v94/Z/l
1DK1NROcE0rjcJvCFjhLcgXX9upsVlEQKDR94SsbkgJAv/ToXP2umoUCu8VXlO7Y
tqvW8aU06cpi4/UXBq61riqVP2pOz6mYGWMFHw26tFpmAq3+NvFvW/Zp1UI0pMW/
k4hbWDetmxPJA0T1UJcRHLHeumkEsyHTUgMFbD9UcM6YGs6tbsHw2aWgbEAGYMIa
TMnM+vi+xRWThmUN558SBj7fZ17x/H7pJJllNvL6myn514BPBTmpmGCXQFxqOlc3
/6AQEdYOMhG5z3axqCT+Hbjg8SQMeY7gAROU/Ql7VETJZZsRtIdl2Q/bynsjO06S
8HOemQdC2DZbXRHCdb5t
=8zQX
-----END PGP SIGNATURE-----
Merge tag 'fbdev-fixes-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux
Pull fbdev fix from Tomi Valkeinen:
"Fix hang caused by fbconsole blink timer"
* tag 'fbdev-fixes-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
fbcon: set a default value to blink interval
The representation of external connections got some heated
discussions recently. As we're too close to the merge window,
let's not set those entities into a stone.
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Overlayfs must update uid/gid after chown, otherwise functions
like inode_owner_or_capable() will check user against stale uid.
Catched by xfstests generic/087, it chowns file and calls utimes.
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
After rename file dentry still holds reference to lower dentry from
previous location. This doesn't matter for data access because data comes
from upper dentry. But this stale lower dentry taints dentry at new
location and turns it into non-pure upper. Such file leaves visible
whiteout entry after remove in directory which shouldn't have whiteouts at
all.
Overlayfs already tracks pureness of file location in oe->opaque. This
patch just uses that for detecting actual path type.
Comment from Vivek Goyal's patch:
Here are the details of the problem. Do following.
$ mkdir upper lower work merged upper/dir/
$ touch lower/test
$ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=
work merged
$ mv merged/test merged/dir/
$ rm merged/dir/test
$ ls -l merged/dir/
/usr/bin/ls: cannot access merged/dir/test: No such file or directory
total 0
c????????? ? ? ? ? ? test
Basic problem seems to be that once a file has been unlinked, a whiteout
has been left behind which was not needed and hence it becomes visible.
Whiteout is visible because parent dir is of not type MERGE, hence
od->is_real is set during ovl_dir_open(). And that means ovl_iterate()
passes on iterate handling directly to underlying fs. Underlying fs does
not know/filter whiteouts so it becomes visible to user.
Why did we leave a whiteout to begin with when we should not have.
ovl_do_remove() checks for OVL_TYPE_PURE_UPPER() and does not leave
whiteout if file is pure upper. In this case file is not found to be pure
upper hence whiteout is left.
So why file was not PURE_UPPER in this case? I think because dentry is
still carrying some leftover state which was valid before rename. For
example, od->numlower was set to 1 as it was a lower file. After rename,
this state is not valid anymore as there is no such file in lower.
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Viktor Stanchev <me@viktorstanchev.com>
Suggested-by: Vivek Goyal <vgoyal@redhat.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=109611
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
ovl_remove_upper() should do d_drop() only after it successfully
removes the dir, otherwise a subsequent getcwd() system call will
fail, breaking userspace programs.
This is to fix: https://bugzilla.kernel.org/show_bug.cgi?id=110491
Signed-off-by: Rui Wang <rui.y.wang@intel.com>
Reviewed-by: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
This adds missing .d_select_inode into alternative dentry_operations.
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Fixes: 7c03b5d45b ("ovl: allow distributed fs as lower layer")
Fixes: 4bacc9c923 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Reviewed-by: Nikolay Borisov <kernel@kyup.com>
Tested-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org> # 4.2+
According to IBTA spec v1.3 section 12.7.19, QPs should use GRH when
the path returned by the SA has hop-limit > 0. Currently, we do that
only for the > 1 case, fix that.
Fixes: 6d969a471b ('IB/sa: Add ib_init_ah_from_path()')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
While testing audio with pxa2xx-ac97, underrun were happening while the
user application was correctly feeding the music. Debug proved that the
cyclic transfer is not cyclic, ie. the last descriptor did not loop on
the first.
Another issue is that the descriptor length was always set to 8192,
because of an trivial operator issue.
This was tested on a pxa27x platform.
Fixes: a57e16cf03 ("dmaengine: pxa: add pxa dmaengine driver")
Reported-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Currently, the inlen field of the vendor's part of the command
doesn't match the command buffer. This happens because the inlen
accommodates ib_uverbs_cmd_hdr which is deducted from the in buffer.
This is problematic since the vendor function could be called either
from the legacy verb (where the input length mismatches the actual
length) or by the extended verb (where the length matches). The vendor
has no idea which function calls it and therefore has no way to know
how the length variable should be treated.
Fixing this by aligning the inlen to the correct length.
All vendor drivers either assumed that inlen >= sizeof(vendor_uhw_cmd)
or just failed wrongly (mlx5) and fixed in this patch.
Fixes: cfb5e088e2 ('IB/mlx5: Add CQE version 1 support to user QPs and SRQs')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Normal SRQs, unlike XRC SRQs, don't have user-index, therefore
avoid verifying it and using it.
Fixes: cfb5e088e2 ('IB/mlx5: Add CQE version 1 support to user QPs and SRQs')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When destroying a hw_breakpoint event, the kernel oopses as follows:
Unable to handle kernel paging request for data at address 0x00000c07
NIP [c0000000000291d0] arch_unregister_hw_breakpoint+0x40/0x60
LR [c00000000020b6b4] release_bp_slot+0x44/0x80
Call chain:
hw_breakpoint_event_init()
bp->destroy = bp_perf_event_destroy;
do_exit()
perf_event_exit_task()
perf_event_exit_task_context()
WRITE_ONCE(child_ctx->task, TASK_TOMBSTONE);
perf_event_exit_event()
free_event()
_free_event()
bp_perf_event_destroy() // event->destroy(event);
release_bp_slot()
arch_unregister_hw_breakpoint()
perf_event_exit_task_context() sets child_ctx->task as TASK_TOMBSTONE
which is (void *)-1. arch_unregister_hw_breakpoint() tries to fetch
'thread' attribute of 'task' resulting in oops.
Peterz points out that the code shouldn't be using bp->ctx anyway, but
fixing that will require a decent amount of rework. So for now to fix
the oops, check if bp->ctx->task has been set to (void *)-1, before
dereferencing it. We don't use TASK_TOMBSTONE, because that would
require exporting it and it's supposed to be an internal detail.
Fixes: 63b6da39bb ("perf: Fix perf_event_exit_task() race")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This patch applies the microphone-related fix created for the Acer
Aspire E1-572 to the E1-472 as well, as it uses the same Realtek ALC282
CODEC and demonstrates the same issues.
This patch allows an external, headset microphone to be used and limits
the gain on the (quite noisy) internal microphone.
Signed-off-by: Simon South <simon@simonsouth.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Make the base offset hexadecimal to simplify debugging since the base
addresses are hex too.
The offsets for connectors is also changed to start after the 'reserved'
range 0x10000-0x2ffff.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Fixes for radeon and amdgpu:
- Fix GPUVM flushing on CI and VI
- Misc DPM and Powerplay fixes
- VCE DPM fixes for CZ/ST
- DP hotplug fix
* 'drm-fixes-4.5' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu: return from atombios_dp_get_dpcd only when error
drm/amdgpu/cz: remove commented out call to enable vce pg
drm/amdgpu/powerplay/cz: enable/disable vce dpm independent of vce pg
drm/amdgpu/cz: enable/disable vce dpm even if vce pg is disabled
drm/amdgpu/gfx8: specify which engine to wait before vm flush
drm/amdgpu: apply gfx_v8 fixes to gfx_v7 as well
drm/amd/powerplay: send event to notify powerplay all modules are initialized.
drm/amd/powerplay: export AMD_PP_EVENT_COMPLETE_INIT task to amdgpu.
drm/radeon/pm: update current crtc info after setting the powerstate
drm/amdgpu/pm: update current crtc info after setting the powerstate
Pause/unpause graph tracing around do_suspend_lowlevel as it has
inconsistent call/return info after it jumps to the wakeup vector.
The graph trace buffer will otherwise become misaligned and
may eventually crash and hang on suspend.
To reproduce the issue and test the fix:
Run a function_graph trace over suspend/resume and set the graph
function to suspend_devices_and_enter. This consistently hangs the
system without this fix.
Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lars Persson says:
====================
dwc_eth_qos: stability fixes and support for CMA
This series has bug fixes for the dwc_eth_qos ethernet driver.
Mainly two stability fixes for problems found by Rabin Vincent:
- Successive starts and stops of the interface would trigger a DMA reset timeout.
- A race condition in the TX DMA handling could trigger a netdev watchdog
timeout.
The memory allocation was improved to support use of the CMA as DMA allocator
backend.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts the changed init order from commit 3647bc35bd
("dwc_eth_qos: Reset hardware before PHY start") and makes another fix
for the race.
It turned out that the reset state machine of the dwceqos hardware
requires PHY clocks to be present in order to complete the reset
cycle.
To plug the race with the phy state machine we defer link speed
setting until the hardware init has finished.
Signed-off-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since debug is hardcoded to 3, the defaults in the DWCEQOS_MSG_DEFAULT
macro are never used, which does not seem to be the intended behaviour
here. Set debug to -1 like other drivers so that DWCEQOS_MSG_DEFAULT is
actually used by default.
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>