The init functions do not return any error. They behave as the following:
- panic, thus leading to a kernel crash while another timer may work and
make the system boot up correctly
or
- print an error and let the caller unaware if the state of the system
Change that by converting the init functions to return an error conforming
to the CLOCKSOURCE_OF_RET prototype.
Proper error handling (rollback, errno value) will be changed later case
by case, thus this change just return back an error or success in the init
function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The init functions do not return any error. They behave as the following:
- panic, thus leading to a kernel crash while another timer may work and
make the system boot up correctly
or
- print an error and let the caller unaware if the state of the system
Change that by converting the init functions to return an error conforming
to the CLOCKSOURCE_OF_RET prototype.
Proper error handling (rollback, errno value) will be changed later case
by case, thus this change just return back an error or success in the init
function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The init functions do not return any error. They behave as the following:
- panic, thus leading to a kernel crash while another timer may work and
make the system boot up correctly
or
- print an error and let the caller unaware if the state of the system
Change that by converting the init functions to return an error conforming
to the CLOCKSOURCE_OF_RET prototype.
Proper error handling (rollback, errno value) will be changed later case
by case, thus this change just return back an error or success in the init
function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Patrice Chotard <patrice.chotard@st.com>
Acked-by: Maxime Coquelin <maxime.coquelin@st.com>
The init functions do not return any error. They behave as the following:
- panic, thus leading to a kernel crash while another timer may work and
make the system boot up correctly
or
- print an error and let the caller unaware if the state of the system
Change that by converting the init functions to return an error conforming
to the CLOCKSOURCE_OF_RET prototype.
Proper error handling (rollback, errno value) will be changed later case
by case, thus this change just return back an error or success in the init
function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Sören Brinkmann <soren.brinkmann@xilinx.com>
Since commit 7bb11dc9f5 ("bonding: unify all places where
actor-oper key needs to be updated."), the logic in bonding to handle
selection between multiple aggregators has not functioned.
This affects only configurations wherein the bonding slaves
connect to two discrete aggregators (e.g., two independent switches, each
with LACP enabled), thus creating two separate aggregation groups within a
single bond.
The cause is a change in 7bb11dc9f5 to no longer set
AD_PORT_BEGIN on a port after a link state change, which would cause the
port to be reselected for attachment to an aggregator as if were newly
added to the bond. We cannot restore the prior behavior, as it
contradicts IEEE 802.1AX 5.4.12, which requires ports that "become
inoperable" (lose carrier, setting port_enabled=false as per 802.1AX
5.4.7) to remain selected (i.e., assigned to the aggregator). As the port
now remains selected, the aggregator selection logic is not invoked.
A side effect of this change is that aggregators in bonding will
now contain ports that are link down. The aggregator selection logic
does not currently handle this situation correctly, causing incorrect
aggregator selection.
This patch makes two changes to repair the aggregator selection
logic in bonding to function as documented and within the confines of the
standard:
First, the aggregator selection and related logic now utilizes the
number of active ports per aggregator, not the number of selected ports
(as some selected ports may be down). The ad_select "bandwidth" and
"count" options only consider ports that are link up.
Second, on any carrier state change of any slave, the aggregator
selection logic is explicitly called to insure the correct aggregator is
active.
Reported-by: Veli-Matti Lintu <veli-matti.lintu@opinsys.fi>
Fixes: 7bb11dc9f5 ("bonding: unify all places where actor-oper key needs to be updated.")
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The init functions do not return any error. They behave as the following:
- panic, thus leading to a kernel crash while another timer may work and
make the system boot up correctly
or
- print an error and let the caller unaware if the state of the system
Change that by converting the init functions to return an error conforming
to the CLOCKSOURCE_OF_RET prototype.
Proper error handling (rollback, errno value) will be changed later case
by case, thus this change just return back an error or success in the init
function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The init functions do not return any error. They behave as the following:
- panic, thus leading to a kernel crash while another timer may work and
make the system boot up correctly
or
- print an error and let the caller unaware if the state of the system
Change that by converting the init functions to return an error conforming
to the CLOCKSOURCE_OF_RET prototype.
Proper error handling (rollback, errno value) will be changed later case
by case, thus this change just return back an error or success in the init
function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
The init functions do not return any error. They behave as the following:
- panic, thus leading to a kernel crash while another timer may work and
make the system boot up correctly
or
- print an error and let the caller unaware if the state of the system
Change that by converting the init functions to return an error conforming
to the CLOCKSOURCE_OF_RET prototype.
Proper error handling (rollback, errno value) will be changed later case
by case, thus this change just return back an error or success in the init
function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Matthias Brugger <matthias.bgg@gmail.com>
The init functions do not return any error. They behave as the following:
- panic, thus leading to a kernel crash while another timer may work and
make the system boot up correctly
or
- print an error and let the caller unaware if the state of the system
Change that by converting the init functions to return an error conforming
to the CLOCKSOURCE_OF_RET prototype.
Proper error handling (rollback, errno value) will be changed later case
by case, thus this change just return back an error or success in the init
function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
on a rk3399-evb
Tested-by: Heiko Stuebner <heiko@sntech.de>
Currently, the clksrc-probe is not able to handle any error from the init
functions. There are different issues with the current code:
- the code is duplicated in the init functions by writing error
- every driver tends to panic in its own init function
- counting the number of clocksources is not reliable
This patch adds another table to store the functions returning an error.
The table is temporary while we convert all the drivers to return an error
and will disappear.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The macro OF_DECLARE_1 expect a void (*func)(struct device_node *) while the
OF_DECLARE_2 expect a int (*func)(struct device_node *, struct device_node *).
The second one allows to pass an init function returning a value, which make
possible to call the functions in the table and check the return value in order
to catch at a higher level the errors and handle them from there instead of
doing a panic in each driver (well at least this is the case for the clkevt).
Unfortunately the OF_DECLARE_1 does not allow that and that lead to some code
duplication and crappyness in the drivers.
The OF_DECLARE_1 is used by all the clk drivers and the clocksource/clockevent
drivers. It is not possible to do the change in one shot as we have to change
all the init functions.
The OF_DECLARE_2 specifies an init function prototype with two parameters with
the node and its parent. The latter won't be used, ever, in the timer drivers.
Introduce a OF_DECLARE_1_RET macro to be used, and hopefully we can smoothly
and iteratively change the users of OF_DECLARE_1 to use the new macro instead.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Add DT bindings for the Oxford Semiconductor RPS dual Timer.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Add clocksource and clockevent driver from dual RPS timer.
The HW provides a dual one-shot or periodic 24bit timers,
the drivers set the first one as tick event source and the
second as a continuous scheduler clock source.
The timer can use 1, 16 or 256 as pre-dividers, thus the
clocksource uses 16 by default.
CC: Ma Haijun <mahaijuns@gmail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Add a 'rktimer' node in the device treee for the ARM64 rk3399 SoC.
Signed-off-by: Huang Tao <huangtao@rock-chips.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Heiko Stuebner <heiko@sntech.de>
Tested-by: Jianqun Xu <jay.xu@rock-chips.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The only difference between the rk3399 SoC and the other ones is the control
register offset which is different.
Add a new field to store the control register address depending on the SoC
and use it instead of the <base> + <control offset>.
Signed-off-by: Huang Tao <huangtao@rock-chips.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Heiko Stuebner <heiko@sntech.de>
Tested-by: Jianqun Xu <jay.xu@rock-chips.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The rockchip timer is a broadcast timer. Add the CLOCK_EVT_FEAT_DYNIRQ flag
and set the cpumask to all possible cpus to save power by avoiding
unnecessary wakeups and IPIs.
Signed-off-by: Huang Tao <huangtao@rock-chips.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Heiko Stuebner <heiko@sntech.de>
Tested-by: Jianqun Xu <jay.xu@rock-chips.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Add a compatible string for rk3399 SoC because the timer is slightly different
from the older SoCs. So rename the file name from rockchip,rk3288-timer.txt to
rockchip,rk-timer.txt and clarify rockchip,rk3288-timer supported SoCs.
Signed-off-by: Huang Tao <huangtao@rock-chips.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Correct the typo in "driver" word in the option description.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Fix the Samsung pwm timer access code to deal with kernels built for big
endian operation.
Signed-off-by: Matthew Leach <matthew@mattleach.net>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Change the dc_timer function to be static as it is not used outside
this driver. This fixes the following warning:
drivers/clocksource/timer-digicolor.c:66:24: warning: symbol 'dc_timer' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Baruch Siach <baruch@tkos.co.il>
The driver does not export armada_370_xp_timer_syscore_ops so
make it static to fix the following warning:
drivers/clocksource/time-armada-370-xp.c:249:20: warning: symbol 'armada_370_xp_timer_syscore_ops' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
This fixes wrong-interface signaling on 32-bit platforms for entries
created when jiffies > 2^31 + MFC_ASSERT_THRESH.
Signed-off-by: Tom Goff <thomas.goff@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
The test_fp_ctl function is used to test if a given value is a valid
floating-point control. The inline assembly in test_fp_ctl uses an
incorrect constraint for the 'orig_fpc' variable. If the compiler
chooses the same register for 'fpc' and 'orig_fpc' the test_fp_ctl()
function always returns true. This allows user space to trigger
kernel oopses with invalid floating-point control values on the
signal stack.
This problem has been introduced with git commit 4725c86055
"s390: fix save and restore of the floating-point-control register"
Cc: stable@vger.kernel.org # v3.13+
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This reverts commit 852ffd0f4e.
There are use cases where an intermediate boot kernel (1) uses kexec
to boot the final production kernel (2). For this scenario we should
provide the original boot information to the production kernel (2).
Therefore clearing the boot information during kexec() should not
be done.
Cc: stable@vger.kernel.org # v3.17+
Reported-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If CONFIG_ARC_DW2_UNWIND is disabled every time arc_unwind_core()
gets called following message gets printed in debug console:
----------------->8---------------
CONFIG_ARC_DW2_UNWIND needs to be enabled
----------------->8---------------
That message makes sense if user indeed wants to see a backtrace or
get nice function call-graphs in perf but what if user disabled
unwinder for the purpose? Why pollute his debug console?
So instead we'll warn user about possibly missing feature once and
let him decide if that was what he or she really wanted.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
With recent binutils update to support dwarf CFI pseudo-ops in gas, we
now get .eh_frame vs. .debug_frame. Although the call frame info is
exactly the same in both, the CIE differs, which the current kernel
unwinder can't cope with.
This broke both the kernel unwinder as well as loadable modules (latter
because of a new unhandled relo R_ARC_32_PCREL from .rela.eh_frame in
the module loader)
The ideal solution would be to switch unwinder to .eh_frame.
For now however we can make do by just ensureing .debug_frame is
generated by removing -fasynchronous-unwind-tables
.eh_frame generated with -gdwarf-2 -fasynchronous-unwind-tables
.debug_frame generated with -gdwarf-2
Fixes STAR 9001058196
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Pull input fixes from Dmitry Torokhov.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: vmmouse - remove port reservation
Input: elantech - add more IC body types to the list
Input: wacom_w8001 - ignore invalid pen data packets
Input: wacom_w8001 - w8001_MAX_LENGTH should be 13
Input: xpad - fix oops when attaching an unknown Xbox One gamepad
MAINTAINERS: add Pali Rohár as reviewer of ALPS PS/2 touchpad driver
Input: add HDMI CEC specific keycodes
Input: add BUS_CEC type
Input: xpad - fix rumble on Xbox One controllers with 2015 firmware
CPU notifications from the firmware coming in when cpufreq is
suspended cause cpufreq_update_current_freq() to return 0 which
triggers the WARN_ON() in cpufreq_update_policy() for no reason.
Avoid that by checking cpufreq_suspended before calling
cpufreq_update_current_freq().
Fixes: c9d9c929e6 (cpufreq: Abort cpufreq_update_current_freq() for cpufreq_suspended set)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 4.6+ <stable@vger.kernel.org> # 4.6+
If of_match_node() fails, this init function bails out without
calling of_node_put().
Also change of_node_put(of_root) to of_node_put(np); both of them
hold the same pointer, but it seems better to call of_node_put()
against the node returned by of_find_node_by_path().
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
intel_pstate_set_policy() is invoked by the cpufreq core during
driver initialization, on changes of policy attributes (minimim and
maximum frequency, for example) via sysfs and via CPU notifications
from the platform firmware. On some platforms the latter may occur
relatively often.
Commit bb6ab52f2b (intel_pstate: Do not set utilization update hook
too early) made intel_pstate_set_policy() clear the CPU's utilization
update hook before updating the policy attributes for it (and set the
hook again after doind that), but that involves invoking
synchronize_sched() and adds overhead to the CPU notifications
mentioned above and to the sched-RCU handling in general.
That extra overhead is arguably not necessary, because updating
policy attributes when the CPU's utilization update hook is active
should not lead to any adverse effects, so drop the clearing of
the hook from intel_pstate_set_policy() and make it check if
the hook has been set already when attempting to set it.
Fixes: bb6ab52f2b (intel_pstate: Do not set utilization update hook too early)
Reported-by: Jisheng Zhang <jszhang@marvell.com>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
Tested-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull kbuild regression fix from Michal Marek:
"The problem is that commit 9c8fa9bc08 ("kbuild: fix if_change and
friends to consider argument order") fixed a potential missed rebuild,
but this results in unnnecessary rebuilds with the packaging targets.
Which is still more correct than the previous logic, but also very
annoying"
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kbuild: Initialize exported variables
'commpage_bak' is allocated with 'sizeof(struct echoaudio)' bytes.
We then copy 'sizeof(struct comm_page)' bytes in it.
On my system, smatch complains because one is 2960 and the other is 3072.
This would result in memory corruption or a oops.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This isn't functionally apparent for some reason, but
when we test io at extreme offsets at the end of the loff_t
rang, such as in fstests xfs/071, the calculation of
"max" in dax_io() can be wrong due to pos + size overflowing.
For example,
# xfs_io -c "pwrite 9223372036854771712 512" /mnt/test/file
enters dax_io with:
start 0x7ffffffffffff000
end 0x7ffffffffffff200
and the rounded up "size" variable is 0x1000. This yields:
pos + size 0x8000000000000000 (overflows loff_t)
end 0x7ffffffffffff200
Due to the overflow, the min() function picks the wrong
value for the "max" variable, and when we send (max - pos)
into i.e. copy_from_iter_pmem() it is also the wrong value.
This somehow(tm) gets magically absorbed without incident,
probably because iter->count is correct. But it seems best
to fix it up properly by comparing the two values as
unsigned.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Pull cifs fixes from Steve French:
"Various small cifs/smb3 fixes, include some for stable, and some from
the recent SMB3 test event"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
File names with trailing period or space need special case conversion
Fix reconnect to not defer smb3 session reconnect long after socket reconnect
cifs: check hash calculating succeeded
cifs: dynamic allocation of ntlmssp blob
cifs: use CIFS_MAX_DOMAINNAME_LEN when converting the domain name
cifs: stuff the fl_owner into "pid" field in the lock request
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- Missing length check for user-space GETALG request
- Bogus memmove length in ux500 driver
- Incorrect priority setting for vmx driver
- Incorrect ABI selection for vmx driver"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: user - re-add size check for CRYPTO_MSG_GETALG
crypto: ux500 - memmove the right size
crypto: vmx - Increase priority of aes-cbc cipher
crypto: vmx - Fix ABI detection
The USB core contains a bug that can show up when a USB-3 host
controller is removed. If the primary (USB-2) hcd structure is
released before the shared (USB-3) hcd, the core will try to do a
double-free of the common bandwidth_mutex.
The problem was described in graphical form by Chung-Geol Kim, who
first reported it:
=================================================
At *remove USB(3.0) Storage
sequence <1> --> <5> ((Problem Case))
=================================================
VOLD
------------------------------------|------------
(uevent)
________|_________
|<1> |
|dwc3_otg_sm_work |
|usb_put_hcd |
|peer_hcd(kref=2)|
|__________________|
________|_________
|<2> |
|New USB BUS #2 |
| |
|peer_hcd(kref=1) |
| |
--(Link)-bandXX_mutex|
| |__________________|
|
___________________ |
|<3> | |
|dwc3_otg_sm_work | |
|usb_put_hcd | |
|primary_hcd(kref=1)| |
|___________________| |
_________|_________ |
|<4> | |
|New USB BUS #1 | |
|hcd_release | |
|primary_hcd(kref=0)| |
| | |
|bandXX_mutex(free) |<-
|___________________|
(( VOLD ))
______|___________
|<5> |
| SCSI |
|usb_put_hcd |
|peer_hcd(kref=0) |
|*hcd_release |
|bandXX_mutex(free*)|<- double free
|__________________|
=================================================
This happens because hcd_release() frees the bandwidth_mutex whenever
it sees a primary hcd being released (which is not a very good idea
in any case), but in the course of releasing the primary hcd, it
changes the pointers in the shared hcd in such a way that the shared
hcd will appear to be primary when it gets released.
This patch fixes the problem by changing hcd_release() so that it
deallocates the bandwidth_mutex only when the _last_ hcd structure
referencing it is released. The patch also removes an unnecessary
test, so that when an hcd is released, both the shared_hcd and
primary_hcd pointers in the hcd's peer will be cleared.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Chung-Geol Kim <chunggeol.kim@samsung.com>
Tested-by: Chung-Geol Kim <chunggeol.kim@samsung.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are several places where the listener and pending or accept queue
child sockets are accessed at the same time. Lockdep is unhappy that
two locks from the same class are held.
Tell lockdep that it is safe and document the lock ordering.
Originally Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> sent a similar
patch asking whether this is safe. I have audited the code and also
covered the vsock_pending_work() function.
Suggested-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
with the commit 8c14586fc3 ("net: ipv6: Use passed in table for
nexthop lookups"), net hop lookup is first performed on route creation
in the passed-in table.
However device match is not enforced in table lookup, so the found
route can be later discarded due to egress device mismatch and no
global lookup will be performed.
This cause the following to fail:
ip link add dummy1 type dummy
ip link add dummy2 type dummy
ip link set dummy1 up
ip link set dummy2 up
ip route add 2001:db8:8086::/48 dev dummy1 metric 20
ip route add 2001:db8:d34d::/64 via 2001:db8:8086::2 dev dummy1 metric 20
ip route add 2001:db8:8086::/48 dev dummy2 metric 21
ip route add 2001:db8:d34d::/64 via 2001:db8:8086::2 dev dummy2 metric 21
RTNETLINK answers: No route to host
This change fixes the issue enforcing device lookup in
ip6_nh_lookup_table()
v1->v2: updated commit message title
Fixes: 8c14586fc3 ("net: ipv6: Use passed in table for nexthop lookups")
Reported-and-tested-by: Beniamino Galvani <bgalvani@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEcBAABCgAGBQJXa6kLAAoJED07qiWsqSVqRsIH/RiHvKa9VB7yYQaXV3YqUPIo
iizU6mQCeODqZsDw9bXce232RevKBteYDyr4YpC4f9mX54CrQI7WRN7ev5fKU49a
FB4M9uz8v3kS5XX8gADkuDvSwtrQ7pMz1fXM2rkEyHT/xf6egCOT/lpI/mWQuNcM
3mkMFLy5ZUAaVHAsfqu8TrDgeWMDXNxbVwGtB/AuoFJ62pqVf5M+TwzKrYaOFM4r
Rbl3NINKwFwk41KCOz20GiVvvahCp05SPHmK0OMwxsffKZmmkUOdHvusOZx7Zxnw
RY7Mc/j+OvvAHYnRaZmfdDEPXc2hKQP0ATjVsW/bju7PWoVpG+87mYqubIFuTSY=
=B2gO
-----END PGP SIGNATURE-----
Merge tag 'linux-can-fixes-for-4.7-20160623' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2016-06-23
this is a pull request of 3 patches for the upcoming linux-4.7 release.
The first two patches are by Oliver Hartkopp fixing oopes in the generic CAN
device netlink handling. Jimmy Assarsson's patch for the kvaser_usb driver adds
support for more devices by adding their USB product ids.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
I couldn't get Xen to boot a L2 HVM when it was nested under KVM - it was
getting a GP(0) on a rather unspecial vmread from Xen:
(XEN) ----[ Xen-4.7.0-rc x86_64 debug=n Not tainted ]----
(XEN) CPU: 1
(XEN) RIP: e008:[<ffff82d0801e629e>] vmx_get_segment_register+0x14e/0x450
(XEN) RFLAGS: 0000000000010202 CONTEXT: hypervisor (d1v0)
(XEN) rax: ffff82d0801e6288 rbx: ffff83003ffbfb7c rcx: fffffffffffab928
(XEN) rdx: 0000000000000000 rsi: 0000000000000000 rdi: ffff83000bdd0000
(XEN) rbp: ffff83000bdd0000 rsp: ffff83003ffbfab0 r8: ffff830038813910
(XEN) r9: ffff83003faf3958 r10: 0000000a3b9f7640 r11: ffff83003f82d418
(XEN) r12: 0000000000000000 r13: ffff83003ffbffff r14: 0000000000004802
(XEN) r15: 0000000000000008 cr0: 0000000080050033 cr4: 00000000001526e0
(XEN) cr3: 000000003fc79000 cr2: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
(XEN) Xen code around <ffff82d0801e629e> (vmx_get_segment_register+0x14e/0x450):
(XEN) 00 00 41 be 02 48 00 00 <44> 0f 78 74 24 08 0f 86 38 56 00 00 b8 08 68 00
(XEN) Xen stack trace from rsp=ffff83003ffbfab0:
...
(XEN) Xen call trace:
(XEN) [<ffff82d0801e629e>] vmx_get_segment_register+0x14e/0x450
(XEN) [<ffff82d0801f3695>] get_page_from_gfn_p2m+0x165/0x300
(XEN) [<ffff82d0801bfe32>] hvmemul_get_seg_reg+0x52/0x60
(XEN) [<ffff82d0801bfe93>] hvm_emulate_prepare+0x53/0x70
(XEN) [<ffff82d0801ccacb>] handle_mmio+0x2b/0xd0
(XEN) [<ffff82d0801be591>] emulate.c#_hvm_emulate_one+0x111/0x2c0
(XEN) [<ffff82d0801cd6a4>] handle_hvm_io_completion+0x274/0x2a0
(XEN) [<ffff82d0801f334a>] __get_gfn_type_access+0xfa/0x270
(XEN) [<ffff82d08012f3bb>] timer.c#add_entry+0x4b/0xb0
(XEN) [<ffff82d08012f80c>] timer.c#remove_entry+0x7c/0x90
(XEN) [<ffff82d0801c8433>] hvm_do_resume+0x23/0x140
(XEN) [<ffff82d0801e4fe7>] vmx_do_resume+0xa7/0x140
(XEN) [<ffff82d080164aeb>] context_switch+0x13b/0xe40
(XEN) [<ffff82d080128e6e>] schedule.c#schedule+0x22e/0x570
(XEN) [<ffff82d08012c0cc>] softirq.c#__do_softirq+0x5c/0x90
(XEN) [<ffff82d0801602c5>] domain.c#idle_loop+0x25/0x50
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 1:
(XEN) GENERAL PROTECTION FAULT
(XEN) [error_code=0000]
(XEN) ****************************************
Tracing my host KVM showed it was the one injecting the GP(0) when
emulating the VMREAD and checking the destination segment permissions in
get_vmx_mem_address():
3) | vmx_handle_exit() {
3) | handle_vmread() {
3) | nested_vmx_check_permission() {
3) | vmx_get_segment() {
3) 0.074 us | vmx_read_guest_seg_base();
3) 0.065 us | vmx_read_guest_seg_selector();
3) 0.066 us | vmx_read_guest_seg_ar();
3) 1.636 us | }
3) 0.058 us | vmx_get_rflags();
3) 0.062 us | vmx_read_guest_seg_ar();
3) 3.469 us | }
3) | vmx_get_cs_db_l_bits() {
3) 0.058 us | vmx_read_guest_seg_ar();
3) 0.662 us | }
3) | get_vmx_mem_address() {
3) 0.068 us | vmx_cache_reg();
3) | vmx_get_segment() {
3) 0.074 us | vmx_read_guest_seg_base();
3) 0.068 us | vmx_read_guest_seg_selector();
3) 0.071 us | vmx_read_guest_seg_ar();
3) 1.756 us | }
3) | kvm_queue_exception_e() {
3) 0.066 us | kvm_multiple_exception();
3) 0.684 us | }
3) 4.085 us | }
3) 9.833 us | }
3) + 10.366 us | }
Cross-checking the KVM/VMX VMREAD emulation code with the Intel Software
Developper Manual Volume 3C - "VMREAD - Read Field from Virtual-Machine
Control Structure", I found that we're enforcing that the destination
operand is NOT located in a read-only data segment or any code segment when
the L1 is in long mode - BUT that check should only happen when it is in
protected mode.
Shuffling the code a bit to make our emulation follow the specification
allows me to boot a Xen dom0 in a nested KVM and start HVM L2 guests
without problems.
Fixes: f9eb4af67c ("KVM: nVMX: VMX instructions: add checks for #GP/#SS exceptions")
Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Eugene Korenevsky <ekorenevsky@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The host timer which emulates the guest LAPIC TSC deadline
timer has its expiration diminished by lapic_timer_advance_ns
nanoseconds. Therefore if, at wait_lapic_expire, a difference
larger than lapic_timer_advance_ns is encountered, delay at most
lapic_timer_advance_ns.
This fixes a problem where the guest can cause the host
to delay for large amounts of time.
Reported-by: Alan Jenkins <alan.christopher.jenkins@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the inline function nsec_to_cycles from x86.c to x86.h, as
the next patch uses it from lapic.c.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is a generic function __pvclock_read_cycles to be used to get both
flags and cycles. For function pvclock_read_flags, it's useless to get
cycles value. To make this function be more effective, get this variable
flags directly in function.
Signed-off-by: Minfei Huang <mnghuan@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Function __pvclock_read_cycles is short enough, so there is no need to
have another function pvclock_get_nsec_offset to calculate tsc delta.
It's better to combine it into function __pvclock_read_cycles.
Remove useless variables in function __pvclock_read_cycles.
Signed-off-by: Minfei Huang <mnghuan@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Protocol for the "version" fields is: hypervisor raises it (making it
uneven) before it starts updating the fields and raises it again (making
it even) when it is done. Thus the guest can make sure the time values
it got are consistent by checking the version before and after reading
them.
Add CPU barries after getting version value just like what function
vread_pvclock does, because all of callees in this function is inline.
Fixes: 502dfeff23
Cc: stable@vger.kernel.org
Signed-off-by: Minfei Huang <mnghuan@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In "NFSv4: Move dentry instantiation into the NFSv4-specific atomic open code"
unconditional d_drop() after the ->open_context() had been removed. It had
been correct for success cases (there ->open_context() itself had been doing
dcache manipulations), but not for error ones. Only one of those (ENOENT)
got a compensatory d_drop() added in that commit, but in fact it should've
been done for all errors. As it is, the case of O_CREAT non-exclusive open
on a hashed negative dentry racing with e.g. symlink creation from another
client ended up with ->open_context() getting an error and proceeding to
call nfs_lookup(). On a hashed dentry, which would've instantly triggered
BUG_ON() in d_materialise_unique() (or, these days, its equivalent in
d_splice_alias()).
Cc: stable@vger.kernel.org # v3.10+
Tested-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Commit 2a0cb4e2d4 ("iommu/amd: Add new map for storing IVHD dev entry
type HID") added a call to DUMP_printk in init_iommu_from_acpi() which
used the value of devid before this variable was initialized.
Fixes: 2a0cb4e2d4 ('iommu/amd: Add new map for storing IVHD dev entry type HID')
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The valid range of 'did' in get_iommu_domain(*iommu, did)
is 0..cap_ndoms(iommu->cap), so don't exceed that
range in free_all_cpu_cached_iovas().
The user-visible impact of the out-of-bounds access is the machine
hanging on suspend-to-ram. It is, in fact, a kernel panic, but due
to already suspended devices, that's often not visible to the user.
Fixes: 22e2f9fa63 ("iommu/vt-d: Use per-cpu IOVA caching")
Signed-off-by: Jan Niehusmann <jan@gondor.com>
Tested-By: Marius Vlad <marius.c.vlad@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
kvm provides kvm_vcpu_uninit(), which amongst other things, releases the
last reference to the struct pid of the task that was last running the vcpu.
On arm64 built with CONFIG_DEBUG_KMEMLEAK, starting a guest with kvmtool,
then killing it with SIGKILL results (after some considerable time) in:
> cat /sys/kernel/debug/kmemleak
> unreferenced object 0xffff80007d5ea080 (size 128):
> comm "lkvm", pid 2025, jiffies 4294942645 (age 1107.776s)
> hex dump (first 32 bytes):
> 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<ffff8000001b30ec>] create_object+0xfc/0x278
> [<ffff80000071da34>] kmemleak_alloc+0x34/0x70
> [<ffff80000019fa2c>] kmem_cache_alloc+0x16c/0x1d8
> [<ffff8000000d0474>] alloc_pid+0x34/0x4d0
> [<ffff8000000b5674>] copy_process.isra.6+0x79c/0x1338
> [<ffff8000000b633c>] _do_fork+0x74/0x320
> [<ffff8000000b66b0>] SyS_clone+0x18/0x20
> [<ffff800000085cb0>] el0_svc_naked+0x24/0x28
> [<ffffffffffffffff>] 0xffffffffffffffff
On x86 kvm_vcpu_uninit() is called on the path from kvm_arch_destroy_vm(),
on arm no equivalent call is made. Add the call to kvm_arch_vcpu_free().
Signed-off-by: James Morse <james.morse@arm.com>
Fixes: 749cf76c5a ("KVM: ARM: Initial skeleton to compile KVM support")
Cc: <stable@vger.kernel.org> # 3.10+
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>