linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-17 17:24:17 +08:00

Author	SHA1	Message	Date
Linus Torvalds	c5f1ac5e9a	powerpc fixes for 5.0 #5 Just one fix, for pgd/pud_present() which were broken on big endian since v4.20, leading to possible data corruption. Thanks to: Aneesh Kumar K.V., Erhard F., Jan Kara. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJcaReiAAoJEFHr6jzI4aWAoy0P/09u2Vbj2vcOuFn/9BZ7JK5w Pw9lHPC2NHtoM3Wq1ZAK3GPELkU4Bl4xtorFgC1/f0Oe3Nt3wHs6tfu+jx/qTgtz +j1fR7Q0nKA62uJ53n9i4e3HLWJR80gFkczpWFMSgpbNdw/pvzZfW1YlXQs/iZTX A0lwfrMKc8ud1KkAr7S1rzWnF+55gwOmia4F6fkHBAV/vo2rj861LTY0FRz5OdW0 h4OyQEmw/LBRnZW0SJJBGFib8HtpANc4a35Lbq9x7PMAsAGCvNBpqbVx1fkgRzEt lVY/bUqFK8+KOQuao8T8FFN9y8upwayb5PZdlz3YlONSdZsDa3VbcQG2qLUhmJZQ 2NS0cuw2uJ7QP8iC26j1SH8EdcraQsYxl57nQZhtI38pP5RXT+C1+aZEwk2DNaPK BQM4asEd9YNCKRvU/cxhS5Gv2BnerUuktF72vEx/ul/wXIjJXO4buIZyGDiznVsk AImmdPA8yiGa8+0DN/TCuizFSMx3rwZEYPux6MqU40K/xp3f0eEiqCZD7xQ5kh+C Vi5TV6/epTqUYbeKkrqMyJ+0CmeTWF2YL3hZ3Na5+XwIhgSOGGiGGpPrXcVqwvA0 t+zhN/L99urBtg3ubwiVfRd8WbZS5/9kDEhAZwsYjGxboVg4cnhniHU4RHIO/VYE 0MlwXdiZMXTJolzpZfuF =8du7 -----END PGP SIGNATURE----- Merge tag 'powerpc-5.0-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: "Just one fix, for pgd/pud_present() which were broken on big endian since v4.20, leading to possible data corruption. Thanks to: Aneesh Kumar K.V., Erhard F., Jan Kara" * tag 'powerpc-5.0-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()	2019-02-17 08:36:21 -08:00
Michael Ellerman	a58007621b	powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present() In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT rather than just checking that the value is non-zero, e.g.: static inline int pgd_present(pgd_t pgd) { - return !pgd_none(pgd); + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); } Unfortunately this is broken on big endian, as the result of the bitwise & is truncated to int, which is always zero because _PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and pud_present() are always false at compile time, and the compiler elides the subsequent code. Remarkably with that bug present we are still able to boot and run with few noticeable effects. However under some work loads we are able to trigger a warning in the ext4 code: WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0 CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1 ... NIP .ext4_set_page_dirty+0x70/0xb0 LR .set_page_dirty+0xa0/0x150 Call Trace: .set_page_dirty+0xa0/0x150 .unmap_page_range+0xbf0/0xe10 .unmap_vmas+0x84/0x130 .unmap_region+0xe8/0x190 .__do_munmap+0x2f0/0x510 .__vm_munmap+0x80/0x110 .__se_sys_munmap+0x14/0x30 system_call+0x5c/0x70 The fix is simple, we need to convert the result of the bitwise & to an int before returning it. Thanks to Erhard, Jan Kara and Aneesh for help with debugging. Fixes: `da7ad366b4` ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit") Cc: stable@vger.kernel.org # v4.20+ Reported-by: Erhard F. <erhard_f@mailbox.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-17 15:24:45 +11:00
David S. Miller	3313da8188	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The netfilter conflicts were rather simple overlapping changes. However, the cls_tcindex.c stuff was a bit more complex. On the 'net' side, Cong is fixing several races and memory leaks. Whilst on the 'net-next' side we have Vlad adding the rtnl-ness support. What I've decided to do, in order to resolve this, is revert the conversion over to using a workqueue that Cong did, bringing us back to pure RCU. I did it this way because I believe that either Cong's races don't apply with have Vlad did things, or Cong will have to implement the race fix slightly differently. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 12:38:38 -08:00
Christoph Hellwig	34e04eedd1	of: select OF_RESERVED_MEM automatically The OF_RESERVED_MEM can be used if we have either CMA or the generic declare coherent code built and we support the early flattened DT. So don't bother making it a user visible options that is selected by most configs that fit the above category, but just select it when the requirements are met. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Rob Herring <robh@kernel.org>	2019-02-13 19:19:47 +01:00
Greg Kroah-Hartman	9481caf39b	Merge 5.0-rc6 into driver-core-next We need the debugfs fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-11 09:09:02 +01:00
Greg Kroah-Hartman	5c07488d99	Merge 5.0-rc6 into char-misc-next We need the char-misc fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-11 09:05:58 +01:00
Thomas Gleixner	41ea39101d	y2038: Add time64 system calls This series finally gets us to the point of having system calls with 64-bit time_t on all architectures, after a long time of incremental preparation patches. There was actually one conversion that I missed during the summer, i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes and review comments. The following system calls are now added on all 32-bit architectures using the same system call numbers: 403 clock_gettime64 404 clock_settime64 405 clock_adjtime64 406 clock_getres_time64 407 clock_nanosleep_time64 408 timer_gettime64 409 timer_settime64 410 timerfd_gettime64 411 timerfd_settime64 412 utimensat_time64 413 pselect6_time64 414 ppoll_time64 416 io_pgetevents_time64 417 recvmmsg_time64 418 mq_timedsend_time64 419 mq_timedreceiv_time64 420 semtimedop_time64 421 rt_sigtimedwait_time64 422 futex_time64 423 sched_rr_get_interval_time64 Each one of these corresponds directly to an existing system call that includes a 'struct timespec' argument, or a structure containing a timespec or (in case of clock_adjtime) timeval. Not included here are new versions of getitimer/setitimer and getrusage/waitid, which are planned for the future but only needed to make a consistent API rather than for correct operation beyond y2038. These four system calls are based on 'timeval', and it has not been finally decided what the replacement kernel interface will use instead. So far, I have done a lot of build testing across most architectures, which has found a number of bugs. Runtime testing so far included testing LTP on 32-bit ARM with the existing system calls, to ensure we do not regress for existing binaries, and a test with a 32-bit x86 build of LTP against a modified version of the musl C library that has been adapted to the new system call interface [3]. This library can be used for testing on all architectures supported by musl-1.1.21, but it is not how the support is getting integrated into the official musl release. Official musl support is planned but will require more invasive changes to the library. Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/ Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/ Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2] Signed-off-by: Arnd Bergmann <arnd@arndb.de> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJcXf7/AAoJEGCrR//JCVInPSUP/RhsQSCKMGtONB/vVICQhwep PybhzBSpHWFxszzTi6BEPN1zS9B069G9mDollRBYZCckyPqL/Bv6sI/vzQZdNk01 Q6Nw92OnNE1QP8owZ5TjrZhpbtopWdqIXjsbGZlloUemvuJP2JwvKovQUcn5CPTQ jbnqU04CVyFFJYVxAnGJ+VSeWNrjW/cm/m+rhLFjUcwW7Y3aodxsPqPP6+K9hY9P yIWfcH42WBeEWGm1RSBOZOScQl4SGCPUAhFydl/TqyEQagyegJMIyMOv9wZ5AuTT xK644bDVmNsrtJDZDpx+J8hytXCk1LrnKzkHR/uK80iUIraF/8D7PlaPgTmEEjko XcrywEkvkXTVU3owCm2/sbV+8fyFKzSPipnNfN1JNxEX71A98kvMRtPjDueQq/GA Yh81rr2YLF2sUiArkc2fNpENT7EGhrh1q6gviK3FB8YDgj1kSgPK5wC/X0uolC35 E7iC2kg4NaNEIjhKP/WKluCaTvjRbvV+0IrlJLlhLTnsqbA57ZKCCteiBrlm7wQN 4csUtCyxchR9Ac2o/lj+Mf53z68Zv74haIROp18K2dL7ZpVcOPnA3XHeauSAdoyp wy2Ek6ilNvlNB+4x+mRntPoOsyuOUGv7JXzB9JvweLWUd9G7tvYeDJQp/0YpDppb K4UWcKnhtEom0DgK08vY =IZVb -----END PGP SIGNATURE----- Merge tag 'y2038-new-syscalls' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038 Pull y2038 - time64 system calls from Arnd Bergmann: This series finally gets us to the point of having system calls with 64-bit time_t on all architectures, after a long time of incremental preparation patches. There was actually one conversion that I missed during the summer, i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes and review comments. The following system calls are now added on all 32-bit architectures using the same system call numbers: 403 clock_gettime64 404 clock_settime64 405 clock_adjtime64 406 clock_getres_time64 407 clock_nanosleep_time64 408 timer_gettime64 409 timer_settime64 410 timerfd_gettime64 411 timerfd_settime64 412 utimensat_time64 413 pselect6_time64 414 ppoll_time64 416 io_pgetevents_time64 417 recvmmsg_time64 418 mq_timedsend_time64 419 mq_timedreceiv_time64 420 semtimedop_time64 421 rt_sigtimedwait_time64 422 futex_time64 423 sched_rr_get_interval_time64 Each one of these corresponds directly to an existing system call that includes a 'struct timespec' argument, or a structure containing a timespec or (in case of clock_adjtime) timeval. Not included here are new versions of getitimer/setitimer and getrusage/waitid, which are planned for the future but only needed to make a consistent API rather than for correct operation beyond y2038. These four system calls are based on 'timeval', and it has not been finally decided what the replacement kernel interface will use instead. So far, I have done a lot of build testing across most architectures, which has found a number of bugs. Runtime testing so far included testing LTP on 32-bit ARM with the existing system calls, to ensure we do not regress for existing binaries, and a test with a 32-bit x86 build of LTP against a modified version of the musl C library that has been adapted to the new system call interface [3]. This library can be used for testing on all architectures supported by musl-1.1.21, but it is not how the support is getting integrated into the official musl release. Official musl support is planned but will require more invasive changes to the library. Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/ Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/ Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2]	2019-02-10 21:24:43 +01:00
Thomas Gleixner	fd659cc095	arch: System call unification and cleanup The system call tables have diverged a bit over the years, and a number of the recent additions never made it into all architectures, for one reason or another. This is an attempt to clean it up as far as we can without breaking compatibility, doing a number of steps: - Add system calls that have not yet been integrated into all architectures but that we definitely want there. This includes {,f}statfs64() and get{eg,eu,g,p,u,pp}id() on alpha, which have been missing traditionally. - The s390 compat syscall handling is cleaned up to be more like what we do on other architectures, while keeping the 31-bit pointer extension. This was merged as a shared branch by the s390 maintainers and is included here in order to base the other patches on top. - Add the separate ipc syscalls on all architectures that traditionally only had sys_ipc(). This version is done without support for IPC_OLD that is we have in sys_ipc. The new semtimedop_time64 syscall will only be added here, not in sys_ipc - Add syscall numbers for a couple of syscalls that we probably don't need everywhere, in particular pkey_* and rseq, for the purpose of symmetry: if it's in asm-generic/unistd.h, it makes sense to have it everywhere. I expect that any future system calls will get assigned on all platforms together, even when they appear to be specific to a single architecture. - Prepare for having the same system call numbers for any future calls. In combination with the generated tables, this hopefully makes it easier to add new calls across all architectures together. All of the above are technically separate from the y2038 work, but are done as preparation before we add the new 64-bit time_t system calls everywhere, providing a common baseline set of system calls. I expect that glibc and other libraries that want to use 64-bit time_t will require linux-5.1 kernel headers for building in the future, and at a much later point may also require linux-5.1 or a later version as the minimum kernel at runtime. Having a common baseline then allows the removal of many architecture or kernel version specific workarounds. Signed-off-by: Arnd Bergmann <arnd@arndb.de> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJcXf6XAAoJEGCrR//JCVInIm4P/AlkMmQRa/B2ziWMW6PifPoI v18r44017rA1BPENyZvumJUdM5mDvNofOW8F2DYQ7Uiys2YtXenwe/Cf8LHn2n6c TMXGQryQpvNmfDCyU+0UjF8m2+poFMrL4aRTXtjODh1YTsPNgeDC+KFMCAAtZmZd cVbXFudtbdYKD/pgCX4SI1CWAMBiXe2e+ukPdJVr+iqusCMTApf+GOuyvDBZY9s/ vURb+tIS87HZ/jehWfZFSuZt+Gu7b3ijUXNC8v9qSIxNYekw62vBNl6F09HE79uB Bv4OujAODqKvI9gGyydBzLJNzaMo0ryQdusyqcJHT7MY/8s+FwcYAXyTlQ3DbbB4 2u/c+58OwJ9Zk12p4LXZRA47U+vRhQt2rO4+zZWs2txNNJY89ZvCm/Z04KOiu5Xz 1Nnj607KGzthYRs2gs68AwzGGyf0uykIQ3RcaJLIBlX1Nd8BWO0ZgAguCvkXbQMX XNXJTd92HmeuKKpiO0n/M4/mCeP0cafBRPCZbKlHyTl0Jeqd/HBQEO9Z8Ifwyju3 mXz9JCR9VlPCkX605keATbjtPGZf3XQtaXlQnezitDudXk8RJ33EpPcbhx76wX7M Rux37ByqEOzk4wMGX9YQyNU7z7xuVg4sJAa2LlJqYeKXHtym+u3gG7SGP5AsYjmk 6mg2+9O2yZuLhQtOtrwm =s4wf -----END PGP SIGNATURE----- Merge tag 'y2038-syscall-cleanup' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038 Pull preparatory work for y2038 changes from Arnd Bergmann: System call unification and cleanup The system call tables have diverged a bit over the years, and a number of the recent additions never made it into all architectures, for one reason or another. This is an attempt to clean it up as far as we can without breaking compatibility, doing a number of steps: - Add system calls that have not yet been integrated into all architectures but that we definitely want there. This includes {,f}statfs64() and get{eg,eu,g,p,u,pp}id() on alpha, which have been missing traditionally. - The s390 compat syscall handling is cleaned up to be more like what we do on other architectures, while keeping the 31-bit pointer extension. This was merged as a shared branch by the s390 maintainers and is included here in order to base the other patches on top. - Add the separate ipc syscalls on all architectures that traditionally only had sys_ipc(). This version is done without support for IPC_OLD that is we have in sys_ipc. The new semtimedop_time64 syscall will only be added here, not in sys_ipc - Add syscall numbers for a couple of syscalls that we probably don't need everywhere, in particular pkey_* and rseq, for the purpose of symmetry: if it's in asm-generic/unistd.h, it makes sense to have it everywhere. I expect that any future system calls will get assigned on all platforms together, even when they appear to be specific to a single architecture. - Prepare for having the same system call numbers for any future calls. In combination with the generated tables, this hopefully makes it easier to add new calls across all architectures together. All of the above are technically separate from the y2038 work, but are done as preparation before we add the new 64-bit time_t system calls everywhere, providing a common baseline set of system calls. I expect that glibc and other libraries that want to use 64-bit time_t will require linux-5.1 kernel headers for building in the future, and at a much later point may also require linux-5.1 or a later version as the minimum kernel at runtime. Having a common baseline then allows the removal of many architecture or kernel version specific workarounds.	2019-02-10 20:44:19 +01:00
Linus Torvalds	820828bffe	powerpc fixes for 5.0 #4 Just two fixes, both going to stable. Our support for split pmd page table lock had a bug which could lead to a crash on mremap() when using the Radix MMU (Power9 only). A fix for the PAPR SCM driver (nvdimm) we added last release, which had a bug where we might mis-handle a hypervisor response leading to us failing to attach the memory region. Thanks to: Aneesh Kumar K.V, Oliver O'Halloran. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJcXXVeAAoJEFHr6jzI4aWA+18P/2EJHmTJ2XgXfQdz7XAEb5YJ AXUsg47rm1Cx83PNTOpY4uGVEQmvb+a4DkeSIucoISJeGdo3lDIkGluySYZNaT6E 1Z8Tm6v5j9WLSV7CQcx0p3jU2xR/iap4HpDa6IiPjT8/4v4SwJvDkZLnqflwA2Q5 yk8e7gfViWccCD3F+/MyDvOF+t/9PEHP8qd86NtVrUxjx57WN+LehW2D3gi6LdNv L4L42ZQndGYXjmF4WkoDVLB/AQLFD95XiO0FlP45nqJK/CPBhi6g9knwg1zI7Fgd 2yXJJWfxW52EqhFv9D9hmU/5SqgKb0vXgqrNW07HvqMp5WMWP69+gsvLughBb5SE KFos9EkVlCkKdBsjdC9nT2p/qxP0MXe8CrGuVSNXAGjw79Je9FM8byYpL45xB5Pm bqZnvrjktAhgLvyzz48eSNqMmX1c3GfVwQn3WIxBlO+k1Hd5hzndBD4PEeX8WOo1 /sTb8B0VHhYM+cyexUufXE5FwXzMWOgW/7GqcgHClHg5/PgaFKs7ZGVa5y0oRn+A KWnPjkwQQUNk712AZosEUjffHASNXzkM1ckMm36E8j11OgpplLea3nSoa9vqXZJi 92d5bqDgUZljQLjxeg52CrPOOr6RCDLQ9UdYgt72UOjhKk2f/Aim9jcFnxg71xOC mfEx7R64crDZFMYZT2Ij =anwQ -----END PGP SIGNATURE----- Merge tag 'powerpc-5.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Just two fixes, both going to stable. - Our support for split pmd page table lock had a bug which could lead to a crash on mremap() when using the Radix MMU (Power9 only). - A fix for the PAPR SCM driver (nvdimm) we added last release, which had a bug where we might mis-handle a hypervisor response leading to us failing to attach the memory region. Thanks to: Aneesh Kumar K.V, Oliver O'Halloran" * tag 'powerpc-5.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/papr_scm: Use the correct bind address powerpc/radix: Fix kernel crash with mremap()	2019-02-08 16:04:12 -08:00
Arnd Bergmann	48166e6ea4	y2038: add 64-bit time_t syscalls to all 32-bit architectures This adds 21 new system calls on each ABI that has 32-bit time_t today. All of these have the exact same semantics as their existing counterparts, and the new ones all have macro names that end in 'time64' for clarification. This gets us to the point of being able to safely use a C library that has 64-bit time_t in user space. There are still a couple of loose ends to tie up in various areas of the code, but this is the big one, and should be entirely uncontroversial at this point. In particular, there are four system calls (getitimer, setitimer, waitid, and getrusage) that don't have a 64-bit counterpart yet, but these can all be safely implemented in the C library by wrapping around the existing system calls because the 32-bit time_t they pass only counts elapsed time, not time since the epoch. They will be dealt with later. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2019-02-07 00:13:28 +01:00
Arnd Bergmann	d33c577ccc	y2038: rename old time and utime syscalls The time, stime, utime, utimes, and futimesat system calls are only used on older architectures, and we do not provide y2038 safe variants of them, as they are replaced by clock_gettime64, clock_settime64, and utimensat_time64. However, for consistency it seems better to have the 32-bit architectures that still use them call the "time32" entry points (leaving the traditional handlers for the 64-bit architectures), like we do for system calls that now require two versions. Note: We used to always define __ARCH_WANT_SYS_TIME and __ARCH_WANT_SYS_UTIME and only set __ARCH_WANT_COMPAT_SYS_TIME and __ARCH_WANT_SYS_UTIME32 for compat mode on 64-bit kernels. Now this is reversed: only 64-bit architectures set __ARCH_WANT_SYS_TIME/UTIME, while we need __ARCH_WANT_SYS_TIME32/UTIME32 for 32-bit architectures and compat mode. The resulting asm/unistd.h changes look a bit counterintuitive. This is only a cleanup patch and it should not change any behavior. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2019-02-07 00:13:28 +01:00
Arnd Bergmann	00bf25d693	y2038: use time32 syscall names on 32-bit This is the big flip, where all 32-bit architectures set COMPAT_32BIT_TIME and use the _time32 system calls from the former compat layer instead of the system calls that take __kernel_timespec and similar arguments. The temporary redirects for __kernel_timespec, __kernel_itimerspec and __kernel_timex can get removed with this. It would be easy to split this commit by architecture, but with the new generated system call tables, it's easy enough to do it all at once, which makes it a little easier to check that the changes are the same in each table. Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2019-02-07 00:13:28 +01:00
Arnd Bergmann	8dabe7245b	y2038: syscalls: rename y2038 compat syscalls A lot of system calls that pass a time_t somewhere have an implementation using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have been reworked so that this implementation can now be used on 32-bit architectures as well. The missing step is to redefine them using the regular SYSCALL_DEFINEx() to get them out of the compat namespace and make it possible to build them on 32-bit architectures. Any system call that ends in 'time' gets a '32' suffix on its name for that version, while the others get a '_time32' suffix, to distinguish them from the normal version, which takes a 64-bit time argument in the future. In this step, only 64-bit architectures are changed, doing this rename first lets us avoid touching the 32-bit architectures twice. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2019-02-07 00:13:27 +01:00
Oliver O'Halloran	b174b4fb91	powerpc/powernv: Escalate reset when IODA reset fails The IODA reset is used to flush out any OS controlled state from the PHB. This reset can fail if a PHB fatal error has occurred in early boot, probably due to a because of a bad device. We already do a fundemental reset of the device in some cases, so this patch just adds a test to force a full reset if firmware reports an error when performing the IODA reset. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-07 00:29:20 +11:00
Breno Leitao	ebb0e13ead	powerpc/ptrace: Mitigate potential Spectre v1 'regno' is directly controlled by user space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. On PTRACE_SETREGS and PTRACE_GETREGS requests, user space passes the register number that would be read or written. This register number is called 'regno' which is part of the 'addr' syscall parameter. This 'regno' value is checked against the maximum pt_regs structure size, and then used to dereference it, which matches the initial part of a Spectre v1 (and Spectre v1.1) attack. The dereferenced value, then, is returned to userspace in the GETREGS case. This patch sanitizes 'regno' before using it to dereference pt_reg. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-07 00:29:20 +11:00
Joel Stanley	98ecc6768e	powerpc/32: Include .branch_lt in data section When building a 32 bit powerpc kernel with Binutils 2.31.1 this warning is emitted: powerpc-linux-gnu-ld: warning: orphan section `.branch_lt' from `arch/powerpc/kernel/head_44x.o' being placed in section `.branch_lt' As of binutils commit 2d7ad24e8726 ("Support PLT16 relocs against local symbols")[1], 32 bit targets can produce .branch_lt sections in their output. Include these symbols in the .data section as the ppc64 kernel does. [1] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=2d7ad24e8726ba4c45c9e67be08223a146a837ce Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Alan Modra <amodra@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-05 15:47:16 +11:00
Sam Bobroff	195482c363	powerpc/eeh: Correct retries in eeh_pe_reset_full() Currently, eeh_pe_reset_full() will only attempt to reset a PE more than once if activating the reset state and deactivating it both succeed, but later polling shows that it hasn't become active. Change this so that it will try up to three times for any reason other than an unrecoverable slot error and adjust the message generation so that it's clear weather the reset has ultimately succeeded or failed. This allows the reset to succeed in some situations where it would currently fail. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-05 11:55:45 +11:00
Sam Bobroff	1ef52073fd	powerpc/eeh: Improve recovery of passed-through devices Currently, the EEH recovery process considers passed-through devices as if they were not EEH-aware, which can cause them to be removed as part of recovery. Because device removal requires cooperation from the guest, this may lead to the process stalling or deadlocking. Also, if devices are removed on the host side, they will be removed from their IOMMU group, making recovery in the guest impossible. Therefore, alter the recovery process so that passed-through devices are not removed but are instead left frozen (and marked isolated) until the guest performs it's own recovery. If firmware thaws a passed-through PE because it's parent PE has been thawed (because it was not passed through), re-freeze it. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-05 11:55:44 +11:00
Sam Bobroff	4d8e325d9d	powerpc/eeh: Add include_passed to eeh_clear_pe_frozen_state() Add a parameter to eeh_clear_pe_frozen_state() that allows passed-through PEs to be excluded. Update callers to always pass true so that there is no change in behaviour. This is to prepare for follow-up work for passed-through devices. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-05 11:55:44 +11:00
Sam Bobroff	9ed5ca66aa	powerpc/eeh: Add include_passed to eeh_pe_state_clear() Add a parameter to eeh_pe_state_clear() that allows passed-through PEs to be excluded. Update callers to always pass true so that there is no change in behaviour. Also refactor to use direct traversal, to allow the removal of some boilerplate. This is to prepare for follow-up work for passed-through devices. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-05 11:55:43 +11:00
Sam Bobroff	188fdea69f	powerpc/eeh: remove sw_state from eeh_unfreeze_pe() eeh_unfreeze_pe() performs two operations: unfreezing a PE (which may cause firmware to unfreeze child PEs as well) and de-isolating the PE and it's children. To simplify this and support future work, separate out the de-isolation and perform it at the call sites (when necessary). There should be no change in behaviour. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-05 11:55:42 +11:00
Sam Bobroff	3376cb91ed	powerpc/eeh: Cleanup eeh_pe_clear_frozen_state() The 'clear_sw_state' parameter for eeh_pe_clear_frozen_state() is redundant because it has no effect (except in the rare case of a hardware error part way through unfreezing a tree of PEs, where it would dangerously allow partial de-isolation before returning failure). It is passed down to __eeh_pe_clear_frozen_state(), and from there to eeh_unfreeze_pe(), where it causes EEH_PE_ISOLATED to be removed from the state of each PE during the traversal. However, when the traversal finishes, EEH_PE_ISOLATED is unconditionally removed by a call to eeh_pe_state_clear() regardless of the parameter's value. So remove the flag and pass false to eeh_unfreeze_pe() (to avoid the rare case described above, as it was before the flag was introduced). Also, perform the recursion directly in the function and eliminate a bit of boilerplate. There should be no change in functionality, except as mentioned above. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-05 11:55:41 +11:00
Christophe Leroy	26b523356f	powerpc: Drop page_is_ram() and walk_system_ram_range() Since commit `c40dd2f766` ("powerpc: Add System RAM to /proc/iomem") it is possible to use the generic walk_system_ram_range() and the generic page_is_ram(). To enable the use of walk_system_ram_range() by the IBM EHEA ethernet driver, we still need an export of the generic function. As powerpc was the only user of CONFIG_ARCH_HAS_WALK_MEMORY, the ifdef around the generic walk_system_ram_range() has become useless and can be dropped. Fixes: `c40dd2f766` ("powerpc: Add System RAM to /proc/iomem") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Keep the EXPORT_SYMBOL_GPL in powerpc code] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-04 21:22:06 +11:00
Ingo Molnar	98cb621081	Merge branch 'perf/urgent' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-02-04 08:45:42 +01:00
Deepa Dinamani	45bdc66159	socket: Rename SO_RCVTIMEO/ SO_SNDTIMEO with _OLD suffixes SO_RCVTIMEO and SO_SNDTIMEO socket options use struct timeval as the time format. struct timeval is not y2038 safe. The subsequent patches in the series add support for new socket timeout options with _NEW suffix that will use y2038 safe data structures. Although the existing struct timeval layout is sufficiently wide to represent timeouts, because of the way libc will interpret time_t based on user defined flag, these new flags provide a way of having a structure that is the same for all architectures consistently. Rename the existing options with _OLD suffix forms so that the right option is enabled for userspace applications according to the architecture and time_t definition of libc. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Cc: ccaulfie@redhat.com Cc: deller@gmx.de Cc: paulus@samba.org Cc: ralf@linux-mips.org Cc: rth@twiddle.net Cc: cluster-devel@redhat.com Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-03 11:17:31 -08:00
Mathieu Malaterre	8e0f973575	Move static keyword at beginning of declaration Move the static keyword around to remove the following warnings (W=1): arch/powerpc/platforms/ps3/os-area.c:212:1: error: 'static' is not at beginning of declaration [-Werror=old-style-declaration] arch/powerpc/platforms/ps3/system-bus.c:45:1: error: 'static' is not at beginning of declaration [-Werror=old-style-declaration] Signed-off-by: Mathieu Malaterre <malat@debian.org> Acked-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-03 20:44:19 +11:00
Mathieu Malaterre	e5c27ef7a5	powerpc: Remove trailing semicolon after curly brace There is not point in having a trailing semicolon after a closing curly brace. Remove it. Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-03 20:01:03 +11:00
Sandipan Das	6f20c71d85	bpf: powerpc64: add JIT support for bpf line info This adds support for generating bpf line info for JITed programs. Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-01 21:03:34 +01:00
Christian Lamparter	423bfc69d7	powerpc: Enable kernel XZ compression option on 44x Enable kernel XZ compression option on 44x. Tested on a Western Digital - MyBook Live NAS. It takes 22 seconds for the 800 MHz CPU to decompress and boot a 2.63 MiB XZ-compressed kernel simpleImage. Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-01 22:04:55 +11:00
Christoph Hellwig	cfced78696	dma-mapping: remove the default map_resource implementation Instead provide a proper implementation in the direct mapping code, and also wire it up for arm and powerpc, leaving an error return for all the IOMMU or virtual mapping instances for which we'd have to wire up an actual implementation Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>	2019-02-01 09:56:15 +01:00
Oliver O'Halloran	5a3840a470	powerpc/papr_scm: Use the correct bind address When binding an SCM volume to a physical address the hypervisor has the option to return early with a continue token with the expectation that the guest will resume the bind operation until it completes. A quirk of this interface is that the bind address will only be returned by the first bind h-call and the subsequent calls will return 0xFFFF_FFFF_FFFF_FFFF for the bind address. We currently do not save the address returned by the first h-call. As a result we will use the junk address as the base of the bound region if the hypervisor decides to split the bind across multiple h-calls. This bug was found when testing with very large SCM volumes where the bind process would take more time than they hypervisor's internal h-call time limit would allow. This patch fixes the issue by saving the bind address from the first call. Cc: stable@vger.kernel.org Fixes: `b5beae5e22` ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-01 10:13:51 +11:00
Aneesh Kumar K.V	579b9239c1	powerpc/radix: Fix kernel crash with mremap() With support for split pmd lock, we use pmd page pmd_huge_pte pointer to store the deposited page table. In those config when we move page tables we need to make sure we move the deposited page table to the correct pmd page. Otherwise this can result in crash when we withdraw of deposited page table because we can find the pmd_huge_pte NULL. eg: __split_huge_pmd+0x1070/0x1940 __split_huge_pmd+0xe34/0x1940 (unreliable) vma_adjust_trans_huge+0x110/0x1c0 __vma_adjust+0x2b4/0x9b0 __split_vma+0x1b8/0x280 __do_munmap+0x13c/0x550 sys_mremap+0x220/0x7e0 system_call+0x5c/0x70 Fixes: `675d995297` ("powerpc/book3s64: Enable split pmd ptlock.") Cc: stable@vger.kernel.org # v4.18+ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-31 20:10:15 +11:00
Joe Lawrence	3de27dcf81	powerpc/livepatch: return -ERRNO values in save_stack_trace_tsk_reliable() To match its x86 counterpart, save_stack_trace_tsk_reliable() should return -EINVAL in cases that it is currently returning 1. No caller is currently differentiating non-zero error codes, but let's keep the arch-specific implementations consistent. Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-31 16:43:38 +11:00
Joe Lawrence	29a77bbb0c	powerpc/livepatch: small cleanups in save_stack_trace_tsk_reliable() Mostly cosmetic changes: - Group common stack pointer code at the top - Simplify the first frame logic - Code stackframe iteration into for...loop construct - Check for trace->nr_entries overflow before adding any into the array Suggested-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-31 16:43:38 +11:00
Joe Lawrence	18be37603d	powerpc/livepatch: relax reliable stack tracer checks for first-frame The bottom-most stack frame (the first to be unwound) may be largely uninitialized, for the "Power Architecture 64-Bit ELF V2 ABI" only requires its backchain pointer to be set. The reliable stack tracer should be careful when verifying this frame: skip checks on STACK_FRAME_LR_SAVE and STACK_FRAME_MARKER offsets that may contain uninitialized residual data. Fixes: `df78d3f614` ("powerpc/livepatch: Implement reliable stack tracing for the consistency model") Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-31 16:43:38 +11:00
Nicolai Stange	a50d3250d7	powerpc/64s: Make reliable stacktrace dependency clearer Make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies PPC_BOOK3S_64, there's no functional change here. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-31 16:43:29 +11:00
Nicolai Stange	eddd0b3323	powerpc/64s: Clear on-stack exception marker upon exception return The ppc64 specific implementation of the reliable stacktracer, save_stack_trace_tsk_reliable(), bails out and reports an "unreliable trace" whenever it finds an exception frame on the stack. Stack frames are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, as written by exception prologues, is found at a particular location. However, as observed by Joe Lawrence, it is possible in practice that non-exception stack frames can alias with prior exception frames and thus, that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on the stack. It in turn falsely reports an unreliable stacktrace and blocks any live patching transition to finish. Said condition lasts until the stack frame is overwritten/initialized by function call or other means. In principle, we could mitigate this by making the exception frame classification condition in save_stack_trace_tsk_reliable() stronger: in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into account that for all exceptions executing on the kernel stack - their stack frames's backlink pointers always match what is saved in their pt_regs instance's ->gpr[1] slot and that - their exception frame size equals STACK_INT_FRAME_SIZE, a value uncommonly large for non-exception frames. However, while these are currently true, relying on them would make the reliable stacktrace implementation more sensitive towards future changes in the exception entry code. Note that false negatives, i.e. not detecting exception frames, would silently break the live patching consistency model. Furthermore, certain other places (diagnostic stacktraces, perf, xmon) rely on STACK_FRAME_REGS_MARKER as well. Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER for those exceptions running on the "normal" kernel stack and returning to kernelspace: because the topmost frame is ignored by the reliable stack tracer anyway, returns to userspace don't need to take care of clearing the marker. Furthermore, as I don't have the ability to test this on Book 3E or 32 bits, limit the change to Book 3S and 64 bits. Fixes: `df78d3f614` ("powerpc/livepatch: Implement reliable stack tracing for the consistency model") Reported-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-31 16:40:25 +11:00
Madhavan Srinivasan	ab4510e9ac	powerpc/perf: Add mem access events to sysfs Add mem-loads/mem-stores events to sysfs. The event is formed based on raw event encoding. Primary PMU event used here is PM_MRK_INST_CMPL along with MMCRA[SM] modes and Thresholding bit Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-31 10:38:27 +11:00
Reza Arbab	865a9432d1	powerpc/mm: Add _PAGE_SAO to _PAGE_CACHE_CTL mask In htab_convert_pte_flags(), _PAGE_CACHE_CTL is used to check for the _PAGE_SAO flag: else if ((pteflags & _PAGE_CACHE_CTL) == _PAGE_SAO) rflags \|= (HPTE_R_W \| HPTE_R_I \| HPTE_R_M); But, it isn't defined to include that flag: #define _PAGE_CACHE_CTL (_PAGE_NON_IDEMPOTENT \| _PAGE_TOLERANT) This happens to work, but only because of the flag values: #define _PAGE_SAO 0x00010 /* Strong access order / #define _PAGE_NON_IDEMPOTENT 0x00020 / non idempotent memory / #define _PAGE_TOLERANT 0x00030 / tolerant memory, cache inhibited */ To prevent any issues if these particulars ever change, add _PAGE_SAO to the mask. Suggested-by: Charles Johns <crjohns@us.ibm.com> Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-31 00:36:06 +11:00
Sabyasachi Gupta	45a202a3fe	powerpc/cell: Remove duplicate header Remove linux/syscalls.h which is included more than once Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-31 00:00:23 +11:00
Sabyasachi Gupta	f069a062ec	powerpc/powernv: Remove duplicate header Remove linux/printk.h which is included more than once. Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-30 23:59:28 +11:00
Brajeswar Ghosh	75f8a37580	powerpc/kernel/time: Remove duplicate header Remove linux/rtc.h which is included more than once Signed-off-by: Brajeswar Ghosh <brajeswar.linux@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-30 23:42:31 +11:00
Christophe Leroy	9bf3d3c4e4	powerpc/traps: Fix the message printed when stack overflows Today's message is useless: [ 42.253267] Kernel stack overflow in process (ptrval), r1=c65500b0 This patch fixes it: [ 66.905235] Kernel stack overflow in process sh[356], r1=c65560b0 Fixes: `ad67b74d24` ("printk: hash addresses printed with %p") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Use task_pid_nr()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-30 23:31:44 +11:00
Nathan Fontenot	81b6132492	powerpc/pseries: Perform full re-add of CPU for topology update post-migration On pseries systems, performing a partition migration can result in altering the nodes a CPU is assigned to on the destination system. For exampl, pre-migration on the source system CPUs are in node 1 and 3, post-migration on the destination system CPUs are in nodes 2 and 3. Handling the node change for a CPU can cause corruption in the slab cache if we hit a timing where a CPUs node is changed while cache_reap() is invoked. The corruption occurs because the slab cache code appears to rely on the CPU and slab cache pages being on the same node. The current dynamic updating of a CPUs node done in arch/powerpc/mm/numa.c does not prevent us from hitting this scenario. Changing the device tree property update notification handler that recognizes an affinity change for a CPU to do a full DLPAR remove and add of the CPU instead of dynamically changing its node resolves this issue. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com> Tested-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-30 23:28:56 +11:00
David S. Miller	ec7146db15	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2019-01-29 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Teach verifier dead code removal, this also allows for optimizing / removing conditional branches around dead code and to shrink the resulting image. Code store constrained architectures like nfp would have hard time doing this at JIT level, from Jakub. 2) Add JMP32 instructions to BPF ISA in order to allow for optimizing code generation for 32-bit sub-registers. Evaluation shows that this can result in code reduction of ~5-20% compared to 64 bit-only code generation. Also add implementation for most JITs, from Jiong. 3) Add support for __int128 types in BTF which is also needed for vmlinux's BTF conversion to work, from Yonghong. 4) Add a new command to bpftool in order to dump a list of BPF-related parameters from the system or for a specific network device e.g. in terms of available prog/map types or helper functions, from Quentin. 5) Add AF_XDP sock_diag interface for querying sockets from user space which provides information about the RX/TX/fill/completion rings, umem, memory usage etc, from Björn. 6) Add skb context access for skb_shared_info->gso_segs field, from Eric. 7) Add support for testing flow dissector BPF programs by extending existing BPF_PROG_TEST_RUN infrastructure, from Stanislav. 8) Split BPF kselftest's test_verifier into various subgroups of tests in order better deal with merge conflicts in this area, from Jakub. 9) Add support for queue/stack manipulations in bpftool, from Stanislav. 10) Document BTF, from Yonghong. 11) Dump supported ELF section names in libbpf on program load failure, from Taeung. 12) Silence a false positive compiler warning in verifier's BTF handling, from Peter. 13) Fix help string in bpftool's feature probing, from Prashant. 14) Remove duplicate includes in BPF kselftests, from Yue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-28 19:38:33 -08:00
Greg Kroah-Hartman	fdddcfd9c9	Merge 5.0-rc4 into char-misc-next We need the char-misc fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-28 08:13:52 +01:00
Masahiro Yamada	afa974b771	kbuild: add real-prereqs shorthand for $(filter-out FORCE,$^) In Kbuild, if_changed and friends must have FORCE as a prerequisite. Hence, $(filter-out FORCE,$^) or $(filter-out $(PHONY),$^) is a common idiom to get the names of all the prerequisites except phony targets. Add real-prereqs as a shorthand. Note: We cannot replace $(filter %.o,$^) in cmd_link_multi-m because $^ may include auto-generated dependencies from the .*.cmd file when a single object module is changed into a multi object module. Refer to commit `69ea912fda` ("kbuild: remove unneeded link_multi_deps"). I added some comment to avoid accidental breakage. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Rob Herring <robh@kernel.org>	2019-01-28 09:11:17 +09:00
Jiong Wang	5f6459966d	ppc: bpf: implement jitting of JMP32 This patch implements code-gen for new JMP32 instructions on ppc. For JMP32 \| JSET, instruction encoding for PPC_RLWINM_DOT is added to check the result of ANDing low 32-bit of operands. Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-26 13:33:02 -08:00
Arnd Bergmann	0d6040d468	arch: add split IPC system calls where needed The IPC system call handling is highly inconsistent across architectures, some use sys_ipc, some use separate calls, and some use both. We also have some architectures that require passing IPC_64 in the flags, and others that set it implicitly. For the addition of a y2038 safe semtimedop() system call, I chose to only support the separate entry points, but that requires first supporting the regular ones with their own syscall numbers. The IPC_64 is now implied by the new semctl/shmctl/msgctl system calls even on the architectures that require passing it with the ipc() multiplexer. I'm not adding the new semtimedop() or semop() on 32-bit architectures, those will get implemented using the new semtimedop_time64() version that gets added along with the other time64 calls. Three 64-bit architectures (powerpc, s390 and sparc) get semtimedop(). Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2019-01-25 17:22:50 +01:00
Greg Kroah-Hartman	c1507ea834	pseries: ibmebus.c: convert to use BUS_ATTR_WO We are trying to get rid of BUS_ATTR() and the usage of that in ibmebus.c can be trivially converted to use BUS_ATTR_WO(), so use that instead. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-22 14:25:26 +01:00
Logan Gunthorpe	79bf0cbd86	iomap: introduce io{read\|write}64_{lo_hi\|hi_lo} In order to provide non-atomic functions for io{read\|write}64 that will use readq and writeq when appropriate. We define a number of variants of these functions in the generic iomap that will do non-atomic operations on pio but atomic operations on mmio. These functions are only defined if readq and writeq are defined. If they are not, then the wrappers that always use non-atomic operations from include/linux/io-64-nonatomic*.h will be used. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Suresh Warrier <warrier@linux.vnet.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-22 13:39:59 +01:00
Finn Thain	20e07af71f	powerpc: Adopt nvram module for PPC64 Adopt nvram module to reduce code duplication. This means CONFIG_NVRAM becomes available to PPC64 builds. Previously it was only available to PPC32 builds because it depended on CONFIG_GENERIC_NVRAM. The IOC_NVRAM_GET_OFFSET ioctl as implemented on PPC64 validates the offset returned by pmac_get_partition(). Do the same in the nvram module. Note that the old PPC32 generic_nvram module lacked this test. So when CONFIG_PPC32 && CONFIG_PPC_PMAC, the IOC_NVRAM_GET_OFFSET ioctl would have returned 0 (always). But when CONFIG_PPC64 && CONFIG_PPC_PMAC, the IOC_NVRAM_GET_OFFSET ioctl would have returned -1 (which is -EPERM) when the requested partition was not found. With this patch, the result is now -EINVAL on both PPC32 and PPC64 when the requested PowerMac NVRAM partition is not found. This is a userspace- visible change, in the non-existent partition case, which would be in an error path for an IOC_NVRAM_GET_OFFSET ioctl syscall. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-22 10:21:45 +01:00
Finn Thain	f9c3a570f5	powerpc: Enable HAVE_ARCH_NVRAM_OPS and disable GENERIC_NVRAM Switch PPC32 kernels from the generic_nvram module to the nvram module. Also fix a theoretical bug where CHRP omits the chrp_nvram_init() call when CONFIG_NVRAM_MODULE=m. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-22 10:21:45 +01:00
Finn Thain	ebcebc7f45	powerpc: Define missing ppc_md.nvram_size for CHRP and PowerMac Add the nvram_size() function to those PowerPC platforms that don't already have one: CHRP and PowerMac. This means that the ppc_md.nvram_size() function can be called by nvram_get_size(). Since we are addressing CHRP inconsistencies here, rename chrp_nvram_read and chrp_nvram_write, which break the naming convention used across powerpc platforms for NVRAM accessor functions. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-22 10:21:45 +01:00
Finn Thain	a156c7ba66	powerpc: Replace nvram_* extern declarations with standard header Remove the nvram_read_byte() and nvram_write_byte() declarations in powerpc/include/asm/nvram.h and use the cross-platform static functions in linux/nvram.h instead. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-22 10:21:43 +01:00
Andrew Murray	c2c9091d9e	perf/core, arch/powerpc: use PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs For PowerPC PMUs that do not support context exclusion let's advertise the PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will prevent us from handling events where any exclusion flags are set. Let's also remove the now unnecessary check for exclusion flags. Signed-off-by: Andrew Murray <andrew.murray@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <linux@armlinux.org.uk> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: robin.murphy@arm.com Cc: suzuki.poulose@arm.com Link: https://lkml.kernel.org/r/1547128414-50693-10-git-send-email-andrew.murray@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-01-21 11:01:26 +01:00
Linus Torvalds	6a0141a096	A single build fix for powerpc due to device_node.type removal -----BEGIN PGP SIGNATURE----- iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAlxDUjkQHHJvYmhAa2Vy bmVsLm9yZwAKCRD6+121jbxhwzTBD/9MfQbCzF5IcgW+alp7fKqoFH5MQUkwEQgO TJ7TO/CgrnT/uDjUe90tfixsF7A4orpK8h8FWczLF22Ksnirqaxv4Gcmj0ItGQlD TeTzCV7DW6Kgcv6weQjH5PohB3XUKzHQe8/MdZCXnn/ikQpKK5bmQEV3Kxw1s4zM jHIwMGcDy+EKbcLPpq/lhLyCR7Ce3SrzCtrY41rw+tpgi450A72g0/sOf5onb/Xz +/akwRl8lIfcIl21+wMA4zyLOHVeqio+xvXSiYB31z82Pv2v4hGLxjTqBVfPLhpq 1qJI1ua4v+424RLBNInuEf3+lnUkNizjT38INw0zw+E86TLtQYj1n/m2upoVsRE3 ggLWVnxlWEIebDhpLySnKW9R58fYxXSq+qaP2jOKkGjzx+0HCZDVIjZwZx3C2Vft sMAVSNKd+Mche6GGGEhcPxmKnMIh6P2XyIo/60hv8JqYY6CwqxBGyCGQvCck38AA OySmMnS7q31pwiS3m6/RnlLiL6JmHeI/P5FRmPUak68Gbsp7dddcw6dIFUuOBsnF QQpuq99cG4r3e9Az5/A45KNrR5MyZVJHoZj0CmjLPRnQuJsPvpEk5BQi2rmH5jSG F/BPcFU9/+r65tDVYVmOaaHfqOWk9oeoxGR+e8ve/MpfTktid7to/dLv9smqs0UB XdHeh+jepg== =6H3h -----END PGP SIGNATURE----- Merge tag 'devicetree-fixes-for-5.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull Devicetree fix from Rob Herring: "A single build fix for powerpc due to device_node.type removal" * tag 'devicetree-fixes-for-5.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: powerpc: chrp: Use of_node_is_type to access device_type	2019-01-20 10:28:46 +12:00
Rob Herring	75a080cde0	powerpc: chrp: Use of_node_is_type to access device_type Commit `8ce5f84157` ("of: Remove struct device_node.type pointer") removed struct device_node.type pointer, but the conversion to use of_node_is_type() accessor was missed in chrp_init_IRQ(). Fixes: `8ce5f84157` ("of: Remove struct device_node.type pointer") Reported-by: kbuild test robot <lkp@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Rob Herring <robh@kernel.org>	2019-01-19 10:35:42 -06:00
Linus Torvalds	c5b709804e	powerpc fixes for 5.0 #3 A couple of weeks of fixes. There's one fix for an oops on Power9 machines with Open CAPI adapters. And a fix for probable memory corruption in some of the new NPU code, caught by smatch though and not seen in the wild. Plus a few other minor fixes. There's one non-fix which is the perf_regs change. That was sent during the merge window but I accidentally only merged the first of two patches in the series. It's been in linux-next so hopefully doesn't conflict with anything in acme's tree. Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Breno Leitao, Christian Lamparter, Christophe Leroy, Dan Carpenter, Frederic Barrat, Greg Kurz, Jason A. Donenfeld, Madhavan Srinivasan. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJcQcioAAoJEFHr6jzI4aWAMxoP/j2w/p1z5As/rMQRH9L0wTDV Z/69GkRnj+rkRSNBWJ2T/0KY6c1mXPH4R2nvmFNfdEYzXWLh+Ymn65RQ3ifQnb56 C5PPjVOPruiCjKAWiNYGr8S+Ev8IehZU0zXXToCwV1MKCMU0QcO6Q1HtEVI56WhV xtQfBJz1tkPJ4Ep9HZ7go7p6SKaFmmWh/Z8pg02s5DOlGN4bKFQ3Qc+XnNPw5vc8 LgjrwrOIQ7D+lXa6saQWbV16ktLzzpsxDfxXHXNTz0bOjyuQAXfdnfGJnEoDowYa Pqio5fm1rcjXcHtqwuSsRWeYi+dzO+AYj0WUrqevcPSAMM0RwmqREcfBGLvAlPWA fYfuMMB5zhf9HkDHkx4+8pvZ6io+VDP5k5YF7ZnQfz8tVYAboTmRvIiGAM8ks8hC 6DnNdV2WojBeoK2gWsgX+WAIc4Ynk+u0554kf884rtiK7TSCRq63JNTeTmIr8v/u 7g5qwlC99RDYsl/ZkY2eQviiQo6dWXTwRCZ9lbk/iLivc90ulN7P+8r3oaQNV6ja zYpiLz95fpL7g5G0caW3AZTzfnJxOGioaCGOQc/hZHzhdc7p9zWH+7sd9mPMGayu iTMn66h2v8cf6o6u2peAf15NQvR0jHe8mIccUpRJTXWwnlVMI2WAcXqlpE+9fj5V gBZ0MuitQtX0qLEtFpUa =hlZ7 -----END PGP SIGNATURE----- Merge tag 'powerpc-5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "A couple of weeks of fixes. There's one fix for an oops on Power9 machines with Open CAPI adapters. And a fix for probable memory corruption in some of the new NPU code, caught by smatch though and not seen in the wild. Plus a few other minor fixes. There's one non-fix which is the perf_regs change. That was sent during the merge window but I accidentally only merged the first of two patches in the series. It's been in linux-next so hopefully doesn't conflict with anything in acme's tree. Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Breno Leitao, Christian Lamparter, Christophe Leroy, Dan Carpenter, Frederic Barrat, Greg Kurz, Jason A. Donenfeld, Madhavan Srinivasan" * tag 'powerpc-5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/syscalls: Fix syscall tracing powerpc/pseries: Fix build break due to pnv_npu2_init() powerpc/4xx/ocm: Fix fix for phys_addr_t printf warnings powerpc/powernv/npu: Fix oops in pnv_try_setup_npu_table_group() powerpc/tm: Limit TM code inside PPC_TRANSACTIONAL_MEM powerpc/8xx: fix setting of pagetable for Abatron BDI debug tool. powerpc/powernv/npu: Allocate enough memory in pnv_try_setup_npu_table_group() powerpc/perf: Update perf_regs structure to include MMCRA	2019-01-19 05:55:42 +12:00
Michael Ellerman	7bea7ac0ca	powerpc/syscalls: Fix syscall tracing Recently in commit `fbf508da74` ("powerpc: split compat syscall table out from native table") we changed the layout of the system call table. Instead of having two entries for each syscall number, one for the regular entry point and one for the compat entry point, we now have separate tables for regular and compat entry points. This inadvertently broke syscall tracing (CONFIG_FTRACE_SYSCALLS), because our implementation of arch_syscall_addr() knew about the layout of the table (it did nr * 2). We can fix it just by dropping our version of arch_syscall_addr() and using the generic version which does: return (unsigned long)sys_call_table[nr]; Fixes: `fbf508da74` ("powerpc: split compat syscall table out from native table") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 21:32:25 +11:00
Jason A. Donenfeld	da727097a4	powerpc/pseries: Fix build break due to pnv_npu2_init() Commit `3be2df00e2` ("powerpc/pseries/npu: Enable platform support") added a call to pnv_npu2_init() in pseries code. This causes a build break if we build with CONFIG_PPC_PSERIES && !CONFIG_PPC_POWERNV: powerpc64le-pc-linux-gnu-ld: arch/powerpc/platforms/pseries/pci.o: in function `pSeries_final_fixup': pci.c:(.init.text+0x1b0): undefined reference to `pnv_npu2_init' This commit therefore wraps that line in an ifdef, so that pseries builds without powernv. Fixes: `3be2df00e2` ("powerpc/pseries/npu: Enable platform support") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> [mpe: Frob change log a bit to blame a different commit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 21:27:47 +11:00
Igor Stoppa	63da6caeb8	powerpc: remove unnecessary unlikely() WARN_ON() already contains an unlikely(), so it's not necessary to wrap it into another. Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com> Cc: Arseny Solokha <asolokha@kb.kras.ru> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 11:38:05 +11:00
Mathieu Malaterre	9bd10b6498	powerpc: Allow CPU selection of G4/74xx variant GCC supports -mcpu=G4 This patch gives the opportunity to select ALTIVEC for this variant. Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 11:19:45 +11:00
Michael Ellerman	16842516ea	powerpc/64s: Add MMU type to __die() output On Power9 machines (64-bit Book3S), we can be running with either the Hash table or Radix tree MMU enabled. So add some text to the __die() output to tell us which is enabled, for the case where all you have is the oops output and no other information. Example output: kernel BUG at drivers/misc/lkdtm/bugs.c:63! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: kvm vmx_crypto binfmt_misc ip_tables x_tables Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 11:17:10 +11:00
Michael Ellerman	184051396b	powerpc: Show PAGE_SIZE in __die() output The page size the kernel is built with is useful info when debugging a crash, so add it to the output in __die(). Result looks like eg: kernel BUG at drivers/misc/lkdtm/bugs.c:63! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K SMP NR_CPUS=2048 NUMA pSeries Modules linked in: vmx_crypto kvm binfmt_misc ip_tables Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 11:17:09 +11:00
Michael Ellerman	782274434d	powerpc: Stop using pr_cont() in __die() Using pr_cont() risks having our output interleaved with other output from other CPUs. Instead print everything in a single printk() call. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 11:17:09 +11:00
Jonathan Neuschäfer	8de7547e03	powerpc: wii.dts: Add GPIO keys The Wii has POWER and EJECT buttons, which are connected through normalization logic to the GPIO controller (the length of an assertion of these signals is always the same, regardless of how long the user pressed the buttons). Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 11:17:09 +11:00
Jonathan Neuschäfer	f4ddc19a71	powerpc: wii.dts: Add interrupt-related properties to GPIO node The Hollywood GPIO controller is connected to the Hollywood PIC (&PIC1) at IRQs 10 and 11; IRQ 10 for GPIO lines that are configured for access by the PPC, 11 for GPIO lines that are configured for access by the ARM926. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 11:17:09 +11:00
Alexey Kardashevskiy	797eadd9c8	powerpc/powernv/npu: Remove obsolete comment about TCE_KILL_INVAL_ALL TCE_KILL_INVAL_ALL has moved long ago but the comment was forgotted so finish the move and remove the comment. Fixes: `0bbcdb437d` "powerpc/powernv/npu: TCE Kill helpers cleanup" Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 11:17:09 +11:00
Alexey Kardashevskiy	c35f78d7a4	powerpc/powernv: Remove never used pnv_power9_force_smt4 This removes never used symbol - pnv_power9_force_smt4. Note that we might still want to add stubs for: void pnv_power9_force_smt4_catch(void); void pnv_power9_force_smt4_release(void); Fixes: `7672691a08` "powerpc/powernv: Provide a way to force a core into SMT4 mode" Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 11:17:09 +11:00
Alexey Kardashevskiy	cd6b8a631c	powerpc/mm: Fix compile when CONFIG_PPC_RADIX_MMU is not defined This adds some stubs for hash only configs. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 11:17:09 +11:00
Joel Stanley	a652758ac1	powerpc: Use ALIGN instead of BLOCK In the ld documentation under Builtin Functions: BLOCK(exp) This is a synonym for ALIGN, for compatibility with older linker scripts. Clang's linker (lld) doesn't know about BLOCK so remove this use of it. Suggested-by: George Rimar <grimar@accesssoftek.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-15 11:12:10 +11:00
Corentin Labbe	acef5e0165	powerpc/dts: Build virtex dtbs I wanted to test the virtex440-ml507 qemu machine and found that the dtb for it was not built. All powerpc dtbs are only built when CONFIG_OF_ALL_DTBS is set which depend on COMPILE_TEST. This patch enables building of the virtex dtbs when CONFIG_XILINX_VIRTEX440_GENERIC_BOARD is enabled. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> [mpe: Put both targets on a single line] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-14 20:39:27 +11:00
Christophe Leroy	8acb88682c	powerpc/ipic: drop unused functions ipic_set_highest_priority(), ipic_enable_mcp() and ipic_disable_mcp() are unused. This patch drops them. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-14 20:39:27 +11:00
Gustavo A. R. Silva	00def7130a	powerpc/spufs: use struct_size() in kmalloc() One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void entry[]; }; instance = kmalloc(sizeof(struct foo) + sizeof(void ) * count, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-14 20:39:27 +11:00
Masahiro Yamada	fbe3ab014f	powerpc: math-emu: remove unneeded header search paths The header search path -I. in kernel Makefiles is very suspicious; it allows the compiler to search for headers in the top of $(srctree), where obviously no header file exists. -Iinclude/math-emu seems unnecessary because all files include headers in the form of #include <math-emu/...>. I was able to build without these header search paths. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-14 20:39:27 +11:00
Masahiro Yamada	b00899b895	powerpc: remove redundant header search path additions The same path -Iarch/$(ARCH) is passed to KBUILD_CPPFLAGS, KBUILD_AFLAGS, and KBUILD_CFLAGS. As you see in scripts/Makefile.lib, KBUILD_CPPFLAGS is passed to c_flags and a_flags as well. Passing it to KBUILD_CPPFLAGS is enough. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-14 20:39:27 +11:00
Masahiro Yamada	c142e9741e	KVM: powerpc: remove -I. header search paths The header search path -I. in kernel Makefiles is very suspicious; it allows the compiler to search for headers in the top of $(srctree), where obviously no header file exists. Commit `46f43c6ee0` ("KVM: powerpc: convert marker probes to event trace") first added these options, but they are completely useless. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-14 20:39:27 +11:00
YueHaibing	7cd4774ff7	powerpc/mm: Fix debugfs_simple_attr.cocci warnings Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE for debugfs files. Generated by: scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-14 20:39:27 +11:00
Christophe Leroy	607ea5090b	powerpc/irq: drop arch_early_irq_init() arch_early_irq_init() does nothing different than the weak arch_early_irq_init() in kernel/softirq.c Fixes: `089fb442f3` ("powerpc: Use ARCH_IRQ_INIT_FLAGS") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-14 20:39:27 +11:00
Gustavo A. R. Silva	31367b9a01	powerpc/ps3: Use struct_size() in kzalloc() One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void entry[]; }; instance = kzalloc(sizeof(struct foo) + sizeof(void ) * count, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-14 20:39:27 +11:00
Matteo Croce	3b702ddd06	powerpc/hvsi: Fix spelling mistake: "lenght" should be "length" Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-14 20:39:27 +11:00
Sergey Senozhatsky	fae1383b38	powerpc: use a CONSOLE_LOGLEVEL_DEBUG macro Use a CONSOLE_LOGLEVEL_DEBUG macro for console_loglevel rather than a naked number. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-14 20:39:27 +11:00
Michael Ellerman	fcf5036f09	powerpc/4xx/ocm: Fix fix for phys_addr_t printf warnings My recent commit to fix the printf warnings in ocm.c got the format specifier wrong, because I copied it from the documentation without realising the square brackets are not meant as literals. This results in the address being suffixed with a literal "[p]". Actually tested this time: # cat info /sys/kernel/debug/ppc4xx_ocm PhysAddr : 0x0000000400040000 ... NC.PhysAddr : 0x0000000400040000 ... C.PhysAddr : 0x0000000000000000 Fixes: `52b88fa1e8` ("powerpc/4xx/ocm: Fix phys_addr_t printf warnings") Reported-by: Christian Lamparter <chunkeey@gmail.com> Tested-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-11 23:57:20 +11:00
Frederic Barrat	6bca515917	powerpc/powernv/npu: Fix oops in pnv_try_setup_npu_table_group() With a recent change around IOMMU group, a system with an opencapi adapter is no longer booting and we get a kernel oops: BUG: Kernel NULL pointer dereference at 0x00000028 Faulting instruction address: 0xc0000000000aa38c ... NIP pnv_try_setup_npu_table_group+0x1c/0x1a0 LR pnv_pci_ioda_fixup+0x1f8/0x660 Call Trace: pnv_try_setup_npu_table_group+0x60/0x pnv_pci_ioda_fixup+0x20c/0x660 pcibios_resource_survey+0x2c8/0x31c pcibios_init+0xb0/0xe4 do_one_initcall+0x64/0x264 kernel_init_freeable+0x36c/0x468 kernel_init+0x2c/0x148 ret_from_kernel_thread+0x5c/0x68 An opencapi device is using a device PE, so the current code breaks because pe->pbus is not defined. More generally, there's no need to define an IOMMU group for opencapi, as the device sends real addresses directly (admittedly, the virtualization story is yet to be written). So let's fix it by skipping the IOMMU group setup for opencapi PHBs. Fixes: `0bd971676e` ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-11 23:45:00 +11:00
Breno Leitao	897bc3df8c	powerpc/tm: Limit TM code inside PPC_TRANSACTIONAL_MEM Commit `e1c3743e1a` ("powerpc/tm: Set MSR[TS] just prior to recheckpoint") moved a code block around and this block uses a 'msr' variable outside of the CONFIG_PPC_TRANSACTIONAL_MEM, however the 'msr' variable is declared inside a CONFIG_PPC_TRANSACTIONAL_MEM block, causing a possible error when CONFIG_PPC_TRANSACTION_MEM is not defined. error: 'msr' undeclared (first use in this function) This is not causing a compilation error in the mainline kernel, because 'msr' is being used as an argument of MSR_TM_ACTIVE(), which is defined as the following when CONFIG_PPC_TRANSACTIONAL_MEM is not set: #define MSR_TM_ACTIVE(x) 0 This patch just fixes this issue avoiding the 'msr' variable usage outside the CONFIG_PPC_TRANSACTIONAL_MEM block, avoiding trusting in the MSR_TM_ACTIVE() definition. Cc: stable@vger.kernel.org Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Fixes: `e1c3743e1a` ("powerpc/tm: Set MSR[TS] just prior to recheckpoint") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-11 23:45:00 +11:00
Christophe Leroy	fb0bdec51a	powerpc/8xx: fix setting of pagetable for Abatron BDI debug tool. Commit `8c8c10b90d` ("powerpc/8xx: fix handling of early NULL pointer dereference") moved the loading of r6 earlier in the code. As some functions are called inbetween, r6 needs to be loaded again with the address of swapper_pg_dir in order to set PTE pointers for the Abatron BDI. Fixes: `8c8c10b90d` ("powerpc/8xx: fix handling of early NULL pointer dereference") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-11 23:45:00 +11:00
Dan Carpenter	d7b6cc199b	powerpc/powernv/npu: Allocate enough memory in pnv_try_setup_npu_table_group() There is a typo so we accidentally allocate enough memory for a pointer when we wanted to allocate enough for a struct. Fixes: `0bd971676e` ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-11 23:44:54 +11:00
Luis Chamberlain	750afb08ca	cross-tree: phase out dma_zalloc_coherent() We already need to zero out memory for dma_alloc_coherent(), as such using dma_zalloc_coherent() is superflous. Phase it out. This change was generated with the following Coccinelle SmPL patch: @ replace_dma_zalloc_coherent @ expression dev, size, data, handle, flags; @@ -dma_zalloc_coherent(dev, size, handle, flags) +dma_alloc_coherent(dev, size, handle, flags) Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> [hch: re-ran the script on the latest tree] Signed-off-by: Christoph Hellwig <hch@lst.de>	2019-01-08 07:58:37 -05:00
Madhavan Srinivasan	6529870cb0	powerpc/perf: Update perf_regs structure to include MMCRA On each sample, Monitor Mode Control Register A (MMCRA) content is saved in pt_regs. MMCRA does not have a entry as-is in the pt_regs but instead, MMCRA content is saved in the "dsisr" register of pt_regs. Patch adds another entry to the perf_regs structure to include the "MMCRA" printing which internally maps to the "dsisr" of pt_regs. It also check for the MMCRA availability in the platform and present value accordingly mpe: This was the 2nd patch in a series with commit `333804dc3b` ("powerpc/perf: Update perf_regs structure to include SIER") but I accidentally only merged the 1st patch, so merge this one now. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-08 19:22:47 +11:00
Dan Williams	8fc5c73554	acpi/nfit, device-dax: Identify differentiated memory with a unique numa-node Persistent memory, as described by the ACPI NFIT (NVDIMM Firmware Interface Table), is the first known instance of a memory range described by a unique "target" proximity domain. Where "initiator" and "target" proximity domains is an approach that the ACPI HMAT (Heterogeneous Memory Attributes Table) uses to described the unique performance properties of a memory range relative to a given initiator (e.g. CPU or DMA device). Currently the numa-node for a /dev/pmemX block-device or /dev/daxX.Y char-device follows the traditional notion of 'numa-node' where the attribute conveys the closest online numa-node. That numa-node attribute is useful for cpu-binding and memory-binding processes near the device. However, when the memory range backing a 'pmem', or 'dax' device is onlined (memory hot-add) the memory-only-numa-node representing that address needs to be differentiated from the set of online nodes. In other words, the numa-node association of the device depends on whether you can bind processes near the cpu-numa-node in the offline device-case, or bind process on the memory-range directly after the backing address range is onlined. Allow for the case that platform firmware describes persistent memory with a unique proximity domain, i.e. when it is distinct from the proximity of DRAM and CPUs that are on the same socket. Plumb the Linux numa-node translation of that proximity through the libnvdimm region device to namespaces that are in device-dax mode. With this in place the proposed kmem driver [1] can optionally discover a unique numa-node number for the address range as it transitions the memory from an offline state managed by a device-driver to an online memory range managed by the core-mm. [1]: https://lore.kernel.org/lkml/20181022201317.8558C1D8@viggo.jf.intel.com Reported-by: Fan Du <fan.du@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2019-01-06 21:41:57 -08:00
Masahiro Yamada	d6e4b3e326	arch: remove redundant UAPI generic-y defines Now that Kbuild automatically creates asm-generic wrappers for missing mandatory headers, it is redundant to list the same headers in generic-y and mandatory-y. Suggested-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Sam Ravnborg <sam@ravnborg.org>	2019-01-06 10:22:15 +09:00
Masahiro Yamada	d4ce5458ea	arch: remove stale comments "UAPI Header export list" These comments are leftovers of commit `fcc8487d47` ("uapi: export all headers under uapi directories"). Prior to that commit, exported headers must be explicitly added to header-y. Now, all headers under the uapi/ directories are exported. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-01-06 09:46:51 +09:00
Masahiro Yamada	e9666d10a5	jump_label: move 'asm goto' support test to Kconfig Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label". The jump label is controlled by HAVE_JUMP_LABEL, which is defined like this: #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) # define HAVE_JUMP_LABEL #endif We can improve this by testing 'asm goto' support in Kconfig, then make JUMP_LABEL depend on CC_HAS_ASM_GOTO. Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will match to the real kernel capability. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: Sedat Dilek <sedat.dilek@gmail.com>	2019-01-06 09:46:51 +09:00
Linus Torvalds	f1c2f8857c	powerpc fixes for 4.21 #2 A fix for the recent access_ok() change, which broke the build. We recently added a use of type in order to squash a warning elsewhere about type being unused. A handful of other minor build fixes, and one defconfig update. Thanks to: Christian Lamparter, Christophe Leroy, Diana Craciun, Mathieu Malaterre. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJcL05pAAoJEFHr6jzI4aWAaQIP/iKoKuPvS94Y6DDny4JQTgB5 5NxOm40Wmylx7DyJQYY55Hdvs6EbitO+5YajW08Buc0EU6yCaSdDlH27ExTFI1c/ 14y59dWXI85NZBo3wTPa0229TDfA8ghw5jwwT8BGEvZe3xSzrKGlGcCuSJGnmEBH YOUpTdY34IHR4o8yiIqxEkg0QyPDimqx3fnK/PYRcbwnynxE/QQ7sRDDVtHXvseC MuOE5kooac78dDEXsOWrLqGoXpvMqRWdn8YZe17x+fO2dxMrWQbYYZT8IZC2DS1u yFxYYU8r4xfQS/EnGrzpn6P/+hNQ1Cj5zhR0mr8m45cDu9hk5py3h5LSp9uhJeqw PaYGGNp+v1jEWE8w2XOW3+bSszjUalTMRLV6xfLLGZm+pJrn+cWycN3e05OJuHcK Pzjqez+h/cJoahNL1Iw79iGxkS8bHitwdEvAGTv88Y4MzMGBlr1rOU9073yoiJZ2 iVNLoacdHicAicUZ5g4+H3TQBZ3OU4mQb3XaOwJFspPmsbN5jLZw/YqegqYX1zgT l1qpxqhnqPJDjl5bjcBxS5jfGv7mLS2Qyk/r3h9+BSqAMK50eGOBpNaqQs5Zcc2e mGEWA+Fd+xIHdGCLlgs3HyhhgoQC/o1VR1usAbncRgaFR4UbiKg+8nMAIpIiyZhK zrJIYMekNcuO+k+d/Gtr =0vml -----END PGP SIGNATURE----- Merge tag 'powerpc-4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "A fix for the recent access_ok() change, which broke the build. We recently added a use of type in order to squash a warning elsewhere about type being unused. A handful of other minor build fixes, and one defconfig update. Thanks to: Christian Lamparter, Christophe Leroy, Diana Craciun, Mathieu Malaterre" * tag 'powerpc-4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Drop use of 'type' from access_ok() KVM: PPC: Book3S HV: radix: Fix uninitialized var build error powerpc/configs: Add PPC4xx_OCM to ppc40x_defconfig powerpc/4xx/ocm: Fix phys_addr_t printf warnings powerpc/4xx/ocm: Fix compilation error due to PAGE_KERNEL usage powerpc/fsl: Fixed warning: orphan section `__btb_flush_fixup'	2019-01-05 11:48:44 -08:00
Linus Torvalds	a65981109f	Merge branch 'akpm' (patches from Andrew) Merge more updates from Andrew Morton: - procfs updates - various misc bits - lib/ updates - epoll updates - autofs - fatfs - a few more MM bits * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (58 commits) mm/page_io.c: fix polled swap page in checkpatch: add Co-developed-by to signature tags docs: fix Co-Developed-by docs drivers/base/platform.c: kmemleak ignore a known leak fs: don't open code lru_to_page() fs/: remove caller signal_pending branch predictions mm/: remove caller signal_pending branch predictions arch/arc/mm/fault.c: remove caller signal_pending_branch predictions kernel/sched/: remove caller signal_pending branch predictions kernel/locking/mutex.c: remove caller signal_pending branch predictions mm: select HAVE_MOVE_PMD on x86 for faster mremap mm: speed up mremap by 20x on large regions mm: treewide: remove unused address argument from pte_alloc functions initramfs: cleanup incomplete rootfs scripts/gdb: fix lx-version string output kernel/kcov.c: mark write_comp_data() as notrace kernel/sysctl: add panic_print into sysctl panic: add options to print system info when panic happens bfs: extra sanity checking and static inode bitmap exec: separate MM_ANONPAGES and RLIMIT_STACK accounting ...	2019-01-05 09:16:18 -08:00
Joel Fernandes (Google)	4cf5892495	mm: treewide: remove unused address argument from pte_alloc functions Patch series "Add support for fast mremap". This series speeds up the mremap(2) syscall by copying page tables at the PMD level even for non-THP systems. There is concern that the extra 'address' argument that mremap passes to pte_alloc may do something subtle architecture related in the future that may make the scheme not work. Also we find that there is no point in passing the 'address' to pte_alloc since its unused. This patch therefore removes this argument tree-wide resulting in a nice negative diff as well. Also ensuring along the way that the enabled architectures do not do anything funky with the 'address' argument that goes unnoticed by the optimization. Build and boot tested on x86-64. Build tested on arm64. The config enablement patch for arm64 will be posted in the future after more testing. The changes were obtained by applying the following Coccinelle script. (thanks Julia for answering all Coccinelle questions!). Following fix ups were done manually: * Removal of address argument from pte_fragment_alloc * Removal of pte_alloc_one_fast definitions from m68k and microblaze. // Options: --include-headers --no-includes // Note: I split the 'identifier fn' line, so if you are manually // running it, please unsplit it so it runs for you. virtual patch @pte_alloc_func_def depends on patch exists@ identifier E2; identifier fn =~ "^(__pte_alloc\|pte_alloc_one\|pte_alloc\|__pte_alloc_kernel\|pte_alloc_one_kernel)$"; type T2; @@ fn(... - , T2 E2 ) { ... } @pte_alloc_func_proto_noarg depends on patch exists@ type T1, T2, T3, T4; identifier fn =~ "^(__pte_alloc\|pte_alloc_one\|pte_alloc\|__pte_alloc_kernel\|pte_alloc_one_kernel)$"; @@ ( - T3 fn(T1, T2); + T3 fn(T1); \| - T3 fn(T1, T2, T4); + T3 fn(T1, T2); ) @pte_alloc_func_proto depends on patch exists@ identifier E1, E2, E4; type T1, T2, T3, T4; identifier fn =~ "^(__pte_alloc\|pte_alloc_one\|pte_alloc\|__pte_alloc_kernel\|pte_alloc_one_kernel)$"; @@ ( - T3 fn(T1 E1, T2 E2); + T3 fn(T1 E1); \| - T3 fn(T1 E1, T2 E2, T4 E4); + T3 fn(T1 E1, T2 E2); ) @pte_alloc_func_call depends on patch exists@ expression E2; identifier fn =~ "^(__pte_alloc\|pte_alloc_one\|pte_alloc\|__pte_alloc_kernel\|pte_alloc_one_kernel)$"; @@ fn(... -, E2 ) @pte_alloc_macro depends on patch exists@ identifier fn =~ "^(__pte_alloc\|pte_alloc_one\|pte_alloc\|__pte_alloc_kernel\|pte_alloc_one_kernel)$"; identifier a, b, c; expression e; position p; @@ ( - #define fn(a, b, c) e + #define fn(a, b) e \| - #define fn(a, b) e + #define fn(a) e ) Link: http://lkml.kernel.org/r/20181108181201.88826-2-joelaf@google.com Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Suggested-by: Kirill A. Shutemov <kirill@shutemov.name> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michal Hocko <mhocko@kernel.org> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-01-04 13:13:47 -08:00
Linus Torvalds	4caf4ebfe4	Fix access_ok() fallout for sparc32 and powerpc These two architectures actually had an intentional use of the 'type' argument to access_ok() just to avoid warnings. I had actually noticed the powerpc one, but forgot to then fix it up. And I missed the sparc32 case entirely. This is hopefully all of it. Reported-by: Mathieu Malaterre <malat@debian.org> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: `96d4f267e4` ("Remove 'type' argument from access_ok() function") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-01-04 09:58:25 -08:00
Mathieu Malaterre	074400a7be	powerpc: Drop use of 'type' from access_ok() In commit `05a4ab8239` ("powerpc/uaccess: fix warning/error with access_ok()") an attempt was made to remove a warning by referencing the variable `type`. However in commit `96d4f267e4` ("Remove 'type' argument from access_ok() function") the variable `type` has been removed, breaking the build: arch/powerpc/include/asm/uaccess.h:66:32: error: ‘type’ undeclared (first use in this function) This essentially reverts commit `05a4ab8239` ("powerpc/uaccess: fix warning/error with access_ok()") to fix the error. Fixes: `96d4f267e4` ("Remove 'type' argument from access_ok() function") Signed-off-by: Mathieu Malaterre <malat@debian.org> [mpe: Reword change log slightly.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-01-04 23:07:59 +11:00
Michael Ellerman	d538d94f0c	Merge branch 'master' into fixes We have a fix to apply on top of commit `96d4f267e4` ("Remove 'type' argument from access_ok() function"), so merge master to get it.	2019-01-04 22:07:47 +11:00
Linus Torvalds	96d4f267e4	Remove 'type' argument from access_ok() function Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-01-03 18:57:57 -08:00
Linus Torvalds	8e143b90e4	IOMMU Updates for Linux v4.21 Including (in no particular order): - Page table code for AMD IOMMU now supports large pages where smaller page-sizes were mapped before. VFIO had to work around that in the past and I included a patch to remove it (acked by Alex Williamson) - Patches to unmodularize a couple of IOMMU drivers that would never work as modules anyway. - Work to unify the the iommu-related pointers in 'struct device' into one pointer. This work is not finished yet, but will probably be in the next cycle. - NUMA aware allocation in iommu-dma code - Support for r8a774a1 and r8a774c0 in the Renesas IOMMU driver - Scalable mode support for the Intel VT-d driver - PM runtime improvements for the ARM-SMMU driver - Support for the QCOM-SMMUv2 IOMMU hardware from Qualcom - Various smaller fixes and improvements -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJcKkEoAAoJECvwRC2XARrjCCoQAJxsgaAF5Z0s7z8j2A9SkaGp SIMnUAI5mDOdyhTOAI+eehpRzg5UVYt/JjFYnHz8HWqbSc8YOvDvHafmhMFIwYvO hq5knbs6ns2jJNFO+M4dioDq+3THdqkGIF5xoHdGTP7cn9+XyQ8lAoHo0RuL122U PJGqX7Cp4XnFP4HMb3uQYhVeBV7mU+XqAdB+4aDnQkzI5LkQCRr74GcqOm+Rlnyc cmQWc2arUMjgc1TJIrex8dx9dT6lq8kOmhyEg/IjHeGaZyJ3HqA+30XDDLEExN0G MeVawuxJz40HgXlkXr+iZTQtIFYkXdKvJH6rptMbOfbDeDz+YZ01TbtAMMH9o4jX yxjjMjdcWTsWYQ/MHHdsoMP34cajCi/EYPMNksbycw+E3Y+X/bSReCoWC0HUK8/+ Z4TpZ9mZVygtJR+QNZ+pE9oiJpb4sroM10zTnbMoVHNnvfsO01FYk7FMPkolSKLw zB4MDswQYgchoFR9Z4ZB4PycYTzeafLKYgDPDoD1vIJgDavuidwvDWDRTDc+aMWM siIIewq19To9jDJkVjX4dsT/p99KVKgAR/Ps6jjWkAroha7g6GcmlYZHIJnyop04 jiaSXUsk8aRucP/CRz5xdMmaGoN7BsNmpUjcrquc6Povk/6gvXvpY04oCs1+gNMX ipL9E3GTFCVBubRFrksv =DT9A -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: - Page table code for AMD IOMMU now supports large pages where smaller page-sizes were mapped before. VFIO had to work around that in the past and I included a patch to remove it (acked by Alex Williamson) - Patches to unmodularize a couple of IOMMU drivers that would never work as modules anyway. - Work to unify the the iommu-related pointers in 'struct device' into one pointer. This work is not finished yet, but will probably be in the next cycle. - NUMA aware allocation in iommu-dma code - Support for r8a774a1 and r8a774c0 in the Renesas IOMMU driver - Scalable mode support for the Intel VT-d driver - PM runtime improvements for the ARM-SMMU driver - Support for the QCOM-SMMUv2 IOMMU hardware from Qualcom - Various smaller fixes and improvements * tag 'iommu-updates-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (78 commits) iommu: Check for iommu_ops == NULL in iommu_probe_device() ACPI/IORT: Don't call iommu_ops->add_device directly iommu/of: Don't call iommu_ops->add_device directly iommu: Consolitate ->add/remove_device() calls iommu/sysfs: Rename iommu_release_device() dmaengine: sh: rcar-dmac: Use device_iommu_mapped() xhci: Use device_iommu_mapped() powerpc/iommu: Use device_iommu_mapped() ACPI/IORT: Use device_iommu_mapped() iommu/of: Use device_iommu_mapped() driver core: Introduce device_iommu_mapped() function iommu/tegra: Use helper functions to access dev->iommu_fwspec iommu/qcom: Use helper functions to access dev->iommu_fwspec iommu/of: Use helper functions to access dev->iommu_fwspec iommu/mediatek: Use helper functions to access dev->iommu_fwspec iommu/ipmmu-vmsa: Use helper functions to access dev->iommu_fwspec iommu/dma: Use helper functions to access dev->iommu_fwspec iommu/arm-smmu: Use helper functions to access dev->iommu_fwspec ACPI/IORT: Use helper functions to access dev->iommu_fwspec iommu: Introduce wrappers around dev->iommu_fwspec ...	2019-01-01 15:55:29 -08:00
Linus Torvalds	fcf010449e	kgdb patches for 4.20-rc1 Mostly clean ups although whilst Doug's was chasing down a odd lockdep warning he also did some work to improved debugger resilience when some CPUs fail to respond to the round up request. The main changes are: * Fixing a lockdep warning on architectures that cannot use an NMI for the round up plus related changes to make CPU round up and all CPU backtrace more resilient. * Constify the arch ops tables * A couple of other small clean ups Two of the three patchsets here include changes that spill over into arch/. Changes in the arch space are relatively narrow in scope (and directly related to kgdb). Didn't get comprehensive acks but all impacted maintainers were Cc:ed in good time. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> -----BEGIN PGP SIGNATURE----- iQJPBAABCAA5FiEELzVBU1D3lWq6cKzwfOMlXTn3iKEFAlwonoUbHGRhbmllbC50 aG9tcHNvbkBsaW5hcm8ub3JnAAoJEHzjJV0594ihmooP/1uzSMGQIoQMB8XeU/jT Da2iILybi6hGp7ILA27d0yN3tsJBxWGWs8wzNdzMo3NQ3J0o4foAUnS/R0Vjkg9w uphe5EA4HDsIrH05OouNb984BeEgNaC9HSqtyr9fXuh024NboULFKIm7REYm+QHT C5SrBtmonL1xE7FmAhudLWjl7ZlvxM6DJeoVViH4kKq0raTiILt6VJaGl9JfcAdL m9GEf9r/nh0sCq3GNgyc0y4BvHed+Kxzy1fsIi3jE6t8elaYYR72gNRQ5LaFxcnQ F04/UtH75qB4rqYsqqV1q0rFi+tj+p9wYTmxixaGWsVDX4Gb5KXuLWJhaRb5IvwC bdq/0IAXRr4vUL3y0tFWfCj7pHGaVc/gfXi8aieRXLGAZG+tdfuu99NCiulIZTfc QqZz12Z+99/qi6dK7dBQtaN8SyPeB1QXKWefeGo2Bt5QqiBmcKHxsQYMUo3nkf3J UXHpj4LG6Ldsi/w8VZfvXmM0/vbO/jrus9m+X2v+4tJyisjrsyv0FRnREI4avfbC l09P1ajv7RrAaxtab0smV9krqWZ/mSn0zcgcaD6RdKe0+SwsiP/CEx1z1Wb1MH9c wjEiClXjdVB39YVT0YVfG2Ho7qH8WRErxVyNb/f4QKHMXL1Mu91hFWhBBpUOGUj2 7Jrq2zK1uWramtt7GBDpHYYH =Aqlc -----END PGP SIGNATURE----- Merge tag 'kgdb-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux Pull kgdb updates from Daniel Thompson: "Mostly clean ups although while Doug's was chasing down a odd lockdep warning he also did some work to improved debugger resilience when some CPUs fail to respond to the round up request. The main changes are: - Fixing a lockdep warning on architectures that cannot use an NMI for the round up plus related changes to make CPU round up and all CPU backtrace more resilient. - Constify the arch ops tables - A couple of other small clean ups Two of the three patchsets here include changes that spill over into arch/. Changes in the arch space are relatively narrow in scope (and directly related to kgdb). Didn't get comprehensive acks but all impacted maintainers were Cc:ed in good time" * tag 'kgdb-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux: kgdb/treewide: constify struct kgdb_arch arch_kgdb_ops mips/kgdb: prepare arch_kgdb_ops for constness kdb: use bool for binary state indicators kdb: Don't back trace on a cpu that didn't round up kgdb: Don't round up a CPU that failed rounding up before kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function() kgdb: Remove irq flags from roundup	2019-01-01 15:38:14 -08:00
Linus Torvalds	495d714ad1	Tracing changes for v4.21: - Rework of the kprobe/uprobe and synthetic events to consolidate all the dynamic event code. This will make changes in the future easier. - Partial rewrite of the function graph tracing infrastructure. This will allow for multiple users of hooking onto functions to get the callback (return) of the function. This is the ground work for having kprobes and function graph tracer using one code base. - Clean up of the histogram code that will facilitate adding more features to the histograms in the future. - Addition of str_has_prefix() and a few use cases. There currently is a similar function strstart() that is used in a few places, but only returns a bool and not a length. These instances will be removed in the future to use str_has_prefix() instead. - A few other various clean ups as well. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXCawlBQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qhbcAQCFeT0fWWTUxofBQz5jqsHaRnVg21+9 X4sTldYRYEn4YgEAmWOyiwq7zvrsAu4ZwkNBMeqxn3tVymYHiGOGe3Y4BAw= =u96o -----END PGP SIGNATURE----- Merge tag 'trace-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Rework of the kprobe/uprobe and synthetic events to consolidate all the dynamic event code. This will make changes in the future easier. - Partial rewrite of the function graph tracing infrastructure. This will allow for multiple users of hooking onto functions to get the callback (return) of the function. This is the ground work for having kprobes and function graph tracer using one code base. - Clean up of the histogram code that will facilitate adding more features to the histograms in the future. - Addition of str_has_prefix() and a few use cases. There currently is a similar function strstart() that is used in a few places, but only returns a bool and not a length. These instances will be removed in the future to use str_has_prefix() instead. - A few other various clean ups as well. * tag 'trace-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (57 commits) tracing: Use the return of str_has_prefix() to remove open coded numbers tracing: Have the historgram use the result of str_has_prefix() for len of prefix tracing: Use str_has_prefix() instead of using fixed sizes tracing: Use str_has_prefix() helper for histogram code string.h: Add str_has_prefix() helper function tracing: Make function ‘ftrace_exports’ static tracing: Simplify printf'ing in seq_print_sym tracing: Avoid -Wformat-nonliteral warning tracing: Merge seq_print_sym_short() and seq_print_sym_offset() tracing: Add hist trigger comments for variable-related fields tracing: Remove hist trigger synth_var_refs tracing: Use hist trigger's var_ref array to destroy var_refs tracing: Remove open-coding of hist trigger var_ref management tracing: Use var_refs[] for hist trigger reference checking tracing: Change strlen to sizeof for hist trigger static strings tracing: Remove unnecessary hist trigger struct field tracing: Fix ftrace_graph_get_ret_stack() to use task and not current seq_buf: Use size_t for len in seq_buf_puts() seq_buf: Make seq_buf_puts() null-terminate the buffer arm64: Use ftrace_graph_get_ret_stack() instead of curr_ret_stack ...	2018-12-31 11:46:59 -08:00
Christophe Leroy	cc0282975b	kgdb/treewide: constify struct kgdb_arch arch_kgdb_ops checkpatch.pl reports the following: WARNING: struct kgdb_arch should normally be const #28: FILE: arch/mips/kernel/kgdb.c:397: +struct kgdb_arch arch_kgdb_ops = { This report makes sense, as all other ops struct, this one should also be const. This patch does the change. Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: James Hogan <jhogan@kernel.org> Cc: Ley Foon Tan <lftan@altera.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Acked-by: Paul Burton <paul.burton@mips.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>	2018-12-30 08:33:06 +00:00
Douglas Anderson	3cd99ac355	kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function() When I had lockdep turned on and dropped into kgdb I got a nice splat on my system. Specifically it hit: DEBUG_LOCKS_WARN_ON(current->hardirq_context) Specifically it looked like this: sysrq: SysRq : DEBUG ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(current->hardirq_context) WARNING: CPU: 0 PID: 0 at .../kernel/locking/lockdep.c:2875 lockdep_hardirqs_on+0xf0/0x160 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0 #27 pstate: 604003c9 (nZCv DAIF +PAN -UAO) pc : lockdep_hardirqs_on+0xf0/0x160 ... Call trace: lockdep_hardirqs_on+0xf0/0x160 trace_hardirqs_on+0x188/0x1ac kgdb_roundup_cpus+0x14/0x3c kgdb_cpu_enter+0x53c/0x5cc kgdb_handle_exception+0x180/0x1d4 kgdb_compiled_brk_fn+0x30/0x3c brk_handler+0x134/0x178 do_debug_exception+0xfc/0x178 el1_dbg+0x18/0x78 kgdb_breakpoint+0x34/0x58 sysrq_handle_dbg+0x54/0x5c __handle_sysrq+0x114/0x21c handle_sysrq+0x30/0x3c qcom_geni_serial_isr+0x2dc/0x30c ... ... irq event stamp: ...45 hardirqs last enabled at (...44): [...] __do_softirq+0xd8/0x4e4 hardirqs last disabled at (...45): [...] el1_irq+0x74/0x130 softirqs last enabled at (...42): [...] _local_bh_enable+0x2c/0x34 softirqs last disabled at (...43): [...] irq_exit+0xa8/0x100 ---[ end trace adf21f830c46e638 ]--- Looking closely at it, it seems like a really bad idea to be calling local_irq_enable() in kgdb_roundup_cpus(). If nothing else that seems like it could violate spinlock semantics and cause a deadlock. Instead, let's use a private csd alongside smp_call_function_single_async() to round up the other CPUs. Using smp_call_function_single_async() doesn't require interrupts to be enabled so we can remove the offending bit of code. In order to avoid duplicating this across all the architectures that use the default kgdb_roundup_cpus(), we'll add a "weak" implementation to debug_core.c. Looking at all the people who previously had copies of this code, there were a few variants. I've attempted to keep the variants working like they used to. Specifically: * For arch/arc we passed NULL to kgdb_nmicallback() instead of get_irq_regs(). * For arch/mips there was a bit of extra code around kgdb_nmicallback() NOTE: In this patch we will still get into trouble if we try to round up a CPU that failed to round up before. We'll try to round it up again and potentially hang when we try to grab the csd lock. That's not new behavior but we'll still try to do better in a future patch. Suggested-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: James Hogan <jhogan@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>	2018-12-30 08:28:02 +00:00
Douglas Anderson	9ef7fa507d	kgdb: Remove irq flags from roundup The function kgdb_roundup_cpus() was passed a parameter that was documented as: > the flags that will be used when restoring the interrupts. There is > local_irq_save() call before kgdb_roundup_cpus(). Nobody used those flags. Anyone who wanted to temporarily turn on interrupts just did local_irq_enable() and local_irq_disable() without looking at them. So we can definitely remove the flags. Signed-off-by: Douglas Anderson <dianders@chromium.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: James Hogan <jhogan@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>	2018-12-30 08:24:21 +00:00
Michael Ellerman	f460772291	KVM: PPC: Book3S HV: radix: Fix uninitialized var build error Old GCCs (4.6.3 at least), aren't able to follow the logic in __kvmhv_copy_tofrom_guest_radix() and warn that old_pid is used uninitialized: arch/powerpc/kvm/book3s_64_mmu_radix.c:75:3: error: 'old_pid' may be used uninitialized in this function The logic is OK, we only use old_pid if quadrant == 1, and in that case it has definitely be initialised, eg: if (quadrant == 1) { old_pid = mfspr(SPRN_PID); ... if (quadrant == 1 && pid != old_pid) mtspr(SPRN_PID, old_pid); Annotate it to fix the error. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-30 14:05:13 +11:00
Michael Ellerman	42aee37298	powerpc/configs: Add PPC4xx_OCM to ppc40x_defconfig There was recently a compilation break to this driver, but we didn't notice because none of our defconfigs have it enabled. Fix that. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-30 14:00:47 +11:00
Michael Ellerman	52b88fa1e8	powerpc/4xx/ocm: Fix phys_addr_t printf warnings Currently the code produces several warnings, eg: arch/powerpc/platforms/4xx/ocm.c:240:38: error: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'phys_addr_t {aka unsigned int}' seq_printf(m, "PhysAddr : 0x%llx\n", ocm->phys); ~~~^ ~~~~~~~~~ Fix it by using the special %pa[p] format for printing phys_addr_t. Note we need to pass the value by reference for the special specifier to work. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-30 14:00:47 +11:00
Christian Lamparter	d0757237d7	powerpc/4xx/ocm: Fix compilation error due to PAGE_KERNEL usage This patch fixes a recent compilation regression in ocm: ocm.c: In function ‘ocm_init_node’: ocm.c:182:18: error: invalid operands to binary \| (have ‘int’ and ‘pgprot_t’ {aka ‘struct <anonymous>’}) _PAGE_EXEC \| PAGE_KERNEL_NCG); ^ ocm.c:197:17: error: invalid operands to binary \| (have ‘int’ and ‘pgprot_t’ {aka ‘struct <anonymous>’}) _PAGE_EXEC \| PAGE_KERNEL); ^ Fixes: `56f3c1413f` ("powerpc/mm: properly set PAGE_KERNEL flags in ioremap()") Cc: stable@vger.kernel.org # v4.20 Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-30 14:00:47 +11:00
Diana Craciun	039daac552	powerpc/fsl: Fixed warning: orphan section `__btb_flush_fixup' Fixed the following build warning: powerpc-linux-gnu-ld: warning: orphan section `__btb_flush_fixup' from `arch/powerpc/kernel/head_44x.o' being placed in section `__btb_flush_fixup'. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-30 14:00:47 +11:00
Linus Torvalds	195303136f	Kconfig file consolidation for v4.21 Consolidation of bus (PCI, PCMCIA, EISA, RapidIO) config entries by Christoph Hellwig. Currently, every architecture that wants to provide common peripheral busses needs to add some boilerplate code and include the right Kconfig files. This series instead just selects the presence (when needed) and then handles everything in the bus-specific Kconfig file under drivers/. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJcJilwAAoJED2LAQed4NsGt1YP/RMTEUqbCSwS/CnTLrE+aVTC O2aWwB80ZlVwpeBbHLW5/M88OvOev0UaCr+gyzgpFRl5ITzS7Jevb8VbpGzblbH7 bFxIEyZFGQiy9oEWw3Lfu9JRSsLm3jNo7hkmdBSn2Rw3KkEd/YF7K3q9GuA7BpCS ZxAirebvEpr4KYEzkuc57NqCYx2Tc8G+JWr5D7pZCFaq9vxYt3TddGqw/c7iQVSQ 1Og1809IdhGyCSlA/ExfaqaBMaJHMRAOHX5GgkqZw1EbFcizUFhAAsKCrGL5nBtX NiWF9jhgHR1M+L69jfctOstrmGQD2KicNgWQf1aS5RQkPfjuqIKGT/i9g6J1pVyX TaW1J36Hcl8PpsKoPBnnrixd1T41O3/PuqtEJRm7LCBYOQiwS9sEmLO09RDRjER8 SPAAyvkhE8oq+0RHiTYN4tm8dyJc1djZ5wzgLnwFPAnU6SR+mbN02RzBMsYZXD+x RNbBSGBRJFQDBw6Rn+ktcIQvcKYmUqe1k1YNHMy6kG3QqvhBaDy+8PA/YjIKPQYQ B/NNUAMEJMys1OQrRL2UDXb2ysaCpzwMmlrBW2IwYsQrX5OwbPkNuQ5Mbe1Lr+mc 4NXR+HubvojsHaAby+OhFbrUX2Jcz3wqYj7aannb9sMRmw0VJXV5dPYUqje3ZhPS P2AovKT8O9nWsEttqER5 =WxId -----END PGP SIGNATURE----- Merge tag 'kconfig-v4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kconfig file consolidation from Masahiro Yamada: "Consolidation of bus (PCI, PCMCIA, EISA, RapidIO) config entries by Christoph Hellwig. Currently, every architecture that wants to provide common peripheral busses needs to add some boilerplate code and include the right Kconfig files. This series instead just selects the presence (when needed) and then handles everything in the bus-specific Kconfig file under drivers/" * tag 'kconfig-v4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: pcmcia: remove per-arch PCMCIA config entry eisa: consolidate EISA Kconfig entry in drivers/eisa rapidio: consolidate RAPIDIO config entry in drivers/rapidio pcmcia: allow PCMCIA support independent of the architecture PCI: consolidate the PCI_SYSCALL symbol PCI: consolidate the PCI_DOMAINS and PCI_DOMAINS_GENERIC config options PCI: consolidate PCI config entry in drivers/pci MIPS: remove the HT_PCI config option	2018-12-29 13:40:29 -08:00
Linus Torvalds	769e47094d	Kconfig updates for v4.21 - support -y option for merge_config.sh to avoid downgrading =y to =m - remove S_OTHER symbol type, and touch include/config/.h files correctly - fix file name and line number in lexer warnings - fix memory leak when EOF is encountered in quotation - resolve all shift/reduce conflicts of the parser - warn no new line at end of file - make 'source' statement more strict to take only string literal - rewrite the lexer and remove the keyword lookup table - convert to SPDX License Identifier - compile C files independently instead of including them from zconf.y - fix various warnings of gconfig - misc cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJcJieuAAoJED2LAQed4NsGHlIP/1s0fQ86XD9dIMyHzAO0gh2f 7rylfe2kEXJgIzJ0DyZdLu4iZtwbkEUqTQrRS1abriNGVemPkfBAnZdM5d92lOQX 3iREa700AJ2xo7V7gYZ6AbhZoG3p0S9U9Q2qE5S+tFTe8c2Gy4xtjnODF+Vel85r S0P8tF5sE1/d00lm+yfMI/CJVfDjyNaMm+aVEnL0kZTPiRkaktjWgo6Fc2p4z1L5 HFmMMP6/iaXmRZ+tHJGPQ2AT70GFVZw5ePxPcl50EotUP25KHbuUdzs8wDpYm3U/ rcESVsIFpgqHWmTsdBk6dZk0q8yFZNkMlkaP/aYukVZpUn/N6oAXgTFckYl8dmQL fQBkQi6DTfr9EBPVbj18BKm7xI3Y4DdQ2fzTfYkJ2XwNRGFA5r9N3sjd7ZTVGjxC aeeMHCwvGdSx1x8PeZAhZfsUHW8xVDMSQiT713+ljBY+6cwzA+2NF0kP7B6OAqwr ETFzd4Xu2/lZcL7gQRH8WU3L2S5iedmDG6RnZgJMXI0/9V4qAA+nlsWaCgnl1TgA mpxYlLUMrd6AUJevE34FlnyFdk8IMn9iKRFsvF0f3doO5C7QzTVGqFdJu5a0CuWO 4NBJvZjFT8/4amoWLfnDlfApWXzTfwLbKG+r6V2F30fLuXpYg5LxWhBoGRPYLZSq oi4xN1Mpx3TvXz6WcKVZ =r3Fl -----END PGP SIGNATURE----- Merge tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kconfig updates from Masahiro Yamada: - support -y option for merge_config.sh to avoid downgrading =y to =m - remove S_OTHER symbol type, and touch include/config/.h files correctly - fix file name and line number in lexer warnings - fix memory leak when EOF is encountered in quotation - resolve all shift/reduce conflicts of the parser - warn no new line at end of file - make 'source' statement more strict to take only string literal - rewrite the lexer and remove the keyword lookup table - convert to SPDX License Identifier - compile C files independently instead of including them from zconf.y - fix various warnings of gconfig - misc cleanups * tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits) kconfig: surround dbg_sym_flags with #ifdef DEBUG to fix gconf warning kconfig: split images.c out of qconf.cc/gconf.c to fix gconf warnings kconfig: add static qualifiers to fix gconf warnings kconfig: split the lexer out of zconf.y kconfig: split some C files out of zconf.y kconfig: convert to SPDX License Identifier kconfig: remove keyword lookup table entirely kconfig: update current_pos in the second lexer kconfig: switch to ASSIGN_VAL state in the second lexer kconfig: stop associating kconf_id with yylval kconfig: refactor end token rules kconfig: stop supporting '.' and '/' in unquoted words treewide: surround Kconfig file paths with double quotes microblaze: surround string default in Kconfig with double quotes kconfig: use T_WORD instead of T_VARIABLE for variables kconfig: use specific tokens instead of T_ASSIGN for assignments kconfig: refactor scanning and parsing "option" properties kconfig: use distinct tokens for type and default properties kconfig: remove redundant token defines kconfig: rename depends_list to comment_option_list ...	2018-12-29 13:03:29 -08:00
Linus Torvalds	668c35f69c	Kbuild updates for v4.21 Kbuild core: - remove unneeded $(call cc-option,...) switches - consolidate Clang compiler flags into CLANG_FLAGS - announce the deprecation of SUBDIRS - fix single target build for external module - simplify the dependencies of 'prepare' stage targets - allow fixdep to directly write to ..cmd files - simplify dependency generation for CONFIG_TRIM_UNUSED_KSYMS - change if_changed_rule to accept multi-line recipe - move .SECONDARY special target to scripts/Kbuild.include - remove redundant 'set -e' - improve parallel execution for CONFIG_HEADERS_CHECK - misc cleanups Treewide fixes and cleanups - set Clang flags correctly for PowerPC boot images - fix UML build error with CONFIG_GCC_PLUGINS - remove unneeded patterns from .gitignore files - refactor firmware/Makefile - remove unneeded rules for offsets.s - avoid unneeded regeneration of intermediate .s files - clean up ./Kbuild Modpost: - remove unused -M, -K options - fix false positive warnings about section mismatch - use simple devtable lookup instead of linker magic - misc cleanups Coccinelle: - relax boolinit.cocci checks for overall consistency - fix warning messages of boolinit.cocci Other tools: - improve -dirty check of scripts/setlocalversion - add a tool to generate compile_commands.json from ..cmd files -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJcJiVeAAoJED2LAQed4NsGXEMP/1fdkBfZGxYlxp4w1tEBeRKZ GInxxMfHiHZlu7nFPQ3k1mncczZjeLbnKcLaybnPmqBSfQe1F7tN0ux1sHtWkVfI DvqoKaHObZ3lEMEVxCHXQ2bKrN/j9nB2xSxzr4dvc9DscW8dwKElZuKVE7nHKdhl z70xsalxHGEGY6hptJrucbv8KTBOSleZ8Gaat79sEDkDSLCTjxXB3WcVMWqDT0M/ IqN5QCwiPjZC3UCwuqq6+vnG1gyvDUORcbrVgHrBIKxLYAABYzugB5IlLpi8B31C ZUDmijSTandHd4SG5gw5uZoyYuK4YhZeI7g4yNyXSqnPurmcJxrGdpiuwRcE7xet 5yB2uaNbAFO6Fbz4gUdDEvryA9IZEPPn1Z0Btfpp7UOxiWEqE61oHReCNdkiad94 Oonl+ROZw5UOT3AZD3xCZTf9bTnQoCFccHTmnbaKqqSiDWxxLUXum7sNfnJ6GXEb fNQDuwuxtsPStaWoU93QNrGyl5e8jYTKFdphUCnZZ2/hE8ygoQ06PYmeBAk+UKfN UI2IqR8PHgmA97XJg+fHhbw6X4nQcPsg/usLqxGItAZNO2O4uXgaN24dOcHG74Bu 2Dk/gQQB2sVB/Dxfmlaxvvj98MVg7IMtpusAT2bQ7miiSS3EqPFT8KQMZYLICWuZ u7QQe20yCho3ZULmsRwH =0PCt -----END PGP SIGNATURE----- Merge tag 'kbuild-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: "Kbuild core: - remove unneeded $(call cc-option,...) switches - consolidate Clang compiler flags into CLANG_FLAGS - announce the deprecation of SUBDIRS - fix single target build for external module - simplify the dependencies of 'prepare' stage targets - allow fixdep to directly write to ..cmd files - simplify dependency generation for CONFIG_TRIM_UNUSED_KSYMS - change if_changed_rule to accept multi-line recipe - move .SECONDARY special target to scripts/Kbuild.include - remove redundant 'set -e' - improve parallel execution for CONFIG_HEADERS_CHECK - misc cleanups Treewide fixes and cleanups - set Clang flags correctly for PowerPC boot images - fix UML build error with CONFIG_GCC_PLUGINS - remove unneeded patterns from .gitignore files - refactor firmware/Makefile - remove unneeded rules for offsets.s - avoid unneeded regeneration of intermediate .s files - clean up ./Kbuild Modpost: - remove unused -M, -K options - fix false positive warnings about section mismatch - use simple devtable lookup instead of linker magic - misc cleanups Coccinelle: - relax boolinit.cocci checks for overall consistency - fix warning messages of boolinit.cocci Other tools: - improve -dirty check of scripts/setlocalversion - add a tool to generate compile_commands.json from ..cmd files" * tag 'kbuild-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (51 commits) kbuild: remove unused cmd_gentimeconst kbuild: remove $(obj)/ prefixes in ./Kbuild treewide: add intermediate .s files to targets treewide: remove explicit rules for offsets.s firmware: refactor firmware/Makefile firmware: remove unnecessary patterns from .gitignore scripts: remove unnecessary ihex2fw and check-lc_ctypes from .gitignore um: remove unused filechk_gen_header in Makefile scripts: add a tool to produce a compile_commands.json file kbuild: add -Werror=implicit-int flag unconditionally kbuild: add -Werror=strict-prototypes flag unconditionally kbuild: add -fno-PIE flag unconditionally scripts: coccinelle: Correct warning message scripts: coccinelle: only suggest true/false in files that already use them kbuild: handle part-of-module correctly for .ll and *.symtypes kbuild: refactor part-of-module kbuild: refactor quiet_modtag kbuild: remove redundant quiet_modtag for $(obj-m) kbuild: refactor Makefile.asm-generic user/Makefile: Fix typo and capitalization in comment section ...	2018-12-29 12:03:17 -08:00
Linus Torvalds	030672aea8	Merge tag 'devicetree-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull Devicetree updates from Rob Herring: "The biggest highlight here is the start of using json-schema for DT bindings. Being able to validate bindings has been discussed for years with little progress. - Initial support for DT bindings using json-schema language. This is the start of converting DT bindings from free-form text to a structured format. - Reworking of initrd address initialization. This moves to using the phys address instead of virt addr in the DT parsing code. This rework was motivated by CONFIG_DEV_BLK_INITRD causing unnecessary rebuilding of lots of files. - Fix stale phandle entries in phandle cache - DT overlay validation improvements. This exposed several memory leak bugs which have been fixed. - Use node name and device_type helper functions in DT code - Last remaining conversions to using %pOFn printk specifier instead of device_node.name directly - Create new common RTC binding doc and move all trivial RTC devices out of trivial-devices.txt. - New bindings for Freescale MAG3110 magnetometer, Cadence Sierra PHY, and Xen shared memory - Update dtc to upstream version v1.4.7-57-gf267e674d145" * tag 'devicetree-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (68 commits) of: __of_detach_node() - remove node from phandle cache of: of_node_get()/of_node_put() nodes held in phandle cache gpio-omap.txt: add reg and interrupts properties dt-bindings: mrvl,intc: fix a trivial typo dt-bindings: iio: magnetometer: add dt-bindings for freescale mag3110 dt-bindings: Convert trivial-devices.txt to json-schema dt-bindings: arm: mrvl: amend Browstone compatible string dt-bindings: arm: Convert Tegra board/soc bindings to json-schema dt-bindings: arm: Convert ZTE board/soc bindings to json-schema dt-bindings: arm: Add missing Xilinx boards dt-bindings: arm: Convert Xilinx board/soc bindings to json-schema dt-bindings: arm: Convert VIA board/soc bindings to json-schema dt-bindings: arm: Convert ST STi board/soc bindings to json-schema dt-bindings: arm: Convert SPEAr board/soc bindings to json-schema dt-bindings: arm: Convert CSR SiRF board/soc bindings to json-schema dt-bindings: arm: Convert QCom board/soc bindings to json-schema dt-bindings: arm: Convert TI nspire board/soc bindings to json-schema dt-bindings: arm: Convert TI davinci board/soc bindings to json-schema dt-bindings: arm: Convert Calxeda board/soc bindings to json-schema dt-bindings: arm: Convert Altera board/soc bindings to json-schema ...	2018-12-28 20:08:34 -08:00
Linus Torvalds	f346b0becb	Merge branch 'akpm' (patches from Andrew) Merge misc updates from Andrew Morton: - large KASAN update to use arm's "software tag-based mode" - a few misc things - sh updates - ocfs2 updates - just about all of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (167 commits) kernel/fork.c: mark 'stack_vm_area' with __maybe_unused memcg, oom: notify on oom killer invocation from the charge path mm, swap: fix swapoff with KSM pages include/linux/gfp.h: fix typo mm/hmm: fix memremap.h, move dev_page_fault_t callback to hmm hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization memory_hotplug: add missing newlines to debugging output mm: remove __hugepage_set_anon_rmap() include/linux/vmstat.h: remove unused page state adjustment macro mm/page_alloc.c: allow error injection mm: migrate: drop unused argument of migrate_page_move_mapping() blkdev: avoid migration stalls for blkdev pages mm: migrate: provide buffer_migrate_page_norefs() mm: migrate: move migrate_page_lock_buffers() mm: migrate: lock buffers before migrate_page_move_mapping() mm: migration: factor out code to compute expected number of page references mm, page_alloc: enable pcpu_drain with zone capability kmemleak: add config to select auto scan mm/page_alloc.c: don't call kasan_free_pages() at deferred mem init ...	2018-12-28 16:55:46 -08:00
Linus Torvalds	af7ddd8a62	DMA mapping updates for Linux 4.21 A huge update this time, but a lot of that is just consolidating or removing code: - provide a common DMA_MAPPING_ERROR definition and avoid indirect calls for dma_map_* error checking - use direct calls for the DMA direct mapping case, avoiding huge retpoline overhead for high performance workloads - merge the swiotlb dma_map_ops into dma-direct - provide a generic remapping DMA consistent allocator for architectures that have devices that perform DMA that is not cache coherent. Based on the existing arm64 implementation and also used for csky now. - improve the dma-debug infrastructure, including dynamic allocation of entries (Robin Murphy) - default to providing chaining scatterlist everywhere, with opt-outs for the few architectures (alpha, parisc, most arm32 variants) that can't cope with it - misc sparc32 dma-related cleanups - remove the dma_mark_clean arch hook used by swiotlb on ia64 and replace it with the generic noncoherent infrastructure - fix the return type of dma_set_max_seg_size (Niklas Söderlund) - move the dummy dma ops for not DMA capable devices from arm64 to common code (Robin Murphy) - ensure dma_alloc_coherent returns zeroed memory to avoid kernel data leaks through userspace. We already did this for most common architectures, but this ensures we do it everywhere. dma_zalloc_coherent has been deprecated and can hopefully be removed after -rc1 with a coccinelle script. -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlwctQgLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYMxgQ//dBpAfS4/J76CdAbYry2zqgcOUU9hIrD6NHiEMWov ltJxyvEl3LsUmIdEj3aCrYL9jZN0qsnCzn5BVj2c3jDIVgD64fAr7HDf/PbEEfKb j6/GgEnVLPZV+sQMvhNA5jOzHrkseaqPa4/pNLFZ/l8jnuZ2d+btusDWJpMoVDer TXVwtIfgeIu0gTygYOShLYXd5qptWKWsZEpbTZOO2sE6+x+ZJX7yQYUxYDTlcOIj JWVO2l5QNHPc5T9o2at+6L5aNUvnZOxT79sWgyZLn0Kc+FagKAVwfLqUEl0v7foG 8k/xca5/8p3afB1DfrIrtplJqis7cVgdyGxriwuuoO8X4F0nPyWwpGmxsBhrWwwl xTqC4UorEJ7QwoP6Azopk/vYI2QXIUBLjuCJCuFXZj9+2BGf4IfvBY1S2cLM9qLs HMcxQonuXJii044KEFS96ePEuiT+igVINweIFBKWcgNCEG0UQtyL6RQ1U5297ipF JiWZAqD+p9X52UdKS+oKfAiZEekMXn6Xyo97+YCiNpfOo0GP5eEcwhL+JpY4AiRq apPXtsRy2o1s8yfjdraUIM2Mc2n62vFKb35oUbGCd/QO9piPrFQHl6T0HHcHk4YR XrUXcHieFZBCYqh7ZVa4RL8Msq1wvGuTL4Dxl43mXdsMoUFRR6eSNWLoAV4IpOLZ WgA= =in72 -----END PGP SIGNATURE----- Merge tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping Pull DMA mapping updates from Christoph Hellwig: "A huge update this time, but a lot of that is just consolidating or removing code: - provide a common DMA_MAPPING_ERROR definition and avoid indirect calls for dma_map_* error checking - use direct calls for the DMA direct mapping case, avoiding huge retpoline overhead for high performance workloads - merge the swiotlb dma_map_ops into dma-direct - provide a generic remapping DMA consistent allocator for architectures that have devices that perform DMA that is not cache coherent. Based on the existing arm64 implementation and also used for csky now. - improve the dma-debug infrastructure, including dynamic allocation of entries (Robin Murphy) - default to providing chaining scatterlist everywhere, with opt-outs for the few architectures (alpha, parisc, most arm32 variants) that can't cope with it - misc sparc32 dma-related cleanups - remove the dma_mark_clean arch hook used by swiotlb on ia64 and replace it with the generic noncoherent infrastructure - fix the return type of dma_set_max_seg_size (Niklas Söderlund) - move the dummy dma ops for not DMA capable devices from arm64 to common code (Robin Murphy) - ensure dma_alloc_coherent returns zeroed memory to avoid kernel data leaks through userspace. We already did this for most common architectures, but this ensures we do it everywhere. dma_zalloc_coherent has been deprecated and can hopefully be removed after -rc1 with a coccinelle script" * tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping: (73 commits) dma-mapping: fix inverted logic in dma_supported dma-mapping: deprecate dma_zalloc_coherent dma-mapping: zero memory returned from dma_alloc_* sparc/iommu: fix ->map_sg return value sparc/io-unit: fix ->map_sg return value arm64: default to the direct mapping in get_arch_dma_ops PCI: Remove unused attr variable in pci_dma_configure ia64: only select ARCH_HAS_DMA_COHERENT_TO_PFN if swiotlb is enabled dma-mapping: bypass indirect calls for dma-direct vmd: use the proper dma_* APIs instead of direct methods calls dma-direct: merge swiotlb_dma_ops into the dma_direct code dma-direct: use dma_direct_map_page to implement dma_direct_map_sg dma-direct: improve addressability error reporting swiotlb: remove dma_mark_clean swiotlb: remove SWIOTLB_MAP_ERROR ACPI / scan: Refactor _CCA enforcement dma-mapping: factor out dummy DMA ops dma-mapping: always build the direct mapping code dma-mapping: move dma_cache_sync out of line dma-mapping: move various slow path functions out of line ...	2018-12-28 14:12:21 -08:00
Oscar Salvador	2c2a5af6fe	mm, memory_hotplug: add nid parameter to arch_remove_memory Patch series "Do not touch pages in hot-remove path", v2. This patchset aims for two things: 1) A better definition about offline and hot-remove stage 2) Solving bugs where we can access non-initialized pages during hot-remove operations [2] [3]. This is achieved by moving all page/zone handling to the offline stage, so we do not need to access pages when hot-removing memory. [1] https://patchwork.kernel.org/cover/10691415/ [2] https://patchwork.kernel.org/patch/10547445/ [3] https://www.spinics.net/lists/linux-mm/msg161316.html This patch (of 5): This is a preparation for the following-up patches. The idea of passing the nid is that it will allow us to get rid of the zone parameter afterwards. Link: http://lkml.kernel.org/r/20181127162005.15833-2-osalvador@suse.de Signed-off-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-12-28 12:11:49 -08:00
Arun KS	ca79b0c211	mm: convert totalram_pages and totalhigh_pages variables to atomic totalram_pages and totalhigh_pages are made static inline function. Main motivation was that managed_page_count_lock handling was complicating things. It was discussed in length here, https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes better to remove the lock and convert variables to atomic, with preventing poteintial store-to-read tearing as a bonus. [akpm@linux-foundation.org: coding style fixes] Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.org Signed-off-by: Arun KS <arunks@codeaurora.org> Suggested-by: Michal Hocko <mhocko@suse.com> Suggested-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-12-28 12:11:47 -08:00
Linus Torvalds	e0c38a4d1f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: 1) New ipset extensions for matching on destination MAC addresses, from Stefano Brivio. 2) Add ipv4 ttl and tos, plus ipv6 flow label and hop limit offloads to nfp driver. From Stefano Brivio. 3) Implement GRO for plain UDP sockets, from Paolo Abeni. 4) Lots of work from Michał Mirosław to eliminate the VLAN_TAG_PRESENT bit so that we could support the entire vlan_tci value. 5) Rework the IPSEC policy lookups to better optimize more usecases, from Florian Westphal. 6) Infrastructure changes eliminating direct manipulation of SKB lists wherever possible, and to always use the appropriate SKB list helpers. This work is still ongoing... 7) Lots of PHY driver and state machine improvements and simplifications, from Heiner Kallweit. 8) Various TSO deferral refinements, from Eric Dumazet. 9) Add ntuple filter support to aquantia driver, from Dmitry Bogdanov. 10) Batch dropping of XDP packets in tuntap, from Jason Wang. 11) Lots of cleanups and improvements to the r8169 driver from Heiner Kallweit, including support for ->xmit_more. This driver has been getting some much needed love since he started working on it. 12) Lots of new forwarding selftests from Petr Machata. 13) Enable VXLAN learning in mlxsw driver, from Ido Schimmel. 14) Packed ring support for virtio, from Tiwei Bie. 15) Add new Aquantia AQtion USB driver, from Dmitry Bezrukov. 16) Add XDP support to dpaa2-eth driver, from Ioana Ciocoi Radulescu. 17) Implement coalescing on TCP backlog queue, from Eric Dumazet. 18) Implement carrier change in tun driver, from Nicolas Dichtel. 19) Support msg_zerocopy in UDP, from Willem de Bruijn. 20) Significantly improve garbage collection of neighbor objects when the table has many PERMANENT entries, from David Ahern. 21) Remove egdev usage from nfp and mlx5, and remove the facility completely from the tree as it no longer has any users. From Oz Shlomo and others. 22) Add a NETDEV_PRE_CHANGEADDR so that drivers can veto the change and therefore abort the operation before the commit phase (which is the NETDEV_CHANGEADDR event). From Petr Machata. 23) Add indirect call wrappers to avoid retpoline overhead, and use them in the GRO code paths. From Paolo Abeni. 24) Add support for netlink FDB get operations, from Roopa Prabhu. 25) Support bloom filter in mlxsw driver, from Nir Dotan. 26) Add SKB extension infrastructure. This consolidates the handling of the auxiliary SKB data used by IPSEC and bridge netfilter, and is designed to support the needs to MPTCP which could be integrated in the future. 27) Lots of XDP TX optimizations in mlx5 from Tariq Toukan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1845 commits) net: dccp: fix kernel crash on module load drivers/net: appletalk/cops: remove redundant if statement and mask bnx2x: Fix NULL pointer dereference in bnx2x_del_all_vlans() on some hw net/net_namespace: Check the return value of register_pernet_subsys() net/netlink_compat: Fix a missing check of nla_parse_nested ieee802154: lowpan_header_create check must check daddr net/mlx4_core: drop useless LIST_HEAD mlxsw: spectrum: drop useless LIST_HEAD net/mlx5e: drop useless LIST_HEAD iptunnel: Set tun_flags in the iptunnel_metadata_reply from src net/mlx5e: fix semicolon.cocci warnings staging: octeon: fix build failure with XFRM enabled net: Revert recent Spectre-v1 patches. can: af_can: Fix Spectre v1 vulnerability packet: validate address length if non-zero nfc: af_nfc: Fix Spectre v1 vulnerability phonet: af_phonet: Fix Spectre v1 vulnerability net: core: Fix Spectre v1 vulnerability net: minor cleanup in skb_ext_add() net: drop the unused helper skb_ext_get() ...	2018-12-27 13:04:52 -08:00
Linus Torvalds	c06e9ef691	pstore improvements and refactorings - Improve compression handling - Refactor argument handling during initialization - Avoid needless locking for saner EFI backend handling - Add more kern-doc and improve debugging output -----BEGIN PGP SIGNATURE----- Comment: Kees Cook <kees@outflux.net> iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlwYNbMWHGtlZXNjb29r QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJprSD/476vQkmv+3Q7sBexzi8lD2Qz4A PVmeKGtHc2SUEIeBSdTTCGd2LvcgMYvoPyBVOgKc7k4S2Tuyef5zafpLTuR/rQHb FClZ9uKdHFN+Yfr/NhAlwGecHjnYSlAE/dxpeGywUox3kqohVx2VOsb85sSpOV1C l41PaLJwdEJ1ShNqI9ohKFUOjjhSFfUvk8D+LDnO6CzroFNzt/wCE4I7sP0WQ6v4 9/4HzVdtLb9x5J21uLiYL6GavxWKF0qzLQtHlNF68a4Im4dtNkGJsSFqiblJ89N2 0AfZdKAUo9sDGkV2XJNg3pC3EjnPiLAY7vOJvMriT2tPiDCB+GZ0fu4rreN3IjhY CqXclUX/W72wQfQdwuQwCjFc+Clc8h6HC3HCYwWNoutpwX2s2pRT0plpl3frNELT Z1WcpXk5053ZlmAkNuSH3a8CeuDGGjZlACHXF/OH5Asx6RHKruGw1LckGUHCJ5EQ 8+2gJOmQk0jhStp9jPUbwVfFGdyMIS5Ns/hwcu0WIvjaeCfPb4jJKXk+aRc1U8qA I0eCJAyrU90QXf/yEUTWi0tTkGzB3xwRxX490MS2pgtlWHgHndpk6QIebZh9XWJV cxzGE7qyS5k/jji+9KaksYNXT5CoZnO/7EGAIWiNZ8hDhZVrcGdsfL2icDwQ7URO fhZGeQqPVZYN/fdZ7A== =woxs -----END PGP SIGNATURE----- Merge tag 'pstore-v4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore updates from Kees Cook: "Improvements and refactorings: - Improve compression handling - Refactor argument handling during initialization - Avoid needless locking for saner EFI backend handling - Add more kern-doc and improve debugging output" * tag 'pstore-v4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore/ram: Avoid NULL deref in ftrace merging failure path pstore: Convert buf_lock to semaphore pstore: Fix bool initialization/comparison pstore/ram: Do not treat empty buffers as valid pstore/ram: Simplify ramoops_get_next_prz() arguments pstore: Map PSTORE_TYPE_* to strings pstore: Replace open-coded << with BIT() pstore: Improve and update some comments and status output pstore/ram: Add kern-doc for struct persistent_ram_zone pstore/ram: Report backend assignments with finer granularity pstore/ram: Standardize module name in ramoops pstore: Avoid duplicate call of persistent_ram_zap() pstore: Remove needless lock during console writes pstore: Do not use crash buffer for decompression	2018-12-27 11:15:21 -08:00
Linus Torvalds	8d6973327e	powerpc updates for 4.21 Notable changes: - Mitigations for Spectre v2 on some Freescale (NXP) CPUs. - A large series adding support for pass-through of Nvidia V100 GPUs to guests on Power9. - Another large series to enable hardware assistance for TLB table walk on MPC8xx CPUs. - Some preparatory changes to our DMA code, to make way for further cleanups from Christoph. - Several fixes for our Transactional Memory handling discovered by fuzzing the signal return path. - Support for generating our system call table(s) from a text file like other architectures. - A fix to our page fault handler so that instead of generating a WARN_ON_ONCE, user accesses of kernel addresses instead print a ratelimited and appropriately scary warning. - A cosmetic change to make our unhandled page fault messages more similar to other arches and also more compact and informative. - Freescale updates from Scott: "Highlights include elimination of legacy clock bindings use from dts files, an 83xx watchdog handler, fixes to old dts interrupt errors, and some minor cleanup." And many clean-ups, reworks and minor fixes etc. Thanks to: Alexandre Belloni, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Benjamin Herrenschmidt, Breno Leitao, Christian Lamparter, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Darren Stevens, David Gibson, Diana Craciun, Dmitry V. Levin, Firoz Khan, Geert Uytterhoeven, Greg Kurz, Gustavo Romero, Hari Bathini, Joel Stanley, Kees Cook, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Mathieu Malaterre, Michal Suchánek, Naveen N. Rao, Nick Desaulniers, Oliver O'Halloran, Paul Mackerras, Ram Pai, Ravi Bangoria, Rob Herring, Russell Currey, Sabyasachi Gupta, Sam Bobroff, Satheesh Rajendran, Scott Wood, Segher Boessenkool, Stephen Rothwell, Tang Yuantian, Thiago Jung Bauermann, Yangtao Li, Yuantian Tang, Yue Haibing. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJcJLwZAAoJEFHr6jzI4aWAAv4P/jMvP52lA90i2E8G72LOVSF1 33DbE/Okib3VfmmMcXZpgpEfwIcEmJcIj86WWcLWzBfXLunehkgwh+AOfBLwqWch D08+RR9EZb7ppvGe91hvSgn4/28CWVKAxuDviSuoE1OK8lOTncu889r2+AxVFZiY f6Al9UPlB3FTJonNx8iO4r/GwrPigukjbzp1vkmJJg59LvNUrMQ1Fgf9D3cdlslH z4Ff9zS26RJy7cwZYQZI4sZXJZmeQ1DxOZ+6z6FL/nZp/O4WLgpw6C6o1+vxo1kE 9ZnO/3+zIRhoWiXd6OcOQXBv3NNCjJZlXh9HHAiL8m5ZqbmxrStQWGyKW/jjEZuK wVHxfUT19x9Qy1p+BH3XcUNMlxchYgcCbEi5yPX2p9ZDXD6ogNG7sT1+NO+FBTww ueCT5PCCB/xWOccQlBErFTMkFXFLtyPDNFK7BkV7uxbH0PQ+9guCvjWfBZti6wjD /6NK4mk7FpmCiK13Y1xjwC5OqabxLUYwtVuHYOMr5TOPh8URUPS4+0pIOdoYDM6z Ensrq1CC843h59MWADgFHSlZ78FRtZlG37JAXunjLbqGupLOvL7phC9lnwkylHga 2hWUWFeOV8HFQBP4gidZkLk64pkT9LzqHgdgIB4wUwrhc8r2mMZGdQTq5H7kOn3Q n9I48PWANvEC0PBCJ/KL =cr6s -----END PGP SIGNATURE----- Merge tag 'powerpc-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - Mitigations for Spectre v2 on some Freescale (NXP) CPUs. - A large series adding support for pass-through of Nvidia V100 GPUs to guests on Power9. - Another large series to enable hardware assistance for TLB table walk on MPC8xx CPUs. - Some preparatory changes to our DMA code, to make way for further cleanups from Christoph. - Several fixes for our Transactional Memory handling discovered by fuzzing the signal return path. - Support for generating our system call table(s) from a text file like other architectures. - A fix to our page fault handler so that instead of generating a WARN_ON_ONCE, user accesses of kernel addresses instead print a ratelimited and appropriately scary warning. - A cosmetic change to make our unhandled page fault messages more similar to other arches and also more compact and informative. - Freescale updates from Scott: "Highlights include elimination of legacy clock bindings use from dts files, an 83xx watchdog handler, fixes to old dts interrupt errors, and some minor cleanup." And many clean-ups, reworks and minor fixes etc. Thanks to: Alexandre Belloni, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Benjamin Herrenschmidt, Breno Leitao, Christian Lamparter, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Darren Stevens, David Gibson, Diana Craciun, Dmitry V. Levin, Firoz Khan, Geert Uytterhoeven, Greg Kurz, Gustavo Romero, Hari Bathini, Joel Stanley, Kees Cook, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Mathieu Malaterre, Michal Suchánek, Naveen N. Rao, Nick Desaulniers, Oliver O'Halloran, Paul Mackerras, Ram Pai, Ravi Bangoria, Rob Herring, Russell Currey, Sabyasachi Gupta, Sam Bobroff, Satheesh Rajendran, Scott Wood, Segher Boessenkool, Stephen Rothwell, Tang Yuantian, Thiago Jung Bauermann, Yangtao Li, Yuantian Tang, Yue Haibing" * tag 'powerpc-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (201 commits) Revert "powerpc/fsl_pci: simplify fsl_pci_dma_set_mask" powerpc/zImage: Also check for stdout-path powerpc: Fix HMIs on big-endian with CONFIG_RELOCATABLE=y macintosh: Use of_node_name_{eq, prefix} for node name comparisons ide: Use of_node_name_eq for node name comparisons powerpc: Use of_node_name_eq for node name comparisons powerpc/pseries/pmem: Convert to %pOFn instead of device_node.name powerpc/mm: Remove very old comment in hash-4k.h powerpc/pseries: Fix node leak in update_lmb_associativity_index() powerpc/configs/85xx: Enable CONFIG_DEBUG_KERNEL powerpc/dts/fsl: Fix dtc-flagged interrupt errors clk: qoriq: add more compatibles strings powerpc/fsl: Use new clockgen binding powerpc/83xx: handle machine check caused by watchdog timer powerpc/fsl-rio: fix spelling mistake "reserverd" -> "reserved" powerpc/fsl_pci: simplify fsl_pci_dma_set_mask arch/powerpc/fsl_rmu: Use dma_zalloc_coherent vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver vfio_pci: Allow regions to add own capabilities vfio_pci: Allow mapping extra regions ...	2018-12-27 10:43:24 -08:00
Linus Torvalds	6e54df001a	Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build updates from Ingo Molnar: - Resolve LLVM build bug by removing redundant GNU specific flag - Remove obsolete -funit-at-a-time and -fno-unit-at-a-time use from x86 PowerPC and UM. The UML change was seen and acked by UML maintainer Richard Weinberger. * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/um/vdso: Drop implicit common-page-size linker flag x86, powerpc: Remove -funit-at-a-time compiler option entirely x86/um: Remove -fno-unit-at-a-time workaround for pre-4.0 GCC	2018-12-26 16:57:27 -08:00
Linus Torvalds	792bf4d871	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The biggest RCU changes in this cycle were: - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar. - Replace calls of RCU-bh and RCU-sched update-side functions to their vanilla RCU counterparts. This series is a step towards complete removal of the RCU-bh and RCU-sched update-side functions. ( Note that some of these conversions are going upstream via their respective maintainers. ) - Documentation updates, including a number of flavor-consolidation updates from Joel Fernandes. - Miscellaneous fixes. - Automate generation of the initrd filesystem used for rcutorture testing. - Convert spin_is_locked() assertions to instead use lockdep. ( Note that some of these conversions are going upstream via their respective maintainers. ) - SRCU updates, especially including a fix from Dennis Krein for a bag-on-head-class bug. - RCU torture-test updates" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits) rcutorture: Don't do busted forward-progress testing rcutorture: Use 100ms buckets for forward-progress callback histograms rcutorture: Recover from OOM during forward-progress tests rcutorture: Print forward-progress test age upon failure rcutorture: Print time since GP end upon forward-progress failure rcutorture: Print histogram of CB invocation at OOM time rcutorture: Print GP age upon forward-progress failure rcu: Print per-CPU callback counts for forward-progress failures rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings rcutorture: Dump grace-period diagnostics upon forward-progress OOM rcutorture: Prepare for asynchronous access to rcu_fwd_startat torture: Remove unnecessary "ret" variables rcutorture: Affinity forward-progress test to avoid housekeeping CPUs rcutorture: Break up too-long rcu_torture_fwd_prog() function rcutorture: Remove cbflood facility torture: Bring any extra CPUs online during kernel startup rcutorture: Add call_rcu() flooding forward-progress tests rcutorture/formal: Replace synchronize_sched() with synchronize_rcu() tools/kernel.h: Replace synchronize_sched() with synchronize_rcu() net/decnet: Replace rcu_barrier_bh() with rcu_barrier() ...	2018-12-26 13:07:19 -08:00
Linus Torvalds	42b00f122c	* ARM: selftests improvements, large PUD support for HugeTLB, single-stepping fixes, improved tracing, various timer and vGIC fixes * x86: Processor Tracing virtualization, STIBP support, some correctness fixes, refactorings and splitting of vmx.c, use the Hyper-V range TLB flush hypercall, reduce order of vcpu struct, WBNOINVD support, do not use -ftrace for __noclone functions, nested guest support for PAUSE filtering on AMD, more Hyper-V enlightenments (direct mode for synthetic timers) * PPC: nested VFIO * s390: bugfixes only this time -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJcH0vFAAoJEL/70l94x66Dw/wH/2FZp1YOM5OgiJzgqnXyDbyf dNEfWo472MtNiLsuf+ZAfJojVIu9cv7wtBfXNzW+75XZDfh/J88geHWNSiZDm3Fe aM4MOnGG0yF3hQrRQyEHe4IFhGFNERax8Ccv+OL44md9CjYrIrsGkRD08qwb+gNh P8T/3wJEKwUcVHA/1VHEIM8MlirxNENc78p6JKd/C7zb0emjGavdIpWFUMr3SNfs CemabhJUuwOYtwjRInyx1y34FzYwW3Ejuc9a9UoZ+COahUfkuxHE8u+EQS7vLVF6 2VGVu5SA0PqgmLlGhHthxLqVgQYo+dB22cRnsLtXlUChtVAq8q9uu5sKzvqEzuE= =b4Jx -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "ARM: - selftests improvements - large PUD support for HugeTLB - single-stepping fixes - improved tracing - various timer and vGIC fixes x86: - Processor Tracing virtualization - STIBP support - some correctness fixes - refactorings and splitting of vmx.c - use the Hyper-V range TLB flush hypercall - reduce order of vcpu struct - WBNOINVD support - do not use -ftrace for __noclone functions - nested guest support for PAUSE filtering on AMD - more Hyper-V enlightenments (direct mode for synthetic timers) PPC: - nested VFIO s390: - bugfixes only this time" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: x86: Add CPUID support for new instruction WBNOINVD kvm: selftests: ucall: fix exit mmio address guessing Revert "compiler-gcc: disable -ftracer for __noclone functions" KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobs KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixup MAINTAINERS: Add arch/x86/kvm sub-directories to existing KVM/x86 entry KVM/x86: Use SVM assembly instruction mnemonics instead of .byte streams KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range() KVM/MMU: Flush tlb directly in kvm_set_pte_rmapp() KVM/MMU: Move tlb flush in kvm_set_pte_rmapp() to kvm_mmu_notifier_change_pte() KVM: Make kvm_set_spte_hva() return int KVM: Replace old tlb flush function with new one to flush a specified range. KVM/MMU: Add tlb flush with range helper function KVM/VMX: Add hv tlb range flush support x86/hyper-v: Add HvFlushGuestAddressList hypercall support KVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops KVM: x86: Disable Intel PT when VMXON in L1 guest KVM: x86: Set intercept for Intel PT MSRs read/write KVM: x86: Implement Intel PT MSRs read/write emulation ...	2018-12-26 11:46:28 -08:00
Linus Torvalds	5694cecdb0	arm64 festive updates for 4.21 In the end, we ended up with quite a lot more than I expected: - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and kernel-side support to come later) - Support for per-thread stack canaries, pending an update to GCC that is currently undergoing review - Support for kexec_file_load(), which permits secure boot of a kexec payload but also happens to improve the performance of kexec dramatically because we can avoid the sucky purgatory code from userspace. Kdump will come later (requires updates to libfdt). - Optimisation of our dynamic CPU feature framework, so that all detected features are enabled via a single stop_machine() invocation - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that they can benefit from global TLB entries when KASLR is not in use - 52-bit virtual addressing for userspace (kernel remains 48-bit) - Patch in LSE atomics for per-cpu atomic operations - Custom preempt.h implementation to avoid unconditional calls to preempt_schedule() from preempt_enable() - Support for the new 'SB' Speculation Barrier instruction - Vectorised implementation of XOR checksumming and CRC32 optimisations - Workaround for Cortex-A76 erratum #1165522 - Improved compatibility with Clang/LLD - Support for TX2 system PMUS for profiling the L3 cache and DMC - Reflect read-only permissions in the linear map by default - Ensure MMIO reads are ordered with subsequent calls to Xdelay() - Initial support for memory hotplug - Tweak the threshold when we invalidate the TLB by-ASID, so that mremap() performance is improved for ranges spanning multiple PMDs. - Minor refactoring and cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJcE4TmAAoJELescNyEwWM0Nr0H/iaU7/wQSzHyNXtZoImyKTul Blu2ga4/EqUrTU7AVVfmkl/3NBILWlgQVpY6tH6EfXQuvnxqD7CizbHyLdyO+z0S B5PsFUH2GLMNAi48AUNqGqkgb2knFbg+T+9IimijDBkKg1G/KhQnRg6bXX32mLJv Une8oshUPBVJMsHN1AcQknzKariuoE3u0SgJ+eOZ9yA2ZwKxP4yy1SkDt3xQrtI0 lojeRjxcyjTP1oGRNZC+BWUtGOT35p7y6cGTnBd/4TlqBGz5wVAJUcdoxnZ6JYVR O8+ob9zU+4I0+SKt80s7pTLqQiL9rxkKZ5joWK1pr1g9e0s5N5yoETXKFHgJYP8= =sYdt -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 festive updates from Will Deacon: "In the end, we ended up with quite a lot more than I expected: - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and kernel-side support to come later) - Support for per-thread stack canaries, pending an update to GCC that is currently undergoing review - Support for kexec_file_load(), which permits secure boot of a kexec payload but also happens to improve the performance of kexec dramatically because we can avoid the sucky purgatory code from userspace. Kdump will come later (requires updates to libfdt). - Optimisation of our dynamic CPU feature framework, so that all detected features are enabled via a single stop_machine() invocation - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that they can benefit from global TLB entries when KASLR is not in use - 52-bit virtual addressing for userspace (kernel remains 48-bit) - Patch in LSE atomics for per-cpu atomic operations - Custom preempt.h implementation to avoid unconditional calls to preempt_schedule() from preempt_enable() - Support for the new 'SB' Speculation Barrier instruction - Vectorised implementation of XOR checksumming and CRC32 optimisations - Workaround for Cortex-A76 erratum #1165522 - Improved compatibility with Clang/LLD - Support for TX2 system PMUS for profiling the L3 cache and DMC - Reflect read-only permissions in the linear map by default - Ensure MMIO reads are ordered with subsequent calls to Xdelay() - Initial support for memory hotplug - Tweak the threshold when we invalidate the TLB by-ASID, so that mremap() performance is improved for ranges spanning multiple PMDs. - Minor refactoring and cleanups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (125 commits) arm64: kaslr: print PHYS_OFFSET in dump_kernel_offset() arm64: sysreg: Use _BITUL() when defining register bits arm64: cpufeature: Rework ptr auth hwcaps using multi_entry_cap_matches arm64: cpufeature: Reduce number of pointer auth CPU caps from 6 to 4 arm64: docs: document pointer authentication arm64: ptr auth: Move per-thread keys from thread_info to thread_struct arm64: enable pointer authentication arm64: add prctl control for resetting ptrauth keys arm64: perf: strip PAC when unwinding userspace arm64: expose user PAC bit positions via ptrace arm64: add basic pointer authentication support arm64/cpufeature: detect pointer authentication arm64: Don't trap host pointer auth use to EL2 arm64/kvm: hide ptrauth from guests arm64/kvm: consistently handle host HCR_EL2 flags arm64: add pointer authentication register bits arm64: add comments about EC exception levels arm64: perf: Treat EXCLUDE_EL* bit definitions as unsigned arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field arm64: enable per-task stack canaries ...	2018-12-25 17:41:56 -08:00
Michael Ellerman	12526b0d6c	Merge branch 'next' of https://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Freescale updates from Scott: "Highlights include elimination of legacy clock bindings use from dts files, an 83xx watchdog handler, fixes to old dts interrupt errors, and some minor cleanup."	2018-12-24 14:14:45 +11:00
Scott Wood	63d86876f3	Revert "powerpc/fsl_pci: simplify fsl_pci_dma_set_mask" This reverts commit `c6e5485e0c` due to failures such as: e1000e 2000:01:00.0: Tx DMA map failed Signed-off-by: Scott Wood <oss@buserror.net>	2018-12-23 20:11:20 -06:00
Steven Rostedt (VMware)	0fad8bfef7	powerpc/frace: Use ftrace_graph_get_ret_stack() instead of curr_ret_stack The structure of the ret_stack array on the task struct is going to change, and accessing it directly via the curr_ret_stack index will no longer give the ret_stack entry that holds the return address. To access that, architectures must now use ftrace_graph_get_ret_stack() to get the associated ret_stack that matches the saved return address. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2018-12-22 08:20:45 -05:00
Oliver O'Halloran	9bbc7e4ce4	powerpc/zImage: Also check for stdout-path The /chosen/linux,stdout-path is "deprecated" in favour of /chosen/stdout-path so we should be checking for both. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-22 21:47:23 +11:00
Benjamin Herrenschmidt	505a314fb2	powerpc: Fix HMIs on big-endian with CONFIG_RELOCATABLE=y HMIs will crash the kernel due to BRANCH_LINK_TO_FAR(hmi_exception_realmode) Calling into the OPD instead of the actual code. Fixes: `2337d20728` ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Use DOTSYM() rather than #ifdef] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-22 21:43:55 +11:00
Rob Herring	2c8e65b595	powerpc: Use of_node_name_eq for node name comparisons Convert string compares of DT node names to use of_node_name_eq helper instead. This removes direct access to the node name pointer. A couple of open coded iterating thru the child node names are converted to use for_each_child_of_node() instead. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-22 21:29:50 +11:00
Rob Herring	0d1223dd92	powerpc/pseries/pmem: Convert to %pOFn instead of device_node.name In preparation to remove the node name pointer from struct device_node, convert printf users to use the %pOFn format specifier. pmem.c was recently added and missed the initial conversion. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-22 21:05:03 +11:00
Michael Ellerman	423e2f9445	powerpc/mm: Remove very old comment in hash-4k.h This comment talks about PTEs being 64-bits and PMD/PGD being 32-bits, but that hasn't been true since 2005 when David Gibson implemented 4-level page tables in the commit titled "Four level pagetables for ppc64". Remove it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-22 21:04:27 +11:00
Michael Ellerman	47918bc68b	powerpc/pseries: Fix node leak in update_lmb_associativity_index() In update_lmb_associativity_index() we lookup dr_node using of_find_node_by_path() which takes a reference for us. In the non-error case we forget to drop the reference. Note that find_aa_index() does modify properties of the node, but doesn't need an extra reference held once it's returned. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-22 21:04:22 +11:00
Scott Wood	5f470b3638	powerpc/configs/85xx: Enable CONFIG_DEBUG_KERNEL This is required for CONFIG_DEBUG_INFO to work. Signed-off-by: Scott Wood <oss@buserror.net>	2018-12-21 22:07:54 -06:00
Scott Wood	ccdde478e8	powerpc/dts/fsl: Fix dtc-flagged interrupt errors mpc8641_hpcn was updated to 4-cell interrupt specifiers, but PCI interrupt-map was not updated. It was also missing #interrupt-cells on the outer PCI buses. p1020rdb-pc was updated to 4-cell interrupt specifiers, but the ethernet-phy nodes weren't updated. mpc832x_rdb had an invalid "interrupts = <0>" on the ethernet-phy nodes. Besides being the wrong number of cells, 0 is not a valid IPIC interrupt according to ipic.c. Presumably it was meant to indicate that these PHYs are not connected to an interrupt. Signed-off-by: Scott Wood <oss@buserror.net>	2018-12-21 21:37:05 -06:00
Scott Wood	54877957e9	powerpc/fsl: Use new clockgen binding The driver retains compatibility with old device trees, but we don't want the old nodes lying around to be copied, or used as a reference (some of the mux options are incorrect), or even just being clutter. Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Tang Yuantian <andy.tang@nxp.com> [scottwood: removed sysclk node added by Andy] Signed-off-by: Scott Wood <oss@buserror.net>	2018-12-21 20:58:34 -06:00
Christophe Leroy	0deae39cec	powerpc/83xx: handle machine check caused by watchdog timer When the watchdog timer is set in interrupt mode, it causes a machine check when it times out. The purpose of this mode is to ease debugging, not to crash the kernel and reboot the machine. This patch implements a special handling for that, in order to not crash the kernel if the watchdog times out while in interrupt or within the idle task. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [scottwood: added missing #include] Signed-off-by: Scott Wood <oss@buserror.net>	2018-12-21 20:56:41 -06:00
Alexandre Belloni	01f45c8fb8	powerpc/fsl-rio: fix spelling mistake "reserverd" -> "reserved" Fix a spelling mistake in a register description. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Scott Wood <oss@buserror.net>	2018-12-21 20:45:15 -06:00
Christoph Hellwig	c6e5485e0c	powerpc/fsl_pci: simplify fsl_pci_dma_set_mask swiotlb will only bounce buffer when the effective dma address for the device is smaller than the actual DMA range. Instead of flipping between the swiotlb and nommu ops for FSL SOCs that have the second outbound window just don't set the bus dma_mask in this case. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Scott Wood <oss@buserror.net>	2018-12-21 20:09:24 -06:00
Sabyasachi Gupta	7811eade24	arch/powerpc/fsl_rmu: Use dma_zalloc_coherent Replaced dma_alloc_coherent + memset with dma_zalloc_coherent Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>	2018-12-21 19:53:20 -06:00
Masahiro Yamada	8636a1f967	treewide: surround Kconfig file paths with double quotes The Kconfig lexer supports special characters such as '.' and '/' in the parameter context. In my understanding, the reason is just to support bare file paths in the source statement. I do not see a good reason to complicate Kconfig for the room of ambiguity. The majority of code already surrounds file paths with double quotes, and it makes sense since file paths are constant string literals. Make it treewide consistent now. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Wolfram Sang <wsa@the-dreams.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Ingo Molnar <mingo@kernel.org>	2018-12-22 00:25:54 +09:00
Paolo Bonzini	c6ad459733	Second PPC KVM update for 4.21 This has 5 commits that fix page dirty tracking when running nested HV KVM guests, from Suraj Jitindar Singh. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJcHGnCAAoJEJ2a6ncsY3GfaMQIAKP1KQ/SzE38gORRjdf4aUTf 8X6+B6bAjVPwu0OgR2xoU4hPT8vpViG8gIjlspDm/v1igRlokz5GeCEXXlttQwK6 5JFfqS+psCQ/Z/bPFo0uW7cOojeDeE3s1Vd3XXjMH79T6Mvpg54fYutvxd+qbjW6 gAl/jGK6xxo1XsaQfySeSSLA3b8ibI77mjnPBgvbSHbrxBAIjoAfqKTVSjrPwR2b ZvHyCbaoX2uOGyWrw9O73CqCgsiXEOQvr8dLEjsT7a+brrJBO4mv20s6DzlpC1lS YVHd8GAfk44h1fxsNXsj9eEUIcRMEB4fYYOd/u0TECdeyFFWz+8JjBhm6u0TRck= =Zrya -----END PGP SIGNATURE----- Merge tag 'kvm-ppc-next-4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-next Second PPC KVM update for 4.21 This has 5 commits that fix page dirty tracking when running nested HV KVM guests, from Suraj Jitindar Singh.	2018-12-21 11:48:41 +01:00
Lan Tianyu	748c0e312f	KVM: Make kvm_set_spte_hva() return int The patch is to make kvm_set_spte_hva() return int and caller can check return value to determine flush tlb or not. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-12-21 11:28:41 +01:00
Alexey Kardashevskiy	58629c0dc3	powerpc/powernv/npu: Fault user page into the hypervisor's pagetable When a page fault happens in a GPU, the GPU signals the OS and the GPU driver calls the fault handler which populated a page table; this allows the GPU to complete an ATS request. On the bare metal get_user_pages() is enough as it adds a pte to the kernel page table but under KVM the partition scope tree does not get updated so ATS will still fail. This reads a byte from an effective address which causes HV storage interrupt and KVM updates the partition scope tree. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	135ef95405	powerpc/powernv/npu: Check mmio_atsd array bounds when populating A broken device tree might contain more than 8 values and introduce hard to debug memory corruption bug. This adds the boundary check. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	1b785611e1	powerpc/powernv/npu: Add release_ownership hook In order to make ATS work and translate addresses for arbitrary LPID and PID, we need to program an NPU with LPID and allow PID wildcard matching with a specific MSR mask. This implements a helper to assign a GPU to LPAR and program the NPU with a wildcard for PID and a helper to do clean-up. The helper takes MSR (only DR/HV/PR/SF bits are allowed) to program them into NPU2 for ATS checkout requests support. This exports pnv_npu2_unmap_lpar_dev() as following patches will use it from the VFIO driver. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	0bd971676e	powerpc/powernv/npu: Add compound IOMMU groups At the moment the powernv platform registers an IOMMU group for each PE. There is an exception though: an NVLink bridge which is attached to the corresponding GPU's IOMMU group making it a master. Now we have POWER9 systems with GPUs connected to each other directly bypassing PCI. At the moment we do not control state of these links so we have to put such interconnected GPUs to one IOMMU group which means that the old scheme with one GPU as a master won't work - there will be up to 3 GPUs in such group. This introduces a npu_comp struct which represents a compound IOMMU group made of multiple PEs - PCI PEs (for GPUs) and NPU PEs (for NVLink bridges). This converts the existing NVLink1 code to use the new scheme. >From now on, each PE must have a valid iommu_table_group_ops which will either be called directly (for a single PE group) or indirectly from a compound group handlers. This moves IOMMU group registration for NVLink-connected GPUs to npu-dma.c. For POWER8, this stores a new compound group pointer in the PE (so a GPU is still a master); for POWER9 the new group pointer is stored in an NPU (which is allocated per a PCI host controller). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [mpe: Initialise npdev to NULL in pnv_try_setup_npu_table_group()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	83fb8ccf97	powerpc/powernv/npu: Convert NPU IOMMU helpers to iommu_table_group_ops At the moment NPU IOMMU is manipulated directly from the IODA2 PCI PE code; PCI PE acts as a master to NPU PE. Soon we will have compound IOMMU groups with several PEs from several different PHB (such as interconnected GPUs and NPUs) so there will be no single master but a one big IOMMU group. This makes a first step and converts an NPU PE with a set of extern function to a table group. This should cause no behavioral change. Note that pnv_npu_release_ownership() has never been implemented. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	b04149c2dd	powerpc/powernv/npu: Move single TVE handling to NPU PE Normal PCI PEs have 2 TVEs, one per a DMA window; however NPU PE has only one which points to one of two tables of the corresponding PCI PE. So whenever a new DMA window is programmed to PEs, the NPU PE needs to release old table in order to use the new one. Commit `d41ce7b1bc` ("powerpc/powernv/npu: Do not try invalidating 32bit table when 64bit table is enabled") did just that but in pci-ioda.c while it actually belongs to npu-dma.c. This moves the single TVE handling to npu-dma.c. This does not implement restoring though as it is highly unlikely that we can set the table to PCI PE and cannot to NPU PE and if that fails, we could only set 32bit table to NPU PE and this configuration is not really supported or wanted. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	847e6563aa	powerpc/powernv: Reference iommu_table while it is linked to a group The iommu_table pointer stored in iommu_table_group may get stale by accident, this adds referencing and removes a redundant comment about this. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	5eada8a3f0	powerpc/iommu_api: Move IOMMU groups setup to a single place Registering new IOMMU groups and adding devices to them are separated in code and the latter is dug in the DMA setup code which it does not really belong to. This moved IOMMU groups setup to a separate helper which registers a group and adds devices as before. This does not make a difference as IOMMU groups are not used anyway; the only dependency here is that iommu_add_device() requires a valid pointer to an iommu_table (set by set_iommu_table_base()). To keep the old behaviour, this does not add new IOMMU groups for PEs with no DMA weight and also skips NVLink bridges which do not have pci_controller_ops::setup_bridge (the normal way of adding PEs). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	c4e9d3c1e6	powerpc/powernv/pseries: Rework device adding to IOMMU groups The powernv platform registers IOMMU groups and adds devices to them from the pci_controller_ops::setup_bridge() hook except one case when virtual functions (SRIOV VFs) are added from a bus notifier. The pseries platform registers IOMMU groups from the pci_controller_ops::dma_bus_setup() hook and adds devices from the pci_controller_ops::dma_dev_setup() hook. The very same bus notifier used for powernv does not add devices for pseries though as __of_scan_bus() adds devices first, then it does the bus/dev DMA setup. Both platforms use iommu_add_device() which takes a device and expects it to have a valid IOMMU table struct with an iommu_table_group pointer which in turn points the iommu_group struct (which represents an IOMMU group). Although the helper seems easy to use, it relies on some pre-existing device configuration and associated data structures which it does not really need. This simplifies iommu_add_device() to take the table_group pointer directly. Pseries already has a table_group pointer handy and the bus notified is not used anyway. For powernv, this copies the existing bus notifier, makes it work for powernv only which means an easy way of getting to the table_group pointer. This was tested on VFs but should also support physical PCI hotplug. Since iommu_add_device() receives the table_group pointer directly, pseries does not do TCE cache invalidation (the hypervisor does) nor allow multiple groups per a VFIO container (in other words sharing an IOMMU table between partitionable endpoints), this removes iommu_table_group_link from pseries. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	c409c63161	powerpc/pseries: Remove IOMMU API support for non-LPAR systems The pci_dma_bus_setup_pSeries and pci_dma_dev_setup_pSeries hooks are registered for the pseries platform which does not have FW_FEATURE_LPAR; these would be pre-powernv platforms which we never supported PCI pass through for anyway so remove it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	3be2df00e2	powerpc/pseries/npu: Enable platform support We already changed NPU API for GPUs to not to call OPAL and the remaining bit is initializing NPU structures. This searches for POWER9 NVLinks attached to any device on a PHB and initializes an NPU structure if any found. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	68c0449ea1	powerpc/pseries/iommu: Use memory@ nodes in max RAM address calculation We might have memory@ nodes with "linux,usable-memory" set to zero (for example, to replicate powernv's behaviour for GPU coherent memory) which means that the memory needs an extra initialization but since it can be used afterwards, the pseries platform will try mapping it for DMA so the DMA window needs to cover those memory regions too; if the window cannot cover new memory regions, the memory onlining fails. This walks through the memory nodes to find the highest RAM address to let a huge DMA window cover that too in case this memory gets onlined later. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	0e759bd752	powerpc/powernv/npu: Move OPAL calls away from context manipulation When introduced, the NPU context init/destroy helpers called OPAL which enabled/disabled PID (a userspace memory context ID) filtering in an NPU per a GPU; this was a requirement for P9 DD1.0. However newer chip revision added a PID wildcard support so there is no more need to call OPAL every time a new context is initialized. Also, since the PID wildcard support was added, skiboot does not clear wildcard entries in the NPU so these remain in the hardware till the system reboot. This moves LPID and wildcard programming to the PE setup code which executes once during the booting process so NPU2 context init/destroy won't need to do additional configuration. This replaces the check for FW_FEATURE_OPAL with a check for npu!=NULL as this is the way to tell if the NPU support is present and configured. This moves pnv_npu2_init() declaration as pseries should be able to use it. This keeps pnv_npu2_map_lpar() in powernv as pseries is not allowed to call that. This exports pnv_npu2_map_lpar_dev() as following patches will use it from the VFIO driver. While at it, replace redundant list_for_each_entry_safe() with a simpler list_for_each_entry(). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	46a1449d9e	powerpc/powernv: Move npu struct from pnv_phb to pci_controller The powernv PCI code stores NPU data in the pnv_phb struct. The latter is referenced by pci_controller::private_data. We are going to have NPU2 support in the pseries platform as well but it does not store any private_data in in the pci_controller struct; and even if it did, it would be a different data structure. This makes npu a pointer and stores it one level higher in the pci_controller struct. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	c10c21efa4	powerpc/vfio/iommu/kvm: Do not pin device memory This new memory does not have page structs as it is not plugged to the host so gup() will fail anyway. This adds 2 helpers: - mm_iommu_newdev() to preregister the "memory device" memory so the rest of API can still be used; - mm_iommu_is_devmem() to know if the physical address is one of thise new regions which we must avoid unpinning of. This adds @mm to tce_page_is_contained() and iommu_tce_xchg() to test if the memory is device memory to avoid pfn_to_page(). This adds a check for device memory in mm_iommu_ua_mark_dirty_rm() which does delayed pages dirtying. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	e0bf78b0f9	powerpc/mm/iommu/vfio_spapr_tce: Change mm_iommu_get to reference a region Normally mm_iommu_get() should add a reference and mm_iommu_put() should remove it. However historically mm_iommu_find() does the referencing and mm_iommu_get() is doing allocation and referencing. We are going to add another helper to preregister device memory so instead of having mm_iommu_new() (which pre-registers the normal memory and references the region), we need separate helpers for pre-registering and referencing. This renames: - mm_iommu_get to mm_iommu_new; - mm_iommu_find to mm_iommu_get. This changes mm_iommu_get() to reference the region so the name now reflects what it does. This removes the check for exact match from mm_iommu_new() as we want it to fail on existing regions; mm_iommu_get() should be used instead. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Alexey Kardashevskiy	ab7032e793	powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2 The skiboot firmware has a hot reset handler which fences the NVIDIA V100 GPU RAM on Witherspoons and makes accesses no-op instead of throwing HMIs: https://github.com/open-power/skiboot/commit/fca2b2b839a67 Now we are going to pass V100 via VFIO which most certainly involves KVM guests which are often terminated without getting a chance to offline GPU RAM so we end up with a running machine with misconfigured memory. Accessing this memory produces hardware management interrupts (HMI) which bring the host down. To suppress HMIs, this wires up this hot reset hook to vfio_pci_disable() via pci_disable_device() which switches NPU2 to a safe mode and prevents HMIs. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:46 +11:00
Christophe Leroy	ffca395b11	powerpc/mm: Fix reporting of kernel execute faults on the 8xx On the 8xx, no-execute is set via PPP bits in the PTE. Therefore a no-exec fault generates DSISR_PROTFAULT error bits, not DSISR_NOEXEC_OR_G. This patch adds DSISR_PROTFAULT in the test mask. Fixes: `d3ca587404` ("powerpc/mm: Fix reporting of kernel execute faults") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 16:20:45 +11:00
Firoz Khan	ab66dcc76d	powerpc: generate uapi header and system call table files System call table generation script must be run to gener- ate unistd_32/64.h and syscall_table_32/64/c32/spu.h files. This patch will have changes which will invokes the script. This patch will generate unistd_32/64.h and syscall_table- _32/64/c32/spu.h files by the syscall table generation script invoked by parisc/Makefile and the generated files against the removed files must be identical. The generated uapi header file will be included in uapi/- asm/unistd.h and generated system call table header file will be included by kernel/systbl.S file. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 14:46:50 +11:00
Firoz Khan	aff8503932	powerpc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add or modify the system calls in the respective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implementation across all architectures. The system call table generation script is added in syscalls directory which contain the script to generate both uapi header file and system call table files. The syscall.tbl file will be the input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header- unistd_32/64.h and syscall_table_32/64/c32/spu.h files respectively. File syscall_table_32/64/c32/spu.h is incl- uded by syscall.S - the real system call table. Both *.sh files will parse the content syscall.tbl to generate the header and table files. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 14:46:50 +11:00
Firoz Khan	fbf508da74	powerpc: split compat syscall table out from native table PowerPC uses a syscall table with native and compat calls interleaved, which is a slightly simpler way to define two matching tables. As we move to having the tables generated, that advantage is no longer important, but the interleaved table gets in the way of using the same scripts as on the other archit- ectures. Split out a new compat_sys_call_table symbol that contains all the compat calls, and leave the main table for the nat- ive calls, to more closely match the method we use every- where else. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 14:46:50 +11:00
Firoz Khan	a11b763d61	powerpc: move macro definition from asm/systbl.h Move the macro definition for compat_sys_sigsuspend from asm/systbl.h to the file which it is getting included. One of the patch in this patch series is generating uapi header and syscall table files. In order to come up with a common implimentation across all architecture, we need to do this change. This change will simplify the implementation of system call table generation script and help to come up a common implementation across all architecture. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 14:46:50 +11:00
Firoz Khan	8a19eeeab6	powerpc: add __NR_syscalls along with NR_syscalls NR_syscalls macro holds the number of system call exist in powerpc architecture. We have to change the value of NR_syscalls, if we add or delete a system call. One of the patch in this patch series has a script which will generate a uapi header based on syscall.tbl file. The syscall.tbl file contains the number of system call information. So we have two option to update NR_syscalls value. 1. Update NR_syscalls in asm/unistd.h manually by count- ing the no.of system calls. No need to update NR_sys- calls until we either add a new system call or delete existing system call. 2. We can keep this feature in above mentioned script, that will count the number of syscalls and keep it in a generated file. In this case we don't need to expli- citly update NR_syscalls in asm/unistd.h file. The 2nd option will be the recommended one. For that, I added the __NR_syscalls macro in uapi/asm/unistd.h along with NR_syscalls asm/unistd.h. The macro __NR_syscalls also added for making the name convention same across all architecture. While __NR_syscalls isn't strictly part of the uapi, having it as part of the generated header to simplifies the implementation. We also need to enclose this macro with #ifdef __KERNEL__ to avoid side effects. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 14:46:50 +11:00
Ram Pai	2cd4bd192e	powerpc/pkeys: Fix handling of pkey state across fork() Protection key tracking information is not copied over to the mm_struct of the child during fork(). This can cause the child to erroneously allocate keys that were already allocated. Any allocated execute-only key is lost aswell. Add code; called by dup_mmap(), to copy the pkey state from parent to child explicitly. This problem was originally found by Dave Hansen on x86, which turns out to be a problem on powerpc aswell. Fixes: `cf43d3b264` ("powerpc: Enable pkey subsystem") Cc: stable@vger.kernel.org # v4.16+ Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 14:46:50 +11:00
Breno Leitao	6f5b9f018f	powerpc/tm: Unset MSR[TS] if not recheckpointing There is a TM Bad Thing bug that can be caused when you return from a signal context in a suspended transaction but with ucontext MSR[TS] unset. This forces regs->msr[TS] to be set at syscall entrance (since the CPU state is transactional). It also calls treclaim() to flush the transaction state, which is done based on the live (mfmsr) MSR state. Since user context MSR[TS] is not set, then restore_tm_sigcontexts() is not called, thus, not executing recheckpoint, keeping the CPU state as not transactional. When calling rfid, SRR1 will have MSR[TS] set, but the CPU state is non transactional, causing the TM Bad Thing with the following stack: [ 33.862316] Bad kernel stack pointer 3fffd9dce3e0 at c00000000000c47c cpu 0x8: Vector: 700 (Program Check) at [c00000003ff7fd40] pc: c00000000000c47c: fast_exception_return+0xac/0xb4 lr: 00003fff865f442c sp: 3fffd9dce3e0 msr: 8000000102a03031 current = 0xc00000041f68b700 paca = 0xc00000000fb84800 softe: 0 irq_happened: 0x01 pid = 1721, comm = tm-signal-sigre Linux version 4.9.0-3-powerpc64le (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) WARNING: exception is not recoverable, can't continue The same problem happens on 32-bits signal handler, and the fix is very similar, if tm_recheckpoint() is not executed, then regs->msr[TS] should be zeroed. This patch also fixes a sparse warning related to lack of indentation when CONFIG_PPC_TRANSACTIONAL_MEM is set. Fixes: `2b0a576d15` ("powerpc: Add new transactional memory state to the signal context") CC: Stable <stable@vger.kernel.org> # 3.10+ Signed-off-by: Breno Leitao <leitao@debian.org> Tested-by: Michal Suchánek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 14:46:50 +11:00
Breno Leitao	11be39584a	powerpc/tm: Print scratch value Usually a TM Bad Thing exception is raised due to three different problems. a) touching SPRs in an active transaction; b) using TM instruction with the facility disabled and c) setting a wrong MSR/SRR1 at RFID. The two initial cases are easy to identify by looking at the instructions. The latter case is harder, because the MSR is masked after RFID, so, it is very useful to look at the previous MSR (SRR1) before RFID as also the current and masked MSR. Since MSR is saved at paca just before RFID, this patch prints it if a TM Bad thing happen, helping to understand what is the invalid TM transition that is causing the exception. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 14:46:50 +11:00
Breno Leitao	63a0d6b03b	powerpc/tm: Save MSR to PACA before RFID As other exit points, move SRR1 (MSR) into paca->tm_scratch, so, if there is a TM Bad Thing in RFID, it is easy to understand what was the SRR1 value being used. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 14:46:50 +11:00
Breno Leitao	e1c3743e1a	powerpc/tm: Set MSR[TS] just prior to recheckpoint On a signal handler return, the user could set a context with MSR[TS] bits set, and these bits would be copied to task regs->msr. At restore_tm_sigcontexts(), after current task regs->msr[TS] bits are set, several __get_user() are called and then a recheckpoint is executed. This is a problem since a page fault (in kernel space) could happen when calling __get_user(). If it happens, the process MSR[TS] bits were already set, but recheckpoint was not executed, and SPRs are still invalid. The page fault can cause the current process to be de-scheduled, with MSR[TS] active and without tm_recheckpoint() being called. More importantly, without TEXASR[FS] bit set also. Since TEXASR might not have the FS bit set, and when the process is scheduled back, it will try to reclaim, which will be aborted because of the CPU is not in the suspended state, and, then, recheckpoint. This recheckpoint will restore thread->texasr into TEXASR SPR, which might be zero, hitting a BUG_ON(). kernel BUG at /build/linux-sf3Co9/linux-4.9.30/arch/powerpc/kernel/tm.S:434! cpu 0xb: Vector: 700 (Program Check) at [c00000041f1576d0] pc: c000000000054550: restore_gprs+0xb0/0x180 lr: 0000000000000000 sp: c00000041f157950 msr: 8000000100021033 current = 0xc00000041f143000 paca = 0xc00000000fb86300 softe: 0 irq_happened: 0x01 pid = 1021, comm = kworker/11:1 kernel BUG at /build/linux-sf3Co9/linux-4.9.30/arch/powerpc/kernel/tm.S:434! Linux version 4.9.0-3-powerpc64le (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) enter ? for help [c00000041f157b30] c00000000001bc3c tm_recheckpoint.part.11+0x6c/0xa0 [c00000041f157b70] c00000000001d184 __switch_to+0x1e4/0x4c0 [c00000041f157bd0] c00000000082eeb8 __schedule+0x2f8/0x990 [c00000041f157cb0] c00000000082f598 schedule+0x48/0xc0 [c00000041f157ce0] c0000000000f0d28 worker_thread+0x148/0x610 [c00000041f157d80] c0000000000f96b0 kthread+0x120/0x140 [c00000041f157e30] c00000000000c0e0 ret_from_kernel_thread+0x5c/0x7c This patch simply delays the MSR[TS] set, so, if there is any page fault in the __get_user() section, it does not have regs->msr[TS] set, since the TM structures are still invalid, thus avoiding doing TM operations for in-kernel exceptions and possible process reschedule. With this patch, the MSR[TS] will only be set just before recheckpointing and setting TEXASR[FS] = 1, thus avoiding an interrupt with TM registers in invalid state. Other than that, if CONFIG_PREEMPT is set, there might be a preemption just after setting MSR[TS] and before tm_recheckpoint(), thus, this block must be atomic from a preemption perspective, thus, calling preempt_disable/enable() on this code. It is not possible to move tm_recheckpoint to happen earlier, because it is required to get the checkpointed registers from userspace, with __get_user(), thus, the only way to avoid this undesired behavior is delaying the MSR[TS] set. The 32-bits signal handler seems to be safe this current issue, but, it might be exposed to the preemption issue, thus, disabling preemption in this chunk of code. Changes from v2: * Run the critical section with preempt_disable. Fixes: `87b4e5393a` ("powerpc/tm: Fix return of active 64bit signals") Cc: stable@vger.kernel.org (v3.9+) Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 14:46:50 +11:00
Suraj Jitindar Singh	ae59a7e194	KVM: PPC: Book3S HV: Keep rc bits in shadow pgtable in sync with host The rc bits contained in ptes are used to track whether a page has been accessed and whether it is dirty. The accessed bit is used to age a page and the dirty bit to track whether a page is dirty or not. Now that we support nested guests there are three ptes which track the state of the same page: - The partition-scoped page table in the L1 guest, mapping L2->L1 address - The partition-scoped page table in the host for the L1 guest, mapping L1->L0 address - The shadow partition-scoped page table for the nested guest in the host, mapping L2->L0 address The idea is to attempt to keep the rc state of these three ptes in sync, both when setting and when clearing rc bits. When setting the bits we achieve consistency by: - Initially setting the bits in the shadow page table as the 'and' of the other two. - When updating in software the rc bits in the shadow page table we ensure the state is consistent with the other two locations first, and update these before reflecting the change into the shadow page table. i.e. only set the bits in the L2->L0 pte if also set in both the L2->L1 and the L1->L0 pte. When clearing the bits we achieve consistency by: - The rc bits in the shadow page table are only cleared when discarding a pte, and we don't need to record this as if either bit is set then it must also be set in the pte mapping L1->L0. - When L1 clears an rc bit in the L2->L1 mapping it __should__ issue a tlbie instruction - This means we will discard the pte from the shadow page table meaning the mapping will have to be setup again. - When setup the pte again in the shadow page table we will ensure consistency with the L2->L1 pte. - When the host clears an rc bit in the L1->L0 mapping we need to also clear the bit in any ptes in the shadow page table which map the same gfn so we will be notified if a nested guest accesses the page. This case is what this patch specifically concerns. - We can search the nest_rmap list for that given gfn and clear the same bit from all corresponding ptes in shadow page tables. - If a nested guest causes either of the rc bits to be set by software in future then we will update the L1->L0 pte and maintain consistency. With the process outlined above we aim to maintain consistency of the 3 pte locations where we track rc for a given guest page. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-21 14:42:07 +11:00
Suraj Jitindar Singh	90165d3da0	KVM: PPC: Book3S HV: Introduce kvmhv_update_nest_rmap_rc_list() Introduce a function kvmhv_update_nest_rmap_rc_list() which for a given nest_rmap list will traverse it, find the corresponding pte in the shadow page tables, and if it still maps the same host page update the rc bits accordingly. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-21 14:39:35 +11:00
Suraj Jitindar Singh	8b23eee4e5	KVM: PPC: Book3S HV: Apply combination of host and l1 pte rc for nested guest The shadow page table contains ptes for translations from nested guest address to host address. Currently when creating these ptes we take the rc bits from the pte for the L1 guest address to host address translation. This is incorrect as we must also factor in the rc bits from the pte for the nested guest address to L1 guest address translation (as contained in the L1 guest partition table for the nested guest). By not calculating these bits correctly L1 may not have been correctly notified when it needed to update its rc bits in the partition table it maintains for its nested guest. Modify the code so that the rc bits in the resultant pte for the L2->L0 translation are the 'and' of the rc bits in the L2->L1 pte and the L1->L0 pte, also accounting for whether this was a write access when setting the dirty bit. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-21 14:37:43 +11:00
Suraj Jitindar Singh	8400f87406	KVM: PPC: Book3S HV: Align gfn to L1 page size when inserting nest-rmap entry Nested rmap entries are used to store the translation from L1 gpa to L2 gpa when entries are inserted into the shadow (nested) page tables. This rmap list is located by indexing the rmap array in the memslot by L1 gfn. When we come to search for these entries we only know the L1 page size (which could be PAGE_SIZE, 2M or a 1G page) and so can only select a gfn aligned to that size. This means that when we insert the entry, so we can find it later, we need to align the gfn we use to select the rmap list in which to insert the entry to L1 page size as well. By not doing this we were missing nested rmap entries when modifying L1 ptes which were for a page also passed through to an L2 guest. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-21 14:37:43 +11:00
Suraj Jitindar Singh	bec6e03b5e	KVM: PPC: Book3S HV: Hold kvm->mmu_lock across updating nested pte rc bits We already hold the kvm->mmu_lock spin lock across updating the rc bits in the pte for the L1 guest. Continue to hold the lock across updating the rc bits in the pte for the nested guest as well to prevent invalidations from occurring. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-21 14:37:43 +11:00
Mahesh Salgaonkar	0db6896ff6	powerpc/fadump: Do not allow hot-remove memory from fadump reserved area. For fadump to work successfully there should not be any holes in reserved memory ranges where kernel has asked firmware to move the content of old kernel memory in event of crash. Now that fadump uses CMA for reserved area, this memory area is now not protected from hot-remove operations unless it is cma allocated. Hence, fadump service can fail to re-register after the hot-remove operation, if hot-removed memory belongs to fadump reserved region. To avoid this make sure that memory from fadump reserved area is not hot-removable if fadump is registered. However, if user still wants to remove that memory, he can do so by manually stopping fadump service before hot-remove operation. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 11:32:49 +11:00
Mahesh Salgaonkar	f86593be1e	powerpc/fadump: Throw proper error message on fadump registration failure fadump fails to register when there are holes in reserved memory area. This can happen if user has hot-removed a memory that falls in the fadump reserved memory area. Throw a meaningful error message to the user in such case. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> [mpe: is_reserved_memory_area_contiguous() returns bool, unsplit string] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 11:32:49 +11:00
Mahesh Salgaonkar	a4e92ce8e4	powerpc/fadump: Reservationless firmware assisted dump One of the primary issues with Firmware Assisted Dump (fadump) on Power is that it needs a large amount of memory to be reserved. On large systems with TeraBytes of memory, this reservation can be quite significant. In some cases, fadump fails if the memory reserved is insufficient, or if the reserved memory was DLPAR hot-removed. In the normal case, post reboot, the preserved memory is filtered to extract only relevant areas of interest using the makedumpfile tool. While the tool provides flexibility to determine what needs to be part of the dump and what memory to filter out, all supported distributions default this to "Capture only kernel data and nothing else". We take advantage of this default and the Linux kernel's Contiguous Memory Allocator (CMA) to fundamentally change the memory reservation model for fadump. Instead of setting aside a significant chunk of memory nobody can use, this patch uses CMA instead, to reserve a significant chunk of memory that the kernel is prevented from using (due to MIGRATE_CMA), but applications are free to use it. With this fadump will still be able to capture all of the kernel memory and most of the user space memory except the user pages that were present in CMA region. Essentially, on a P9 LPAR with 2 cores, 8GB RAM and current upstream: [root@zzxx-yy10 ~]# free -m total used free shared buff/cache available Mem: 7557 193 6822 12 541 6725 Swap: 4095 0 4095 With this patch: [root@zzxx-yy10 ~]# free -m total used free shared buff/cache available Mem: 8133 194 7464 12 475 7338 Swap: 4095 0 4095 Changes made here are completely transparent to how fadump has traditionally worked. Thanks to Aneesh Kumar and Anshuman Khandual for helping us understand CMA and its usage. TODO: - Handle case where CMA reservation spans nodes. Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 11:32:49 +11:00
Mahesh Salgaonkar	08fb726df1	powerpc/powernv: Move opal_power_control_init() call in opal_init(). opal_power_control_init() depends on opal message notifier to be initialized, which is done in opal_init()->opal_message_init(). But both these initialization are called through machine initcalls and it all depends on in which order they being called. So far these are called in correct order (may be we got lucky) and never saw any issue. But it is clearer to control initialization order explicitly by moving opal_power_control_init() into opal_init(). Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 11:32:49 +11:00
Markus Elfring	ae6263cc33	powerpc/4xx: Delete an unnecessary return statement in two functions The script "checkpatch.pl" pointed information out like the following. WARNING: void function return statements are not generally useful Thus remove such a statement in the affected functions. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 11:32:49 +11:00
Markus Elfring	a8d5dadae5	powerpc/4xx: Delete error message for a ENOMEM in two functions Omit an extra message for a memory allocation failure in these functions. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 11:32:49 +11:00
Markus Elfring	52930bc6e8	powerpc/4xx: Use seq_putc() in ocm_debugfs_show() A single character (line break) should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 11:32:49 +11:00
Markus Elfring	b52106a040	powerpc/4xx: Combine four seq_printf() calls into two in ocm_debugfs_show() Some data were printed into a sequence by four separate function calls. Print the same data by two single function calls instead. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 11:32:49 +11:00
Christophe Leroy	96d19d70e1	powerpc/8xx: Allow pinning IMMR TLB when using early debug console CONFIG_EARLY_DEBUG_CPM requires IMMR area TLB to be pinned otherwise it doesn't survive MMU_init, and the boot fails. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 11:32:49 +11:00
Oliver O'Halloran	5f639e5fad	powerpc/powernv: Remove PCI_MSI ifdef checks CONFIG_PCI_MSI was made mandatory by commit `a311e738b6` ("powerpc/powernv: Make PCI non-optional") so the #ifdef checks around CONFIG_PCI_MSI here can be removed entirely. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 11:32:49 +11:00
Alexandre Belloni	a083787680	powerpc/fsl-rio: fix spelling mistake "reserverd" -> "reserved" Fix a spelling mistake in a register description. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 11:32:49 +11:00
Ravi Bangoria	0c9108b083	Powerpc/perf: Wire up PMI throttling Commit `14c63f17b1` ("perf: Drop sample rate when sampling is too slow") introduced a way to throttle PMU interrupts if we're spending too much time just processing those. Wire up powerpc PMI handler to use this infrastructure. We have throttling of the rate of interrupts, but this adds throttling based on the time taken to process the interrupts. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-21 11:32:49 +11:00
David S. Miller	2be09de7d6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Lots of conflicts, by happily all cases of overlapping changes, parallel adds, things of that nature. Thanks to Stephen Rothwell, Saeed Mahameed, and others for their guidance in these resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 11:53:36 -08:00
Radim Krčmář	cfdfaf4a86	PPC KVM update for 4.21 The main new feature this time is support in HV nested KVM for passing a device that is emulated by a level 0 hypervisor and presented to level 1 as a PCI device through to a level 2 guest using VFIO. Apart from that there are improvements for migration of radix guests under HV KVM and some other fixes and cleanups. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJcGFzEAAoJEJ2a6ncsY3GfKjoH/Azcf8QIO5ftyHrjazFZOSUh 5Lr24HZTYHheowp6obzuZWRAIyckHmflRmOkv8RVGuA8+Sp+m5pBxN3WTVPOwDUh WanOWVGJsuhl6qATmkm7xIxmYhQEyLxVNbnWva7WXuZ92rgGCNfHtByHWAx/7vTe q5Shr4fLIQ8HRzor8Xqqph1I0hQNTE9VsaK1hW/PxI0gsO8qjDwOR8SDpT/aaJrS Sir+lM0TwCbJREuObDxYAXn1OWy8rMYjlb9fEBv5tmPCQKiB9vJz4tV+ahR9eJ14 PEF57MoBOGwzQXo4geFLuo/Bu8fDygKsKQX1eYGcn6tRGA4pnTxzYl0+dHLBkOM= =3WkD -----END PGP SIGNATURE----- Merge tag 'kvm-ppc-next-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc PPC KVM update for 4.21 from Paul Mackerras The main new feature this time is support in HV nested KVM for passing a device that is emulated by a level 0 hypervisor and presented to level 1 as a PCI device through to a level 2 guest using VFIO. Apart from that there are improvements for migration of radix guests under HV KVM and some other fixes and cleanups.	2018-12-20 14:54:09 +01:00
YueHaibing	8c6c942d33	powerpc/eeh: Fix debugfs_simple_attr.cocci warnings Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE for debugfs files. Semantic patch information: Rationale: DEFINE_SIMPLE_ATTRIBUTE + debugfs_create_file() imposes some significant overhead as compared to DEFINE_DEBUGFS_ATTRIBUTE + debugfs_create_file_unsafe(). Generated by: scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Alexey Kardashevskiy	c20577014f	powerpc/powernv/eeh/npu: Fix uninitialized variables in opal_pci_eeh_freeze_status The current implementation of the OPAL_PCI_EEH_FREEZE_STATUS call in skiboot's NPU driver does not touch the pci_error_type parameter so it might have garbage but the powernv code analyzes it nevertheless. This initializes pcierr and fstate to zero in all call sites. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Alexey Kardashevskiy	a25de7af34	powerpc/powernv/ioda: Reduce a number of hooks in pnv_phb fixup_phb() is never used, this removes it. pick_m64_pe() and reserve_m64_pe() are always defined for all powernv PHBs: they are initialized by pnv_ioda_parse_m64_window() which is called unconditionally from pnv_pci_init_ioda_phb() which initializes all known PHB types on powernv so we can open code them. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Diana Craciun	dfa88658fb	powerpc/fsl: Update Spectre v2 reporting Report branch predictor state flush as a mitigation for Spectre variant 2. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Alexey Kardashevskiy	f21b0a45e4	powerpc/powernv/ioda1: Remove dead code for a single device PE At the moment PNV_IODA_PE_DEV is only used for NPU PEs which are not present on IODA1 machines (i.e. POWER7) so let's remove a piece of dead code. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Diana Craciun	3bc8ea8603	powerpc/fsl: Enable runtime patching if nospectre_v2 boot arg is used If the user choses not to use the mitigations, replace the code sequence with nops. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Diana Craciun	e7aa61f47b	powerpc/fsl: Flush branch predictor when entering KVM Switching from the guest to host is another place where the speculative accesses can be exploited. Flush the branch predictor when entering KVM. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Alexey Kardashevskiy	fa1ada7889	powerpc/powernv/npu: Remove unused headers and a macro. The macro and few headers are not used so remove them. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Diana Craciun	7fef436295	powerpc/fsl: Flush the branch predictor at each kernel entry (32 bit) In order to protect against speculation attacks on indirect branches, the branch predictor is flushed at kernel entry to protect for the following situations: - userspace process attacking another userspace process - userspace process attacking the kernel Basically when the privillege level change (i.e.the kernel is entered), the branch predictor state is flushed. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Alexey Kardashevskiy	bdbf649efe	powerpc/powernv/ioda: Allocate indirect TCE levels of cached userspace addresses on demand The powernv platform maintains 2 TCE tables for VFIO - a hardware TCE table and a table with userspace addresses; the latter is used for marking pages dirty when corresponging TCEs are unmapped from the hardware table. `a68bd1267b` ("powerpc/powernv/ioda: Allocate indirect TCE levels on demand") enabled on-demand allocation of the hardware table, however it missed the other table so it has still been fully allocated at the boot time. This fixes the issue by allocating a single level, just like we do for the hardware table. Fixes: `a68bd1267b` ("powerpc/powernv/ioda: Allocate indirect TCE levels on demand") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Diana Craciun	10c5e83afd	powerpc/fsl: Flush the branch predictor at each kernel entry (64bit) In order to protect against speculation attacks on indirect branches, the branch predictor is flushed at kernel entry to protect for the following situations: - userspace process attacking another userspace process - userspace process attacking the kernel Basically when the privillege level change (i.e. the kernel is entered), the branch predictor state is flushed. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Diana Craciun	f633a8ad63	powerpc/fsl: Add nospectre_v2 command line argument When the command line argument is present, the Spectre variant 2 mitigations are disabled. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Diana Craciun	98518c4d87	powerpc/fsl: Emulate SPRN_BUCSR register In order to flush the branch predictor the guest kernel performs writes to the BUCSR register which is hypervisor privilleged. However, the branch predictor is flushed at each KVM entry, so the branch predictor has been already flushed, so just return as soon as possible to guest. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> [mpe: Tweak comment formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Diana Craciun	7d8bad99ba	powerpc/fsl: Fix spectre_v2 mitigations reporting Currently for CONFIG_PPC_FSL_BOOK3E the spectre_v2 file is incorrect: $ cat /sys/devices/system/cpu/vulnerabilities/spectre_v2 "Mitigation: Software count cache flush" Which is wrong. Fix it to report vulnerable for now. Fixes: `ee13cb249f` ("powerpc/64s: Add support for software count cache flush") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:59:03 +11:00
Diana Craciun	1cbf8990d7	powerpc/fsl: Add macro to flush the branch predictor The BUCSR register can be used to invalidate the entries in the branch prediction mechanisms. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:58:57 +11:00
Diana Craciun	76a5eaa38b	powerpc/fsl: Add infrastructure to fixup branch predictor flush In order to protect against speculation attacks (Spectre variant 2) on NXP PowerPC platforms, the branch predictor should be flushed when the privillege level is changed. This patch is adding the infrastructure to fixup at runtime the code sections that are performing the branch predictor flush depending on a boot arg parameter which is added later in a separate patch. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:53:39 +11:00
Christophe Leroy	f242e0ac95	powerpc/prom: move the device tree if not in declared memory. If the device tree doesn't reside in the memory which is declared inside it, it has to be moved as well as this memory will not be mapped by the kernel. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Michael Ellerman	2b874a5c7b	powerpc/configs: Don't enable PPC_EARLY_DEBUG in defconfigs This reverts the remains of commit `b9ef7d6b11` ("powerpc: Update default configurations"). That commit was proceeded by a commit which added a config option to control use of BOOTX for early debug, ie. PPC_EARLY_DEBUG_BOOTX, and then the update of the defconfigs was intended to not change behaviour by then enabling the new config option. However enabling PPC_EARLY_DEBUG had other consequences, notably causing us to register the udbg console at the end of udbg_early_init(). This means on a system which doesn't have anything that BOOTX can use (most systems), we register the udbg console very early but the bootx code just throws everything away, meaning early boot messages are never printed to the console. What we want to happen is for the udbg console to only be registered later (from setup_arch()) once we've setup udbg_putc, and then all early boot messages will be replayed. Fixes: `b9ef7d6b11` ("powerpc: Update default configurations") Reported-by: Torsten Duwe <duwe@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Arnd Bergmann	2fea82db11	powerpc: eeh_event: convert semaphore to completion For this use case, completions and semaphores are equivalent, but semaphores are an awkward interface that should generally be avoided, so use the completion instead. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Benjamin Herrenschmidt	3cfb9ebe90	powerpc/44x/bamboo: Fix PCI range The bamboo dts has a bug: it uses a non-naturally aligned range for PCI memory space. This isnt' supported by the code, thus causing PCI to break on this system. This is due to the fact that while the chip memory map has 1G reserved for PCI memory, it's only 512M aligned. The code doesn't know how to split that into 2 different PMMs and fails, so limit the region to 512M. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Darren Stevens	51f4cc2047	powerpc/pasemi: Add Nemo board IRQ initroutine Add a IRQ init routine for the Nemo board which inits and attatches the i8259 found in the SB600, and a cascade routine to dispatch the interrupts. Signed-off-by: Darren Stevens <darren@stevens-zone.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Darren Stevens	656fdf3ad8	powerpc/pasemi: Add Nemo board device init code. Add routines for Nemo specific devices to init at boot time, these being board level power-off and SB600's rtc. Also add a run time variable to prevent these being activated if we boot on a reference board. Signed-off-by: Darren Stevens <darren@stevens-zone.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Darren Stevens	0428a5f494	powerpc/pasemi: Add Nemo board IRQ initroutine Add a IRQ init routine for the Nemo board which inits and attatches the i8259 found in the SB600, and a cascade routine to dispatch the interrupts. Signed-off-by: Darren Stevens <darren@stevens-zone.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Darren Stevens	68f211a4d1	powerpc/pasemi: Add PCI initialisation for Nemo board. The A-Eon Amigaone X1000's Nemo motherboard has an AMD SB600 connected to one of the PCI-e root ports on its PaSemi Pwrficient 1628M SoC. Normally the SB600 southbridge would be connected to a hidden PCI-e port on the system's northbridge, and as a result doesn't fully comply with the PCI-e spec. Add code to relax the PCI-e detection in both the root port and the Linux kernel allowing on board devices to be detected. Signed-off-by: Darren Stevens <darren@stevens-zone.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Christophe Leroy	49a502ea23	powerpc/mm: Make NULL pointer deferences explicit on bad page faults. As several other arches including x86, this patch makes it explicit that a bad page fault is a NULL pointer dereference when the fault address is lower than PAGE_SIZE In the mean time, this page makes all bad_page_fault() messages shorter so that they remain on one single line. And it prefixes them by "BUG: " so that they get easily grepped. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Avoid pr_cont()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Dmitry V. Levin	8dbdec0bcb	powerpc/ptrace: Combine SYSCALL_EMU & SYSCALL_TRACE handling Combine the SYSCALL_EMU and SYSCALL_TRACE handling so that we only call tracehook_report_syscall_entry() in one place. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> [mpe: Flesh out change log, s/cached_flags/flags/, reflow comments] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Christoph Hellwig	25078dc1f7	powerpc: use mm zones more sensibly Powerpc has somewhat odd usage where ZONE_DMA is used for all memory on common 64-bit configfs, and ZONE_DMA32 is used for 31-bit schemes. Move to a scheme closer to what other architectures use (and I dare to say the intent of the system): - ZONE_DMA: optionally for memory < 31-bit (64-bit embedded only) - ZONE_NORMAL: everything addressable by the kernel - ZONE_HIGHMEM: memory > 32-bit for 32-bit kernels Also provide information on how ZONE_DMA is used by defining ARCH_ZONE_DMA_BITS. Contains various fixes from Benjamin Herrenschmidt. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Christoph Hellwig	44a0337b32	powerpc/dma: split the two __dma_alloc_coherent implementations The implemementation for the CONFIG_NOT_COHERENT_CACHE case doesn't share any code with the one for systems with coherent caches. Split it off and merge it with the helpers in dma-noncoherent.c that have no other callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Christoph Hellwig	9c15a87cfc	powerpc/dma: remove the unused dma_iommu_ops export Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Christoph Hellwig	acddff9dc4	powerpc/dma: remove the unused ISA_DMA_THRESHOLD export Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Christoph Hellwig	0e652390fb	powerpc/dma: remove the unused ARCH_HAS_DMA_MMAP_COHERENT define Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Christoph Hellwig	0aeba2d0d2	powerpc/dma: properly wire up the unmap_page and unmap_sg methods The unmap methods need to transfer memory ownership back from the device to the cpu by identical means as dma_sync__to_cpu. I'm not sure powerpc needs to do any work in this transfer direction, but given that it does invalidate the caches in dma_sync__to_cpu already we should make sure we also do so on unmapping. Signed-off-by: Christoph Hellwig <hch@lst.de> [mpe: s/dir/direction in dma_nommu_unmap_page()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Christoph Hellwig	9286356907	powerpc: allow NOT_COHERENT_CACHE for amigaone AMIGAONE selects NOT_COHERENT_CACHE, so we better allow it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Christophe Leroy	b18f0ae92b	powerpc/prom: fix early DEBUG messages This patch fixes early DEBUG messages in prom.c: - Use %px instead of %p to see the addresses - Cast memblock_phys_mem_size() with (unsigned long long) to avoid build failure when phys_addr_t is not 64 bits. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 22:21:20 +11:00
Joel Stanley	72e7bcc2cd	powerpc/32: Avoid unsupported flags with clang When building for ppc32 with clang these flags are unsupported: -ffixed-r2 and -mmultiple llvm's lib/Target/PowerPC/PPCRegisterInfo.cpp marks r2 as reserved on when building for SVR4ABI and !ppc64: // The SVR4 ABI reserves r2 and r13 if (Subtarget.isSVR4ABI()) { // We only reserve r2 if we need to use the TOC pointer. If we have no // explicit uses of the TOC pointer (meaning we're a leaf function with // no constant-pool loads, etc.) and we have no potential uses inside an // inline asm block, then we can treat r2 has an ordinary callee-saved // register. const PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>(); if (!TM.isPPC64() \|\| FuncInfo->usesTOCBasePtr() \|\| MF.hasInlineAsm()) markSuperRegs(Reserved, PPC::R2); // System-reserved register markSuperRegs(Reserved, PPC::R13); // Small Data Area pointer register } This means we can safely omit -ffixed-r2 when building for 32-bit targets. The -mmultiple/-mno-multiple flags are not supported by clang, so platforms that might support multiple miss out on using multiple word instructions. We wrap these flags in cc-option so that when Clang gains support the kernel will be able use these flags. Clang 8 can then build a ppc44x_defconfig which boots in Qemu: make CC=clang-8 ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- ppc44x_defconfig ./scripts/config -e CONFIG_DEVTMPFS -d DEVTMPFS_MOUNT make CC=clang-8 ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- qemu-system-ppc -M bamboo \ -kernel arch/powerpc/boot/zImage \ -dtb arch/powerpc/boot/dts/bamboo.dtb \ -initrd ~/ppc32-440-rootfs.cpio \ -nographic -serial stdio -monitor pty -append "console=ttyS0" Link: https://github.com/ClangBuiltLinux/linux/issues/261 Link: https://bugs.llvm.org/show_bug.cgi?id=39556 Link: https://bugs.llvm.org/show_bug.cgi?id=39555 Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 20:53:11 +11:00
Madhavan Srinivasan	3757cba80a	powerpc/perf: Remove l2 bus events from HW cache event array Remove PM_L2_ST_MISS and PM_L2_ST from HW cache event array since these are bus events. And these needs to be programmed in groups. Hence remove them. Fixes: `f1fb60bfde` ('powerpc/perf: Export Power9 generic and cache events to sysfs') Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 20:53:11 +11:00
Madhavan Srinivasan	59029136d7	powerpc/perf: Add constraints for power9 l2/l3 bus events In previous generation processors, both bus events and direct events of performance monitoring unit can be individually programmabled and monitored in PMCs. But in Power9, L2/L3 bus events are always available as a "bank" of 4 events. To obtain the counts for any of the l2/l3 bus events in a given bank, the user will have to program PMC4 with corresponding l2/l3 bus event for that bank. Patch enforce two contraints incase of L2/L3 bus events. 1)Any L2/L3 event when programmed is also expected to program corresponding PMC4 event from that group. 2)PMC4 event should always been programmed first due to group constraint logic limitation For ex. consider these L3 bus events PM_L3_PF_ON_CHIP_MEM (0x460A0), PM_L3_PF_MISS_L3 (0x160A0), PM_L3_CO_MEM (0x260A0), PM_L3_PF_ON_CHIP_CACHE (0x360A0), 1) This is an INVALID group for L3 Bus event monitoring, since it is missing PMC4 event. perf stat -e "{r160A0,r260A0,r360A0}" < > And this is a VALID group for L3 Bus events: perf stat -e "{r460A0,r160A0,r260A0,r360A0}" < > 2) This is an INVALID group for L3 Bus event monitoring, since it is missing PMC4 event. perf stat -e "{r260A0,r360A0}" < > And this is a VALID group for L3 Bus events: perf stat -e "{r460A0,r260A0,r360A0}" < > 3) This is an INVALID group for L3 Bus event monitoring, since it is missing PMC4 event. perf stat -e "{r360A0}" < > And this is a VALID group for L3 Bus events: perf stat -e "{r460A0,r360A0}" < > Patch here implements group constraint logic suggested by Michael Ellerman. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 20:53:11 +11:00
Madhavan Srinivasan	2d46d4877b	powerpc/perf: Fix unit_sel/cache_sel checks Raw event code has couple of fields "unit" and "cache" in it, to capture the "unit" to monitor for a given pmcxsel and cache reload qualifier to program in MMCR1. isa207_get_constraint() refers "unit" field to update the MMCRC (L2/L3) Event bus control fields with "cache" bits of the raw event code. These are power8 specific and not supported by PowerISA v3.0 pmu. So wrap the checks to be power8 specific. Also, "cache" bit field is referred to update MMCR1[16:17] and this check can be power8 specific. Fixes: `7ffd948fae` ('powerpc/perf: factor out power8 pmu functions') Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 20:53:11 +11:00
Madhavan Srinivasan	8c31459d61	powerpc/perf: Cleanup cache_sel bits comment Update the raw event code comment in power9-pmu.c with respect to "cache" bits, since power9 MMCRC does not support these. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 20:53:11 +11:00
Madhavan Srinivasan	333804dc3b	powerpc/perf: Update perf_regs structure to include SIER On each sample, Sample Instruction Event Register (SIER) content is saved in pt_regs. SIER does not have a entry as-is in the pt_regs but instead, SIER content is saved in the "dar" register of pt_regs. Patch adds another entry to the perf_regs structure to include the "SIER" printing which internally maps to the "dar" of pt_regs. It also check for the SIER availability in the platform and present value accordingly Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 20:53:11 +11:00
Madhavan Srinivasan	17cfccc915	powerpc/perf: Fix thresholding counter data for unknown type MMCRA[34:36] and MMCRA[38:44] expose the thresholding counter value. Thresholding counter can be used to count latency cycles such as load miss to reload. But threshold counter value is not relevant when the sampled instruction type is unknown or reserved. Patch to fix the thresholding counter value to zero when sampled instruction type is unknown or reserved. Fixes: 170a315f41c6('powerpc/perf: Support to export MMCRA[TEC*] field to userspace') Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 20:53:11 +11:00
Aneesh Kumar K.V	374f3f5979	powerpc/mm/hash: Handle user access of kernel address gracefully In commit `2865d08dd9` ("powerpc/mm: Move the DSISR_PROTFAULT sanity check") we moved the protection fault access check before the vma lookup. That means we hit that WARN_ON when user space accesses a kernel address. Before that commit this was handled by find_vma() not finding vma for the kernel address and considering that access as bad area access. Avoid the confusing WARN_ON and convert that to a ratelimited printk. With the patch we now get: for load: a.out[5997]: User access of kernel address (c00000000000dea0) - exploit attempt? (uid: 1000) a.out[5997]: segfault (11) at c00000000000dea0 nip 1317c0798 lr 7fff80d6441c code 1 in a.out[1317c0000+10000] a.out[5997]: code: 60000000 60420000 3c4c0002 38427790 4bffff20 3c4c0002 38427784 fbe1fff8 a.out[5997]: code: f821ffc1 7c3f0b78 60000000 e9228030 <89290000> 993f002f 60000000 383f0040 for exec: a.out[6067]: User access of kernel address (c00000000000dea0) - exploit attempt? (uid: 1000) a.out[6067]: segfault (11) at c00000000000dea0 nip c00000000000dea0 lr 129d507b0 code 1 a.out[6067]: Bad NIP, not dumping instructions. Fixes: `2865d08dd9` ("powerpc/mm: Move the DSISR_PROTFAULT sanity check") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Tested-by: Breno Leitao <leitao@debian.org> [mpe: Don't split printk() string across lines] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-20 20:52:54 +11:00
Joerg Roedel	03ebe48e23	Merge branches 'iommu/fixes', 'arm/renesas', 'arm/mediatek', 'arm/tegra', 'arm/omap', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into next	2018-12-20 10:05:20 +01:00
Christophe Leroy	385e89d5b2	powerpc/mm: add exec protection on powerpc 603 The 603 doesn't have a HASH table, TLB misses are handled by software. It is then possible to generate page fault when _PAGE_EXEC is not set like in nohash/32. There is one "reserved" PTE bit available, this patch uses it for _PAGE_EXEC. In order to support it, set_pte_filter() and set_access_flags_filter() are made common, and the handling is made dependent on MMU_FTR_HPTE_TABLE Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	badb9687ce	powerpc/mm: define an empty slice_init_new_context_exec() Define slice_init_new_context_exec() at all time to avoid Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	05a4ab8239	powerpc/uaccess: fix warning/error with access_ok() With the following piece of code, the following compilation warning is encountered: if (_IOC_DIR(ioc) != _IOC_NONE) { int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : VERIFY_READ; if (!access_ok(verify, ioarg, _IOC_SIZE(ioc))) { drivers/platform/test/dev.c: In function 'my_ioctl': drivers/platform/test/dev.c:219:7: warning: unused variable 'verify' [-Wunused-variable] int verify = _IOC_DIR(ioc) & _IOC_READ ? VERIFY_WRITE : VERIFY_READ; This patch fixes it by referencing 'type' in the macro allthough doing nothing with it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	c62ce9ef97	powerpc: remove remaining bits from CONFIG_APUS commit `f21f49ea63` ("[POWERPC] Remove the dregs of APUS support from arch/powerpc") removed CONFIG_APUS, but forgot to remove the logic which adapts tophys() and tovirt() for it. This patch removes the last stale pieces. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	32c8c4c621	powerpc/xmon: fix dump_segments() mfsrin() takes segment num from bits 31-28 (IBM bits 0-3). Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Clarify bit numbering] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	0ed5b55884	powerpc/8xx: add exception frame marker This patch adds STACK_FRAME_REGS_MARKER in the stack at exception entry in order to see interrupts in call traces as below: [ 0.013964] Call Trace: [ 0.014014] [c0745db0] [c007a9d4] tick_periodic.constprop.5+0xd8/0x104 (unreliable) [ 0.014086] [c0745dc0] [c007aa20] tick_handle_periodic+0x20/0x9c [ 0.014181] [c0745de0] [c0009cd0] timer_interrupt+0xa0/0x264 [ 0.014258] [c0745e10] [c000e484] ret_from_except+0x0/0x14 [ 0.014390] --- interrupt: 901 at console_unlock.part.7+0x3f4/0x528 [ 0.014390] LR = console_unlock.part.7+0x3f0/0x528 [ 0.014455] [c0745ee0] [c0050334] console_unlock.part.7+0x114/0x528 (unreliable) [ 0.014542] [c0745f30] [c00524e0] register_console+0x3d8/0x44c [ 0.014625] [c0745f60] [c0675aac] cpm_uart_console_init+0x18/0x2c [ 0.014709] [c0745f70] [c06614f4] console_init+0x114/0x1cc [ 0.014795] [c0745fb0] [c0658b68] start_kernel+0x300/0x3d8 [ 0.014864] [c0745ff0] [c00022cc] start_here+0x44/0x98 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	e93ba1b7eb	powerpc/book3s/32: fix number of bats in p/v_block_mapped() This patch fixes the loop in p_block_mapped() and v_block_mapped() to scan the entire bat_addrs[] array. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	712877f874	powerpc/mm: Eliminate not possible mmu features at compile time Depending on the CONFIG selected, many of the MMU features are not possible. Lets only get the possible ones in MMU_FTRS_POSSIBLE. This allows gcc to get rid at compile time of code related to not possible features. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	8a01960fb5	powerpc/smp: Use code patching to restore reset vector Instead of hardcoding reset vector restore, use patch_instruction() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	6c16816b91	powerpc/44x: use patch_sites for TLB handlers patching Use patch sites and associated helpers to manage TLB handlers patching instead of hardcoding. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	d16952a629	powerpc/signal: Use code patching instead of hardcoding Instead of hardcoding code modifications, use code patching functions. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	002cdfc2c7	powerpc/8xx: use modify_instruction_site() Instead of hardcoding the TLB handlers patching, use the newly created modify_instruction_site() helper. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	9efc74ff52	powerpc/book3s/32: Use patch_site to patch hash functions Use patch_sites and the new modify_instruction_site() function instead of hardcoding hash functions patching. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	4a3a224c5a	powerpc/book3s/32: Use MMU_FTR_HPTE_TABLE in head_32.S Instead of manually patching a blr at hash_page() entry in MMU_init_hw(), this patch adds a features section in head_32.S Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	04b0a72f28	powerpc/32: use patch_site_addr() in machine_init() Use patch_site_addr() instead of hardcoding the address calculation in machine_init() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	36b08b431e	powerpc: add modify_instruction() and modify_instruction_site() Add two helpers to avoid hardcoding of instructions modifications. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	45090c2661	powerpc: simplify patch_instruction_site() and patch_branch_site() Using patch_site_addr() helper, patch_instruction_site() and patch_branch_site() can be simplified and inlined. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	584dbc7727	powerpc/mm: remove unused variable In file included from ./include/linux/hugetlb.h:445:0, from arch/powerpc/kernel/setup-common.c:37: ./arch/powerpc/include/asm/hugetlb.h: In function ‘huge_ptep_clear_flush’: ./arch/powerpc/include/asm/hugetlb.h:154:8: error: variable ‘pte’ set but not used [-Werror=unused-but-set-variable] Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:32 +11:00
Christophe Leroy	6bf752daca	powerpc: implement CONFIG_DEBUG_VIRTUAL This patch implements CONFIG_DEBUG_VIRTUAL to warn about incorrect use of virt_to_phys() and page_to_phys() Below is the result of test_debug_virtual: [ 1.438746] WARNING: CPU: 0 PID: 1 at ./arch/powerpc/include/asm/io.h:808 test_debug_virtual_init+0x3c/0xd4 [ 1.448156] CPU: 0 PID: 1 Comm: swapper Not tainted 4.20.0-rc5-00560-g6bfb52e23a00-dirty #532 [ 1.457259] NIP: c066c550 LR: c0650ccc CTR: c066c514 [ 1.462257] REGS: c900bdb0 TRAP: 0700 Not tainted (4.20.0-rc5-00560-g6bfb52e23a00-dirty) [ 1.471184] MSR: 00029032 <EE,ME,IR,DR,RI> CR: 48000422 XER: 20000000 [ 1.477811] [ 1.477811] GPR00: c0650ccc c900be60 c60d0000 00000000 006000c0 c9000000 00009032 c7fa0020 [ 1.477811] GPR08: 00002400 00000001 09000000 00000000 c07b5d04 00000000 c00037d8 00000000 [ 1.477811] GPR16: 00000000 00000000 00000000 00000000 c0760000 c0740000 00000092 c0685bb0 [ 1.477811] GPR24: c065042c c068a734 c0685b8c 00000006 00000000 c0760000 c075c3c0 ffffffff [ 1.512711] NIP [c066c550] test_debug_virtual_init+0x3c/0xd4 [ 1.518315] LR [c0650ccc] do_one_initcall+0x8c/0x1cc [ 1.523163] Call Trace: [ 1.525595] [c900be60] [c0567340] 0xc0567340 (unreliable) [ 1.530954] [c900be90] [c0650ccc] do_one_initcall+0x8c/0x1cc [ 1.536551] [c900bef0] [c0651000] kernel_init_freeable+0x1f4/0x2cc [ 1.542658] [c900bf30] [c00037ec] kernel_init+0x14/0x110 [ 1.547913] [c900bf40] [c000e1d0] ret_from_kernel_thread+0x14/0x1c [ 1.553971] Instruction dump: [ 1.556909] 3ca50100 bfa10024 54a5000e 3fa0c076 7c0802a6 3d454000 813dc204 554893be [ 1.564566] 7d294010 7d294910 90010034 39290001 <0f090000> 7c3e0b78 955e0008 3fe0c062 [ 1.572425] ---[ end trace 6f6984225b280ad6 ]--- [ 1.577467] PA: 0x09000000 for VA: 0xc9000000 [ 1.581799] PA: 0x061e8f50 for VA: 0xc61e8f50 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-19 18:56:26 +11:00
Mathieu Malaterre	ebd1d3b74f	powerpc/32: Move the old 6xx -mcpu logic before the TARGET_CPU logic The code: ifdef CONFIG_6xx KBUILD_CFLAGS += -mcpu=powerpc endif was added in 2006 in commit `f48b8296b3` ("[PATCH] powerpc32: Set cpu explicitly in kernel compiles"). This change was acceptable since the TARGET_CPU logic was 64-bit only. Since commit `0e00a8c9fd` ("powerpc: Allow CPU selection also on PPC32") this logic is no longer acceptable after the TARGET_CPU specific. It currently appends -mcpu=powerpc at the end of the command line, after any TARGET_CPU specific: gcc -Wp,-MD,init/.do_mounts.o.d ... -mcpu=powerpc -mbig-endian -m32 ... -mcpu=e300c2 ... -mcpu=powerpc ... ../init/do_mounts.c Fixes: `0e00a8c9fd` ("powerpc: Allow CPU selection also on PPC32") Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-17 22:12:30 +11:00
Michael Ellerman	c7e900c05b	powerpc/ipic: Remove unused ipic_set_priority() ipic_set_priority() has been unused since 2006 when the last usage was removed in commit `b9f0f1bb2b` ("[POWERPC] Adapt ipic driver to new host_ops interface, add set_irq_type to set IRQ sense"). Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-17 22:12:30 +11:00
Michael Ellerman	4d6a198273	Merge branch 'fixes' into next Merge our fixes branch again, this has a couple of build fixes and also a change to do_syscall_trace_enter() that will conflict with a patch we want to apply in next.	2018-12-17 22:11:54 +11:00
Joerg Roedel	bf8763d8f8	powerpc/iommu: Use device_iommu_mapped() Use the new function to replace the open-coded iommu check. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell Currey <ruscur@russell.cc> Cc: Sam Bobroff <sbobroff@linux.ibm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-12-17 10:38:43 +01:00
Suraj Jitindar Singh	95d386c2d2	KVM: PPC: Book3S HV: Allow passthrough of an emulated device to an L3 guest Previously when a device was being emulated by an L1 guest for an L2 guest, that device couldn't then be passed through to an L3 guest. This was because the L1 guest had no method for accessing L3 memory. The hcall H_COPY_TOFROM_GUEST provides this access. Thus this setup for passthrough can now be allowed. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-17 11:33:50 +11:00
Suraj Jitindar Singh	6ff887b8bd	KVM: PPC: Book3S: Introduce new hcall H_COPY_TOFROM_GUEST to access quadrants 1 & 2 A guest cannot access quadrants 1 or 2 as this would result in an exception. Thus introduce the hcall H_COPY_TOFROM_GUEST to be used by a guest when it wants to perform an access to quadrants 1 or 2, for example when it wants to access memory for one of its nested guests. Also provide an implementation for the kvm-hv module. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-17 11:33:50 +11:00
Suraj Jitindar Singh	873db2cd9a	KVM: PPC: Book3S HV: Allow passthrough of an emulated device to an L2 guest Allow for a device which is being emulated at L0 (the host) for an L1 guest to be passed through to a nested (L2) guest. The existing kvmppc_hv_emulate_mmio function can be used here. The main challenge is that for a load the result must be stored into the L2 gpr, not an L1 gpr as would normally be the case after going out to qemu to complete the operation. This presents a challenge as at this point the L2 gpr state has been written back into L1 memory. To work around this we store the address in L1 memory of the L2 gpr where the result of the load is to be stored and use the new io_gpr value KVM_MMIO_REG_NESTED_GPR to indicate that this is a nested load for which completion must be done when returning back into the kernel. Then in kvmppc_complete_mmio_load() the resultant value is written into L1 memory at the location of the indicated L2 gpr. Note that we don't currently let an L1 guest emulate a device for an L2 guest which is then passed through to an L3 guest. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-17 11:33:50 +11:00
Suraj Jitindar Singh	cc6929cc84	KVM: PPC: Update kvmppc_st and kvmppc_ld to use quadrants The functions kvmppc_st and kvmppc_ld are used to access guest memory from the host using a guest effective address. They do so by translating through the process table to obtain a guest real address and then using kvm_read_guest or kvm_write_guest to make the access with the guest real address. This method of access however only works for L1 guests and will give the incorrect results for a nested guest. We can however use the store_to_eaddr and load_from_eaddr kvmppc_ops to perform the access for a nested guesti (and a L1 guest). So attempt this method first and fall back to the old method if this fails and we aren't running a nested guest. At this stage there is no fall back method to perform the access for a nested guest and this is left as a future improvement. For now we will return to the nested guest and rely on the fact that a translation should be faulted in before retrying the access. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-17 11:33:50 +11:00
Suraj Jitindar Singh	dceadcf91b	KVM: PPC: Add load_from_eaddr and store_to_eaddr to the kvmppc_ops struct The kvmppc_ops struct is used to store function pointers to kvm implementation specific functions. Introduce two new functions load_from_eaddr and store_to_eaddr to be used to load from and store to a guest effective address respectively. Also implement these for the kvm-hv module. If we are using the radix mmu then we can call the functions to access quadrant 1 and 2. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-17 11:33:50 +11:00
Suraj Jitindar Singh	d7b4561522	KVM: PPC: Book3S HV: Implement functions to access quadrants 1 & 2 The POWER9 radix mmu has the concept of quadrants. The quadrant number is the two high bits of the effective address and determines the fully qualified address to be used for the translation. The fully qualified address consists of the effective lpid, the effective pid and the effective address. This gives then 4 possible quadrants 0, 1, 2, and 3. When accessing these quadrants the fully qualified address is obtained as follows: Quadrant \| Hypervisor \| Guest -------------------------------------------------------------------------- \| EA[0:1] = 0b00 \| EA[0:1] = 0b00 0 \| effLPID = 0 \| effLPID = LPIDR \| effPID = PIDR \| effPID = PIDR -------------------------------------------------------------------------- \| EA[0:1] = 0b01 \| 1 \| effLPID = LPIDR \| Invalid Access \| effPID = PIDR \| -------------------------------------------------------------------------- \| EA[0:1] = 0b10 \| 2 \| effLPID = LPIDR \| Invalid Access \| effPID = 0 \| -------------------------------------------------------------------------- \| EA[0:1] = 0b11 \| EA[0:1] = 0b11 3 \| effLPID = 0 \| effLPID = LPIDR \| effPID = 0 \| effPID = 0 -------------------------------------------------------------------------- In the Guest; Quadrant 3 is normally used to address the operating system since this uses effPID=0 and effLPID=LPIDR, meaning the PID register doesn't need to be switched. Quadrant 0 is normally used to address user space since the effLPID and effPID are taken from the corresponding registers. In the Host; Quadrant 0 and 3 are used as above, however the effLPID is always 0 to address the host. Quadrants 1 and 2 can be used by the host to address guest memory using a guest effective address. Since the effLPID comes from the LPID register, the host loads the LPID of the guest it would like to access (and the PID of the process) and can perform accesses to a guest effective address. This means quadrant 1 can be used to address the guest user space and quadrant 2 can be used to address the guest operating system from the hypervisor, using a guest effective address. Access to the quadrants can cause a Hypervisor Data Storage Interrupt (HDSI) due to being unable to perform partition scoped translation. Previously this could only be generated from a guest and so the code path expects us to take the KVM trampoline in the interrupt handler. This is no longer the case so we modify the handler to call bad_page_fault() to check if we were expecting this fault so we can handle it gracefully and just return with an error code. In the hash mmu case we still raise an unknown exception since quadrants aren't defined for the hash mmu. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-17 11:33:50 +11:00
Suraj Jitindar Singh	d232afebf9	KVM: PPC: Book3S HV: Add function kvmhv_vcpu_is_radix() There exists a function kvm_is_radix() which is used to determine if a kvm instance is using the radix mmu. However this only applies to the first level (L1) guest. Add a function kvmhv_vcpu_is_radix() which can be used to determine if the current execution context of the vcpu is radix, accounting for if the vcpu is running a nested guest. Currently all nested guests must be radix but this may change in the future. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-17 11:33:49 +11:00
Suraj Jitindar Singh	693ac10a88	KVM: PPC: Book3S: Only report KVM_CAP_SPAPR_TCE_VFIO on powernv machines The kvm capability KVM_CAP_SPAPR_TCE_VFIO is used to indicate the availability of in kernel tce acceleration for vfio. However it is currently the case that this is only available on a powernv machine, not for a pseries machine. Thus make this capability dependent on having the cpu feature CPU_FTR_HVMODE. [paulus@ozlabs.org - fixed compilation for Book E.] Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-17 11:33:49 +11:00
Paul Mackerras	5af3e9d06d	KVM: PPC: Book3S HV: Flush guest mappings when turning dirty tracking on/off This adds code to flush the partition-scoped page tables for a radix guest when dirty tracking is turned on or off for a memslot. Only the guest real addresses covered by the memslot are flushed. The reason for this is to get rid of any 2M PTEs in the partition-scoped page tables that correspond to host transparent huge pages, so that page dirtiness is tracked at a system page (4k or 64k) granularity rather than a 2M granularity. The page tables are also flushed when turning dirty tracking off so that the memslot's address space can be repopulated with THPs if possible. To do this, we add a new function kvmppc_radix_flush_memslot(). Since this does what's needed for kvmppc_core_flush_memslot_hv() on a radix guest, we now make kvmppc_core_flush_memslot_hv() call the new kvmppc_radix_flush_memslot() rather than calling kvm_unmap_radix() for each page in the memslot. This has the effect of fixing a bug in that kvmppc_core_flush_memslot_hv() was previously calling kvm_unmap_radix() without holding the kvm->mmu_lock spinlock, which is required to be held. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-17 10:58:51 +11:00
Paul Mackerras	c43c3a8683	KVM: PPC: Book3S HV: Cleanups - constify memslots, fix comments This adds 'const' to the declarations for the struct kvm_memory_slot pointer parameters of some functions, which will make it possible to call those functions from kvmppc_core_commit_memory_region_hv() in the next patch. This also fixes some comments about locking. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-17 10:58:43 +11:00
Paul Mackerras	f460f6791a	KVM: PPC: Book3S HV: Map single pages when doing dirty page logging For radix guests, this makes KVM map guest memory as individual pages when dirty page logging is enabled for the memslot corresponding to the guest real address. Having a separate partition-scoped PTE for each system page mapped to the guest means that we have a separate dirty bit for each page, thus making the reported dirty bitmap more accurate. Without this, if part of guest memory is backed by transparent huge pages, the dirty status is reported at a 2MB granularity rather than a 64kB (or 4kB) granularity for that part, causing userspace to have to transmit more data when migrating the guest. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-17 10:58:33 +11:00
Bharata B Rao	f032b73459	KVM: PPC: Pass change type down to memslot commit function Currently, kvm_arch_commit_memory_region() gets called with a parameter indicating what type of change is being made to the memslot, but it doesn't pass it down to the platform-specific memslot commit functions. This adds the `change' parameter to the lower-level functions so that they can use it in future. [paulus@ozlabs.org - fix book E also.] Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-17 10:57:27 +11:00
Linus Torvalds	4645453cef	powerpc fixes for 4.20 #4 One notable fix for our change to split pt_regs between user/kernel, we forgot to update BPF to use the user-visible type which was an ABI break for BPF programs. A slightly ugly but minimal fix to do_syscall_trace_enter() so that we use tracehook_report_syscall_entry() properly. We'll rework the code in next to avoid the empty if body. Seven commits fixing bugs in the new papr_scm (Storage Class Memory) driver. The driver was finally able to be tested on the other hypervisor which exposed several bugs. The fixes are all fairly minimal at least. Fix a crash in our MSI code if an MSI-capable device is plugged into a non-MSI capable PHB, only seen on older hardware (MPC8378). Fix our legacy serial code to look for "stdout-path" since the device trees were updated to use that instead of "linux,stdout-path". A change to the COFF zImage code to fix booting old powermacs. A couple of minor build fixes. Thanks to: Benjamin Herrenschmidt, Daniel Axtens, Dmitry V. Levin, Elvira Khabirova, Oliver O'Halloran, Paul Mackerras, Radu Rendec, Rob Herring, Sandipan Das. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJcE7YGAAoJEFHr6jzI4aWA/rAP/2k8NqgbkfKMdY/nHQ/+Tvwu 6EdKZ5dMi875380TomrP80mIyAAwwtN/c1MmJT7plZRCwYcuhGk4UnQjTRcNzrfK Q7SokdMAzSjSDaj7VPiVdpJ6yVaedZYmsRe9m+uP7frmcHr9KL4vjgDn3z8IL8bV I46dvjuSCvgWhFvhXkxf9PHIsC+sP4Z6JRNVhyBjlzQHZiGparq238H8jJVsNHRs f/fsze0z6m3S6dVKEZKZyDlCb3TkP+DjuXpv4hbT9nvnOY132kqO53L++7rQ2YvV TNkabwFfj+p/DnrXXH7OHNkvnDW4cy3KjhyStOnTH5lxYYVYNd5vnjj1AnXa37xE GCBFiHExEHbdG6vifwBmQjvoNRPf6Kh4RGYuRMm8ci7W0WUmq00LKZhxeLbwoou9 1K7lNp0JwLHgqOxCSy9liwNV7YQp1upvC/caQ1qE/ZISnUOXMLxomVfmGQYW51Xc 0/OXCdmacViA0RGatrGSScMO+CTNG0Wa3pm+fo/ufLahxspTDMVdJgs/kEfCAGeN 6BruUoEvm6wAOsPF8+zDumYOpUN9cD7tOQ573B2pMbJYIv3HJxRMDHYGBSijEgRK xy8BbQ0ZtxgfILjIuA95QfGQ5V35ZjugQfM4mt33GPrQ/qIV9SJ+/zZJNISSqaKI Y1xgZdJeVbECA9YYswRK =wI/G -----END PGP SIGNATURE----- Merge tag 'powerpc-4.20-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "One notable fix for our change to split pt_regs between user/kernel, we forgot to update BPF to use the user-visible type which was an ABI break for BPF programs. A slightly ugly but minimal fix to do_syscall_trace_enter() so that we use tracehook_report_syscall_entry() properly. We'll rework the code in next to avoid the empty if body. Seven commits fixing bugs in the new papr_scm (Storage Class Memory) driver. The driver was finally able to be tested on the other hypervisor which exposed several bugs. The fixes are all fairly minimal at least. Fix a crash in our MSI code if an MSI-capable device is plugged into a non-MSI capable PHB, only seen on older hardware (MPC8378). Fix our legacy serial code to look for "stdout-path" since the device trees were updated to use that instead of "linux,stdout-path". A change to the COFF zImage code to fix booting old powermacs. A couple of minor build fixes. Thanks to: Benjamin Herrenschmidt, Daniel Axtens, Dmitry V. Levin, Elvira Khabirova, Oliver O'Halloran, Paul Mackerras, Radu Rendec, Rob Herring, Sandipan Das" * tag 'powerpc-4.20-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/ptrace: replace ptrace_report_syscall() with a tracehook call powerpc/mm: Fallback to RAM if the altmap is unusable powerpc/papr_scm: Use ibm,unit-guid as the iset cookie powerpc/papr_scm: Fix DIMM device registration race powerpc/papr_scm: Remove endian conversions powerpc/papr_scm: Update DT properties powerpc/papr_scm: Fix resource end address powerpc/papr_scm: Use depend instead of select powerpc/bpf: Fix broken uapi for BPF_PROG_TYPE_PERF_EVENT powerpc/boot: Fix build failures with -j 1 powerpc: Look for "stdout-path" when setting up legacy consoles powerpc/msi: Fix NULL pointer access in teardown code powerpc/mm: Fix linux page tables build with some configs powerpc: Fix COFF zImage booting on old powermacs	2018-12-14 09:33:34 -08:00
Paolo Bonzini	e5d83c74a5	kvm: make KVM_CAP_ENABLE_CAP_VM architecture agnostic The first such capability to be handled in virt/kvm/ will be manual dirty page reprotection. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-12-14 12:34:18 +01:00
Suraj Jitindar Singh	6142236cd9	KVM: PPC: Book3S PR: Set hflag to indicate that POWER9 supports 1T segments When booting a kvm-pr guest on a POWER9 machine the following message is observed: "qemu-system-ppc64: KVM does not support 1TiB segments which guest expects" This is because the guest is expecting to be able to use 1T segments however we don't indicate support for it. This is because we don't set the BOOK3S_HFLAG_MULTI_PGSIZE flag in the hflags in kvmppc_set_pvr_pr() on POWER9. POWER9 does indeed have support for 1T segments, so add a case for POWER9 to the switch statement to ensure it is set. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-14 15:39:47 +11:00
Yangtao Li	0f6ddf34be	KVM: PPC: Book3S HV: Change to use DEFINE_SHOW_ATTRIBUTE macro Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-14 15:39:47 +11:00
Paul Mackerras	234ff0b729	KVM: PPC: Book3S HV: Fix race between kvm_unmap_hva_range and MMU mode switch Testing has revealed an occasional crash which appears to be caused by a race between kvmppc_switch_mmu_to_hpt and kvm_unmap_hva_range_hv. The symptom is a NULL pointer dereference in __find_linux_pte() called from kvm_unmap_radix() with kvm->arch.pgtable == NULL. Looking at kvmppc_switch_mmu_to_hpt(), it does indeed clear kvm->arch.pgtable (via kvmppc_free_radix()) before setting kvm->arch.radix to NULL, and there is nothing to prevent kvm_unmap_hva_range_hv() or the other MMU callback functions from being called concurrently with kvmppc_switch_mmu_to_hpt() or kvmppc_switch_mmu_to_radix(). This patch therefore adds calls to spin_lock/unlock on the kvm->mmu_lock around the assignments to kvm->arch.radix, and makes sure that the partition-scoped radix tree or HPT is only freed after changing kvm->arch.radix. This also takes the kvm->mmu_lock in kvmppc_rmap_reset() to make sure that the clearing of each rmap array (one per memslot) doesn't happen concurrently with use of the array in the kvm_unmap_hva_range_hv() or the other MMU callbacks. Fixes: `18c3640cef` ("KVM: PPC: Book3S HV: Add infrastructure for running HPT guests on radix host") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2018-12-14 15:33:15 +11:00
Christoph Hellwig	55897af630	dma-direct: merge swiotlb_dma_ops into the dma_direct code While the dma-direct code is (relatively) clean and simple we actually have to use the swiotlb ops for the mapping on many architectures due to devices with addressing limits. Instead of keeping two implementations around this commit allows the dma-direct implementation to call the swiotlb bounce buffering functions and thus share the guts of the mapping implementation. This also simplified the dma-mapping setup on a few architectures where we don't have to differenciate which implementation to use. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Tony Luck <tony.luck@intel.com>	2018-12-13 21:06:17 +01:00
Christoph Hellwig	7249c1a52d	dma-mapping: move various slow path functions out of line There is no need to have all setup and coherent allocation / freeing routines inline. Move them out of line to keep the implemeation nicely encapsulated and save some kernel text size. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Tony Luck <tony.luck@intel.com>	2018-12-13 21:06:10 +01:00
David S. Miller	addb067983	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-12-11 The following pull-request contains BPF updates for your net-next tree. It has three minor merge conflicts, resolutions: 1) tools/testing/selftests/bpf/test_verifier.c Take first chunk with alignment_prevented_execution. 2) net/core/filter.c [...] case bpf_ctx_range_ptr(struct __sk_buff, flow_keys): case bpf_ctx_range(struct __sk_buff, wire_len): return false; [...] 3) include/uapi/linux/bpf.h Take the second chunk for the two cases each. The main changes are: 1) Add support for BPF line info via BTF and extend libbpf as well as bpftool's program dump to annotate output with BPF C code to facilitate debugging and introspection, from Martin. 2) Add support for BPF_ALU \| BPF_ARSH \| BPF_{K,X} in interpreter and all JIT backends, from Jiong. 3) Improve BPF test coverage on archs with no efficient unaligned access by adding an "any alignment" flag to the BPF program load to forcefully disable verifier alignment checks, from David. 4) Add a new bpf_prog_test_run_xattr() API to libbpf which allows for proper use of BPF_PROG_TEST_RUN with data_out, from Lorenz. 5) Extend tc BPF programs to use a new __sk_buff field called wire_len for more accurate accounting of packets going to wire, from Petar. 6) Improve bpftool to allow dumping the trace pipe from it and add several improvements in bash completion and map/prog dump, from Quentin. 7) Optimize arm64 BPF JIT to always emit movn/movk/movk sequence for kernel addresses and add a dedicated BPF JIT backend allocator, from Ard. 8) Add a BPF helper function for IR remotes to report mouse movements, from Sean. 9) Various cleanups in BPF prog dump e.g. to make UAPI bpf_prog_info member naming consistent with existing conventions, from Yonghong and Song. 10) Misc cleanups and improvements in allowing to pass interface name via cmdline for xdp1 BPF example, from Matteo. 11) Fix a potential segfault in BPF sample loader's kprobes handling, from Daniel T. 12) Fix SPDX license in libbpf's README.rst, from Andrey. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-10 18:00:43 -08:00
David S. Miller	4cc1feeb6f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Several conflicts, seemingly all over the place. I used Stephen Rothwell's sample resolutions for many of these, if not just to double check my own work, so definitely the credit largely goes to him. The NFP conflict consisted of a bug fix (moving operations past the rhashtable operation) while chaning the initial argument in the function call in the moved code. The net/dsa/master.c conflict had to do with a bug fix intermixing of making dsa_master_set_mtu() static with the fixing of the tagging attribute location. cls_flower had a conflict because the dup reject fix from Or overlapped with the addition of port range classifiction. __set_phy_supported()'s conflict was relatively easy to resolve because Andrew fixed it in both trees, so it was just a matter of taking the net-next copy. Or at least I think it was :-) Joe Stringer's fix to the handling of netns id 0 in bpf_sk_lookup() intermixed with changes on how the sdif and caller_net are calculated in these code paths in net-next. The remaining BPF conflicts were largely about the addition of the __bpf_md_ptr stuff in 'net' overlapping with adjustments and additions to the relevant data structure where the MD pointer macros are used. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-09 21:43:31 -08:00
Elvira Khabirova	a225f15674	powerpc/ptrace: replace ptrace_report_syscall() with a tracehook call Arch code should use tracehook_*() helpers, as documented in include/linux/tracehook.h, ptrace_report_syscall() is not expected to be used outside that file. The patch does not look very nice, but at least it is correct and opens the way for PTRACE_GET_SYSCALL_INFO API. Co-authored-by: Dmitry V. Levin <ldv@altlinux.org> Fixes: `5521eb4bca` ("powerpc/ptrace: Add support for PTRACE_SYSEMU") Signed-off-by: Elvira Khabirova <lineprinter@altlinux.org> Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> [mpe: Take this as a minimal fix for 4.20, we'll rework it later] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-10 15:19:58 +11:00
Linus Torvalds	d48f782e4f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "A decent batch of fixes here. I'd say about half are for problems that have existed for a while, and half are for new regressions added in the 4.20 merge window. 1) Fix 10G SFP phy module detection in mvpp2, from Baruch Siach. 2) Revert bogus emac driver change, from Benjamin Herrenschmidt. 3) Handle BPF exported data structure with pointers when building 32-bit userland, from Daniel Borkmann. 4) Memory leak fix in act_police, from Davide Caratti. 5) Check RX checksum offload in RX descriptors properly in aquantia driver, from Dmitry Bogdanov. 6) SKB unlink fix in various spots, from Edward Cree. 7) ndo_dflt_fdb_dump() only works with ethernet, enforce this, from Eric Dumazet. 8) Fix FID leak in mlxsw driver, from Ido Schimmel. 9) IOTLB locking fix in vhost, from Jean-Philippe Brucker. 10) Fix SKB truesize accounting in ipv4/ipv6/netfilter frag memory limits otherwise namespace exit can hang. From Jiri Wiesner. 11) Address block parsing length fixes in x25 from Martin Schiller. 12) IRQ and ring accounting fixes in bnxt_en, from Michael Chan. 13) For tun interfaces, only iface delete works with rtnl ops, enforce this by disallowing add. From Nicolas Dichtel. 14) Use after free in liquidio, from Pan Bian. 15) Fix SKB use after passing to netif_receive_skb(), from Prashant Bhole. 16) Static key accounting and other fixes in XPS from Sabrina Dubroca. 17) Partially initialized flow key passed to ip6_route_output(), from Shmulik Ladkani. 18) Fix RTNL deadlock during reset in ibmvnic driver, from Thomas Falcon. 19) Several small TCP fixes (off-by-one on window probe abort, NULL deref in tail loss probe, SNMP mis-estimations) from Yuchung Cheng" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits) net/sched: cls_flower: Reject duplicated rules also under skip_sw bnxt_en: Fix _bnxt_get_max_rings() for 57500 chips. bnxt_en: Fix NQ/CP rings accounting on the new 57500 chips. bnxt_en: Keep track of reserved IRQs. bnxt_en: Fix CNP CoS queue regression. net/mlx4_core: Correctly set PFC param if global pause is turned off. Revert "net/ibm/emac: wrong bit is used for STA control" neighbour: Avoid writing before skb->head in neigh_hh_output() ipv6: Check available headroom in ip6_xmit() even without options tcp: lack of available data can also cause TSO defer ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output mlxsw: spectrum_switchdev: Fix VLAN device deletion via ioctl mlxsw: spectrum_router: Relax GRE decap matching check mlxsw: spectrum_switchdev: Avoid leaking FID's reference count mlxsw: spectrum_nve: Remove easily triggerable warnings ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes sctp: frag_point sanity check tcp: fix NULL ref in tail loss probe tcp: Do not underestimate rwnd_limited net: use skb_list_del_init() to remove from RX sublists ...	2018-12-09 15:12:33 -08:00
Masahiro Yamada	63fea0af43	x86, powerpc: Remove -funit-at-a-time compiler option entirely GCC 4.6 manual says: -funit-at-a-time This option is left for compatibility reasons. -funit-at-a-time has no effect, while -fno-unit-at-a-time implies -fno-toplevel-reorder and -fno-section-anchors. Enabled by default. Remove it. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Weinberger <richard@sigma-star.at> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linuxppc-dev@lists.ozlabs.org Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/1541990120-9643-3-git-send-email-yamada.masahiro@socionext.com	2018-12-09 11:55:32 +01:00
Oliver O'Halloran	9ef34630a4	powerpc/mm: Fallback to RAM if the altmap is unusable The "altmap" is used to provide a pool of memory that is reserved for the vmemmap backing of hot-plugged memory. This is useful when adding large amount of ZONE_DEVICE memory to a system with a limited amount of normal memory. On ppc64 we use huge pages to map the vmemmap which requires the backing storage to be contigious and aligned to the hugepage size. The altmap implementation allows for the altmap provider to reserve a few PFNs at the start of the range for it's own uses and when this occurs the first chunk of the altmap is not usable for hugepage mappings. On hash there is no sane way to fall back to a normal sized page mapping so we fail the allocation. This results in memory hotplug failing with ENOMEM when the new range doesn't fall into an existing vmemmap block. This patch handles this case by falling back to using system memory rather than failing if we cannot allocate from the altmap. This fallback should only ever be used for the first vmemmap block so it should not cause excess memory consumption. Fixes: `7b73d978a5` ("mm: pass the vmem_altmap to vmemmap_populate") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-09 21:33:21 +11:00
Oliver O'Halloran	43001c52b6	powerpc/papr_scm: Use ibm,unit-guid as the iset cookie The interleave set cookie is used to determine if a label stored in the metadata space should be applied to the current region. This is important in the case of NVDIMMs since the firmware may change the interleaving configuration of a DIMM which would invalidate the existing labels. In our case the hypervisor hides those details from us so we don't really care, but libnvdimm still requires the interleave set cookie to be non-zero. For our purposes we just need the set cookie to be unique and fixed for a given PAPR SCM region and using the unit-guid (really a UUID) is fine for this purpose. Fixes: `b5beae5e22` ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [mpe: Use kernel types (u64)] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-09 21:32:51 +11:00
Oliver O'Halloran	b0d65a8cbc	powerpc/papr_scm: Fix DIMM device registration race When a new nvdimm device is registered with libnvdimm via nvdimm_create() it is added as a device on the nvdimm bus. The probe function for the DIMM driver is potentially quite slow so actually registering and probing the device is done in an async domain rather than immediately after device creation. This can result in a race where the region device (created 2nd) is probed first and fails to activate at boot. To fix this we use the same approach as the ACPI/NFIT driver which is to check that all the DIMM devices registered successfully. LibNVDIMM provides the nvdimm_bus_count_dimms() function which synchronises with the async domain and verifies that the dimm was successfully registered with the bus. If either of these does not occur then we bail. Fixes: `b5beae5e22` ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-09 21:32:30 +11:00
Oliver O'Halloran	409dd7dc83	powerpc/papr_scm: Remove endian conversions The return values of a h-call are returned in the CPU registers and written to the provided buffer by the plpar_hcall() wrapper. As a result the values written to memory are always in the native endian and should not be byte swapped. The inital implementation of the H-Call interface was done in qemu and the returned values were byte swapped unnecessarily in both the hypervisor and in the driver so this was only noticed when bringing up the PowerVM implementation. Fixes: `b5beae5e22` ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-09 21:32:30 +11:00
Oliver O'Halloran	683ec0e04a	powerpc/papr_scm: Update DT properties The ibm,unit-sizes property was originally specified as an array of two u32s corresponding to the memory block size, and the number of blocks available in that region. A fairly last-minute change to the SCM DT specification was splitting that into two seperate u64 properties: ibm,block-sizes and ibm,number-of-blocks that convey the same information. No firmware / hypervisor that emitted the ibm,unit-size property ever appeared in the wild. Fixes: `b5beae5e22` ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [mpe: Use kernel types (u32/u64)] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-09 21:32:16 +11:00
Jiong Wang	44cf43c04b	ppc: bpf: implement jitting of BPF_ALU \| BPF_ARSH \| BPF_* This patch implements code-gen for BPF_ALU \| BPF_ARSH \| BPF_*. Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-12-07 13:30:48 -08:00
Oliver O'Halloran	5961352611	powerpc/papr_scm: Fix resource end address Fix an off-by-one error in the memory resource range. This resource is used to determine the address range of the memory to be hot-plugged as ZONE_DEVICE memory. The current end address results in the kernel attempting to map an additional memblock and the hypervisor may reject the mapping resulting in the entire hot-plug failing. Fixes: `b5beae5e22` ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-07 23:32:02 +11:00
Oliver O'Halloran	14ebfec071	powerpc/papr_scm: Use depend instead of select Making PAPR_SCM select LIBNVDIMM results in circular dependencies in Kconfig when another symbol depends on it. Fix this by replacing the select with a depends. Fixes: `b5beae5e22` ("powerpc/pseries: Add driver for PAPR SCM regions") Reported-by: Alastair D'Silva <alastair@d-silva.org> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-07 23:32:01 +11:00
Sandipan Das	a6460b03f9	powerpc/bpf: Fix broken uapi for BPF_PROG_TYPE_PERF_EVENT Now that there are different variants of pt_regs for userspace and kernel, the uapi for the BPF_PROG_TYPE_PERF_EVENT program type must be changed by exporting the user_pt_regs structure instead of the pt_regs structure that is in-kernel only. Fixes: `002af9391b` ("powerpc: Split user/kernel definitions of struct pt_regs") Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-07 23:19:04 +11:00
Christoph Hellwig	7c703e54cc	arch: switch the default on ARCH_HAS_SG_CHAIN These days architectures are mostly out of the business of dealing with struct scatterlist at all, unless they have architecture specific iommu drivers. Replace the ARCH_HAS_SG_CHAIN symbol with a ARCH_NO_SG_CHAIN one only enabled for architectures with horrible legacy iommu drivers like alpha and parisc, and conditionally for arm which wants to keep it disable for legacy platforms. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Palmer Dabbelt <palmer@sifive.com>	2018-12-06 07:04:56 -08:00
Christoph Hellwig	d11e3d3d03	powerpc/iommu: remove the mapping_error dma_map_ops method The powerpc iommu code already returns (~(dma_addr_t)0x0) on mapping failures, so we can switch over to returning DMA_MAPPING_ERROR and let the core dma-mapping code handle the rest. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-12-06 06:56:38 -08:00
Christoph Hellwig	b0cbeae494	dma-direct: remove the mapping_error dma_map_ops method The dma-direct code already returns (~(dma_addr_t)0x0) on mapping failures, so we can switch over to returning DMA_MAPPING_ERROR and let the core dma-mapping code handle the rest. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-12-06 06:56:36 -08:00
AKASHI Takahiro	735c2f90e3	powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem() Memblock list is another source for usable system memory layout. So move powerpc's arch_kexec_walk_mem() to common code so that other memblock-based architectures, particularly arm64, can also utilise it. A moved function is now renamed to kexec_walk_memblock() and integrated into kexec_locate_mem_hole(), which will now be usable for all architectures with no need for overriding arch_kexec_walk_mem(). With this change, arch_kexec_walk_mem() need no longer be a weak function, and was now renamed to kexec_walk_resources(). Since powerpc doesn't support kdump in its kexec_file_load(), the current kexec_walk_memblock() won't work for kdump either in this form, this will be fixed in the next patch. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Acked-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-12-06 14:38:50 +00:00
Michael Ellerman	e41b93a6be	powerpc/boot: Fix build failures with -j 1 In commit `5e9dcb6188` ("powerpc/boot: Expose Kconfig symbols to wrapper") we added a dependency to serial.c on autoconf.h: $(obj)/serial.c: $(obj)/autoconf.h This works when building in-tree (ie. with KBUILD_OUTPUT unset) because the obj tree is the src tree. But when building with eg. O=build and -j 1 the build fails: gcc ... -I../arch/powerpc/boot -c -o arch/powerpc/boot/serial.o arch/powerpc/boot/serial.c gcc: error: arch/powerpc/boot/serial.c: No such file or directory Why this is only happening with -j 1 is not clear, when building with -j greater than 1 somehow we decide to look for serial.c in the src tree (../), eg: gcc -I../arch/powerpc/boot -c -o arch/powerpc/boot/serial.o ../arch/powerpc/boot/serial.c Regardless we shouldn't be specifying a dependency on serial.c in the build tree, we want to add a dependency to the version in $(srctree) so fix the rule to say that. Fixes: `5e9dcb6188` ("powerpc/boot: Expose Kconfig symbols to wrapper") Tested-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-06 16:10:15 +11:00
David S. Miller	e37d05a538	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Alexei Starovoitov says: ==================== pull-request: bpf 2018-12-05 The following pull-request contains BPF updates for your net tree. The main changes are: 1) fix bpf uapi pointers for 32-bit architectures, from Daniel. 2) improve verifer ability to handle progs with a lot of branches, from Alexei. 3) strict btf checks, from Yonghong. 4) bpf_sk_lookup api cleanup, from Joe. 5) other misc fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-05 16:30:30 -08:00
Christophe Leroy	7c91efce16	powerpc/mm: dump block address translation on book3s/32 This patch adds a debugfs file to dump block address translation: ~# cat /sys/kernel/debug/powerpc/block_address_translation ---[ Instruction Block Address Translations ]--- 0: - 1: - 2: 0xc0000000-0xcfffffff 0x00000000 Kernel EXEC coherent 3: 0xd0000000-0xdfffffff 0x10000000 Kernel EXEC coherent 4: - 5: - 6: - 7: - ---[ Data Block Address Translations ]--- 0: - 1: - 2: 0xc0000000-0xcfffffff 0x00000000 Kernel RW coherent 3: 0xd0000000-0xdfffffff 0x10000000 Kernel RW coherent 4: - 5: - 6: - 7: - Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-04 19:46:49 +11:00
Christophe Leroy	0261a508c9	powerpc/mm: dump segment registers on book3s/32 This patch creates a debugfs file to see content of segment registers # cat /sys/kernel/debug/segment_registers ---[ User Segments ]--- 0x00000000-0x0fffffff Kern key 1 User key 1 VSID 0xade2b0 0x10000000-0x1fffffff Kern key 1 User key 1 VSID 0xade3c1 0x20000000-0x2fffffff Kern key 1 User key 1 VSID 0xade4d2 0x30000000-0x3fffffff Kern key 1 User key 1 VSID 0xade5e3 0x40000000-0x4fffffff Kern key 1 User key 1 VSID 0xade6f4 0x50000000-0x5fffffff Kern key 1 User key 1 VSID 0xade805 0x60000000-0x6fffffff Kern key 1 User key 1 VSID 0xade916 0x70000000-0x7fffffff Kern key 1 User key 1 VSID 0xadea27 0x80000000-0x8fffffff Kern key 1 User key 1 VSID 0xadeb38 0x90000000-0x9fffffff Kern key 1 User key 1 VSID 0xadec49 0xa0000000-0xafffffff Kern key 1 User key 1 VSID 0xaded5a 0xb0000000-0xbfffffff Kern key 1 User key 1 VSID 0xadee6b ---[ Kernel Segments ]--- 0xc0000000-0xcfffffff Kern key 0 User key 1 VSID 0x000ccc 0xd0000000-0xdfffffff Kern key 0 User key 1 VSID 0x000ddd 0xe0000000-0xefffffff Kern key 0 User key 1 VSID 0x000eee 0xf0000000-0xffffffff Kern key 0 User key 1 VSID 0x000fff Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Move it under /sys/kernel/debug/powerpc, make sr_init() __init] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2018-12-04 19:45:54 +11:00

... 4 5 6 7 8 ...

19770 Commits