mirror of
https://mirrors.bfsu.edu.cn/git/linux.git
synced 2024-12-15 06:55:13 +08:00
Linux 5.3-rc7
-----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl1tSg4eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG018IAJGV7SbXggW/iC+e cSMlo8kPnuU7dKCUW+ngXnZY1xuDYWPhXMX9+yDYf2NfMYGdDGYZ+GRjSFim816w HsNsovnYiyxhkh+wA/DmZPWKdTgYrIxbPRO+MlO5ZfbxWNaLgSjqirz0iBITSv3S r2XLmFw8GVACv/GkNGrWBM53wpkJLHzvwaV9hg6dr8HFDipaEn7vEY9/LAN3S3fw reVwW6Q4N4+RSofM1eIGgAZsTYbYBDfri94mRQZ3y+Q8EkRGkJ270WKA0OAVFYS7 KA6nrjvGSYVtmDK3HORjbINQn3bXwIKeMZHl15c+LGM9ePwoHbsN3+smBswRX+R3 JDQjkhY= =DV37 -----END PGP SIGNATURE----- Merge tag 'v5.3-rc7' into x86/mm, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit is contained in:
commit
ae1ad26388
3
.gitignore
vendored
3
.gitignore
vendored
@ -142,3 +142,6 @@ x509.genkey
|
||||
|
||||
# Kdevelop4
|
||||
*.kdev4
|
||||
|
||||
# Clang's compilation database file
|
||||
/compile_commands.json
|
||||
|
8
.mailmap
8
.mailmap
@ -64,6 +64,9 @@ Dengcheng Zhu <dzhu@wavecomp.com> <dengcheng.zhu@imgtec.com>
|
||||
Dengcheng Zhu <dzhu@wavecomp.com> <dczhu@mips.com>
|
||||
Dengcheng Zhu <dzhu@wavecomp.com> <dengcheng.zhu@gmail.com>
|
||||
Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
|
||||
Dmitry Safonov <0x7f454c46@gmail.com> <dsafonov@virtuozzo.com>
|
||||
Dmitry Safonov <0x7f454c46@gmail.com> <d.safonov@partner.samsung.com>
|
||||
Dmitry Safonov <0x7f454c46@gmail.com> <dima@arista.com>
|
||||
Domen Puncer <domen@coderock.org>
|
||||
Douglas Gilbert <dougg@torque.net>
|
||||
Ed L. Cashin <ecashin@coraid.com>
|
||||
@ -98,6 +101,7 @@ Jason Gunthorpe <jgg@ziepe.ca> <jgunthorpe@obsidianresearch.com>
|
||||
Javi Merino <javi.merino@kernel.org> <javi.merino@arm.com>
|
||||
<javier@osg.samsung.com> <javier.martinez@collabora.co.uk>
|
||||
Jean Tourrilhes <jt@hpl.hp.com>
|
||||
<jean-philippe@linaro.org> <jean-philippe.brucker@arm.com>
|
||||
Jeff Garzik <jgarzik@pretzel.yyz.us>
|
||||
Jeff Layton <jlayton@kernel.org> <jlayton@redhat.com>
|
||||
Jeff Layton <jlayton@kernel.org> <jlayton@poochiereds.net>
|
||||
@ -116,6 +120,7 @@ John Stultz <johnstul@us.ibm.com>
|
||||
Juha Yrjola <at solidboot.com>
|
||||
Juha Yrjola <juha.yrjola@nokia.com>
|
||||
Juha Yrjola <juha.yrjola@solidboot.com>
|
||||
Julien Thierry <julien.thierry.kdev@gmail.com> <julien.thierry@arm.com>
|
||||
Kay Sievers <kay.sievers@vrfy.org>
|
||||
Kenneth W Chen <kenneth.w.chen@intel.com>
|
||||
Konstantin Khlebnikov <koct9i@gmail.com> <k.khlebnikov@samsung.com>
|
||||
@ -132,6 +137,7 @@ Linus Lüssing <linus.luessing@c0d3.blue> <linus.luessing@ascom.ch>
|
||||
Li Yang <leoyang.li@nxp.com> <leo@zh-kernel.org>
|
||||
Li Yang <leoyang.li@nxp.com> <leoli@freescale.com>
|
||||
Maciej W. Rozycki <macro@mips.com> <macro@imgtec.com>
|
||||
Marc Zyngier <maz@kernel.org> <marc.zyngier@arm.com>
|
||||
Marcin Nowakowski <marcin.nowakowski@mips.com> <marcin.nowakowski@imgtec.com>
|
||||
Mark Brown <broonie@sirena.org.uk>
|
||||
Mark Yao <markyao0591@gmail.com> <mark.yao@rock-chips.com>
|
||||
@ -157,6 +163,8 @@ Matt Ranostay <mranostay@gmail.com> Matthew Ranostay <mranostay@embeddedalley.co
|
||||
Matt Ranostay <mranostay@gmail.com> <matt.ranostay@intel.com>
|
||||
Matt Ranostay <matt.ranostay@konsulko.com> <matt@ranostay.consulting>
|
||||
Matt Redfearn <matt.redfearn@mips.com> <matt.redfearn@imgtec.com>
|
||||
Maxime Ripard <mripard@kernel.org> <maxime.ripard@bootlin.com>
|
||||
Maxime Ripard <mripard@kernel.org> <maxime.ripard@free-electrons.com>
|
||||
Mayuresh Janorkar <mayur@ti.com>
|
||||
Michael Buesch <m@bues.ch>
|
||||
Michel Dänzer <michel@tungstengraphics.com>
|
||||
|
@ -9,7 +9,7 @@ Linux PCI Bus Subsystem
|
||||
:numbered:
|
||||
|
||||
pci
|
||||
picebus-howto
|
||||
pciebus-howto
|
||||
pci-iov-howto
|
||||
msi-howto
|
||||
acpi-info
|
||||
|
@ -403,7 +403,7 @@ That is, the recovery API only requires that:
|
||||
.. note::
|
||||
|
||||
Implementation details for the powerpc platform are discussed in
|
||||
the file Documentation/powerpc/eeh-pci-error-recovery.txt
|
||||
the file Documentation/powerpc/eeh-pci-error-recovery.rst
|
||||
|
||||
As of this writing, there is a growing list of device drivers with
|
||||
patches implementing error recovery. Not all of these patches are in
|
||||
@ -422,3 +422,6 @@ That is, the recovery API only requires that:
|
||||
- drivers/net/cxgb3
|
||||
- drivers/net/s2io.c
|
||||
- drivers/net/qlge
|
||||
|
||||
The End
|
||||
-------
|
||||
|
@ -1,7 +1,7 @@
|
||||
Using hlist_nulls to protect read-mostly linked lists and
|
||||
objects using SLAB_TYPESAFE_BY_RCU allocations.
|
||||
|
||||
Please read the basics in Documentation/RCU/listRCU.txt
|
||||
Please read the basics in Documentation/RCU/listRCU.rst
|
||||
|
||||
Using special makers (called 'nulls') is a convenient way
|
||||
to solve following problem :
|
||||
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = 'Linux Kernel User Documentation'
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'linux-user.tex', 'Linux Kernel User Documentation',
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -41,10 +41,11 @@ Related CVEs
|
||||
|
||||
The following CVE entries describe Spectre variants:
|
||||
|
||||
============= ======================= =================
|
||||
============= ======================= ==========================
|
||||
CVE-2017-5753 Bounds check bypass Spectre variant 1
|
||||
CVE-2017-5715 Branch target injection Spectre variant 2
|
||||
============= ======================= =================
|
||||
CVE-2019-1125 Spectre v1 swapgs Spectre variant 1 (swapgs)
|
||||
============= ======================= ==========================
|
||||
|
||||
Problem
|
||||
-------
|
||||
@ -78,6 +79,13 @@ There are some extensions of Spectre variant 1 attacks for reading data
|
||||
over the network, see :ref:`[12] <spec_ref12>`. However such attacks
|
||||
are difficult, low bandwidth, fragile, and are considered low risk.
|
||||
|
||||
Note that, despite "Bounds Check Bypass" name, Spectre variant 1 is not
|
||||
only about user-controlled array bounds checks. It can affect any
|
||||
conditional checks. The kernel entry code interrupt, exception, and NMI
|
||||
handlers all have conditional swapgs checks. Those may be problematic
|
||||
in the context of Spectre v1, as kernel code can speculatively run with
|
||||
a user GS.
|
||||
|
||||
Spectre variant 2 (Branch Target Injection)
|
||||
-------------------------------------------
|
||||
|
||||
@ -132,6 +140,9 @@ not cover all possible attack vectors.
|
||||
1. A user process attacking the kernel
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Spectre variant 1
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
The attacker passes a parameter to the kernel via a register or
|
||||
via a known address in memory during a syscall. Such parameter may
|
||||
be used later by the kernel as an index to an array or to derive
|
||||
@ -144,7 +155,40 @@ not cover all possible attack vectors.
|
||||
potentially be influenced for Spectre attacks, new "nospec" accessor
|
||||
macros are used to prevent speculative loading of data.
|
||||
|
||||
Spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
|
||||
Spectre variant 1 (swapgs)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
An attacker can train the branch predictor to speculatively skip the
|
||||
swapgs path for an interrupt or exception. If they initialize
|
||||
the GS register to a user-space value, if the swapgs is speculatively
|
||||
skipped, subsequent GS-related percpu accesses in the speculation
|
||||
window will be done with the attacker-controlled GS value. This
|
||||
could cause privileged memory to be accessed and leaked.
|
||||
|
||||
For example:
|
||||
|
||||
::
|
||||
|
||||
if (coming from user space)
|
||||
swapgs
|
||||
mov %gs:<percpu_offset>, %reg
|
||||
mov (%reg), %reg1
|
||||
|
||||
When coming from user space, the CPU can speculatively skip the
|
||||
swapgs, and then do a speculative percpu load using the user GS
|
||||
value. So the user can speculatively force a read of any kernel
|
||||
value. If a gadget exists which uses the percpu value as an address
|
||||
in another load/store, then the contents of the kernel value may
|
||||
become visible via an L1 side channel attack.
|
||||
|
||||
A similar attack exists when coming from kernel space. The CPU can
|
||||
speculatively do the swapgs, causing the user GS to get used for the
|
||||
rest of the speculative window.
|
||||
|
||||
Spectre variant 2
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
A spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
|
||||
target buffer (BTB) before issuing syscall to launch an attack.
|
||||
After entering the kernel, the kernel could use the poisoned branch
|
||||
target buffer on indirect jump and jump to gadget code in speculative
|
||||
@ -280,11 +324,18 @@ The sysfs file showing Spectre variant 1 mitigation status is:
|
||||
|
||||
The possible values in this file are:
|
||||
|
||||
======================================= =================================
|
||||
'Mitigation: __user pointer sanitation' Protection in kernel on a case by
|
||||
case base with explicit pointer
|
||||
sanitation.
|
||||
======================================= =================================
|
||||
.. list-table::
|
||||
|
||||
* - 'Not affected'
|
||||
- The processor is not vulnerable.
|
||||
* - 'Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers'
|
||||
- The swapgs protections are disabled; otherwise it has
|
||||
protection in the kernel on a case by case base with explicit
|
||||
pointer sanitation and usercopy LFENCE barriers.
|
||||
* - 'Mitigation: usercopy/swapgs barriers and __user pointer sanitization'
|
||||
- Protection in the kernel on a case by case base with explicit
|
||||
pointer sanitation, usercopy LFENCE barriers, and swapgs LFENCE
|
||||
barriers.
|
||||
|
||||
However, the protections are put in place on a case by case basis,
|
||||
and there is no guarantee that all possible attack vectors for Spectre
|
||||
@ -366,12 +417,27 @@ Turning on mitigation for Spectre variant 1 and Spectre variant 2
|
||||
1. Kernel mitigation
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Spectre variant 1
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
For the Spectre variant 1, vulnerable kernel code (as determined
|
||||
by code audit or scanning tools) is annotated on a case by case
|
||||
basis to use nospec accessor macros for bounds clipping :ref:`[2]
|
||||
<spec_ref2>` to avoid any usable disclosure gadgets. However, it may
|
||||
not cover all attack vectors for Spectre variant 1.
|
||||
|
||||
Copy-from-user code has an LFENCE barrier to prevent the access_ok()
|
||||
check from being mis-speculated. The barrier is done by the
|
||||
barrier_nospec() macro.
|
||||
|
||||
For the swapgs variant of Spectre variant 1, LFENCE barriers are
|
||||
added to interrupt, exception and NMI entry where needed. These
|
||||
barriers are done by the FENCE_SWAPGS_KERNEL_ENTRY and
|
||||
FENCE_SWAPGS_USER_ENTRY macros.
|
||||
|
||||
Spectre variant 2
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
For Spectre variant 2 mitigation, the compiler turns indirect calls or
|
||||
jumps in the kernel into equivalent return trampolines (retpolines)
|
||||
:ref:`[3] <spec_ref3>` :ref:`[9] <spec_ref9>` to go to the target
|
||||
@ -473,6 +539,12 @@ Mitigation control on the kernel command line
|
||||
Spectre variant 2 mitigation can be disabled or force enabled at the
|
||||
kernel command line.
|
||||
|
||||
nospectre_v1
|
||||
|
||||
[X86,PPC] Disable mitigations for Spectre Variant 1
|
||||
(bounds check bypass). With this option data leaks are
|
||||
possible in the system.
|
||||
|
||||
nospectre_v2
|
||||
|
||||
[X86] Disable all mitigations for the Spectre variant 2
|
||||
|
@ -2545,7 +2545,7 @@
|
||||
mem_encrypt=on: Activate SME
|
||||
mem_encrypt=off: Do not activate SME
|
||||
|
||||
Refer to Documentation/virtual/kvm/amd-memory-encryption.rst
|
||||
Refer to Documentation/virt/kvm/amd-memory-encryption.rst
|
||||
for details on when memory encryption can be activated.
|
||||
|
||||
mem_sleep_default= [SUSPEND] Default system suspend mode:
|
||||
@ -2604,7 +2604,7 @@
|
||||
expose users to several CPU vulnerabilities.
|
||||
Equivalent to: nopti [X86,PPC]
|
||||
kpti=0 [ARM64]
|
||||
nospectre_v1 [PPC]
|
||||
nospectre_v1 [X86,PPC]
|
||||
nobp=0 [S390]
|
||||
nospectre_v2 [X86,PPC,S390,ARM64]
|
||||
spectre_v2_user=off [X86]
|
||||
@ -2965,9 +2965,9 @@
|
||||
nosmt=force: Force disable SMT, cannot be undone
|
||||
via the sysfs control file.
|
||||
|
||||
nospectre_v1 [PPC] Disable mitigations for Spectre Variant 1 (bounds
|
||||
check bypass). With this option data leaks are possible
|
||||
in the system.
|
||||
nospectre_v1 [X86,PPC] Disable mitigations for Spectre Variant 1
|
||||
(bounds check bypass). With this option data leaks are
|
||||
possible in the system.
|
||||
|
||||
nospectre_v2 [X86,PPC_FSL_BOOK3E,ARM64] Disable all mitigations for
|
||||
the Spectre variant 2 (indirect branch prediction)
|
||||
@ -4090,6 +4090,13 @@
|
||||
Run specified binary instead of /init from the ramdisk,
|
||||
used for early userspace startup. See initrd.
|
||||
|
||||
rdrand= [X86]
|
||||
force - Override the decision by the kernel to hide the
|
||||
advertisement of RDRAND support (this affects
|
||||
certain AMD processors because of buggy BIOS
|
||||
support, specifically around the suspend/resume
|
||||
path).
|
||||
|
||||
rdt= [HW,X86,RDT]
|
||||
Turn on/off individual RDT features. List is:
|
||||
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
|
||||
|
@ -53,7 +53,7 @@ disabled, there is ``khugepaged`` daemon that scans memory and
|
||||
collapses sequences of basic pages into huge pages.
|
||||
|
||||
The THP behaviour is controlled via :ref:`sysfs <thp_sysfs>`
|
||||
interface and using madivse(2) and prctl(2) system calls.
|
||||
interface and using madvise(2) and prctl(2) system calls.
|
||||
|
||||
Transparent Hugepage Support maximizes the usefulness of free memory
|
||||
if compared to the reservation approach of hugetlbfs by allowing all
|
||||
|
@ -39,7 +39,6 @@ Table : Subdirectories in /proc/sys/net
|
||||
802 E802 protocol ax25 AX25
|
||||
ethernet Ethernet protocol rose X.25 PLP layer
|
||||
ipv4 IP version 4 x25 X.25 protocol
|
||||
ipx IPX token-ring IBM token ring
|
||||
bridge Bridging decnet DEC net
|
||||
ipv6 IP version 6 tipc TIPC
|
||||
========= =================== = ========== ==================
|
||||
@ -401,33 +400,7 @@ interface.
|
||||
(network) that the route leads to, the router (may be directly connected), the
|
||||
route flags, and the device the route is using.
|
||||
|
||||
|
||||
5. IPX
|
||||
------
|
||||
|
||||
The IPX protocol has no tunable values in proc/sys/net.
|
||||
|
||||
The IPX protocol does, however, provide proc/net/ipx. This lists each IPX
|
||||
socket giving the local and remote addresses in Novell format (that is
|
||||
network:node:port). In accordance with the strange Novell tradition,
|
||||
everything but the port is in hex. Not_Connected is displayed for sockets that
|
||||
are not tied to a specific remote address. The Tx and Rx queue sizes indicate
|
||||
the number of bytes pending for transmission and reception. The state
|
||||
indicates the state the socket is in and the uid is the owning uid of the
|
||||
socket.
|
||||
|
||||
The /proc/net/ipx_interface file lists all IPX interfaces. For each interface
|
||||
it gives the network number, the node number, and indicates if the network is
|
||||
the primary network. It also indicates which device it is bound to (or
|
||||
Internal for internal networks) and the Frame Type if appropriate. Linux
|
||||
supports 802.3, 802.2, 802.2 SNAP and DIX (Blue Book) ethernet framing for
|
||||
IPX.
|
||||
|
||||
The /proc/net/ipx_route table holds a list of IPX routes. For each route it
|
||||
gives the destination network, the router node (or Directly) and the network
|
||||
address of the router (or Connected) for internal networks.
|
||||
|
||||
6. TIPC
|
||||
5. TIPC
|
||||
-------
|
||||
|
||||
tipc_rmem
|
||||
|
@ -16,6 +16,8 @@ import sys
|
||||
import os
|
||||
import sphinx
|
||||
|
||||
from subprocess import check_output
|
||||
|
||||
# Get Sphinx version
|
||||
major, minor, patch = sphinx.version_info[:3]
|
||||
|
||||
@ -276,10 +278,21 @@ latex_elements = {
|
||||
\\setsansfont{DejaVu Sans}
|
||||
\\setromanfont{DejaVu Serif}
|
||||
\\setmonofont{DejaVu Sans Mono}
|
||||
|
||||
'''
|
||||
}
|
||||
|
||||
# At least one book (translations) may have Asian characters
|
||||
# with are only displayed if xeCJK is used
|
||||
|
||||
cjk_cmd = check_output(['fc-list', '--format="%{family[0]}\n"']).decode('utf-8', 'ignore')
|
||||
if cjk_cmd.find("Noto Sans CJK SC") >= 0:
|
||||
print ("enabling CJK for LaTeX builder")
|
||||
latex_elements['preamble'] += '''
|
||||
% This is needed for translations
|
||||
\\usepackage{xeCJK}
|
||||
\\setCJKmainfont{Noto Sans CJK SC}
|
||||
'''
|
||||
|
||||
# Fix reference escape troubles with Sphinx 1.4.x
|
||||
if major == 1 and minor > 3:
|
||||
latex_elements['preamble'] += '\\renewcommand*{\\DUrole}[2]{ #2 }\n'
|
||||
@ -410,6 +423,21 @@ latex_documents = [
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
||||
|
||||
# Add all other index files from Documentation/ subdirectories
|
||||
for fn in os.listdir('.'):
|
||||
doc = os.path.join(fn, "index")
|
||||
if os.path.exists(doc + ".rst"):
|
||||
has = False
|
||||
for l in latex_documents:
|
||||
if l[0] == doc:
|
||||
has = True
|
||||
break
|
||||
if not has:
|
||||
latex_documents.append((doc, fn + '.tex',
|
||||
'Linux %s Documentation' % fn.capitalize(),
|
||||
'The kernel development community',
|
||||
'manual'))
|
||||
|
||||
# The name of an image file (relative to this directory) to place at the top of
|
||||
# the title page.
|
||||
#latex_logo = None
|
||||
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "Core-API Documentation"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'core-api.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = 'Linux Kernel Crypto API'
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'crypto-api.tex', 'Linux Kernel Crypto API manual',
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "Development tools for the kernel"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'dev-tools.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -19,7 +19,9 @@ quiet_cmd_mk_schema = SCHEMA $@
|
||||
|
||||
DT_DOCS = $(shell \
|
||||
cd $(srctree)/$(src) && \
|
||||
find * \( -name '*.yaml' ! -name $(DT_TMP_SCHEMA) \) \
|
||||
find * \( -name '*.yaml' ! \
|
||||
-name $(DT_TMP_SCHEMA) ! \
|
||||
-name '*.example.dt.yaml' \) \
|
||||
)
|
||||
|
||||
DT_SCHEMA_FILES ?= $(addprefix $(src)/,$(DT_DOCS))
|
||||
|
@ -703,4 +703,4 @@ cpus {
|
||||
https://www.devicetree.org/specifications/
|
||||
|
||||
[6] ARM Linux Kernel documentation - Booting AArch64 Linux
|
||||
Documentation/arm64/booting.txt
|
||||
Documentation/arm64/booting.rst
|
||||
|
@ -1,7 +1,7 @@
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/arm/shmobile.yaml#
|
||||
$id: http://devicetree.org/schemas/arm/renesas.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Renesas SH-Mobile, R-Mobile, and R-Car Platform Device Tree Bindings
|
||||
|
@ -1,7 +1,7 @@
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/arm/milbeaut.yaml#
|
||||
$id: http://devicetree.org/schemas/arm/socionext/milbeaut.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Milbeaut platforms device tree bindings
|
||||
|
@ -1,7 +1,7 @@
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/arm/ti/davinci.yaml#
|
||||
$id: http://devicetree.org/schemas/arm/ti/ti,davinci.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Texas Instruments DaVinci Platforms Device Tree Bindings
|
||||
|
@ -1,7 +1,7 @@
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/phy/allwinner,sun4i-a10-ccu.yaml#
|
||||
$id: http://devicetree.org/schemas/clock/allwinner,sun4i-a10-ccu.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Allwinner Clock Control Unit Device Tree Bindings
|
||||
|
@ -2,7 +2,7 @@
|
||||
# Copyright 2019 Linaro Ltd.
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: "http://devicetree.org/schemas/firmware/intel-ixp4xx-network-processing-engine.yaml#"
|
||||
$id: "http://devicetree.org/schemas/firmware/intel,ixp4xx-network-processing-engine.yaml#"
|
||||
$schema: "http://devicetree.org/meta-schemas/core.yaml#"
|
||||
|
||||
title: Intel IXP4xx Network Processing Engine
|
||||
|
@ -1,7 +1,7 @@
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/iio/accelerometers/adi,adxl345.yaml#
|
||||
$id: http://devicetree.org/schemas/iio/accel/adi,adxl345.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Analog Devices ADXL345/ADXL375 3-Axis Digital Accelerometers
|
||||
|
@ -1,7 +1,7 @@
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/iio/accelerometers/adi,adxl372.yaml#
|
||||
$id: http://devicetree.org/schemas/iio/accel/adi,adxl372.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Analog Devices ADXL372 3-Axis, +/-(200g) Digital Accelerometer
|
||||
|
@ -5,21 +5,19 @@ Required properties:
|
||||
- compatible: should be "amazon,al-fic"
|
||||
- reg: physical base address and size of the registers
|
||||
- interrupt-controller: identifies the node as an interrupt controller
|
||||
- #interrupt-cells: must be 2.
|
||||
First cell defines the index of the interrupt within the controller.
|
||||
Second cell is used to specify the trigger type and must be one of the
|
||||
following:
|
||||
- bits[3:0] trigger type and level flags
|
||||
1 = low-to-high edge triggered
|
||||
4 = active high level-sensitive
|
||||
- interrupt-parent: specifies the parent interrupt controller.
|
||||
- #interrupt-cells : must be 2. Specifies the number of cells needed to encode
|
||||
an interrupt source. Supported trigger types are low-to-high edge
|
||||
triggered and active high level-sensitive.
|
||||
- interrupts: describes which input line in the interrupt parent, this
|
||||
fic's output is connected to. This field property depends on the parent's
|
||||
binding
|
||||
|
||||
Please refer to interrupts.txt in this directory for details of the common
|
||||
Interrupt Controllers bindings used by client devices.
|
||||
|
||||
Example:
|
||||
|
||||
amazon_fic: interrupt-controller@0xfd8a8500 {
|
||||
amazon_fic: interrupt-controller@fd8a8500 {
|
||||
compatible = "amazon,al-fic";
|
||||
interrupt-controller;
|
||||
#interrupt-cells = <2>;
|
||||
|
@ -2,7 +2,7 @@
|
||||
# Copyright 2018 Linaro Ltd.
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: "http://devicetree.org/schemas/interrupt/intel-ixp4xx-interrupt.yaml#"
|
||||
$id: "http://devicetree.org/schemas/interrupt-controller/intel,ixp4xx-interrupt.yaml#"
|
||||
$schema: "http://devicetree.org/meta-schemas/core.yaml#"
|
||||
|
||||
title: Intel IXP4xx XScale Networking Processors Interrupt Controller
|
||||
|
@ -1,20 +1,30 @@
|
||||
* ARC-HS Interrupt Distribution Unit
|
||||
|
||||
This optional 2nd level interrupt controller can be used in SMP configurations for
|
||||
dynamic IRQ routing, load balancing of common/external IRQs towards core intc.
|
||||
This optional 2nd level interrupt controller can be used in SMP configurations
|
||||
for dynamic IRQ routing, load balancing of common/external IRQs towards core
|
||||
intc.
|
||||
|
||||
Properties:
|
||||
|
||||
- compatible: "snps,archs-idu-intc"
|
||||
- interrupt-controller: This is an interrupt controller.
|
||||
- #interrupt-cells: Must be <1>.
|
||||
- #interrupt-cells: Must be <1> or <2>.
|
||||
|
||||
Value of the cell specifies the "common" IRQ from peripheral to IDU. Number N
|
||||
of the particular interrupt line of IDU corresponds to the line N+24 of the
|
||||
core interrupt controller.
|
||||
Value of the first cell specifies the "common" IRQ from peripheral to IDU.
|
||||
Number N of the particular interrupt line of IDU corresponds to the line N+24
|
||||
of the core interrupt controller.
|
||||
|
||||
intc accessed via the special ARC AUX register interface, hence "reg" property
|
||||
is not specified.
|
||||
The (optional) second cell specifies any of the following flags:
|
||||
- bits[3:0] trigger type and level flags
|
||||
1 = low-to-high edge triggered
|
||||
2 = NOT SUPPORTED (high-to-low edge triggered)
|
||||
4 = active high level-sensitive <<< DEFAULT
|
||||
8 = NOT SUPPORTED (active low level-sensitive)
|
||||
When no second cell is specified, the interrupt is assumed to be level
|
||||
sensitive.
|
||||
|
||||
The interrupt controller is accessed via the special ARC AUX register
|
||||
interface, hence "reg" property is not specified.
|
||||
|
||||
Example:
|
||||
core_intc: core-interrupt-controller {
|
||||
|
@ -2,7 +2,7 @@
|
||||
# Copyright 2019 Linaro Ltd.
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: "http://devicetree.org/schemas/misc/intel-ixp4xx-ahb-queue-manager.yaml#"
|
||||
$id: "http://devicetree.org/schemas/misc/intel,ixp4xx-ahb-queue-manager.yaml#"
|
||||
$schema: "http://devicetree.org/meta-schemas/core.yaml#"
|
||||
|
||||
title: Intel IXP4xx AHB Queue Manager
|
@ -1,7 +1,7 @@
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/net/allwinner,sun8i-a83t-gmac.yaml#
|
||||
$id: http://devicetree.org/schemas/net/allwinner,sun8i-a83t-emac.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Allwinner A83t EMAC Device Tree Bindings
|
||||
|
@ -12,6 +12,7 @@ Required properties:
|
||||
- "microchip,ksz8565"
|
||||
- "microchip,ksz9893"
|
||||
- "microchip,ksz9563"
|
||||
- "microchip,ksz8563"
|
||||
|
||||
Optional properties:
|
||||
|
||||
|
@ -7,18 +7,6 @@ Required properties:
|
||||
- phy-mode : See ethernet.txt file in the same directory
|
||||
|
||||
Optional properties:
|
||||
- phy-reset-gpios : Should specify the gpio for phy reset
|
||||
- phy-reset-duration : Reset duration in milliseconds. Should present
|
||||
only if property "phy-reset-gpios" is available. Missing the property
|
||||
will have the duration be 1 millisecond. Numbers greater than 1000 are
|
||||
invalid and 1 millisecond will be used instead.
|
||||
- phy-reset-active-high : If present then the reset sequence using the GPIO
|
||||
specified in the "phy-reset-gpios" property is reversed (H=reset state,
|
||||
L=operation state).
|
||||
- phy-reset-post-delay : Post reset delay in milliseconds. If present then
|
||||
a delay of phy-reset-post-delay milliseconds will be observed after the
|
||||
phy-reset-gpios has been toggled. Can be omitted thus no delay is
|
||||
observed. Delay is in range of 1ms to 1000ms. Other delays are invalid.
|
||||
- phy-supply : regulator that powers the Ethernet PHY.
|
||||
- phy-handle : phandle to the PHY device connected to this device.
|
||||
- fixed-link : Assume a fixed link. See fixed-link.txt in the same directory.
|
||||
@ -47,11 +35,27 @@ Optional properties:
|
||||
For imx6sx, "int0" handles all 3 queues and ENET_MII. "pps" is for the pulse
|
||||
per second interrupt associated with 1588 precision time protocol(PTP).
|
||||
|
||||
|
||||
Optional subnodes:
|
||||
- mdio : specifies the mdio bus in the FEC, used as a container for phy nodes
|
||||
according to phy.txt in the same directory
|
||||
|
||||
Deprecated optional properties:
|
||||
To avoid these, create a phy node according to phy.txt in the same
|
||||
directory, and point the fec's "phy-handle" property to it. Then use
|
||||
the phy's reset binding, again described by phy.txt.
|
||||
- phy-reset-gpios : Should specify the gpio for phy reset
|
||||
- phy-reset-duration : Reset duration in milliseconds. Should present
|
||||
only if property "phy-reset-gpios" is available. Missing the property
|
||||
will have the duration be 1 millisecond. Numbers greater than 1000 are
|
||||
invalid and 1 millisecond will be used instead.
|
||||
- phy-reset-active-high : If present then the reset sequence using the GPIO
|
||||
specified in the "phy-reset-gpios" property is reversed (H=reset state,
|
||||
L=operation state).
|
||||
- phy-reset-post-delay : Post reset delay in milliseconds. If present then
|
||||
a delay of phy-reset-post-delay milliseconds will be observed after the
|
||||
phy-reset-gpios has been toggled. Can be omitted thus no delay is
|
||||
observed. Delay is in range of 1ms to 1000ms. Other delays are invalid.
|
||||
|
||||
Example:
|
||||
|
||||
ethernet@83fec000 {
|
||||
|
@ -15,10 +15,10 @@ Required properties:
|
||||
Use "atmel,sama5d4-gem" for the GEM IP (10/100) available on Atmel sama5d4 SoCs.
|
||||
Use "cdns,zynq-gem" Xilinx Zynq-7xxx SoC.
|
||||
Use "cdns,zynqmp-gem" for Zynq Ultrascale+ MPSoC.
|
||||
Use "sifive,fu540-macb" for SiFive FU540-C000 SoC.
|
||||
Use "sifive,fu540-c000-gem" for SiFive FU540-C000 SoC.
|
||||
Or the generic form: "cdns,emac".
|
||||
- reg: Address and length of the register set for the device
|
||||
For "sifive,fu540-macb", second range is required to specify the
|
||||
For "sifive,fu540-c000-gem", second range is required to specify the
|
||||
address and length of the registers for GEMGXL Management block.
|
||||
- interrupts: Should contain macb interrupt
|
||||
- phy-mode: See ethernet.txt file in the same directory.
|
||||
|
@ -37,13 +37,13 @@ required:
|
||||
|
||||
examples:
|
||||
- |
|
||||
sid@1c23800 {
|
||||
efuse@1c23800 {
|
||||
compatible = "allwinner,sun4i-a10-sid";
|
||||
reg = <0x01c23800 0x10>;
|
||||
};
|
||||
|
||||
- |
|
||||
sid@1c23800 {
|
||||
efuse@1c23800 {
|
||||
compatible = "allwinner,sun7i-a20-sid";
|
||||
reg = <0x01c23800 0x200>;
|
||||
};
|
||||
|
45
Documentation/devicetree/bindings/nvmem/nvmem-consumer.yaml
Normal file
45
Documentation/devicetree/bindings/nvmem/nvmem-consumer.yaml
Normal file
@ -0,0 +1,45 @@
|
||||
# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/nvmem/nvmem-consumer.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: NVMEM (Non Volatile Memory) Consumer Device Tree Bindings
|
||||
|
||||
maintainers:
|
||||
- Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
|
||||
|
||||
select: true
|
||||
|
||||
properties:
|
||||
nvmem:
|
||||
$ref: /schemas/types.yaml#/definitions/phandle-array
|
||||
description:
|
||||
List of phandle to the nvmem providers.
|
||||
|
||||
nvmem-cells:
|
||||
$ref: /schemas/types.yaml#/definitions/phandle-array
|
||||
description:
|
||||
List of phandle to the nvmem data cells.
|
||||
|
||||
nvmem-names:
|
||||
$ref: /schemas/types.yaml#/definitions/string-array
|
||||
description:
|
||||
Names for the each nvmem provider.
|
||||
|
||||
nvmem-cell-names:
|
||||
$ref: /schemas/types.yaml#/definitions/string-array
|
||||
description:
|
||||
Names for each nvmem-cells specified.
|
||||
|
||||
dependencies:
|
||||
nvmem-names: [ nvmem ]
|
||||
nvmem-cell-names: [ nvmem-cells ]
|
||||
|
||||
examples:
|
||||
- |
|
||||
tsens {
|
||||
/* ... */
|
||||
nvmem-cells = <&tsens_calibration>;
|
||||
nvmem-cell-names = "calibration";
|
||||
};
|
@ -1,80 +1 @@
|
||||
= NVMEM(Non Volatile Memory) Data Device Tree Bindings =
|
||||
|
||||
This binding is intended to represent the location of hardware
|
||||
configuration data stored in NVMEMs like eeprom, efuses and so on.
|
||||
|
||||
On a significant proportion of boards, the manufacturer has stored
|
||||
some data on NVMEM, for the OS to be able to retrieve these information
|
||||
and act upon it. Obviously, the OS has to know about where to retrieve
|
||||
these data from, and where they are stored on the storage device.
|
||||
|
||||
This document is here to document this.
|
||||
|
||||
= Data providers =
|
||||
Contains bindings specific to provider drivers and data cells as children
|
||||
of this node.
|
||||
|
||||
Optional properties:
|
||||
read-only: Mark the provider as read only.
|
||||
|
||||
= Data cells =
|
||||
These are the child nodes of the provider which contain data cell
|
||||
information like offset and size in nvmem provider.
|
||||
|
||||
Required properties:
|
||||
reg: specifies the offset in byte within the storage device.
|
||||
|
||||
Optional properties:
|
||||
|
||||
bits: Is pair of bit location and number of bits, which specifies offset
|
||||
in bit and number of bits within the address range specified by reg property.
|
||||
Offset takes values from 0-7.
|
||||
|
||||
For example:
|
||||
|
||||
/* Provider */
|
||||
qfprom: qfprom@700000 {
|
||||
...
|
||||
|
||||
/* Data cells */
|
||||
tsens_calibration: calib@404 {
|
||||
reg = <0x404 0x10>;
|
||||
};
|
||||
|
||||
tsens_calibration_bckp: calib_bckp@504 {
|
||||
reg = <0x504 0x11>;
|
||||
bits = <6 128>
|
||||
};
|
||||
|
||||
pvs_version: pvs-version@6 {
|
||||
reg = <0x6 0x2>
|
||||
bits = <7 2>
|
||||
};
|
||||
|
||||
speed_bin: speed-bin@c{
|
||||
reg = <0xc 0x1>;
|
||||
bits = <2 3>;
|
||||
|
||||
};
|
||||
...
|
||||
};
|
||||
|
||||
= Data consumers =
|
||||
Are device nodes which consume nvmem data cells/providers.
|
||||
|
||||
Required-properties:
|
||||
nvmem-cells: list of phandle to the nvmem data cells.
|
||||
nvmem-cell-names: names for the each nvmem-cells specified. Required if
|
||||
nvmem-cells is used.
|
||||
|
||||
Optional-properties:
|
||||
nvmem : list of phandles to nvmem providers.
|
||||
nvmem-names: names for the each nvmem provider. required if nvmem is used.
|
||||
|
||||
For example:
|
||||
|
||||
tsens {
|
||||
...
|
||||
nvmem-cells = <&tsens_calibration>;
|
||||
nvmem-cell-names = "calibration";
|
||||
};
|
||||
This file has been moved to nvmem.yaml and nvmem-consumer.yaml.
|
||||
|
93
Documentation/devicetree/bindings/nvmem/nvmem.yaml
Normal file
93
Documentation/devicetree/bindings/nvmem/nvmem.yaml
Normal file
@ -0,0 +1,93 @@
|
||||
# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/nvmem/nvmem.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: NVMEM (Non Volatile Memory) Device Tree Bindings
|
||||
|
||||
maintainers:
|
||||
- Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
|
||||
|
||||
description: |
|
||||
This binding is intended to represent the location of hardware
|
||||
configuration data stored in NVMEMs like eeprom, efuses and so on.
|
||||
|
||||
On a significant proportion of boards, the manufacturer has stored
|
||||
some data on NVMEM, for the OS to be able to retrieve these
|
||||
information and act upon it. Obviously, the OS has to know about
|
||||
where to retrieve these data from, and where they are stored on the
|
||||
storage device.
|
||||
|
||||
properties:
|
||||
$nodename:
|
||||
pattern: "^(eeprom|efuse|nvram)(@.*|-[0-9a-f])*$"
|
||||
|
||||
"#address-cells":
|
||||
const: 1
|
||||
|
||||
"#size-cells":
|
||||
const: 1
|
||||
|
||||
read-only:
|
||||
$ref: /schemas/types.yaml#/definitions/flag
|
||||
description:
|
||||
Mark the provider as read only.
|
||||
|
||||
patternProperties:
|
||||
"^.*@[0-9a-f]+$":
|
||||
type: object
|
||||
|
||||
properties:
|
||||
reg:
|
||||
maxItems: 1
|
||||
description:
|
||||
Offset and size in bytes within the storage device.
|
||||
|
||||
bits:
|
||||
maxItems: 1
|
||||
items:
|
||||
items:
|
||||
- minimum: 0
|
||||
maximum: 7
|
||||
description:
|
||||
Offset in bit within the address range specified by reg.
|
||||
- minimum: 1
|
||||
description:
|
||||
Size in bit within the address range specified by reg.
|
||||
|
||||
required:
|
||||
- reg
|
||||
|
||||
additionalProperties: false
|
||||
|
||||
examples:
|
||||
- |
|
||||
qfprom: eeprom@700000 {
|
||||
#address-cells = <1>;
|
||||
#size-cells = <1>;
|
||||
|
||||
/* ... */
|
||||
|
||||
/* Data cells */
|
||||
tsens_calibration: calib@404 {
|
||||
reg = <0x404 0x10>;
|
||||
};
|
||||
|
||||
tsens_calibration_bckp: calib_bckp@504 {
|
||||
reg = <0x504 0x11>;
|
||||
bits = <6 128>;
|
||||
};
|
||||
|
||||
pvs_version: pvs-version@6 {
|
||||
reg = <0x6 0x2>;
|
||||
bits = <7 2>;
|
||||
};
|
||||
|
||||
speed_bin: speed-bin@c{
|
||||
reg = <0xc 0x1>;
|
||||
bits = <2 3>;
|
||||
};
|
||||
};
|
||||
|
||||
...
|
@ -1,7 +1,7 @@
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/display/allwinner,sun6i-a31-mipi-dphy.yaml#
|
||||
$id: http://devicetree.org/schemas/phy/allwinner,sun6i-a31-mipi-dphy.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Allwinner A31 MIPI D-PHY Controller Device Tree Bindings
|
||||
|
@ -37,7 +37,8 @@ properties:
|
||||
hwlocks: true
|
||||
|
||||
st,syscfg:
|
||||
$ref: "/schemas/types.yaml#/definitions/phandle-array"
|
||||
allOf:
|
||||
- $ref: "/schemas/types.yaml#/definitions/phandle-array"
|
||||
description: Should be phandle/offset/mask
|
||||
items:
|
||||
- description: Phandle to the syscon node which includes IRQ mux selection.
|
||||
|
@ -1,162 +0,0 @@
|
||||
===================
|
||||
RISC-V CPU Bindings
|
||||
===================
|
||||
|
||||
The device tree allows to describe the layout of CPUs in a system through
|
||||
the "cpus" node, which in turn contains a number of subnodes (ie "cpu")
|
||||
defining properties for every cpu.
|
||||
|
||||
Bindings for CPU nodes follow the Devicetree Specification, available from:
|
||||
|
||||
https://www.devicetree.org/specifications/
|
||||
|
||||
with updates for 32-bit and 64-bit RISC-V systems provided in this document.
|
||||
|
||||
===========
|
||||
Terminology
|
||||
===========
|
||||
|
||||
This document uses some terminology common to the RISC-V community that is not
|
||||
widely used, the definitions of which are listed here:
|
||||
|
||||
* hart: A hardware execution context, which contains all the state mandated by
|
||||
the RISC-V ISA: a PC and some registers. This terminology is designed to
|
||||
disambiguate software's view of execution contexts from any particular
|
||||
microarchitectural implementation strategy. For example, my Intel laptop is
|
||||
described as having one socket with two cores, each of which has two hyper
|
||||
threads. Therefore this system has four harts.
|
||||
|
||||
=====================================
|
||||
cpus and cpu node bindings definition
|
||||
=====================================
|
||||
|
||||
The RISC-V architecture, in accordance with the Devicetree Specification,
|
||||
requires the cpus and cpu nodes to be present and contain the properties
|
||||
described below.
|
||||
|
||||
- cpus node
|
||||
|
||||
Description: Container of cpu nodes
|
||||
|
||||
The node name must be "cpus".
|
||||
|
||||
A cpus node must define the following properties:
|
||||
|
||||
- #address-cells
|
||||
Usage: required
|
||||
Value type: <u32>
|
||||
Definition: must be set to 1
|
||||
- #size-cells
|
||||
Usage: required
|
||||
Value type: <u32>
|
||||
Definition: must be set to 0
|
||||
|
||||
- cpu node
|
||||
|
||||
Description: Describes a hart context
|
||||
|
||||
PROPERTIES
|
||||
|
||||
- device_type
|
||||
Usage: required
|
||||
Value type: <string>
|
||||
Definition: must be "cpu"
|
||||
- reg
|
||||
Usage: required
|
||||
Value type: <u32>
|
||||
Definition: The hart ID of this CPU node
|
||||
- compatible:
|
||||
Usage: required
|
||||
Value type: <stringlist>
|
||||
Definition: must contain "riscv", may contain one of
|
||||
"sifive,rocket0"
|
||||
- mmu-type:
|
||||
Usage: optional
|
||||
Value type: <string>
|
||||
Definition: Specifies the CPU's MMU type. Possible values are
|
||||
"riscv,sv32"
|
||||
"riscv,sv39"
|
||||
"riscv,sv48"
|
||||
- riscv,isa:
|
||||
Usage: required
|
||||
Value type: <string>
|
||||
Definition: Contains the RISC-V ISA string of this hart. These
|
||||
ISA strings are defined by the RISC-V ISA manual.
|
||||
|
||||
Example: SiFive Freedom U540G Development Kit
|
||||
---------------------------------------------
|
||||
|
||||
This system contains two harts: a hart marked as disabled that's used for
|
||||
low-level system tasks and should be ignored by Linux, and a second hart that
|
||||
Linux is allowed to run on.
|
||||
|
||||
cpus {
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
timebase-frequency = <1000000>;
|
||||
cpu@0 {
|
||||
clock-frequency = <1600000000>;
|
||||
compatible = "sifive,rocket0", "riscv";
|
||||
device_type = "cpu";
|
||||
i-cache-block-size = <64>;
|
||||
i-cache-sets = <128>;
|
||||
i-cache-size = <16384>;
|
||||
next-level-cache = <&L15 &L0>;
|
||||
reg = <0>;
|
||||
riscv,isa = "rv64imac";
|
||||
status = "disabled";
|
||||
L10: interrupt-controller {
|
||||
#interrupt-cells = <1>;
|
||||
compatible = "riscv,cpu-intc";
|
||||
interrupt-controller;
|
||||
};
|
||||
};
|
||||
cpu@1 {
|
||||
clock-frequency = <1600000000>;
|
||||
compatible = "sifive,rocket0", "riscv";
|
||||
d-cache-block-size = <64>;
|
||||
d-cache-sets = <64>;
|
||||
d-cache-size = <32768>;
|
||||
d-tlb-sets = <1>;
|
||||
d-tlb-size = <32>;
|
||||
device_type = "cpu";
|
||||
i-cache-block-size = <64>;
|
||||
i-cache-sets = <64>;
|
||||
i-cache-size = <32768>;
|
||||
i-tlb-sets = <1>;
|
||||
i-tlb-size = <32>;
|
||||
mmu-type = "riscv,sv39";
|
||||
next-level-cache = <&L15 &L0>;
|
||||
reg = <1>;
|
||||
riscv,isa = "rv64imafdc";
|
||||
status = "okay";
|
||||
tlb-split;
|
||||
L13: interrupt-controller {
|
||||
#interrupt-cells = <1>;
|
||||
compatible = "riscv,cpu-intc";
|
||||
interrupt-controller;
|
||||
};
|
||||
};
|
||||
};
|
||||
|
||||
Example: Spike ISA Simulator with 1 Hart
|
||||
----------------------------------------
|
||||
|
||||
This device tree matches the Spike ISA golden model as run with `spike -p1`.
|
||||
|
||||
cpus {
|
||||
cpu@0 {
|
||||
device_type = "cpu";
|
||||
reg = <0x00000000>;
|
||||
status = "okay";
|
||||
compatible = "riscv";
|
||||
riscv,isa = "rv64imafdc";
|
||||
mmu-type = "riscv,sv48";
|
||||
clock-frequency = <0x3b9aca00>;
|
||||
interrupt-controller {
|
||||
#interrupt-cells = <0x00000001>;
|
||||
interrupt-controller;
|
||||
compatible = "riscv,cpu-intc";
|
||||
}
|
||||
}
|
||||
}
|
@ -10,6 +10,18 @@ maintainers:
|
||||
- Paul Walmsley <paul.walmsley@sifive.com>
|
||||
- Palmer Dabbelt <palmer@sifive.com>
|
||||
|
||||
description: |
|
||||
This document uses some terminology common to the RISC-V community
|
||||
that is not widely used, the definitions of which are listed here:
|
||||
|
||||
hart: A hardware execution context, which contains all the state
|
||||
mandated by the RISC-V ISA: a PC and some registers. This
|
||||
terminology is designed to disambiguate software's view of execution
|
||||
contexts from any particular microarchitectural implementation
|
||||
strategy. For example, an Intel laptop containing one socket with
|
||||
two cores, each of which has two hyperthreads, could be described as
|
||||
having four harts.
|
||||
|
||||
properties:
|
||||
compatible:
|
||||
items:
|
||||
@ -50,6 +62,10 @@ properties:
|
||||
User-Level ISA document, available from
|
||||
https://riscv.org/specifications/
|
||||
|
||||
While the isa strings in ISA specification are case
|
||||
insensitive, letters in the riscv,isa string must be all
|
||||
lowercase to simplify parsing.
|
||||
|
||||
timebase-frequency:
|
||||
type: integer
|
||||
minimum: 1
|
||||
|
@ -19,7 +19,7 @@ properties:
|
||||
compatible:
|
||||
items:
|
||||
- enum:
|
||||
- sifive,freedom-unleashed-a00
|
||||
- sifive,hifive-unleashed-a00
|
||||
- const: sifive,fu540-c000
|
||||
- const: sifive,fu540
|
||||
...
|
||||
|
@ -73,7 +73,6 @@ patternProperties:
|
||||
Compatible of the SPI device.
|
||||
|
||||
reg:
|
||||
maxItems: 1
|
||||
minimum: 0
|
||||
maximum: 256
|
||||
description:
|
||||
|
@ -2,7 +2,7 @@
|
||||
# Copyright 2018 Linaro Ltd.
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: "http://devicetree.org/schemas/timer/intel-ixp4xx-timer.yaml#"
|
||||
$id: "http://devicetree.org/schemas/timer/intel,ixp4xx-timer.yaml#"
|
||||
$schema: "http://devicetree.org/meta-schemas/core.yaml#"
|
||||
|
||||
title: Intel IXP4xx XScale Networking Processors Timers
|
||||
|
@ -64,10 +64,8 @@ Optional properties :
|
||||
- power-on-time-ms : Specifies the time it takes from the time the host
|
||||
initiates the power-on sequence to a port until the port has adequate
|
||||
power. The value is given in ms in a 0 - 510 range (default is 100ms).
|
||||
- swap-dx-lanes : Specifies the downstream ports which will swap the
|
||||
differential-pair (D+/D-), default is not-swapped.
|
||||
- swap-us-lanes : Selects the upstream port differential-pair (D+/D-)
|
||||
swapping (boolean, default is not-swapped)
|
||||
- swap-dx-lanes : Specifies the ports which will swap the differential-pair
|
||||
(D+/D-), default is not-swapped.
|
||||
|
||||
Examples:
|
||||
usb2512b@2c {
|
||||
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = 'Linux Kernel Documentation Guide'
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'kernel-doc-guide.tex', 'Linux Kernel Documentation Guide',
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "Linux 802.11 Driver Developer's Guide"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', '80211.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "The Linux driver implementer's API guide"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'driver-api.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -233,7 +233,7 @@ Userspace Interface
|
||||
Several sysfs attributes are generated by the Generic Counter interface,
|
||||
and reside under the /sys/bus/counter/devices/counterX directory, where
|
||||
counterX refers to the respective counter device. Please see
|
||||
Documentation/ABI/testing/sys-bus-counter-generic-sysfs for detailed
|
||||
Documentation/ABI/testing/sysfs-bus-counter for detailed
|
||||
information on each Generic Counter interface sysfs attribute.
|
||||
|
||||
Through these sysfs attributes, programs and scripts may interact with
|
||||
@ -325,7 +325,7 @@ sysfs attributes, where Y is the unique ID of the respective Count:
|
||||
|
||||
For a more detailed breakdown of the available Generic Counter interface
|
||||
sysfs attributes, please refer to the
|
||||
Documentation/ABI/testing/sys-bus-counter file.
|
||||
Documentation/ABI/testing/sysfs-bus-counter file.
|
||||
|
||||
The Signals and Counts associated with the Counter device are registered
|
||||
to the system as well by the counter_register function. The
|
||||
|
@ -179,8 +179,8 @@ PHY Mappings
|
||||
|
||||
In order to get reference to a PHY without help from DeviceTree, the framework
|
||||
offers lookups which can be compared to clkdev that allow clk structures to be
|
||||
bound to devices. A lookup can be made be made during runtime when a handle to
|
||||
the struct phy already exists.
|
||||
bound to devices. A lookup can be made during runtime when a handle to the
|
||||
struct phy already exists.
|
||||
|
||||
The framework offers the following API for registering and unregistering the
|
||||
lookups::
|
||||
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "Device Power Management"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'pm.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -13,7 +13,8 @@ a) SMB3 (and SMB3.1.1) missing optional features:
|
||||
- T10 copy offload ie "ODX" (copy chunk, and "Duplicate Extents" ioctl
|
||||
currently the only two server side copy mechanisms supported)
|
||||
|
||||
b) improved sparse file support
|
||||
b) improved sparse file support (fiemap and SEEK_HOLE are implemented
|
||||
but additional features would be supportable by the protocol).
|
||||
|
||||
c) Directory entry caching relies on a 1 second timer, rather than
|
||||
using Directory Leases, currently only the root file handle is cached longer
|
||||
@ -21,9 +22,13 @@ using Directory Leases, currently only the root file handle is cached longer
|
||||
d) quota support (needs minor kernel change since quota calls
|
||||
to make it to network filesystems or deviceless filesystems)
|
||||
|
||||
e) Additional use cases where we use "compoounding" (e.g. open/query/close
|
||||
and open/setinfo/close) to reduce the number of roundtrips, and also
|
||||
open to reduce redundant opens (using deferred close and reference counts more).
|
||||
e) Additional use cases can be optimized to use "compounding"
|
||||
(e.g. open/query/close and open/setinfo/close) to reduce the number
|
||||
of roundtrips to the server and improve performance. Various cases
|
||||
(stat, statfs, create, unlink, mkdir) already have been improved by
|
||||
using compounding but more can be done. In addition we could significantly
|
||||
reduce redundant opens by using deferred close (with handle caching leases)
|
||||
and better using reference counters on file handles.
|
||||
|
||||
f) Finish inotify support so kde and gnome file list windows
|
||||
will autorefresh (partially complete by Asser). Needs minor kernel
|
||||
@ -43,18 +48,17 @@ mount or a per server basis to client UIDs or nobody if no mapping
|
||||
exists. Also better integration with winbind for resolving SID owners
|
||||
|
||||
k) Add tools to take advantage of more smb3 specific ioctls and features
|
||||
(passthrough ioctl/fsctl for sending various SMB3 fsctls to the server
|
||||
is in progress, and a passthrough query_info call is already implemented
|
||||
in cifs.ko to allow smb3 info levels queries to be sent from userspace)
|
||||
(passthrough ioctl/fsctl is now implemented in cifs.ko to allow sending
|
||||
various SMB3 fsctls and query info and set info calls directly from user space)
|
||||
Add tools to make setting various non-POSIX metadata attributes easier
|
||||
from tools (e.g. extending what was done in smb-info tool).
|
||||
|
||||
l) encrypted file support
|
||||
|
||||
m) improved stats gathering tools (perhaps integration with nfsometer?)
|
||||
to extend and make easier to use what is currently in /proc/fs/cifs/Stats
|
||||
|
||||
n) allow setting more NTFS/SMB3 file attributes remotely (currently limited to compressed
|
||||
file attribute via chflags) and improve user space tools for managing and
|
||||
viewing them.
|
||||
n) Add support for claims based ACLs ("DAC")
|
||||
|
||||
o) mount helper GUI (to simplify the various configuration options on mount)
|
||||
|
||||
@ -82,6 +86,8 @@ so far).
|
||||
w) Add support for additional strong encryption types, and additional spnego
|
||||
authentication mechanisms (see MS-SMB2)
|
||||
|
||||
x) Finish support for SMB3.1.1 compression
|
||||
|
||||
KNOWN BUGS
|
||||
====================================
|
||||
See http://bugzilla.samba.org - search on product "CifsVFS" for
|
||||
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "Linux Filesystems API"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'filesystems.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "Linux GPU Driver Developer's Guide"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'gpu.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -9,7 +9,7 @@ Supported chips:
|
||||
|
||||
Addresses scanned: PCI space
|
||||
|
||||
Datasheet: http://support.amd.com/us/Processor_TechDocs/32559.pdf
|
||||
Datasheet: http://www.amd.com/system/files/TechDocs/32559.pdf
|
||||
|
||||
Author: Rudolf Marek
|
||||
|
||||
|
@ -111,9 +111,11 @@ needed).
|
||||
netlabel/index
|
||||
networking/index
|
||||
pcmcia/index
|
||||
power/index
|
||||
target/index
|
||||
timers/index
|
||||
watchdog/index
|
||||
virtual/index
|
||||
input/index
|
||||
hwmon/index
|
||||
gpu/index
|
||||
@ -143,6 +145,7 @@ implementation.
|
||||
arm64/index
|
||||
ia64/index
|
||||
m68k/index
|
||||
powerpc/index
|
||||
riscv/index
|
||||
s390/index
|
||||
sh/index
|
||||
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "The Linux input driver subsystem"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'linux-input.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "Kernel Hacking Guides"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'kernel-hacking.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -82,7 +82,7 @@ itself. The read lock allows many concurrent readers. Anything that
|
||||
**changes** the list will have to get the write lock.
|
||||
|
||||
NOTE! RCU is better for list traversal, but requires careful
|
||||
attention to design detail (see Documentation/RCU/listRCU.txt).
|
||||
attention to design detail (see Documentation/RCU/listRCU.rst).
|
||||
|
||||
Also, you cannot "upgrade" a read-lock to a write-lock, so if you at _any_
|
||||
time need to do any changes (even if you don't do it every time), you have
|
||||
@ -90,7 +90,7 @@ to get the write-lock at the very beginning.
|
||||
|
||||
NOTE! We are working hard to remove reader-writer spinlocks in most
|
||||
cases, so please don't add a new one without consensus. (Instead, see
|
||||
Documentation/RCU/rcu.txt for complete information.)
|
||||
Documentation/RCU/rcu.rst for complete information.)
|
||||
|
||||
----
|
||||
|
||||
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = 'Linux Kernel Development Documentation'
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'maintainer.tex', 'Linux Kernel Development Documentation',
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -1,12 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
# SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
project = 'Linux Media Subsystem Documentation'
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'media.tex', 'Linux Media Subsystem Documentation',
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -548,7 +548,7 @@ There are certain things that the Linux kernel memory barriers do not guarantee:
|
||||
|
||||
[*] For information on bus mastering DMA and coherency please read:
|
||||
|
||||
Documentation/PCI/pci.rst
|
||||
Documentation/driver-api/pci/pci.rst
|
||||
Documentation/DMA-API-HOWTO.txt
|
||||
Documentation/DMA-API.txt
|
||||
|
||||
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "Linux Networking Documentation"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'networking.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -424,13 +424,24 @@ Statistics
|
||||
Following minimum set of TLS-related statistics should be reported
|
||||
by the driver:
|
||||
|
||||
* ``rx_tls_decrypted`` - number of successfully decrypted TLS segments
|
||||
* ``tx_tls_encrypted`` - number of in-order TLS segments passed to device
|
||||
for encryption
|
||||
* ``rx_tls_decrypted_packets`` - number of successfully decrypted RX packets
|
||||
which were part of a TLS stream.
|
||||
* ``rx_tls_decrypted_bytes`` - number of TLS payload bytes in RX packets
|
||||
which were successfully decrypted.
|
||||
* ``tx_tls_encrypted_packets`` - number of TX packets passed to the device
|
||||
for encryption of their TLS payload.
|
||||
* ``tx_tls_encrypted_bytes`` - number of TLS payload bytes in TX packets
|
||||
passed to the device for encryption.
|
||||
* ``tx_tls_ctx`` - number of TLS TX HW offload contexts added to device for
|
||||
encryption.
|
||||
* ``tx_tls_ooo`` - number of TX packets which were part of a TLS stream
|
||||
but did not arrive in the expected order
|
||||
* ``tx_tls_drop_no_sync_data`` - number of TX packets dropped because
|
||||
they arrived out of order and associated record could not be found
|
||||
but did not arrive in the expected order.
|
||||
* ``tx_tls_drop_no_sync_data`` - number of TX packets which were part of
|
||||
a TLS stream dropped, because they arrived out of order and associated
|
||||
record could not be found.
|
||||
* ``tx_tls_drop_bypass_req`` - number of TX packets which were part of a TLS
|
||||
stream dropped, because they contain both data that has been encrypted by
|
||||
software and data that expects hardware crypto offload.
|
||||
|
||||
Notable corner cases, exceptions and additional requirements
|
||||
============================================================
|
||||
@ -495,21 +506,3 @@ Drivers should ignore the changes to TLS the device feature flags.
|
||||
These flags will be acted upon accordingly by the core ``ktls`` code.
|
||||
TLS device feature flags only control adding of new TLS connection
|
||||
offloads, old connections will remain active after flags are cleared.
|
||||
|
||||
Known bugs
|
||||
==========
|
||||
|
||||
skb_orphan() leaks clear text
|
||||
-----------------------------
|
||||
|
||||
Currently drivers depend on the :c:member:`sk` member of
|
||||
:c:type:`struct sk_buff <sk_buff>` to identify segments requiring
|
||||
encryption. Any operation which removes or does not preserve the socket
|
||||
association such as :c:func:`skb_orphan` or :c:func:`skb_clone`
|
||||
will cause the driver to miss the packets and lead to clear text leaks.
|
||||
|
||||
Redirects leak clear text
|
||||
-------------------------
|
||||
|
||||
In the RX direction, if segment has already been decrypted by the device
|
||||
and it gets redirected or mirrored - clear text will be transmitted out.
|
||||
|
@ -204,8 +204,8 @@ Ethernet device, which instead of receiving packets from a physical
|
||||
media, receives them from user space program and instead of sending
|
||||
packets via physical media sends them to the user space program.
|
||||
|
||||
Let's say that you configured IPX on the tap0, then whenever
|
||||
the kernel sends an IPX packet to tap0, it is passed to the application
|
||||
Let's say that you configured IPv6 on the tap0, then whenever
|
||||
the kernel sends an IPv6 packet to tap0, it is passed to the application
|
||||
(VTun for example). The application encrypts, compresses and sends it to
|
||||
the other side over TCP or UDP. The application on the other side decompresses
|
||||
and decrypts the data received and writes the packet to the TAP device,
|
||||
|
@ -1,4 +1,4 @@
|
||||
:orphan:
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
================
|
||||
Power Management
|
||||
|
@ -1,5 +1,7 @@
|
||||
========================
|
||||
The PowerPC boot wrapper
|
||||
------------------------
|
||||
========================
|
||||
|
||||
Copyright (C) Secret Lab Technologies Ltd.
|
||||
|
||||
PowerPC image targets compresses and wraps the kernel image (vmlinux) with
|
||||
@ -21,6 +23,7 @@ it uses the wrapper script (arch/powerpc/boot/wrapper) to generate target
|
||||
image. The details of the build system is discussed in the next section.
|
||||
Currently, the following image format targets exist:
|
||||
|
||||
==================== ========================================================
|
||||
cuImage.%: Backwards compatible uImage for older version of
|
||||
U-Boot (for versions that don't understand the device
|
||||
tree). This image embeds a device tree blob inside
|
||||
@ -29,31 +32,36 @@ Currently, the following image format targets exist:
|
||||
with boot wrapper code that extracts data from the old
|
||||
bd_info structure and loads the data into the device
|
||||
tree before jumping into the kernel.
|
||||
Because of the series of #ifdefs found in the
|
||||
|
||||
Because of the series of #ifdefs found in the
|
||||
bd_info structure used in the old U-Boot interfaces,
|
||||
cuImages are platform specific. Each specific
|
||||
U-Boot platform has a different platform init file
|
||||
which populates the embedded device tree with data
|
||||
from the platform specific bd_info file. The platform
|
||||
specific cuImage platform init code can be found in
|
||||
arch/powerpc/boot/cuboot.*.c. Selection of the correct
|
||||
`arch/powerpc/boot/cuboot.*.c`. Selection of the correct
|
||||
cuImage init code for a specific board can be found in
|
||||
the wrapper structure.
|
||||
|
||||
dtbImage.%: Similar to zImage, except device tree blob is embedded
|
||||
inside the image instead of provided by firmware. The
|
||||
output image file can be either an elf file or a flat
|
||||
binary depending on the platform.
|
||||
dtbImages are used on systems which do not have an
|
||||
|
||||
dtbImages are used on systems which do not have an
|
||||
interface for passing a device tree directly.
|
||||
dtbImages are similar to simpleImages except that
|
||||
dtbImages have platform specific code for extracting
|
||||
data from the board firmware, but simpleImages do not
|
||||
talk to the firmware at all.
|
||||
PlayStation 3 support uses dtbImage. So do Embedded
|
||||
|
||||
PlayStation 3 support uses dtbImage. So do Embedded
|
||||
Planet boards using the PlanetCore firmware. Board
|
||||
specific initialization code is typically found in a
|
||||
file named arch/powerpc/boot/<platform>.c; but this
|
||||
can be overridden by the wrapper script.
|
||||
|
||||
simpleImage.%: Firmware independent compressed image that does not
|
||||
depend on any particular firmware interface and embeds
|
||||
a device tree blob. This image is a flat binary that
|
||||
@ -61,14 +69,16 @@ Currently, the following image format targets exist:
|
||||
Firmware cannot pass any configuration data to the
|
||||
kernel with this image type and it depends entirely on
|
||||
the embedded device tree for all information.
|
||||
The simpleImage is useful for booting systems with
|
||||
|
||||
The simpleImage is useful for booting systems with
|
||||
an unknown firmware interface or for booting from
|
||||
a debugger when no firmware is present (such as on
|
||||
the Xilinx Virtex platform). The only assumption that
|
||||
simpleImage makes is that RAM is correctly initialized
|
||||
and that the MMU is either off or has RAM mapped to
|
||||
base address 0.
|
||||
simpleImage also supports inserting special platform
|
||||
|
||||
simpleImage also supports inserting special platform
|
||||
specific initialization code to the start of the bootup
|
||||
sequence. The virtex405 platform uses this feature to
|
||||
ensure that the cache is invalidated before caching
|
||||
@ -81,9 +91,11 @@ Currently, the following image format targets exist:
|
||||
named (virtex405-<board>.dts). Search the wrapper
|
||||
script for 'virtex405' and see the file
|
||||
arch/powerpc/boot/virtex405-head.S for details.
|
||||
|
||||
treeImage.%; Image format for used with OpenBIOS firmware found
|
||||
on some ppc4xx hardware. This image embeds a device
|
||||
tree blob inside the image.
|
||||
|
||||
uImage: Native image format used by U-Boot. The uImage target
|
||||
does not add any boot code. It just wraps a compressed
|
||||
vmlinux in the uImage data structure. This image
|
||||
@ -91,12 +103,14 @@ Currently, the following image format targets exist:
|
||||
a device tree to the kernel at boot. If using an older
|
||||
version of U-Boot, then you need to use a cuImage
|
||||
instead.
|
||||
|
||||
zImage.%: Image format which does not embed a device tree.
|
||||
Used by OpenFirmware and other firmware interfaces
|
||||
which are able to supply a device tree. This image
|
||||
expects firmware to provide the device tree at boot.
|
||||
Typically, if you have general purpose PowerPC
|
||||
hardware then you want this image format.
|
||||
==================== ========================================================
|
||||
|
||||
Image types which embed a device tree blob (simpleImage, dtbImage, treeImage,
|
||||
and cuImage) all generate the device tree blob from a file in the
|
@ -1,3 +1,4 @@
|
||||
============
|
||||
CPU Families
|
||||
============
|
||||
|
||||
@ -8,8 +9,8 @@ and are supported by arch/powerpc.
|
||||
Book3S (aka sPAPR)
|
||||
------------------
|
||||
|
||||
- Hash MMU
|
||||
- Mix of 32 & 64 bit
|
||||
- Hash MMU
|
||||
- Mix of 32 & 64 bit::
|
||||
|
||||
+--------------+ +----------------+
|
||||
| Old POWER | --------------> | RS64 (threads) |
|
||||
@ -108,8 +109,8 @@ Book3S (aka sPAPR)
|
||||
IBM BookE
|
||||
---------
|
||||
|
||||
- Software loaded TLB.
|
||||
- All 32 bit
|
||||
- Software loaded TLB.
|
||||
- All 32 bit::
|
||||
|
||||
+--------------+
|
||||
| 401 |
|
||||
@ -155,8 +156,8 @@ IBM BookE
|
||||
Motorola/Freescale 8xx
|
||||
----------------------
|
||||
|
||||
- Software loaded with hardware assist.
|
||||
- All 32 bit
|
||||
- Software loaded with hardware assist.
|
||||
- All 32 bit::
|
||||
|
||||
+-------------+
|
||||
| MPC8xx Core |
|
||||
@ -166,9 +167,9 @@ Motorola/Freescale 8xx
|
||||
Freescale BookE
|
||||
---------------
|
||||
|
||||
- Software loaded TLB.
|
||||
- e6500 adds HW loaded indirect TLB entries.
|
||||
- Mix of 32 & 64 bit
|
||||
- Software loaded TLB.
|
||||
- e6500 adds HW loaded indirect TLB entries.
|
||||
- Mix of 32 & 64 bit::
|
||||
|
||||
+--------------+
|
||||
| e200 |
|
||||
@ -207,8 +208,8 @@ Freescale BookE
|
||||
IBM A2 core
|
||||
-----------
|
||||
|
||||
- Book3E, software loaded TLB + HW loaded indirect TLB entries.
|
||||
- 64 bit
|
||||
- Book3E, software loaded TLB + HW loaded indirect TLB entries.
|
||||
- 64 bit::
|
||||
|
||||
+--------------+ +----------------+
|
||||
| A2 core | --> | WSP |
|
@ -1,3 +1,7 @@
|
||||
============
|
||||
CPU Features
|
||||
============
|
||||
|
||||
Hollis Blanchard <hollis@austin.ibm.com>
|
||||
5 Jun 2002
|
||||
|
||||
@ -32,7 +36,7 @@ anyways).
|
||||
After detecting the processor type, the kernel patches out sections of code
|
||||
that shouldn't be used by writing nop's over it. Using cpufeatures requires
|
||||
just 2 macros (found in arch/powerpc/include/asm/cputable.h), as seen in head.S
|
||||
transfer_to_handler:
|
||||
transfer_to_handler::
|
||||
|
||||
#ifdef CONFIG_ALTIVEC
|
||||
BEGIN_FTR_SECTION
|
@ -1,3 +1,4 @@
|
||||
====================================
|
||||
Coherent Accelerator Interface (CXL)
|
||||
====================================
|
||||
|
||||
@ -21,6 +22,8 @@ Introduction
|
||||
Hardware overview
|
||||
=================
|
||||
|
||||
::
|
||||
|
||||
POWER8/9 FPGA
|
||||
+----------+ +---------+
|
||||
| | | |
|
||||
@ -59,14 +62,16 @@ Hardware overview
|
||||
the fault. The context to which this fault is serviced is based on
|
||||
who owns that acceleration function.
|
||||
|
||||
POWER8 <-----> PSL Version 8 is compliant to the CAIA Version 1.0.
|
||||
POWER9 <-----> PSL Version 9 is compliant to the CAIA Version 2.0.
|
||||
- POWER8 and PSL Version 8 are compliant to the CAIA Version 1.0.
|
||||
- POWER9 and PSL Version 9 are compliant to the CAIA Version 2.0.
|
||||
|
||||
This PSL Version 9 provides new features such as:
|
||||
|
||||
* Interaction with the nest MMU on the P9 chip.
|
||||
* Native DMA support.
|
||||
* Supports sending ASB_Notify messages for host thread wakeup.
|
||||
* Supports Atomic operations.
|
||||
* ....
|
||||
* etc.
|
||||
|
||||
Cards with a PSL9 won't work on a POWER8 system and cards with a
|
||||
PSL8 won't work on a POWER9 system.
|
||||
@ -147,7 +152,9 @@ User API
|
||||
master devices.
|
||||
|
||||
A userspace library libcxl is available here:
|
||||
|
||||
https://github.com/ibm-capi/libcxl
|
||||
|
||||
This provides a C interface to this kernel API.
|
||||
|
||||
open
|
||||
@ -165,7 +172,8 @@ open
|
||||
When all available contexts are allocated the open call will fail
|
||||
and return -ENOSPC.
|
||||
|
||||
Note: IRQs need to be allocated for each context, which may limit
|
||||
Note:
|
||||
IRQs need to be allocated for each context, which may limit
|
||||
the number of contexts that can be created, and therefore
|
||||
how many times the device can be opened. The POWER8 CAPP
|
||||
supports 2040 IRQs and 3 are used by the kernel, so 2037 are
|
||||
@ -186,7 +194,9 @@ ioctl
|
||||
updated as userspace allocates and frees memory. This ioctl
|
||||
returns once the AFU context is started.
|
||||
|
||||
Takes a pointer to a struct cxl_ioctl_start_work:
|
||||
Takes a pointer to a struct cxl_ioctl_start_work
|
||||
|
||||
::
|
||||
|
||||
struct cxl_ioctl_start_work {
|
||||
__u64 flags;
|
||||
@ -269,7 +279,7 @@ read
|
||||
The buffer passed to read() must be at least 4K bytes.
|
||||
|
||||
The result of the read will be a buffer of one or more events,
|
||||
each event is of type struct cxl_event, of varying size.
|
||||
each event is of type struct cxl_event, of varying size::
|
||||
|
||||
struct cxl_event {
|
||||
struct cxl_event_header header;
|
||||
@ -280,7 +290,9 @@ read
|
||||
};
|
||||
};
|
||||
|
||||
The struct cxl_event_header is defined as:
|
||||
The struct cxl_event_header is defined as
|
||||
|
||||
::
|
||||
|
||||
struct cxl_event_header {
|
||||
__u16 type;
|
||||
@ -307,7 +319,9 @@ read
|
||||
For future extensions and padding.
|
||||
|
||||
If the event type is CXL_EVENT_AFU_INTERRUPT then the event
|
||||
structure is defined as:
|
||||
structure is defined as
|
||||
|
||||
::
|
||||
|
||||
struct cxl_event_afu_interrupt {
|
||||
__u16 flags;
|
||||
@ -326,7 +340,9 @@ read
|
||||
For future extensions and padding.
|
||||
|
||||
If the event type is CXL_EVENT_DATA_STORAGE then the event
|
||||
structure is defined as:
|
||||
structure is defined as
|
||||
|
||||
::
|
||||
|
||||
struct cxl_event_data_storage {
|
||||
__u16 flags;
|
||||
@ -356,7 +372,9 @@ read
|
||||
For future extensions
|
||||
|
||||
If the event type is CXL_EVENT_AFU_ERROR then the event structure
|
||||
is defined as:
|
||||
is defined as
|
||||
|
||||
::
|
||||
|
||||
struct cxl_event_afu_error {
|
||||
__u16 flags;
|
||||
@ -393,15 +411,15 @@ open
|
||||
ioctl
|
||||
-----
|
||||
|
||||
CXL_IOCTL_DOWNLOAD_IMAGE:
|
||||
CXL_IOCTL_VALIDATE_IMAGE:
|
||||
CXL_IOCTL_DOWNLOAD_IMAGE / CXL_IOCTL_VALIDATE_IMAGE:
|
||||
Starts and controls flashing a new FPGA image. Partial
|
||||
reconfiguration is not supported (yet), so the image must contain
|
||||
a copy of the PSL and AFU(s). Since an image can be quite large,
|
||||
the caller may have to iterate, splitting the image in smaller
|
||||
chunks.
|
||||
|
||||
Takes a pointer to a struct cxl_adapter_image:
|
||||
Takes a pointer to a struct cxl_adapter_image::
|
||||
|
||||
struct cxl_adapter_image {
|
||||
__u64 flags;
|
||||
__u64 data;
|
||||
@ -442,7 +460,7 @@ Udev rules
|
||||
The following udev rules could be used to create a symlink to the
|
||||
most logical chardev to use in any programming mode (afuX.Yd for
|
||||
dedicated, afuX.Ys for afu directed), since the API is virtually
|
||||
identical for each:
|
||||
identical for each::
|
||||
|
||||
SUBSYSTEM=="cxl", ATTRS{mode}=="dedicated_process", SYMLINK="cxl/%b"
|
||||
SUBSYSTEM=="cxl", ATTRS{mode}=="afu_directed", \
|
@ -1,3 +1,7 @@
|
||||
================================
|
||||
Coherent Accelerator (CXL) Flash
|
||||
================================
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
@ -28,7 +32,7 @@ Introduction
|
||||
responsible for the initialization of the adapter, setting up the
|
||||
special path for user space access, and performing error recovery. It
|
||||
communicates directly the Flash Accelerator Functional Unit (AFU)
|
||||
as described in Documentation/powerpc/cxl.txt.
|
||||
as described in Documentation/powerpc/cxl.rst.
|
||||
|
||||
The cxlflash driver supports two, mutually exclusive, modes of
|
||||
operation at the device (LUN) level:
|
||||
@ -58,7 +62,7 @@ Overview
|
||||
|
||||
The CXL Flash Adapter Driver establishes a master context with the
|
||||
AFU. It uses memory mapped I/O (MMIO) for this control and setup. The
|
||||
Adapter Problem Space Memory Map looks like this:
|
||||
Adapter Problem Space Memory Map looks like this::
|
||||
|
||||
+-------------------------------+
|
||||
| 512 * 64 KB User MMIO |
|
||||
@ -375,7 +379,7 @@ CXL Flash Driver Host IOCTLs
|
||||
Each host adapter instance that is supported by the cxlflash driver
|
||||
has a special character device associated with it to enable a set of
|
||||
host management function. These character devices are hosted in a
|
||||
class dedicated for cxlflash and can be accessed via /dev/cxlflash/*.
|
||||
class dedicated for cxlflash and can be accessed via `/dev/cxlflash/*`.
|
||||
|
||||
Applications can be written to perform various functions using the
|
||||
host ioctl APIs below.
|
@ -1,10 +1,11 @@
|
||||
=====================
|
||||
DAWR issues on POWER9
|
||||
============================
|
||||
=====================
|
||||
|
||||
On POWER9 the Data Address Watchpoint Register (DAWR) can cause a checkstop
|
||||
if it points to cache inhibited (CI) memory. Currently Linux has no way to
|
||||
disinguish CI memory when configuring the DAWR, so (for now) the DAWR is
|
||||
disabled by this commit:
|
||||
disabled by this commit::
|
||||
|
||||
commit 9654153158d3e0684a1bdb76dbababdb7111d5a0
|
||||
Author: Michael Neuling <mikey@neuling.org>
|
||||
@ -12,7 +13,7 @@ disabled by this commit:
|
||||
powerpc: Disable DAWR in the base POWER9 CPU features
|
||||
|
||||
Technical Details:
|
||||
============================
|
||||
==================
|
||||
|
||||
DAWR has 6 different ways of being set.
|
||||
1) ptrace
|
||||
@ -37,7 +38,7 @@ DAWR on the migration.
|
||||
For xmon, the 'bd' command will return an error on P9.
|
||||
|
||||
Consequences for users
|
||||
============================
|
||||
======================
|
||||
|
||||
For GDB watchpoints (ie 'watch' command) on POWER9 bare metal , GDB
|
||||
will accept the command. Unfortunately since there is no hardware
|
||||
@ -57,8 +58,8 @@ trapped in GDB. The watchpoint is remembered, so if the guest is
|
||||
migrated back to the POWER8 host, it will start working again.
|
||||
|
||||
Force enabling the DAWR
|
||||
=============================
|
||||
Kernels (since ~v5.2) have an option to force enable the DAWR via:
|
||||
=======================
|
||||
Kernels (since ~v5.2) have an option to force enable the DAWR via::
|
||||
|
||||
echo Y > /sys/kernel/debug/powerpc/dawr_enable_dangerous
|
||||
|
||||
@ -86,5 +87,7 @@ dawr_enable_dangerous file will fail if the hypervisor doesn't support
|
||||
writing the DAWR.
|
||||
|
||||
To double check the DAWR is working, run this kernel selftest:
|
||||
|
||||
tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
|
||||
|
||||
Any errors/failures/skips mean something is wrong.
|
@ -1,5 +1,6 @@
|
||||
DSCR (Data Stream Control Register)
|
||||
================================================
|
||||
===================================
|
||||
DSCR (Data Stream Control Register)
|
||||
===================================
|
||||
|
||||
DSCR register in powerpc allows user to have some control of prefetch of data
|
||||
stream in the processor. Please refer to the ISA documents or related manual
|
||||
@ -10,14 +11,17 @@ user interface.
|
||||
|
||||
(A) Data Structures:
|
||||
|
||||
(1) thread_struct:
|
||||
(1) thread_struct::
|
||||
|
||||
dscr /* Thread DSCR value */
|
||||
dscr_inherit /* Thread has changed default DSCR */
|
||||
|
||||
(2) PACA:
|
||||
(2) PACA::
|
||||
|
||||
dscr_default /* per-CPU DSCR default value */
|
||||
|
||||
(3) sysfs.c:
|
||||
(3) sysfs.c::
|
||||
|
||||
dscr_default /* System DSCR default value */
|
||||
|
||||
(B) Scheduler Changes:
|
||||
@ -35,8 +39,8 @@ user interface.
|
||||
|
||||
(C) SYSFS Interface:
|
||||
|
||||
Global DSCR default: /sys/devices/system/cpu/dscr_default
|
||||
CPU specific DSCR default: /sys/devices/system/cpu/cpuN/dscr
|
||||
- Global DSCR default: /sys/devices/system/cpu/dscr_default
|
||||
- CPU specific DSCR default: /sys/devices/system/cpu/cpuN/dscr
|
||||
|
||||
Changing the global DSCR default in the sysfs will change all the CPU
|
||||
specific DSCR defaults immediately in their PACA structures. Again if
|
@ -1,10 +1,10 @@
|
||||
==========================
|
||||
PCI Bus EEH Error Recovery
|
||||
==========================
|
||||
|
||||
Linas Vepstas <linas@austin.ibm.com>
|
||||
|
||||
PCI Bus EEH Error Recovery
|
||||
--------------------------
|
||||
Linas Vepstas
|
||||
<linas@austin.ibm.com>
|
||||
12 January 2005
|
||||
12 January 2005
|
||||
|
||||
|
||||
Overview:
|
||||
@ -143,17 +143,17 @@ seen in /proc/ppc64/eeh (subject to change). Normally, almost
|
||||
all of these occur during boot, when the PCI bus is scanned, where
|
||||
a large number of 0xff reads are part of the bus scan procedure.
|
||||
|
||||
If a frozen slot is detected, code in
|
||||
arch/powerpc/platforms/pseries/eeh.c will print a stack trace to
|
||||
syslog (/var/log/messages). This stack trace has proven to be very
|
||||
useful to device-driver authors for finding out at what point the EEH
|
||||
error was detected, as the error itself usually occurs slightly
|
||||
If a frozen slot is detected, code in
|
||||
arch/powerpc/platforms/pseries/eeh.c will print a stack trace to
|
||||
syslog (/var/log/messages). This stack trace has proven to be very
|
||||
useful to device-driver authors for finding out at what point the EEH
|
||||
error was detected, as the error itself usually occurs slightly
|
||||
beforehand.
|
||||
|
||||
Next, it uses the Linux kernel notifier chain/work queue mechanism to
|
||||
allow any interested parties to find out about the failure. Device
|
||||
drivers, or other parts of the kernel, can use
|
||||
eeh_register_notifier(struct notifier_block *) to find out about EEH
|
||||
`eeh_register_notifier(struct notifier_block *)` to find out about EEH
|
||||
events. The event will include a pointer to the pci device, the
|
||||
device node and some state info. Receivers of the event can "do as
|
||||
they wish"; the default handler will be described further in this
|
||||
@ -162,10 +162,13 @@ section.
|
||||
To assist in the recovery of the device, eeh.c exports the
|
||||
following functions:
|
||||
|
||||
rtas_set_slot_reset() -- assert the PCI #RST line for 1/8th of a second
|
||||
rtas_configure_bridge() -- ask firmware to configure any PCI bridges
|
||||
rtas_set_slot_reset()
|
||||
assert the PCI #RST line for 1/8th of a second
|
||||
rtas_configure_bridge()
|
||||
ask firmware to configure any PCI bridges
|
||||
located topologically under the pci slot.
|
||||
eeh_save_bars() and eeh_restore_bars(): save and restore the PCI
|
||||
eeh_save_bars() and eeh_restore_bars():
|
||||
save and restore the PCI
|
||||
config-space info for a device and any devices under it.
|
||||
|
||||
|
||||
@ -191,7 +194,7 @@ events get delivered to user-space scripts.
|
||||
|
||||
Following is an example sequence of events that cause a device driver
|
||||
close function to be called during the first phase of an EEH reset.
|
||||
The following sequence is an example of the pcnet32 device driver.
|
||||
The following sequence is an example of the pcnet32 device driver::
|
||||
|
||||
rpa_php_unconfig_pci_adapter (struct slot *) // in rpaphp_pci.c
|
||||
{
|
||||
@ -241,53 +244,54 @@ The following sequence is an example of the pcnet32 device driver.
|
||||
}}}}}}
|
||||
|
||||
|
||||
in drivers/pci/pci_driver.c,
|
||||
struct device_driver->remove() is just pci_device_remove()
|
||||
which calls struct pci_driver->remove() which is pcnet32_remove_one()
|
||||
which calls unregister_netdev() (in net/core/dev.c)
|
||||
which calls dev_close() (in net/core/dev.c)
|
||||
which calls dev->stop() which is pcnet32_close()
|
||||
which then does the appropriate shutdown.
|
||||
in drivers/pci/pci_driver.c,
|
||||
struct device_driver->remove() is just pci_device_remove()
|
||||
which calls struct pci_driver->remove() which is pcnet32_remove_one()
|
||||
which calls unregister_netdev() (in net/core/dev.c)
|
||||
which calls dev_close() (in net/core/dev.c)
|
||||
which calls dev->stop() which is pcnet32_close()
|
||||
which then does the appropriate shutdown.
|
||||
|
||||
---
|
||||
Following is the analogous stack trace for events sent to user-space
|
||||
when the pci device is unconfigured.
|
||||
|
||||
rpa_php_unconfig_pci_adapter() { // in rpaphp_pci.c
|
||||
calls
|
||||
pci_remove_bus_device (struct pci_dev *) { // in /drivers/pci/remove.c
|
||||
Following is the analogous stack trace for events sent to user-space
|
||||
when the pci device is unconfigured::
|
||||
|
||||
rpa_php_unconfig_pci_adapter() { // in rpaphp_pci.c
|
||||
calls
|
||||
pci_destroy_dev (struct pci_dev *) {
|
||||
pci_remove_bus_device (struct pci_dev *) { // in /drivers/pci/remove.c
|
||||
calls
|
||||
device_unregister (&dev->dev) { // in /drivers/base/core.c
|
||||
pci_destroy_dev (struct pci_dev *) {
|
||||
calls
|
||||
device_del(struct device * dev) { // in /drivers/base/core.c
|
||||
device_unregister (&dev->dev) { // in /drivers/base/core.c
|
||||
calls
|
||||
kobject_del() { //in /libs/kobject.c
|
||||
device_del(struct device * dev) { // in /drivers/base/core.c
|
||||
calls
|
||||
kobject_uevent() { // in /libs/kobject.c
|
||||
kobject_del() { //in /libs/kobject.c
|
||||
calls
|
||||
kset_uevent() { // in /lib/kobject.c
|
||||
kobject_uevent() { // in /libs/kobject.c
|
||||
calls
|
||||
kset->uevent_ops->uevent() // which is really just
|
||||
a call to
|
||||
dev_uevent() { // in /drivers/base/core.c
|
||||
kset_uevent() { // in /lib/kobject.c
|
||||
calls
|
||||
dev->bus->uevent() which is really just a call to
|
||||
pci_uevent () { // in drivers/pci/hotplug.c
|
||||
which prints device name, etc....
|
||||
kset->uevent_ops->uevent() // which is really just
|
||||
a call to
|
||||
dev_uevent() { // in /drivers/base/core.c
|
||||
calls
|
||||
dev->bus->uevent() which is really just a call to
|
||||
pci_uevent () { // in drivers/pci/hotplug.c
|
||||
which prints device name, etc....
|
||||
}
|
||||
}
|
||||
}
|
||||
then kobject_uevent() sends a netlink uevent to userspace
|
||||
--> userspace uevent
|
||||
(during early boot, nobody listens to netlink events and
|
||||
kobject_uevent() executes uevent_helper[], which runs the
|
||||
event process /sbin/hotplug)
|
||||
then kobject_uevent() sends a netlink uevent to userspace
|
||||
--> userspace uevent
|
||||
(during early boot, nobody listens to netlink events and
|
||||
kobject_uevent() executes uevent_helper[], which runs the
|
||||
event process /sbin/hotplug)
|
||||
}
|
||||
}
|
||||
}
|
||||
kobject_del() then calls sysfs_remove_dir(), which would
|
||||
trigger any user-space daemon that was watching /sysfs,
|
||||
and notice the delete event.
|
||||
kobject_del() then calls sysfs_remove_dir(), which would
|
||||
trigger any user-space daemon that was watching /sysfs,
|
||||
and notice the delete event.
|
||||
|
||||
|
||||
Pro's and Con's of the Current Design
|
||||
@ -299,12 +303,12 @@ individual device drivers, so that the current design throws a wide net.
|
||||
The biggest negative of the design is that it potentially disturbs
|
||||
network daemons and file systems that didn't need to be disturbed.
|
||||
|
||||
-- A minor complaint is that resetting the network card causes
|
||||
- A minor complaint is that resetting the network card causes
|
||||
user-space back-to-back ifdown/ifup burps that potentially disturb
|
||||
network daemons, that didn't need to even know that the pci
|
||||
card was being rebooted.
|
||||
|
||||
-- A more serious concern is that the same reset, for SCSI devices,
|
||||
- A more serious concern is that the same reset, for SCSI devices,
|
||||
causes havoc to mounted file systems. Scripts cannot post-facto
|
||||
unmount a file system without flushing pending buffers, but this
|
||||
is impossible, because I/O has already been stopped. Thus,
|
||||
@ -322,7 +326,7 @@ network daemons and file systems that didn't need to be disturbed.
|
||||
from the block layer. It would be very natural to add an EEH
|
||||
reset into this chain of events.
|
||||
|
||||
-- If a SCSI error occurs for the root device, all is lost unless
|
||||
- If a SCSI error occurs for the root device, all is lost unless
|
||||
the sysadmin had the foresight to run /bin, /sbin, /etc, /var
|
||||
and so on, out of ramdisk/tmpfs.
|
||||
|
||||
@ -330,5 +334,3 @@ network daemons and file systems that didn't need to be disturbed.
|
||||
Conclusions
|
||||
-----------
|
||||
There's forward progress ...
|
||||
|
||||
|
@ -1,7 +1,8 @@
|
||||
======================
|
||||
Firmware-Assisted Dump
|
||||
======================
|
||||
|
||||
Firmware-Assisted Dump
|
||||
------------------------
|
||||
July 2011
|
||||
July 2011
|
||||
|
||||
The goal of firmware-assisted dump is to enable the dump of
|
||||
a crashed system, and to do so from a fully-reset system, and
|
||||
@ -27,11 +28,11 @@ in production use.
|
||||
Comparing with kdump or other strategies, firmware-assisted
|
||||
dump offers several strong, practical advantages:
|
||||
|
||||
-- Unlike kdump, the system has been reset, and loaded
|
||||
- Unlike kdump, the system has been reset, and loaded
|
||||
with a fresh copy of the kernel. In particular,
|
||||
PCI and I/O devices have been reinitialized and are
|
||||
in a clean, consistent state.
|
||||
-- Once the dump is copied out, the memory that held the dump
|
||||
- Once the dump is copied out, the memory that held the dump
|
||||
is immediately available to the running kernel. And therefore,
|
||||
unlike kdump, fadump doesn't need a 2nd reboot to get back
|
||||
the system to the production configuration.
|
||||
@ -40,17 +41,18 @@ The above can only be accomplished by coordination with,
|
||||
and assistance from the Power firmware. The procedure is
|
||||
as follows:
|
||||
|
||||
-- The first kernel registers the sections of memory with the
|
||||
- The first kernel registers the sections of memory with the
|
||||
Power firmware for dump preservation during OS initialization.
|
||||
These registered sections of memory are reserved by the first
|
||||
kernel during early boot.
|
||||
|
||||
-- When a system crashes, the Power firmware will save
|
||||
- When a system crashes, the Power firmware will save
|
||||
the low memory (boot memory of size larger of 5% of system RAM
|
||||
or 256MB) of RAM to the previous registered region. It will
|
||||
also save system registers, and hardware PTE's.
|
||||
|
||||
NOTE: The term 'boot memory' means size of the low memory chunk
|
||||
NOTE:
|
||||
The term 'boot memory' means size of the low memory chunk
|
||||
that is required for a kernel to boot successfully when
|
||||
booted with restricted memory. By default, the boot memory
|
||||
size will be the larger of 5% of system RAM or 256MB.
|
||||
@ -64,12 +66,12 @@ as follows:
|
||||
as fadump uses a predefined offset to reserve memory
|
||||
for boot memory dump preservation in case of a crash.
|
||||
|
||||
-- After the low memory (boot memory) area has been saved, the
|
||||
- After the low memory (boot memory) area has been saved, the
|
||||
firmware will reset PCI and other hardware state. It will
|
||||
*not* clear the RAM. It will then launch the bootloader, as
|
||||
normal.
|
||||
|
||||
-- The freshly booted kernel will notice that there is a new
|
||||
- The freshly booted kernel will notice that there is a new
|
||||
node (ibm,dump-kernel) in the device tree, indicating that
|
||||
there is crash data available from a previous boot. During
|
||||
the early boot OS will reserve rest of the memory above
|
||||
@ -77,17 +79,18 @@ as follows:
|
||||
size. This will make sure that the second kernel will not
|
||||
touch any of the dump memory area.
|
||||
|
||||
-- User-space tools will read /proc/vmcore to obtain the contents
|
||||
- User-space tools will read /proc/vmcore to obtain the contents
|
||||
of memory, which holds the previous crashed kernel dump in ELF
|
||||
format. The userspace tools may copy this info to disk, or
|
||||
network, nas, san, iscsi, etc. as desired.
|
||||
|
||||
-- Once the userspace tool is done saving dump, it will echo
|
||||
- Once the userspace tool is done saving dump, it will echo
|
||||
'1' to /sys/kernel/fadump_release_mem to release the reserved
|
||||
memory back to general use, except the memory required for
|
||||
next firmware-assisted dump registration.
|
||||
|
||||
e.g.
|
||||
e.g.::
|
||||
|
||||
# echo 1 > /sys/kernel/fadump_release_mem
|
||||
|
||||
Please note that the firmware-assisted dump feature
|
||||
@ -95,7 +98,7 @@ is only available on Power6 and above systems with recent
|
||||
firmware versions.
|
||||
|
||||
Implementation details:
|
||||
----------------------
|
||||
-----------------------
|
||||
|
||||
During boot, a check is made to see if firmware supports
|
||||
this feature on that particular machine. If it does, then
|
||||
@ -121,7 +124,7 @@ Allocator (CMA) for memory reservation if CMA is configured for kernel.
|
||||
With CMA reservation this memory will be available for applications to
|
||||
use it, while kernel is prevented from using it. With this fadump will
|
||||
still be able to capture all of the kernel memory and most of the user
|
||||
space memory except the user pages that were present in CMA region.
|
||||
space memory except the user pages that were present in CMA region::
|
||||
|
||||
o Memory Reservation during first kernel
|
||||
|
||||
@ -166,7 +169,7 @@ The tools to examine the dump will be same as the ones
|
||||
used for kdump.
|
||||
|
||||
How to enable firmware-assisted dump (fadump):
|
||||
-------------------------------------
|
||||
----------------------------------------------
|
||||
|
||||
1. Set config option CONFIG_FA_DUMP=y and build kernel.
|
||||
2. Boot into linux kernel with 'fadump=on' kernel cmdline option.
|
||||
@ -177,19 +180,20 @@ How to enable firmware-assisted dump (fadump):
|
||||
to specify size of the memory to reserve for boot memory dump
|
||||
preservation.
|
||||
|
||||
NOTE: 1. 'fadump_reserve_mem=' parameter has been deprecated. Instead
|
||||
use 'crashkernel=' to specify size of the memory to reserve
|
||||
for boot memory dump preservation.
|
||||
2. If firmware-assisted dump fails to reserve memory then it
|
||||
will fallback to existing kdump mechanism if 'crashkernel='
|
||||
option is set at kernel cmdline.
|
||||
3. if user wants to capture all of user space memory and ok with
|
||||
reserved memory not available to production system, then
|
||||
'fadump=nocma' kernel parameter can be used to fallback to
|
||||
old behaviour.
|
||||
NOTE:
|
||||
1. 'fadump_reserve_mem=' parameter has been deprecated. Instead
|
||||
use 'crashkernel=' to specify size of the memory to reserve
|
||||
for boot memory dump preservation.
|
||||
2. If firmware-assisted dump fails to reserve memory then it
|
||||
will fallback to existing kdump mechanism if 'crashkernel='
|
||||
option is set at kernel cmdline.
|
||||
3. if user wants to capture all of user space memory and ok with
|
||||
reserved memory not available to production system, then
|
||||
'fadump=nocma' kernel parameter can be used to fallback to
|
||||
old behaviour.
|
||||
|
||||
Sysfs/debugfs files:
|
||||
------------
|
||||
--------------------
|
||||
|
||||
Firmware-assisted dump feature uses sysfs file system to hold
|
||||
the control files and debugfs file to display memory reserved region.
|
||||
@ -197,20 +201,20 @@ the control files and debugfs file to display memory reserved region.
|
||||
Here is the list of files under kernel sysfs:
|
||||
|
||||
/sys/kernel/fadump_enabled
|
||||
|
||||
This is used to display the fadump status.
|
||||
0 = fadump is disabled
|
||||
1 = fadump is enabled
|
||||
|
||||
- 0 = fadump is disabled
|
||||
- 1 = fadump is enabled
|
||||
|
||||
This interface can be used by kdump init scripts to identify if
|
||||
fadump is enabled in the kernel and act accordingly.
|
||||
|
||||
/sys/kernel/fadump_registered
|
||||
|
||||
This is used to display the fadump registration status as well
|
||||
as to control (start/stop) the fadump registration.
|
||||
0 = fadump is not registered.
|
||||
1 = fadump is registered and ready to handle system crash.
|
||||
|
||||
- 0 = fadump is not registered.
|
||||
- 1 = fadump is registered and ready to handle system crash.
|
||||
|
||||
To register fadump echo 1 > /sys/kernel/fadump_registered and
|
||||
echo 0 > /sys/kernel/fadump_registered for un-register and stop the
|
||||
@ -219,13 +223,12 @@ Here is the list of files under kernel sysfs:
|
||||
easily integrated with kdump service start/stop.
|
||||
|
||||
/sys/kernel/fadump_release_mem
|
||||
|
||||
This file is available only when fadump is active during
|
||||
second kernel. This is used to release the reserved memory
|
||||
region that are held for saving crash dump. To release the
|
||||
reserved memory echo 1 to it:
|
||||
reserved memory echo 1 to it::
|
||||
|
||||
echo 1 > /sys/kernel/fadump_release_mem
|
||||
echo 1 > /sys/kernel/fadump_release_mem
|
||||
|
||||
After echo 1, the content of the /sys/kernel/debug/powerpc/fadump_region
|
||||
file will change to reflect the new memory reservations.
|
||||
@ -238,38 +241,39 @@ Here is the list of files under powerpc debugfs:
|
||||
(Assuming debugfs is mounted on /sys/kernel/debug directory.)
|
||||
|
||||
/sys/kernel/debug/powerpc/fadump_region
|
||||
|
||||
This file shows the reserved memory regions if fadump is
|
||||
enabled otherwise this file is empty. The output format
|
||||
is:
|
||||
<region>: [<start>-<end>] <reserved-size> bytes, Dumped: <dump-size>
|
||||
is::
|
||||
|
||||
<region>: [<start>-<end>] <reserved-size> bytes, Dumped: <dump-size>
|
||||
|
||||
e.g.
|
||||
Contents when fadump is registered during first kernel
|
||||
Contents when fadump is registered during first kernel::
|
||||
|
||||
# cat /sys/kernel/debug/powerpc/fadump_region
|
||||
CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x0
|
||||
HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x0
|
||||
DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x0
|
||||
# cat /sys/kernel/debug/powerpc/fadump_region
|
||||
CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x0
|
||||
HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x0
|
||||
DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x0
|
||||
|
||||
Contents when fadump is active during second kernel
|
||||
Contents when fadump is active during second kernel::
|
||||
|
||||
# cat /sys/kernel/debug/powerpc/fadump_region
|
||||
CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x40020
|
||||
HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x1000
|
||||
DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x10000000
|
||||
: [0x00000010000000-0x0000006ffaffff] 0x5ffb0000 bytes, Dumped: 0x5ffb0000
|
||||
# cat /sys/kernel/debug/powerpc/fadump_region
|
||||
CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x40020
|
||||
HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x1000
|
||||
DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x10000000
|
||||
: [0x00000010000000-0x0000006ffaffff] 0x5ffb0000 bytes, Dumped: 0x5ffb0000
|
||||
|
||||
NOTE: Please refer to Documentation/filesystems/debugfs.txt on
|
||||
NOTE:
|
||||
Please refer to Documentation/filesystems/debugfs.txt on
|
||||
how to mount the debugfs filesystem.
|
||||
|
||||
|
||||
TODO:
|
||||
-----
|
||||
o Need to come up with the better approach to find out more
|
||||
- Need to come up with the better approach to find out more
|
||||
accurate boot memory size that is required for a kernel to
|
||||
boot successfully when booted with restricted memory.
|
||||
o The fadump implementation introduces a fadump crash info structure
|
||||
- The fadump implementation introduces a fadump crash info structure
|
||||
in the scratch area before the ELF core header. The idea of introducing
|
||||
this structure is to pass some important crash info data to the second
|
||||
kernel which will help second kernel to populate ELF core header with
|
||||
@ -277,7 +281,9 @@ TODO:
|
||||
design implementation does not address a possibility of introducing
|
||||
additional fields (in future) to this structure without affecting
|
||||
compatibility. Need to come up with the better approach to address this.
|
||||
|
||||
The possible approaches are:
|
||||
|
||||
1. Introduce version field for version tracking, bump up the version
|
||||
whenever a new field is added to the structure in future. The version
|
||||
field can be used to find out what fields are valid for the current
|
||||
@ -285,8 +291,11 @@ TODO:
|
||||
2. Reserve the area of predefined size (say PAGE_SIZE) for this
|
||||
structure and have unused area as reserved (initialized to zero)
|
||||
for future field additions.
|
||||
|
||||
The advantage of approach 1 over 2 is we don't need to reserve extra space.
|
||||
---
|
||||
|
||||
Author: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
|
||||
|
||||
This document is based on the original documentation written for phyp
|
||||
|
||||
assisted dump by Linas Vepstas and Manish Ahuja.
|
@ -1,19 +1,22 @@
|
||||
===========================================================================
|
||||
HVCS
|
||||
IBM "Hypervisor Virtual Console Server" Installation Guide
|
||||
for Linux Kernel 2.6.4+
|
||||
Copyright (C) 2004 IBM Corporation
|
||||
===============================================================
|
||||
HVCS IBM "Hypervisor Virtual Console Server" Installation Guide
|
||||
===============================================================
|
||||
|
||||
===========================================================================
|
||||
NOTE:Eight space tabs are the optimum editor setting for reading this file.
|
||||
===========================================================================
|
||||
for Linux Kernel 2.6.4+
|
||||
|
||||
Author(s) : Ryan S. Arnold <rsa@us.ibm.com>
|
||||
Date Created: March, 02, 2004
|
||||
Last Changed: August, 24, 2004
|
||||
Copyright (C) 2004 IBM Corporation
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
Table of contents:
|
||||
.. ===========================================================================
|
||||
.. NOTE:Eight space tabs are the optimum editor setting for reading this file.
|
||||
.. ===========================================================================
|
||||
|
||||
|
||||
Author(s): Ryan S. Arnold <rsa@us.ibm.com>
|
||||
|
||||
Date Created: March, 02, 2004
|
||||
Last Changed: August, 24, 2004
|
||||
|
||||
.. Table of contents:
|
||||
|
||||
1. Driver Introduction:
|
||||
2. System Requirements
|
||||
@ -27,8 +30,8 @@ Table of contents:
|
||||
8. Questions & Answers:
|
||||
9. Reporting Bugs:
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
1. Driver Introduction:
|
||||
=======================
|
||||
|
||||
This is the device driver for the IBM Hypervisor Virtual Console Server,
|
||||
"hvcs". The IBM hvcs provides a tty driver interface to allow Linux user
|
||||
@ -38,8 +41,8 @@ ppc64 system. Physical hardware consoles per partition are not practical
|
||||
on this hardware so system consoles are accessed by this driver using
|
||||
firmware interfaces to virtual terminal devices.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
2. System Requirements:
|
||||
=======================
|
||||
|
||||
This device driver was written using 2.6.4 Linux kernel APIs and will only
|
||||
build and run on kernels of this version or later.
|
||||
@ -52,8 +55,8 @@ Sysfs must be mounted on the system so that the user can determine which
|
||||
major and minor numbers are associated with each vty-server. Directions
|
||||
for sysfs mounting are outside the scope of this document.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
3. Build Options:
|
||||
=================
|
||||
|
||||
The hvcs driver registers itself as a tty driver. The tty layer
|
||||
dynamically allocates a block of major and minor numbers in a quantity
|
||||
@ -65,11 +68,11 @@ If the default number of device entries is adequate then this driver can be
|
||||
built into the kernel. If not, the default can be over-ridden by inserting
|
||||
the driver as a module with insmod parameters.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
3.1 Built-in:
|
||||
-------------
|
||||
|
||||
The following menuconfig example demonstrates selecting to build this
|
||||
driver into the kernel.
|
||||
driver into the kernel::
|
||||
|
||||
Device Drivers --->
|
||||
Character devices --->
|
||||
@ -77,11 +80,11 @@ driver into the kernel.
|
||||
|
||||
Begin the kernel make process.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
3.2 Module:
|
||||
-----------
|
||||
|
||||
The following menuconfig example demonstrates selecting to build this
|
||||
driver as a kernel module.
|
||||
driver as a kernel module::
|
||||
|
||||
Device Drivers --->
|
||||
Character devices --->
|
||||
@ -89,11 +92,11 @@ driver as a kernel module.
|
||||
|
||||
The make process will build the following kernel modules:
|
||||
|
||||
hvcs.ko
|
||||
hvcserver.ko
|
||||
- hvcs.ko
|
||||
- hvcserver.ko
|
||||
|
||||
To insert the module with the default allocation execute the following
|
||||
commands in the order they appear:
|
||||
commands in the order they appear::
|
||||
|
||||
insmod hvcserver.ko
|
||||
insmod hvcs.ko
|
||||
@ -103,7 +106,7 @@ be inserted first, otherwise the hvcs module will not find some of the
|
||||
symbols it expects.
|
||||
|
||||
To override the default use an insmod parameter as follows (requesting 4
|
||||
tty devices as an example):
|
||||
tty devices as an example)::
|
||||
|
||||
insmod hvcs.ko hvcs_parm_num_devs=4
|
||||
|
||||
@ -115,31 +118,31 @@ source file before building.
|
||||
NOTE: The length of time it takes to insmod the driver seems to be related
|
||||
to the number of tty interfaces the registering driver requests.
|
||||
|
||||
In order to remove the driver module execute the following command:
|
||||
In order to remove the driver module execute the following command::
|
||||
|
||||
rmmod hvcs.ko
|
||||
|
||||
The recommended method for installing hvcs as a module is to use depmod to
|
||||
build a current modules.dep file in /lib/modules/`uname -r` and then
|
||||
execute:
|
||||
execute::
|
||||
|
||||
modprobe hvcs hvcs_parm_num_devs=4
|
||||
modprobe hvcs hvcs_parm_num_devs=4
|
||||
|
||||
The modules.dep file indicates that hvcserver.ko needs to be inserted
|
||||
before hvcs.ko and modprobe uses this file to smartly insert the modules in
|
||||
the proper order.
|
||||
|
||||
The following modprobe command is used to remove hvcs and hvcserver in the
|
||||
proper order:
|
||||
proper order::
|
||||
|
||||
modprobe -r hvcs
|
||||
modprobe -r hvcs
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
4. Installation:
|
||||
================
|
||||
|
||||
The tty layer creates sysfs entries which contain the major and minor
|
||||
numbers allocated for the hvcs driver. The following snippet of "tree"
|
||||
output of the sysfs directory shows where these numbers are presented:
|
||||
output of the sysfs directory shows where these numbers are presented::
|
||||
|
||||
sys/
|
||||
|-- *other sysfs base dirs*
|
||||
@ -164,7 +167,7 @@ output of the sysfs directory shows where these numbers are presented:
|
||||
|-- *other sysfs base dirs*
|
||||
|
||||
For the above examples the following output is a result of cat'ing the
|
||||
"dev" entry in the hvcs directory:
|
||||
"dev" entry in the hvcs directory::
|
||||
|
||||
Pow5:/sys/class/tty/hvcs0/ # cat dev
|
||||
254:0
|
||||
@ -184,7 +187,7 @@ systems running hvcs will already have the device entries created or udev
|
||||
will do it automatically.
|
||||
|
||||
Given the example output above, to manually create a /dev/hvcs* node entry
|
||||
mknod can be used as follows:
|
||||
mknod can be used as follows::
|
||||
|
||||
mknod /dev/hvcs0 c 254 0
|
||||
mknod /dev/hvcs1 c 254 1
|
||||
@ -195,15 +198,15 @@ Using mknod to manually create the device entries makes these device nodes
|
||||
persistent. Once created they will exist prior to the driver insmod.
|
||||
|
||||
Attempting to connect an application to /dev/hvcs* prior to insertion of
|
||||
the hvcs module will result in an error message similar to the following:
|
||||
the hvcs module will result in an error message similar to the following::
|
||||
|
||||
"/dev/hvcs*: No such device".
|
||||
|
||||
NOTE: Just because there is a device node present doesn't mean that there
|
||||
is a vty-server device configured for that node.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
5. Connection
|
||||
=============
|
||||
|
||||
Since this driver controls devices that provide a tty interface a user can
|
||||
interact with the device node entries using any standard tty-interactive
|
||||
@ -249,7 +252,7 @@ vty-server adapter is associated with which /dev/hvcs* node a special sysfs
|
||||
attribute has been added to each vty-server sysfs entry. This entry is
|
||||
called "index" and showing it reveals an integer that refers to the
|
||||
/dev/hvcs* entry to use to connect to that device. For instance cating the
|
||||
index attribute of vty-server adapter 30000004 shows the following.
|
||||
index attribute of vty-server adapter 30000004 shows the following::
|
||||
|
||||
Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat index
|
||||
2
|
||||
@ -262,8 +265,8 @@ system the /dev/hvcs* entry that interacts with a particular vty-server
|
||||
adapter is not guaranteed to remain the same across system reboots. Look
|
||||
in the Q & A section for more on this issue.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
6. Disconnection
|
||||
================
|
||||
|
||||
As a security feature to prevent the delivery of stale data to an
|
||||
unintended target the Power5 system firmware disables the fetching of data
|
||||
@ -305,7 +308,7 @@ connection between the vty-server and target vty ONLY if the vterm_state
|
||||
previously read '1'. The write directive is ignored if the vterm_state
|
||||
read '0' or if any value other than '0' was written to the vterm_state
|
||||
attribute. The following example will show the method used for verifying
|
||||
the vty-server connection status and disconnecting a vty-server connection.
|
||||
the vty-server connection status and disconnecting a vty-server connection::
|
||||
|
||||
Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat vterm_state
|
||||
1
|
||||
@ -318,12 +321,12 @@ the vty-server connection status and disconnecting a vty-server connection.
|
||||
All vty-server connections are automatically terminated when the device is
|
||||
hotplug removed and when the module is removed.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
7. Configuration
|
||||
================
|
||||
|
||||
Each vty-server has a sysfs entry in the /sys/devices/vio directory, which
|
||||
is symlinked in several other sysfs tree directories, notably under the
|
||||
hvcs driver entry, which looks like the following example:
|
||||
hvcs driver entry, which looks like the following example::
|
||||
|
||||
Pow5:/sys/bus/vio/drivers/hvcs # ls
|
||||
. .. 30000003 30000004 rescan
|
||||
@ -344,7 +347,7 @@ completed or was never executed.
|
||||
|
||||
Vty-server entries in this directory are a 32 bit partition unique unit
|
||||
address that is created by firmware. An example vty-server sysfs entry
|
||||
looks like the following:
|
||||
looks like the following::
|
||||
|
||||
Pow5:/sys/bus/vio/drivers/hvcs/30000004 # ls
|
||||
. current_vty devspec name partner_vtys
|
||||
@ -352,21 +355,21 @@ looks like the following:
|
||||
|
||||
Each entry is provided, by default with a "name" attribute. Reading the
|
||||
"name" attribute will reveal the device type as shown in the following
|
||||
example:
|
||||
example::
|
||||
|
||||
Pow5:/sys/bus/vio/drivers/hvcs/30000003 # cat name
|
||||
vty-server
|
||||
|
||||
Each entry is also provided, by default, with a "devspec" attribute which
|
||||
reveals the full device specification when read, as shown in the following
|
||||
example:
|
||||
example::
|
||||
|
||||
Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat devspec
|
||||
/vdevice/vty-server@30000004
|
||||
|
||||
Each vty-server sysfs dir is provided with two read-only attributes that
|
||||
provide lists of easily parsed partner vty data: "partner_vtys" and
|
||||
"partner_clcs".
|
||||
"partner_clcs"::
|
||||
|
||||
Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat partner_vtys
|
||||
30000000
|
||||
@ -396,7 +399,7 @@ A vty-server can only be connected to a single vty at a time. The entry,
|
||||
read.
|
||||
|
||||
The current_vty can be changed by writing a valid partner clc to the entry
|
||||
as in the following example:
|
||||
as in the following example::
|
||||
|
||||
Pow5:/sys/bus/vio/drivers/hvcs/30000004 # echo U5112.428.10304
|
||||
8A-V4-C0 > current_vty
|
||||
@ -408,9 +411,9 @@ currently open connection is freed.
|
||||
Information on the "vterm_state" attribute was covered earlier on the
|
||||
chapter entitled "disconnection".
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
8. Questions & Answers:
|
||||
===========================================================================
|
||||
=======================
|
||||
|
||||
Q: What are the security concerns involving hvcs?
|
||||
|
||||
A: There are three main security concerns:
|
||||
@ -429,6 +432,7 @@ A: There are three main security concerns:
|
||||
partition) will experience the previously logged in session.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
Q: How do I multiplex a console that I grab through hvcs so that other
|
||||
people can see it:
|
||||
|
||||
@ -440,6 +444,7 @@ term type "screen" to others. This means that curses based programs may
|
||||
not display properly in screen sessions.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
Q: Why are the colors all messed up?
|
||||
Q: Why are the control characters acting strange or not working?
|
||||
Q: Why is the console output all strange and unintelligible?
|
||||
@ -455,6 +460,7 @@ disconnect from the console. This will ensure that the next user gets
|
||||
their own TERM type set when they login.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
Q: When I try to CONNECT kermit to an hvcs device I get:
|
||||
"Sorry, can't open connection: /dev/hvcs*"What is happening?
|
||||
|
||||
@ -490,6 +496,7 @@ A: There is not a corresponding vty-server device that maps to an existing
|
||||
/dev/hvcs* entry.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
Q: When I try to CONNECT kermit to an hvcs device I get:
|
||||
"Sorry, write access to UUCP lockfile directory denied."
|
||||
|
||||
@ -497,6 +504,7 @@ A: The /dev/hvcs* entry you have specified doesn't exist where you said it
|
||||
does? Maybe you haven't inserted the module (on systems with udev).
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
Q: If I already have one Linux partition installed can I use hvcs on said
|
||||
partition to provide the console for the install of a second Linux
|
||||
partition?
|
||||
@ -505,6 +513,7 @@ A: Yes granted that your are connected to the /dev/hvcs* device using
|
||||
kermit or cu or some other program that doesn't provide terminal emulation.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
Q: Can I connect to more than one partition's console at a time using this
|
||||
driver?
|
||||
|
||||
@ -512,6 +521,7 @@ A: Yes. Of course this means that there must be more than one vty-server
|
||||
configured for this partition and each must point to a disconnected vty.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
Q: Does the hvcs driver support dynamic (hotplug) addition of devices?
|
||||
|
||||
A: Yes, if you have dlpar and hotplug enabled for your system and it has
|
||||
@ -519,6 +529,7 @@ been built into the kernel the hvcs drivers is configured to dynamically
|
||||
handle additions of new devices and removals of unused devices.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
Q: For some reason /dev/hvcs* doesn't map to the same vty-server adapter
|
||||
after a reboot. What happened?
|
||||
|
||||
@ -533,6 +544,7 @@ on how to determine which vty-server goes with which /dev/hvcs* node.
|
||||
Hint; look at the sysfs "index" attribute for the vty-server.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
Q: Can I use /dev/hvcs* as a conduit to another partition and use a tty
|
||||
device on that partition as the other end of the pipe?
|
||||
|
||||
@ -554,7 +566,9 @@ read or write to /dev/hvcs*. Now you have a tty conduit between two
|
||||
partitions.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
9. Reporting Bugs:
|
||||
==================
|
||||
|
||||
The proper channel for reporting bugs is either through the Linux OS
|
||||
distribution company that provided your OS or by posting issues to the
|
34
Documentation/powerpc/index.rst
Normal file
34
Documentation/powerpc/index.rst
Normal file
@ -0,0 +1,34 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=======
|
||||
powerpc
|
||||
=======
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
bootwrapper
|
||||
cpu_families
|
||||
cpu_features
|
||||
cxl
|
||||
cxlflash
|
||||
dawr-power9
|
||||
dscr
|
||||
eeh-pci-error-recovery
|
||||
firmware-assisted-dump
|
||||
hvcs
|
||||
isa-versions
|
||||
mpc52xx
|
||||
pci_iov_resource_on_powernv
|
||||
pmu-ebb
|
||||
ptrace
|
||||
qe_firmware
|
||||
syscall64-abi
|
||||
transactional_memory
|
||||
|
||||
.. only:: subproject and html
|
||||
|
||||
Indices
|
||||
=======
|
||||
|
||||
* :ref:`genindex`
|
@ -1,13 +1,12 @@
|
||||
:orphan:
|
||||
|
||||
==========================
|
||||
CPU to ISA Version Mapping
|
||||
==========================
|
||||
|
||||
Mapping of some CPU versions to relevant ISA versions.
|
||||
|
||||
========= ====================
|
||||
========= ====================================================================
|
||||
CPU Architecture version
|
||||
========= ====================
|
||||
========= ====================================================================
|
||||
Power9 Power ISA v3.0B
|
||||
Power8 Power ISA v2.07
|
||||
Power7 Power ISA v2.06
|
||||
@ -24,7 +23,7 @@ PPC970 - PowerPC User Instruction Set Architecture Book I v2.01
|
||||
- PowerPC Virtual Environment Architecture Book II v2.01
|
||||
- PowerPC Operating Environment Architecture Book III v2.01
|
||||
- Plus Altivec/VMX ~= 2.03
|
||||
========= ====================
|
||||
========= ====================================================================
|
||||
|
||||
|
||||
Key Features
|
||||
@ -60,9 +59,9 @@ Power5 No
|
||||
PPC970 No
|
||||
========== ====
|
||||
|
||||
========== ====================
|
||||
========== ====================================
|
||||
CPU Transactional Memory
|
||||
========== ====================
|
||||
========== ====================================
|
||||
Power9 Yes (* see transactional_memory.txt)
|
||||
Power8 Yes
|
||||
Power7 No
|
||||
@ -73,4 +72,4 @@ Power5++ No
|
||||
Power5+ No
|
||||
Power5 No
|
||||
PPC970 No
|
||||
========== ====================
|
||||
========== ====================================
|
||||
|
@ -1,11 +1,13 @@
|
||||
=============================
|
||||
Linux 2.6.x on MPC52xx family
|
||||
-----------------------------
|
||||
=============================
|
||||
|
||||
For the latest info, go to http://www.246tNt.com/mpc52xx/
|
||||
|
||||
To compile/use :
|
||||
|
||||
- U-Boot:
|
||||
- U-Boot::
|
||||
|
||||
# <edit Makefile to set ARCH=ppc & CROSS_COMPILE=... ( also EXTRAVERSION
|
||||
if you wish to ).
|
||||
# make lite5200_defconfig
|
||||
@ -16,7 +18,8 @@ To compile/use :
|
||||
=> tftpboot 400000 pRamdisk
|
||||
=> bootm 200000 400000
|
||||
|
||||
- DBug:
|
||||
- DBug::
|
||||
|
||||
# <edit Makefile to set ARCH=ppc & CROSS_COMPILE=... ( also EXTRAVERSION
|
||||
if you wish to ).
|
||||
# make lite5200_defconfig
|
||||
@ -28,7 +31,8 @@ To compile/use :
|
||||
DBug> dn -i zImage.initrd.lite5200
|
||||
|
||||
|
||||
Some remarks :
|
||||
Some remarks:
|
||||
|
||||
- The port is named mpc52xxx, and config options are PPC_MPC52xx. The MGT5100
|
||||
is not supported, and I'm not sure anyone is interesting in working on it
|
||||
so. I didn't took 5xxx because there's apparently a lot of 5xxx that have
|
@ -1,6 +1,13 @@
|
||||
===================================================
|
||||
PCI Express I/O Virtualization Resource on Powerenv
|
||||
===================================================
|
||||
|
||||
Wei Yang <weiyang@linux.vnet.ibm.com>
|
||||
|
||||
Benjamin Herrenschmidt <benh@au1.ibm.com>
|
||||
|
||||
Bjorn Helgaas <bhelgaas@google.com>
|
||||
|
||||
26 Aug 2014
|
||||
|
||||
This document describes the requirement from hardware for PCI MMIO resource
|
||||
@ -10,6 +17,7 @@ Endpoints and the implementation on P8 (IODA2). The next two sections talks
|
||||
about considerations on enabling SRIOV on IODA2.
|
||||
|
||||
1. Introduction to Partitionable Endpoints
|
||||
==========================================
|
||||
|
||||
A Partitionable Endpoint (PE) is a way to group the various resources
|
||||
associated with a device or a set of devices to provide isolation between
|
||||
@ -35,6 +43,7 @@ is a completely separate HW entity that replicates the entire logic, so has
|
||||
its own set of PEs, etc.
|
||||
|
||||
2. Implementation of Partitionable Endpoints on P8 (IODA2)
|
||||
==========================================================
|
||||
|
||||
P8 supports up to 256 Partitionable Endpoints per PHB.
|
||||
|
||||
@ -149,6 +158,7 @@ P8 supports up to 256 Partitionable Endpoints per PHB.
|
||||
sense, but we haven't done it yet.
|
||||
|
||||
3. Considerations for SR-IOV on PowerKVM
|
||||
========================================
|
||||
|
||||
* SR-IOV Background
|
||||
|
||||
@ -224,7 +234,7 @@ P8 supports up to 256 Partitionable Endpoints per PHB.
|
||||
IODA supports 256 PEs, so segmented windows contain 256 segments, so if
|
||||
total_VFs is less than 256, we have the situation in Figure 1.0, where
|
||||
segments [total_VFs, 255] of the M64 window may map to some MMIO range on
|
||||
other devices:
|
||||
other devices::
|
||||
|
||||
0 1 total_VFs - 1
|
||||
+------+------+- -+------+------+
|
||||
@ -243,7 +253,7 @@ P8 supports up to 256 Partitionable Endpoints per PHB.
|
||||
Figure 1.0 Direct map VF(n) BAR space
|
||||
|
||||
Our current solution is to allocate 256 segments even if the VF(n) BAR
|
||||
space doesn't need that much, as shown in Figure 1.1:
|
||||
space doesn't need that much, as shown in Figure 1.1::
|
||||
|
||||
0 1 total_VFs - 1 255
|
||||
+------+------+- -+------+------+- -+------+------+
|
||||
@ -269,6 +279,7 @@ P8 supports up to 256 Partitionable Endpoints per PHB.
|
||||
responds to segments [total_VFs, 255].
|
||||
|
||||
4. Implications for the Generic PCI Code
|
||||
========================================
|
||||
|
||||
The PCIe SR-IOV spec requires that the base of the VF(n) BAR space be
|
||||
aligned to the size of an individual VF BAR.
|
@ -1,3 +1,4 @@
|
||||
========================
|
||||
PMU Event Based Branches
|
||||
========================
|
||||
|
156
Documentation/powerpc/ptrace.rst
Normal file
156
Documentation/powerpc/ptrace.rst
Normal file
@ -0,0 +1,156 @@
|
||||
======
|
||||
Ptrace
|
||||
======
|
||||
|
||||
GDB intends to support the following hardware debug features of BookE
|
||||
processors:
|
||||
|
||||
4 hardware breakpoints (IAC)
|
||||
2 hardware watchpoints (read, write and read-write) (DAC)
|
||||
2 value conditions for the hardware watchpoints (DVC)
|
||||
|
||||
For that, we need to extend ptrace so that GDB can query and set these
|
||||
resources. Since we're extending, we're trying to create an interface
|
||||
that's extendable and that covers both BookE and server processors, so
|
||||
that GDB doesn't need to special-case each of them. We added the
|
||||
following 3 new ptrace requests.
|
||||
|
||||
1. PTRACE_PPC_GETHWDEBUGINFO
|
||||
============================
|
||||
|
||||
Query for GDB to discover the hardware debug features. The main info to
|
||||
be returned here is the minimum alignment for the hardware watchpoints.
|
||||
BookE processors don't have restrictions here, but server processors have
|
||||
an 8-byte alignment restriction for hardware watchpoints. We'd like to avoid
|
||||
adding special cases to GDB based on what it sees in AUXV.
|
||||
|
||||
Since we're at it, we added other useful info that the kernel can return to
|
||||
GDB: this query will return the number of hardware breakpoints, hardware
|
||||
watchpoints and whether it supports a range of addresses and a condition.
|
||||
The query will fill the following structure provided by the requesting process::
|
||||
|
||||
struct ppc_debug_info {
|
||||
unit32_t version;
|
||||
unit32_t num_instruction_bps;
|
||||
unit32_t num_data_bps;
|
||||
unit32_t num_condition_regs;
|
||||
unit32_t data_bp_alignment;
|
||||
unit32_t sizeof_condition; /* size of the DVC register */
|
||||
uint64_t features; /* bitmask of the individual flags */
|
||||
};
|
||||
|
||||
features will have bits indicating whether there is support for::
|
||||
|
||||
#define PPC_DEBUG_FEATURE_INSN_BP_RANGE 0x1
|
||||
#define PPC_DEBUG_FEATURE_INSN_BP_MASK 0x2
|
||||
#define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x4
|
||||
#define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x8
|
||||
#define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x10
|
||||
|
||||
2. PTRACE_SETHWDEBUG
|
||||
|
||||
Sets a hardware breakpoint or watchpoint, according to the provided structure::
|
||||
|
||||
struct ppc_hw_breakpoint {
|
||||
uint32_t version;
|
||||
#define PPC_BREAKPOINT_TRIGGER_EXECUTE 0x1
|
||||
#define PPC_BREAKPOINT_TRIGGER_READ 0x2
|
||||
#define PPC_BREAKPOINT_TRIGGER_WRITE 0x4
|
||||
uint32_t trigger_type; /* only some combinations allowed */
|
||||
#define PPC_BREAKPOINT_MODE_EXACT 0x0
|
||||
#define PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE 0x1
|
||||
#define PPC_BREAKPOINT_MODE_RANGE_EXCLUSIVE 0x2
|
||||
#define PPC_BREAKPOINT_MODE_MASK 0x3
|
||||
uint32_t addr_mode; /* address match mode */
|
||||
|
||||
#define PPC_BREAKPOINT_CONDITION_MODE 0x3
|
||||
#define PPC_BREAKPOINT_CONDITION_NONE 0x0
|
||||
#define PPC_BREAKPOINT_CONDITION_AND 0x1
|
||||
#define PPC_BREAKPOINT_CONDITION_EXACT 0x1 /* different name for the same thing as above */
|
||||
#define PPC_BREAKPOINT_CONDITION_OR 0x2
|
||||
#define PPC_BREAKPOINT_CONDITION_AND_OR 0x3
|
||||
#define PPC_BREAKPOINT_CONDITION_BE_ALL 0x00ff0000 /* byte enable bits */
|
||||
#define PPC_BREAKPOINT_CONDITION_BE(n) (1<<((n)+16))
|
||||
uint32_t condition_mode; /* break/watchpoint condition flags */
|
||||
|
||||
uint64_t addr;
|
||||
uint64_t addr2;
|
||||
uint64_t condition_value;
|
||||
};
|
||||
|
||||
A request specifies one event, not necessarily just one register to be set.
|
||||
For instance, if the request is for a watchpoint with a condition, both the
|
||||
DAC and DVC registers will be set in the same request.
|
||||
|
||||
With this GDB can ask for all kinds of hardware breakpoints and watchpoints
|
||||
that the BookE supports. COMEFROM breakpoints available in server processors
|
||||
are not contemplated, but that is out of the scope of this work.
|
||||
|
||||
ptrace will return an integer (handle) uniquely identifying the breakpoint or
|
||||
watchpoint just created. This integer will be used in the PTRACE_DELHWDEBUG
|
||||
request to ask for its removal. Return -ENOSPC if the requested breakpoint
|
||||
can't be allocated on the registers.
|
||||
|
||||
Some examples of using the structure to:
|
||||
|
||||
- set a breakpoint in the first breakpoint register::
|
||||
|
||||
p.version = PPC_DEBUG_CURRENT_VERSION;
|
||||
p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
|
||||
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
|
||||
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
|
||||
p.addr = (uint64_t) address;
|
||||
p.addr2 = 0;
|
||||
p.condition_value = 0;
|
||||
|
||||
- set a watchpoint which triggers on reads in the second watchpoint register::
|
||||
|
||||
p.version = PPC_DEBUG_CURRENT_VERSION;
|
||||
p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
|
||||
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
|
||||
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
|
||||
p.addr = (uint64_t) address;
|
||||
p.addr2 = 0;
|
||||
p.condition_value = 0;
|
||||
|
||||
- set a watchpoint which triggers only with a specific value::
|
||||
|
||||
p.version = PPC_DEBUG_CURRENT_VERSION;
|
||||
p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
|
||||
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
|
||||
p.condition_mode = PPC_BREAKPOINT_CONDITION_AND | PPC_BREAKPOINT_CONDITION_BE_ALL;
|
||||
p.addr = (uint64_t) address;
|
||||
p.addr2 = 0;
|
||||
p.condition_value = (uint64_t) condition;
|
||||
|
||||
- set a ranged hardware breakpoint::
|
||||
|
||||
p.version = PPC_DEBUG_CURRENT_VERSION;
|
||||
p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
|
||||
p.addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
|
||||
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
|
||||
p.addr = (uint64_t) begin_range;
|
||||
p.addr2 = (uint64_t) end_range;
|
||||
p.condition_value = 0;
|
||||
|
||||
- set a watchpoint in server processors (BookS)::
|
||||
|
||||
p.version = 1;
|
||||
p.trigger_type = PPC_BREAKPOINT_TRIGGER_RW;
|
||||
p.addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
|
||||
or
|
||||
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
|
||||
|
||||
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
|
||||
p.addr = (uint64_t) begin_range;
|
||||
/* For PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE addr2 needs to be specified, where
|
||||
* addr2 - addr <= 8 Bytes.
|
||||
*/
|
||||
p.addr2 = (uint64_t) end_range;
|
||||
p.condition_value = 0;
|
||||
|
||||
3. PTRACE_DELHWDEBUG
|
||||
|
||||
Takes an integer which identifies an existing breakpoint or watchpoint
|
||||
(i.e., the value returned from PTRACE_SETHWDEBUG), and deletes the
|
||||
corresponding breakpoint or watchpoint..
|
@ -1,151 +0,0 @@
|
||||
GDB intends to support the following hardware debug features of BookE
|
||||
processors:
|
||||
|
||||
4 hardware breakpoints (IAC)
|
||||
2 hardware watchpoints (read, write and read-write) (DAC)
|
||||
2 value conditions for the hardware watchpoints (DVC)
|
||||
|
||||
For that, we need to extend ptrace so that GDB can query and set these
|
||||
resources. Since we're extending, we're trying to create an interface
|
||||
that's extendable and that covers both BookE and server processors, so
|
||||
that GDB doesn't need to special-case each of them. We added the
|
||||
following 3 new ptrace requests.
|
||||
|
||||
1. PTRACE_PPC_GETHWDEBUGINFO
|
||||
|
||||
Query for GDB to discover the hardware debug features. The main info to
|
||||
be returned here is the minimum alignment for the hardware watchpoints.
|
||||
BookE processors don't have restrictions here, but server processors have
|
||||
an 8-byte alignment restriction for hardware watchpoints. We'd like to avoid
|
||||
adding special cases to GDB based on what it sees in AUXV.
|
||||
|
||||
Since we're at it, we added other useful info that the kernel can return to
|
||||
GDB: this query will return the number of hardware breakpoints, hardware
|
||||
watchpoints and whether it supports a range of addresses and a condition.
|
||||
The query will fill the following structure provided by the requesting process:
|
||||
|
||||
struct ppc_debug_info {
|
||||
unit32_t version;
|
||||
unit32_t num_instruction_bps;
|
||||
unit32_t num_data_bps;
|
||||
unit32_t num_condition_regs;
|
||||
unit32_t data_bp_alignment;
|
||||
unit32_t sizeof_condition; /* size of the DVC register */
|
||||
uint64_t features; /* bitmask of the individual flags */
|
||||
};
|
||||
|
||||
features will have bits indicating whether there is support for:
|
||||
|
||||
#define PPC_DEBUG_FEATURE_INSN_BP_RANGE 0x1
|
||||
#define PPC_DEBUG_FEATURE_INSN_BP_MASK 0x2
|
||||
#define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x4
|
||||
#define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x8
|
||||
#define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x10
|
||||
|
||||
2. PTRACE_SETHWDEBUG
|
||||
|
||||
Sets a hardware breakpoint or watchpoint, according to the provided structure:
|
||||
|
||||
struct ppc_hw_breakpoint {
|
||||
uint32_t version;
|
||||
#define PPC_BREAKPOINT_TRIGGER_EXECUTE 0x1
|
||||
#define PPC_BREAKPOINT_TRIGGER_READ 0x2
|
||||
#define PPC_BREAKPOINT_TRIGGER_WRITE 0x4
|
||||
uint32_t trigger_type; /* only some combinations allowed */
|
||||
#define PPC_BREAKPOINT_MODE_EXACT 0x0
|
||||
#define PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE 0x1
|
||||
#define PPC_BREAKPOINT_MODE_RANGE_EXCLUSIVE 0x2
|
||||
#define PPC_BREAKPOINT_MODE_MASK 0x3
|
||||
uint32_t addr_mode; /* address match mode */
|
||||
|
||||
#define PPC_BREAKPOINT_CONDITION_MODE 0x3
|
||||
#define PPC_BREAKPOINT_CONDITION_NONE 0x0
|
||||
#define PPC_BREAKPOINT_CONDITION_AND 0x1
|
||||
#define PPC_BREAKPOINT_CONDITION_EXACT 0x1 /* different name for the same thing as above */
|
||||
#define PPC_BREAKPOINT_CONDITION_OR 0x2
|
||||
#define PPC_BREAKPOINT_CONDITION_AND_OR 0x3
|
||||
#define PPC_BREAKPOINT_CONDITION_BE_ALL 0x00ff0000 /* byte enable bits */
|
||||
#define PPC_BREAKPOINT_CONDITION_BE(n) (1<<((n)+16))
|
||||
uint32_t condition_mode; /* break/watchpoint condition flags */
|
||||
|
||||
uint64_t addr;
|
||||
uint64_t addr2;
|
||||
uint64_t condition_value;
|
||||
};
|
||||
|
||||
A request specifies one event, not necessarily just one register to be set.
|
||||
For instance, if the request is for a watchpoint with a condition, both the
|
||||
DAC and DVC registers will be set in the same request.
|
||||
|
||||
With this GDB can ask for all kinds of hardware breakpoints and watchpoints
|
||||
that the BookE supports. COMEFROM breakpoints available in server processors
|
||||
are not contemplated, but that is out of the scope of this work.
|
||||
|
||||
ptrace will return an integer (handle) uniquely identifying the breakpoint or
|
||||
watchpoint just created. This integer will be used in the PTRACE_DELHWDEBUG
|
||||
request to ask for its removal. Return -ENOSPC if the requested breakpoint
|
||||
can't be allocated on the registers.
|
||||
|
||||
Some examples of using the structure to:
|
||||
|
||||
- set a breakpoint in the first breakpoint register
|
||||
|
||||
p.version = PPC_DEBUG_CURRENT_VERSION;
|
||||
p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
|
||||
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
|
||||
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
|
||||
p.addr = (uint64_t) address;
|
||||
p.addr2 = 0;
|
||||
p.condition_value = 0;
|
||||
|
||||
- set a watchpoint which triggers on reads in the second watchpoint register
|
||||
|
||||
p.version = PPC_DEBUG_CURRENT_VERSION;
|
||||
p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
|
||||
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
|
||||
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
|
||||
p.addr = (uint64_t) address;
|
||||
p.addr2 = 0;
|
||||
p.condition_value = 0;
|
||||
|
||||
- set a watchpoint which triggers only with a specific value
|
||||
|
||||
p.version = PPC_DEBUG_CURRENT_VERSION;
|
||||
p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
|
||||
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
|
||||
p.condition_mode = PPC_BREAKPOINT_CONDITION_AND | PPC_BREAKPOINT_CONDITION_BE_ALL;
|
||||
p.addr = (uint64_t) address;
|
||||
p.addr2 = 0;
|
||||
p.condition_value = (uint64_t) condition;
|
||||
|
||||
- set a ranged hardware breakpoint
|
||||
|
||||
p.version = PPC_DEBUG_CURRENT_VERSION;
|
||||
p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
|
||||
p.addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
|
||||
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
|
||||
p.addr = (uint64_t) begin_range;
|
||||
p.addr2 = (uint64_t) end_range;
|
||||
p.condition_value = 0;
|
||||
|
||||
- set a watchpoint in server processors (BookS)
|
||||
|
||||
p.version = 1;
|
||||
p.trigger_type = PPC_BREAKPOINT_TRIGGER_RW;
|
||||
p.addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
|
||||
or
|
||||
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
|
||||
|
||||
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
|
||||
p.addr = (uint64_t) begin_range;
|
||||
/* For PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE addr2 needs to be specified, where
|
||||
* addr2 - addr <= 8 Bytes.
|
||||
*/
|
||||
p.addr2 = (uint64_t) end_range;
|
||||
p.condition_value = 0;
|
||||
|
||||
3. PTRACE_DELHWDEBUG
|
||||
|
||||
Takes an integer which identifies an existing breakpoint or watchpoint
|
||||
(i.e., the value returned from PTRACE_SETHWDEBUG), and deletes the
|
||||
corresponding breakpoint or watchpoint..
|
@ -1,23 +1,23 @@
|
||||
Freescale QUICC Engine Firmware Uploading
|
||||
-----------------------------------------
|
||||
=========================================
|
||||
Freescale QUICC Engine Firmware Uploading
|
||||
=========================================
|
||||
|
||||
(c) 2007 Timur Tabi <timur at freescale.com>,
|
||||
Freescale Semiconductor
|
||||
|
||||
Table of Contents
|
||||
=================
|
||||
.. Table of Contents
|
||||
|
||||
I - Software License for Firmware
|
||||
I - Software License for Firmware
|
||||
|
||||
II - Microcode Availability
|
||||
II - Microcode Availability
|
||||
|
||||
III - Description and Terminology
|
||||
III - Description and Terminology
|
||||
|
||||
IV - Microcode Programming Details
|
||||
IV - Microcode Programming Details
|
||||
|
||||
V - Firmware Structure Layout
|
||||
V - Firmware Structure Layout
|
||||
|
||||
VI - Sample Code for Creating Firmware Files
|
||||
VI - Sample Code for Creating Firmware Files
|
||||
|
||||
Revision Information
|
||||
====================
|
||||
@ -39,7 +39,7 @@ http://opensource.freescale.com. For other firmware files, please contact
|
||||
your Freescale representative or your operating system vendor.
|
||||
|
||||
III - Description and Terminology
|
||||
================================
|
||||
=================================
|
||||
|
||||
In this document, the term 'microcode' refers to the sequence of 32-bit
|
||||
integers that compose the actual QE microcode.
|
||||
@ -89,7 +89,7 @@ being fixed in the RAM package utilizing they should be activated. This data
|
||||
structure signals the microcode which of these virtual traps is active.
|
||||
|
||||
This structure contains 6 words that the application should copy to some
|
||||
specific been defined. This table describes the structure.
|
||||
specific been defined. This table describes the structure::
|
||||
|
||||
---------------------------------------------------------------
|
||||
| Offset in | | Destination Offset | Size of |
|
||||
@ -119,7 +119,7 @@ Extended Modes
|
||||
This is a double word bit array (64 bits) that defines special functionality
|
||||
which has an impact on the software drivers. Each bit has its own impact
|
||||
and has special instructions for the s/w associated with it. This structure is
|
||||
described in this table:
|
||||
described in this table::
|
||||
|
||||
-----------------------------------------------------------------------
|
||||
| Bit # | Name | Description |
|
||||
@ -220,7 +220,8 @@ The 'model' field is a 16-bit number that matches the actual SOC. The
|
||||
'major' and 'minor' fields are the major and minor revision numbers,
|
||||
respectively, of the SOC.
|
||||
|
||||
For example, to match the 8323, revision 1.0:
|
||||
For example, to match the 8323, revision 1.0::
|
||||
|
||||
soc.model = 8323
|
||||
soc.major = 1
|
||||
soc.minor = 0
|
||||
@ -273,10 +274,10 @@ library and available to any driver that calles qe_get_firmware_info().
|
||||
'reserved'.
|
||||
|
||||
After the last microcode is a 32-bit CRC. It can be calculated using
|
||||
this algorithm:
|
||||
this algorithm::
|
||||
|
||||
u32 crc32(const u8 *p, unsigned int len)
|
||||
{
|
||||
u32 crc32(const u8 *p, unsigned int len)
|
||||
{
|
||||
unsigned int i;
|
||||
u32 crc = 0;
|
||||
|
||||
@ -286,7 +287,7 @@ u32 crc32(const u8 *p, unsigned int len)
|
||||
crc = (crc >> 1) ^ ((crc & 1) ? 0xedb88320 : 0);
|
||||
}
|
||||
return crc;
|
||||
}
|
||||
}
|
||||
|
||||
VI - Sample Code for Creating Firmware Files
|
||||
============================================
|
@ -5,12 +5,12 @@ Power Architecture 64-bit Linux system call ABI
|
||||
syscall
|
||||
=======
|
||||
|
||||
syscall calling sequence[*] matches the Power Architecture 64-bit ELF ABI
|
||||
syscall calling sequence\ [1]_ matches the Power Architecture 64-bit ELF ABI
|
||||
specification C function calling sequence, including register preservation
|
||||
rules, with the following differences.
|
||||
|
||||
[*] Some syscalls (typically low-level management functions) may have
|
||||
different calling sequences (e.g., rt_sigreturn).
|
||||
.. [1] Some syscalls (typically low-level management functions) may have
|
||||
different calling sequences (e.g., rt_sigreturn).
|
||||
|
||||
Parameters and return value
|
||||
---------------------------
|
||||
@ -33,12 +33,14 @@ Register preservation rules
|
||||
Register preservation rules match the ELF ABI calling sequence with the
|
||||
following differences:
|
||||
|
||||
r0: Volatile. (System call number.)
|
||||
r3: Volatile. (Parameter 1, and return value.)
|
||||
r4-r8: Volatile. (Parameters 2-6.)
|
||||
cr0: Volatile (cr0.SO is the return error condition)
|
||||
cr1, cr5-7: Nonvolatile.
|
||||
lr: Nonvolatile.
|
||||
=========== ============= ========================================
|
||||
r0 Volatile (System call number.)
|
||||
r3 Volatile (Parameter 1, and return value.)
|
||||
r4-r8 Volatile (Parameters 2-6.)
|
||||
cr0 Volatile (cr0.SO is the return error condition)
|
||||
cr1, cr5-7 Nonvolatile
|
||||
lr Nonvolatile
|
||||
=========== ============= ========================================
|
||||
|
||||
All floating point and vector data registers as well as control and status
|
||||
registers are nonvolatile.
|
||||
@ -90,9 +92,12 @@ The vsyscall may or may not use the caller's stack frame save areas.
|
||||
|
||||
Register preservation rules
|
||||
---------------------------
|
||||
r0: Volatile.
|
||||
cr1, cr5-7: Volatile.
|
||||
lr: Volatile.
|
||||
|
||||
=========== ========
|
||||
r0 Volatile
|
||||
cr1, cr5-7 Volatile
|
||||
lr Volatile
|
||||
=========== ========
|
||||
|
||||
Invocation
|
||||
----------
|
@ -1,3 +1,4 @@
|
||||
============================
|
||||
Transactional Memory support
|
||||
============================
|
||||
|
||||
@ -17,29 +18,29 @@ instructions are presented to delimit transactions; transactions are
|
||||
guaranteed to either complete atomically or roll back and undo any partial
|
||||
changes.
|
||||
|
||||
A simple transaction looks like this:
|
||||
A simple transaction looks like this::
|
||||
|
||||
begin_move_money:
|
||||
tbegin
|
||||
beq abort_handler
|
||||
begin_move_money:
|
||||
tbegin
|
||||
beq abort_handler
|
||||
|
||||
ld r4, SAVINGS_ACCT(r3)
|
||||
ld r5, CURRENT_ACCT(r3)
|
||||
subi r5, r5, 1
|
||||
addi r4, r4, 1
|
||||
std r4, SAVINGS_ACCT(r3)
|
||||
std r5, CURRENT_ACCT(r3)
|
||||
ld r4, SAVINGS_ACCT(r3)
|
||||
ld r5, CURRENT_ACCT(r3)
|
||||
subi r5, r5, 1
|
||||
addi r4, r4, 1
|
||||
std r4, SAVINGS_ACCT(r3)
|
||||
std r5, CURRENT_ACCT(r3)
|
||||
|
||||
tend
|
||||
tend
|
||||
|
||||
b continue
|
||||
b continue
|
||||
|
||||
abort_handler:
|
||||
... test for odd failures ...
|
||||
abort_handler:
|
||||
... test for odd failures ...
|
||||
|
||||
/* Retry the transaction if it failed because it conflicted with
|
||||
* someone else: */
|
||||
b begin_move_money
|
||||
/* Retry the transaction if it failed because it conflicted with
|
||||
* someone else: */
|
||||
b begin_move_money
|
||||
|
||||
|
||||
The 'tbegin' instruction denotes the start point, and 'tend' the end point.
|
||||
@ -123,7 +124,7 @@ Transaction-aware signal handlers can read the transactional register state
|
||||
from the second ucontext. This will be necessary for crash handlers to
|
||||
determine, for example, the address of the instruction causing the SIGSEGV.
|
||||
|
||||
Example signal handler:
|
||||
Example signal handler::
|
||||
|
||||
void crash_handler(int sig, siginfo_t *si, void *uc)
|
||||
{
|
||||
@ -133,9 +134,9 @@ Example signal handler:
|
||||
if (ucp_link) {
|
||||
u64 msr = ucp->uc_mcontext.regs->msr;
|
||||
/* May have transactional ucontext! */
|
||||
#ifndef __powerpc64__
|
||||
#ifndef __powerpc64__
|
||||
msr |= ((u64)transactional_ucp->uc_mcontext.regs->msr) << 32;
|
||||
#endif
|
||||
#endif
|
||||
if (MSR_TM_ACTIVE(msr)) {
|
||||
/* Yes, we crashed during a transaction. Oops. */
|
||||
fprintf(stderr, "Transaction to be restarted at 0x%llx, but "
|
||||
@ -176,6 +177,7 @@ Failure cause codes used by kernel
|
||||
These are defined in <asm/reg.h>, and distinguish different reasons why the
|
||||
kernel aborted a transaction:
|
||||
|
||||
====================== ================================
|
||||
TM_CAUSE_RESCHED Thread was rescheduled.
|
||||
TM_CAUSE_TLBI Software TLB invalid.
|
||||
TM_CAUSE_FAC_UNAV FP/VEC/VSX unavailable trap.
|
||||
@ -184,6 +186,7 @@ kernel aborted a transaction:
|
||||
TM_CAUSE_MISC Currently unused.
|
||||
TM_CAUSE_ALIGNMENT Alignment fault.
|
||||
TM_CAUSE_EMULATE Emulation that touched memory.
|
||||
====================== ================================
|
||||
|
||||
These can be checked by the user program's abort handler as TEXASR[0:7]. If
|
||||
bit 7 is set, it indicates that the error is consider persistent. For example
|
||||
@ -203,7 +206,7 @@ POWER9
|
||||
======
|
||||
|
||||
TM on POWER9 has issues with storing the complete register state. This
|
||||
is described in this commit:
|
||||
is described in this commit::
|
||||
|
||||
commit 4bb3c7a0208fc13ca70598efd109901a7cd45ae7
|
||||
Author: Paul Mackerras <paulus@ozlabs.org>
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = 'Linux Kernel Development Documentation'
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'process.tex', 'Linux Kernel Development Documentation',
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -119,3 +119,17 @@ array may exceed the remaining memory in the stack segment. This could
|
||||
lead to a crash, possible overwriting sensitive contents at the end of the
|
||||
stack (when built without `CONFIG_THREAD_INFO_IN_TASK=y`), or overwriting
|
||||
memory adjacent to the stack (when built without `CONFIG_VMAP_STACK=y`)
|
||||
|
||||
Implicit switch case fall-through
|
||||
---------------------------------
|
||||
The C language allows switch cases to "fall through" when
|
||||
a "break" statement is missing at the end of a case. This,
|
||||
however, introduces ambiguity in the code, as it's not always
|
||||
clear if the missing break is intentional or a bug. As there
|
||||
have been a long list of flaws `due to missing "break" statements
|
||||
<https://cwe.mitre.org/data/definitions/484.html>`_, we no longer allow
|
||||
"implicit fall-through". In order to identify an intentional fall-through
|
||||
case, we have adopted the marking used by static analyzers: a comment
|
||||
saying `/* Fall through */`. Once the C++17 `__attribute__((fallthrough))`
|
||||
is more widely handled by C compilers, static analyzers, and IDEs, we can
|
||||
switch to using that instead.
|
||||
|
279
Documentation/process/embargoed-hardware-issues.rst
Normal file
279
Documentation/process/embargoed-hardware-issues.rst
Normal file
@ -0,0 +1,279 @@
|
||||
Embargoed hardware issues
|
||||
=========================
|
||||
|
||||
Scope
|
||||
-----
|
||||
|
||||
Hardware issues which result in security problems are a different category
|
||||
of security bugs than pure software bugs which only affect the Linux
|
||||
kernel.
|
||||
|
||||
Hardware issues like Meltdown, Spectre, L1TF etc. must be treated
|
||||
differently because they usually affect all Operating Systems ("OS") and
|
||||
therefore need coordination across different OS vendors, distributions,
|
||||
hardware vendors and other parties. For some of the issues, software
|
||||
mitigations can depend on microcode or firmware updates, which need further
|
||||
coordination.
|
||||
|
||||
.. _Contact:
|
||||
|
||||
Contact
|
||||
-------
|
||||
|
||||
The Linux kernel hardware security team is separate from the regular Linux
|
||||
kernel security team.
|
||||
|
||||
The team only handles the coordination of embargoed hardware security
|
||||
issues. Reports of pure software security bugs in the Linux kernel are not
|
||||
handled by this team and the reporter will be guided to contact the regular
|
||||
Linux kernel security team (:ref:`Documentation/admin-guide/
|
||||
<securitybugs>`) instead.
|
||||
|
||||
The team can be contacted by email at <hardware-security@kernel.org>. This
|
||||
is a private list of security officers who will help you to coordinate an
|
||||
issue according to our documented process.
|
||||
|
||||
The list is encrypted and email to the list can be sent by either PGP or
|
||||
S/MIME encrypted and must be signed with the reporter's PGP key or S/MIME
|
||||
certificate. The list's PGP key and S/MIME certificate are available from
|
||||
https://www.kernel.org/....
|
||||
|
||||
While hardware security issues are often handled by the affected hardware
|
||||
vendor, we welcome contact from researchers or individuals who have
|
||||
identified a potential hardware flaw.
|
||||
|
||||
Hardware security officers
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The current team of hardware security officers:
|
||||
|
||||
- Linus Torvalds (Linux Foundation Fellow)
|
||||
- Greg Kroah-Hartman (Linux Foundation Fellow)
|
||||
- Thomas Gleixner (Linux Foundation Fellow)
|
||||
|
||||
Operation of mailing-lists
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The encrypted mailing-lists which are used in our process are hosted on
|
||||
Linux Foundation's IT infrastructure. By providing this service Linux
|
||||
Foundation's director of IT Infrastructure security technically has the
|
||||
ability to access the embargoed information, but is obliged to
|
||||
confidentiality by his employment contract. Linux Foundation's director of
|
||||
IT Infrastructure security is also responsible for the kernel.org
|
||||
infrastructure.
|
||||
|
||||
The Linux Foundation's current director of IT Infrastructure security is
|
||||
Konstantin Ryabitsev.
|
||||
|
||||
|
||||
Non-disclosure agreements
|
||||
-------------------------
|
||||
|
||||
The Linux kernel hardware security team is not a formal body and therefore
|
||||
unable to enter into any non-disclosure agreements. The kernel community
|
||||
is aware of the sensitive nature of such issues and offers a Memorandum of
|
||||
Understanding instead.
|
||||
|
||||
|
||||
Memorandum of Understanding
|
||||
---------------------------
|
||||
|
||||
The Linux kernel community has a deep understanding of the requirement to
|
||||
keep hardware security issues under embargo for coordination between
|
||||
different OS vendors, distributors, hardware vendors and other parties.
|
||||
|
||||
The Linux kernel community has successfully handled hardware security
|
||||
issues in the past and has the necessary mechanisms in place to allow
|
||||
community compliant development under embargo restrictions.
|
||||
|
||||
The Linux kernel community has a dedicated hardware security team for
|
||||
initial contact, which oversees the process of handling such issues under
|
||||
embargo rules.
|
||||
|
||||
The hardware security team identifies the developers (domain experts) who
|
||||
will form the initial response team for a particular issue. The initial
|
||||
response team can bring in further developers (domain experts) to address
|
||||
the issue in the best technical way.
|
||||
|
||||
All involved developers pledge to adhere to the embargo rules and to keep
|
||||
the received information confidential. Violation of the pledge will lead to
|
||||
immediate exclusion from the current issue and removal from all related
|
||||
mailing-lists. In addition, the hardware security team will also exclude
|
||||
the offender from future issues. The impact of this consequence is a highly
|
||||
effective deterrent in our community. In case a violation happens the
|
||||
hardware security team will inform the involved parties immediately. If you
|
||||
or anyone becomes aware of a potential violation, please report it
|
||||
immediately to the Hardware security officers.
|
||||
|
||||
|
||||
Process
|
||||
^^^^^^^
|
||||
|
||||
Due to the globally distributed nature of Linux kernel development,
|
||||
face-to-face meetings are almost impossible to address hardware security
|
||||
issues. Phone conferences are hard to coordinate due to time zones and
|
||||
other factors and should be only used when absolutely necessary. Encrypted
|
||||
email has been proven to be the most effective and secure communication
|
||||
method for these types of issues.
|
||||
|
||||
Start of Disclosure
|
||||
"""""""""""""""""""
|
||||
|
||||
Disclosure starts by contacting the Linux kernel hardware security team by
|
||||
email. This initial contact should contain a description of the problem and
|
||||
a list of any known affected hardware. If your organization builds or
|
||||
distributes the affected hardware, we encourage you to also consider what
|
||||
other hardware could be affected.
|
||||
|
||||
The hardware security team will provide an incident-specific encrypted
|
||||
mailing-list which will be used for initial discussion with the reporter,
|
||||
further disclosure and coordination.
|
||||
|
||||
The hardware security team will provide the disclosing party a list of
|
||||
developers (domain experts) who should be informed initially about the
|
||||
issue after confirming with the developers that they will adhere to this
|
||||
Memorandum of Understanding and the documented process. These developers
|
||||
form the initial response team and will be responsible for handling the
|
||||
issue after initial contact. The hardware security team is supporting the
|
||||
response team, but is not necessarily involved in the mitigation
|
||||
development process.
|
||||
|
||||
While individual developers might be covered by a non-disclosure agreement
|
||||
via their employer, they cannot enter individual non-disclosure agreements
|
||||
in their role as Linux kernel developers. They will, however, agree to
|
||||
adhere to this documented process and the Memorandum of Understanding.
|
||||
|
||||
|
||||
Disclosure
|
||||
""""""""""
|
||||
|
||||
The disclosing party provides detailed information to the initial response
|
||||
team via the specific encrypted mailing-list.
|
||||
|
||||
From our experience the technical documentation of these issues is usually
|
||||
a sufficient starting point and further technical clarification is best
|
||||
done via email.
|
||||
|
||||
Mitigation development
|
||||
""""""""""""""""""""""
|
||||
|
||||
The initial response team sets up an encrypted mailing-list or repurposes
|
||||
an existing one if appropriate. The disclosing party should provide a list
|
||||
of contacts for all other parties who have already been, or should be
|
||||
informed about the issue. The response team contacts these parties so they
|
||||
can name experts who should be subscribed to the mailing-list.
|
||||
|
||||
Using a mailing-list is close to the normal Linux development process and
|
||||
has been successfully used in developing mitigations for various hardware
|
||||
security issues in the past.
|
||||
|
||||
The mailing-list operates in the same way as normal Linux development.
|
||||
Patches are posted, discussed and reviewed and if agreed on applied to a
|
||||
non-public git repository which is only accessible to the participating
|
||||
developers via a secure connection. The repository contains the main
|
||||
development branch against the mainline kernel and backport branches for
|
||||
stable kernel versions as necessary.
|
||||
|
||||
The initial response team will identify further experts from the Linux
|
||||
kernel developer community as needed and inform the disclosing party about
|
||||
their participation. Bringing in experts can happen at any time of the
|
||||
development process and often needs to be handled in a timely manner.
|
||||
|
||||
Coordinated release
|
||||
"""""""""""""""""""
|
||||
|
||||
The involved parties will negotiate the date and time where the embargo
|
||||
ends. At that point the prepared mitigations are integrated into the
|
||||
relevant kernel trees and published.
|
||||
|
||||
While we understand that hardware security issues need coordinated embargo
|
||||
time, the embargo time should be constrained to the minimum time which is
|
||||
required for all involved parties to develop, test and prepare the
|
||||
mitigations. Extending embargo time artificially to meet conference talk
|
||||
dates or other non-technical reasons is creating more work and burden for
|
||||
the involved developers and response teams as the patches need to be kept
|
||||
up to date in order to follow the ongoing upstream kernel development,
|
||||
which might create conflicting changes.
|
||||
|
||||
CVE assignment
|
||||
""""""""""""""
|
||||
|
||||
Neither the hardware security team nor the initial response team assign
|
||||
CVEs, nor are CVEs required for the development process. If CVEs are
|
||||
provided by the disclosing party they can be used for documentation
|
||||
purposes.
|
||||
|
||||
Process ambassadors
|
||||
-------------------
|
||||
|
||||
For assistance with this process we have established ambassadors in various
|
||||
organizations, who can answer questions about or provide guidance on the
|
||||
reporting process and further handling. Ambassadors are not involved in the
|
||||
disclosure of a particular issue, unless requested by a response team or by
|
||||
an involved disclosed party. The current ambassadors list:
|
||||
|
||||
============= ========================================================
|
||||
ARM
|
||||
AMD
|
||||
IBM
|
||||
Intel
|
||||
Qualcomm
|
||||
|
||||
Microsoft
|
||||
VMware
|
||||
XEN
|
||||
|
||||
Canonical Tyler Hicks <tyhicks@canonical.com>
|
||||
Debian Ben Hutchings <ben@decadent.org.uk>
|
||||
Oracle Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
||||
Red Hat Josh Poimboeuf <jpoimboe@redhat.com>
|
||||
SUSE Jiri Kosina <jkosina@suse.cz>
|
||||
|
||||
Amazon
|
||||
Google
|
||||
============== ========================================================
|
||||
|
||||
If you want your organization to be added to the ambassadors list, please
|
||||
contact the hardware security team. The nominated ambassador has to
|
||||
understand and support our process fully and is ideally well connected in
|
||||
the Linux kernel community.
|
||||
|
||||
Encrypted mailing-lists
|
||||
-----------------------
|
||||
|
||||
We use encrypted mailing-lists for communication. The operating principle
|
||||
of these lists is that email sent to the list is encrypted either with the
|
||||
list's PGP key or with the list's S/MIME certificate. The mailing-list
|
||||
software decrypts the email and re-encrypts it individually for each
|
||||
subscriber with the subscriber's PGP key or S/MIME certificate. Details
|
||||
about the mailing-list software and the setup which is used to ensure the
|
||||
security of the lists and protection of the data can be found here:
|
||||
https://www.kernel.org/....
|
||||
|
||||
List keys
|
||||
^^^^^^^^^
|
||||
|
||||
For initial contact see :ref:`Contact`. For incident specific mailing-lists
|
||||
the key and S/MIME certificate are conveyed to the subscribers by email
|
||||
sent from the specific list.
|
||||
|
||||
Subscription to incident specific lists
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Subscription is handled by the response teams. Disclosed parties who want
|
||||
to participate in the communication send a list of potential subscribers to
|
||||
the response team so the response team can validate subscription requests.
|
||||
|
||||
Each subscriber needs to send a subscription request to the response team
|
||||
by email. The email must be signed with the subscriber's PGP key or S/MIME
|
||||
certificate. If a PGP key is used, it must be available from a public key
|
||||
server and is ideally connected to the Linux kernel's PGP web of trust. See
|
||||
also: https://www.kernel.org/signature.html.
|
||||
|
||||
The response team verifies that the subscriber request is valid and adds
|
||||
the subscriber to the list. After subscription the subscriber will receive
|
||||
email from the mailing-list which is signed either with the list's PGP key
|
||||
or the list's S/MIME certificate. The subscriber's email client can extract
|
||||
the PGP key or the S/MIME certificate from the signature so the subscriber
|
||||
can send encrypted email to the list.
|
||||
|
@ -45,6 +45,7 @@ Other guides to the community that are of interest to most developers are:
|
||||
submit-checklist
|
||||
kernel-docs
|
||||
deprecated
|
||||
embargoed-hardware-issues
|
||||
|
||||
These are some overall technical guides that have been put here for now for
|
||||
lack of a better place.
|
||||
|
@ -180,6 +180,13 @@ The process of how these work together.
|
||||
add it to an iommu_group and a vfio_group. Then we could pass through
|
||||
the mdev to a guest.
|
||||
|
||||
|
||||
VFIO-CCW Regions
|
||||
----------------
|
||||
|
||||
The vfio-ccw driver exposes MMIO regions to accept requests from and return
|
||||
results to userspace.
|
||||
|
||||
vfio-ccw I/O region
|
||||
-------------------
|
||||
|
||||
@ -205,6 +212,25 @@ irb_area stores the I/O result.
|
||||
|
||||
ret_code stores a return code for each access of the region.
|
||||
|
||||
This region is always available.
|
||||
|
||||
vfio-ccw cmd region
|
||||
-------------------
|
||||
|
||||
The vfio-ccw cmd region is used to accept asynchronous instructions
|
||||
from userspace::
|
||||
|
||||
#define VFIO_CCW_ASYNC_CMD_HSCH (1 << 0)
|
||||
#define VFIO_CCW_ASYNC_CMD_CSCH (1 << 1)
|
||||
struct ccw_cmd_region {
|
||||
__u32 command;
|
||||
__u32 ret_code;
|
||||
} __packed;
|
||||
|
||||
This region is exposed via region type VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD.
|
||||
|
||||
Currently, CLEAR SUBCHANNEL and HALT SUBCHANNEL use this region.
|
||||
|
||||
vfio-ccw operation details
|
||||
--------------------------
|
||||
|
||||
@ -306,9 +332,8 @@ Together with the corresponding work in QEMU, we can bring the passed
|
||||
through DASD/ECKD device online in a guest now and use it as a block
|
||||
device.
|
||||
|
||||
While the current code allows the guest to start channel programs via
|
||||
START SUBCHANNEL, support for HALT SUBCHANNEL or CLEAR SUBCHANNEL is
|
||||
not yet implemented.
|
||||
The current code allows the guest to start channel programs via
|
||||
START SUBCHANNEL, and to issue HALT SUBCHANNEL and CLEAR SUBCHANNEL.
|
||||
|
||||
vfio-ccw supports classic (command mode) channel I/O only. Transport
|
||||
mode (HPF) is not supported.
|
||||
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "SuperH architecture implementation manual"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'sh.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "Linux Sound Subsystem Documentation"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'sound.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
@ -21,6 +21,29 @@ def loadConfig(namespace):
|
||||
and os.path.normpath(namespace["__file__"]) != os.path.normpath(config_file) ):
|
||||
config_file = os.path.abspath(config_file)
|
||||
|
||||
# Let's avoid one conf.py file just due to latex_documents
|
||||
start = config_file.find('Documentation/')
|
||||
if start >= 0:
|
||||
start = config_file.find('/', start + 1)
|
||||
|
||||
end = config_file.rfind('/')
|
||||
if start >= 0 and end > 0:
|
||||
dir = config_file[start + 1:end]
|
||||
|
||||
print("source directory: %s" % dir)
|
||||
new_latex_docs = []
|
||||
latex_documents = namespace['latex_documents']
|
||||
|
||||
for l in latex_documents:
|
||||
if l[0].find(dir + '/') == 0:
|
||||
has = True
|
||||
fn = l[0][len(dir) + 1:]
|
||||
new_latex_docs.append((fn, l[1], l[2], l[3], l[4]))
|
||||
break
|
||||
|
||||
namespace['latex_documents'] = new_latex_docs
|
||||
|
||||
# If there is an extra conf.py file, load it
|
||||
if os.path.isfile(config_file):
|
||||
sys.stdout.write("load additional sphinx-config: %s\n" % config_file)
|
||||
config = namespace.copy()
|
||||
@ -29,4 +52,6 @@ def loadConfig(namespace):
|
||||
del config['__file__']
|
||||
namespace.update(config)
|
||||
else:
|
||||
sys.stderr.write("WARNING: additional sphinx-config not found: %s\n" % config_file)
|
||||
config = namespace.copy()
|
||||
config['tags'].add("subproject")
|
||||
namespace.update(config)
|
||||
|
@ -242,8 +242,9 @@ del kernel:
|
||||
* Per inserire blocchi di testo con caratteri a dimensione fissa (codici di
|
||||
esempio, casi d'uso, eccetera): utilizzate ``::`` quando non è necessario
|
||||
evidenziare la sintassi, specialmente per piccoli frammenti; invece,
|
||||
utilizzate ``.. code-block:: <language>`` per blocchi di più lunghi che
|
||||
potranno beneficiare dell'avere la sintassi evidenziata.
|
||||
utilizzate ``.. code-block:: <language>`` per blocchi più lunghi che
|
||||
beneficeranno della sintassi evidenziata. Per un breve pezzo di codice da
|
||||
inserire nel testo, usate \`\`.
|
||||
|
||||
|
||||
Il dominio C
|
||||
@ -267,12 +268,14 @@ molto comune come ``open`` o ``ioctl``:
|
||||
|
||||
Il nome della funzione (per esempio ioctl) rimane nel testo ma il nome del suo
|
||||
riferimento cambia da ``ioctl`` a ``VIDIOC_LOG_STATUS``. Anche la voce
|
||||
nell'indice cambia in ``VIDIOC_LOG_STATUS`` e si potrà quindi fare riferimento
|
||||
a questa funzione scrivendo:
|
||||
nell'indice cambia in ``VIDIOC_LOG_STATUS``.
|
||||
|
||||
.. code-block:: rst
|
||||
|
||||
:c:func:`VIDIOC_LOG_STATUS`
|
||||
Notate che per una funzione non c'è bisogno di usare ``c:func:`` per generarne
|
||||
i riferimenti nella documentazione. Grazie a qualche magica estensione a
|
||||
Sphinx, il sistema di generazione della documentazione trasformerà
|
||||
automaticamente un riferimento ad una ``funzione()`` in un riferimento
|
||||
incrociato quando questa ha una voce nell'indice. Se trovate degli usi di
|
||||
``c:func:`` nella documentazione del kernel, sentitevi liberi di rimuoverli.
|
||||
|
||||
|
||||
Tabelle a liste
|
||||
|
@ -27,6 +27,7 @@ Di seguito le guide che ogni sviluppatore dovrebbe leggere.
|
||||
code-of-conduct
|
||||
development-process
|
||||
submitting-patches
|
||||
programming-language
|
||||
coding-style
|
||||
maintainer-pgp-guide
|
||||
email-clients
|
||||
|
@ -1,6 +1,7 @@
|
||||
.. include:: ../disclaimer-ita.rst
|
||||
|
||||
:Original: :ref:`Documentation/process/kernel-docs.rst <kernel_docs>`
|
||||
:Translator: Federico Vaga <federico.vaga@vaga.pv.it>
|
||||
|
||||
|
||||
.. _it_kernel_docs:
|
||||
@ -8,6 +9,10 @@
|
||||
Indice di documenti per le persone interessate a capire e/o scrivere per il kernel Linux
|
||||
========================================================================================
|
||||
|
||||
.. warning::
|
||||
|
||||
TODO ancora da tradurre
|
||||
.. note::
|
||||
Questo documento contiene riferimenti a documenti in lingua inglese; inoltre
|
||||
utilizza dai campi *ReStructuredText* di supporto alla ricerca e che per
|
||||
questo motivo è meglio non tradurre al fine di garantirne un corretto
|
||||
utilizzo.
|
||||
Per questi motivi il documento non verrà tradotto. Per favore fate
|
||||
riferimento al documento originale in lingua inglese.
|
||||
|
@ -248,7 +248,10 @@ possano ricevere la vostra nuova sottochiave::
|
||||
kernel.
|
||||
|
||||
Se per qualche ragione preferite rimanere con sottochiavi RSA, nel comando
|
||||
precedente, sostituite "ed25519" con "rsa2048".
|
||||
precedente, sostituite "ed25519" con "rsa2048". In aggiunta, se avete
|
||||
intenzione di usare un dispositivo hardware che non supporta le chiavi
|
||||
ED25519 ECC, come la Nitrokey Pro o la Yubikey, allora dovreste usare
|
||||
"nistp256" al posto di "ed25519".
|
||||
|
||||
Copia di riserva della chiave primaria per gestire il recupero da disastro
|
||||
--------------------------------------------------------------------------
|
||||
@ -449,23 +452,27 @@ implementi le funzionalità delle smartcard. Sul mercato ci sono diverse
|
||||
soluzioni disponibili:
|
||||
|
||||
- `Nitrokey Start`_: è Open hardware e Free Software, è basata sul progetto
|
||||
`GnuK`_ della FSIJ. Ha il supporto per chiavi ECC, ma meno funzionalità di
|
||||
sicurezza (come la resistenza alla manomissione o alcuni attacchi ad un
|
||||
canale laterale).
|
||||
`GnuK`_ della FSIJ. Questo è uno dei pochi dispositivi a supportare le chiavi
|
||||
ECC ED25519, ma offre meno funzionalità di sicurezza (come la resistenza
|
||||
alla manomissione o alcuni attacchi ad un canale laterale).
|
||||
- `Nitrokey Pro`_: è simile alla Nitrokey Start, ma è più resistente alla
|
||||
manomissione e offre più funzionalità di sicurezza, ma l'ECC.
|
||||
- `Yubikey 4`_: l'hardware e il software sono proprietari, ma è più economica
|
||||
manomissione e offre più funzionalità di sicurezza. La Pro 2 supporta la
|
||||
crittografia ECC (NISTP).
|
||||
- `Yubikey 5`_: l'hardware e il software sono proprietari, ma è più economica
|
||||
della Nitrokey Pro ed è venduta anche con porta USB-C il che è utile con i
|
||||
computer portatili più recenti. In aggiunta, offre altre funzionalità di
|
||||
sicurezza come FIDO, U2F, ma non l'ECC
|
||||
sicurezza come FIDO, U2F, e ora supporta anche le chiavi ECC (NISTP)
|
||||
|
||||
`Su LWN c'è una buona recensione`_ dei modelli elencati qui sopra e altri.
|
||||
La scelta dipenderà dal costo, dalla disponibilità nella vostra area
|
||||
geografica e vostre considerazioni sull'hardware aperto/proprietario.
|
||||
|
||||
Se volete usare chiavi ECC, la vostra migliore scelta sul mercato è la
|
||||
Nitrokey Start.
|
||||
|
||||
.. _`Nitrokey Start`: https://shop.nitrokey.com/shop/product/nitrokey-start-6
|
||||
.. _`Nitrokey Pro`: https://shop.nitrokey.com/shop/product/nitrokey-pro-3
|
||||
.. _`Yubikey 4`: https://www.yubico.com/product/yubikey-4-series/
|
||||
.. _`Nitrokey Pro 2`: https://shop.nitrokey.com/shop/product/nitrokey-pro-2-3
|
||||
.. _`Yubikey 5`: https://www.yubico.com/product/yubikey-5-overview/
|
||||
.. _Gnuk: http://www.fsij.org/doc-gnuk/
|
||||
.. _`Su LWN c'è una buona recensione`: https://lwn.net/Articles/736231/
|
||||
|
||||
|
@ -0,0 +1,51 @@
|
||||
.. include:: ../disclaimer-ita.rst
|
||||
|
||||
:Original: :ref:`Documentation/process/programming-language.rst <programming_language>`
|
||||
:Translator: Federico Vaga <federico.vaga@vaga.pv.it>
|
||||
|
||||
.. _it_programming_language:
|
||||
|
||||
Linguaggio di programmazione
|
||||
============================
|
||||
|
||||
Il kernel è scritto nel linguaggio di programmazione C [c-language]_.
|
||||
Più precisamente, il kernel viene compilato con ``gcc`` [gcc]_ usando
|
||||
l'opzione ``-std=gnu89`` [gcc-c-dialect-options]_: il dialetto GNU
|
||||
dello standard ISO C90 (con l'aggiunta di alcune funzionalità da C99)
|
||||
|
||||
Questo dialetto contiene diverse estensioni al linguaggio [gnu-extensions]_,
|
||||
e molte di queste vengono usate sistematicamente dal kernel.
|
||||
|
||||
Il kernel offre un certo livello di supporto per la compilazione con ``clang``
|
||||
[clang]_ e ``icc`` [icc]_ su diverse architetture, tuttavia in questo momento
|
||||
il supporto non è completo e richiede delle patch aggiuntive.
|
||||
|
||||
Attributi
|
||||
---------
|
||||
|
||||
Una delle estensioni più comuni e usate nel kernel sono gli attributi
|
||||
[gcc-attribute-syntax]_. Gli attributi permettono di aggiungere una semantica,
|
||||
definita dell'implementazione, alle entità del linguaggio (come le variabili,
|
||||
le funzioni o i tipi) senza dover fare importanti modifiche sintattiche al
|
||||
linguaggio stesso (come l'aggiunta di nuove parole chiave) [n2049]_.
|
||||
|
||||
In alcuni casi, gli attributi sono opzionali (ovvero un compilatore che non
|
||||
dovesse supportarli dovrebbe produrre comunque codice corretto, anche se
|
||||
più lento o che non esegue controlli aggiuntivi durante la compilazione).
|
||||
|
||||
Il kernel definisce alcune pseudo parole chiave (per esempio ``__pure``)
|
||||
in alternativa alla sintassi GNU per gli attributi (per esempio
|
||||
``__attribute__((__pure__))``) allo scopo di mostrare quali funzionalità si
|
||||
possono usare e/o per accorciare il codice.
|
||||
|
||||
Per maggiori informazioni consultate il file d'intestazione
|
||||
``include/linux/compiler_attributes.h``.
|
||||
|
||||
.. [c-language] http://www.open-std.org/jtc1/sc22/wg14/www/standards
|
||||
.. [gcc] https://gcc.gnu.org
|
||||
.. [clang] https://clang.llvm.org
|
||||
.. [icc] https://software.intel.com/en-us/c-compilers
|
||||
.. [gcc-c-dialect-options] https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html
|
||||
.. [gnu-extensions] https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html
|
||||
.. [gcc-attribute-syntax] https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html
|
||||
.. [n2049] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2049.pdf
|
@ -569,7 +569,7 @@ ACQUIRE 는 해당 오퍼레이션의 로드 부분에만 적용되고 RELEASE
|
||||
|
||||
[*] 버스 마스터링 DMA 와 일관성에 대해서는 다음을 참고하시기 바랍니다:
|
||||
|
||||
Documentation/PCI/pci.rst
|
||||
Documentation/driver-api/pci/pci.rst
|
||||
Documentation/DMA-API-HOWTO.txt
|
||||
Documentation/DMA-API.txt
|
||||
|
||||
|
@ -1,10 +0,0 @@
|
||||
# -*- coding: utf-8; mode: python -*-
|
||||
|
||||
project = "The Linux kernel user-space API guide"
|
||||
|
||||
tags.add("subproject")
|
||||
|
||||
latex_documents = [
|
||||
('index', 'userspace-api.tex', project,
|
||||
'The kernel development community', 'manual'),
|
||||
]
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user