Commit Graph

7315 Commits

Author SHA1 Message Date
Linus Torvalds
a50243b1dd 5.1 Merge Window Pull Request
This has been a slightly more active cycle than normal with ongoing core
 changes and quite a lot of collected driver updates.
 
 - Various driver fixes for bnxt_re, cxgb4, hns, mlx5, pvrdma, rxe
 
 - A new data transfer mode for HFI1 giving higher performance
 
 - Significant functional and bug fix update to the mlx5 On-Demand-Paging MR
   feature
 
 - A chip hang reset recovery system for hns
 
 - Change mm->pinned_vm to an atomic64
 
 - Update bnxt_re to support a new 57500 chip
 
 - A sane netlink 'rdma link add' method for creating rxe devices and fixing
   the various unregistration race conditions in rxe's unregister flow
 
 - Allow lookup up objects by an ID over netlink
 
 - Various reworking of the core to driver interface:
   * Drivers should not assume umem SGLs are in PAGE_SIZE chunks
   * ucontext is accessed via udata not other means
   * Start to make the core code responsible for object memory
     allocation
   * Drivers should convert struct device to struct ib_device
     via a helper
   * Drivers have more tools to avoid use after unregister problems
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlyAJYYACgkQOG33FX4g
 mxrWwQ/+OyAx4Moru7Aix0C6GWxTJp/wKgw21CS3reZxgLai6x81xNYG/s2wCNjo
 IccObVd7mvzyqPdxOeyHBsJBbQDqWvoD6O2duH8cqGMgBRgh3CSdUep2zLvPpSAx
 2W1SvWYCLDnCuarboFrCA8c4AN3eCZiqD7z9lHyFQGjy3nTUWzk1uBaOP46uaiMv
 w89N8EMdXJ/iY6ONzihvE05NEYbMA8fuvosKLLNdghRiHIjbMQU8SneY23pvyPDd
 ZziPu9NcO3Hw9OVbkwtJp47U3KCBgvKHmnixyZKkikjiD+HVoABw2IMwcYwyBZwP
 Bic/ddONJUvAxMHpKRnQaW7znAiHARk21nDG28UAI7FWXH/wMXgicMp6LRcNKqKF
 vqXdxHTKJb0QUR4xrYI+eA8ihstss7UUpgSgByuANJ0X729xHiJtlEvPb1DPo1Dz
 9CB4OHOVRl5O8sA5Jc6PSusZiKEpvWoyWbdmw0IiwDF5pe922VLl5Nv88ta+sJ38
 v2Ll5AgYcluk7F3599Uh9D7gwp5hxW2Ph3bNYyg2j3HP4/dKsL9XvIJPXqEthgCr
 3KQS9rOZfI/7URieT+H+Mlf+OWZhXsZilJG7No0fYgIVjgJ00h3SF1/299YIq6Qp
 9W7ZXBfVSwLYA2AEVSvGFeZPUxgBwHrSZ62wya4uFeB1jyoodPk=
 =p12E
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This has been a slightly more active cycle than normal with ongoing
  core changes and quite a lot of collected driver updates.

   - Various driver fixes for bnxt_re, cxgb4, hns, mlx5, pvrdma, rxe

   - A new data transfer mode for HFI1 giving higher performance

   - Significant functional and bug fix update to the mlx5
     On-Demand-Paging MR feature

   - A chip hang reset recovery system for hns

   - Change mm->pinned_vm to an atomic64

   - Update bnxt_re to support a new 57500 chip

   - A sane netlink 'rdma link add' method for creating rxe devices and
     fixing the various unregistration race conditions in rxe's
     unregister flow

   - Allow lookup up objects by an ID over netlink

   - Various reworking of the core to driver interface:
       - drivers should not assume umem SGLs are in PAGE_SIZE chunks
       - ucontext is accessed via udata not other means
       - start to make the core code responsible for object memory
         allocation
       - drivers should convert struct device to struct ib_device via a
         helper
       - drivers have more tools to avoid use after unregister problems"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (280 commits)
  net/mlx5: ODP support for XRC transport is not enabled by default in FW
  IB/hfi1: Close race condition on user context disable and close
  RDMA/umem: Revert broken 'off by one' fix
  RDMA/umem: minor bug fix in error handling path
  RDMA/hns: Use GFP_ATOMIC in hns_roce_v2_modify_qp
  cxgb4: kfree mhp after the debug print
  IB/rdmavt: Fix concurrency panics in QP post_send and modify to error
  IB/rdmavt: Fix loopback send with invalidate ordering
  IB/iser: Fix dma_nents type definition
  IB/mlx5: Set correct write permissions for implicit ODP MR
  bnxt_re: Clean cq for kernel consumers only
  RDMA/uverbs: Don't do double free of allocated PD
  RDMA: Handle ucontext allocations by IB/core
  RDMA/core: Fix a WARN() message
  bnxt_re: fix the regression due to changes in alloc_pbl
  IB/mlx4: Increase the timeout for CM cache
  IB/core: Abort page fault handler silently during owning process exit
  IB/mlx5: Validate correct PD before prefetch MR
  IB/mlx5: Protect against prefetch of invalid MR
  RDMA/uverbs: Store PR pointer before it is overwritten
  ...
2019-03-09 15:53:03 -08:00
Linus Torvalds
96a6de1a54 media updates for v5.1-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcgUDlAAoJEAhfPr2O5OEV2kIP/AiHMkMGi/fXmwzN0tFjYkim
 39t6rodj6rT/oMib4XvW55GjQy5sdXwz+1jE+kZA5imbUvt6YzUXFBzIBOGOIF0n
 1MukKa7M6ragnm2yR+42ucBr3jcuc91/keeVzWgP2cgeZeKUlBHme+rECYnwqDdT
 9rcG4U2XL0Wolbm4lAispaWYIYoOURvPeryJ244vlPmch5/2nmXbG7AgNlfJsAw4
 NFmdHBWxLeyB8F95ToikhuNlTWrsvdVHPHbDaDPwioSulZ1vw+lu4CHRd1uZo2iH
 W0INE65ukgyenzTDbmnj5/oWCqV4KRTs8A2x6eimz+wG/60jWQjDiBLSzhxjBH7x
 alrwhxnW3bD31ZUCkmaGd1+3txvLf+Lup9lLX3GCBKA45dW9pzVCLfxSfNaKKlTL
 0xCYSMxl5xbl8TL6hHxK7/n+LsButgTRWIoJpqkM9uPrljwzznpgqJvARqSuHEKJ
 3Tvnkc2DZsmlM8L02i929BsrsoTncm6wBBVlCJzhL0VNaOuL7yJVzXhrw7b/dZZw
 IZu6cH5RrZhIQR4y1UPlaEZoidUGvR0+K997AsURIHJA0RolWE5eI2JHSE86EX8S
 bzG5SChkQmbpYt5OXQvg5VxvqVElx/5/tamcHe/rKwaAwaG9aI9HICgP2e0Zaoce
 YOMJUpcHtSY5Fedk8P1a
 =tD1x
 -----END PGP SIGNATURE-----

Merge tag 'media/v5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - remove sensor drivers that got converted from soc_camera

 - remaining soc_camera drivers got moved to staging

 - some documentation cleanups and improvements

 - the imx staging driver now supports imx7

 - the ov9640, mt9m001 and mt9m111 got converted from soc_camera

 - the vim2m driver now does what a m2m convert driver expects to do

 - epoll() fixes on media subsystems

 - several drivers fixes, typos, cleanups and improvements

* tag 'media/v5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (346 commits)
  media: dvb/earth-pt1: fix wrong initialization for demod blocks
  media: vim2m: Address some coding style issues
  media: vim2m: don't use BUG()
  media: vim2m: speedup passthrough copy
  media: vim2m: add an horizontal scaler
  media: vim2m: don't accept YUYV anymore as output format
  media: vim2m: add vertical linear scaler
  media: vim2m: better handle cap/out buffers with different sizes
  media: vim2m: use different framesizes for bayer formats
  media: vim2m: add support for VIDIOC_ENUM_FRAMESIZES
  media: vim2m: ensure that width is multiple of two
  media: vim2m: improve debug messages
  media: vim2m: add bayer capture formats
  media: a few more typos at staging, pci, platform, radio and usb
  media: Documentation: fix several typos
  media: staging: fix several typos
  media: include: fix several typos
  media: common: fix several typos
  media: v4l2-core: fix several typos
  media: usb: fix several typos
  ...
2019-03-09 14:45:54 -08:00
Linus Torvalds
38e7571c07 io_uring-2019-03-06
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlyAJvAQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgphb+EACFaKI2HIdjExQ5T7Cxebzwky+Qiro3FV55
 ziW00FZrkJ5g0h4ItBzh/5SDlcNQYZDMlA3s4xzWIMadWl5PjMPq1uJul0cITbSl
 WIJO5hpgNMXeUEhvcXUl6+f/WzpgYUxN40uW8N5V7EKlooaFVfudDqJGlvEv+UgB
 g8NWQYThSG+/e7r9OGwK0xDRVKfpjxVvmqmnDH3DrxKaDgSOwTf4xn1u41wKwfQ3
 3uPfQ+GBeTqt4a2AhOi7K6KQFNnj5Jz5CXYMiOZI2JGtLPcL6dmyBVD7K0a0HUr+
 rs4ghNdd1+puvPGNK4TX8qV0uiNrMctoRNVA/JDd1ZTYEKTmNLxeFf+olfYHlwuK
 K5FRs60/lgNzNkzcUpFvJHitPwYtxYJdB36PyswE1FZP1YviEeVoKNt9W8aIhEoA
 549uj90brfA74eCINGhq98pJqj9CNyCPw3bfi76f5Ej2utwYDb9S5Cp2gfSa853X
 qc/qNda9efEq7ikwCbPzhekRMXZo6TSXtaSmC2C+Vs5+mD1Scc4kdAvdCKGQrtr9
 aoy0iQMYO2NDZ/G5fppvXtMVuEPAZWbsGftyOe15IlMysjRze2ycJV8cFahKEVM9
 uBeXLyH1pqGU/j7ABP4+XRZ/sbHJTwjKJbnXhTgBsdU8XO/CR3U+kRQFTsidKMfH
 Wlo3uH2h2A==
 =p78E
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-2019-03-06' of git://git.kernel.dk/linux-block

Pull io_uring IO interface from Jens Axboe:
 "Second attempt at adding the io_uring interface.

  Since the first one, we've added basic unit testing of the three
  system calls, that resides in liburing like the other unit tests that
  we have so far. It'll take a while to get full coverage of it, but
  we're working towards it. I've also added two basic test programs to
  tools/io_uring. One uses the raw interface and has support for all the
  various features that io_uring supports outside of standard IO, like
  fixed files, fixed IO buffers, and polled IO. The other uses the
  liburing API, and is a simplified version of cp(1).

  This adds support for a new IO interface, io_uring.

  io_uring allows an application to communicate with the kernel through
  two rings, the submission queue (SQ) and completion queue (CQ) ring.
  This allows for very efficient handling of IOs, see the v5 posting for
  some basic numbers:

    https://lore.kernel.org/linux-block/20190116175003.17880-1-axboe@kernel.dk/

  Outside of just efficiency, the interface is also flexible and
  extendable, and allows for future use cases like the upcoming NVMe
  key-value store API, networked IO, and so on. It also supports async
  buffered IO, something that we've always failed to support in the
  kernel.

  Outside of basic IO features, it supports async polled IO as well.
  This particular feature has already been tested at Facebook months ago
  for flash storage boxes, with 25-33% improvements. It makes polled IO
  actually useful for real world use cases, where even basic flash sees
  a nice win in terms of efficiency, latency, and performance. These
  boxes were IOPS bound before, now they are not.

  This series adds three new system calls. One for setting up an
  io_uring instance (io_uring_setup(2)), one for submitting/completing
  IO (io_uring_enter(2)), and one for aux functions like registrating
  file sets, buffers, etc (io_uring_register(2)). Through the help of
  Arnd, I've coordinated the syscall numbers so merge on that front
  should be painless.

  Jon did a writeup of the interface a while back, which (except for
  minor details that have been tweaked) is still accurate. Find that
  here:

    https://lwn.net/Articles/776703/

  Huge thanks to Al Viro for helping getting the reference cycle code
  correct, and to Jann Horn for his extensive reviews focused on both
  security and bugs in general.

  There's a userspace library that provides basic functionality for
  applications that don't need or want to care about how to fiddle with
  the rings directly. It has helpers to allow applications to easily set
  up an io_uring instance, and submit/complete IO through it without
  knowing about the intricacies of the rings. It also includes man pages
  (thanks to Jeff Moyer), and will continue to grow support helper
  functions and features as time progresses. Find it here:

    git://git.kernel.dk/liburing

  Fio has full support for the raw interface, both in the form of an IO
  engine (io_uring), but also with a small test application (t/io_uring)
  that can exercise and benchmark the interface"

* tag 'io_uring-2019-03-06' of git://git.kernel.dk/linux-block:
  io_uring: add a few test tools
  io_uring: allow workqueue item to handle multiple buffered requests
  io_uring: add support for IORING_OP_POLL
  io_uring: add io_kiocb ref count
  io_uring: add submission polling
  io_uring: add file set registration
  net: split out functions related to registering inflight socket files
  io_uring: add support for pre-mapped user IO buffers
  block: implement bio helper to add iter bvec pages to bio
  io_uring: batch io_kiocb allocation
  io_uring: use fget/fput_many() for file references
  fs: add fget_many() and fput_many()
  io_uring: support for IO polling
  io_uring: add fsync support
  Add io_uring IO interface
2019-03-08 14:48:40 -08:00
Linus Torvalds
80201fe175 for-5.1/block-20190302
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlx63XIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpp2vEACfrrQsap7R+Av28mmXpmXi2FPa3g5Tev1t
 yYjK2qHvhlMZjPTYw3hCmbYdDDczlF7PEgSE2x2DjdcsYapb8Fy1lZ2X16c7ztBR
 HD/t9b5AVSQsczZzKgv3RqsNtTnjzS5V0A8XH8FAP2QRgiwDMwSN6G0FP0JBLbE/
 ZgxQrH1Iy1F33Wz4hI3Z7dEghKPZrH1IlegkZCEu47q9SlWS76qUetSy2GEtchOl
 3Lgu54mQZyVdI5/QZf9DyMDLF6dIz3tYU2qhuo01AHjGRCC72v86p8sIiXcUr94Q
 8pbegJhJ/g8KBol9Qhv3+pWG/QUAZwi/ZwasTkK+MJ4klRXfOrznxPubW1z6t9Vn
 QRo39Po5SqqP0QWAscDxCFjESIQlWlKa+LZurJL7DJDCUGrSgzTpnVwFqKwc5zTP
 HJa5MT2tEeL2TfUYRYCfh0ZV0elINdHA1y1klDBh38drh4EWr2gW8xdseGYXqRjh
 fLgEpoF7VQ8kTvxKN+E4jZXkcZmoLmefp0ZyAbblS6IawpPVC7kXM9Fdn2OU8f2c
 fjVjvSiqxfeN6dnpfeLDRbbN9894HwgP/LPropJOQ7KmjCorQq5zMDkAvoh3tElq
 qwluRqdBJpWT/F05KweY+XVW8OawIycmUWqt6JrVNoIDAK31auHQv47kR0VA4OvE
 DRVVhYpocw==
 =VBaU
 -----END PGP SIGNATURE-----

Merge tag 'for-5.1/block-20190302' of git://git.kernel.dk/linux-block

Pull block layer updates from Jens Axboe:
 "Not a huge amount of changes in this round, the biggest one is that we
  finally have Mings multi-page bvec support merged. Apart from that,
  this pull request contains:

   - Small series that avoids quiescing the queue for sysfs changes that
     match what we currently have (Aleksei)

   - Series of bcache fixes (via Coly)

   - Series of lightnvm fixes (via Mathias)

   - NVMe pull request from Christoph. Nothing major, just SPDX/license
     cleanups, RR mp policy (Hannes), and little fixes (Bart,
     Chaitanya).

   - BFQ series (Paolo)

   - Save blk-mq cpu -> hw queue mapping, removing a pointer indirection
     for the fast path (Jianchao)

   - fops->iopoll() added for async IO polling, this is a feature that
     the upcoming io_uring interface will use (Christoph, me)

   - Partition scan loop fixes (Dongli)

   - mtip32xx conversion from managed resource API (Christoph)

   - cdrom registration race fix (Guenter)

   - MD pull from Song, two minor fixes.

   - Various documentation fixes (Marcos)

   - Multi-page bvec feature. This brings a lot of nice improvements
     with it, like more efficient splitting, larger IOs can be supported
     without growing the bvec table size, and so on. (Ming)

   - Various little fixes to core and drivers"

* tag 'for-5.1/block-20190302' of git://git.kernel.dk/linux-block: (117 commits)
  block: fix updating bio's front segment size
  block: Replace function name in string with __func__
  nbd: propagate genlmsg_reply return code
  floppy: remove set but not used variable 'q'
  null_blk: fix checking for REQ_FUA
  block: fix NULL pointer dereference in register_disk
  fs: fix guard_bio_eod to check for real EOD errors
  blk-mq: use HCTX_TYPE_DEFAULT but not 0 to index blk_mq_tag_set->map
  block: optimize bvec iteration in bvec_iter_advance
  block: introduce mp_bvec_for_each_page() for iterating over page
  block: optimize blk_bio_segment_split for single-page bvec
  block: optimize __blk_segment_map_sg() for single-page bvec
  block: introduce bvec_nth_page()
  iomap: wire up the iopoll method
  block: add bio_set_polled() helper
  block: wire up block device iopoll method
  fs: add an iopoll method to struct file_operations
  loop: set GENHD_FL_NO_PART_SCAN after blkdev_reread_part()
  loop: do not print warn message if partition scan is successful
  block: bounce: make sure that bvec table is updated
  ...
2019-03-08 14:12:17 -08:00
Linus Torvalds
851ca779d1 drm next pull request for 5.1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcfzdFAAoJEAx081l5xIa+QxsP/A6QP+gx4vQ8XXikaJMNz89e
 59TAbXHW/2qFMHRtUesuB2bc1a2cw2ppFsrryG7c4HqjKDDHna7Yx2JzZYL0MmNh
 SpJYL4yMuu/2TmyCouaAYzzP+5Supdosfif4LRn3269DH0i5MWXL+NVrbeB47blG
 XwjQTu46yfn06IFAo5bI2jMqSuPCDd4Hzpyixpvmjt+r16XwoH5nGUrDCHG8t/jV
 +PUZCYAjn71in6Z66MKZv/EVCVFfTnaVJ2KEgw7e+vWxnERkRh/xnRO6KIXMD5O1
 vo2qc2vbxkGpjaE6pDzC/2e5pRJT8Ks0t50jYjbVF+6nHpP5XIPvAXH4R2QdTA7B
 Jiu8N0oz6wj0H3AJ/V38rEHWW8zgOfXkhbRBfmfQ9NfgiEfwxqCVgspIOwei4oVw
 hvMXYUBM1CU+JIfW6w7ZT4oHALUlnCpnr5DQRdCNRm8zjClyNfIAoJIJrOtqmX44
 qjEzSgxb89ZtS7c0yislSBaovgAmcM3I+aq5I4xokdY0hFEZ6QomuKunyuQ8pBYa
 3gsvMEReLxETffhhYpjBt5+b5IgB49nf3Y38CKFurv32Sp0p0YgK0qVo8qRQHclj
 QIJ+3+zQMCX20swYpCWXhOPUIwtQppdKhWzg12my8rL2VgTlYhjlEbL4EL+Wk+hv
 6Ipulthzn0RyrSK9Dojh
 =GlRQ
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-03-06' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "This is the main drm pull request for the 5.1 merge window.

  The big changes I'd highlight are:
   - nouveau has HMM support now, there is finally an in-tree user so we
     can quieten down the rip it out people.
   - i915 now enables fastboot by default on Skylake+
   - Displayport Multistream support has been refactored and should
     hopefully be more reliable.

  Core:
   - header cleanups aiming towards removing drmP.h
   - dma-buf fence seqnos to 64-bits
   - common helper for DP mst hotplug for radeon,i915,amdgpu + new
     refcounting scheme
   - MST i2c improvements
   - drm_syncobj_cb removal
   - ARM FB compression fourcc
   - P010 + P016 fourcc
   - allwinner tiled format modifier
   - i2c over aux I2C_M_STOP support
   - DRM_AUTH handling fixes

  TTM:
   - ref/unref renaming

  New driver:
   - ARM komeda display driver

  scheduler:
   - refactor mirror list handling
   - rework hw fence processing
   - 0 run queue entity fix

  bridge:
   - TI DS90C185 LVDS bridge
   - thc631lvdm83d bridge improvements
   - cadence + allwinner DSI ported to generic phy

  panels:
   - Sitronix ST7701 panel
   - Kingdisplay KD097D04
   - LeMaker BL035-RGB-002
   - PDA 91-00156-A0
   - Innolux EE101IA-01D

  i915:
   - Enable fastboot by default on SKL+/VLV/CHV
   - Export RPCS configuration for ICL media driver
   - Coffelake PCI ID
   - CNL clocks setup fixes
   - ACPI/PMIC support for MIPI/DSI
   - Per-engine WA init for all engines
   - Shrinker locking fixes
   - Kerneldoc updates
   - Lots of ring improvements and reset fixes
   - Coffeelake GVT Support
   - VFIO GVT EDID Region support
   - runtime PM wakeref tracking
   - ILK->IVB primary plane enable delays
   - userptr mutex locking fixes
   - DSI fixes
   - LVDS/TV cleanups
   - HW readout fixes
   - LUT robustness fixes
   - ICL display and watermark fixes
   - gem mmap race fix

  amdgpu:
   - add scheduled dependencies interface
   - DCC on scanout surfaces
   - vega10/20 BACO support
   - Multiple IH rings on soc15
   - XGMI locking fixes
   - DC i2c/aux cleanups
   - runtime SMU debug interface
   - Kexec improvmeents
   - SR-IOV fixes
   - DC freesync + ABM fixes
   - GDS fixes
   - GPUVM fixes
   - vega20 PCIE DPM switching fixes
   - Context priority handling fixes

  radeon:
   - fix missing break in evergreen parser

  nouveau:
   - SVM support via HMM

  msm:
   - QCOM Compressed modifier support

  exynos:
   - s5pv210 rotator support

  imx:
   - zpos property support
   - pending update fixes

  v3d:
   - cache flush improvments

  vc4:
   - reflection support
   - HDMI overscan support

  tegra:
   - CEC refactoring
   - HDMI audio fixes
   - Tegra186 prep work
   - SOR crossbar device tree fixes

  sun4i:
   - implicit fencing support
   - YUV and scalar support improvements
   - A23 support
   - tiling fixes

  atmel-hlcdc:
   - clipping and rotation property fixes

  qxl:
   - BO and PRIME improvements
   - generic fbdev emulation

  dw-hdmi:
   - HDMI 2.0 2160p
   - YUV420 ouput

  rockchip:
   - implicit fencing support
   - reflection proerties

  virtio-gpu:
   - use generic fbdev emulation

  tilcdc:
   - cpufreq vs crtc init fix

  rcar-du:
   - R8A774C0 support
   - D3/E3 RGB output routing fixes and DPAD0 support
   - RA87744 LVDS support

  bochs:
   - atomic and generic fbdev emulation
   - ID mismatch error on bochs load

  meson:
   - remove firmware fbs"

* tag 'drm-next-2019-03-06' of git://anongit.freedesktop.org/drm/drm: (1130 commits)
  drm/amd/display: Use vrr friendly pageflip throttling in DC.
  drm/imx: only send commit done event when all state has been applied
  drm/imx: allow building under COMPILE_TEST
  drm/imx: imx-tve: depend on COMMON_CLK
  drm/imx: ipuv3-plane: add zpos property
  drm/imx: ipuv3-plane: add function to query atomic update status
  gpu: ipu-v3: prg: add function to get channel configure status
  gpu: ipu-v3: pre: add double buffer status readback
  drm/amdgpu: Bump amdgpu version for context priority override.
  drm/amdgpu/powerplay: fix typo in BACO header guards
  drm/amdgpu/powerplay: fix return codes in BACO code
  drm/amdgpu: add missing license on baco files
  drm/bochs: Fix the ID mismatch error
  drm/nouveau/dmem: use dma addresses during migration copies
  drm/nouveau/dmem: use physical vram addresses during migration copies
  drm/nouveau/dmem: extend copy function to allow direct use of physical addresses
  drm/nouveau/svm: new ioctl to migrate process memory to GPU memory
  drm/nouveau/dmem: device memory helpers for SVM
  drm/nouveau/svm: initial support for shared virtual memory
  drm/nouveau: prepare for enabling svm with existing userspace interfaces
  ...
2019-03-08 08:23:15 -08:00
Linus Torvalds
b5dd0c658c Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - some of the rest of MM

 - various misc things

 - dynamic-debug updates

 - checkpatch

 - some epoll speedups

 - autofs

 - rapidio

 - lib/, lib/lzo/ updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits)
  samples/mic/mpssd/mpssd.h: remove duplicate header
  kernel/fork.c: remove duplicated include
  include/linux/relay.h: fix percpu annotation in struct rchan
  arch/nios2/mm/fault.c: remove duplicate include
  unicore32: stop printing the virtual memory layout
  MAINTAINERS: fix GTA02 entry and mark as orphan
  mm: create the new vm_fault_t type
  arm, s390, unicore32: remove oneliner wrappers for memblock_alloc()
  arch: simplify several early memory allocations
  openrisc: simplify pte_alloc_one_kernel()
  sh: prefer memblock APIs returning virtual address
  microblaze: prefer memblock API returning virtual address
  powerpc: prefer memblock APIs returning virtual address
  lib/lzo: separate lzo-rle from lzo
  lib/lzo: implement run-length encoding
  lib/lzo: fast 8-byte copy on arm64
  lib/lzo: 64-bit CTZ on arm64
  lib/lzo: tidy-up ifdefs
  ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size
  ipc: annotate implicit fall through
  ...
2019-03-07 19:25:37 -08:00
Oleg Nesterov
6eb3c3d0a5 exec: increase BINPRM_BUF_SIZE to 256
Large enterprise clients often run applications out of networked file
systems where the IT mandated layout of project volumes can end up
leading to paths that are longer than 128 characters.  Bumping this up
to the next order of two solves this problem in all but the most
egregious case while still fitting into a 512b slab.

[oleg@redhat.com: update comment, per Kees]
Link: http://lkml.kernel.org/r/20181112160956.GA28472@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Ben Woodard <woodard@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Ian Kent
60d6d04ca3 autofs: add ignore mount option
Add an autofs file system mount option that can be used to provide a
generic indicator to applications that the mount entry should be ignored
when displaying mount information.

In other OSes that provide autofs and that provide a mount list to user
space based on the kernel mount list a no-op mount option ("ignore" is
the one use on the most common OS) is allowed so that autofs file system
users can optionally use it.

The idea is that it be used by user space programs to exclude autofs
mounts from consideration when reading the mounts list.

Prior to the change to link /etc/mtab to /proc/self/mounts all I needed
to do to achieve this was to use mount(2) and not update the mtab but
now that no longer works.

I know the symlinking happened a long time ago and I considered doing
this then but, at the time I couldn't remember the commonly used option
name and thought persuading the various utility maintainers would be too
hard.

But now I have a RHEL request to do this for compatibility for a widely
used product so I want to go ahead with it and try and enlist the help
of some utility package maintainers.

Clearly, without the option nothing can be done so it's at least a
start.

Link: http://lkml.kernel.org/r/154725123970.11260.6113771566924907275.stgit@pluto-themaw-net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Masahiro Yamada
54d50897d5 linux/kernel.h: split *_MAX and *_MIN macros into <linux/limits.h>
<linux/kernel.h> tends to be cluttered because we often put various sort
of unrelated stuff in it.  So, we have split out a sensible chunk of
code into a separate header from time to time.

This commit splits out the *_MAX and *_MIN defines.

The standard header <limits.h> contains various MAX, MIN constants
including numerial limits.  [1]

I think it makes sense to move in-kernel MAX, MIN constants into
include/linux/limits.h.

We already have include/uapi/linux/limits.h to contain some user-space
constants.  I changed its include guard to _UAPI_LINUX_LIMITS_H.  This
change has no impact to the user-space because
scripts/headers_install.sh rips off the '_UAPI' prefix from the include
guards of exported headers.

[1] http://pubs.opengroup.org/onlinepubs/009604499/basedefs/limits.h.html

Link: http://lkml.kernel.org/r/1549156242-20806-2-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Alex Elder <elder@linaro.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:31:59 -08:00
Linus Torvalds
b1e243957e for-5.1-part1-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlx9czQACgkQxWXV+ddt
 WDvC9w/8CxJf1/eZBqb+b+aA38kgZhoaNixMud/IW/IFmIlicX0PoDxk6dh1ZTA+
 3uej/7fyfwjNCVvtrPVVxdT8zhZgyJouHrbhG1PlDWtmTEV2VqV5pBG1xQtCwmZy
 oinQI5oYYM5Le5EXxRGH8TQs6Z3tFuLx2kcrVWBLFKoZ2kZBZxe6KykGF9izve4a
 sVjtOL1CEL1e00vrNLzUmch8qss9Cu0i3qd3k8UANp3SgKIaOkJt4S/HeEcLfy5J
 kf6hVKlgPDuakVtAJKyhbLVQsfHVNkfiyvplta9lDot/iJchJITTRkadP6LblVeo
 knl8V+VO9kzQUvGauxtu66Q3DJ/7mqbzHUwPISetdKCV9ZXkuPFHnu0AEP577mrx
 e1JAPA/a8lF3up5QhqIb0uzH3sczOd8nNN/b1Xnxl7Kogyl8SUjhmX3FFy88borj
 /8Ptv/fFMQZs9IJ0QWlkh5TKRXAtSNAzVy2FpkvLaO0k0gJKQjyJuTKV5ezv/PGU
 +4m5kDtfpyz//KAOZxq4lERj4EMIEDhHhNbA8Qqmdeoj7oaKZ+gW57enOXohCTbi
 gVE6xDr2u4oQ85j3JuQo5W5mZA4uza35Gh4t43n5akdrrkLFVLW584hDxShGx9uS
 B0maToGbzOdGJTZXZ2SLHZ5Da14Lzb/TooCufF8GMAISb99vbjw=
 =D9zc
 -----END PGP SIGNATURE-----

Merge tag 'for-5.1-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "This contains usual mix of new features, core changes and fixes; full
  list below. I'm planning second pull request, with a few more fixes
  that arrived recently but too close to merge window, will send it next
  week.

  New features:

   - support zstd compression levels

   - new ioctl to unregister a device from the module (ie. reverse of
     device scan)

   - scrub prints a message to log when it's about to start or finish

  Core changes:

   - qgroups can now skip part of a tree that does not get updated
     during relocation, because this does not affect the quota
     accounting, estimated speedup in run time is about 20%

   - the compression workspace management had to be enhanced due to zstd
     requirements

   - various enospc fixes, when there's high fragmentation the
     over-reservation can cause ENOSPC that might not happen after a
     flush, in such cases try to wait if the situation improves

  Fixes:

   - various ioctls could overwrite previous return value if
     copy_to_user fails, fix this so the original error is reported

   - more reclaim vs GFP_KERNEL fixes

   - other cleanups and refactoring

   - fix a (valid) lockdep warning in a test when device replace is
     destroying worker threads

   - make qgroup async transaction commit more aggressive, this avoids
     some 'quota limit reached' errors if there are not enough data to
     trigger transaction in order to flush

   - fix deadlock between snapshot deletion and quotas when backref
     walking is called from context that already holds the same locks

   - fsync fixes:
      - fix fsync after succession of renames of different files
      - fix fsync after succession of renames and unlink/rmdir"

* tag 'for-5.1-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (92 commits)
  btrfs: Remove unnecessary casts in btrfs_read_root_item
  Btrfs: remove assertion when searching for a key in a node/leaf
  Btrfs: add missing error handling after doing leaf/node binary search
  btrfs: drop the lock on error in btrfs_dev_replace_cancel
  btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
  btrfs: init csum_list before possible free
  Btrfs: remove no longer needed range length checks for deduplication
  Btrfs: fix fsync after succession of renames and unlink/rmdir
  Btrfs: fix fsync after succession of renames of different files
  btrfs: honor path->skip_locking in backref code
  btrfs: qgroup: Make qgroup async transaction commit more aggressive
  btrfs: qgroup: Move reserved data accounting from btrfs_delayed_ref_head to btrfs_qgroup_extent_record
  btrfs: scrub: remove unused nocow worker pointer
  btrfs: scrub: add assertions for worker pointers
  btrfs: scrub: convert scrub_workers_refcnt to refcount_t
  btrfs: scrub: add scrub_lock lockdep check in scrub_workers_get
  btrfs: scrub: fix circular locking dependency warning
  btrfs: fix comment its device list mutex not volume lock
  btrfs: extent_io: Kill the forward declaration of flush_write_bio
  btrfs: Fix grossly misleading argument names in extent io search
  ...
2019-03-07 09:07:30 -08:00
Linus Torvalds
0556161ff9 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAlx5SzMACgkQnJ2qBz9k
 QNnP0AgAl9HDtk436P4QPPFhdeXBB6uRTYU8wgZSropXMxoyUzqZotWkeUqUusHs
 I8BAJeQojZeKUnMwET1/RA+dMLsgAlMcxkM3+3eDCLF9wh/+wO2B2G3ywh8KMVed
 Sa3C4V/NZvZzAossoGDV/yWmK+ZYrrW8l/DM3LU54GV1NfAL+Khn4FNwtgWiYiP5
 S4RkRflzwhIEZJSZByMlCLcsrHl/ehtMJuR1opUPY1c0CY8iAGcIobSSzVFqv4f5
 ScRB56rnXqTt6CBGBcDIkWxWqEE9XuTdAn1AVMC7327UezyFNocXUZaqDzJcGcR/
 rM3sf+seZftbA4nid8dIhSRoqAf4Yw==
 =/DQr
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fanotify updates from Jan Kara:
 "Support for fanotify directory events and changes to make waiting for
  fanotify permission event response killable"

* tag 'fsnotify_for_v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (25 commits)
  fanotify: Make waits for fanotify events only killable
  fanotify: Use interruptible wait when waiting for permission events
  fanotify: Track permission event state
  fanotify: Simplify cleaning of access_list
  fsnotify: Create function to remove event from notification list
  fanotify: Move locking inside get_one_event()
  fanotify: Fold dequeue_event() into process_access_response()
  fanotify: Select EXPORTFS
  fanotify: report FAN_ONDIR to listener with FAN_REPORT_FID
  fanotify: add support for create/attrib/move/delete events
  fanotify: support events with data type FSNOTIFY_EVENT_INODE
  fanotify: check FS_ISDIR flag instead of d_is_dir()
  fsnotify: report FS_ISDIR flag with MOVE_SELF and DELETE_SELF events
  fanotify: use vfs_get_fsid() helper instead of vfs_statfs()
  vfs: add vfs_get_fsid() helper
  fanotify: cache fsid in fsnotify_mark_connector
  fanotify: enable FAN_REPORT_FID init flag
  fanotify: copy event fid info to user
  fanotify: encode file identifier for FAN_REPORT_FID
  fanotify: open code fill_event_metadata()
  ...
2019-03-07 09:03:38 -08:00
Linus Torvalds
67e79a6dc2 TTY/Serial patches for 5.1-rc1
Here is the "big" patchset for the tty/serial driver layer for 5.1-rc1.
 
 It's really not all that big, nothing major here.
 
 There are a lot of tiny driver fixes and updates, combined with other
 cleanups for different serial drivers and the vt layer.  Full details
 are in the shortlog.
 
 All of these have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXH+g7A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymoRgCgy8MZ6vayopxNdRBnp4mMmgtaVYMAoMIPYq0V
 XuhV/ZSENWdef2QuaJ3O
 =nSZk
 -----END PGP SIGNATURE-----

Merge tag 'tty-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial updates from Greg KH:
 "Here is the "big" patchset for the tty/serial driver layer for
  5.1-rc1.

  It's really not all that big, nothing major here.

  There are a lot of tiny driver fixes and updates, combined with other
  cleanups for different serial drivers and the vt layer. Full details
  are in the shortlog.

  All of these have been in linux-next with no reported issues"

* tag 'tty-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (70 commits)
  tty: xilinx_uartps: Correct return value in probe
  serial: sprd: Modify the baud rate calculation formula
  dt-bindings: serial: Add Milbeaut serial driver description
  serial: 8250_of: assume reg-shift of 2 for mrvl,mmp-uart
  serial: 8250_pxa: honor the port number from devicetree
  tty: hvc_xen: Mark expected switch fall-through
  tty: n_gsm: Mark expected switch fall-throughs
  tty: serial: msm_serial: Remove __init from msm_console_setup()
  tty: serial: samsung: Enable baud clock during initialisation
  serial: uartps: Fix stuck ISR if RX disabled with non-empty FIFO
  tty: serial: remove redundant likely annotation
  tty/n_hdlc: mark expected switch fall-through
  serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954 chip use the pci_pericom_setup()
  serial: 8250_pci: Fix number of ports for ACCES serial cards
  vt: perform safe console erase in the right order
  tty/nozomi: use pci_iomap instead of ioremap_nocache
  tty/synclink: remove ISA support
  serial: 8250_pci: Replace custom code with pci_match_id()
  serial: max310x: Correction of the initial setting of the MODE1 bits for various supported ICs.
  serial: mps2-uart: Add parentheses around conditional in mps2_uart_shutdown
  ...
2019-03-06 16:35:12 -08:00
Linus Torvalds
e266ca36da Staging/IIO patches for 5.1-rc1
Here is the big staging/iio driver pull request for 5.1-rc1.
 
 Lots of good IIO driver updates and cleanups in here as always.
 Combined with the removal of the xgifb driver, we have a net "loss" of
 over 9000 lines in the pull request, always a nice thing.
 
 As the outreachy application process is currently happening, there are
 loads of tiny checkpatch cleanup fixes all over the staging tree, which
 accounts for the majority of the fixups.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXH+gLQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymBiQCeJpoBhG+W3r+kP8w65ZY8qU+/liIAn0Tkl4/k
 IX1dQzCsEpO1jA8AHj6n
 =7wCH
 -----END PGP SIGNATURE-----

Merge tag 'staging-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging/IIO updates from Greg KH:
 "Here is the big staging/iio driver pull request for 5.1-rc1.

  Lots of good IIO driver updates and cleanups in here as always.
  Combined with the removal of the xgifb driver, we have a net "loss" of
  over 9000 lines in the pull request, always a nice thing.

  As the outreachy application process is currently happening, there are
  loads of tiny checkpatch cleanup fixes all over the staging tree,
  which accounts for the majority of the fixups"

* tag 'staging-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (341 commits)
  staging: mt7621-dma: remove license boilerplate text
  staging: mt7621-dma: add SPDX GPL-2.0+ license identifier
  Staging: ks7010: Replace typecast to int
  Staging: vt6655: Align a static function declaration
  staging: speakup: fix line over 80 characters.
  staging: mt7621-eth: Remove license boilerplate text
  staging: mt7621-eth: Add SPDX license identifier
  staging: ks7010: removed custom Michael MIC implementation.
  staging: rtl8192e: Fix space and suspect issue
  Staging: vt6655: Modify comment style of SPDX License Identifier
  Staging: vt6655: Modify comment style for SPDX-License-Identifier
  Staging: vt6655: Align a function declaration
  Staging: vt6655: Alignment of function declaration
  staging: rtl8712: Fix indentation issue
  staging: wilc1000: fix incorrent type in initializer
  staging: rtl8188eu: remove unused P2P_PRIVATE_IOCTL_SET_LEN
  staging: rtl8188eu: remove unused enum P2P_PROTO_WK_ID
  staging: rtl8723bs: Remove duplicated include from drv_types.h
  Staging: vt6655: Alignment should match open parenthesis
  staging: erofs: fix mis-acted TAIL merging behavior
  ...
2019-03-06 16:29:27 -08:00
Linus Torvalds
45763bf4bc Char/Misc driver patches for 5.1-rc1
Here is the big char/misc driver patch pull request for 5.1-rc1.
 
 The largest thing by far is the new habanalabs driver for their AI
 accelerator chip.  For now it is in the drivers/misc directory but will
 probably move to a new directory soon along with other drivers of this
 type.
 
 Other than that, just the usual set of individual driver updates and
 fixes.  There's an "odd" merge in here from the DRM tree that they asked
 me to do as the MEI driver is starting to interact with the i915 driver,
 and it needed some coordination.  All of those patches have been
 properly acked by the relevant subsystem maintainers.
 
 All of these have been in linux-next with no reported issues, most for
 quite some time.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXH+dPQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ym1fACgvpZAxjNzoRQJ6f06tc8ujtPk9rUAnR+tCtrZ
 9e3l7H76oe33o96Qjhor
 =8A2k
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big char/misc driver patch pull request for 5.1-rc1.

  The largest thing by far is the new habanalabs driver for their AI
  accelerator chip. For now it is in the drivers/misc directory but will
  probably move to a new directory soon along with other drivers of this
  type.

  Other than that, just the usual set of individual driver updates and
  fixes. There's an "odd" merge in here from the DRM tree that they
  asked me to do as the MEI driver is starting to interact with the i915
  driver, and it needed some coordination. All of those patches have
  been properly acked by the relevant subsystem maintainers.

  All of these have been in linux-next with no reported issues, most for
  quite some time"

* tag 'char-misc-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (219 commits)
  habanalabs: adjust Kconfig to fix build errors
  habanalabs: use %px instead of %p in error print
  habanalabs: use do_div for 64-bit divisions
  intel_th: gth: Fix an off-by-one in output unassigning
  habanalabs: fix little-endian<->cpu conversion warnings
  habanalabs: use NULL to initialize array of pointers
  habanalabs: fix little-endian<->cpu conversion warnings
  habanalabs: soft-reset device if context-switch fails
  habanalabs: print pointer using %p
  habanalabs: fix memory leak with CBs with unaligned size
  habanalabs: return correct error code on MMU mapping failure
  habanalabs: add comments in uapi/misc/habanalabs.h
  habanalabs: extend QMAN0 job timeout
  habanalabs: set DMA0 completion to SOB 1007
  habanalabs: fix validation of WREG32 to DMA completion
  habanalabs: fix mmu cache registers init
  habanalabs: disable CPU access on timeouts
  habanalabs: add MMU DRAM default page mapping
  habanalabs: Dissociate RAZWI info from event types
  misc/habanalabs: adjust Kconfig to fix build errors
  ...
2019-03-06 14:18:59 -08:00
Jens Axboe
221c5eb233 io_uring: add support for IORING_OP_POLL
This is basically a direct port of bfe4037e72, which implements a
one-shot poll command through aio. Description below is based on that
commit as well. However, instead of adding a POLL command and relying
on io_cancel(2) to remove it, we mimic the epoll(2) interface of
having a command to add a poll notification, IORING_OP_POLL_ADD,
and one to remove it again, IORING_OP_POLL_REMOVE.

To poll for a file descriptor the application should submit an sqe of
type IORING_OP_POLL. It will poll the fd for the events specified in the
poll_events field.

Unlike poll or epoll without EPOLLONESHOT this interface always works in
one shot mode, that is once the sqe is completed, it will have to be
resubmitted.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Based-on-code-from: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-03-06 13:00:12 -07:00
Linus Torvalds
8dcd175bc3 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - a few misc things

 - ocfs2 updates

 - most of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (159 commits)
  tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include
  proc: more robust bulk read test
  proc: test /proc/*/maps, smaps, smaps_rollup, statm
  proc: use seq_puts() everywhere
  proc: read kernel cpu stat pointer once
  proc: remove unused argument in proc_pid_lookup()
  fs/proc/thread_self.c: code cleanup for proc_setup_thread_self()
  fs/proc/self.c: code cleanup for proc_setup_self()
  proc: return exit code 4 for skipped tests
  mm,mremap: bail out earlier in mremap_to under map pressure
  mm/sparse: fix a bad comparison
  mm/memory.c: do_fault: avoid usage of stale vm_area_struct
  writeback: fix inode cgroup switching comment
  mm/huge_memory.c: fix "orig_pud" set but not used
  mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC
  mm/memcontrol.c: fix bad line in comment
  mm/cma.c: cma_declare_contiguous: correct err handling
  mm/page_ext.c: fix an imbalance with kmemleak
  mm/compaction: pass pgdat to too_many_isolated() instead of zone
  mm: remove zone_lru_lock() function, access ->lru_lock directly
  ...
2019-03-06 10:31:36 -08:00
Linus Torvalds
fa29f5ba42 asm-generic changes for v5.1
Only a few small changes this time:
 
 - Michael S. Tsirkin cleans up linux/mman.h
 - Mike Rapoport found a typo
 
 I had originally merged another cleanup series for I/O accessors from
 Hugo Lefeuvre as well, but dropped it after the discussion of the barrier
 semantics and some conflicts. I expect this series to get merged for a
 later release though.
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcf9yrAAoJEGCrR//JCVInohIP+wcCZ3XEym0oyb9C5BgIdnMF
 qq69wbOT9ypAnB469PPjsvGhyCsqGo1Fm2Cludac0C/OfmB/lYJCqklAJrc6a7Cp
 jXD5rQkXQZGdmFHS81i24ejcNV9F5fh16vyDwqCgUQDCgg9MeDPirvwaDl958rhq
 5dUsoUu+CJRI35jZ8NPWbsSU5Wa2BpckWnuTs97PtlLnMH/RzEyxmSRysxtHNA7S
 PgYjvWBHbRK6mTNgcY/eojAoQuQmBgiOppYX1XTffHZgwY/VBIjYL3Wxwe1qbvSN
 vvLgE6nvvmRYjdq4VOoVbi9BOtWiCtXw29TWXfxSetN9CWCEwIQJdIm8lHn1CJRh
 6GnOdb8ADqAzraxo7GOf78kNsekl3mxh+QYN5gDaGS6eQASCNlGx70zUh7Y6JcE5
 6NJ8VMy0oEEFzHAsE0mxluu8LHL3F1hwS832D6mQ697Z71T+IoHgIcMT2jHqp9fw
 7/D2taoBAXdGqhToY3hhMMWsi4lFCvxjVVlxCxhp7Ik+cyOgW0O5vG2afLOHrtti
 vWEcn7M6nuKvb3MxBVjg8sK2ln6vIXNgZGjVmkxn70ZAmvZJX+KE+G1cvB2gMGc0
 S/ogtpbETMgMCwuAf2hdYgiBFJpZL725DEWFfS+p+02PTwjdKF+EWAS4swZM0sCE
 U1v1N7yCe/Wb/ijEm0M1
 =C9p7
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "Only a few small changes this time:

   - Michael S. Tsirkin cleans up linux/mman.h

   - Mike Rapoport found a typo

  I had originally merged another cleanup series for I/O accessors from
  Hugo Lefeuvre as well, but dropped it after the discussion of the
  barrier semantics and some conflicts. I expect this series to get
  merged for a later release though"

* tag 'asm-generic-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic/page.h: fix typo in #error text requiring a real asm/page.h
  arch: move common mmap flags to linux/mman.h
  drm: tweak header name
  x86/mpx: tweak header name
2019-03-06 09:18:43 -08:00
Linus Torvalds
203b6609e0 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Lots of tooling updates - too many to list, here's a few highlights:

   - Various subcommand updates to 'perf trace', 'perf report', 'perf
     record', 'perf annotate', 'perf script', 'perf test', etc.

   - CPU and NUMA topology and affinity handling improvements,

   - HW tracing and HW support updates:
      - Intel PT updates
      - ARM CoreSight updates
      - vendor HW event updates

   - BPF updates

   - Tons of infrastructure updates, both on the build system and the
     library support side

   - Documentation updates.

   - ... and lots of other changes, see the changelog for details.

  Kernel side updates:

   - Tighten up kprobes blacklist handling, reduce the number of places
     where developers can install a kprobe and hang/crash the system.

   - Fix/enhance vma address filter handling.

   - Various PMU driver updates, small fixes and additions.

   - refcount_t conversions

   - BPF updates

   - error code propagation enhancements

   - misc other changes"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
  perf script python: Add Python3 support to syscall-counts-by-pid.py
  perf script python: Add Python3 support to syscall-counts.py
  perf script python: Add Python3 support to stat-cpi.py
  perf script python: Add Python3 support to stackcollapse.py
  perf script python: Add Python3 support to sctop.py
  perf script python: Add Python3 support to powerpc-hcalls.py
  perf script python: Add Python3 support to net_dropmonitor.py
  perf script python: Add Python3 support to mem-phys-addr.py
  perf script python: Add Python3 support to failed-syscalls-by-pid.py
  perf script python: Add Python3 support to netdev-times.py
  perf tools: Add perf_exe() helper to find perf binary
  perf script: Handle missing fields with -F +..
  perf data: Add perf_data__open_dir_data function
  perf data: Add perf_data__(create_dir|close_dir) functions
  perf data: Fail check_backup in case of error
  perf data: Make check_backup work over directories
  perf tools: Add rm_rf_perf_data function
  perf tools: Add pattern name checking to rm_rf
  perf tools: Add depth checking to rm_rf
  perf data: Add global path holder
  ...
2019-03-06 07:59:36 -08:00
Joel Fernandes (Google)
ab3948f58f mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd
Android uses ashmem for sharing memory regions.  We are looking forward
to migrating all usecases of ashmem to memfd so that we can possibly
remove the ashmem driver in the future from staging while also
benefiting from using memfd and contributing to it.  Note staging
drivers are also not ABI and generally can be removed at anytime.

One of the main usecases Android has is the ability to create a region
and mmap it as writeable, then add protection against making any
"future" writes while keeping the existing already mmap'ed
writeable-region active.  This allows us to implement a usecase where
receivers of the shared memory buffer can get a read-only view, while
the sender continues to write to the buffer.  See CursorWindow
documentation in Android for more details:

  https://developer.android.com/reference/android/database/CursorWindow

This usecase cannot be implemented with the existing F_SEAL_WRITE seal.
To support the usecase, this patch adds a new F_SEAL_FUTURE_WRITE seal
which prevents any future mmap and write syscalls from succeeding while
keeping the existing mmap active.

A better way to do F_SEAL_FUTURE_WRITE seal was discussed [1] last week
where we don't need to modify core VFS structures to get the same
behavior of the seal.  This solves several side-effects pointed by Andy.
self-tests are provided in later patch to verify the expected semantics.

[1] https://lore.kernel.org/lkml/20181111173650.GA256781@google.com/

Thanks a lot to Andy for suggestions to improve code.

Link: http://lkml.kernel.org/r/20190112203816.85534-2-joel@joelfernandes.org
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Marc-Andr Lureau <marcandre.lureau@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:19 -08:00
David Hildenbrand
ca215086b1 mm: convert PG_balloon to PG_offline
PG_balloon was introduced to implement page migration/compaction for
pages inflated in virtio-balloon.  Nowadays, it is only a marker that a
page is part of virtio-balloon and therefore logically offline.

We also want to make use of this flag in other balloon drivers - for
inflated pages or when onlining a section but keeping some pages offline
(e.g.  used right now by XEN and Hyper-V via set_online_page_callback()).

We are going to expose this flag to dump tools like makedumpfile.  But
instead of exposing PG_balloon, let's generalize the concept of marking
pages as logically offline, so it can be reused for other purposes later
on.

Rename PG_balloon to PG_offline.  This is an indicator that the page is
logically offline, the content stale and that it should not be touched
(e.g.  a hypervisor would have to allocate backing storage in order for
the guest to dump an unused page).  We can then e.g.  exclude such pages
from dumps.

We replace and reuse KPF_BALLOON (23), as this shouldn't really harm
(and for now the semantics stay the same).  In following patches, we
will make use of this bit also in other balloon drivers.  While at it,
document PGTABLE.

[akpm@linux-foundation.org: fix comment text, per David]
Link: http://lkml.kernel.org/r/20181119101616.8901-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Konstantin Khlebnikov <koct9i@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Pankaj gupta <pagupta@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Julien Freche <jfreche@vmware.com>
Cc: Kairui Song <kasong@redhat.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:14 -08:00
Linus Torvalds
b1b988a6a0 Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull year 2038 updates from Thomas Gleixner:
 "Another round of changes to make the kernel ready for 2038. After lots
  of preparatory work this is the first set of syscalls which are 2038
  safe:

    403 clock_gettime64
    404 clock_settime64
    405 clock_adjtime64
    406 clock_getres_time64
    407 clock_nanosleep_time64
    408 timer_gettime64
    409 timer_settime64
    410 timerfd_gettime64
    411 timerfd_settime64
    412 utimensat_time64
    413 pselect6_time64
    414 ppoll_time64
    416 io_pgetevents_time64
    417 recvmmsg_time64
    418 mq_timedsend_time64
    419 mq_timedreceiv_time64
    420 semtimedop_time64
    421 rt_sigtimedwait_time64
    422 futex_time64
    423 sched_rr_get_interval_time64

  The syscall numbers are identical all over the architectures"

* 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  riscv: Use latest system call ABI
  checksyscalls: fix up mq_timedreceive and stat exceptions
  unicore32: Fix __ARCH_WANT_STAT64 definition
  asm-generic: Make time32 syscall numbers optional
  asm-generic: Drop getrlimit and setrlimit syscalls from default list
  32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option
  compat ABI: use non-compat openat and open_by_handle_at variants
  y2038: add 64-bit time_t syscalls to all 32-bit architectures
  y2038: rename old time and utime syscalls
  y2038: remove struct definition redirects
  y2038: use time32 syscall names on 32-bit
  syscalls: remove obsolete __IGNORE_ macros
  y2038: syscalls: rename y2038 compat syscalls
  x86/x32: use time64 versions of sigtimedwait and recvmmsg
  timex: change syscalls to use struct __kernel_timex
  timex: use __kernel_timex internally
  sparc64: add custom adjtimex/clock_adjtime functions
  time: fix sys_timer_settime prototype
  time: Add struct __kernel_timex
  time: make adjtime compat handling available for 32 bit
  ...
2019-03-05 14:08:26 -08:00
Linus Torvalds
edaed168e1 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/pti update from Thomas Gleixner:
 "Just a single change from the anti-performance departement:

   - Add a new PR_SPEC_DISABLE_NOEXEC option which allows to apply the
     speculation protections on a process without inheriting the state
     on exec.

     This remedies a situation where a Java-launcher has speculation
     protections enabled because that's the default for JVMs which
     causes the launched regular harmless processes to inherit the
     protection state which results in unintended performance
     degradation"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation: Add PR_SPEC_DISABLE_NOEXEC
2019-03-05 12:50:34 -08:00
Christian Brauner
3eb39f4793
signal: add pidfd_send_signal() syscall
The kill() syscall operates on process identifiers (pid). After a process
has exited its pid can be reused by another process. If a caller sends a
signal to a reused pid it will end up signaling the wrong process. This
issue has often surfaced and there has been a push to address this problem [1].

This patch uses file descriptors (fd) from proc/<pid> as stable handles on
struct pid. Even if a pid is recycled the handle will not change. The fd
can be used to send signals to the process it refers to.
Thus, the new syscall pidfd_send_signal() is introduced to solve this
problem. Instead of pids it operates on process fds (pidfd).

/* prototype and argument /*
long pidfd_send_signal(int pidfd, int sig, siginfo_t *info, unsigned int flags);

/* syscall number 424 */
The syscall number was chosen to be 424 to align with Arnd's rework in his
y2038 to minimize merge conflicts (cf. [25]).

In addition to the pidfd and signal argument it takes an additional
siginfo_t and flags argument. If the siginfo_t argument is NULL then
pidfd_send_signal() is equivalent to kill(<positive-pid>, <signal>). If it
is not NULL pidfd_send_signal() is equivalent to rt_sigqueueinfo().
The flags argument is added to allow for future extensions of this syscall.
It currently needs to be passed as 0. Failing to do so will cause EINVAL.

/* pidfd_send_signal() replaces multiple pid-based syscalls */
The pidfd_send_signal() syscall currently takes on the job of
rt_sigqueueinfo(2) and parts of the functionality of kill(2), Namely, when a
positive pid is passed to kill(2). It will however be possible to also
replace tgkill(2) and rt_tgsigqueueinfo(2) if this syscall is extended.

/* sending signals to threads (tid) and process groups (pgid) */
Specifically, the pidfd_send_signal() syscall does currently not operate on
process groups or threads. This is left for future extensions.
In order to extend the syscall to allow sending signal to threads and
process groups appropriately named flags (e.g. PIDFD_TYPE_PGID, and
PIDFD_TYPE_TID) should be added. This implies that the flags argument will
determine what is signaled and not the file descriptor itself. Put in other
words, grouping in this api is a property of the flags argument not a
property of the file descriptor (cf. [13]). Clarification for this has been
requested by Eric (cf. [19]).
When appropriate extensions through the flags argument are added then
pidfd_send_signal() can additionally replace the part of kill(2) which
operates on process groups as well as the tgkill(2) and
rt_tgsigqueueinfo(2) syscalls.
How such an extension could be implemented has been very roughly sketched
in [14], [15], and [16]. However, this should not be taken as a commitment
to a particular implementation. There might be better ways to do it.
Right now this is intentionally left out to keep this patchset as simple as
possible (cf. [4]).

/* naming */
The syscall had various names throughout iterations of this patchset:
- procfd_signal()
- procfd_send_signal()
- taskfd_send_signal()
In the last round of reviews it was pointed out that given that if the
flags argument decides the scope of the signal instead of different types
of fds it might make sense to either settle for "procfd_" or "pidfd_" as
prefix. The community was willing to accept either (cf. [17] and [18]).
Given that one developer expressed strong preference for the "pidfd_"
prefix (cf. [13]) and with other developers less opinionated about the name
we should settle for "pidfd_" to avoid further bikeshedding.

The  "_send_signal" suffix was chosen to reflect the fact that the syscall
takes on the job of multiple syscalls. It is therefore intentional that the
name is not reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the
fomer because it might imply that pidfd_send_signal() is a replacement for
kill(2), and not the latter because it is a hassle to remember the correct
spelling - especially for non-native speakers - and because it is not
descriptive enough of what the syscall actually does. The name
"pidfd_send_signal" makes it very clear that its job is to send signals.

/* zombies */
Zombies can be signaled just as any other process. No special error will be
reported since a zombie state is an unreliable state (cf. [3]). However,
this can be added as an extension through the @flags argument if the need
ever arises.

/* cross-namespace signals */
The patch currently enforces that the signaler and signalee either are in
the same pid namespace or that the signaler's pid namespace is an ancestor
of the signalee's pid namespace. This is done for the sake of simplicity
and because it is unclear to what values certain members of struct
siginfo_t would need to be set to (cf. [5], [6]).

/* compat syscalls */
It became clear that we would like to avoid adding compat syscalls
(cf. [7]).  The compat syscall handling is now done in kernel/signal.c
itself by adding __copy_siginfo_from_user_generic() which lets us avoid
compat syscalls (cf. [8]). It should be noted that the addition of
__copy_siginfo_from_user_any() is caused by a bug in the original
implementation of rt_sigqueueinfo(2) (cf. 12).
With upcoming rework for syscall handling things might improve
significantly (cf. [11]) and __copy_siginfo_from_user_any() will not gain
any additional callers.

/* testing */
This patch was tested on x64 and x86.

/* userspace usage */
An asciinema recording for the basic functionality can be found under [9].
With this patch a process can be killed via:

 #define _GNU_SOURCE
 #include <errno.h>
 #include <fcntl.h>
 #include <signal.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <sys/stat.h>
 #include <sys/syscall.h>
 #include <sys/types.h>
 #include <unistd.h>

 static inline int do_pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
                                         unsigned int flags)
 {
 #ifdef __NR_pidfd_send_signal
         return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
 #else
         return -ENOSYS;
 #endif
 }

 int main(int argc, char *argv[])
 {
         int fd, ret, saved_errno, sig;

         if (argc < 3)
                 exit(EXIT_FAILURE);

         fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
         if (fd < 0) {
                 printf("%s - Failed to open \"%s\"\n", strerror(errno), argv[1]);
                 exit(EXIT_FAILURE);
         }

         sig = atoi(argv[2]);

         printf("Sending signal %d to process %s\n", sig, argv[1]);
         ret = do_pidfd_send_signal(fd, sig, NULL, 0);

         saved_errno = errno;
         close(fd);
         errno = saved_errno;

         if (ret < 0) {
                 printf("%s - Failed to send signal %d to process %s\n",
                        strerror(errno), sig, argv[1]);
                 exit(EXIT_FAILURE);
         }

         exit(EXIT_SUCCESS);
 }

/* Q&A
 * Given that it seems the same questions get asked again by people who are
 * late to the party it makes sense to add a Q&A section to the commit
 * message so it's hopefully easier to avoid duplicate threads.
 *
 * For the sake of progress please consider these arguments settled unless
 * there is a new point that desperately needs to be addressed. Please make
 * sure to check the links to the threads in this commit message whether
 * this has not already been covered.
 */
Q-01: (Florian Weimer [20], Andrew Morton [21])
      What happens when the target process has exited?
A-01: Sending the signal will fail with ESRCH (cf. [22]).

Q-02:  (Andrew Morton [21])
       Is the task_struct pinned by the fd?
A-02:  No. A reference to struct pid is kept. struct pid - as far as I
       understand - was created exactly for the reason to not require to
       pin struct task_struct (cf. [22]).

Q-03: (Andrew Morton [21])
      Does the entire procfs directory remain visible? Just one entry
      within it?
A-03: The same thing that happens right now when you hold a file descriptor
      to /proc/<pid> open (cf. [22]).

Q-04: (Andrew Morton [21])
      Does the pid remain reserved?
A-04: No. This patchset guarantees a stable handle not that pids are not
      recycled (cf. [22]).

Q-05: (Andrew Morton [21])
      Do attempts to signal that fd return errors?
A-05: See {Q,A}-01.

Q-06: (Andrew Morton [22])
      Is there a cleaner way of obtaining the fd? Another syscall perhaps.
A-06: Userspace can already trivially retrieve file descriptors from procfs
      so this is something that we will need to support anyway. Hence,
      there's no immediate need to add another syscalls just to make
      pidfd_send_signal() not dependent on the presence of procfs. However,
      adding a syscalls to get such file descriptors is planned for a
      future patchset (cf. [22]).

Q-07: (Andrew Morton [21] and others)
      This fd-for-a-process sounds like a handy thing and people may well
      think up other uses for it in the future, probably unrelated to
      signals. Are the code and the interface designed to permit such
      future applications?
A-07: Yes (cf. [22]).

Q-08: (Andrew Morton [21] and others)
      Now I think about it, why a new syscall? This thing is looking
      rather like an ioctl?
A-08: This has been extensively discussed. It was agreed that a syscall is
      preferred for a variety or reasons. Here are just a few taken from
      prior threads. Syscalls are safer than ioctl()s especially when
      signaling to fds. Processes are a core kernel concept so a syscall
      seems more appropriate. The layout of the syscall with its four
      arguments would require the addition of a custom struct for the
      ioctl() thereby causing at least the same amount or even more
      complexity for userspace than a simple syscall. The new syscall will
      replace multiple other pid-based syscalls (see description above).
      The file-descriptors-for-processes concept introduced with this
      syscall will be extended with other syscalls in the future. See also
      [22], [23] and various other threads already linked in here.

Q-09: (Florian Weimer [24])
      What happens if you use the new interface with an O_PATH descriptor?
A-09:
      pidfds opened as O_PATH fds cannot be used to send signals to a
      process (cf. [2]). Signaling processes through pidfds is the
      equivalent of writing to a file. Thus, this is not an operation that
      operates "purely at the file descriptor level" as required by the
      open(2) manpage. See also [4].

/* References */
[1]:  https://lore.kernel.org/lkml/20181029221037.87724-1-dancol@google.com/
[2]:  https://lore.kernel.org/lkml/874lbtjvtd.fsf@oldenburg2.str.redhat.com/
[3]:  https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6fhva@brauner.io/
[4]:  https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru2ku@brauner.io/
[5]:  https://lore.kernel.org/lkml/20181121213946.GA10795@mail.hallyn.com/
[6]:  https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6nv4@brauner.io/
[7]:  https://lore.kernel.org/lkml/36323361-90BD-41AF-AB5B-EE0D7BA02C21@amacapital.net/
[8]:  https://lore.kernel.org/lkml/87tvjxp8pc.fsf@xmission.com/
[9]:  https://asciinema.org/a/IQjuCHew6bnq1cr78yuMv16cy
[11]: https://lore.kernel.org/lkml/F53D6D38-3521-4C20-9034-5AF447DF62FF@amacapital.net/
[12]: https://lore.kernel.org/lkml/87zhtjn8ck.fsf@xmission.com/
[13]: https://lore.kernel.org/lkml/871s6u9z6u.fsf@xmission.com/
[14]: https://lore.kernel.org/lkml/20181206231742.xxi4ghn24z4h2qki@brauner.io/
[15]: https://lore.kernel.org/lkml/20181207003124.GA11160@mail.hallyn.com/
[16]: https://lore.kernel.org/lkml/20181207015423.4miorx43l3qhppfz@brauner.io/
[17]: https://lore.kernel.org/lkml/CAGXu5jL8PciZAXvOvCeCU3wKUEB_dU-O3q0tDw4uB_ojMvDEew@mail.gmail.com/
[18]: https://lore.kernel.org/lkml/20181206222746.GB9224@mail.hallyn.com/
[19]: https://lore.kernel.org/lkml/20181208054059.19813-1-christian@brauner.io/
[20]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/
[21]: https://lore.kernel.org/lkml/20181228152012.dbf0508c2508138efc5f2bbe@linux-foundation.org/
[22]: https://lore.kernel.org/lkml/20181228233725.722tdfgijxcssg76@brauner.io/
[23]: https://lwn.net/Articles/773459/
[24]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/
[25]: https://lore.kernel.org/lkml/CAK8P3a0ej9NcJM8wXNPbcGUyOUZYX+VLoDFdbenW3s3114oQZw@mail.gmail.com/

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Tycho Andersen <tycho@tycho.ws>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Aleksa Sarai <cyphar@cyphar.com>
2019-03-05 17:03:53 +01:00
David S. Miller
f7fb7c1a1c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-03-04

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add AF_XDP support to libbpf. Rationale is to facilitate writing
   AF_XDP applications by offering higher-level APIs that hide many
   of the details of the AF_XDP uapi. Sample programs are converted
   over to this new interface as well, from Magnus.

2) Introduce a new cant_sleep() macro for annotation of functions
   that cannot sleep and use it in BPF_PROG_RUN() to assert that
   BPF programs run under preemption disabled context, from Peter.

3) Introduce per BPF prog stats in order to monitor the usage
   of BPF; this is controlled by kernel.bpf_stats_enabled sysctl
   knob where monitoring tools can make use of this to efficiently
   determine the average cost of programs, from Alexei.

4) Split up BPF selftest's test_progs similarly as we already
   did with test_verifier. This allows to further reduce merge
   conflicts in future and to get more structure into our
   quickly growing BPF selftest suite, from Stanislav.

5) Fix a bug in BTF's dedup algorithm which can cause an infinite
   loop in some circumstances; also various BPF doc fixes and
   improvements, from Andrii.

6) Various BPF sample cleanups and migration to libbpf in order
   to further isolate the old sample loader code (so we can get
   rid of it at some point), from Jakub.

7) Add a new BPF helper for BPF cgroup skb progs that allows
   to set ECN CE code point and a Host Bandwidth Manager (HBM)
   sample program for limiting the bandwidth used by v2 cgroups,
   from Lawrence.

8) Enable write access to skb->queue_mapping from tc BPF egress
   programs in order to let BPF pick TX queue, from Jesper.

9) Fix a bug in BPF spinlock handling for map-in-map which did
   not propagate spin_lock_off to the meta map, from Yonghong.

10) Fix a bug in the new per-CPU BPF prog counters to properly
    initialize stats for each CPU, from Eric.

11) Add various BPF helper prototypes to selftest's bpf_helpers.h,
    from Willem.

12) Fix various BPF samples bugs in XDP and tracing progs,
    from Toke, Daniel and Yonghong.

13) Silence preemption splat in test_bpf after BPF_PROG_RUN()
    enforces it now everywhere, from Anders.

14) Fix a signedness bug in libbpf's btf_dedup_ref_type() to
    get error handling working, from Dan.

15) Fix bpftool documentation and auto-completion with regards
    to stream_{verdict,parser} attach types, from Alban.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-04 10:14:31 -08:00
Francesco Ruggeri
9036b2fe09 net: ipv6: add socket option IPV6_ROUTER_ALERT_ISOLATE
By default IPv6 socket with IPV6_ROUTER_ALERT socket option set will
receive all IPv6 RA packets from all namespaces.
IPV6_ROUTER_ALERT_ISOLATE socket option restricts packets received by
the socket to be only from the socket's namespace.

Signed-off-by: Maxim Martynov <maxim@arista.com>
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 21:05:10 -08:00
Kevin Darbyshire-Bryant
0b5c7efdfc sch_cake: Permit use of connmarks as tin classifiers
Add flag 'FWMARK' to enable use of firewall connmarks as tin selector.
The connmark (skbuff->mark) needs to be in the range 1->tin_cnt ie.
for diffserv3 the mark needs to be 1->3.

Background

Typically CAKE uses DSCP as the basis for tin selection.  DSCP values
are relatively easily changed as part of the egress path, usually with
iptables & the mangle table, ingress is more challenging.  CAKE is often
used on the WAN interface of a residential gateway where passthrough of
DSCP from the ISP is either missing or set to unhelpful values thus use
of ingress DSCP values for tin selection isn't helpful in that
environment.

An approach to solving the ingress tin selection problem is to use
CAKE's understanding of tc filters.  Naive tc filters could match on
source/destination port numbers and force tin selection that way, but
multiple filters don't scale particularly well as each filter must be
traversed whether it matches or not. e.g. a simple example to map 3
firewall marks to tins:

MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
tc filter add dev $DEV parent $MAJOR protocol all handle 0x01 fw action skbedit priority ${MAJOR}1
tc filter add dev $DEV parent $MAJOR protocol all handle 0x02 fw action skbedit priority ${MAJOR}2
tc filter add dev $DEV parent $MAJOR protocol all handle 0x03 fw action skbedit priority ${MAJOR}3

Another option is to use eBPF cls_act with tc filters e.g.

MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
tc filter add dev $DEV parent $MAJOR bpf da obj my-bpf-fwmark-to-class.o

This has the disadvantages of a) needing someone to write & maintain
the bpf program, b) a bpf toolchain to compile it and c) needing to
hardcode the major number in the bpf program so it matches the cake
instance (or forcing the cake instance to a particular major number)
since the major number cannot be passed to the bpf program via tc
command line.

As already hinted at by the previous examples, it would be helpful
to associate tins with something that survives the Internet path and
ideally allows tin selection on both egress and ingress.  Netfilter's
conntrack permits setting an identifying mark on a connection which
can also be restored to an ingress packet with tc action connmark e.g.

tc filter add dev eth0 parent ffff: protocol all prio 10 u32 \
	match u32 0 0 flowid 1:1 action connmark action mirred egress redirect dev ifb1

Since tc's connmark action has restored any connmark into skb->mark,
any of the previous solutions are based upon it and in one form or
another copy that mark to the skb->priority field where again CAKE
picks this up.

This change cuts out at least one of the (less intuitive &
non-scalable) middlemen and permit direct access to skb->mark.

Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 20:14:28 -08:00
brakmo
f7c917ba11 bpf: add bpf helper bpf_skb_ecn_set_ce
This patch adds a new bpf helper BPF_FUNC_skb_ecn_set_ce
"int bpf_skb_ecn_set_ce(struct sk_buff *skb)". It is added to
BPF_PROG_TYPE_CGROUP_SKB typed bpf_prog which currently can
be attached to the ingress and egress path. The helper is needed
because his type of bpf_prog cannot modify the skb directly.

This helper is used to set the ECN field of ECN capable IP packets to ce
(congestion encountered) in the IPv6 or IPv4 header of the skb. It can be
used by a bpf_prog to manage egress or ingress network bandwdith limit
per cgroupv2 by inducing an ECN response in the TCP sender.
This works best when using DCTCP.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-02 10:48:27 -08:00
Joerg Roedel
d05e4c8600 Merge branches 'iommu/fixes', 'arm/msm', 'arm/tegra', 'arm/mediatek', 'x86/vt-d', 'x86/amd', 'hyper-v' and 'core' into next 2019-03-01 11:24:51 +01:00
Jens Axboe
6c271ce2f1 io_uring: add submission polling
This enables an application to do IO, without ever entering the kernel.
By using the SQ ring to fill in new sqes and watching for completions
on the CQ ring, we can submit and reap IOs without doing a single system
call. The kernel side thread will poll for new submissions, and in case
of HIPRI/polled IO, it'll also poll for completions.

By default, we allow 1 second of active spinning. This can by changed
by passing in a different grace period at io_uring_register(2) time.
If the thread exceeds this idle time without having any work to do, it
will set:

sq_ring->flags |= IORING_SQ_NEED_WAKEUP.

The application will have to call io_uring_enter() to start things back
up again. If IO is kept busy, that will never be needed. Basically an
application that has this feature enabled will guard it's
io_uring_enter(2) call with:

read_barrier();
if (*sq_ring->flags & IORING_SQ_NEED_WAKEUP)
	io_uring_enter(fd, 0, 0, IORING_ENTER_SQ_WAKEUP);

instead of calling it unconditionally.

It's mandatory to use fixed files with this feature. Failure to do so
will result in the application getting an -EBADF CQ entry when
submitting IO.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 08:24:23 -07:00
Jens Axboe
6b06314c47 io_uring: add file set registration
We normally have to fget/fput for each IO we do on a file. Even with
the batching we do, the cost of the atomic inc/dec of the file usage
count adds up.

This adds IORING_REGISTER_FILES, and IORING_UNREGISTER_FILES opcodes
for the io_uring_register(2) system call. The arguments passed in must
be an array of __s32 holding file descriptors, and nr_args should hold
the number of file descriptors the application wishes to pin for the
duration of the io_uring instance (or until IORING_UNREGISTER_FILES is
called).

When used, the application must set IOSQE_FIXED_FILE in the sqe->flags
member. Then, instead of setting sqe->fd to the real fd, it sets sqe->fd
to the index in the array passed in to IORING_REGISTER_FILES.

Files are automatically unregistered when the io_uring instance is torn
down. An application need only unregister if it wishes to register a new
set of fds.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 08:24:23 -07:00
Jens Axboe
edafccee56 io_uring: add support for pre-mapped user IO buffers
If we have fixed user buffers, we can map them into the kernel when we
setup the io_uring. That avoids the need to do get_user_pages() for
each and every IO.

To utilize this feature, the application must call io_uring_register()
after having setup an io_uring instance, passing in
IORING_REGISTER_BUFFERS as the opcode. The argument must be a pointer to
an iovec array, and the nr_args should contain how many iovecs the
application wishes to map.

If successful, these buffers are now mapped into the kernel, eligible
for IO. To use these fixed buffers, the application must use the
IORING_OP_READ_FIXED and IORING_OP_WRITE_FIXED opcodes, and then
set sqe->index to the desired buffer index. sqe->addr..sqe->addr+seq->len
must point to somewhere inside the indexed buffer.

The application may register buffers throughout the lifetime of the
io_uring instance. It can call io_uring_register() with
IORING_UNREGISTER_BUFFERS as the opcode to unregister the current set of
buffers, and then register a new set. The application need not
unregister buffers explicitly before shutting down the io_uring
instance.

It's perfectly valid to setup a larger buffer, and then sometimes only
use parts of it for an IO. As long as the range is within the originally
mapped region, it will work just fine.

For now, buffers must not be file backed. If file backed buffers are
passed in, the registration will fail with -1/EOPNOTSUPP. This
restriction may be relaxed in the future.

RLIMIT_MEMLOCK is used to check how much memory we can pin. A somewhat
arbitrary 1G per buffer size is also imposed.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 08:24:23 -07:00
Jens Axboe
def596e955 io_uring: support for IO polling
Add support for a polled io_uring instance. When a read or write is
submitted to a polled io_uring, the application must poll for
completions on the CQ ring through io_uring_enter(2). Polled IO may not
generate IRQ completions, hence they need to be actively found by the
application itself.

To use polling, io_uring_setup() must be used with the
IORING_SETUP_IOPOLL flag being set. It is illegal to mix and match
polled and non-polled IO on an io_uring.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 08:24:23 -07:00
Christoph Hellwig
c992fe2925 io_uring: add fsync support
Add a new fsync opcode, which either syncs a range if one is passed,
or the whole file if the offset and length fields are both cleared
to zero.  A flag is provided to use fdatasync semantics, that is only
force out metadata which is required to retrieve the file data, but
not others like metadata.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 08:24:23 -07:00
Jens Axboe
2b188cc1bb Add io_uring IO interface
The submission queue (SQ) and completion queue (CQ) rings are shared
between the application and the kernel. This eliminates the need to
copy data back and forth to submit and complete IO.

IO submissions use the io_uring_sqe data structure, and completions
are generated in the form of io_uring_cqe data structures. The SQ
ring is an index into the io_uring_sqe array, which makes it possible
to submit a batch of IOs without them being contiguous in the ring.
The CQ ring is always contiguous, as completion events are inherently
unordered, and hence any io_uring_cqe entry can point back to an
arbitrary submission.

Two new system calls are added for this:

io_uring_setup(entries, params)
	Sets up an io_uring instance for doing async IO. On success,
	returns a file descriptor that the application can mmap to
	gain access to the SQ ring, CQ ring, and io_uring_sqes.

io_uring_enter(fd, to_submit, min_complete, flags, sigset, sigsetsize)
	Initiates IO against the rings mapped to this fd, or waits for
	them to complete, or both. The behavior is controlled by the
	parameters passed in. If 'to_submit' is non-zero, then we'll
	try and submit new IO. If IORING_ENTER_GETEVENTS is set, the
	kernel will wait for 'min_complete' events, if they aren't
	already available. It's valid to set IORING_ENTER_GETEVENTS
	and 'min_complete' == 0 at the same time, this allows the
	kernel to return already completed events without waiting
	for them. This is useful only for polling, as for IRQ
	driven IO, the application can just check the CQ ring
	without entering the kernel.

With this setup, it's possible to do async IO with a single system
call. Future developments will enable polled IO with this interface,
and polled submission as well. The latter will enable an application
to do IO without doing ANY system calls at all.

For IRQ driven IO, an application only needs to enter the kernel for
completions if it wants to wait for them to occur.

Each io_uring is backed by a workqueue, to support buffered async IO
as well. We will only punt to an async context if the command would
need to wait for IO on the device side. Any data that can be accessed
directly in the page cache is done inline. This avoids the slowness
issue of usual threadpools, since cached data is accessed as quickly
as a sync interface.

Sample application: http://git.kernel.dk/cgit/fio/plain/t/io_uring.c

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 08:24:23 -07:00
Oded Gabbay
541664d360 habanalabs: add comments in uapi/misc/habanalabs.h
Add comment about minimum and maximum size of command buffer.
Add some text about the expected input of CS IOCTL.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-28 13:06:09 +01:00
Ingo Molnar
9ed8f1a6e7 Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-28 08:27:17 +01:00
Thomas Gleixner
cfbe271667 y2038: additional syscall ABI cleanup
This is a follow-up to the y2038 syscall patches already merged in the tip
 tree.  As the final 32-bit RISC-V syscall ABI is still being decided on,
 this is the last chance to make a few corrections to leave out interfaces
 based on 32-bit time_t along with the old off_t and rlimit types.
 
 The series achieves this in a few steps:
 
 - A couple of bug fixes for minor regressions I introduced
   in the original series
 
 - A couple of older patches from Yury Norov that I had never
   merged in the past, these fix up the openat/open_by_handle_at and
   getrlimit/setrlimit syscalls to disallow the old versions of off_t
   and rlimit.
 
 - Hiding the deprecated system calls behind an #ifdef in
   include/uapi/asm-generic/unistd.h
 
 - Change arch/riscv to drop all these ABIs.
 
 Originally, the plan was to also leave these out on C-Sky, but that now
 has a glibc port that uses the older interfaces, so we need to leave
 them in place.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcdEhGAAoJEGCrR//JCVInQuUQAN+mRFzRXAqhbpb63/vYGJei
 nmDqB+SoxzaIKAIGAVIdMGUoFxBrY1oyS4m6/a9lzQ9G4aSkr0PruZnUID+vIo2h
 rj+3FBlB/c9nvW+NG8iEtVadlRbTmoRILCWpvgIuLNd6fwvNzP3V4uu6a1QRIMx4
 aUCWQfhzv18kW1EAPIroPA1gEL2HKbhDdEuN2V0SKnsKNiWkHQeswWQFAYpLgT36
 eZ+L52lh+miEdtBxycxJ5lh3KsWO4dPImh+QHONZgeB9iS8v47K0R6ONKm4NMeQV
 5KW55pepUq1uQUdEU9KRrh2krMih2IJbOQoN2lvb2ao5UG6erHbj0N55RQym5gSC
 +TrvP3dnqfohh9hWdHDwME+5OTeOM+8SUMRnaZBJKuywzo7W1ceLpf+KZjwlk2s5
 AgEX67fKrUbtBfTgVhzlYhJLWcgSD1yt64ed5SF15c5M3JZhkK8cd50dB9pM2/YB
 o9VbijkYwb2KyCNUiV3nghgiiqcROvOIO7PK6z3XFFiRm/Gn2CgNZyZa7c4+Vgrr
 PM/DmDvCdFqYnqBOlV2ilCLigKGN0JgwzMXnbQU77d71Yg7Bco8e/yqSucSilp2d
 lEv44extu9FINWXIqvWEjRqdSq+sNgj21VSp6Zu/GaTgNCQKac2wsAZtnQgnslko
 knKwwp525fjqnJEDd1aH
 =/iFA
 -----END PGP SIGNATURE-----

Merge tag 'y2038-syscall-abi' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground into timers/2038

Pull additional syscall ABI cleanup for y2038 from Arnd Bergmann:

This is a follow-up to the y2038 syscall patches already merged in the tip
tree.  As the final 32-bit RISC-V syscall ABI is still being decided on,
this is the last chance to make a few corrections to leave out interfaces
based on 32-bit time_t along with the old off_t and rlimit types.

The series achieves this in a few steps:

- A couple of bug fixes for minor regressions I introduced
  in the original series

- A couple of older patches from Yury Norov that I had never
  merged in the past, these fix up the openat/open_by_handle_at and
  getrlimit/setrlimit syscalls to disallow the old versions of off_t
  and rlimit.

- Hiding the deprecated system calls behind an #ifdef in
  include/uapi/asm-generic/unistd.h

- Change arch/riscv to drop all these ABIs.

Originally, the plan was to also leave these out on C-Sky, but that now
has a glibc port that uses the older interfaces, so we need to leave
them in place.
2019-02-27 21:45:27 +01:00
Alexei Starovoitov
5f8f8b93ae bpf: expose program stats via bpf_prog_info
Return bpf program run_time_ns and run_cnt via bpf_prog_info

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-27 17:22:50 +01:00
Oded Gabbay
230afe74d1 habanalabs: allow memory allocations larger than 4GB
This patch increase the size field in the uapi structure of the Memory
IOCTL from 32-bit to 64-bit. This is to allow the user to allocate and/or
map memory in chunks that are larger then 4GB.

Goya's device memory (DRAM) can be up to 16GB, and for certain
topologies, the user may want an allocation that is larger than 4GB.

This change doesn't break current user-space because there was a "pad"
field in the uapi structure right after the size field. Changing the size
field to be 64-bit and removing the pad field maintains compatibility with
current user-space.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-27 16:00:20 +01:00
Kuppuswamy Sathyanarayanan
8c938ddc6d PCI/ATS: Add pci_ats_page_aligned() interface
Return the Page Aligned Request bit in the ATS Capability Register.

As per PCIe spec r4.0, sec 10.5.1.2, if the Page Aligned Request bit is
set, it indicates the Untranslated Addresses generated by the device are
always aligned to a 4096 byte boundary.

An IOMMU that can only translate page-aligned addresses can only be used
with devices that always produce aligned Untranslated Addresses. This
interface will be used by drivers for such IOMMUs to determine whether
devices can use the ATS service.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-26 11:08:07 +01:00
Kuppuswamy Sathyanarayanan
e5567f5f67 PCI/ATS: Add pci_prg_resp_pasid_required() interface.
Return the PRG Response PASID Required bit in the Page Request
Status Register.

As per PCIe spec r4.0, sec 10.5.2.3, if this bit is Set, the device
expects a PASID TLP Prefix on PRG Response Messages when the
corresponding Page Requests had a PASID TLP Prefix. If Clear, the device
does not expect PASID TLP Prefixes on any PRG Response Message, and the
device behavior is undefined if the device receives a PRG Response Message
with a PASID TLP Prefix. Also the device behavior is undefined if this
bit is Set and the device receives a PRG Response Message with no PASID TLP
Prefix when the corresponding Page Requests had a PASID TLP Prefix.

This function will be used by drivers like IOMMU, if it is required to
check the status of the PRG Response PASID Required bit before enabling
the PASID support of the device.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-26 11:07:05 +01:00
Mohit P. Tahiliani
3f7ae5f3dc net: sched: pie: add more cases to auto-tune alpha and beta
The current implementation scales the local alpha and beta
variables in the calculate_probability function by the same
amount for all values of drop probability below 1%.

RFC 8033 suggests using additional cases for auto-tuning
alpha and beta when the drop probability is less than 1%.

In order to add more auto-tuning cases, MAX_PROB must be
scaled by u64 instead of u32 to prevent underflow when
scaling the local alpha and beta variables in the
calculate_probability function.

Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com>
Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com>
Signed-off-by: Manish Kumar B <bmanish15597@gmail.com>
Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25 14:21:03 -08:00
Anand Jain
228a73abde btrfs: introduce new ioctl to unregister a btrfs device
Support for a new command that can be used eg. as a command

  $ btrfs device scan --forget [dev]'
(the final name may change though)

to undo the effects of 'btrfs device scan [dev]'. For this purpose
this patch proposes to use ioctl #5 as it was empty and is next to the
SCAN ioctl.

The new ioctl BTRFS_IOC_FORGET_DEV works only on the control device
(/dev/btrfs-control) to unregister one or all devices, devices that are
not mounted.

The argument is struct btrfs_ioctl_vol_args, ::name specifies the device
path. To unregister all device, the path is an empty string.

Again, the devices are removed only if they aren't part of a mounte
filesystem.

This new ioctl provides:

- release of unwanted btrfs_fs_devices and btrfs_devices structures
  from memory if the device is not going to be mounted

- ability to mount filesystem in degraded mode, when one devices is
  corrupted like in split brain raid1

- running test cases which would require reloading the kernel module
  but this is not possible eg. due to mounted filesystem or built-in

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:30 +01:00
Heiner Kallweit
e728fdf062 net: phy: improve definition of __ETHTOOL_LINK_MODE_MASK_NBITS
The way to define __ETHTOOL_LINK_MODE_MASK_NBITS seems to be overly
complicated, go with a standard approach instead.
Whilst we're at it, move the comment to the right place.

v2:
- rebased

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 17:33:59 -08:00
Richard Weinberger
663586c0a8 ubi: Expose the bitrot interface
Using UBI_IOCRPEB and UBI_IOCSPEB userspace can force
reading and scrubbing of PEBs.

In case of bitflips UBI will automatically take action
and move data to a different PEB.
This interface allows a daemon to foster your NAND.

Signed-off-by: Richard Weinberger <richard@nod.at>
2019-02-24 11:40:45 +01:00
Oded Gabbay
e126600487 uapi/habanalabs: add some comments in habanalabs.h
This patch adds two comments in uapi/habanalabs.h:
- From which queue id the internal queues begin
- Invalid values that can be returned in the seq field from the CS IOCTL

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-07 14:20:05 +02:00
Greg Kroah-Hartman
62fa78436e Merge 5.1-rc3 into char-misc-next
We want the char-misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-01 07:34:09 +02:00
Tomas Winkler
1e55b609b9 mei: adjust the copyright notice in the files.
Use unified version of the copyright notice in the files
Update copyright years according the year the files
were touched, except this patch and SPDX conversions.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-28 02:07:54 +09:00
Tomas Winkler
9fff0425aa mei: convert to SPDX license tags
Replace boiler plate licenses texts with the SPDX license
identifiers in the mei files header.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-28 02:07:54 +09:00
Dave Airlie
fbac3c48fa Merge branch 'drm-next-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-next
Fixes for 5.1:
amdgpu:
- Fix missing fw declaration after dropping old CI DPM code
- Fix debugfs access to registers beyond the MMIO bar size
- Fix context priority handling
- Add missing license on some new files
- Various cleanups and bug fixes

radeon:
- Fix missing break in CS parser for evergreen
- Various cleanups and bug fixes

sched:
- Fix entities with 0 run queues

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190221214134.3308-1-alexander.deucher@amd.com
2019-02-22 15:56:42 +10:00
Callum Sinclair
ca8d4794f6 ipmr: ip6mr: Create new sockopt to clear mfc cache or vifs
Currently the only way to clear the forwarding cache was to delete the
entries one by one using the MRT_DEL_MFC socket option or to destroy and
recreate the socket.

Create a new socket option which with the use of optional flags can
clear any combination of multicast entries (static or not static) and
multicast vifs (static or not static).

Calling the new socket option MRT_FLUSH with the flags MRT_FLUSH_MFC and
MRT_FLUSH_VIFS will clear all entries and vifs on the socket except for
static entries.

Signed-off-by: Callum Sinclair <callum.sinclair@alliedtelesis.co.nz>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-21 13:05:05 -08:00
Aya Levin
54719527fd devlink: Rename devlink health attributes
Rename devlink health attributes for better reflect the attributes use.
Add COUNT prefix on error counter attribute and recovery counter
attribute.

Fixes: 7afe335a8b ("devlink: Add health get command")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-21 10:38:51 -08:00
Mike Snitzer
61697a6abd dm: eliminate 'split_discard_bios' flag from DM target interface
There is no need to have DM core split discards on behalf of a DM target
now that blk_queue_split() handles splitting discards based on the
queue_limits.  A DM target just needs to set max_discard_sectors,
discard_granularity, etc, in queue_limits.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-02-20 23:24:55 -05:00
Christoph Hellwig
fadccd8fc2 nvme_ioctl.h: remove duplicate GPL boilerplate
We already have a ЅPDX header, so no need to duplicate the information.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2019-02-20 07:22:05 -07:00
Steve Wise
3856ec4b93 RDMA/core: Add RDMA_NLDEV_CMD_NEWLINK/DELLINK support
Add support for new LINK messages to allow adding and deleting rdma
interfaces.  This will be used initially for soft rdma drivers which
instantiate device instances dynamically by the admin specifying a netdev
device to use.  The rdma_rxe module will be the first user of these
messages.

The design is modeled after RTNL_NEWLINK/DELLINK: rdma drivers register
with the rdma core if they provide link add/delete functions.  Each driver
registers with a unique "type" string, that is used to dispatch messages
coming from user space.  A new RDMA_NLDEV_ATTR is defined for the "type"
string.  User mode will pass 3 attributes in a NEWLINK message:
RDMA_NLDEV_ATTR_DEV_NAME for the desired rdma device name to be created,
RDMA_NLDEV_ATTR_LINK_TYPE for the "type" of link being added, and
RDMA_NLDEV_ATTR_NDEV_NAME for the net_device interface to use for this
link.  The DELLINK message will contain the RDMA_NLDEV_ATTR_DEV_INDEX of
the device to delete.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19 20:52:19 -07:00
Jérôme Glisse
f180bf12ac drm/nouveau/svm: new ioctl to migrate process memory to GPU memory
This add an ioctl to migrate a range of process address space to the
device memory. On platform without cache coherent bus (x86, ARM, ...)
this means that CPU can not access that range directly, instead CPU
will fault which will migrate the memory back to system memory.

This is behind a staging flag so that we can evolve the API.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
2019-02-20 09:00:03 +10:00
Ben Skeggs
eeaf06ac1a drm/nouveau/svm: initial support for shared virtual memory
This uses HMM to mirror a process' CPU page tables into a channel's page
tables, and keep them synchronised so that both the CPU and GPU are able
to access the same memory at the same virtual address.

While this code also supports Volta/Turing, it's only enabled for Pascal
GPUs currently due to channel recovery being unreliable right now on the
later GPUs.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-02-20 09:00:02 +10:00
Aya Levin
2736d94f35 ethtool: Added support for 50Gbps per lane link modes
Added support for 50Gbps per lane link modes. Define various 50G, 100G
and 200G link modes using it.

Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-19 14:15:02 -08:00
Arnd Bergmann
c8ce48f065 asm-generic: Make time32 syscall numbers optional
We don't want new architectures to even provide the old 32-bit time_t
based system calls any more, or define the syscall number macros.

Add a new __ARCH_WANT_TIME32_SYSCALLS macro that gets enabled for all
existing 32-bit architectures using the generic system call table,
so we don't change any current behavior.
Since this symbol is evaluated in user space as well, we cannot use
a Kconfig CONFIG_* macro but have to define it in uapi/asm/unistd.h.

On 64-bit architectures, the same system call numbers mostly refer to
the system calls we want to keep, as they already pass 64-bit time_t.

As new architectures no longer provide these, we need new exceptions
in checksyscalls.sh.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-19 21:27:32 +01:00
Leon Romanovsky
c3d02788b4 RDMA/nldev: Provide parent IDs for PD, MR and QP objects
PD, MR and QP objects have parents objects: contexts and PDs.  The exposed
parent IDs allow to correlate various objects and simplify debug
investigation.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19 10:13:39 -07:00
Leon Romanovsky
517b773e0f RDMA/nldev: Share with user-space object IDs
Give to the user space tools unique identifier for PD, MR, CQ and CM_ID
objects, so they can be able to query on them with .doit callbacks.

QP .doit is not supported yet, till all drivers will be updated to provide
their LQPN to be equal to their restrack ID.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19 10:13:39 -07:00
Yury Norov
80d7da1cac asm-generic: Drop getrlimit and setrlimit syscalls from default list
The newer prlimit64 syscall provides all the functionality of getrlimit
and setrlimit syscalls and adds the pid of target process, so future
architectures won't need to include getrlimit and setrlimit.

Therefore drop getrlimit and setrlimit syscalls from the generic syscall
list unless __ARCH_WANT_SET_GET_RLIMIT is defined by the architecture's
unistd.h prior to including asm-generic/unistd.h, and adjust all
architectures using the generic syscall list to define it so that no
in-tree architectures are affected.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-hexagon@vger.kernel.org
Cc: uclinux-h8-devel@lists.sourceforge.jp
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: James Hogan <james.hogan@imgtec.com> [metag]
Acked-by: Ley Foon Tan <lftan@altera.com> [nios2]
Acked-by: Stafford Horne <shorne@gmail.com> [openrisc]
Acked-by: Will Deacon <will.deacon@arm.com> [arm64]
Acked-by: Vineet Gupta <vgupta@synopsys.com> #arch/arc bits
Signed-off-by: Yury Norov <ynorov@marvell.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-19 10:10:06 +01:00
Patrick Lerda
721074b034 media: rc: rcmm decoder and encoder
media: add support for RCMM infrared remote controls.

Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-02-18 15:39:49 -05:00
David S. Miller
8bbed40f10 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for you net-next
tree:

1) Missing NFTA_RULE_POSITION_ID netlink attribute validation,
   from Phil Sutter.

2) Restrict matching on tunnel metadata to rx/tx path, from wenxu.

3) Avoid indirect calls for IPV6=y, from Florian Westphal.

4) Add two indirections to prepare merger of IPV4 and IPV6 nat
   modules, from Florian Westphal.

5) Broken indentation in ctnetlink, from Colin Ian King.

6) Patches to use struct_size() from netfilter and IPVS,
   from Gustavo A. R. Silva.

7) Display kernel splat only once in case of racing to confirm
   conntrack from bridge plus nfqueue setups, from Chieh-Min Wang.

8) Skip checksum validation for layer 4 protocols that don't need it,
   patch from Alin Nastac.

9) Sparse warning due to symbol that should be static in CLUSTERIP,
   from Wei Yongjun.

10) Add new toggle to disable SDP payload translation when media
    endpoint is reachable though the same interface as the signalling
    peer, from Alin Nastac.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-18 11:38:30 -08:00
Vivek Kasireddy
a7fe4ca72b media: v4l: Add 32-bit packed YUV formats
The formats added in this patch include:
 V4L2_PIX_FMT_AYUV32
 V4L2_PIX_FMT_XYUV32
 V4L2_PIX_FMT_VUYA32
 V4L2_PIX_FMT_VUYX32

These formats enable the trasmission of alpha channel data to other
drivers and userspace applications in addition to YUV data. For
example, buffers generated by drivers in one of these formats
can be used by the Weston compositor to display as a texture or
flipped directly onto the overlay planes with the help of a DRM
driver.

Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-02-18 14:31:01 -05:00
Michael S. Tsirkin
746c9398f5 arch: move common mmap flags to linux/mman.h
Now that we have 3 mmap flags shared by all architectures,
let's move them into the common header.

This will help discourage future architectures from duplicating code.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-18 17:49:30 +01:00
Yury Norov
0d0216c03a compat ABI: use non-compat openat and open_by_handle_at variants
The only difference between native and compat openat and open_by_handle_at
is that non-compat version forces O_LARGEFILE, and it should be the
default behaviour for all architectures, as we are going to drop the
support of 32-bit userspace off_t.

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Yury Norov <ynorov@marvell.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-18 16:57:12 +01:00
Mauro Carvalho Chehab
5f09bc8cc4 Linux 5.0-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlxqHJYeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGWl8H/jPI4EipzD2GbnjZ
 GaFpMBBjcXBaVmoA+Y69so+7BHx1Ql+5GQtqbK0RHJRb9qEPLw3FBhHNjM/N8Sgf
 nSrK+GnBZp9s+k/NR/Yf2RacUR3jhz+Q9JEoQd3u9bFUeQyvE8Rf3vgtoBBwFOfz
 +t7N1memYVF3asLGWB4e4sP1YVMGfseTQpSPojvM30YWM86Bv+QtSx1AGgHczQIM
 kMKealR8ZPelN6JAXgLhQ5opDojBrE4YKB98pwsMDI6abz0Tz2JLFEUTTxsv5XNN
 o/Iz+XDoylskEyxN2unNWfHx7Swkvoklog8J/hDg5XlTvipL/WkT66PHBgcGMNvj
 BW9GgU8=
 =ZizU
 -----END PGP SIGNATURE-----

Merge tag 'v5.0-rc7' into patchwork

Linux 5.0-rc7

* tag 'v5.0-rc7': (1667 commits)
  Linux 5.0-rc7
  Input: elan_i2c - add ACPI ID for touchpad in Lenovo V330-15ISK
  Input: st-keyscan - fix potential zalloc NULL dereference
  Input: apanel - switch to using brightness_set_blocking()
  powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  efi/arm: Revert "Defer persistent reservations until after paging_init()"
  arm64, mm, efi: Account for GICv3 LPI tables in static memblock reserve table
  sunrpc: fix 4 more call sites that were using stack memory with a scatterlist
  include/linux/module.h: copy __init/__exit attrs to init/cleanup_module
  Compiler Attributes: add support for __copy (gcc >= 9)
  lib/crc32.c: mark crc32_le_base/__crc32c_le_base aliases as __pure
  auxdisplay: ht16k33: fix potential user-after-free on module unload
  x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls
  i2c: bcm2835: Clear current buffer pointers and counts after a transfer
  i2c: cadence: Fix the hold bit setting
  drm: Use array_size() when creating lease
  dm thin: fix bug where bio that overwrites thin block ignores FUA
  Revert "exec: load_script: don't blindly truncate shebang string"
  Revert "gfs2: read journal in large chunks to locate the head"
  net: ethernet: freescale: set FEC ethtool regs version
  ...

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-02-18 10:12:12 -05:00
Oded Gabbay
d8dd7b0a81 habanalabs: implement INFO IOCTL
This patch implements the INFO IOCTL. That IOCTL is used by the user to
query information that is relevant/needed by the user in order to submit
deep learning jobs to Goya.

The information is divided into several categories, such as H/W IP, Events
that happened, DDR usage and more.

Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18 09:46:46 +01:00
Omer Shpigelman
0feaf86d4e habanalabs: add virtual memory and MMU modules
This patch adds the Virtual Memory and MMU modules.

Goya has an internal MMU which provides process isolation on the internal
DDR. The internal MMU also performs translations for transactions that go
from Goya to the Host.

The driver is responsible for allocating and freeing memory on the DDR
upon user request. It also provides an interface to map and unmap DDR and
Host memory to the device address space.

The MMU in Goya supports 3-level and 4-level page tables. With 3-level, the
size of each page is 2MB, while with 4-level the size of each page is 4KB.

In the DDR, the physical pages are always 2MB.

Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18 09:46:46 +01:00
Oded Gabbay
eff6f4a0e7 habanalabs: add command submission module
This patch adds the main flow for the user to submit work to the device.

Each work is described by a command submission object (CS). The CS contains
3 arrays of command buffers: One for execution, and two for context-switch
(store and restore).

For each CB, the user specifies on which queue to put that CB. In case of
an internal queue, the entry doesn't contain a pointer to the CB but the
address in the on-chip memory that the CB resides at.

The driver parses some of the CBs to enforce security restrictions.

The user receives a sequence number that represents the CS object. The user
can then query the driver regarding the status of the CS, using that
sequence number.

In case the CS doesn't finish before the timeout expires, the driver will
perform a soft-reset of the device.

Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18 09:46:45 +01:00
Oded Gabbay
9494a8dd8d habanalabs: add h/w queues module
This patch adds the H/W queues module and the code to initialize Goya's
various compute and DMA engines and their queues.

Goya has 5 DMA channels, 8 TPC engines and a single MME engine. For each
channel/engine, there is a H/W queue logic which is used to pass commands
from the user to the H/W. That logic is called QMAN.

There are two types of QMANs: external and internal. The DMA QMANs are
considered external while the TPC and MME QMANs are considered internal.
For each external queue there is a completion queue, which is located on
the Host memory.

The differences between external and internal QMANs are:

1. The location of the queue's memory. External QMANs are located on the
   Host memory while internal QMANs are located on the on-chip memory.

2. The external QMAN write an entry to a completion queue and sends an
   MSI-X interrupt upon completion of a command buffer that was given to
   it. The internal QMAN doesn't do that.

Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18 09:46:45 +01:00
Oded Gabbay
be5d926b5c habanalabs: add command buffer module
This patch adds the command buffer (CB) module, which allows the user to
create and destroy CBs and to map them to the user's process
address-space.

A command buffer is a memory blocks that reside in DMA-able address-space
and is physically contiguous so it can be accessed by the device without
MMU translation. The command buffer memory is allocated using the
coherent DMA API.

When creating a new CB, the IOCTL returns a handle of it, and the
user-space process needs to use that handle to mmap the buffer to get a VA
in the user's address-space.

Before destroying (freeing) a CB, the user must unmap the CB's VA using the
CB handle.

Each CB has a reference counter, which tracks its usage in command
submissions and also its mmaps (only a single mmap is allowed).

The driver maintains a pool of pre-allocated CBs in order to reduce
latency during command submissions. In case the pool is empty, the driver
will go to the slow-path of allocating a new CB, i.e. calling
dma_alloc_coherent.

Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18 09:46:44 +01:00
Oded Gabbay
99b9d7b497 habanalabs: add basic Goya support
This patch adds a basic support for the Goya device. The code initializes
the device's PCI controller and PCI bars. It also initializes various S/W
structures and adds some basic helper functions.

Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18 09:46:44 +01:00
Dave Airlie
c06de56121 Linux 5.0-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlxqHJYeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGWl8H/jPI4EipzD2GbnjZ
 GaFpMBBjcXBaVmoA+Y69so+7BHx1Ql+5GQtqbK0RHJRb9qEPLw3FBhHNjM/N8Sgf
 nSrK+GnBZp9s+k/NR/Yf2RacUR3jhz+Q9JEoQd3u9bFUeQyvE8Rf3vgtoBBwFOfz
 +t7N1memYVF3asLGWB4e4sP1YVMGfseTQpSPojvM30YWM86Bv+QtSx1AGgHczQIM
 kMKealR8ZPelN6JAXgLhQ5opDojBrE4YKB98pwsMDI6abz0Tz2JLFEUTTxsv5XNN
 o/Iz+XDoylskEyxN2unNWfHx7Swkvoklog8J/hDg5XlTvipL/WkT66PHBgcGMNvj
 BW9GgU8=
 =ZizU
 -----END PGP SIGNATURE-----

Merge v5.0-rc7 into drm-next

Backmerging for nouveau and imx that needed some fixes for next pulls.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-02-18 13:27:15 +10:00
Jakub Kicinski
76726ccb7f devlink: add flash update command
Add devlink flash update command. Advanced NICs have firmware
stored in flash and often cryptographically secured. Updating
that flash is handled by management firmware. Ethtool has a
flash update command which served us well, however, it has two
shortcomings:
 - it takes rtnl_lock unnecessarily - really flash update has
   nothing to do with networking, so using a networking device
   as a handle is suboptimal, which leads us to the second one:
 - it requires a functioning netdev - in case device enters an
   error state and can't spawn a netdev (e.g. communication
   with the device fails) there is no netdev to use as a handle
   for flashing.

Devlink already has the ability to report the firmware versions,
now with the ability to update the firmware/flash we will be
able to recover devices in bad state.

To enable updates of sub-components of the FW allow passing
component name.  This name should correspond to one of the
versions reported in devlink info.

v1: - replace target id with component name (Jiri).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-17 15:27:38 -08:00
David S. Miller
885e631959 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2019-02-16

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) numerous libbpf API improvements, from Andrii, Andrey, Yonghong.

2) test all bpf progs in alu32 mode, from Jiong.

3) skb->sk access and bpf_sk_fullsock(), bpf_tcp_sock() helpers, from Martin.

4) support for IP encap in lwt bpf progs, from Peter.

5) remove XDP_QUERY_XSK_UMEM dead code, from Jan.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-16 22:56:34 -08:00
David S. Miller
3313da8188 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The netfilter conflicts were rather simple overlapping
changes.

However, the cls_tcindex.c stuff was a bit more complex.

On the 'net' side, Cong is fixing several races and memory
leaks.  Whilst on the 'net-next' side we have Vlad adding
the rtnl-ness support.

What I've decided to do, in order to resolve this, is revert the
conversion over to using a workqueue that Cong did, bringing us back
to pure RCU.  I did it this way because I believe that either Cong's
races don't apply with have Vlad did things, or Cong will have to
implement the race fix slightly differently.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15 12:38:38 -08:00
Bas Nieuwenhuizen
b5bb37eddb drm/amdgpu: Add command to override the context priority.
Given a master fd we can then override the priority of the context
in another fd.

Using these overrides was recommended by Christian instead of trying
to submit from a master fd, and I am adding a way to override a
single context instead of the entire process so we can only upgrade
a single Vulkan queue and not effectively the entire process.

Reused the flags field as it was checked to be 0 anyways, so nothing
used it. This is source-incompatible (due to the name change), but
ABI compatible.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-15 11:15:43 -05:00
Linus Torvalds
6e7bd3b549 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix MAC address setting in mac80211 pmsr code, from Johannes Berg.

 2) Probe SFP modules after being attached, from Russell King.

 3) Byte ordering bug in SMC rx_curs_confirmed code, from Ursula Braun.

 4) Revert some r8169 changes that are causing regressions, from Heiner
    Kallweit.

 5) Fix spurious connection timeouts in netfilter nat code, from Florian
    Westphal.

 6) SKB leak in tipc, from Hoang Le.

 7) Short packet checkum issue in mlx4, similar to a previous mlx5
    change, from Saeed Mahameed. The issue is that whilst padding bytes
    are usually zero, it is not guarateed and the hardware doesn't take
    the padding bytes into consideration when generating the checksum.

 8) Fix various races in cls_tcindex, from Cong Wang.

 9) Need to set stream ext to NULL before freeing in SCTP code, from Xin
    Long.

10) Fix locking in phy_is_started, from Heiner Kallweit.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits)
  net: ethernet: freescale: set FEC ethtool regs version
  net: hns: Fix object reference leaks in hns_dsaf_roce_reset()
  mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs
  net: phy: fix potential race in the phylib state machine
  net: phy: don't use locking in phy_is_started
  selftests: fix timestamping Makefile
  net: dsa: bcm_sf2: potential array overflow in bcm_sf2_sw_suspend()
  net: fix possible overflow in __sk_mem_raise_allocated()
  dsa: mv88e6xxx: Ensure all pending interrupts are handled prior to exit
  net: phy: fix interrupt handling in non-started states
  sctp: set stream ext to NULL after freeing it in sctp_stream_outq_migrate
  sctp: call gso_reset_checksum when computing checksum in sctp_gso_segment
  net/mlx5e: XDP, fix redirect resources availability check
  net/mlx5: Fix a compilation warning in events.c
  net/mlx5: No command allowed when command interface is not ready
  net/mlx5e: Fix NULL pointer derefernce in set channels error flow
  netfilter: nft_compat: use-after-free when deleting targets
  team: avoid complex list operations in team_nl_cmd_options_set()
  net_sched: fix two more memory leaks in cls_tcindex
  net_sched: fix a memory leak in cls_tcindex
  ...
2019-02-15 08:00:11 -08:00
Deepa Dinamani
460a2db027 errqueue.h: Include time_types.h
Now that we have a separate header for struct __kernel_timespec,
include it directly without relying on userspace to do it.

Reported-by: Ran Rozenstein <ranro@mellanox.com>
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-14 11:51:51 -05:00
Deepa Dinamani
ca5e9aba75 time: Add time_types.h
sys/time.h is the mandated include for many time related
defines. However, linux/time.h overlaps sys/time.h
significantly and this makes including both from userspace
or one from the other impossible.

This also means that userspace can get away with including
sys/time.h whenever it needs linux/time.h and this is what's
been happening in the user world usually.

But, we have new data types that we plan to use in the uapi time
interfaces also defined in the linux/time.h. But, we are unable
to use these types when sys/time.h is included.

Hence, move the new types to a new header, time_types.h.
We intend to eventually have all the uapi defines that the kernel
uses defined in this header.
Note that the plan is to replace uapi interfaces with timeval to
use __kernel_old_timeval, timespec to use __kernel_old_timespec etc.

Reported-by: Ran Rozenstein <ranro@mellanox.com>
Fixes: 9718475e69 ("socket: Add SO_TIMESTAMPING_NEW")
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-14 11:51:51 -05:00
David S. Miller
03b9674202 This feature/cleanup patchset includes the following patches:
- fix memory leak in in batadv_dat_put_dhcp, by Martin Weinelt
 
  - fix typo, by Sven Eckelmann
 
  - netlink restructuring patch series (part 2), by Sven Eckelmann
    (19 patches)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlxj6MYWHHN3QHNpbW9u
 d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoYTMD/9Uyaqe4ULGQ+rLBnCMSuNc5AnQ
 d7hebUep1RVDyyzJaHRNLmPwmSDNscdQX2AaAxOMhrbmDiBceHq1vDpNvLMDCa0g
 4KbljpfICc6Wm8cq6zHMvL5DQYdxvHt3xDDOvTufhTDra0+cZ8b7tLN1GIztrSSL
 BJ4gUkVydtFUIGCPkSfqZoMI82VTo4V/vOQpaAAANJXVehajHhPLOc2+AqnsHTxC
 6EpNFbMib7sxvrDV5/GdkhtyikmNyHXlaZaiboJjH8htdRWmxHSIfelgDZ2KzqJK
 NG5EhZNljLEHHU5cnCBA1ptvjB0zWVF1WEZDyD3fHPraNdmeyCyqI1kuEMFHPSEr
 PZ+gD9wqnmgQUc3D7Xi7Q0faEQ5wcOKtcCvMrWgydbQxanhiftBH8pikkR9VJh+B
 1ADuN2m+OSKaZ8Fw5tqknNbd8paKztlndgSNRHEc9KOWnT91cKO02Zg9wCyNhn7S
 SI/0eLZx0N0v3T/UVsoKbjKGCEiFZzgXMBRRb7jthJghD2hnyx+s1tRzkgxIJj8z
 fI5hwAl//m6tSCDjbOLjCdsNHiGs4AhP0MsFq5jHmpqvsY8ZUX59LcRnpfhX3IaN
 L+KpBxH9O25Pd3JgXgjlcuza2L1L/V7piPUit/C8LVb7goNN3QCi+UtXFVMR1C4c
 J7kI7vMak8BEIs4EuQ==
 =+p14
 -----END PGP SIGNATURE-----

Merge tag 'batadv-next-for-davem-20190213' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This feature/cleanup patchset includes the following patches:

 - fix memory leak in in batadv_dat_put_dhcp, by Martin Weinelt

 - fix typo, by Sven Eckelmann

 - netlink restructuring patch series (part 2), by Sven Eckelmann
   (19 patches)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-13 22:28:11 -08:00
Peter Oskolkov
3e0bd37ce0 bpf: add plumbing for BPF_LWT_ENCAP_IP in bpf_lwt_push_encap
This patch adds all needed plumbing in preparation to allowing
bpf programs to do IP encapping via bpf_lwt_push_encap. Actual
implementation is added in the next patch in the patchset.

Of note:
- bpf_lwt_push_encap can now be called from BPF_PROG_TYPE_LWT_XMIT
  prog types in addition to BPF_PROG_TYPE_LWT_IN;
- if the skb being encapped has GSO set, encapsulation is limited
  to IPIP/IP+GRE/IP+GUE (both IPv4 and IPv6);
- as route lookups are different for ingress vs egress, the single
  external bpf_lwt_push_encap BPF helper is routed internally to
  either bpf_lwt_in_push_encap or bpf_lwt_xmit_push_encap BPF_CALLs,
  depending on prog type.

v8 changes: fixed a typo.

Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-13 18:27:55 -08:00
Maxime Chevallier
7fd8afa893 net: phy: Add generic support for 2.5GBaseT and 5GBaseT
The 802.3bz specification, based on previous by the NBASET alliance,
defines the 2.5GBaseT and 5GBaseT link modes for ethernet traffic on
cat5e, cat6 and cat7 cables.

These mode integrate with the already defined C45 MDIO PMA/PMD registers
set that added 10G support, by defining some previously reserved bits,
and adding a new register (2.5G/5G Extended abilities).

This commit adds the required definitions in include/uapi/linux/mdio.h
to support these modes, and detect when a link-partner advertises them.

It also adds support for these mode in the generic C45 PHY
infrastructure.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-13 19:17:53 -05:00
Chad Austin
d9a9ea94f7 fuse: support clients that don't implement 'opendir'
Allow filesystems to return ENOSYS from opendir, preventing the kernel from
sending opendir and releasedir messages in the future. This avoids
userspace transitions when filesystems don't need to keep track of state
per directory handle.

A new capability flag, FUSE_NO_OPENDIR_SUPPORT, parallels
FUSE_NO_OPEN_SUPPORT, indicating the new semantics for returning ENOSYS
from opendir.

Signed-off-by: Chad Austin <chadaustin@fb.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-02-13 13:15:15 +01:00
Masahiro Yamada
76ce2a80a2 Rename include/{uapi => }/asm-generic/shmparam.h really
Commit 36c0f7f0f8 ("arch: unexport asm/shmparam.h for all
architectures") is different from the patch I submitted.

My patch is this:

  https://lore.kernel.org/lkml/1546904307-11124-1-git-send-email-yamada.masahiro@socionext.com/T/#u

The file renaming part:

  rename include/{uapi => }/asm-generic/shmparam.h (100%)

was lost when it was picked up.

I think it was an accident because Andrew did not say anything.

Link: http://lkml.kernel.org/r/1549158277-24558-1-git-send-email-yamada.masahiro@socionext.com
Fixes: 36c0f7f0f8 ("arch: unexport asm/shmparam.h for all architectures")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-12 16:33:18 -08:00
Konstantin Khlebnikov
1ec17dbd90 inet_diag: fix reporting cgroup classid and fallback to priority
Field idiag_ext in struct inet_diag_req_v2 used as bitmap of requested
extensions has only 8 bits. Thus extensions starting from DCTCPINFO
cannot be requested directly. Some of them included into response
unconditionally or hook into some of lower 8 bits.

Extension INET_DIAG_CLASS_ID has not way to request from the beginning.

This patch bundle it with INET_DIAG_TCLASS (ipv6 tos), fixes space
reservation, and documents behavior for other extensions.

Also this patch adds fallback to reporting socket priority. This filed
is more widely used for traffic classification because ipv4 sockets
automatically maps TOS to priority and default qdisc pfifo_fast knows
about that. But priority could be changed via setsockopt SO_PRIORITY so
INET_DIAG_TOS isn't enough for predicting class.

Also cgroup2 obsoletes net_cls classid (it always zero), but we cannot
reuse this field for reporting cgroup2 id because it is 64-bit (ino+gen).

So, after this patch INET_DIAG_CLASS_ID will report socket priority
for most common setup when net_cls isn't set and/or cgroup2 in use.

Fixes: 0888e372c3 ("net: inet: diag: expose sockets cgroup classid")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12 13:35:57 -05:00
Srinivas Kandagatla
6cffd79504 misc: fastrpc: Add support for dmabuf exporter
User process can involve dealing with big buffer sizes, and also passing
buffers from one compute context bank to other compute context bank for
complex dsp algorithms.

This patch adds support to fastrpc to make it a proper dmabuf exporter
to avoid making copies of buffers.

Co-developed-by: Thierry Escande <thierry.escande@linaro.org>
Signed-off-by: Thierry Escande <thierry.escande@linaro.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12 10:40:30 +01:00
Srinivas Kandagatla
d73f71c7c6 misc: fastrpc: Add support for create remote init process
This patch adds support to create or attach remote shell process.
The shell process called fastrpc_shell_0 is usually loaded on the DSP
when a user process is spawned.

Most of the work is derived from various downstream Qualcomm kernels.
Credits to various Qualcomm authors who have contributed to this code.
Specially Tharun Kumar Merugu <mtharu@codeaurora.org>

Co-developed-by: Thierry Escande <thierry.escande@linaro.org>
Signed-off-by: Thierry Escande <thierry.escande@linaro.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12 10:40:30 +01:00
Srinivas Kandagatla
c68cfb718c misc: fastrpc: Add support for context Invoke method
This patch adds support to compute context invoke method on the
remote processor (DSP).
This involves setting up the functions input and output arguments,
input and output handles and mapping the dmabuf fd for the
argument/handle buffers.

The below diagram depicts invocation of a single method where the
client and objects reside on different processors. An object could
expose multiple methods which can be grouped together and referred
to as an interface.

,--------,        ,------,  ,-----------,  ,------,        ,--------,
|        | method |      |  |           |  |      | method |        |
| Client |------->| Stub |->| Transport |->| Skel |------->| Object |
|        |        |      |  |           |  |      |        |        |
`--------`        `------`  `-----------`  `------`        `--------`

Client:    Linux user mode process that initiates the remote invocation
Stub:      Auto generated code linked in with the user mode process that
           takes care of marshaling parameters
Transport: Involved in carrying an invocation from a client to an
           object. This involves two portions: 1) FastRPC Linux
           kernel driver that receives the remote invocation, queues
           them up and then waits for the response after signaling the
           remote side. 2) Service running on the remote side that
           dequeues the messages from the queue and dispatches them for
           processing.
Skel:      Auto generated code that takes care of un-marshaling
           parameters
Object:    Method implementation

Most of the work is derived from various downstream Qualcomm kernels.
Credits to various Qualcomm authors who have contributed to this code.
Specially Tharun Kumar Merugu <mtharu@codeaurora.org>

Co-developed-by: Thierry Escande <thierry.escande@linaro.org>
Signed-off-by: Thierry Escande <thierry.escande@linaro.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12 10:40:30 +01:00
Maxime Ripard
d588100baa
Merge drm/drm-next into drm-misc-next
We need to backmerge drm-next to fix the komeda build failure.

Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
2019-02-11 10:35:35 +01:00
Greg Kroah-Hartman
c744ca39f2 Merge 5.0-rc6 into tty-next
We need the tty fixes in here for other patches to be based on.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-11 09:26:45 +01:00
Greg Kroah-Hartman
3b6effbc38 Merge 5.0-rc6 into staging-next
We need the staging fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-11 09:25:01 +01:00
Greg Kroah-Hartman
5c07488d99 Merge 5.0-rc6 into char-misc-next
We need the char-misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-11 09:05:58 +01:00
Dave Airlie
f4bc54b532 Merge branch 'drm-next-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-next
Updates for 5.1:
- GDS fixes
- Add AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES interface
- GPUVM fixes
- PCIE DPM switching fixes for vega20
- Vega10 uclk DPM regression fix
- DC Freesync fixes
- DC ABM fixes
- Various DC cleanups

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208210214.27666-1-alexander.deucher@amd.com
2019-02-11 14:04:20 +10:00
Martin KaFai Lau
655a51e536 bpf: Add struct bpf_tcp_sock and BPF_FUNC_tcp_sock
This patch adds a helper function BPF_FUNC_tcp_sock and it
is currently available for cg_skb and sched_(cls|act):

struct bpf_tcp_sock *bpf_tcp_sock(struct bpf_sock *sk);

int cg_skb_foo(struct __sk_buff *skb) {
	struct bpf_tcp_sock *tp;
	struct bpf_sock *sk;
	__u32 snd_cwnd;

	sk = skb->sk;
	if (!sk)
		return 1;

	tp = bpf_tcp_sock(sk);
	if (!tp)
		return 1;

	snd_cwnd = tp->snd_cwnd;
	/* ... */

	return 1;
}

A 'struct bpf_tcp_sock' is also added to the uapi bpf.h to provide
read-only access.  bpf_tcp_sock has all the existing tcp_sock's fields
that has already been exposed by the bpf_sock_ops.
i.e. no new tcp_sock's fields are exposed in bpf.h.

This helper returns a pointer to the tcp_sock.  If it is not a tcp_sock
or it cannot be traced back to a tcp_sock by sk_to_full_sk(), it
returns NULL.  Hence, the caller needs to check for NULL before
accessing it.

The current use case is to expose members from tcp_sock
to allow a cg_skb_bpf_prog to provide per cgroup traffic
policing/shaping.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10 19:46:17 -08:00
Martin KaFai Lau
aa65d6960a bpf: Add state, dst_ip4, dst_ip6 and dst_port to bpf_sock
This patch adds "state", "dst_ip4", "dst_ip6" and "dst_port" to the
bpf_sock.  The userspace has already been using "state",
e.g. inet_diag (ss -t) and getsockopt(TCP_INFO).

This patch also allows narrow load on the following existing fields:
"family", "type", "protocol" and "src_port".  Unlike IP address,
the load offset is resticted to the first byte for them but it
can be relaxed later if there is a use case.

This patch also folds __sock_filter_check_size() into
bpf_sock_is_valid_access() since it is not called
by any where else.  All bpf_sock checking is in
one place.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10 19:46:17 -08:00
Martin KaFai Lau
46f8bc9275 bpf: Add a bpf_sock pointer to __sk_buff and a bpf_sk_fullsock helper
In kernel, it is common to check "skb->sk && sk_fullsock(skb->sk)"
before accessing the fields in sock.  For example, in __netdev_pick_tx:

static u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb,
			    struct net_device *sb_dev)
{
	/* ... */

	struct sock *sk = skb->sk;

		if (queue_index != new_index && sk &&
		    sk_fullsock(sk) &&
		    rcu_access_pointer(sk->sk_dst_cache))
			sk_tx_queue_set(sk, new_index);

	/* ... */

	return queue_index;
}

This patch adds a "struct bpf_sock *sk" pointer to the "struct __sk_buff"
where a few of the convert_ctx_access() in filter.c has already been
accessing the skb->sk sock_common's fields,
e.g. sock_ops_convert_ctx_access().

"__sk_buff->sk" is a PTR_TO_SOCK_COMMON_OR_NULL in the verifier.
Some of the fileds in "bpf_sock" will not be directly
accessible through the "__sk_buff->sk" pointer.  It is limited
by the new "bpf_sock_common_is_valid_access()".
e.g. The existing "type", "protocol", "mark" and "priority" in bpf_sock
     are not allowed.

The newly added "struct bpf_sock *bpf_sk_fullsock(struct bpf_sock *sk)"
can be used to get a sk with all accessible fields in "bpf_sock".
This helper is added to both cg_skb and sched_(cls|act).

int cg_skb_foo(struct __sk_buff *skb) {
	struct bpf_sock *sk;

	sk = skb->sk;
	if (!sk)
		return 1;

	sk = bpf_sk_fullsock(sk);
	if (!sk)
		return 1;

	if (sk->family != AF_INET6 || sk->protocol != IPPROTO_TCP)
		return 1;

	/* some_traffic_shaping(); */

	return 1;
}

(1) The sk is read only

(2) There is no new "struct bpf_sock_common" introduced.

(3) Future kernel sock's members could be added to bpf_sock only
    instead of repeatedly adding at multiple places like currently
    in bpf_sock_ops_md, bpf_sock_addr_md, sk_reuseport_md...etc.

(4) After "sk = skb->sk", the reg holding sk is in type
    PTR_TO_SOCK_COMMON_OR_NULL.

(5) After bpf_sk_fullsock(), the return type will be in type
    PTR_TO_SOCKET_OR_NULL which is the same as the return type of
    bpf_sk_lookup_xxx().

    However, bpf_sk_fullsock() does not take refcnt.  The
    acquire_reference_state() is only depending on the return type now.
    To avoid it, a new is_acquire_function() is checked before calling
    acquire_reference_state().

(6) The WARN_ON in "release_reference_state()" is no longer an
    internal verifier bug.

    When reg->id is not found in state->refs[], it means the
    bpf_prog does something wrong like
    "bpf_sk_release(bpf_sk_fullsock(skb->sk))" where reference has
    never been acquired by calling "bpf_sk_fullsock(skb->sk)".

    A -EINVAL and a verbose are done instead of WARN_ON.  A test is
    added to the test_verifier in a later patch.

    Since the WARN_ON in "release_reference_state()" is no longer
    needed, "__release_reference_state()" is folded into
    "release_reference_state()" also.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10 19:46:17 -08:00
Dave Airlie
5ea3998d56 UAPI Changes:
- Expose RPCS (SSEU) configuration to userspace for Ice Lake
 in order to allow userspace to reconfigure the subslice config
 per context basis. (Tvrtko, Lionel)
 
 Driver Changes:
 
 - Execbuf and preemption improvements including selftests (Chris)
 - Rename HAS_GMCH_DISPLAY/HAS_GMCH (Rodrigo)
 - Debugfs error handling fix for robustness (Greg)
 - Improve reg_rw traces (Ville)
 - Push clear_intel_crtc_state onto the heap (Chris)
 - Watermark fixes for Ice Lake (Ville)
 - Fix enable count array size and bounds checking (Tvrtko)
 - MST Fixes (Lyude)
 - Prevent race and handle error on I915_GEM_MMAP (Joonas)
 - Initial rework for an full atomic gamma mode (Ville)
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJcXJw7AAoJEPpiX2QO6xPKWuQH/i3o86TqLheae2vdi8X/yp92
 XeZbmNfaL7Gui6ggZpRMREjIIiDZ0nFga5INcVtrak6JRjeefb+wQsy38tdb98ne
 EPn6eif0nO/uxc7BsdW2ndIDBSWC+ZW7gt4rzY5zLG/WCmX1qfOOS9GdUpXPF3jY
 sB8Af6lsGSxrCFkBzUB9Z7AyoP/Yq+YeamFOtXLufhtQb4GL3djjdugaUBBfQmNn
 FvZTxhhsXn3HU9Jy1AjRFALhqXvTjZt1L5myav4sCJ2qlsr2dL9y3c7evbO5jrZY
 4TSlKUi5XwfW1RlHTQN43fTq7yWFwoF1FsR4LaKS0a8bKIXiBLwset7YA1m6wmY=
 =Woo8
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2019-02-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

UAPI Changes:

- Expose RPCS (SSEU) configuration to userspace for Ice Lake
in order to allow userspace to reconfigure the subslice config
per context basis. (Tvrtko, Lionel)

Driver Changes:

- Execbuf and preemption improvements including selftests (Chris)
- Rename HAS_GMCH_DISPLAY/HAS_GMCH (Rodrigo)
- Debugfs error handling fix for robustness (Greg)
- Improve reg_rw traces (Ville)
- Push clear_intel_crtc_state onto the heap (Chris)
- Watermark fixes for Ice Lake (Ville)
- Fix enable count array size and bounds checking (Tvrtko)
- MST Fixes (Lyude)
- Prevent race and handle error on I915_GEM_MMAP (Joonas)
- Initial rework for an full atomic gamma mode (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208165000.GA30314@intel.com
2019-02-11 13:41:59 +10:00
Thomas Gleixner
41ea39101d y2038: Add time64 system calls
This series finally gets us to the point of having system calls with
 64-bit time_t on all architectures, after a long time of incremental
 preparation patches.
 
 There was actually one conversion that I missed during the summer,
 i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes
 and review comments.
 
 The following system calls are now added on all 32-bit architectures
 using the same system call numbers:
 
 403 clock_gettime64
 404 clock_settime64
 405 clock_adjtime64
 406 clock_getres_time64
 407 clock_nanosleep_time64
 408 timer_gettime64
 409 timer_settime64
 410 timerfd_gettime64
 411 timerfd_settime64
 412 utimensat_time64
 413 pselect6_time64
 414 ppoll_time64
 416 io_pgetevents_time64
 417 recvmmsg_time64
 418 mq_timedsend_time64
 419 mq_timedreceiv_time64
 420 semtimedop_time64
 421 rt_sigtimedwait_time64
 422 futex_time64
 423 sched_rr_get_interval_time64
 
 Each one of these corresponds directly to an existing system call
 that includes a 'struct timespec' argument, or a structure containing
 a timespec or (in case of clock_adjtime) timeval. Not included here
 are new versions of getitimer/setitimer and getrusage/waitid, which
 are planned for the future but only needed to make a consistent API
 rather than for correct operation beyond y2038. These four system
 calls are based on 'timeval', and it has not been finally decided
 what the replacement kernel interface will use instead.
 
 So far, I have done a lot of build testing across most architectures,
 which has found a number of bugs. Runtime testing so far included
 testing LTP on 32-bit ARM with the existing system calls, to ensure
 we do not regress for existing binaries, and a test with a 32-bit
 x86 build of LTP against a modified version of the musl C library
 that has been adapted to the new system call interface [3].
 This library can be used for testing on all architectures supported
 by musl-1.1.21, but it is not how the support is getting integrated
 into the official musl release. Official musl support is planned
 but will require more invasive changes to the library.
 
 Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/
 Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/
 Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2]
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcXf7/AAoJEGCrR//JCVInPSUP/RhsQSCKMGtONB/vVICQhwep
 PybhzBSpHWFxszzTi6BEPN1zS9B069G9mDollRBYZCckyPqL/Bv6sI/vzQZdNk01
 Q6Nw92OnNE1QP8owZ5TjrZhpbtopWdqIXjsbGZlloUemvuJP2JwvKovQUcn5CPTQ
 jbnqU04CVyFFJYVxAnGJ+VSeWNrjW/cm/m+rhLFjUcwW7Y3aodxsPqPP6+K9hY9P
 yIWfcH42WBeEWGm1RSBOZOScQl4SGCPUAhFydl/TqyEQagyegJMIyMOv9wZ5AuTT
 xK644bDVmNsrtJDZDpx+J8hytXCk1LrnKzkHR/uK80iUIraF/8D7PlaPgTmEEjko
 XcrywEkvkXTVU3owCm2/sbV+8fyFKzSPipnNfN1JNxEX71A98kvMRtPjDueQq/GA
 Yh81rr2YLF2sUiArkc2fNpENT7EGhrh1q6gviK3FB8YDgj1kSgPK5wC/X0uolC35
 E7iC2kg4NaNEIjhKP/WKluCaTvjRbvV+0IrlJLlhLTnsqbA57ZKCCteiBrlm7wQN
 4csUtCyxchR9Ac2o/lj+Mf53z68Zv74haIROp18K2dL7ZpVcOPnA3XHeauSAdoyp
 wy2Ek6ilNvlNB+4x+mRntPoOsyuOUGv7JXzB9JvweLWUd9G7tvYeDJQp/0YpDppb
 K4UWcKnhtEom0DgK08vY
 =IZVb
 -----END PGP SIGNATURE-----

Merge tag 'y2038-new-syscalls' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038

Pull y2038 - time64 system calls from Arnd Bergmann:

This series finally gets us to the point of having system calls with 64-bit
time_t on all architectures, after a long time of incremental preparation
patches.

There was actually one conversion that I missed during the summer,
i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes
and review comments.

The following system calls are now added on all 32-bit architectures using
the same system call numbers:

403 clock_gettime64
404 clock_settime64
405 clock_adjtime64
406 clock_getres_time64
407 clock_nanosleep_time64
408 timer_gettime64
409 timer_settime64
410 timerfd_gettime64
411 timerfd_settime64
412 utimensat_time64
413 pselect6_time64
414 ppoll_time64
416 io_pgetevents_time64
417 recvmmsg_time64
418 mq_timedsend_time64
419 mq_timedreceiv_time64
420 semtimedop_time64
421 rt_sigtimedwait_time64
422 futex_time64
423 sched_rr_get_interval_time64

Each one of these corresponds directly to an existing system call that
includes a 'struct timespec' argument, or a structure containing a timespec
or (in case of clock_adjtime) timeval. Not included here are new versions
of getitimer/setitimer and getrusage/waitid, which are planned for the
future but only needed to make a consistent API rather than for correct
operation beyond y2038. These four system calls are based on 'timeval', and
it has not been finally decided what the replacement kernel interface will
use instead.

So far, I have done a lot of build testing across most architectures, which
has found a number of bugs. Runtime testing so far included testing LTP on
32-bit ARM with the existing system calls, to ensure we do not regress for
existing binaries, and a test with a 32-bit x86 build of LTP against a
modified version of the musl C library that has been adapted to the new
system call interface [3].  This library can be used for testing on all
architectures supported by musl-1.1.21, but it is not how the support is
getting integrated into the official musl release. Official musl support is
planned but will require more invasive changes to the library.

Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/
Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/
Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2]
2019-02-10 21:24:43 +01:00
Thomas Gleixner
fd659cc095 arch: System call unification and cleanup
The system call tables have diverged a bit over the years, and a number
 of the recent additions never made it into all architectures, for one
 reason or another.
 
 This is an attempt to clean it up as far as we can without breaking
 compatibility, doing a number of steps:
 
 - Add system calls that have not yet been integrated into all
   architectures but that we definitely want there. This includes
   {,f}statfs64() and get{eg,eu,g,p,u,pp}id() on alpha, which have
   been missing traditionally.
 
 - The s390 compat syscall handling is cleaned up to be more like
   what we do on other architectures, while keeping the 31-bit
   pointer extension. This was merged as a shared branch by the
   s390 maintainers and is included here in order to base the other
   patches on top.
 
 - Add the separate ipc syscalls on all architectures that
   traditionally only had sys_ipc(). This version is done without
   support for IPC_OLD that is we have in sys_ipc. The
   new semtimedop_time64 syscall will only be added here, not
   in sys_ipc
 
 - Add syscall numbers for a couple of syscalls that we probably
   don't need everywhere, in particular pkey_* and rseq,
   for the purpose of symmetry: if it's in asm-generic/unistd.h,
   it makes sense to have it everywhere. I expect that any future
   system calls will get assigned on all platforms together, even
   when they appear to be specific to a single architecture.
 
 - Prepare for having the same system call numbers for any future
   calls. In combination with the generated tables, this hopefully
   makes it easier to add new calls across all architectures
   together.
 
 All of the above are technically separate from the y2038 work,
 but are done as preparation before we add the new 64-bit time_t
 system calls everywhere, providing a common baseline set of system
 calls.
 
 I expect that glibc and other libraries that want to use 64-bit
 time_t will require linux-5.1 kernel headers for building in
 the future, and at a much later point may also require linux-5.1
 or a later version as the minimum kernel at runtime. Having a
 common baseline then allows the removal of many architecture or
 kernel version specific workarounds.
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcXf6XAAoJEGCrR//JCVInIm4P/AlkMmQRa/B2ziWMW6PifPoI
 v18r44017rA1BPENyZvumJUdM5mDvNofOW8F2DYQ7Uiys2YtXenwe/Cf8LHn2n6c
 TMXGQryQpvNmfDCyU+0UjF8m2+poFMrL4aRTXtjODh1YTsPNgeDC+KFMCAAtZmZd
 cVbXFudtbdYKD/pgCX4SI1CWAMBiXe2e+ukPdJVr+iqusCMTApf+GOuyvDBZY9s/
 vURb+tIS87HZ/jehWfZFSuZt+Gu7b3ijUXNC8v9qSIxNYekw62vBNl6F09HE79uB
 Bv4OujAODqKvI9gGyydBzLJNzaMo0ryQdusyqcJHT7MY/8s+FwcYAXyTlQ3DbbB4
 2u/c+58OwJ9Zk12p4LXZRA47U+vRhQt2rO4+zZWs2txNNJY89ZvCm/Z04KOiu5Xz
 1Nnj607KGzthYRs2gs68AwzGGyf0uykIQ3RcaJLIBlX1Nd8BWO0ZgAguCvkXbQMX
 XNXJTd92HmeuKKpiO0n/M4/mCeP0cafBRPCZbKlHyTl0Jeqd/HBQEO9Z8Ifwyju3
 mXz9JCR9VlPCkX605keATbjtPGZf3XQtaXlQnezitDudXk8RJ33EpPcbhx76wX7M
 Rux37ByqEOzk4wMGX9YQyNU7z7xuVg4sJAa2LlJqYeKXHtym+u3gG7SGP5AsYjmk
 6mg2+9O2yZuLhQtOtrwm
 =s4wf
 -----END PGP SIGNATURE-----

Merge tag 'y2038-syscall-cleanup' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038

Pull preparatory work for y2038 changes from Arnd Bergmann:

System call unification and cleanup

The system call tables have diverged a bit over the years, and a number of
the recent additions never made it into all architectures, for one reason
or another.

This is an attempt to clean it up as far as we can without breaking
compatibility, doing a number of steps:

 - Add system calls that have not yet been integrated into all architectures
   but that we definitely want there. This includes {,f}statfs64() and
   get{eg,eu,g,p,u,pp}id() on alpha, which have been missing traditionally.

 - The s390 compat syscall handling is cleaned up to be more like what we
   do on other architectures, while keeping the 31-bit pointer
   extension. This was merged as a shared branch by the s390 maintainers
   and is included here in order to base the other patches on top.

 - Add the separate ipc syscalls on all architectures that traditionally
   only had sys_ipc(). This version is done without support for IPC_OLD
   that is we have in sys_ipc. The new semtimedop_time64 syscall will only
   be added here, not in sys_ipc

 - Add syscall numbers for a couple of syscalls that we probably don't need
   everywhere, in particular pkey_* and rseq, for the purpose of symmetry:
   if it's in asm-generic/unistd.h, it makes sense to have it everywhere. I
   expect that any future system calls will get assigned on all platforms
   together, even when they appear to be specific to a single architecture.

 - Prepare for having the same system call numbers for any future calls. In
   combination with the generated tables, this hopefully makes it easier to
   add new calls across all architectures together.

All of the above are technically separate from the y2038 work, but are done
as preparation before we add the new 64-bit time_t system calls everywhere,
providing a common baseline set of system calls.

I expect that glibc and other libraries that want to use 64-bit time_t will
require linux-5.1 kernel headers for building in the future, and at a much
later point may also require linux-5.1 or a later version as the minimum
kernel at runtime. Having a common baseline then allows the removal of many
architecture or kernel version specific workarounds.
2019-02-10 20:44:19 +01:00
Eli Cohen
eddd2cf195 net: Change TCA_ACT_* to TCA_ID_* to match that of TCA_ID_POLICE
Modify the kernel users of the TCA_ACT_* macros to use TCA_ID_*. For
example, use TCA_ID_GACT instead of TCA_ACT_GACT. This will align with
TCA_ID_POLICE and also differentiates these identifier, used in struct
tc_action_ops type field, from other macros starting with TCA_ACT_.

To make things clearer, we name the enum defining the TCA_ID_*
identifiers and also change the "type" field of struct tc_action to
id.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-10 09:28:43 -08:00
Eli Cohen
257eeded20 net: Move all TC actions identifiers to one place
Move all the TC identifiers to one place, to the same enum that defines
the identifier of police action. This makes it easier choose numbers for
new actions since they are now defined in one place. We preserve the
original values for binary compatibility. New IDs should be added inside
the enum.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-10 09:28:43 -08:00
Sven Eckelmann
9a182242f1 batman-adv: Add throughput_override hardif genl configuration
The B.A.T.M.A.N. V implementation tries to estimate the link throughput of
an interface to an originator using different automatic methods. It is
still possible to overwrite it the link throughput for all reachable
originators via this interface.

The BATADV_CMD_SET_HARDIF/BATADV_CMD_GET_HARDIF commands allow to set/get
the configuration of this feature using the u32
BATADV_ATTR_THROUGHPUT_OVERRIDE attribute. The used unit is in 100 Kbit/s.
If the value is set to 0 then batman-adv will try to estimate the
throughput by itself.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:15 +01:00
Sven Eckelmann
a108008290 batman-adv: Add elp_interval hardif genl configuration
The ELP packets are transmitted every elp_interval milliseconds on an
slave/hard-interface. This value can be changed using the configuration
interface.

The BATADV_CMD_SET_HARDIF/BATADV_CMD_GET_HARDIF commands allow to set/get
the configuration of this feature using the u32 BATADV_ATTR_ELP_INTERVAL
attribute.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:15 +01:00
Sven Eckelmann
7b751b39f0 batman-adv: Add orig_interval mesh genl configuration
The OGM packets are transmitted every orig_interval milliseconds. This
value can be changed using the configuration interface.

The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the
configuration of this feature using the u32 BATADV_ATTR_ORIG_INTERVAL
attribute.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:15 +01:00
Sven Eckelmann
6c57cde680 batman-adv: Add network_coding mesh genl configuration
The mesh interface can use (in an homogeneous mesh) network coding, a
mechanism that aims to increase the overall network throughput by fusing
multiple packets in one transmission.

The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the
configuration of this feature using the BATADV_ATTR_NETWORK_CODING_ENABLED
attribute. Setting the u8 to zero will disable this feature and setting it
to something else is enabling this feature.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:15 +01:00
Sven Eckelmann
f75b56bc91 batman-adv: Add multicast forceflood mesh genl configuration
The mesh interface can optimize the flooding of multicast packets based on
the content of the global translation tables. To disable this behavior and
use the broadcast-like flooding of the packets, forceflood has to be
enabled.

The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the
configuration of this feature using the
BATADV_ATTR_MULTICAST_FORCEFLOOD_ENABLED attribute. Setting the u8 to zero
will disable this feature (allowing multicast optimizations) and setting it
to something else is enabling this feature (forcing simple flooding).

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:14 +01:00
Sven Eckelmann
b85bd09109 batman-adv: Add log_level mesh genl configuration
In contrast to other modules, batman-adv allows to set the debug message
verbosity per mesh/soft-interface and not per module (via modparam).

The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the
configuration of this feature using the u32 (bitmask) BATADV_ATTR_LOG_LEVEL
attribute.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:14 +01:00
Sven Eckelmann
bfc7f1be57 batman-adv: Add hop_penalty mesh genl configuration
The TQ (B.A.T.M.A.N. IV) and throughput values (B.A.T.M.A.N. V) are reduced
when they are forwarded. One of the reductions is the penalty for
traversing an additional hop. This hop_penalty (0-255) defines the
percentage of reduction (0-100%).

The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the
configuration of this feature using the u8 BATADV_ATTR_HOP_PENALTY
attribute.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:14 +01:00
Sven Eckelmann
e2d0d35b5b batman-adv: Add gateway mesh genl configuration
The mesh/soft-interface can optimize the handling of DHCP packets. Instead
of flooding them through the whole mesh, it can be forwarded as unicast to
a specific gateway server. The originator which injects the packets in the
mesh has to select (based on sel_class thresholds) a responsible gateway
server. This is done by switching this originator to the gw_mode client.
The servers announce their forwarding bandwidth (download/upload) when the
gw_mode server was selected.

The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the
configuration of this feature using the attributes:

* u8 BATADV_ATTR_GW_MODE (0 == off, 1 == client, 2 == server)
* u32 BATADV_ATTR_GW_BANDWIDTH_DOWN (in 100 kbit/s steps)
* u32 BATADV_ATTR_GW_BANDWIDTH_UP (in 100 kbit/s steps)
* u32 BATADV_ATTR_GW_SEL_CLASS

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:14 +01:00
Sven Eckelmann
3e15b06eb7 batman-adv: Add fragmentation mesh genl configuration
The mesh interface can fragment unicast packets when the packet size
exceeds the outgoing slave/hard-interface MTU.

The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the
configuration of this feature using the BATADV_ATTR_FRAGMENTATION_ENABLED
attribute. Setting the u8 to zero will disable this feature and setting it
to something else is enabling this feature.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:14 +01:00
Sven Eckelmann
a1c8de8032 batman-adv: Add distributed_arp_table mesh genl configuration
The mesh interface can use a distributed hash table to answer ARP requests
without flooding the request through the whole mesh.

The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the
configuration of this feature using the
BATADV_ATTR_DISTRIBUTED_ARP_TABLE_ENABLED attribute. Setting the u8 to zero
will disable this feature and setting it to something else is enabling this
feature.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:14 +01:00
Sven Eckelmann
43ff6105a5 batman-adv: Add bridge_loop_avoidance mesh genl configuration
The mesh interface can try to detect loops in the same mesh caused by
(indirectly) bridged mesh/soft-interfaces of different nodes. Some of the
loops can also be resolved without breaking the mesh.

The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the
configuration of this feature using the
BATADV_ATTR_BRIDGE_LOOP_AVOIDANCE_ENABLED attribute. Setting the u8 to zero
will disable this feature and setting it to something else is enabling this
feature.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:14 +01:00
Sven Eckelmann
d7e52506b6 batman-adv: Add bonding mesh genl configuration
The mesh interface can use multiple slave/hard-interface ports at the same
time to transport the traffic to other nodes.

The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the
configuration of this feature using the BATADV_ATTR_BONDING_ENABLED
attribute. Setting the u8 to zero will disable this feature and setting it
to something else is enabling this feature.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:13 +01:00
Sven Eckelmann
e43d16b87d batman-adv: Add ap_isolation mesh/vlan genl configuration
The mesh interface can drop messages between clients to implement a
mesh-wide AP isolation.

The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH and
BATADV_CMD_SET_VLAN/BATADV_CMD_GET_VLAN commands allow to set/get the
configuration of this feature using the BATADV_ATTR_AP_ISOLATION_ENABLED
attribute. Setting the u8 to zero will disable this feature and setting it
to something else is enabling this feature.

This feature also requires that skbuff which should be handled as isolated
are marked. The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to
set/get the mark/mask using the u32 attributes BATADV_ATTR_ISOLATION_MARK
and BATADV_ATTR_ISOLATION_MASK.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:13 +01:00
Sven Eckelmann
9ab4cee5ce batman-adv: Add aggregated_ogms mesh genl configuration
The mesh interface can delay OGM messages to aggregate different ogms
together in a single OGM packet.

The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the
configuration of this feature using the BATADV_ATTR_AGGREGATED_OGMS_ENABLED
attribute. Setting the u8 to zero will disable this feature and setting it
to something else is enabling this feature.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:13 +01:00
Sven Eckelmann
49e7e37cd9 batman-adv: Prepare framework for vlan genl config
The batman-adv configuration interface was implemented solely using sysfs.
This approach was condemned by non-batadv developers as "huge mistake".
Instead a netlink/genl based implementation was suggested.

Beside the mesh/soft-interface specific configuration, the VLANs on top of
the mesh/soft-interface have configuration settings. The genl interface
reflects this by allowing to get/set it using the vlan specific commands
BATADV_CMD_GET_VLAN/BATADV_CMD_SET_VLAN.

The set command BATADV_CMD_SET_MESH will also notify interested userspace
listeners of the "config" mcast group using the BATADV_CMD_SET_VLAN command
message type that settings might have been changed and what the current
values are.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:13 +01:00
Sven Eckelmann
5c55a40fa8 batman-adv: Prepare framework for hardif genl config
The batman-adv configuration interface was implemented solely using sysfs.
This approach was condemned by non-batadv developers as "huge mistake".
Instead a netlink/genl based implementation was suggested.

Beside the mesh/soft-interface specific configuration, the
slave/hard-interface have B.A.T.M.A.N. V specific configuration settings.
The genl interface reflects this by allowing to get/set it using the
hard-interface specific commands.

The BATADV_CMD_GET_HARDIFS (or short version BATADV_CMD_GET_HARDIF) is
reused as get command because it already allow sto dump the content of
other information from the slave/hard-interface which are not yet
configuration specific.

The set command BATADV_CMD_SET_HARDIF will also notify interested userspace
listeners of the "config" mcast group using the BATADV_CMD_SET_HARDIF
command message type that settings might have been changed and what the
current values are.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:13 +01:00
Sven Eckelmann
6004051353 batman-adv: Prepare framework for mesh genl config
The batman-adv configuration interface was implemented solely using sysfs.
This approach was condemned by non-batadv developers as "huge mistake".
Instead a netlink/genl based implementation was suggested.

The main objects for this configuration is the mesh/soft-interface object.
Its actual object in memory already contains most of the available
configuration settings. The genl interface reflects this by allowing to
get/set it using the mesh specific commands.

The BATADV_CMD_GET_MESH_INFO (or short version BATADV_CMD_GET_MESH) is
reused as get command because it already provides the content of other
information from the mesh/soft-interface which are not yet configuration
specific.

The set command BATADV_CMD_SET_MESH will also notify interested userspace
listeners of the "config" mcast group using the BATADV_CMD_SET_MESH command
message type that settings might have been changed and what the current
values are.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:28:13 +01:00
Sven Eckelmann
180cf62cec batman-adv: Fix typo "reseved" -> "reserved"
checkpatch.pl complains since commit 45e417022023 ("scripts/spelling.txt:
add more spellings to spelling.txt") about an additional spelling mistake
in batman-adv:`

  CHECK: 'reseved' may be misspelled - perhaps 'reserved'?
  #232: FILE: include/uapi/linux/batadv_packet.h:232:
  + * @flags: reseved for routing relevant flags - currently always 0

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09 14:27:47 +01:00
Heiner Kallweit
3b5e74e0af net: phy: disregard "Clause 22 registers present" bit in get_phy_c45_devs_in_pkg
Bit 0 in register 1.5 doesn't represent a device but is a flag that
Clause 22 registers are present. Therefore disregard this bit when
populating the device list. If code needs this information it
should read register 1.5 directly instead of accessing the device
list.
Because this bit doesn't represent a device don't define a
MDIO_MMD_XYZ constant, just define a MDIO_DEVS_XYZ constant for
the flag in the device list bitmap.

v2:
- make masking of bit 0 more explicit
- improve commit message

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:10:42 -08:00
Danit Goldberg
2c1619edef IB/cma: Define option to set ack timeout and pack tos_set
Define new option in 'rdma_set_option' to override calculated QP timeout
when requested to provide QP attributes to modify a QP.

At the same time, pack tos_set to be bitfield.

Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:14:21 -07:00
David S. Miller
a655fe9f19 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
An ipvlan bug fix in 'net' conflicted with the abstraction away
of the IPV6 specific support in 'net-next'.

Similarly, a bug fix for mlx5 in 'net' conflicted with the flow
action conversion in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:00:17 -08:00
Randy Li
05f8bc82fc drm/fourcc: Add new P010, P016 video format
P010 is a planar 4:2:0 YUV with interleaved UV plane, 10 bits per
channel video format.

P012 is a planar 4:2:0 YUV 12 bits per channel

P016 is a planar 4:2:0 YUV with interleaved UV plane, 16 bits per
channel video format.

V3: Added P012 and fixed cpp for P010.
V4: format definition refined per review.
V5: Format comment block for each new pixel format.
V6: reversed Cb/Cr order in comments.
v7: reversed Cb/Cr order in comments of header files, remove
the wrong part of commit message.
V8: reversed V7 changes except commit message and rebased.
v9: used the new properties to describe those format and
rebased.

Cc: Daniel Stone <daniel@fooishbar.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>

Signed-off-by: Randy Li <ayaka@soulik.info>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Reviewed-by: Ayan Kumar Halder <ayan.halder@arm.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190109195710.28501-2-ayaka@soulik.info
2019-02-08 22:17:08 +01:00
Heiner Kallweit
998a8a8387 net: phy: let genphy_c45_read_link manage the devices to check
Let genphy_c45_read_link manage the devices to check, this removes
overhead from callers. Add C22EXT to the list of excluded devices
because it doesn't implement the status register. According to the
802.3 clause 45 spec registers 29.0 - 29.4 are reserved.

At the moment we have very few clause 45 PHY drivers, so we are
lacking experience whether other drivers will have to exclude further
devices, or may need to check PHY XS. If we should figure out that
list of devices to check needs to be configurable, I think best will
be to add a device list member to struct phy_driver.

v2:
- adjusted commit message
- exclude also device C22EXT from link checking

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07 18:17:08 -08:00
Devesh Sharma
95b86d1c91 RDMA/bnxt_re: Update kernel user abi to pass chip context
User space verbs provider library would need chip context.  Changing the
ABI to add chip version details in structure.  Furthermore, changing the
kernel driver ucontext allocation code to initialize the abi structure
with appropriate values.

As suggested by community, appended the new fields at the bottom of the
ABI structure and retaining to older fields as those were in the older
versions.

Keeping the ABI version at 1 and adding a new field in the ucontext
response structure to hold the component mask.  The user space library
should check pre-defined flags to figure out if a certain feature is
supported on not.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:49 -07:00
Eran Ben Elisha
35455e23e6 devlink: Add health dump {get,clear} commands
Add devlink health dump commands, in order to run an dump operation
over a specific reporter.

The supported operations are dump_get in order to get last saved
dump (if not exist, dump now) and dump_clear to clear last saved
dump.

It is expected from driver's callback for diagnose command to fill it
via the devlink fmsg API. Devlink will parse it and convert it to
netlink nla API in order to pass it to the user.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07 10:34:29 -08:00
Eran Ben Elisha
fca42a2794 devlink: Add health diagnose command
Add devlink health diagnose command, in order to run a diagnose
operation over a specific reporter.

It is expected from driver's callback for diagnose command to fill it
via the devlink fmsg API. Devlink will parse it and convert it to
netlink nla API in order to pass it to the user.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07 10:34:28 -08:00
Eran Ben Elisha
20a0943a5b devlink: Add health recover command
Add devlink health recover command to the uapi, in order to allow the user
to execute a recover operation over a specific reporter.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07 10:34:28 -08:00
Eran Ben Elisha
a1e55ec0a0 devlink: Add health set command
Add devlink health set command, in order to set configuration parameters
for a specific reporter.
Supported parameters are:
- graceful_period: Time interval between auto recoveries (in msec)
- auto_recover: Determines if the devlink shall execute recover upon
		receiving error for the reporter

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07 10:34:28 -08:00
Eran Ben Elisha
7afe335a8b devlink: Add health get command
Add devlink health get command to provide reporter/s data for user space.
Add the ability to get data per reporter or dump data from all available
reporters.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07 10:34:28 -08:00
Eran Ben Elisha
1db64e8733 devlink: Add devlink formatted message (fmsg) API
Devlink fmsg is a mechanism to pass descriptors between drivers and
devlink, in json-like format. The API allows the driver to add nested
attributes such as object, object pair and value array, in addition to
attributes such as name and value.

Driver can use this API to fill the fmsg context in a format which will be
translated by the devlink to the netlink message later.
There is no memory allocation in advance (other than the initial list
head), and it dynamically allocates messages descriptors and add them to
the list on the fly.

When it needs to send the data using SKBs to the netlink layer, it
fragments the data between different SKBs. In order to do this
fragmentation, it uses virtual nests attributes, to avoid actual
nesting use which cannot be divided between different SKBs.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07 10:34:28 -08:00
Amir Goldstein
235328d1fa fanotify: add support for create/attrib/move/delete events
Add support for events with data type FSNOTIFY_EVENT_INODE
(e.g. create/attrib/move/delete) for inode and filesystem mark types.

The "inode" events do not carry enough information (i.e. path) to
report event->fd, so we do not allow setting a mask for those events
unless group supports reporting fid.

The "inode" events are not supported on a mount mark, because they do
not carry enough information (i.e. path) to be filtered by mount point.

The "dirent" events (create/move/delete) report the fid of the parent
directory where events took place without specifying the filename of the
child. In the future, fanotify may get support for reporting filename
information for those events.

Cc: <linux-api@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-02-07 16:43:23 +01:00
Amir Goldstein
5e469c830f fanotify: copy event fid info to user
If group requested FAN_REPORT_FID and event has file identifier,
copy that information to user reading the event after event metadata.

fid information is formatted as struct fanotify_event_info_fid
that includes a generic header struct fanotify_event_info_header,
so that other info types could be defined in the future using the
same header.

metadata->event_len includes the length of the fid information.

The fid information includes the filesystem's fsid (see statfs(2))
followed by an NFS file handle of the file that could be passed as
an argument to open_by_handle_at(2).

Cc: <linux-api@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-02-07 16:38:35 +01:00
Amir Goldstein
e9e0c89030 fanotify: encode file identifier for FAN_REPORT_FID
When user requests the flag FAN_REPORT_FID in fanotify_init(),
a unique file identifier of the event target object will be reported
with the event.

The file identifier includes the filesystem's fsid (i.e. from statfs(2))
and an NFS file handle of the file (i.e. from name_to_handle_at(2)).

The file identifier makes holding the path reference and passing a file
descriptor to user redundant, so those are disabled in a group with
FAN_REPORT_FID.

Encode fid and store it in event for a group with FAN_REPORT_FID.
Up to 12 bytes of file handle on 32bit arch (16 bytes on 64bit arch)
are stored inline in fanotify_event struct. Larger file handles are
stored in an external allocated buffer.

On failure to encode fid, we print a warning and queue the event
without the fid information.

[JK: Fold part of later patched into this one to use
exportfs_encode_inode_fh() right away]

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-02-07 16:38:34 +01:00
Linus Torvalds
b0314565da virtio: fixes
A small fix for a uapi header, and a fix for VDPA for non-x86 guests.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJcWfKsAAoJECgfDbjSjVRpz8AIALAFWg40sy8n1LjcDoTar9oP
 YRrKUD7ag4dL8I41kbu4Vk8GQG1cSKFxk+PnZ4lhyiJGUgcd267qoUf3Kk9MxL94
 IvMxa/Y2qPiSpEIwJGDxikJNaqcSDTJXYgs6zjCLHgo35WUbmAxVh3iZDWkpaKKl
 mql+DsotNhEYE5BGAi8OhPIbGOtumkhCy6JySBD+trHyivKIotKaPN6PoNudPRQU
 ipP1YlTZQL2vYp3vwbwjaO4IKo/17IDQtzHxLQU2Q1ZORSPOctjffmF74qfDkNKl
 pqWYXwVbq8/EuKyhNexAqU3vkSQaLUEMthzi5d/Z7QYeJt/K5xcpgkKHYm1lZ/U=
 =XTXV
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
 "A small fix for a uapi header, and a fix for VDPA for non-x86 guests"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio: drop internal struct from UAPI
  virtio: support VIRTIO_F_ORDER_PLATFORM
2019-02-07 08:05:28 +00:00
Arnd Bergmann
48166e6ea4 y2038: add 64-bit time_t syscalls to all 32-bit architectures
This adds 21 new system calls on each ABI that has 32-bit time_t
today. All of these have the exact same semantics as their existing
counterparts, and the new ones all have macro names that end in 'time64'
for clarification.

This gets us to the point of being able to safely use a C library
that has 64-bit time_t in user space. There are still a couple of
loose ends to tie up in various areas of the code, but this is the
big one, and should be entirely uncontroversial at this point.

In particular, there are four system calls (getitimer, setitimer,
waitid, and getrusage) that don't have a 64-bit counterpart yet,
but these can all be safely implemented in the C library by wrapping
around the existing system calls because the 32-bit time_t they
pass only counts elapsed time, not time since the epoch. They
will be dealt with later.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-07 00:13:28 +01:00
Arnd Bergmann
c70a772fda y2038: remove struct definition redirects
We now use 64-bit time_t on all architectures, so the __kernel_timex,
__kernel_timeval and __kernel_timespec redirects can be removed
after having served their purpose.

This makes it all much less confusing, as the __kernel_* types
now always refer to the same layout based on 64-bit time_t across
all 32-bit and 64-bit architectures.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-07 00:13:28 +01:00
Arnd Bergmann
00bf25d693 y2038: use time32 syscall names on 32-bit
This is the big flip, where all 32-bit architectures set COMPAT_32BIT_TIME
and use the _time32 system calls from the former compat layer instead
of the system calls that take __kernel_timespec and similar arguments.

The temporary redirects for __kernel_timespec, __kernel_itimerspec
and __kernel_timex can get removed with this.

It would be easy to split this commit by architecture, but with the new
generated system call tables, it's easy enough to do it all at once,
which makes it a little easier to check that the changes are the same
in each table.

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-07 00:13:28 +01:00
Arnd Bergmann
8dabe7245b y2038: syscalls: rename y2038 compat syscalls
A lot of system calls that pass a time_t somewhere have an implementation
using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have
been reworked so that this implementation can now be used on 32-bit
architectures as well.

The missing step is to redefine them using the regular SYSCALL_DEFINEx()
to get them out of the compat namespace and make it possible to build them
on 32-bit architectures.

Any system call that ends in 'time' gets a '32' suffix on its name for
that version, while the others get a '_time32' suffix, to distinguish
them from the normal version, which takes a 64-bit time argument in the
future.

In this step, only 64-bit architectures are changed, doing this rename
first lets us avoid touching the 32-bit architectures twice.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-07 00:13:27 +01:00
Deepa Dinamani
2c620ff93d time: Add struct __kernel_timex
struct timex uses struct timeval internally.
struct timeval is not y2038 safe.
Introduce a new UAPI type struct __kernel_timex
that is y2038 safe.

struct __kernel_timex uses a timeval type that is
similar to struct __kernel_timespec which preserves the
same structure size across 32 bit and 64 bit ABIs.
struct __kernel_timex also restructures other members of the
structure to make the structure the same on 64 bit and 32 bit
architectures.
Note that struct __kernel_timex is the same as struct timex
on a 64 bit architecture.

The above solution is similar to other new y2038 syscalls
that are being introduced: both 32 bit and 64 bit ABIs
have a common entry, and the compat entry supports the old 32 bit
syscall interface.

Alternatives considered were:
1. Add new time type to struct timex that makes use of padded
   bits. This time type could be based on the struct __kernel_timespec.
   modes will use a flag to notify which time structure should be
   used internally.
   This needs some application level changes on both 64 bit and 32 bit
   architectures. Although 64 bit machines could continue to use the
   older timeval structure without any changes.

2. Add a new u8 type to struct timex that makes use of padded bits. This
   can be used to save higher order tv_sec bits. modes will use a flag to
   notify presence of such a type.
   This will need some application level changes on 32 bit architectures.

3. Add a new compat_timex structure that differs in only the size of the
   time type; keep rest of struct timex the same.
   This requires extra syscalls to manage all 3 cases on 64 bit
   architectures. This will not need any application level changes but will
   add more complexity from kernel side.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-07 00:13:27 +01:00
Marek Olšák
41cca166cc drm/amdgpu: add a workaround for GDS ordered append hangs with compute queues
I'm not increasing the DRM version because GDS isn't totally without bugs yet.

v2: update emit_ib_size

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-05 21:15:46 -05:00
Andrey Grodzovsky
67dd1a3633 drm/amdgpu: Add AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES
New chunk for dependency on start of job's execution instead on
the end. This is used for GPU deadlock prevention when
userspace uses mid-IB fences to wait for mid-IB work on other rings.

v2: Fix typo in AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES
v3: Bump KMS version
v4: put old fence AFTER acquiring the scheduled fence.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian Koenig <Christian.Koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-05 21:15:46 -05:00
Leon Romanovsky
a78e8723a5 RDMA/cma: Remove CM_ID statistics provided by rdma-cm module
Netlink statistics exported by rdma-cm never had any working user space
component published to the mailing list or to any open source
project. Canvassing various proprietary users, and the original requester,
we find that there are no real users of this interface.

This patch simply removes all occurrences of RDMA CM netlink in favour of
modern nldev implementation, which provides the same information and
accompanied by widely used user space component.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-05 15:30:33 -07:00
Michael S. Tsirkin
9c0644ee4a virtio: drop internal struct from UAPI
There's no reason to expose struct vring_packed in UAPI - if we do we
won't be able to change or drop it, and it's not part of any interface.

Let's move it to virtio_ring.c

Cc: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-02-05 15:29:48 -05:00
Tvrtko Ursulin
e46c2e99f6 drm/i915: Expose RPCS (SSEU) configuration to userspace (Gen11 only)
We want to allow userspace to reconfigure the subslice configuration on a
per context basis.

This is required for the functional requirement of shutting down non-VME
enabled sub-slices on Gen11 parts.

To do so, we expose a context parameter to allow adjustment of the RPCS
register stored within the context image (and currently not accessible via
LRI).

If the context is adjusted before first use or whilst idle, the adjustment
is for "free"; otherwise if the context is active we queue a request to do
so (using the kernel context), following all other activity by that
context, which is also marked as barrier for all following submission
against the same context.

Since the overhead of device re-configuration during context switching can
be significant, especially in multi-context workloads, we limit this new
uAPI to only support the Gen11 VME use case. In this use case either the
device is fully enabled, and exactly one slice and half of the subslices
are enabled.

Example usage:

	struct drm_i915_gem_context_param_sseu sseu = { };
	struct drm_i915_gem_context_param arg = {
		.param = I915_CONTEXT_PARAM_SSEU,
		.ctx_id = gem_context_create(fd),
		.size = sizeof(sseu),
		.value = to_user_pointer(&sseu)
	};

	/* Query device defaults. */
	gem_context_get_param(fd, &arg);

	/* Set VME configuration on a 1x6x8 part. */
	sseu.slice_mask = 0x1;
	sseu.subslice_mask = 0xe0;
	gem_context_set_param(fd, &arg);

v2: Fix offset of CTX_R_PWR_CLK_STATE in intel_lr_context_set_sseu()
    (Lionel)

v3: Add ability to program this per engine (Chris)

v4: Move most get_sseu() into i915_gem_context.c (Lionel)

v5: Validate sseu configuration against the device's capabilities (Lionel)

v6: Change context powergating settings through MI_SDM on kernel context
    (Chris)

v7: Synchronize the requests following a powergating setting change using
    a global dependency (Chris)
    Iterate timelines through dev_priv.gt.active_rings (Tvrtko)
    Disable RPCS configuration setting for non capable users
    (Lionel/Tvrtko)

v8: s/union intel_sseu/struct intel_sseu/ (Lionel)
    s/dev_priv/i915/ (Tvrtko)
    Change uapi class/instance fields to u16 (Tvrtko)
    Bump mask fields to 64bits (Lionel)
    Don't return EPERM when dynamic sseu is disabled (Tvrtko)

v9: Import context image into kernel context's ppgtt only when
    reconfiguring powergated slice/subslices (Chris)
    Use aliasing ppgtt when needed (Michel)

Tvrtko Ursulin:

v10:
 * Update for upstream changes.
 * Request submit needs a RPM reference.
 * Reject on !FULL_PPGTT for simplicity.
 * Pull out get/set param to helpers for readability and less indent.
 * Use i915_request_await_dma_fence in add_global_barrier to skip waits
   on the same timeline and avoid GEM_BUG_ON.
 * No need to explicitly assign a NULL pointer to engine in legacy mode.
 * No need to move gen8_make_rpcs up.
 * Factored out global barrier as prep patch.
 * Allow to only CAP_SYS_ADMIN if !Gen11.

v11:
 * Remove engine vfunc in favour of local helper. (Chris Wilson)
 * Stop retiring requests before updates since it is not needed
   (Chris Wilson)
 * Implement direct CPU update path for idle contexts. (Chris Wilson)
 * Left side dependency needs only be on the same context timeline.
   (Chris Wilson)
 * It is sufficient to order the timeline. (Chris Wilson)
 * Reject !RCS configuration attempts with -ENODEV for now.

v12:
 * Rebase for make_rpcs.

v13:
 * Centralize SSEU normalization to make_rpcs.
 * Type width checking (uAPI <-> implementation).
 * Gen11 restrictions uAPI checks.
 * Gen11 subslice count differences handling.
 Chris Wilson:
 * args->size handling fixes.
 * Update context image from GGTT.
 * Postpone context image update to pinning.
 * Use i915_gem_active_raw instead of last_request_on_engine.

v14:
 * Add activity tracker on intel_context to fix the lifetime issues
   and simplify the code. (Chris Wilson)

v15:
 * Fix context pin leak if no space in ring by simplifying the
   context pinning sequence.

v16:
 * Rebase for context get/set param locking changes.
 * Just -ENODEV on !Gen11. (Joonas)

v17:
 * Fix one Gen11 subslice enablement rule.
 * Handle error from i915_sw_fence_await_sw_fence_gfp. (Chris Wilson)

v18:
 * Update commit message. (Joonas)
 * Restrict uAPI to VME use case. (Joonas)

v19:
 * Rebase.

v20:
 * Rebase for ce->active_tracker.

v21:
 * Rebase for IS_GEN changes.

v22:
 * Reserve uAPI for flags straight away. (Chris Wilson)

v23:
 * Rebase for RUNTIME_INFO.

v24:
 * Added some headline docs for the uapi usage. (Joonas/Chris)

v25:
 * Renamed class/instance to engine_class/engine_instance to avoid clash
   with C++ keyword. (Tony Ye)

v26:
 * Rebased for runtime pm api changes.

v27:
 * Rebased for intel_context_init.
 * Wrap commit msg to 75.

v28:
 (Chris Wilson)
 * Use i915_gem_ggtt.
 * Use i915_request_await_dma_fence to show a better example.

v29:
 * i915_timeline_set_barrier can now fail. (Chris Wilson)

v30:
 * Capture some acks.

v31:
 * Drop the WARN_ON from use controllable paths. (Chris Wilson)
 * Use overflows_type for all checks.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100899
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107634
Issue: https://github.com/intel/media-driver/issues/267
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Zhipeng Gong <zhipeng.gong@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Stéphane Marchesin <marcheu@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205095032.22673-4-tvrtko.ursulin@linux.intel.com
2019-02-05 11:32:03 +00:00
Steve Wise
b0bad9ad51 RDMA/IWPM: Support no port mapping requirements
A soft iwarp driver that uses the host TCP stack via a kernel mode socket
does not need port mapping.  In fact, if the port map daemon, iwpmd, is
running, then iwpmd must not try and create/bind a socket to the actual
port for a soft iwarp connection, since the driver already has that socket
bound.

Yet if the soft iwarp driver wants to interoperate with hard iwarp devices
that -are- using port mapping, then the soft iwarp driver's mappings still
need to be maintained and advertised by the iwpm protocol.

This patch enhances the rdma driver<->iwcm interface to allow an iwarp
driver to specify that it does not want port mapping.  The iwpm
kernel<->iwpmd interface is also enhanced to pass up this information on
map requests.

Care is taken to interoperate with the current iwpmd version (ABI version
3) and only use the new NL attributes if iwpmd supports ABI version 4.

The ABI version define has also been created in rdma_netlink.h so both
kernel and user code can share it.  The iwcm and iwpmd negotiate the ABI
version to use with a new HELLO netlink message.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 16:26:02 -07:00
Steve Wise
f76903d574 RDMA/IWPM: refactor the IWPM message attribute names
In order to add new IWPM_NL attributes, the enums for the IWPM commands
attributes are refactored such that a new attribute can be added without
breaking ABI version 3. Instead of sharing nl attribute enums for both
request and response messages, we create separate enums for each IWPM
message request and reply.  This allows us to extend any given IWPM
message by adding new attributes for just that message.  These new enums
are created, though, in a way to avoid breaking ABI version 3.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 16:26:02 -07:00
Santosh Shilimkar
3eb450367d rds: add type of service(tos) infrastructure
RDS Service type (TOS) is user-defined and needs to be configured
via RDS IOCTL interface. It must be set before initiating any
traffic and once set the TOS can not be changed. All out-going
traffic from the socket will be associated with its TOS.

Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
[yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes]
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
2019-02-04 14:59:12 -08:00
Kamal Heib
668aa15b5b RDMA/rxe: Improve loopback marking
Currently a packet is marked for loopback only if the source and
destination addresses equals. This is not enough when multiple gids are
present in rxe device's gid table and the traffic is from one gid to
another. Fix it by marking the packet for loopback if the destination MAC
address is equal to the source MAC address.

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Tested-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 15:57:49 -07:00
Jason Gunthorpe
6a8a2aa62d Linux 5.0-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlxXYaEeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGkSQH/2yrfnviNPFYpZOR
 QQdc71Bfhkd8m85SmWIsSebkxmi3hKFVj15sGbWXd6+0/VxjEEGvQCZpvVwJceke
 LwDxtkKGg/74wAqJvlSAWxFNZ+Had4jDeoSoeQChddsBVXBBCxQx2v6ECg3o2x7W
 k8Z8t4+3RijDf8fYXY9ETyO2zW8R/wgT+dnl+DPgUH7u4dxh7FzAUfc4bgZIDg+i
 FzBQfbTJuz4BU7uRZ9IJiwhWKv0Iyi2DR3BY8Z1pqEpRaUMJMrCs2WGytHbTgt9e
 0EtO1airbVneU4eumU/ZaF9cyEbah9HousEPnP7J09WG4s/Odxc4zE+uK1QqS2im
 5Xv88is=
 =dVd1
 -----END PGP SIGNATURE-----

Merge tag 'v5.0-rc5' into rdma.git for-next

Linux 5.0-rc5

Needed to merge the include/uapi changes so we have an up to date
single-tree for these files. Patches already posted are also expected to
need this for dependencies.
2019-02-04 14:53:42 -07:00
Moni Shoua
52a72e2a39 IB/uverbs: Expose XRC ODP device capabilities
Expose XRC ODP capabilities as part of the extended device capabilities.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 14:34:06 -07:00
wenxu
a46c52d9f2 netfilter: nft_tunnel: Add NFTA_TUNNEL_MODE options
nft "tunnel" expr match both the tun_info of RX and TX. This patch
provide the NFTA_TUNNEL_MODE to individually match the tun_info of
RX or TX.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-04 14:39:58 +01:00
Ingo Molnar
98cb621081 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 08:45:42 +01:00
Greg Kroah-Hartman
1dcc3ed4a7 First set of new device support, features and cleanup for IIO in the 5.1 cycle
A number of interesting new devices supported plus a good set of staging
 cleanup including one graduation and one drop.
 
 New device support
 * ad56886
   - Add support for AD5674R/AD5679R with some minor driver changes to support
     more channels.
 * ad7768
   - New driver and dt bindings for this 24 bit ADC.
 * max44009
   - New driver and dt bindings for this ambient light sensor.
 * mpu6050
   - Support the ICM 20602 IMU. Minor tweaks due to slightly different
     register map.
 * NPCM adc
   - New driver and dt bindings for this BMC ADC.
 * Sensiron SGP30
   - Modifiers for ethanol and H2.
   - New driver and dt bindings.
   - Follow patch added self cleaning support.
 * Sensiron SPS30
   - New channel type for mass concentration.
   - New driver and bindings.
   - Minor tidy up patch followed (drop fmt specifier as unused)
 * st_pressure
   - lps22hh support. ID plus information structures and dt bindings.
 * ti-ads124s08
   - Add binding doc and driver.
 
 Staging graduations
 * ad7606 driver and bindings.
 
 Staging drops
 * ad7152 CDC driver dropped.  This part is near EoL and no one is known
   to be using it.  If anyone surfaces obviously we can bring the driver
   back.  If not, good to drop it to avoid wasting anyone's time cleaning
   it up.
 
 New features
 * bme680
   - DT support and bindings doc.
 * isl29018
   - Add regulator for VCC.
 * mag3110
   - Add regulators for supplies.
 * meson-saradc
   - Support the temperature sensors of more SoCs.
 * mma8452
   - Add regulators for power suplies and binding docs to reflect them.
 * st-accel
   - Support the undocumented but it seems fairly common _ONT ACPI method
     to specify orientation of the sensor.
 
 Cleanup, minor fixes and fixes for staging driver that have been broken a
 long time
 * ad5933
   - Drop platform data alternative to specifying the reference voltage
     using a regulator.
   - Use the clock framework to contorl the reference clock.
   - Add a DT binding doc to cover the defacto binding.
 * ad7280a
   - Split up some big functions to improve readability.
 * ad7606
   - Allow for timeout if interrupt never occurs.
   - Use devm functions to simplify probe and remove.
   - Use the find_closest macro to avoid need for precise values from
     userspace.
   - Add missing vendor prefixes for various DT properties. Note the
     driver is in staging still and there are no known devicetrees.
   - Add explict OF device ID table.
   - Simplify the Kconfig choices
   - Change to a threaded IRQ.
   - SPDX and simple stype fixes.
 * ad7816
   - Drop unnecessary variable init.
 * ad9523
   - Check a return value that was ignored.
 * ad9833
   - Drop platform data.  It was just setting most values to the hardware
     defaults.
   - Use the clock framework to provide the input clock.
 * adt7316 (lots of staging cleanup)
   - Fix some wrong register / bit definitions
   - Invert the logic of the check for an ldac pin so it actually makes sense.
   - Read the right register to get internal vref settings
   - Allow adt751x chips to use the internal vref for all DAC channels rather
     than a subset.
   - Remove dac vref bypass control from parts that don't have one.
   - Make the store DAC update mode function consistent with the show one.
   - Fix some spellings and other minor tidy up.
   - Avoid passing irq numbers around by putting all the irq logic in
     one place.
   - Fix an issue with the resolution of DAC control.
   - Fix support of the high resolution DAC mode (for temp proportional output)
     where supported.
   - Fix DAC read and write calculations.
 * st_lsm6dsx
   - Drop an unused variable (set but not read)
 * xilinx-xadc
   - Check an unhandled return value.
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAlxXWXwRHGppYzIzQGtl
 cm5lbC5vcmcACgkQVIU0mcT0FogtPhAAtGoXJl9gwBywQXjykFBiN5PxBxKypq58
 El7EQY7I0P/RkIVEBHgIc9p85lREVkIZQEcQZqsO5LI63cumCGtODyJg2CAF6+oz
 EIO0GCZFvb2zLiTyPg2NSuBhsYxiLtyHfu5MGPivbDyzVKZzf8LLjwQxkHYPaLzb
 Qa9D3wigoGHuvXpC9ULGQT36kMj6XPN9JGQsCRtbUQVgPu2XOPiuLv98vEpyaCPj
 1C7D72LAEq1Z5A3rA17KpT576Yy1JGv1LkSz48xEtHlDoB2IgCkoZfx6ThTr1ehX
 eWav1NsD1IpbWKiSY19rSmjSkkbYk1WHyV+tTeTguFb6nfIA/y22kl9td8Rqjk2P
 L/O1tRHdXO8G+gtCfBhUdVmT1SQq/vkow2UBH59CllN81bsWxFEudE4VHJCBlmjr
 sgMKdb3d0aYekRis/bNQJMvT0nY2x+g2k+ey+kEnrYOZVjCx+EeNNS249CnUnFId
 ZZNGqJ2j0xK9BULqffcupCoERLIhf/9nlx5JO3n5YA5X3pdns+yzVkS/9oXpN7EK
 9A41pU9zVWYN8B21MqfN0EPG41nkY36TOuTvL2/4/U2DfAD6VIf4vEYimrkTjsU0
 n7vKqQL8DRicWDsbxP8h6u/nG0u+ZgOgY4SVzth0aT3/plCAwhWmoQpnkjVFKMBD
 HBQB0HlEg2M=
 =M03w
 -----END PGP SIGNATURE-----

Merge tag 'iio-for-5.1a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next

Jonathan writes:

First set of new device support, features and cleanup for IIO in the 5.1 cycle

A number of interesting new devices supported plus a good set of staging
cleanup including one graduation and one drop.

New device support
* ad56886
  - Add support for AD5674R/AD5679R with some minor driver changes to support
    more channels.
* ad7768
  - New driver and dt bindings for this 24 bit ADC.
* max44009
  - New driver and dt bindings for this ambient light sensor.
* mpu6050
  - Support the ICM 20602 IMU. Minor tweaks due to slightly different
    register map.
* NPCM adc
  - New driver and dt bindings for this BMC ADC.
* Sensiron SGP30
  - Modifiers for ethanol and H2.
  - New driver and dt bindings.
  - Follow patch added self cleaning support.
* Sensiron SPS30
  - New channel type for mass concentration.
  - New driver and bindings.
  - Minor tidy up patch followed (drop fmt specifier as unused)
* st_pressure
  - lps22hh support. ID plus information structures and dt bindings.
* ti-ads124s08
  - Add binding doc and driver.

Staging graduations
* ad7606 driver and bindings.

Staging drops
* ad7152 CDC driver dropped.  This part is near EoL and no one is known
  to be using it.  If anyone surfaces obviously we can bring the driver
  back.  If not, good to drop it to avoid wasting anyone's time cleaning
  it up.

New features
* bme680
  - DT support and bindings doc.
* isl29018
  - Add regulator for VCC.
* mag3110
  - Add regulators for supplies.
* meson-saradc
  - Support the temperature sensors of more SoCs.
* mma8452
  - Add regulators for power suplies and binding docs to reflect them.
* st-accel
  - Support the undocumented but it seems fairly common _ONT ACPI method
    to specify orientation of the sensor.

Cleanup, minor fixes and fixes for staging driver that have been broken a
long time
* ad5933
  - Drop platform data alternative to specifying the reference voltage
    using a regulator.
  - Use the clock framework to contorl the reference clock.
  - Add a DT binding doc to cover the defacto binding.
* ad7280a
  - Split up some big functions to improve readability.
* ad7606
  - Allow for timeout if interrupt never occurs.
  - Use devm functions to simplify probe and remove.
  - Use the find_closest macro to avoid need for precise values from
    userspace.
  - Add missing vendor prefixes for various DT properties. Note the
    driver is in staging still and there are no known devicetrees.
  - Add explict OF device ID table.
  - Simplify the Kconfig choices
  - Change to a threaded IRQ.
  - SPDX and simple stype fixes.
* ad7816
  - Drop unnecessary variable init.
* ad9523
  - Check a return value that was ignored.
* ad9833
  - Drop platform data.  It was just setting most values to the hardware
    defaults.
  - Use the clock framework to provide the input clock.
* adt7316 (lots of staging cleanup)
  - Fix some wrong register / bit definitions
  - Invert the logic of the check for an ldac pin so it actually makes sense.
  - Read the right register to get internal vref settings
  - Allow adt751x chips to use the internal vref for all DAC channels rather
    than a subset.
  - Remove dac vref bypass control from parts that don't have one.
  - Make the store DAC update mode function consistent with the show one.
  - Fix some spellings and other minor tidy up.
  - Avoid passing irq numbers around by putting all the irq logic in
    one place.
  - Fix an issue with the resolution of DAC control.
  - Fix support of the high resolution DAC mode (for temp proportional output)
    where supported.
  - Fix DAC read and write calculations.
* st_lsm6dsx
  - Drop an unused variable (set but not read)
* xilinx-xadc
  - Check an unhandled return value.

* tag 'iio-for-5.1a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (67 commits)
  iio: chemical: sps30: remove printk format specifier
  staging: iio: frequency: ad9833: Load clock using clock framework
  staging: iio: frequency: ad9833: Get frequency value statically
  dt-bindings: iio: light: Add max44009
  iio: light: add driver for MAX44009
  dt-bindings: iio: adc: Add docs for AD7768-1
  iio: adc: Add AD7768-1 ADC basic support
  staging: iio: cdc: ad7152: remove driver completely
  iio: imu: mpu6050: Add support for the ICM 20602 IMU
  dt-bindings: iio: imu: add icm20602 bindings to mpu6050
  dt-bindings: iio: pressure: add LPS22HH bindings
  iio: st_accel: use ACPI orientation data
  iio: adc: add NPCM ADC driver
  dt-binding: iio: add NPCM ADC documentation
  iio: chemical: sps30: allow changing self cleaning period
  dt-bindings: iio: chemical: Add bindings for bme680
  iio: chemical: bme680: Add device-tree support
  iio:st_pressure:initial lps22hh sensor support
  iio: accell: mma8452: add vdd/vddio regulator operation support
  dt-bindings: iio: accel: mma8452: add power supplies property
  ...
2019-02-04 07:06:36 +01:00
Jakub Kicinski
bff5731d43 net: devlink: report cell size of shared buffers
Shared buffer allocation is usually done in cell increments.
Drivers will either round up the allocation or refuse the
configuration if it's not an exact multiple of cell size.
Drivers know exactly the cell size of shared buffer, so help
out users by providing this information in dumps.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03 11:25:34 -08:00
Deepa Dinamani
a9beb86ae6 sock: Add SO_RCVTIMEO_NEW and SO_SNDTIMEO_NEW
Add new socket timeout options that are y2038 safe.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Cc: ccaulfie@redhat.com
Cc: davem@davemloft.net
Cc: deller@gmx.de
Cc: paulus@samba.org
Cc: ralf@linux-mips.org
Cc: rth@twiddle.net
Cc: cluster-devel@redhat.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03 11:17:31 -08:00
Deepa Dinamani
45bdc66159 socket: Rename SO_RCVTIMEO/ SO_SNDTIMEO with _OLD suffixes
SO_RCVTIMEO and SO_SNDTIMEO socket options use struct timeval
as the time format. struct timeval is not y2038 safe.
The subsequent patches in the series add support for new socket
timeout options with _NEW suffix that will use y2038 safe
data structures. Although the existing struct timeval layout
is sufficiently wide to represent timeouts, because of the way
libc will interpret time_t based on user defined flag, these
new flags provide a way of having a structure that is the same
for all architectures consistently.
Rename the existing options with _OLD suffix forms so that the
right option is enabled for userspace applications according
to the architecture and time_t definition of libc.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Cc: ccaulfie@redhat.com
Cc: deller@gmx.de
Cc: paulus@samba.org
Cc: ralf@linux-mips.org
Cc: rth@twiddle.net
Cc: cluster-devel@redhat.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03 11:17:31 -08:00
Deepa Dinamani
9718475e69 socket: Add SO_TIMESTAMPING_NEW
Add SO_TIMESTAMPING_NEW variant of socket timestamp options.
This is the y2038 safe versions of the SO_TIMESTAMPING_OLD
for all architectures.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Cc: chris@zankel.net
Cc: fenghua.yu@intel.com
Cc: rth@twiddle.net
Cc: tglx@linutronix.de
Cc: ubraun@linux.ibm.com
Cc: linux-alpha@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-s390@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: sparclinux@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03 11:17:31 -08:00
Deepa Dinamani
887feae36a socket: Add SO_TIMESTAMP[NS]_NEW
Add SO_TIMESTAMP_NEW and SO_TIMESTAMPNS_NEW variants of
socket timestamp options.
These are the y2038 safe versions of the SO_TIMESTAMP_OLD
and SO_TIMESTAMPNS_OLD for all architectures.

Note that the format of scm_timestamping.ts[0] is not changed
in this patch.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Cc: jejb@parisc-linux.org
Cc: ralf@linux-mips.org
Cc: rth@twiddle.net
Cc: linux-alpha@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03 11:17:31 -08:00
Deepa Dinamani
98bb03c865 socket: Add struct __kernel_sock_timeval
The new type is meant to be used as a y2038 safe structure
to be used as part of cmsg data.
Presently the SO_TIMESTAMP socket option uses struct timeval
for timestamps. This is not y2038 safe.
Subsequent patches in the series add new y2038 safe socket
option to be used in the place of SO_TIMESTAMP_OLD.
struct __kernel_sock_timeval will be used as the timestamp
format at that time.

struct __kernel_sock_timeval also maintains the same layout
across 32 bit and 64 bit ABIs.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03 11:17:31 -08:00
Deepa Dinamani
bcb3fc3247 arch: sparc: Override struct __kernel_old_timeval
struct __kernel_old_timeval is supposed to have the same
layout as struct timeval. But, it was inadvarently missed
that __kernel_suseconds has a different definition for
sparc64.
Provide an asm-specific override that fixes it.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Cc: sparclinux@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03 11:17:30 -08:00
Deepa Dinamani
7f1bc6e95d sockopt: Rename SO_TIMESTAMP* to SO_TIMESTAMP*_OLD
SO_TIMESTAMP, SO_TIMESTAMPNS and SO_TIMESTAMPING options, the
way they are currently defined, are not y2038 safe.
Subsequent patches in the series add new y2038 safe versions
of these options which provide 64 bit timestamps on all
architectures uniformly.
Hence, rename existing options with OLD tag suffixes.

Also note that kernel will not use the untagged SO_TIMESTAMP*
and SCM_TIMESTAMP* options internally anymore.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Cc: deller@gmx.de
Cc: dhowells@redhat.com
Cc: jejb@parisc-linux.org
Cc: ralf@linux-mips.org
Cc: rth@twiddle.net
Cc: linux-afs@lists.infradead.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03 11:17:30 -08:00
David S. Miller
beb73559bf Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2019-02-01

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) introduce bpf_spin_lock, from Alexei.

2) convert xdp samples to libbpf, from Maciej.

3) skip verifier tests for unsupported program/map types, from Stanislav.

4) powerpc64 JIT support for BTF line info, from Sandipan.

5) assorted fixed, from Valdis, Jesper, Jiong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 20:12:18 -08:00
Jakub Kicinski
fc6fae7dd9 devlink: add version reporting to devlink info API
ethtool -i has a few fixed-size fields which can be used to report
firmware version and expansion ROM version. Unfortunately, modern
hardware has more firmware components. There is usually some
datapath microcode, management controller, PXE drivers, and a
CPLD load. Running ethtool -i on modern controllers reveals the
fact that vendors cram multiple values into firmware version field.

Here are some examples from systems I could lay my hands on quickly:

tg3:  "FFV20.2.17 bc 5720-v1.39"
i40e: "6.01 0x800034a4 1.1747.0"
nfp:  "0.0.3.5 0.25 sriov-2.1.16 nic"

Add a new devlink API to allow retrieving multiple versions, and
provide user-readable name for those versions.

While at it break down the versions into three categories:
 - fixed - this is the board/fixed component version, usually vendors
           report information like the board version in the PCI VPD,
           but it will benefit from naming and common API as well;
 - running - this is the running firmware version;
 - stored - this is firmware in the flash, after firmware update
            this value will reflect the flashed version, while the
            running version may only be updated after reboot.

v3:
 - add per-type helpers instead of using the special argument (Jiri).
RFCv2:
 - remove the nesting in attr DEVLINK_ATTR_INFO_VERSIONS (now
   versions are mixed with other info attrs)l
 - have the driver report versions from the same callback as
   other info.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:30:30 -08:00
Jakub Kicinski
f9cf22882c devlink: add device information API
ethtool -i has served us well for a long time, but its showing
its limitations more and more. The device information should
also be reported per device not per-netdev.

Lay foundation for a simple devlink-based way of reading device
info. Add driver name and device serial number as initial pieces
of information exposed via this new API.

v3:
 - rename helpers (Jiri);
 - rename driver name attr (Jiri);
 - remove double spacing in commit message (Jiri).
RFC v2:
 - wrap the skb into an opaque structure (Jiri);
 - allow the serial number of be any length (Jiri & Andrew);
 - add driver name (Jonathan).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:30:30 -08:00
Dave Watson
130b392c6c net: tls: Add tls 1.3 support
TLS 1.3 has minor changes from TLS 1.2 at the record layer.

* Header now hardcodes the same version and application content type in
  the header.
* The real content type is appended after the data, before encryption (or
  after decryption).
* The IV is xored with the sequence number, instead of concatinating four
  bytes of IV with the explicit IV.
* Zero-padding:  No exlicit length is given, we search backwards from the
  end of the decrypted data for the first non-zero byte, which is the
  content type.  Currently recv supports reading zero-padding, but there
  is no way for send to add zero padding.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:00:55 -08:00
Dave Watson
fb99bce712 net: tls: Support 256 bit keys
Wire up support for 256 bit keys from the setsockopt to the crypto
framework

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:00:55 -08:00
Alexei Starovoitov
96049f3afd bpf: introduce BPF_F_LOCK flag
Introduce BPF_F_LOCK flag for map_lookup and map_update syscall commands
and for map_update() helper function.
In all these cases take a lock of existing element (which was provided
in BTF description) before copying (in or out) the rest of map value.

Implementation details that are part of uapi:

Array:
The array map takes the element lock for lookup/update.

Hash:
hash map also takes the lock for lookup/update and tries to avoid the bucket lock.
If old element exists it takes the element lock and updates the element in place.
If element doesn't exist it allocates new one and inserts into hash table
while holding the bucket lock.
In rare case the hashmap has to take both the bucket lock and the element lock
to update old value in place.

Cgroup local storage:
It is similar to array. update in place and lookup are done with lock taken.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:39 +01:00
Alexei Starovoitov
d83525ca62 bpf: introduce bpf_spin_lock
Introduce 'struct bpf_spin_lock' and bpf_spin_lock/unlock() helpers to let
bpf program serialize access to other variables.

Example:
struct hash_elem {
    int cnt;
    struct bpf_spin_lock lock;
};
struct hash_elem * val = bpf_map_lookup_elem(&hash_map, &key);
if (val) {
    bpf_spin_lock(&val->lock);
    val->cnt++;
    bpf_spin_unlock(&val->lock);
}

Restrictions and safety checks:
- bpf_spin_lock is only allowed inside HASH and ARRAY maps.
- BTF description of the map is mandatory for safety analysis.
- bpf program can take one bpf_spin_lock at a time, since two or more can
  cause dead locks.
- only one 'struct bpf_spin_lock' is allowed per map element.
  It drastically simplifies implementation yet allows bpf program to use
  any number of bpf_spin_locks.
- when bpf_spin_lock is taken the calls (either bpf2bpf or helpers) are not allowed.
- bpf program must bpf_spin_unlock() before return.
- bpf program can access 'struct bpf_spin_lock' only via
  bpf_spin_lock()/bpf_spin_unlock() helpers.
- load/store into 'struct bpf_spin_lock lock;' field is not allowed.
- to use bpf_spin_lock() helper the BTF description of map value must be
  a struct and have 'struct bpf_spin_lock anyname;' field at the top level.
  Nested lock inside another struct is not allowed.
- syscall map_lookup doesn't copy bpf_spin_lock field to user space.
- syscall map_update and program map_update do not update bpf_spin_lock field.
- bpf_spin_lock cannot be on the stack or inside networking packet.
  bpf_spin_lock can only be inside HASH or ARRAY map value.
- bpf_spin_lock is available to root only and to all program types.
- bpf_spin_lock is not allowed in inner maps of map-in-map.
- ld_abs is not allowed inside spin_lock-ed region.
- tracing progs and socket filter progs cannot use bpf_spin_lock due to
  insufficient preemption checks

Implementation details:
- cgroup-bpf class of programs can nest with xdp/tc programs.
  Hence bpf_spin_lock is equivalent to spin_lock_irqsave.
  Other solutions to avoid nested bpf_spin_lock are possible.
  Like making sure that all networking progs run with softirq disabled.
  spin_lock_irqsave is the simplest and doesn't add overhead to the
  programs that don't use it.
- arch_spinlock_t is used when its implemented as queued_spin_lock
- archs can force their own arch_spinlock_t
- on architectures where queued_spin_lock is not available and
  sizeof(arch_spinlock_t) != sizeof(__u32) trivial lock is used.
- presence of bpf_spin_lock inside map value could have been indicated via
  extra flag during map_create, but specifying it via BTF is cleaner.
  It provides introspection for map key/value and reduces user mistakes.

Next steps:
- allow bpf_spin_lock in other map types (like cgroup local storage)
- introduce BPF_F_LOCK flag for bpf_map_update() syscall and helper
  to request kernel to grab bpf_spin_lock before rewriting the value.
  That will serialize access to map elements.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:38 +01:00
David S. Miller
d3a5fd3c98 This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
 
  - Add DHCPACKs for DAT snooping, by Linus Luessing
 
  - Update copyright years for 2019, by Sven Eckelmann
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlxUKp8WHHN3QHNpbW9u
 d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoc1rD/oC39wZl/ZdubJogpGWI3G9SdU8
 SxbjnaCVCkLMsLJH/wv/OM9OWwHa2OhvOzmMO5Zq47dy5SYXqBMkvv0QOMJcUQqz
 1nPMssTVo9gQLrJiPiP+yXGZ5+sc4dsLCiykFsCs7YRhqeStL1QTZkAOEKfy0K+7
 6G5OQsLT62aY+1OWLbb5oJaB9nlhUY+GyuQZ213jNuYxP7I6MyM9FfMHokASMLn8
 H58fIvDyGkC0PXvYZiJedlnFBU92TPEnBAV/afJ8egcmYQw9jkWL3cbS5ZzqDG4m
 49p9/Xmt2ARsf4UMDxQTEE3elw3tu1PZGPSecTmU+rRzHyYHIIYWnFgTZWmK7/zU
 TKQMlrPx4ky8HOyIY6/5AHNR7x5muchgxf0ft+4Jf0Bf+rGFIgdqfAIeQliUNAUc
 IW+HC0c1SEU/519a6z1V/ARrC6W4qk8aBZ0G4zyx+76KLvxlyvgjEo3XNasyNJnY
 GpHHhpyIeY5xeNOsGmoVrQraJRMqwr4jnWdcmf1LS8o9loB6X7bij8SRKUsEwNi2
 AkIK2sojUf6c4YZW/GVqIgvPuGtZL/Sy9FHx7Ve5f1NRuxpcMStSVV+daYqHYEe6
 72/WYF+oJc4fCR3zAXc67wVoyuPVPxOKpwh2uJUeVbvdvHZ0+dpOV740ktdFjIWQ
 3XAGhl/dSrn7OXYJqA==
 =G6IW
 -----END PGP SIGNATURE-----

Merge tag 'batadv-next-for-davem-20190201' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This feature/cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - Add DHCPACKs for DAT snooping, by Linus Luessing

 - Update copyright years for 2019, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 11:04:13 -08:00
Linus Torvalds
5eeb63359b Second 5.0 rc pull request
Still not much going on, the usual set of OOP's and driver updates this time.
 
 - Fix two uapi breakage regressions in mlx5 drivers
 
 - Various oops fixes in hfi1, mlx4, umem, uverbs, and ipoib
 
 - A protocol bug fix for hfi1 preventing it from implementing the verbs
   API properly, and a compatability fix for EXEC STACK user programs
 
 - Fix missed refcounting in the 'advise_mr' patches merged this
   cycle.
 
 - Fix wrong use of the uABI in the hns SRQ patches merged this cycle.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlxUguAACgkQOG33FX4g
 mxq93Q/+N7SyMjClN2/4Wz+vY1dAAOneshVdg+vsoe+S3sZG77xTAps1NX9nJ89o
 VXc4RyTCQXIirN1tiqXtjvbcrrPvxpDhX9OOo1qE/a1LfnUAtNM2om9IrxOQLxjN
 suYCk9WihilePS6WEtbZNeR3cP496Npm5N/uxVibPNCZghdxIZCa+PXAhosf0vCH
 aslOgQnQXG1hRc4XjyxMa9ltT3R967wYHnhIpuPe9psH3HcB+wrLzusyBYUMYu4N
 B2v9dyBNsuHlwTldF4b2ELD7mAJUMSk8X+x54a7RGQe5kgxD1BRjyOol67WwpNnl
 JZTwTwluiVSEZ5V63dj2AkXKMo1695Q1tuMftGoEipaHopUXOTb70MOH9TDR1jbX
 /jUjMUNJtafHhNBAVtTd04nX1fOsKp4GYVImoKPPRXPaQFoF5jXvqyGUuKZ9DAYb
 zbnLKMaJOCu2JcB8sHHUfjgiGtJqtf73V0a4O3fq+rTSkY6UDnEgYF7j7GY1X+5N
 E478aWw1JYVTIyUgiRFguJcBgJsY7hjhIS9fJaRD9aXuwo9Lg7bdpDUogfpcERA3
 iNZFEyx8h2fBAb8DojYIBzRgbEtZkIYxe18jILK3SMjGQ0PfB9GGAl3JAWDOQIGx
 uphV0WM/MKf31SeY+zD9dh9Z6ew4oTUNv3upRAzLDmkRe3J07j8=
 =LxVe
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Still not much going on, the usual set of oops and driver fixes this
  time:

   - Fix two uapi breakage regressions in mlx5 drivers

   - Various oops fixes in hfi1, mlx4, umem, uverbs, and ipoib

   - A protocol bug fix for hfi1 preventing it from implementing the
     verbs API properly, and a compatability fix for EXEC STACK user
     programs

   - Fix missed refcounting in the 'advise_mr' patches merged this
     cycle.

   - Fix wrong use of the uABI in the hns SRQ patches merged this cycle"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/uverbs: Fix OOPs in uverbs_user_mmap_disassociate
  IB/ipoib: Fix for use-after-free in ipoib_cm_tx_start
  IB/uverbs: Fix ioctl query port to consider device disassociation
  RDMA/mlx5: Fix flow creation on representors
  IB/uverbs: Fix OOPs upon device disassociation
  RDMA/umem: Add missing initialization of owning_mm
  RDMA/hns: Update the kernel header file of hns
  IB/mlx5: Fix how advise_mr() launches async work
  RDMA/device: Expose ib_device_try_get(()
  IB/hfi1: Add limit test for RC/UC send via loopback
  IB/hfi1: Remove overly conservative VM_EXEC flag check
  IB/{hfi1, qib}: Fix WC.byte_len calculation for UD_SEND_WITH_IMM
  IB/mlx4: Fix using wrong function to destroy sqp AHs under SRIOV
  RDMA/mlx5: Fix check for supported user flags when creating a QP
2019-02-01 10:39:24 -08:00
Kristina Martsenko
d0a060be57 arm64: add ptrace regsets for ptrauth key management
Add two new ptrace regsets, which can be used to request and change the
pointer authentication keys of a thread. NT_ARM_PACA_KEYS gives access
to the instruction/data address keys, and NT_ARM_PACG_KEYS to the
generic authentication key. The keys are also part of the core dump file
of the process.

The regsets are only exposed if the kernel is compiled with
CONFIG_CHECKPOINT_RESTORE=y, as the only intended use case is
checkpointing and restoring processes that are using pointer
authentication. (This can be changed later if there are other use
cases.)

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-01 13:56:58 +00:00
Johannes Berg
752cfee90d Merge remote-tracking branch 'net-next/master' into mac80211-next
Merge net-next so that we get the changes from net, which would
otherwise conflict with the NLA_POLICY_NESTED/_ARRAY changes.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 11:05:35 +01:00
Dave Airlie
74b7d6a913 Merge branch 'for-upstream/mali-dp' of git://linux-arm.org/linux-ld into drm-next
This pull includes the new Arm "komeda" DRM driver. It is currently hosted
in the same repo as the other "mali-dp" driver because it is the next
iteration of the IP.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Liviu Dudau <Liviu.Dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190131173600.GN25147@e110455-lin.cambridge.arm.com
2019-02-01 10:01:50 +10:00
Xin Long
80df2704a3 sctp: introduce SCTP_FUTURE/CURRENT/ALL_ASSOC
This patch is to add 3 constants SCTP_FUTURE_ASSOC,
SCTP_CURRENT_ASSOC and SCTP_ALL_ASSOC for reserved
assoc_ids, as defined in rfc6458#section-7.2.

And add the process for them when doing lookup and
inserting in sctp_id2assoc and sctp_assoc_set_id.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 00:44:06 -08:00
Thierry Reding
2d908b38d4 serial: Add Tegra Combined UART driver
The Tegra Combined UART (TCU) is a mailbox-based mechanism that allows
multiplexing multiple "virtual UARTs" into a single hardware serial
port. The TCU is the primary serial port on Tegra194 devices.

Add a TCU driver utilizing the mailbox framework, as the used mailboxes
are part of Tegra HSP blocks that are already controlled by the Tegra
HSP mailbox driver.

Based on work by  Mikko Perttunen <mperttunen@nvidia.com>.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30 09:31:31 +01:00
Vasundhara Volam
c1e5786d67 devlink: Add devlink notifications support for port params
Add notification call for devlink port param set, register and unregister
functions.
Add devlink_port_param_value_changed() function to enable the driver notify
devlink on value change. Driver should use this function after value was
changed on any configuration mode part to driverinit.

v7->v8:
Order devlink_port_param_value_changed() definitions followed by
devlink_param_value_changed()

Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-29 22:13:09 -08:00
Vasundhara Volam
9c54873b4e devlink: Add port param set command
Add port param set command to set the value for a parameter.
Value can be set to any of the supported configuration modes.

v7->v8: Append "Acked-by: Jiri Pirko <jiri@mellanox.com>"

Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-29 22:13:09 -08:00
Vasundhara Volam
f4601dee25 devlink: Add port param get command
Add port param get command which gets data per parameter.
It also has option to dump the parameters data per port.

v7->v8: Append "Acked-by: Jiri Pirko <jiri@mellanox.com>"

Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-29 22:13:08 -08:00
Dexuan Cui
1194c41331 nfit: Add Hyper-V NVDIMM DSM command set to white list
Add the Hyper-V _DSM command set to the white list of NVDIMM command
sets.

This command set is documented at http://www.uefi.org/RFIC_LIST
(see "Virtual NVDIMM 0x1901").

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-01-29 22:09:31 -08:00
Waiman Long
71368af902 x86/speculation: Add PR_SPEC_DISABLE_NOEXEC
With the default SPEC_STORE_BYPASS_SECCOMP/SPEC_STORE_BYPASS_PRCTL mode,
the TIF_SSBD bit will be inherited when a new task is fork'ed or cloned.
It will also remain when a new program is execve'ed.

Only certain class of applications (like Java) that can run on behalf of
multiple users on a single thread will require disabling speculative store
bypass for security purposes. Those applications will call prctl(2) at
startup time to disable SSB. They won't rely on the fact the SSB might have
been disabled. Other applications that don't need SSBD will just move on
without checking if SSBD has been turned on or not.

The fact that the TIF_SSBD is inherited across execve(2) boundary will
cause performance of applications that don't need SSBD but their
predecessors have SSBD on to be unwittingly impacted especially if they
write to memory a lot.

To remedy this problem, a new PR_SPEC_DISABLE_NOEXEC argument for the
PR_SET_SPECULATION_CTRL option of prctl(2) is added to allow applications
to specify that the SSBD feature bit on the task structure should be
cleared whenever a new program is being execve'ed.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: https://lkml.kernel.org/r/1547676096-3281-1-git-send-email-longman@redhat.com
2019-01-29 22:11:49 +01:00
Yishai Hadas
a124edba26 IB/mlx5: Introduce async DEVX obj query API
Introduce async DEVX obj query API to get the command response back to
user space once it's ready without blocking when calling the firmware.

The event's data includes a header with some meta data then the firmware
output command data.

The header includes:
- The input 'wr_id' to let application recognizing the response.

The input FD attribute is used to have the event data ready on.
Downstream patches from this series will implement the file ops to let
application read it.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-29 13:33:00 -07:00
Yishai Hadas
6bf8f22aea IB/mlx5: Introduce MLX5_IB_OBJECT_DEVX_ASYNC_CMD_FD
Introduce MLX5_IB_OBJECT_DEVX_ASYNC_CMD_FD and its initial implementation.

This object is from type class FD and will be used to read DEVX async
commands completion.

The core layer should allow the driver to set object from type FD in a
safe mode, this option was added with a matching comment in place.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-29 13:32:43 -07:00
David S. Miller
ec7146db15 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-01-29

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Teach verifier dead code removal, this also allows for optimizing /
   removing conditional branches around dead code and to shrink the
   resulting image. Code store constrained architectures like nfp would
   have hard time doing this at JIT level, from Jakub.

2) Add JMP32 instructions to BPF ISA in order to allow for optimizing
   code generation for 32-bit sub-registers. Evaluation shows that this
   can result in code reduction of ~5-20% compared to 64 bit-only code
   generation. Also add implementation for most JITs, from Jiong.

3) Add support for __int128 types in BTF which is also needed for
   vmlinux's BTF conversion to work, from Yonghong.

4) Add a new command to bpftool in order to dump a list of BPF-related
   parameters from the system or for a specific network device e.g. in
   terms of available prog/map types or helper functions, from Quentin.

5) Add AF_XDP sock_diag interface for querying sockets from user
   space which provides information about the RX/TX/fill/completion
   rings, umem, memory usage etc, from Björn.

6) Add skb context access for skb_shared_info->gso_segs field, from Eric.

7) Add support for testing flow dissector BPF programs by extending
   existing BPF_PROG_TEST_RUN infrastructure, from Stanislav.

8) Split BPF kselftest's test_verifier into various subgroups of tests
   in order better deal with merge conflicts in this area, from Jakub.

9) Add support for queue/stack manipulations in bpftool, from Stanislav.

10) Document BTF, from Yonghong.

11) Dump supported ELF section names in libbpf on program load
    failure, from Taeung.

12) Silence a false positive compiler warning in verifier's BTF
    handling, from Peter.

13) Fix help string in bpftool's feature probing, from Prashant.

14) Remove duplicate includes in BPF kselftests, from Yue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28 19:38:33 -08:00
David S. Miller
343917b410 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for your net-next tree:

1) Introduce a hashtable to speed up object lookups, from Florian Westphal.

2) Make direct calls to built-in extension, also from Florian.

3) Call helper before confirming the conntrack as it used to be originally,
   from Florian.

4) Call request_module() to autoload br_netfilter when physdev is used
   to relax the dependency, also from Florian.

5) Allow to insert rules at a given position ID that is internal to the
   batch, from Phil Sutter.

6) Several patches to replace conntrack indirections by direct calls,
   and to reduce modularization, from Florian. This also includes
   several follow up patches to deal with minor fallout from this
   rework.

7) Use RCU from conntrack gre helper, from Florian.

8) GRE conntrack module becomes built-in into nf_conntrack, from Florian.

9) Replace nf_ct_invert_tuplepr() by calls to nf_ct_invert_tuple(),
   from Florian.

10) Unify sysctl handling at the core of nf_conntrack, from Florian.

11) Provide modparam to register conntrack hooks.

12) Allow to match on the interface kind string, from wenxu.

13) Remove several exported symbols, not required anymore now after
    a bit of de-modulatization work has been done, from Florian.

14) Remove built-in map support in the hash extension, this can be
    done with the existing userspace infrastructure, from laura.

15) Remove indirection to calculate checksums in IPVS, from Matteo Croce.

16) Use call wrappers for indirection in IPVS, also from Matteo.

17) Remove superfluous __percpu parameter in nft_counter, patch from
    Luc Van Oostenryck.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28 17:34:38 -08:00
Greg Kroah-Hartman
fdddcfd9c9 Merge 5.0-rc4 into char-misc-next
We need the char-misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-28 08:13:52 +01:00
David S. Miller
1d68101367 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-01-27 10:43:17 -08:00
Linus Torvalds
78e372e650 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
 "A fixup for the input_event fix for y2038 Sparc64, and couple other
  minor fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: input_event - fix the CONFIG_SPARC64 mixup
  Input: olpc_apsp - assign priv->dev earlier
  Input: uinput - fix undefined behavior in uinput_validate_absinfo()
  Input: raspberrypi-ts - fix link error
  Input: xpad - add support for SteelSeries Stratus Duo
  Input: input_event - provide override for sparc64
2019-01-27 09:07:03 -08:00
Jiong Wang
d405c7407a bpf: allocate 0x06 to new eBPF instruction class JMP32
The new eBPF instruction class JMP32 uses the reserved class number 0x6.
Kernel BPF ISA documentation updated accordingly.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-26 13:33:01 -08:00
Linus Torvalds
6b8f915916 for-linus-20190125
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlxLdgsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpoVGD/4sGYQqfiXogQIJYbPH2RRPrJuLIIITjiAv
 lPXX1wx/tvz/ktwKJE2OiWTES0JjdH1HlC+7/0L/fLb8DXiKBmFuUHlwhureoL9Y
 o//BIQKuaje35kyHITTy2UAJOOqnNtJTaAP2AfkL+eOcj/V/G5rJIfLGs9QtuAR7
 sRJ+uhg1EbW/+uO0bULDmG6WUjxFu8mqcw3i6g0VVLVOnXB2EKcZTl3KPrdAXrUp
 XtmouERga6OfAUSJyZmTSV136mL+opRB2WFFVeIzjQfLmyItDGbSX/YPS8oJ2pow
 v7630F+CMrd4aKpqqtnAhfWpGqd0Xw7cYfZ9MKTJmZPmGzf9a1fQFpmgZosD4Dh3
 7MrhboU4TUt9PdXESA7CmE7LkTp99ghfj5/ysKrSV5h3HsH2RbLxJk91Rx3vmAWD
 u1xWRYL+GYLH6ZwOLvM1iqBrrLN3mUyrx98SaMgoXuqNzmQmgz9LPeA0Gt09FJbo
 uj+ebg4dRwuThjni4xQhl3zL2RQy7nlTDFDdKOz/XoiYk2NUVksss+sxGjNarHj0
 b5pCD4HOp57OreGExaOARpBRah5HSNdQtBRsIOnbyEq6f/e1LsIY23Z9nNF0deGO
 sZzgsbnsn+zg8bC6T/Gk4UY6XdCcgaS3SL04SVKAE3lO6A4C/Awo8DgD9Bl1zpC1
 HQlNkl5fBg==
 =iucY
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20190125' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A collection of fixes for this release. This contains:

   - Silence sparse rightfully complaining about non-static wbt
     functions (Bart)

   - Fixes for the zoned comments/ioctl documentation (Damien)

   - direct-io fix that's been lingering for a while (Ernesto)

   - cgroup writeback fix (Tejun)

   - Set of NVMe patches for nvme-rdma/tcp (Sagi, Hannes, Raju)

   - Block recursion tracking fix (Ming)

   - Fix debugfs command flag naming for a few flags (Jianchao)"

* tag 'for-linus-20190125' of git://git.kernel.dk/linux-block:
  block: Fix comment typo
  uapi: fix ioctl documentation
  blk-wbt: Declare local functions static
  blk-mq: fix the cmd_flag_name array
  nvme-multipath: drop optimization for static ANA group IDs
  nvmet-rdma: fix null dereference under heavy load
  nvme-rdma: rework queue maps handling
  nvme-tcp: fix timeout handler
  nvme-rdma: fix timeout handler
  writeback: synchronize sync(2) against cgroup writeback membership switches
  block: cover another queue enter recursion via BIO_QUEUE_ENTERED
  direct-io: allow direct writes to empty inodes
2019-01-26 12:42:41 -08:00
Linus Torvalds
d488bd21a4 Char/Misc driver fixes for 5.0-rc4
Here are some small char and misc driver fixes to resolve some reported
 issues, as well as a number of binderfs fixups that were found after
 auditing the filesystem code by Al Viro.  As binderfs hasn't been in a
 previous release yet, it's good to get these in now before the first
 users show up.
 
 All of these have been in linux-next for a bit with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXEr/Iw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymLqACgsYcCs0r/RMqXfvMqJ7beUGq02ioAoNPD0hQh
 Z76nfI+21TiuXx24JCfZ
 =Fo1U
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some small char and misc driver fixes to resolve some
  reported issues, as well as a number of binderfs fixups that were
  found after auditing the filesystem code by Al Viro. As binderfs
  hasn't been in a previous release yet, it's good to get these in now
  before the first users show up.

  All of these have been in linux-next for a bit with no reported
  issues"

* tag 'char-misc-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (26 commits)
  i3c: master: Fix an error checking typo in 'cdns_i3c_master_probe()'
  binderfs: switch from d_add() to d_instantiate()
  binderfs: drop lock in binderfs_binder_ctl_create
  binderfs: kill_litter_super() before cleanup
  binderfs: rework binderfs_binder_device_create()
  binderfs: rework binderfs_fill_super()
  binderfs: prevent renaming the control dentry
  binderfs: remove outdated comment
  binderfs: use __u32 for device numbers
  binderfs: use correct include guards in header
  misc: pvpanic: fix warning implicit declaration
  char/mwave: fix potential Spectre v1 vulnerability
  misc: ibmvsm: Fix potential NULL pointer dereference
  binderfs: fix error return code in binderfs_fill_super()
  mei: me: add denverton innovation engine device IDs
  mei: me: mark LBG devices as having dma support
  mei: dma: silent the reject message
  binderfs: handle !CONFIG_IPC_NS builds
  binderfs: reserve devices for initial mount
  binderfs: rename header to binderfs.h
  ...
2019-01-25 13:03:34 -10:00
Liangwei Dong
6c900360e7 nl80211: Allow set/del pmksa operations for AP
Host drivers may offload authentication to the user space
through the commit ("cfg80211: Authentication offload to
user space in AP mode").

This interface can be used to implement SAE by having the
userspace do authentication/PMKID key derivation and driver
handle the association.

A step ahead, this interface can get further optimized if the
PMKID is passed to the host driver and also have it respond to
the association request by the STA on a valid PMKID.

This commit enables the userspace to pass the PMKID to the host
drivers through the set/del pmksa operations in AP mode.

Set/Del pmksa is now restricted to STA/P2P client mode only and
thus the drivers might not expect them in any other(AP) mode.

This commit also introduces a feature flag
NL80211_EXT_FEATURE_AP_PMKSA_CACHING (johannes: renamed) to
maintain the backward compatibility of such an expectation by
the host drivers. These operations are allowed in AP mode only
when the drivers advertize the capability through this flag.

Signed-off-by: Liangwei Dong <liangwei@codeaurora.org>
Signed-off-by: Srinivas Dasari <dasaris@codeaurora.org>
[rename flag to NL80211_EXT_FEATURE_AP_PMKSA_CACHING]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-01-25 21:09:21 +01:00
Srinivas Dasari
fe4943702c cfg80211: Authentication offload to user space in AP mode
commit 40cbfa9021 ("cfg80211/nl80211: Optional authentication
offload to userspace")' introduced authentication offload to user
space by the host drivers in station mode. This commit extends
the same for the AP mode too.

Extend NL80211_ATTR_EXTERNAL_AUTH_SUPPORT to also claim the
support of external authentication from the user space in AP mode.
A new flag parameter is introduced in cfg80211_ap_settings to
intend the same while "start ap".

Host driver to use NL80211_CMD_FRAME interface to transmit and
receive the authentication frames to / from the user space.

Host driver to indicate the flag NL80211_RXMGMT_FLAG_EXTERNAL_AUTH
while sending the authentication frame to the user space. This
intends to the user space that the driver wishes it to process
the authentication frame for certain protocols, though it had
initially advertised the support for SME functionality.

User space shall accordingly do the authentication and indicate
its final status through the command NL80211_CMD_EXTERNAL_AUTH.
Allow the command even if userspace doesn't include the attribute
NL80211_ATTR_SSID for AP interface.

Host driver shall continue with the association sequence and
indicate the STA connection status through cfg80211_new_sta.

To facilitate the host drivers in AP mode for matching the pmkid
by the stations during the association, NL80211_CMD_EXTERNAL_AUTH
is also enhanced to include the pmkid to drivers after
the authentication.
This pmkid can also be used in the STA mode to include in the
association request.

Also modify nl80211_external_auth to not mandate SSID in AP mode.

Signed-off-by: Srinivas Dasari <dasaris@codeaurora.org>
[remove useless nla_get_flag() usage]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-01-25 21:08:05 +01:00
David S. Miller
30e5c2c6bf net: Revert devlink health changes.
This reverts the devlink health changes from 9/17/2019,
Jiri wants things to be designed differently and it was
agreed that the easiest way to do this is start from the
beginning again.

Commits reverted:

cb5ccfbe73
880ee82f03
c7af343b4e
ff253fedab
6f9d56132e
fcd852c69d
8a66704a13
12bd0dcefe
aba25279c1
ce019faa70
b8c45a033a

And the follow-on build fix:

o33a0efa4baecd689da9474ce0e8b673eb6931c60

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25 10:53:23 -08:00
Lijun Ou
9d9d4ff788 RDMA/hns: Update the kernel header file of hns
The hns_roce_ib_create_srq_resp is used to interact with the user for
data, this was open coded to use a u32 directly, instead use a properly
sized structure.

Fixes: c7bcb13442 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-25 09:55:48 -07:00
Arnd Bergmann
0d6040d468 arch: add split IPC system calls where needed
The IPC system call handling is highly inconsistent across architectures,
some use sys_ipc, some use separate calls, and some use both.  We also
have some architectures that require passing IPC_64 in the flags, and
others that set it implicitly.

For the addition of a y2038 safe semtimedop() system call, I chose to only
support the separate entry points, but that requires first supporting
the regular ones with their own syscall numbers.

The IPC_64 is now implied by the new semctl/shmctl/msgctl system
calls even on the architectures that require passing it with the ipc()
multiplexer.

I'm not adding the new semtimedop() or semop() on 32-bit architectures,
those will get implemented using the new semtimedop_time64() version
that gets added along with the other time64 calls.
Three 64-bit architectures (powerpc, s390 and sparc) get semtimedop().

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-01-25 17:22:50 +01:00
Veerendranath Jakkam
ab4dfa2053 cfg80211: Allow drivers to advertise supported AKM suites
There was no such capability advertisement from the driver and thus the
current user space has to assume the driver to support all the AKMs. While
that may be the case with some drivers (e.g., mac80211-based ones), there
are cfg80211-based drivers that implement SME and have constraints on
which AKMs can be supported (e.g., such drivers may need an update to
support SAE AKM using NL80211_CMD_EXTERNAL_AUTH). Allow such drivers to
advertise the exact set of supported AKMs so that user space tools can
determine what network profile options should be allowed to be configured.

Signed-off-by: Veerendranath Jakkam <vjakkam@codeaurora.org>
[pmsr data might be big, start a new netlink message section]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-01-25 14:05:31 +01:00