Commit Graph

549360 Commits

Author SHA1 Message Date
Linus Torvalds
a4c4c49a61 Power management and ACPI fixes for v4.3-rc6
- Fix a regression introduced by a recent ACPICA cleanup that
    uncovered a latent bug (Lv Zheng).
 
  - Fix a recent regression in the generic power domains framework
    that may cause it to violate PM QoS latency constraints in some
    cases (Ulf Hansson).
 
  - Fix an intel_pstate driver crash on the Knights Landing chips
    that do not update the MPERF counter as often as expected by the
    driver which may result in a divide by 0 (Srinivas Pandruvada).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJWIPwyAAoJEILEb/54YlRx4OYP/ivVGcwUYVLbb9c7ePX220pb
 QlkwQH/uTh1tSwSCNq2X4e6hgPdVKYx3/Buc0t89bVFwV+upqUcfNMfj0GWvXO3e
 2fLgb/27Zg3ZKBe8XqeHbibN7HQ9Wz/s9rJRLVCWLUY/5WHRCyyO2JfELjuX4pNq
 tJTH7WIFIJ3riuV97DfO55ZrpKL2msnFvItv+pKVuuVWJMJKpHg+NQKQ5UWegDed
 IYaT1FlvVwGwlrvGvUNi8qWTwrziNDKd460qrOZFLTTdMwg4UZPI991euK6ZEot7
 refZPtrDIrPBD3tPOWjxERVuleIL+Bw7sq+JKUX9IdM4m69UcAOz63oXRLYm/ilV
 nAvo6eDFc37FVQHGv1JzzbZIyvZOeFMfvRN3R06Qfm9Kdu2wjjTd/fYd63IabHe+
 HwpgZbEhGRarwZ3VOJzQrLaa/gMTltpRKEiMQeHSnUmSCVJiKK/q5b9RBFfAOpfa
 wIlaFxsx1GC8QBatL4I8X2M0w9UQpY4N48NZX+FTRx1zCEkocugHbn6vIHW7USVu
 5mBoghdnPcfZJS66cSPa508LZOdfhw4vB8jQXzlG+v1PBdw7ISf6wnEyTawYMpiB
 pnIciv18WYTk/MmVdkf97TLHMbuiMVkyOaSZ1SpdE+leN8dGOhCscBb/P088ellk
 A9N4dzCeaSo6uHY6OMhZ
 =+ku3
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
 "These fix two recent regressions (ACPICA, the generic power domains
  framework) and one crash that may happen on specific hardware
  supported since 4.1 (intel_pstate).

  Specifics:

   - Fix a regression introduced by a recent ACPICA cleanup that
     uncovered a latent bug (Lv Zheng).

   - Fix a recent regression in the generic power domains framework that
     may cause it to violate PM QoS latency constraints in some cases
     (Ulf Hansson).

   - Fix an intel_pstate driver crash on the Knights Landing chips that
     do not update the MPERF counter as often as expected by the driver
     which may result in a divide by 0 (Srinivas Pandruvada)"

* tag 'pm+acpi-4.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: intel_pstate: Fix divide by zero on Knights Landing (KNL)
  ACPICA: Tables: Fix FADT dependency regression
  PM / Domains: Fix validation of latency constraints in genpd governor
2015-10-16 12:25:54 -07:00
Linus Torvalds
8b7b56f37b Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Nothing too crazy or exciting:

   - two MAINTAINERS entries that I didn't see the point in delaying.
   - one drm mst fix to stop sending uninitialised data to monitors
   - two amdgpu fixes
   - one radeon mst tiling fix
   - one vmwgfx regression fix
   - one virtio warning fix.

  I have found one locking problem that needs a bit of reorg to fix, but
  I'm not sure it's worth putting in -fixes as I don't think we've seen
  it hit in the real world ever, I just found it using the virtio-gpu
  driver when working on it.  I'll possibly send it next week once I've
  time to discuss with Daniel"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/virtio: use %llu format string form atomic64_t
  MAINTAINERS: Add myself as maintainer for the gma500 driver
  MAINTAINERS: add a maintainer for the atmel-hlcdc DRM driver
  drm/amdgpu: Keep the pflip interrupts always enabled v7
  drm/amdgpu: adjust default dispclk (v2)
  drm/dp/mst: make mst i2c transfer code more robust.
  drm/radeon: attach tile property to mst connector
  drm/vmwgfx: Fix kernel NULL pointer dereference on older hardware
2015-10-16 12:19:11 -07:00
Linus Torvalds
ebb65c81e1 powerpc fixes for 4.3 #3
- Re-enable CONFIG_SCSI_DH in our defconfigs
  - Remove unused os_area_db_id_video_mode
  - cxl: fix leak of IRQ names in cxl_free_afu_irqs() from Andrew
  - cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API from Andrew
  - cxl: fix leak of ctx->mapping when releasing kernel API contexts from Andrew
  - cxl: Workaround malformed pcie packets on some cards from Philippe
  - cxl: Fix number of allocated pages in SPA from Christophe Lombard
  - Fix checkstop in native_hpte_clear() with lockdep from Cyril
  - Panic on unhandled Machine Check on powernv from Daniel
  - selftests/powerpc: Fix build failure of load_unaligned_zeropad test
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWIM0xAAoJEFHr6jzI4aWAsCIP/04uAiPCqWOwHjr8/eAlNAmJ
 GaA6b91QUUpBlyXgzYZShS/FQEnyukbGTUzaS3KwijOdRJtCHxvl2eG7pOCws+GS
 2YeA9mBm7MgYT0BJ+KLGCgrF5C/sc+LN3udO9Kf1LimLpp+fIILHgEmhrfy00wUp
 f7tJ/Rvpt23PmcCDX0PhA7NuOrRu5hQOQ9rsqJfzc7XObZAG1AfISPgALgaeAINc
 XqQfWiNFLmDJyhV9K39rUXSTvHYl6pPnfDj4GelfjQD2l/csH0M4MeGW2tHNkgVy
 CakLWOP3zdZVTYTcB8wypnoZxATPhEsHehJmQ4fu3n0WR1vHfCqh4rFZuPaaX0NG
 P3In0eOV285RIpNLcwkchN+07Ops1Fvi5XonaQpgHCcI9c4H7IAGPbQau2DhR9sU
 DyZQ+/6wNzpXbM7llM3VyTA2zvvyiuEzuIZI78XWexO/Ny6TCItRtEqJEXMA+ChX
 lKbLluRnQcnn5sizK0yj4mtkffAbu7Za1KGl1nm1Q/5pBQWsC40wFcRLNNdzqVmH
 7tSp8cIEYunCYKy5bAheWJTzpUgGD55EEcUkQFHVm5LKBXyA73qJRSMuLZqtnB3z
 g6eTiEKhZvVFedNMDNFnNWrvOnd8JpyjGLRAbqgwMhN+lgVvmwwSSB6V2SefMnuL
 HCSGqR40vPA9bH0Cz/ND
 =3ze+
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - Re-enable CONFIG_SCSI_DH in our defconfigs
 - Remove unused os_area_db_id_video_mode
 - cxl: fix leak of IRQ names in cxl_free_afu_irqs() from Andrew
 - cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API from Andrew
 - cxl: fix leak of ctx->mapping when releasing kernel API contexts from Andrew
 - cxl: Workaround malformed pcie packets on some cards from Philippe
 - cxl: Fix number of allocated pages in SPA from Christophe Lombard
 - Fix checkstop in native_hpte_clear() with lockdep from Cyril
 - Panic on unhandled Machine Check on powernv from Daniel
 - selftests/powerpc: Fix build failure of load_unaligned_zeropad test

* tag 'powerpc-4.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  selftests/powerpc: Fix build failure of load_unaligned_zeropad test
  powerpc/powernv: Panic on unhandled Machine Check
  powerpc: Fix checkstop in native_hpte_clear() with lockdep
  cxl: Fix number of allocated pages in SPA
  cxl: Workaround malformed pcie packets on some cards
  cxl: fix leak of ctx->mapping when releasing kernel API contexts
  cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API
  cxl: fix leak of IRQ names in cxl_free_afu_irqs()
  powerpc/ps3: Remove unused os_area_db_id_video_mode
  powerpc/configs: Re-enable CONFIG_SCSI_DH
2015-10-16 12:07:43 -07:00
Linus Torvalds
3d875182d7 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "6 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  sh: add copy_user_page() alias for __copy_user()
  lib/Kconfig: ZLIB_DEFLATE must select BITREVERSE
  mm, dax: fix DAX deadlocks
  memcg: convert threshold to bytes
  builddeb: remove debian/files before build
  mm, fs: obey gfp_mapping for add_to_page_cache()
2015-10-16 11:42:37 -07:00
Ross Zwisler
934ed25ea5 sh: add copy_user_page() alias for __copy_user()
copy_user_page() is needed by DAX.  Without this we get a compile error
for DAX on SH:

  fs/dax.c:280:2: error: implicit declaration of function `copy_user_page' [-Werror=implicit-function-declaration]
    copy_user_page(vto, (void __force *)vfrom, vaddr, to);
      ^

This was done with a random config that happened to include DAX support.

This patch has only been compile tested.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-16 11:42:28 -07:00
Andrew Morton
1fd4e5c347 lib/Kconfig: ZLIB_DEFLATE must select BITREVERSE
lib/built-in.o: In function `__bitrev32':
deftree.c:(.text+0x1e799): undefined reference to `byte_rev_table'
deftree.c:(.text+0x1e7a0): undefined reference to `byte_rev_table'
deftree.c:(.text+0x1e7b4): undefined reference to `byte_rev_table'
deftree.c:(.text+0x1e7c1): undefined reference to `byte_rev_table'

Anything which uses bitrevX() has to select BITREVERSE, to grab
lib/bitrev.o.

Reported-by: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-16 11:42:28 -07:00
Ross Zwisler
0f90cc6609 mm, dax: fix DAX deadlocks
The following two locking commits in the DAX code:

commit 843172978b ("dax: fix race between simultaneous faults")
commit 46c043ede4 ("mm: take i_mmap_lock in unmap_mapping_range() for DAX")

introduced a number of deadlocks and other issues which need to be fixed
for the v4.3 kernel.  The list of issues in DAX after these commits
(some newly introduced by the commits, some preexisting) can be found
here:

  https://lkml.org/lkml/2015/9/25/602 (Subject: "Re: [PATCH] dax: fix deadlock in __dax_fault").

This undoes most of the changes introduced by those two commits,
essentially returning us to the DAX locking scheme that was used in
v4.2.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dave Chinner <dchinner@redhat.com>
Cc: Jan Kara <jack@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-16 11:42:28 -07:00
Shaohua Li
424cdc1413 memcg: convert threshold to bytes
page_counter_memparse() returns pages for the threshold, while
mem_cgroup_usage() returns bytes for memory usage.  Convert the
threshold to bytes.

Fixes: 3e32cb2e0a ("memcg: rename cgroup_event to mem_cgroup_event").
Signed-off-by: Shaohua Li <shli@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-16 11:42:28 -07:00
Riku Voipio
8d740a37b9 builddeb: remove debian/files before build
Commit 3716001bcb ("deb-pkg: add source package") added the ability to
create a debian changelog file.  This exposed that previously the
builddeb script hasn't cleared debian/files between builds.

As debian/files keeps accumulating entries, the changes file will end up
growing indefinelty.  With outdated entries in debian/files, builddeb
script will exit with failure.  This regression impacts those who use
"make deb-pkg" target to build kernel into a .deb package and never use
"make mrproper" or other means to clean kernel tree from generated
directories.

To fix the regression, remove debian/files before starting build and in
the generated clean rule.

Fixes: 3716001bcb ("deb-pkg: add source package")
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reported-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Kalle Valo <kvalo@codeaurora.org>
Acked-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Michal Marek <mmarek@suse.cz>
Cc: maximilian attems <maks@stro.at>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-16 11:42:28 -07:00
Michal Hocko
063d99b4fa mm, fs: obey gfp_mapping for add_to_page_cache()
Commit 6afdb859b7 ("mm: do not ignore mapping_gfp_mask in page cache
allocation paths") has caught some users of hardcoded GFP_KERNEL used in
the page cache allocation paths.  This, however, wasn't complete and
there were others which went unnoticed.

Dave Chinner has reported the following deadlock for xfs on loop device:
: With the recent merge of the loop device changes, I'm now seeing
: XFS deadlock on my single CPU, 1GB RAM VM running xfs/073.
:
: The deadlocked is as follows:
:
: kloopd1: loop_queue_read_work
:       xfs_file_iter_read
:       lock XFS inode XFS_IOLOCK_SHARED (on image file)
:       page cache read (GFP_KERNEL)
:       radix tree alloc
:       memory reclaim
:       reclaim XFS inodes
:       log force to unpin inodes
:       <wait for log IO completion>
:
: xfs-cil/loop1: <does log force IO work>
:       xlog_cil_push
:       xlog_write
:       <loop issuing log writes>
:               xlog_state_get_iclog_space()
:               <blocks due to all log buffers under write io>
:               <waits for IO completion>
:
: kloopd1: loop_queue_write_work
:       xfs_file_write_iter
:       lock XFS inode XFS_IOLOCK_EXCL (on image file)
:       <wait for inode to be unlocked>
:
: i.e. the kloopd, with it's split read and write work queues, has
: introduced a dependency through memory reclaim. i.e. that writes
: need to be able to progress for reads make progress.
:
: The problem, fundamentally, is that mpage_readpages() does a
: GFP_KERNEL allocation, rather than paying attention to the inode's
: mapping gfp mask, which is set to GFP_NOFS.
:
: The didn't used to happen, because the loop device used to issue
: reads through the splice path and that does:
:
:       error = add_to_page_cache_lru(page, mapping, index,
:                       GFP_KERNEL & mapping_gfp_mask(mapping));

This has changed by commit aa4d86163e ("block: loop: switch to VFS
ITER_BVEC").

This patch changes mpage_readpage{s} to follow gfp mask set for the
mapping.  There are, however, other places which are doing basically the
same.

lustre:ll_dir_filler is doing GFP_KERNEL from the function which
apparently uses GFP_NOFS for other allocations so let's make this
consistent.

cifs:readpages_get_pages is called from cifs_readpages and
__cifs_readpages_from_fscache called from the same path obeys mapping
gfp.

ramfs_nommu_expand_for_mapping is hardcoding GFP_KERNEL as well
regardless it uses mapping_gfp_mask for the page allocation.

ext4_mpage_readpages is the called from the page cache allocation path
same as read_pages and read_cache_pages

As I've noticed in my previous post I cannot say I would be happy about
sprinkling mapping_gfp_mask all over the place and it sounds like we
should drop gfp_mask argument altogether and use it internally in
__add_to_page_cache_locked that would require all the filesystems to use
mapping gfp consistently which I am not sure is the case here.  From a
quick glance it seems that some file system use it all the time while
others are selective.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Dave Chinner <david@fromorbit.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-16 11:42:28 -07:00
Ian Morris
c8d71d08aa netfilter: ipv4: whitespace around operators
This patch cleanses whitespace around arithmetical operators.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16 19:19:23 +02:00
Ian Morris
24cebe3f29 netfilter: ipv4: code indentation
Use tabs instead of spaces to indent code.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16 19:19:15 +02:00
Ian Morris
6c28255b46 netfilter: ipv4: function definition layout
Use tabs instead of spaces to indent second line of parameters in
function definitions.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16 19:19:10 +02:00
Ian Morris
27951a0168 netfilter: ipv4: ternary operator layout
Correct whitespace layout of ternary operators in the netfilter-ipv4
code.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16 19:19:04 +02:00
Ian Morris
19f0a60201 netfilter: ipv4: label placement
Whitespace cleansing: Labels should not be indented.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16 19:17:41 +02:00
Arnd Bergmann
008027c31d netfilter: turn NF_HOOK into an inline function
A recent change to the dst_output handling caused a new warning
when the call to NF_HOOK() is the only used of a local variable
passed as 'dev', and CONFIG_NETFILTER is disabled:

net/ipv6/ip6_output.c: In function 'ip6_output':
net/ipv6/ip6_output.c:135:21: warning: unused variable 'dev' [-Wunused-variable]

The reason for this is that the NF_HOOK macro in this case does
not reference the variable at all, and the call to dev_net(dev)
got removed from the ip6_output function. To avoid that warning now
and in the future, this changes the macro into an equivalent
inline function, which tells the compiler that the variable is
passed correctly but still unused.

The dn_forward function apparently had the same problem in
the past and added a local workaround that no longer works
with the inline function. In order to avoid a regression, we
have to also remove the #ifdef from decnet in the same patch.

Fixes: ede2059dba ("dst: Pass net into dst->output")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16 18:45:36 +02:00
Florian Westphal
81b4325eba netfilter: nf_queue: remove rcu_read_lock calls
All verdict handlers make use of the nfnetlink .call_rcu callback
so rcu readlock is already held.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16 18:22:41 +02:00
Florian Westphal
ed78d09d59 netfilter: make nf_queue_entry_get_refs return void
We don't care if module is being unloaded anymore since hook unregister
handling will destroy queue entries using that hook.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16 18:22:23 +02:00
Florian Westphal
2ffbceb2b0 netfilter: remove hook owner refcounting
since commit 8405a8fff3 ("netfilter: nf_qeueue: Drop queue entries on
nf_unregister_hook") all pending queued entries are discarded.

So we can simply remove all of the owner handling -- when module is
removed it also needs to unregister all its hooks.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-16 18:21:39 +02:00
Ilya Dryomov
e30b7577bf rbd: use writefull op for object size writes
This covers only the simplest case - an object size sized write, but
it's still useful in tiering setups when EC is used for the base tier
as writefull op can be proxied, saving an object promotion.

Even though updating ceph_osdc_new_request() to allow writefull should
just be a matter of fixing an assert, I didn't do it because its only
user is cephfs.  All other sites were updated.

Reflects ceph.git commit 7bfb7f9025a8ee0d2305f49bf0336d2424da5b5b.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2015-10-16 16:49:01 +02:00
Ilya Dryomov
0d9fde4fc8 rbd: set max_sectors explicitly
Commit 30e2bc08b2 ("Revert "block: remove artifical max_hw_sectors
cap"") restored a clamp on max_sectors.  It's now 2560 sectors instead
of 1024, but it's not good enough: we set max_hw_sectors to rbd object
size because we don't want object sized I/Os to be split, and the
default object size is 4M.

So, set max_sectors to max_hw_sectors in rbd at queue init time.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2015-10-16 16:48:36 +02:00
David S. Miller
4be3158abe Merge branch 'mlxsw-spectrum'
Jiri Pirko says:

====================
mlxsw: Driver update, add initial support for Spectrum ASIC

Purpose of this patchset is to introduce initial support for Mellanox
Spectrum ASIC, including L2 bridge forwarding offload.

The only non-mlxsw patch in this patchset is the first one, introducing
pre-change upper notifier. That is used in last patch to ensure ports of
single ASIC are not bridged into multiple bridges, as that scenario is
currently not supported by driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:31 -07:00
Jiri Pirko
56ade8fe3f mlxsw: spectrum: Add initial support for Spectrum ASIC
Add support for new generation Mellanox Spectrum ASIC, 10/25/40/50 and
100Gb/s Ethernet Switch.

The initial driver implements bridge forwarding offload including
bridge internal VLAN support, FDB static entries, FDB learning and
HW ageing including their setup.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:23 -07:00
Ido Schimmel
a4feea74cd mlxsw: reg: Add Switch Port VLAN MAC Learning register definition
Since we currently do not support the offloading of 802.1D bridges, we
need to be able to let the device know it should not learn MAC addresses
on specific {Port, VID} pairs.

Add the SPVMLR register, which controls the learning enablement of
{Port, VID} pairs.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:22 -07:00
Jiri Pirko
e534a56a31 mlxsw: reg: Add Switch Filtering Database Aging Time register definition
Add SFDAT which is used to control switch ageing time.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:22 -07:00
Ido Schimmel
1f65da742d mlxsw: reg: Add Switch Virtual-Port Enabling register definition
In order for a port to support {Port, VID} to FID mapping it needs to be
configured to a virtual port mode (as opposed to VLAN mode).

Add the SVPE register, which enables port virtualization.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:20 -07:00
Ido Schimmel
6479023976 mlxsw: reg: Add Switch VID to FID Allocation register definition
An incoming packet can be classified into a filtering identifer (FID)
based on its VID or incoming port and VID ({Port, VID}).

Add the SVFA register, which controls this mapping.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:19 -07:00
Ido Schimmel
f1fb693a08 mlxsw: reg: Add Switch FID Management register definition
Filtering identifiers (FIDs) are unique identifers of bridge instances
in the hardware.

Add the SFMR register, which is responsible for the creation and
configuration of these FIDs.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:18 -07:00
Jiri Pirko
e059436999 mlxsw: reg: Add shared buffer configuration registers definitions
Add definitions of SBPR, SBCM, SBPM, SBMM and PBMC registers that are
used to configure shared buffers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:18 -07:00
Elad Raz
b2e345f9a4 mlxsw: reg: Add Switch Port VID and Switch Port VLAN Membership registers definitions
Add SPVID and SPVM registers responsible for default port VID
configuration and VLAN membership of a port.

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:17 -07:00
Jiri Pirko
f5d88f5892 mlxsw: reg: Add Switch FDB Notification register definition
Add SFN register which is used to poll for newly added and aged-out FDB
entries.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:17 -07:00
Jiri Pirko
236033b33c mlxsw: reg: Add Switch Filtering Database register definition
Add the SFD register which is responsible for filtering database
manipulation, including static and dynamic FDB entries.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:15 -07:00
Jiri Pirko
d64b159253 mlxsw: item: Add MLXSW_ITEM_BUF_INDEXED helper
Add missing item helper which allows to access char bufs on multiple
offsets. This is needed by SFD and SFN register definitions.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:14 -07:00
Jiri Pirko
7b0989b5bc mlxsw: item: Make src arg of memcpy_to helper const
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:12 -07:00
Ido Schimmel
12fd35ab8a mlxsw: cmd: Introduce FID-offset flooding tables
Packets destined to offloaded netdevs will be classified to FIDs in the
device and flooded in case of BUM.

The flooding table used is of type FID-offset, which allows one to
create different flooding domains for different FIDs and specify the
offset in the flooding table for each FID (not necessarily equal to FID
or VID).

Add support for this flooding table type, by exposing the configuration
of the number of tables from this type and their size.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:10 -07:00
Ido Schimmel
453b6a8dd8 mlxsw: cmd: Introduce per-FID flooding tables
In the newly introduced Spectrum switch ASIC, packets destined to not
offloaded netdevs will be classified to special FIDs (vFIDs) in the
device and flooded to the CPU port.

The flooding table used is of type per-FID, which allows one to create
different flooding domains for different vFIDs.

While using a simple single-entry flood table is certainly sufficient at
this point, we do plan to offload 802.1D bridges involving VLAN
interfaces, thus making this change necessary.

Add support for this flooding table type, by exposing the configuration
of the number of tables from this type and their size.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:08 -07:00
Ido Schimmel
bc2055f878 mlxsw: Enable configuration of flooding domains
As part of the introduction of L2 offloads, allow different ports to
join/leave the flooding domain, according to user configuration.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:08 -07:00
Jiri Pirko
573c7ba006 net: introduce pre-change upper device notifier
This newly introduced netdevice notifier is called before actual change
upper happens. That provides a possibility for notifier handlers to
know upper change will happen and react to it, including possibility to
forbid the change. That is valuable for drivers which can check if the
upper device linkage is supported and forbid that in case it is not.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 07:15:05 -07:00
Thomas Gleixner
56fd16caba timekeeping: Increment clock_was_set_seq in timekeeping_init()
timekeeping_init() can set the wall time offset, so we need to
increment the clock_was_set_seq counter. That way hrtimers will pick
up the early offset immediately. Otherwise on a machine which does not
set wall time later in the boot process the hrtimer offset is stale at
0 and wall time timers are going to expire with a delay of 45 years.

Fixes: 868a3e915f "hrtimer: Make offset update smarter"
Reported-and-tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Stefan Liebler <stli@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
2015-10-16 15:50:22 +02:00
David S. Miller
125ecf4b59 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-10-16

This series contains updates to e1000, e1000e, igb, igbvf, ixgbe, ixgbevf,
i40e, i40evf and fm10k.

Alex Duyck fixes the polling routine for i40e/i40evf were the NAPI budget
for receive cleanup was being rounded up to 1 but the netpoll call was
expecting no Rx to be processed as the budget passed was 0.  Also cleaned
up IN_NETPOLL flag that was not adding any value due to the receive
cleanup was handled in NAPI.  Added support for netpoll for i40evf as
well.

Jesse updates all of our drivers to use napi_complete_done() instead of
napi_complete(), which allows us to use
/sys/class/net/ethX/gro_flush_timeout.  Added ethtool support to control
and report the new Interrupt Limit register, since the XL710 hardware
has a different interrupt moderation design that can support a limit of
total interrupts per second per vector.

Shannon cleans up startup log entries to cut down the number by putting
a couple behind debug flags and combining others into single line.  Added
support to enable/disable printing VEB statistics via ethtool.

Jingjing fixes a compile issue by adding const to functions that return
strings that are not going to be modified.

Greg Rose cleans up defines that were not used and were causing customer
confusion.

Greg Bowers adds support for setting a new bit in the Set Local LLDP MIB
admin queue command Type field.

Mitch fixes an issue where vlan_features field was set to the same value
as netdev features field, but before the features were actually being
set up, leaving the vlan_features empty.  Resolve the issue by setting
up the netdev features first, then mask out the VLAN feature bits when
assigning vlan_features.  Fixed VF init timing, where in some instances
the VFs would fail to initialize the first time you loaded the driver.
To correct this, increased the delay time for the init task and wait
longer before giving up.

v2: fix missing space in function header comment in patch 3, based on
    feedback from Sergei Shtylyov.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 06:41:10 -07:00
Rafael J. Wysocki
fa54823732 Merge branches 'acpica', 'pm-domains' and 'pm-cpufreq'
* acpica:
  ACPICA: Tables: Fix FADT dependency regression

* pm-domains:
  PM / Domains: Fix validation of latency constraints in genpd governor

* pm-cpufreq:
  cpufreq: intel_pstate: Fix divide by zero on Knights Landing (KNL)
2015-10-16 14:32:27 +02:00
Catherine Sullivan
d1d39516e4 i40e/i40evf: Bump i40e to 1.3.34 and i40evf to 1.3.21
Bump.

Change-ID: I7ec818a507554648675b9b245ced9e6b6bd9ed4e
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-16 05:05:08 -07:00
Mitch Williams
628f096d8d i40e: increase AQ work limit
With 64 VFs, we can easily overwhelm the AQ on the PF if we have too low
a limit on the number of AQ requests. This leads to ARQ overflow errors,
and occasionally VFs that fail to initialize.

Since we really only hit this condition on initial VF driver load, the
requests that we process are lightweight, so this extra work doesn't
cause problems for the PF driver.

Change-ID: I620221520d8af987df6ace9ba938ffaf22107681
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-16 05:02:40 -07:00
Mitch Williams
3f7e5c330e i40evf: relax and stagger init timing a bit
On some devices, in some systems, in some configurations, the VFs would
fail to initialize the first time you loaded the driver.

To correct this, increase the delay time for the init task slightly, and
wait longer before giving up.

If we enable VFs and load the VF driver in the same kernel as the PF
driver, we can totally overwhelm the PF driver with AQ requests because
all of the instances try to initialize at the same time.

To help alleviate this, stagger the initial scheduling of the init task
using the PCIe function as a multiplier. We mask off the function to
only three bits so no instance has to wait too long.

With these two changes, initializing 128 VFs on a single device goes
from four minutes to just a few seconds.

Change-ID: If3d8720c1c4e838ab36d8781d9ec295a62380936
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-16 05:00:17 -07:00
Catherine Sullivan
48becae60f i40e: Recognize 1000Base_T_Optical phy type when link is up
1000Base_T_Optical got added to the function that figures out what
is supported when link is down but not when link is up. Add it in there
too so that we display the correct information.

Change-ID: I85ebcdfa7c02d898c44c673b1500552a53c8042e
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-16 04:57:54 -07:00
Mitch Williams
cc7e406cb9 i40evf: correctly populate vlan_features
The vlan_features field was correctly being set to the same value as the
netdev features field. However, this was being done before the features
were actually being set up, leaving the vlan_features empty.

Also, after a reset, vlan_features will be incorrectly assigned the
previous netdev feature flags, which can contain VLAN feature bits. This
makes the VLAN code angry and will cause a stack dump.

To fix these issues, set up the netdev features first, then mask out the
VLAN feature bits when assigning vlan_features.

Change-ID: Ib0548869dc83cf6a841cb8697dd94c12359ba4d2
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-16 04:55:29 -07:00
Jingjing Wu
5d38c93e71 i40e: reset the invalid msg counter in vf when a valid msg is received
When the number of invalid messages from a VF is exceeded, the VF
will be disabled, due to the invalid messages.  This happens if
other VF drivers (like DPDK) send a message through the driver's
mailbox (aka virtchannel) interface, but the message is not
supported by the i40e pf driver, such as CONFIG_PROMISCUOUS_MODE.

This patch changes the num_invalid_msgs in struct i40e_vf to record
the continuous invalid msgs, and it will be reset when a valid msg
is received.

Change-ID: Iaec42fd3dcdd281476b3518be23261dd46fc3718
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-16 04:53:04 -07:00
Jesse Brandeburg
ac26fc136c i40e/i40evf: moderate interrupts differently
The XL710 hardware has a different interrupt moderation design
that can support a limit of total interrupts per second per
vector, in addition to the "number of interrupts per second"
controls already established in the driver.  This combination
of hardware features allows us to set very low default latency
settings but minimize the total CPU utilization by not
making too many interrupts, should the user desire.

The current driver implementation is still enabling the dynamic
moderation in the driver, and only using the rx/tx-usecs
limit in ethtool to limit the interrupt rate per second, by default.

The new code implemented in this patch
2) adds init/use of the new "Interrupt Limit" register
3) adds ethtool knob to control/report the limits above

Usage is ethtool -C ethx rx-usecs-high <value> Where <value> is number
of microseconds to create a rate of 1/N interrupts per second,
regardless of rx-usecs or tx-usecs values. Since there is a credit based
scheme in the hardware, the rx-usecs and tx-usecs can be configured for
very low latency for short bursts, but once the credit runs out the
refill rate on the credits is limited by rx-usecs-high.

Change-ID: I3a1075d3296123b0f4f50623c779b027af5b188d
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-16 04:50:38 -07:00
Greg Bowers
947570e800 i40e: Add support for non-willing Apps
Adds support for setting a new bit in the Set Local LLDP MIB AQ command
Type field.  When set to 1, the bit indicates to FW that Apps should be
treated as non-willing.  When 0, FW behaves as before.

Change-ID: I0d2101c1606c59c7188d3e6a0c7810e0f205233a
Signed-off-by: Greg Bowers <gregory.j.bowers@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-16 04:48:11 -07:00
Shannon Nelson
1cdfd88f2d i40e: priv flag for controlling VEB stats
Add an ethtool priv flag to enable and disable printing
the VEB statistics.

Change-ID: I7654054a3a73b08aa8310d94ee8fce6219107dd8
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-16 04:45:47 -07:00