Commit Graph

375503 Commits

Author SHA1 Message Date
Jaegeuk Kim
531ad7d58c f2fs: avoid deadlock during evict after f2fs_gc
o Deadlock case #1

Thread 1:
- writeback_sb_inodes
 - do_writepages
  - f2fs_write_data_pages
   - write_cache_pages
    - f2fs_write_data_page
     - f2fs_balance_fs
      - wait mutex_lock(gc_mutex)

Thread 2:
- f2fs_balance_fs
 - mutex_lock(gc_mutex)
 - f2fs_gc
  - f2fs_iget
   - wait iget_locked(inode->i_lock)

Thread 3:
- do_unlinkat
 - iput
  - lock(inode->i_lock)
   - evict
    - inode_wait_for_writeback

o Deadlock case #2

Thread 1:
- __writeback_single_inode
 : set I_SYNC
  - do_writepages
   - f2fs_write_data_page
    - f2fs_balance_fs
     - f2fs_gc
      - iput
       - evict
        - inode_wait_for_writeback(I_SYNC)

In order to avoid this, even though iput is called with the zero-reference
count, we need to stop the eviction procedure if the inode is on writeback.
So this patch links f2fs_drop_inode which checks the I_SYNC flag.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-08 19:54:08 +09:00
Steven J. Hill
f6b06d9361 MIPS: microMIPS: Support dynamic ASID sizing.
Changes for pure microMIPS cores to dynamically determine the ASID
size at boot time.

Includes bug fix https://patchwork.linux-mips.org/patch/5230/

Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
Signed-off-by: Jonas Gorski <jogo@openwrt.org>
2013-05-08 12:30:10 +02:00
Steven J. Hill
d532f3d267 MIPS: Allow ASID size to be determined at boot time.
Original patch by Ralf Baechle and removed by Harold Koerfgen
with commit f67e4ffc79905482c3b9b8c8dd65197bac7eb508. This
allows for more generic kernels since the size of the ASID
and corresponding masks can be determined at run-time. This
patch is also required for the new Aptiv cores and has been
tested on Malta and Malta Aptiv platforms.

[ralf@linux-mips.org: Added relevant part of fix
https://patchwork.linux-mips.org/patch/5213/]

Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 12:30:10 +02:00
Steven J. Hill
49bffbdc88 MIPS: FW: malta: Code formatting clean-ups.
Clean-up code according to the 'checkpatch.pl' script.

Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
2013-05-08 12:30:10 +02:00
Steven J. Hill
270690e00c MIPS: FW: Remove obsolete header file for MTI platforms.
Remove 'arch/mips/include/asm/mips-boards/prom.h' and get rid of
all inclusions of it by Malta and SEAD-3 platforms.

[ralf@linux-mips.org: Fold in John Crispin <blogic@openwrt.org>'s "MIPS:
ar7 powertv build"].

[ralf@linux-mips.org: Fold in John Crispin <blogic@openwrt.org>'s "MIPS:
unbreak powertv build"].

[ralf@linux-mips.org: Test. Build. Your. Fscking. Code. Or...]

Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
2013-05-08 12:30:10 +02:00
Steven J. Hill
b431f09d55 MIPS: FW: malta: Use new common FW library variable processing.
Remove old YAMON prom code and use common firmware library code.

Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
2013-05-08 12:30:09 +02:00
Steven J. Hill
0be2abbcee MIPS: FW: sead3: Use new common FW library variable processing.
Remove old YAMON prom code and use common firmware library code.

Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
2013-05-08 12:30:09 +02:00
Steven J. Hill
14aecdd419 MIPS: FW: Add environment variable processing.
Add parsing of the environment and command line variables passed to
the kernel to the firmware library.

Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
2013-05-08 12:30:09 +02:00
Steven J. Hill
98ffcf602b MIPS: Add declarations to MIPS Technologies Inc. generic header.
Add declaration of 'mips_scroll_message' and 'mips_display_message'
to the common generic header file for the MIPS Technologies Inc.
development boards.

Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
2013-05-08 12:30:09 +02:00
Lars-Peter Clausen
f4d1f2bed6 MIPS: sead3: Use generic suspend/resume for LEDs.
Setting the LED_CORE_SUSPENDRESUME flag causes the LED driver core to call
led_classdev_suspend/led_classdev_resume during suspend/resume. Since this is
exactly what the driver's custom suspend/resume callbacks do we can replace them
by setting the LED_CORE_SUSPENDRESUME flag.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
2013-05-08 12:30:09 +02:00
Asias He
7dac16c379 KVM: Fix kvm_irqfd_init initialization
In commit a0f155e96 'KVM: Initialize irqfd from kvm_init()', when
kvm_init() is called the second time (e.g kvm-amd.ko and kvm-intel.ko),
kvm_arch_init() will fail with -EEXIST, then kvm_irqfd_exit() will be
called on the error handling path. This way, the kvm_irqfd system will
not be ready.

This patch fix the following:

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff81c0721e>] _raw_spin_lock+0xe/0x30
PGD 0
Oops: 0002 [#1] SMP
Modules linked in: vhost_net
CPU 6
Pid: 4257, comm: qemu-system-x86 Not tainted 3.9.0-rc3+ #757 Dell Inc. OptiPlex 790/0V5HMK
RIP: 0010:[<ffffffff81c0721e>]  [<ffffffff81c0721e>] _raw_spin_lock+0xe/0x30
RSP: 0018:ffff880221721cc8  EFLAGS: 00010046
RAX: 0000000000000100 RBX: ffff88022dcc003f RCX: ffff880221734950
RDX: ffff8802208f6ca8 RSI: 000000007fffffff RDI: 0000000000000000
RBP: ffff880221721cc8 R08: 0000000000000002 R09: 0000000000000002
R10: 00007f7fd01087e0 R11: 0000000000000246 R12: ffff8802208f6ca8
R13: 0000000000000080 R14: ffff880223e2a900 R15: 0000000000000000
FS:  00007f7fd38488e0(0000) GS:ffff88022dcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000022309f000 CR4: 00000000000427e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process qemu-system-x86 (pid: 4257, threadinfo ffff880221720000, task ffff880222bd5640)
Stack:
 ffff880221721d08 ffffffff810ac5c5 ffff88022431dc00 0000000000000086
 0000000000000080 ffff880223e2a900 ffff8802208f6ca8 0000000000000000
 ffff880221721d48 ffffffff810ac8fe 0000000000000000 ffff880221734000
Call Trace:
 [<ffffffff810ac5c5>] __queue_work+0x45/0x2d0
 [<ffffffff810ac8fe>] queue_work_on+0x8e/0xa0
 [<ffffffff810ac949>] queue_work+0x19/0x20
 [<ffffffff81009b6b>] irqfd_deactivate+0x4b/0x60
 [<ffffffff8100a69d>] kvm_irqfd+0x39d/0x580
 [<ffffffff81007a27>] kvm_vm_ioctl+0x207/0x5b0
 [<ffffffff810c9545>] ? update_curr+0xf5/0x180
 [<ffffffff811b66e8>] do_vfs_ioctl+0x98/0x550
 [<ffffffff810c1f5e>] ? finish_task_switch+0x4e/0xe0
 [<ffffffff81c054aa>] ? __schedule+0x2ea/0x710
 [<ffffffff811b6bf7>] sys_ioctl+0x57/0x90
 [<ffffffff8140ae9e>] ? trace_hardirqs_on_thunk+0x3a/0x3c
 [<ffffffff81c0f602>] system_call_fastpath+0x16/0x1b
Code: c1 ea 08 38 c2 74 0f 66 0f 1f 44 00 00 f3 90 0f b6 03 38 c2 75 f7 48 83 c4 08 5b c9 c3 55 48 89 e5 66 66 66 66 90 b8 00 01 00 00 <f0> 66 0f c1 07 89 c2 66 c1 ea 08 38 c2 74 0c 0f 1f 00 f3 90 0f
RIP  [<ffffffff81c0721e>] _raw_spin_lock+0xe/0x30
RSP <ffff880221721cc8>
CR2: 0000000000000000
---[ end trace 13fb1e4b6e5ab21f ]---

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-05-08 13:15:35 +03:00
Marcelo Tosatti
42bdf991f4 KVM: x86: fix maintenance of guest/host xcr0 state
Emulation of xcr0 writes zero guest_xcr0_loaded variable so that
subsequent VM-entry reloads CPU's xcr0 with guests xcr0 value.

However, this is incorrect because guest_xcr0_loaded variable is
read to decide whether to reload hosts xcr0.

In case the vcpu thread is scheduled out after the guest_xcr0_loaded = 0
assignment, and scheduler decides to preload FPU:

switch_to
{
  __switch_to
    __math_state_restore
      restore_fpu_checking
        fpu_restore_checking
          if (use_xsave())
              fpu_xrstor_checking
		xrstor64 with CPU's xcr0 == guests xcr0

Fix by properly restoring hosts xcr0 during emulation of xcr0 writes.

Analyzed-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-05-08 12:47:43 +03:00
Catalin Marinas
420c158dcf arm64: Treat the bitops index argument as an 'int'
The bitops prototype use an 'int' as the bit index type but the asm
implementation assume it to be a 'long'. Since the compiler does not
guarantee zeroing the upper 32-bits in a register when used as 'int',
change the bitops implementation accordingly.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-05-08 10:33:17 +01:00
Catalin Marinas
0e7f7bcc3f arm64: Ignore the 'write' ESR flag on cache maintenance faults
ESR.WnR bit is always set on data cache maintenance faults even though
the page is not required to have write permission. If a translation
fault (page not yet mapped) happens for read-only user address range,
Linux incorrectly assumes a permission fault. This patch adds the check
of the ESR.CM bit during the page fault handling to ignore the 'write'
flag.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Tim Northover <Tim.Northover@arm.com>
Cc: stable@vger.kernel.org
2013-05-08 10:33:16 +01:00
Mark Rutland
ed1f23637a arm64: dts: fix #address-cells for foundation-v8
Commit 90556ca1 ("arm64: vexpress: Add dts files for the ARMv8 RTSM
models") added foundation-v8.dts, but erroneously set
/cpus/#address-cells = <1> while providing two cells in each cpus/cpu@N
node's reg property.

As of commit ea393a2e ("arm64: smp: honour #address-size when parsing
CPU reg property") we read in as many address cells as specified rather
than always reading two. This means that for foundation-v8.dts, we only
read the first reg cell (zero) for each cpu node, and receive a lot of
warnings at boot of the form "/cpus/cpu@1: duplicate cpu reg properties
in the DT".

This patch corrects foundation-v8.dts to have the correct value for
/cpus/#address-cells.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-05-08 10:23:01 +01:00
Catalin Marinas
aa1e8ec1d2 arm64: vexpress: Add support for poweroff/restart
This patch adds the arm_pm_poweroff definition expected by the
vexpress-poweroff.c driver and enables the latter for arm64.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
2013-05-08 10:23:00 +01:00
Catalin Marinas
c4188edc9e arm64: Enable support for the ARM GIC interrupt controller
This patch enables ARM_GIC on the arm64 kernel.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-05-08 10:22:59 +01:00
Catalin Marinas
e6b6dc7f35 Merge branch 'gic' into HEAD
* arm64-prep-gic:
  irqchip: gic: Perform the gic_secondary_init() call via CPU notifier
  irqchip: gic: Call handle_bad_irq() directly
  arm: Move chained_irq_(enter|exit) to a generic file
  arm: Move the set_handle_irq and handle_arch_irq declarations to asm/irq.h
2013-05-08 10:22:07 +01:00
Takashi Iwai
17df3f5565 ALSA: hda - Apply pin-enablement workaround to all Haswell HDMI codecs
This is a revised patch based on Mengdong Lin's fix patch, which is a
supplement to a previous patch [1611a9c9: ALSA: hda - Add fixup for
Haswell to enable all pin and convertor widgets].

Some Haswell BIOS will disable the 2nd and 3rd pin/covertor widgets
when the HD-A controller changes state from D3 to D0.  So when the
controller resumes after a system or runtime suspend, these widgets
are disabled and programming these widgets to D0 will cause H/W error
and codec will not respond.

In addition, we found out that some BIOS disables the pins at S3
although it shows up at boot.  This confuses the driver utterly, and
the hardware falls into the fatal communication error like the above.

So in this patch, we apply intel_haswell_enable_all_pins() not only as
a fixup to a certain device (with 8086:2010) but to all Haswell
machines.  The codec driver basically assumes that all pins are
exposed, so it's anyway better to see them from the beginning.  Even
if all pins and converters are shown by this call, there should be no
regression in practice: the pin default configurations are still kept,
thus the disabled pins are handled as disabled by the driver
properly.

Signed-off-by: Mengdong Lin <mengdong.lin@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-05-08 08:24:57 +02:00
Eric Paris
2a0b4be6dd audit: fix message spacing printing auid
The helper function didn't include a leading space, so it was jammed
against the previous text in the audit record.

Signed-off-by: Eric Paris <eparis@redhat.com>
2013-05-08 00:02:19 -04:00
Linus Torvalds
5af43c24ca Merge branch 'akpm' (incoming from Andrew)
Merge more incoming from Andrew Morton:

 - Various fixes which were stalled or which I picked up recently

 - A large rotorooting of the AIO code.  Allegedly to improve
   performance but I don't really have good performance numbers (I might
   have lost the email) and I can't raise Kent today.  I held this out
   of 3.9 and we could give it another cycle if it's all too late/scary.

I ended up taking only the first two thirds of the AIO rotorooting.  I
left the percpu parts and the batch completion for later.  - Linus

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (33 commits)
  aio: don't include aio.h in sched.h
  aio: kill ki_retry
  aio: kill ki_key
  aio: give shared kioctx fields their own cachelines
  aio: kill struct aio_ring_info
  aio: kill batch allocation
  aio: change reqs_active to include unreaped completions
  aio: use cancellation list lazily
  aio: use flush_dcache_page()
  aio: make aio_read_evt() more efficient, convert to hrtimers
  wait: add wait_event_hrtimeout()
  aio: refcounting cleanup
  aio: make aio_put_req() lockless
  aio: do fget() after aio_get_req()
  aio: dprintk() -> pr_debug()
  aio: move private stuff out of aio.h
  aio: add kiocb_cancel()
  aio: kill return value of aio_complete()
  char: add aio_{read,write} to /dev/{null,zero}
  aio: remove retry-based AIO
  ...
2013-05-07 20:49:51 -07:00
Kent Overstreet
a27bb332c0 aio: don't include aio.h in sched.h
Faster kernel compiles by way of fewer unnecessary includes.

[akpm@linux-foundation.org: fix fallout]
[akpm@linux-foundation.org: fix build]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 20:16:25 -07:00
Kent Overstreet
41ef4eb8ee aio: kill ki_retry
Thanks to Zach Brown's work to rip out the retry infrastructure, we don't
need this anymore - ki_retry was only called right after the kiocb was
initialized.

This also refactors and trims some duplicated code, as well as cleaning up
the refcounting/error handling a bit.

[akpm@linux-foundation.org: use fmode_t in aio_run_iocb()]
[akpm@linux-foundation.org: fix file_start_write/file_end_write tests]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 19:46:02 -07:00
Kent Overstreet
8a6608907c aio: kill ki_key
ki_key wasn't actually used for anything previously - it was always 0.
Drop it to trim struct kiocb a bit.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 19:41:50 -07:00
Eric Paris
82d8da0d46 Revert "audit: move kaudit thread start from auditd registration to kaudit init"
This reverts commit 6ff5e45985.

Conflicts:
	kernel/audit.c

This patch was starting a kthread for all the time.  Since the follow on
patches that required it didn't get finished in 3.10 time, we shouldn't
ship this change in 3.10.

Signed-off-by: Eric Paris <eparis@redhat.com>
2013-05-07 22:27:21 -04:00
Jeff Layton
33e2208acf audit: vfs: fix audit_inode call in O_CREAT case of do_last
Jiri reported a regression in auditing of open(..., O_CREAT) syscalls.
In older kernels, creating a file with open(..., O_CREAT) created
audit_name records that looked like this:

type=PATH msg=audit(1360255720.628:64): item=1 name="/abc/foo" inode=138810 dev=fd:00 mode=0100640 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0
type=PATH msg=audit(1360255720.628:64): item=0 name="/abc/" inode=138635 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0

...in recent kernels though, they look like this:

type=PATH msg=audit(1360255402.886:12574): item=2 name=(null) inode=264599 dev=fd:00 mode=0100640 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0
type=PATH msg=audit(1360255402.886:12574): item=1 name=(null) inode=264598 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0
type=PATH msg=audit(1360255402.886:12574): item=0 name="/abc/foo" inode=264598 dev=fd:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:default_t:s0

Richard bisected to determine that the problems started with commit
bfcec708, but the log messages have changed with some later
audit-related patches.

The problem is that this audit_inode call is passing in the parent of
the dentry being opened, but audit_inode is being called with the parent
flag false. This causes later audit_inode and audit_inode_child calls to
match the wrong entry in the audit_names list.

This patch simply sets the flag to properly indicate that this inode
represents the parent. With this, the audit_names entries are back to
looking like they did before.

Cc: <stable@vger.kernel.org> # v3.7+
Reported-by: Jiri Jaburek <jjaburek@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Test By: Richard Guy Briggs <rbriggs@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-05-07 22:27:19 -04:00
Eric W. Biederman
780a7654ce audit: Make testing for a valid loginuid explicit.
audit rule additions containing "-F auid!=4294967295" were failing
with EINVAL because of a regression caused by e1760bd.

Apparently some userland audit rule sets want to know if loginuid uid
has been set and are using a test for auid != 4294967295 to determine
that.

In practice that is a horrible way to ask if a value has been set,
because it relies on subtle implementation details and will break
every time the uid implementation in the kernel changes.

So add a clean way to test if the audit loginuid has been set, and
silently convert the old idiom to the cleaner and more comprehensible
new idiom.

Cc: <stable@vger.kernel.org> # 3.7
Reported-By: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Tested-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2013-05-07 22:27:15 -04:00
Sanjay Lal
82d45de655 MIPS: Export symbols used by KVM/MIPS module
Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:37 +02:00
Sanjay Lal
12e25f8e19 MIPS: ASM offsets for VCPU arch specific fields.
Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:37 +02:00
Sanjay Lal
f9afbd45b0 MIPS: If KVM is enabled then use the KVM specific routine to flush the TLBs on a ASID wrap.
Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:36 +02:00
Sanjay Lal
f2e3656d23 MIPS: Export routines needed by the KVM module.
Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:36 +02:00
Sanjay Lal
f5c236dd0a KVM/MIPS32: Routines to handle specific traps/exceptions while executing the guest.
Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:36 +02:00
Sanjay Lal
06d1838db0 KVM/MIPS32: Guest interrupt delivery.
Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:36 +02:00
Sanjay Lal
3c20ef5262 KVM/MIPS32: COP0 accesses profiling.
Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:36 +02:00
Sanjay Lal
03a0331c8c KVM/MIPS32: Release notes and KVM module Makefile
Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:35 +02:00
Sanjay Lal
858dd5d457 KVM/MIPS32: MMU/TLB operations for the Guest.
- Note that this file is statically linked with the rest of the host kernel (KSEG0). This is because kernel modules are
loaded into mapped space on MIPS and we want to make sure that we don't get any host kernel TLB faults while
manipulating TLBs.
- Virtual Guest TLBs are implemented as 64 entry array regardless of the number of host TLB entries.
- Shadow TLBs map Guest virtual addresses to Host physical addresses.

    - TLB miss handling details:
        Guest KSEG0 TLBMISS (0x40000000 – 0x60000000): Transparent to the Guest.
        Guest KSEG2/3 (0x60000000 – 0x80000000) & Guest UM TLBMISS (0x00000000 – 0x40000000)
            Lookup in Guest/Virtual TLB
            If an entry doesn’t match
                deliver appropriate TLBMISS LD/ST exception to the guest
            If entry does exist in the Guest TLB and is NOT Valid
                Deliver TLB invalid exception to the guest
            If entry does exist in the Guest TLB and is VALID
                Inject the TLB entry into the Shadow TLB

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:35 +02:00
Sanjay Lal
e685c689f3 KVM/MIPS32: Privileged instruction/target branch emulation.
- The Guest kernel is run in UM and privileged instructions cause a trap.
- If the instruction causing the trap is in a branch delay slot, the branch
  needs to be emulated to figure out the PC @ which the guest will resume
  execution.

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:35 +02:00
Sanjay Lal
9843b030cc KVM/MIPS32: KVM Guest kernel support.
Both Guest kernel and Guest Userspace execute in UM. The memory map is as follows:
Guest User address space:   0x00000000 -> 0x40000000
Guest Kernel Unmapped:      0x40000000 -> 0x60000000
Guest Kernel Mapped:        0x60000000 -> 0x80000000
- Guest Usermode virtual memory is limited to 1GB.

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:35 +02:00
Sanjay Lal
669e846e6c KVM/MIPS32: MIPS arch specific APIs for KVM
- Implements the arch specific APIs for KVM, some are stubs for MIPS
- kvm_mips_handle_exit(): Main 'C' distpatch routine for handling exceptions while in "Guest" mode.
- Also implements in-kernel timer interrupt support for the guest.

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:35 +02:00
Sanjay Lal
b680f70fc1 KVM/MIPS32: Entry point for trampolining to the guest and trap handlers.
- __kvm_mips_vcpu_run: main entry point to enter guest, we save kernel context, load
  up guest context from and ERET to guest context.
- mips32_exception: L1 exception handler(s), save k0/k1 and jump to main handlers.
- mips32_GuestException: Generic exception handlers for exceptions/interrupts while in
  guest context.  Save guest context, restore some kernel context and jump to
  main 'C' handler: kvm_mips_handle_exit()

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:34 +02:00
Sanjay Lal
740765ce45 KVM/MIPS32: Arch specific KVM data structures.
Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:34 +02:00
Sanjay Lal
2235a54dea KVM/MIPS32: Infrastructure/build files.
- Add the KVM option to MIPS build files.
- Add default config files for KVM host/guest kernels.
- Change the link address for the Malta KVM Guest kernel to UM (0x40100000).
- Add KVM Kconfig file with KVM/MIPS specific options

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:55:34 +02:00
Ralf Baechle
fc0460d0df MIPS: IP27: Remove pfn_t.
In the Linux kernel traditionally pfns are represented by an unsigned long.
However a few bits of the SGI IP27 platform code that were ported from
IRIX are using pfn_t for historic reasons.  This is conflicting with
KVM's use of pfn_t.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-08 03:51:58 +02:00
Kent Overstreet
4e23bcaeb9 aio: give shared kioctx fields their own cachelines
[akpm@linux-foundation.org: make reqs_active __cacheline_aligned_in_smp]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 18:38:29 -07:00
Kent Overstreet
58c85dc20a aio: kill struct aio_ring_info
struct aio_ring_info was kind of odd, the only place it's used is where
it's embedded in struct kioctx - there's no real need for it.

The next patch rearranges struct kioctx and puts various things on their
own cachelines - getting rid of struct aio_ring_info now makes that
reordering a bit clearer.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 18:38:29 -07:00
Kent Overstreet
a1c8eae75e aio: kill batch allocation
Previously, allocating a kiocb required touching quite a few global
(well, per kioctx) cachelines...  so batching up allocation to amortize
those was worthwhile.  But we've gotten rid of some of those, and in
another couple of patches kiocb allocation won't require writing to any
shared cachelines, so that means we can just rip this code out.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 18:38:29 -07:00
Kent Overstreet
3e845ce01a aio: change reqs_active to include unreaped completions
The aio code tries really hard to avoid having to deal with the
completion ringbuffer overflowing.  To do that, it has to keep track of
the number of outstanding kiocbs, and the number of completions
currently in the ringbuffer - and it's got to check that every time we
allocate a kiocb.  Ouch.

But - we can improve this quite a bit if we just change reqs_active to
mean "number of outstanding requests and unreaped completions" - that
means kiocb allocation doesn't have to look at the ringbuffer, which is
a fairly significant win.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 18:38:29 -07:00
Kent Overstreet
0460fef2a9 aio: use cancellation list lazily
Cancelling kiocbs requires adding them to a per kioctx linked list,
which is one of the few things we need to take the kioctx lock for in
the fast path.  But most kiocbs can't be cancelled - so if we just do
this lazily, we can avoid quite a bit of locking overhead.

While we're at it, instead of using a flag bit switch to using ki_cancel
itself to indicate that a kiocb has been cancelled/completed.  This lets
us get rid of ki_flags entirely.

[akpm@linux-foundation.org: remove buggy BUG()]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 18:38:29 -07:00
Kent Overstreet
21b40200cf aio: use flush_dcache_page()
This wasn't causing problems before because it's not needed on x86, but
it is needed on other architectures.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 18:38:29 -07:00
Kent Overstreet
a31ad380be aio: make aio_read_evt() more efficient, convert to hrtimers
Previously, aio_read_event() pulled a single completion off the
ringbuffer at a time, locking and unlocking each time.  Change it to
pull off as many events as it can at a time, and copy them directly to
userspace.

This also fixes a bug where if copying the event to userspace failed,
we'd lose the event.

Also convert it to wait_event_interruptible_hrtimeout(), which
simplifies it quite a bit.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 18:38:28 -07:00