2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-28 15:13:55 +08:00
Commit Graph

1866 Commits

Author SHA1 Message Date
Linus Torvalds
b5684b83b1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits)
  ide: use proper printk() KERN_* levels in ide-probe.c
  ide: fix for EATA SCSI HBA in ATA emulating mode
  ide: remove stale comments from drivers/ide/Makefile
  ide: enable local IRQs in all handlers for TASKFILE_NO_DATA data phase
  ide-scsi: remove kmalloced struct request
  ht6560b: remove old history
  ht6560b: update email address
  ide-cd: fix oops when using growisofs
  gayle: release resources on ide_host_add() failure
  palm_bk3710: add UltraDMA/100 support
  ide: trivial sparse annotations
  ide: ide-tape.c sparse annotations and unaligned access removal
  ide: drop 'name' parameter from ->init_chipset method
  ide: prefix messages from IDE PCI host drivers by driver name
  it821x: remove DECLARE_ITE_DEV() macro
  it8213: remove DECLARE_ITE_DEV() macro
  ide: include PCI device name in messages from IDE PCI host drivers
  ide: remove <asm/ide.h> for some archs
  ide-generic: remove ide_default_{io_base,irq}() inlines (take 3)
  ide-generic: is no longer needed on ppc32
  ...
2008-07-24 14:55:09 -07:00
Bartlomiej Zolnierkiewicz
2a8f7450f8 ide: remove <asm/ide.h> for some archs
* Remove <linux/irq.h> include from <asm-ia64.h> (<linux/ide.h> includes
  <linux/interrupt.h> which is enough).

* Remove <asm/ide.h> for alpha/blackfin/h8300/ia64/m32r/sh/x86/xtensa
  (this leaves us with arm/frv/m68k/mips/mn10300/parisc/powerpc/sparc[64]).

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:31 +02:00
Bartlomiej Zolnierkiewicz
f01d35d87f ide-generic: remove ide_default_{io_base,irq}() inlines (take 3)
Replace ide_default_{io_base,irq}() inlines by legacy_{bases,irqs}[].

v2:
Add missing zero-ing of hws[] (caught during testing by Borislav Petkov).

v3:
Fix zero-oing of hws[] for _real_ this time.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:31 +02:00
Bartlomiej Zolnierkiewicz
d83b8b85cd ide: define MAX_HWIFS in <linux/ide.h>
* Now that ide_hwif_t instances are allocated dynamically
  the difference between MAX_HWIFS == 2 and MAX_HWIFS == 10
  is ~100 bytes (x86-32) so use MAX_HWIFS == 10 on all archs
  except these ones that use MAX_HWIFS == 1.

* Define MAX_HWIFS in <linux/ide.h> instead of <asm/ide.h>.

[ Please note that avr32/cris/v850 have no <asm/ide.h>
  and alpha/ia64/sh always define CONFIG_IDE_MAX_HWIFS. ]

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:30 +02:00
Bartlomiej Zolnierkiewicz
b6cd7da5be ide-generic: remove "no_pci_devices()" quirk from ide_default_io_base()
Since the decision to probe for ISA ide2-6 is now left to the user
"no_pci_devices()" quirk is no longer needed and may be removed.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:28 +02:00
Linus Torvalds
6044110742 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix header export, asm-x86/processor-flags.h, CONFIG_* leaks
  x86: BUILD_IRQ say .text to avoid .data.percpu
  xen: don't use sysret for sysexit32
  x86: call early_cpu_init at the same point
2008-07-24 12:33:51 -07:00
Linus Torvalds
7540081c6b Merge branch 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc
* 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
  Remove __DECLARE_SEMAPHORE_GENERIC
  Remove asm/semaphore.h
  Remove use of asm/semaphore.h
  Add missing semaphore.h includes
  Remove mention of semaphores from kernel-locking
2008-07-24 12:24:40 -07:00
Ulrich Drepper
9fe5ad9c8c flag parameters add-on: remove epoll_create size param
Remove the size parameter from the new epoll_create syscall and renames the
syscall itself.  The updated test program follows.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_epoll_create2
# ifdef __x86_64__
#  define __NR_epoll_create2 291
# elif defined __i386__
#  define __NR_epoll_create2 329
# else
#  error "need __NR_epoll_create2"
# endif
#endif

#define EPOLL_CLOEXEC O_CLOEXEC

int
main (void)
{
  int fd = syscall (__NR_epoll_create2, 0);
  if (fd == -1)
    {
      puts ("epoll_create2(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("epoll_create2(0) set close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_epoll_create2, EPOLL_CLOEXEC);
  if (fd == -1)
    {
      puts ("epoll_create2(EPOLL_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:29 -07:00
Ulrich Drepper
4006553b06 flag parameters: inotify_init
This patch introduces the new syscall inotify_init1 (note: the 1 stands for
the one parameter the syscall takes, as opposed to no parameter before).  The
values accepted for this parameter are function-specific and defined in the
inotify.h header.  Here the values must match the O_* flags, though.  In this
patch CLOEXEC support is introduced.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_inotify_init1
# ifdef __x86_64__
#  define __NR_inotify_init1 294
# elif defined __i386__
#  define __NR_inotify_init1 332
# else
#  error "need __NR_inotify_init1"
# endif
#endif

#define IN_CLOEXEC O_CLOEXEC

int
main (void)
{
  int fd;
  fd = syscall (__NR_inotify_init1, 0);
  if (fd == -1)
    {
      puts ("inotify_init1(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("inotify_init1(0) set close-on-exit");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_inotify_init1, IN_CLOEXEC);
  if (fd == -1)
    {
      puts ("inotify_init1(IN_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("inotify_init1(O_CLOEXEC) does not set close-on-exit");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:28 -07:00
Ulrich Drepper
ed8cae8ba0 flag parameters: pipe
This patch introduces the new syscall pipe2 which is like pipe but it also
takes an additional parameter which takes a flag value.  This patch implements
the handling of O_CLOEXEC for the flag.  I did not add support for the new
syscall for the architectures which have a special sys_pipe implementation.  I
think the maintainers of those archs have the chance to go with the unified
implementation but that's up to them.

The implementation introduces do_pipe_flags.  I did that instead of changing
all callers of do_pipe because some of the callers are written in assembler.
I would probably screw up changing the assembly code.  To avoid breaking code
do_pipe is now a small wrapper around do_pipe_flags.  Once all callers are
changed over to do_pipe_flags the old do_pipe function can be removed.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_pipe2
# ifdef __x86_64__
#  define __NR_pipe2 293
# elif defined __i386__
#  define __NR_pipe2 331
# else
#  error "need __NR_pipe2"
# endif
#endif

int
main (void)
{
  int fd[2];
  if (syscall (__NR_pipe2, fd, 0) != 0)
    {
      puts ("pipe2(0) failed");
      return 1;
    }
  for (int i = 0; i < 2; ++i)
    {
      int coe = fcntl (fd[i], F_GETFD);
      if (coe == -1)
        {
          puts ("fcntl failed");
          return 1;
        }
      if (coe & FD_CLOEXEC)
        {
          printf ("pipe2(0) set close-on-exit for fd[%d]\n", i);
          return 1;
        }
    }
  close (fd[0]);
  close (fd[1]);

  if (syscall (__NR_pipe2, fd, O_CLOEXEC) != 0)
    {
      puts ("pipe2(O_CLOEXEC) failed");
      return 1;
    }
  for (int i = 0; i < 2; ++i)
    {
      int coe = fcntl (fd[i], F_GETFD);
      if (coe == -1)
        {
          puts ("fcntl failed");
          return 1;
        }
      if ((coe & FD_CLOEXEC) == 0)
        {
          printf ("pipe2(O_CLOEXEC) does not set close-on-exit for fd[%d]\n", i);
          return 1;
        }
    }
  close (fd[0]);
  close (fd[1]);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:28 -07:00
Ulrich Drepper
336dd1f70f flag parameters: dup2
This patch adds the new dup3 syscall.  It extends the old dup2 syscall by one
parameter which is meant to hold a flag value.  Support for the O_CLOEXEC flag
is added in this patch.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_dup3
# ifdef __x86_64__
#  define __NR_dup3 292
# elif defined __i386__
#  define __NR_dup3 330
# else
#  error "need __NR_dup3"
# endif
#endif

int
main (void)
{
  int fd = syscall (__NR_dup3, 1, 4, 0);
  if (fd == -1)
    {
      puts ("dup3(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("dup3(0) set close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_dup3, 1, 4, O_CLOEXEC);
  if (fd == -1)
    {
      puts ("dup3(O_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("dup3(O_CLOEXEC) set close-on-exec flag");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:28 -07:00
Ulrich Drepper
a0998b50c3 flag parameters: epoll_create
This patch adds the new epoll_create2 syscall.  It extends the old epoll_create
syscall by one parameter which is meant to hold a flag value.  In this
patch the only flag support is EPOLL_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.

A new name EPOLL_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_epoll_create2
# ifdef __x86_64__
#  define __NR_epoll_create2 291
# elif defined __i386__
#  define __NR_epoll_create2 329
# else
#  error "need __NR_epoll_create2"
# endif
#endif

#define EPOLL_CLOEXEC O_CLOEXEC

int
main (void)
{
  int fd = syscall (__NR_epoll_create2, 1, 0);
  if (fd == -1)
    {
      puts ("epoll_create2(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("epoll_create2(0) set close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_epoll_create2, 1, EPOLL_CLOEXEC);
  if (fd == -1)
    {
      puts ("epoll_create2(EPOLL_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:28 -07:00
Ulrich Drepper
b087498eb5 flag parameters: eventfd
This patch adds the new eventfd2 syscall.  It extends the old eventfd
syscall by one parameter which is meant to hold a flag value.  In this
patch the only flag support is EFD_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.

A new name EFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_eventfd2
# ifdef __x86_64__
#  define __NR_eventfd2 290
# elif defined __i386__
#  define __NR_eventfd2 328
# else
#  error "need __NR_eventfd2"
# endif
#endif

#define EFD_CLOEXEC O_CLOEXEC

int
main (void)
{
  int fd = syscall (__NR_eventfd2, 1, 0);
  if (fd == -1)
    {
      puts ("eventfd2(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("eventfd2(0) sets close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_eventfd2, 1, EFD_CLOEXEC);
  if (fd == -1)
    {
      puts ("eventfd2(EFD_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("eventfd2(EFD_CLOEXEC) does not set close-on-exec flag");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:27 -07:00
Ulrich Drepper
9deb27baed flag parameters: signalfd
This patch adds the new signalfd4 syscall.  It extends the old signalfd
syscall by one parameter which is meant to hold a flag value.  In this
patch the only flag support is SFD_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.

A new name SFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_signalfd4
# ifdef __x86_64__
#  define __NR_signalfd4 289
# elif defined __i386__
#  define __NR_signalfd4 327
# else
#  error "need __NR_signalfd4"
# endif
#endif

#define SFD_CLOEXEC O_CLOEXEC

int
main (void)
{
  sigset_t ss;
  sigemptyset (&ss);
  sigaddset (&ss, SIGUSR1);
  int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0);
  if (fd == -1)
    {
      puts ("signalfd4(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("signalfd4(0) set close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_CLOEXEC);
  if (fd == -1)
    {
      puts ("signalfd4(SFD_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("signalfd4(SFD_CLOEXEC) does not set close-on-exec flag");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:27 -07:00
Ulrich Drepper
aaca0bdca5 flag parameters: paccept
This patch is by far the most complex in the series.  It adds a new syscall
paccept.  This syscall differs from accept in that it adds (at the userlevel)
two additional parameters:

- a signal mask
- a flags value

The flags parameter can be used to set flag like SOCK_CLOEXEC.  This is
imlpemented here as well.  Some people argued that this is a property which
should be inherited from the file desriptor for the server but this is against
POSIX.  Additionally, we really want the signal mask parameter as well
(similar to pselect, ppoll, etc).  So an interface change in inevitable.

The flag value is the same as for socket and socketpair.  I think diverging
here will only create confusion.  Similar to the filesystem interfaces where
the use of the O_* constants differs, it is acceptable here.

The signal mask is handled as for pselect etc.  The mask is temporarily
installed for the thread and removed before the call returns.  I modeled the
code after pselect.  If there is a problem it's likely also in pselect.

For architectures which use socketcall I maintained this interface instead of
adding a system call.  The symmetry shouldn't be broken.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <netinet/in.h>
#include <sys/socket.h>
#include <sys/syscall.h>

#ifndef __NR_paccept
# ifdef __x86_64__
#  define __NR_paccept 288
# elif defined __i386__
#  define SYS_PACCEPT 18
#  define USE_SOCKETCALL 1
# else
#  error "need __NR_paccept"
# endif
#endif

#ifdef USE_SOCKETCALL
# define paccept(fd, addr, addrlen, mask, flags) \
  ({ long args[6] = { \
       (long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \
     syscall (__NR_socketcall, SYS_PACCEPT, args); })
#else
# define paccept(fd, addr, addrlen, mask, flags) \
  syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags)
#endif

#define PORT 57392

#define SOCK_CLOEXEC O_CLOEXEC

static pthread_barrier_t b;

static void *
tf (void *arg)
{
  pthread_barrier_wait (&b);
  int s = socket (AF_INET, SOCK_STREAM, 0);
  struct sockaddr_in sin;
  sin.sin_family = AF_INET;
  sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
  sin.sin_port = htons (PORT);
  connect (s, (const struct sockaddr *) &sin, sizeof (sin));
  close (s);

  pthread_barrier_wait (&b);
  s = socket (AF_INET, SOCK_STREAM, 0);
  sin.sin_port = htons (PORT);
  connect (s, (const struct sockaddr *) &sin, sizeof (sin));
  close (s);
  pthread_barrier_wait (&b);

  pthread_barrier_wait (&b);
  sleep (2);
  pthread_kill ((pthread_t) arg, SIGUSR1);

  return NULL;
}

static void
handler (int s)
{
}

int
main (void)
{
  pthread_barrier_init (&b, NULL, 2);

  struct sockaddr_in sin;
  pthread_t th;
  if (pthread_create (&th, NULL, tf, (void *) pthread_self ()) != 0)
    {
      puts ("pthread_create failed");
      return 1;
    }

  int s = socket (AF_INET, SOCK_STREAM, 0);
  int reuse = 1;
  setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse));
  sin.sin_family = AF_INET;
  sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
  sin.sin_port = htons (PORT);
  bind (s, (struct sockaddr *) &sin, sizeof (sin));
  listen (s, SOMAXCONN);

  pthread_barrier_wait (&b);

  int s2 = paccept (s, NULL, 0, NULL, 0);
  if (s2 < 0)
    {
      puts ("paccept(0) failed");
      return 1;
    }

  int coe = fcntl (s2, F_GETFD);
  if (coe & FD_CLOEXEC)
    {
      puts ("paccept(0) set close-on-exec-flag");
      return 1;
    }
  close (s2);

  pthread_barrier_wait (&b);

  s2 = paccept (s, NULL, 0, NULL, SOCK_CLOEXEC);
  if (s2 < 0)
    {
      puts ("paccept(SOCK_CLOEXEC) failed");
      return 1;
    }

  coe = fcntl (s2, F_GETFD);
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("paccept(SOCK_CLOEXEC) does not set close-on-exec flag");
      return 1;
    }
  close (s2);

  pthread_barrier_wait (&b);

  struct sigaction sa;
  sa.sa_handler = handler;
  sa.sa_flags = 0;
  sigemptyset (&sa.sa_mask);
  sigaction (SIGUSR1, &sa, NULL);

  sigset_t ss;
  pthread_sigmask (SIG_SETMASK, NULL, &ss);
  sigaddset (&ss, SIGUSR1);
  pthread_sigmask (SIG_SETMASK, &ss, NULL);

  sigdelset (&ss, SIGUSR1);
  alarm (4);
  pthread_barrier_wait (&b);

  errno = 0 ;
  s2 = paccept (s, NULL, 0, &ss, 0);
  if (s2 != -1 || errno != EINTR)
    {
      puts ("paccept did not fail with EINTR");
      return 1;
    }

  close (s);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[akpm@linux-foundation.org: make it compile]
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Roland McGrath <roland@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:27 -07:00
Andrea Righi
27ac792ca0 PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures
On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit
boundary. For example:

	u64 val = PAGE_ALIGN(size);

always returns a value < 4GB even if size is greater than 4GB.

The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for
example):

#define PAGE_SHIFT      12
#define PAGE_SIZE       (_AC(1,UL) << PAGE_SHIFT)
#define PAGE_MASK       (~(PAGE_SIZE-1))
...
#define PAGE_ALIGN(addr)       (((addr)+PAGE_SIZE-1)&PAGE_MASK)

The "~" is performed on a 32-bit value, so everything in "and" with
PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary.
Using the ALIGN() macro seems to be the right way, because it uses
typeof(addr) for the mask.

Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in
include/linux/mm.h.

See also lkml discussion: http://lkml.org/lkml/2008/6/11/237

[akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c]
[akpm@linux-foundation.org: fix v850]
[akpm@linux-foundation.org: fix powerpc]
[akpm@linux-foundation.org: fix arm]
[akpm@linux-foundation.org: fix mips]
[akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c]
[akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c]
[akpm@linux-foundation.org: fix powerpc]
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:21 -07:00
Andi Kleen
b4718e628d x86: add hugepagesz option on 64-bit
Add an hugepagesz=...  option similar to IA64, PPC etc.  to x86-64.

This finally allows to select GB pages for hugetlbfs in x86 now that all
the infrastructure is in place.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:19 -07:00
Andi Kleen
a551643895 hugetlb: modular state for hugetlb page size
The goal of this patchset is to support multiple hugetlb page sizes.  This
is achieved by introducing a new struct hstate structure, which
encapsulates the important hugetlb state and constants (eg.  huge page
size, number of huge pages currently allocated, etc).

The hstate structure is then passed around the code which requires these
fields, they will do the right thing regardless of the exact hstate they
are operating on.

This patch adds the hstate structure, with a single global instance of it
(default_hstate), and does the basic work of converting hugetlb to use the
hstate.

Future patches will add more hstate structures to allow for different
hugetlbfs mounts to have different page sizes.

[akpm@linux-foundation.org: coding-style fixes]
Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:17 -07:00
Jan Beulich
42b7772812 mm: remove double indirection on tlb parameter to free_pgd_range() & Co
The double indirection here is not needed anywhere and hence (at least)
confusing.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:15 -07:00
Rik van Riel
28b2ee20c7 access_process_vm device memory infrastructure
In order to be able to debug things like the X server and programs using
the PPC Cell SPUs, the debugger needs to be able to access device memory
through ptrace and /proc/pid/mem.

This patch:

Add the generic_access_phys access function and put the hooks in place
to allow access_process_vm to access device or PPC Cell SPU memory.

[riel@redhat.com: Add documentation for the vm_ops->access function]
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Benjamin Herrensmidt <benh@kernel.crashing.org>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:15 -07:00
Matthew Wilcox
2351ec533e Remove asm/semaphore.h
All users have now been converted to linux/semaphore.h and we don't need
to keep these files around any longer.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-07-24 08:31:12 -04:00
Vegard Nossum
04bbe430f7 x86: fix header export, asm-x86/processor-flags.h, CONFIG_* leaks
Apparently,

commit 6330a30a76
Author: Vegard Nossum <vegard.nossum@gmail.com>
Date:   Wed May 28 09:46:19 2008 +0200

    x86: break mutual header inclusion

introduced some CONFIG names to processor-flags.h, which was exported in

commit 6093015db2
Author: Ingo Molnar <mingo@elte.hu>
Date:   Sun Mar 30 11:45:23 2008 +0200

    x86: cleanup replace most vm86 flags with flags from processor-flags.h, fix

Fix it by wrapping the CONFIG parts in __KERNEL__.

Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-24 12:49:53 +02:00
Maciej W. Rozycki
0791e13fbb x86: fix up a comment in ack_APIC_irq()
Adjust a comment in ack_APIC_irq() according to the recent removal of
CONFIG_X86_GOOD_APIC.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-24 12:39:23 +02:00
Maciej W. Rozycki
32f71aff77 x86: PIC, L-APIC and I/O APIC debug information
Dump all the PIC, local APIC and I/O APIC information at the
fs_initcall() level, which is after ACPI (if used) has initialised PCI
information, making the point of invocation consistent across MP-table and
ACPI platforms.  Remove explicit calls to print_IO_APIC() from elsewhere.
Make the interface of all the functions involved consistent between 32-bit
and 64-bit versions and make them all static by default by the means of a
New-and-Improved(TM) __apicdebuginit() macro.

 Note that like print_IO_APIC() all these only output anything if
"apic=debug" has been passed to the kernel through the command line.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-24 12:37:10 +02:00
Linus Torvalds
7f9dce3837 Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: hrtick_enabled() should use cpu_active()
  sched, x86: clean up hrtick implementation
  sched: fix build error, provide partition_sched_domains() unconditionally
  sched: fix warning in inc_rt_tasks() to not declare variable 'rq' if it's not needed
  cpu hotplug: Make cpu_active_map synchronization dependency clear
  cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)
  sched: rework of "prioritize non-migratable tasks over migratable ones"
  sched: reduce stack size in isolated_cpu_setup()
  Revert parts of "ftrace: do not trace scheduler functions"

Fixed up conflicts in include/asm-x86/thread_info.h (due to the
TIF_SINGLESTEP unification vs TIF_HRTICK_RESCHED removal) and
kernel/sched_fair.c (due to cpu_active_map vs for_each_cpu_mask_nr()
introduction).
2008-07-23 19:36:53 -07:00
Linus Torvalds
26dcce0fab Merge branch 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
  NR_CPUS: Replace NR_CPUS in speedstep-centrino.c
  cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP
  NR_CPUS: Replace NR_CPUS in cpufreq userspace routines
  NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var
  NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genapic_flat_64.c
  NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genx2apic_uv_x.c
  NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/proc.c
  NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.c
  cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix
  cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target
  cpumask: Provide a generic set of CPUMASK_ALLOC macros
  cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c
  cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c
  cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c
  cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c
  cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c
  cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
  Revert "cpumask: introduce new APIs"
  cpumask: make for_each_cpu_mask a bit smaller
  net: Pass reference to cpumask variable in net/sunrpc/svc.c
  ...

Fix up trivial conflicts in drivers/cpufreq/cpufreq.c manually
2008-07-23 18:37:44 -07:00
Jaswinder Singh
01eb7858c0 x86: mm/pgtable_32.c declare set_pmd_pfn before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-23 17:41:59 +05:30
Jaswinder Singh
e0b7c8192d x86: mm/pageattr.c declare arch_report_meminfo before they get used
declared arch_report_meminfo() in asm-x86/pgtable.h as it will be also accessible by fs/proc/proc_misc.c

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-23 17:40:34 +05:30
Jaswinder Singh
70ef56414e x86: mm/fault.c declare do_page_fault before they get used
declared do_page_fault() in asm-x86/trap.h for both X86_32 and X86_64

removed do_invalid_op declaration from mm/fault.c as it is already declared in asm-x86/trap.h

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-23 17:36:37 +05:30
Jaswinder Singh
a80495ec92 x86: mm/init_XX.c declare functions before they get used
included <asm/smp.h> in mm/init_32.c for zap_low_mappings()

declared free_initmem() in asm-x86/page_XX.h

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-23 17:33:57 +05:30
Jaswinder Singh
8f7db5186c x86: vm86_32.c declare functions before they get used
declared following syscalls in asm-x86/syscalls.h:
sys_vm86old, sys_vm86

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-23 17:31:02 +05:30
Jaswinder Singh
2b97df06ce x86: apic_XX.c declare functions before they get used
declared following smp interrupts in asm-x86/hw_irq.h:
smp_apic_timer_interrupt, smp_spurious_interrupt, smp_error_interrupt

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-23 17:13:14 +05:30
Vegard Nossum
a318631686 x86: consolidate header guards
This patch consolidates the header guard names which are also used
externally, i.e. in .c files.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
2008-07-22 21:53:53 +02:00
Vegard Nossum
77ef50a522 x86: consolidate header guards
This patch is the result of an automatic script that consolidates the
format of all the headers in include/asm-x86/.

The format:

1. No leading underscore. Names with leading underscores are reserved.
2. Pathname components are separated by two underscores. So we can
   distinguish between mm_types.h and mm/types.h.
3. Everything except letters and numbers are turned into single
   underscores.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
2008-07-22 21:31:34 +02:00
Vegard Nossum
a656c8efb4 x86: fix spurious '#' in kvm header
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
2008-07-22 21:27:11 +02:00
Suresh Siddha
df1be4372e x64, apic: use generic apic_write() for ack_APIC_irq()
I tested tip/master and found an issue (patch attached)
for x2apic support. This is not because of the recent merges we had, but
because of something(where we still access memory based interface after
enabling x2apic mode) that slipped through my earlier tests.

Probably it is a good idea to unmap the memory mapped interface, once we switch
to x2apic mode. That will catch the issues much earlier. I will
post another patch for this.

ack_APIC_irq() is used at too many generic places (and not just during
irq_chip handling!) to use the native_apic_mem_write(). For ex, this will
break x2apic based systems.

Fix ack_APIC_irq() to use the generic apic_write() even for 64-bit.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: suresh.b.siddha@intel.com
Cc: yong.y.wang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-22 17:29:22 +02:00
Jaswinder Singh
8fd329a1ac x86: common.c declare idle_regs before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:09 +02:00
Jaswinder Singh
1c6c727d9c x86: proc.c declare cpuinfo_op before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:08 +02:00
Jaswinder Singh
c1686aeaf0 x86: ptrace.c declare functions before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:07 +02:00
Jaswinder Singh
36454936c0 x86: i387.c declare dump_fpu() before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:06 +02:00
Jaswinder Singh
791b897ccd x86: pci-nommu.c declare nommu_dma_ops before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:05 +02:00
Jaswinder Singh
9321b8cbbb x86: pci-dma.c declare iommu_bio_merge before they get used
moved iommu_bio_merge from io_64.h to io.h because it is required for both.

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:04 +02:00
Jaswinder Singh
a7b7511ac1 x86: e820.c declare pci_mem_start before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:03 +02:00
Jaswinder Singh
5314d48ed5 x86: setup.c declare saved_video_mode before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:02 +02:00
Jaswinder Singh
cc0384917b x86: time_XX.c declare functions before they get used
Declare time_init() in asm-x86/time.h

Also did cleanup in asm-x86/timer.h :
timer_ack is only required for X86_32
int recalibrate_cpu_khz(void) is for X86_32

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:01 +02:00
Jaswinder Singh
b994b6c033 x86: signal_XX.c declare do_notify_resume before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:00 +02:00
Jaswinder Singh
fb26132b44 x86: process_32.c declare cpu_number before they get used
Moved DECLARE_PER_CPU(int, cpu_number) from CONFIG_X86_32_SMP to CONFIG_X86_32
because cpu_number is required for both.
And include asm/smp.h in process_32.c

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:35:59 +02:00
Jaswinder Singh
bbc1f698a5 x86: Introducing asm/syscalls.h
Declaring arch-dependent syscalls for x86 architecture

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:35:57 +02:00
Ingo Molnar
026e2c05ef x86, cyrix: debug
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-22 12:24:00 +02:00
Jeremy Fitzhardinge
77be1fabd0 x86: add PTE_FLAGS_MASK
PTE_PFN_MASK was getting lonely, so I made it a friend.

Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-22 10:43:45 +02:00
Jeremy Fitzhardinge
59438c9fc4 x86: rename PTE_MASK to PTE_PFN_MASK
Rusty, in his peevish way, complained that macros defining constants
should have a name which somewhat accurately reflects the actual
purpose of the constant.

Aside from the fact that PTE_MASK gives no clue as to what's actually
being masked, and is misleadingly similar to the functionally entirely
different PMD_MASK, PUD_MASK and PGD_MASK, I don't really see what the
problem is.

But if this patch silences the incessent noise, then it will have
achieved its goal (TODO: write test-case).

Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-22 10:43:44 +02:00
Rusty Russell
c2e3277f87 x86: fix pte_flags() to only return flags, fix lguest (updated)
(Jeremy said:
	rusty: use PTE_MASK
	rusty: use PTE_MASK
	rusty: use PTE_MASK
 When I asked:
	jsgf: does that include the NX flag?
 He responded eloquently:
	rusty: use PTE_MASK
	rusty: use PTE_MASK
	yes, it's the official constant of masking flags out of ptes
)

Change a15af1c9ea 'x86/paravirt: add
pte_flags to just get pte flags' removed lguest's private pte_flags()
in favor of a generic one.

Unfortunately, the generic one doesn't filter out the non-flags bits:
this results in lguest creating corrupt shadow page tables and blowing
up host memory.

Since noone is supposed to use the pfn part of pte_flags(), it seems
safest to always do the filtering.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-and-morning-tea-spilled-by: Ingo Molnar <mingo@elte.hu>
2008-07-22 10:41:18 +02:00
Yinghai Lu
1b9b89e7f1 x86: add apic probe for genapic 64bit, v2
introducing an APIC handling probing abstraction:

 static struct genapic *apic_probe[] __initdata = {
	&apic_x2apic_uv_x,
	&apic_x2apic_phys,
	&apic_x2apic_cluster,
	&apic_physflat,
	NULL,
 };

This way we can remove UV, x2apic specific code from genapic_64.c and
move them to their specific genapic files.

[ v2: fix compiling when CONFIG_ACPI is not set ]

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-22 09:09:15 +02:00
Ingo Molnar
76c3bb15d6 Merge branch 'linus' into x86/x2apic 2008-07-22 09:06:21 +02:00
Thomas Gleixner
cfc1b9a6a6 x86: convert Dprintk to pr_debug
There are a couple of places where (P)Dprintk is used which is an old
compile time enabled printk wrapper. Convert it to the generic
pr_debug().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-07-21 21:35:38 +02:00
Ingo Molnar
eb6a12c242 Merge branch 'linus' into cpus4096-for-linus
Conflicts:

	net/sunrpc/svc.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-21 17:19:50 +02:00
Ingo Molnar
2e2dcc7631 Merge branch 'x86/paravirt-spinlocks' into x86/for-linus 2008-07-21 16:45:56 +02:00
Ingo Molnar
acee709cab Merge branches 'x86/urgent', 'x86/amd-iommu', 'x86/apic', 'x86/cleanups', 'x86/core', 'x86/cpu', 'x86/fixmap', 'x86/gart', 'x86/kprobes', 'x86/memtest', 'x86/modules', 'x86/nmi', 'x86/pat', 'x86/reboot', 'x86/setup', 'x86/step', 'x86/unify-pci', 'x86/uv', 'x86/xen' and 'xen-64bit' into x86/for-linus 2008-07-21 16:37:17 +02:00
Ingo Molnar
e66d90fb4a Merge branch 'linus' into xen-64bit 2008-07-21 15:06:09 +02:00
Ingo Molnar
1c29dd9a9e Merge branch 'linus' into x86/paravirt-spinlocks 2008-07-21 15:05:58 +02:00
Ingo Molnar
33a37eb411 KVM: fix exception entry / build bug, on 64-bit
-tip testing found this build bug:

 arch/x86/kvm/built-in.o:(.text.fixup+0x1): relocation truncated to fit: R_X86_64_32 against `.text'
 arch/x86/kvm/built-in.o:(.text.fixup+0xb): relocation truncated to fit: R_X86_64_32 against `.text'
 arch/x86/kvm/built-in.o:(.text.fixup+0x15): relocation truncated to fit: R_X86_64_32 against `.text'
 arch/x86/kvm/built-in.o:(.text.fixup+0x1f): relocation truncated to fit: R_X86_64_32 against `.text'
 arch/x86/kvm/built-in.o:(.text.fixup+0x29): relocation truncated to fit: R_X86_64_32 against `.text'

Introduced by commit 4ecac3fd. The problem is that 'push' will default
to 32-bit, which is not wide enough as a fixup address. (and which would
crash on any real fixup event even if it was wide enough)

Introduce KVM_EX_PUSH to get the proper address push width on 64-bit too.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-21 11:03:32 +02:00
Ingo Molnar
e27772b48d Merge branch 'linus' into x86/urgent 2008-07-21 11:02:45 +02:00
Avi Kivity
d6e88aec07 KVM: Prefix some x86 low level function with kvm_, to avoid namespace issues
Fixes compilation with CONFIG_VMI enabled.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:39 +03:00
Avi Kivity
7a5b56dfd3 KVM: x86 emulator: lazily evaluate segment registers
Instead of prefetching all segment bases before emulation, read them at the
last moment.  Since most of them are unneeded, we save some cycles on
Intel machines where this is a bit expensive.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:35 +03:00
Avi Kivity
f5b4edcd52 KVM: x86 emulator: simplify rip relative decoding
rip relative decoding is relative to the instruction pointer of the next
instruction; by moving address adjustment until after decoding is complete,
we remove the need to determine the instruction size.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:34 +03:00
Laurent Vivier
542472b53e KVM: Add coalesced MMIO support (x86 part)
This patch enables coalesced MMIO for x86 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:31 +03:00
Guillaume Thouvenin
3e6e0aab1b KVM: Prefixes segment functions that will be exported with "kvm_"
Prefixes functions that will be exported with kvm_.
We also prefixed set_segment() even if it still static
to be coherent.

signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:27 +03:00
Avi Kivity
9ba075a664 KVM: MTRR support
Add emulation for the memory type range registers, needed by VMware esx 3.5,
and by pci device assignment.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:26 +03:00
Avi Kivity
81609e3e26 KVM: Order segment register constants in the same way as cpu operand encoding
This can be used to simplify the x86 instruction decoder.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:26 +03:00
Sheng Yang
f08864b42a KVM: VMX: Enable NMI with in-kernel irqchip
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:26 +03:00
Sheng Yang
3419ffc8e4 KVM: IOAPIC/LAPIC: Enable NMI support
[avi: fix ia64 build breakage]

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:25 +03:00
Avi Kivity
7cc8883074 KVM: Remove decache_vcpus_on_cpu() and related callbacks
Obsoleted by the vmx-specific per-cpu list.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:25 +03:00
Avi Kivity
4ecac3fd6d KVM: Handle virtualization instruction #UD faults during reboot
KVM turns off hardware virtualization extensions during reboot, in order
to disassociate the memory used by the virtualization extensions from the
processor, and in order to have the system in a consistent state.
Unfortunately virtual machines may still be running while this goes on,
and once virtualization extensions are turned off, any virtulization
instruction will #UD on execution.

Fix by adding an exception handler to virtualization instructions; if we get
an exception during reboot, we simply spin waiting for the reset to complete.
If it's a true exception, BUG() so we can have our stack trace.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:41:43 +03:00
Avi Kivity
1b7fcd3263 KVM: MMU: Fix false flooding when a pte points to page table
The KVM MMU tries to detect when a speculative pte update is not actually
used by demand fault, by checking the accessed bit of the shadow pte.  If
the shadow pte has not been accessed, we deem that page table flooded and
remove the shadow page table, allowing further pte updates to proceed
without emulation.

However, if the pte itself points at a page table and only used for write
operations, the accessed bit will never be set since all access will happen
through the emulator.

This is exactly what happens with kscand on old (2.4.x) HIGHMEM kernels.
The kernel points a kmap_atomic() pte at a page table, and then
proceeds with read-modify-write operations to look at the dirty and accessed
bits.  We get a false flood trigger on the kmap ptes, which results in the
mmu spending all its time setting up and tearing down shadows.

Fix by setting the shadow accessed bit on emulated accesses.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:40:50 +03:00
Joerg Roedel
d2ebb4103f KVM: SVM: add tracing support for TDP page faults
To distinguish between real page faults and nested page faults they should be
traced as different events. This is implemented by this patch.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:40:48 +03:00
Peter Zijlstra
31656519e1 sched, x86: clean up hrtick implementation
random uvesafb failures were reported against Gentoo:

  http://bugs.gentoo.org/show_bug.cgi?id=222799

and Mihai Moldovan bisected it back to:

> 8f4d37ec07 is first bad commit
> commit 8f4d37ec07
> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date:   Fri Jan 25 21:08:29 2008 +0100
>
>    sched: high-res preemption tick

Linus suspected it to be hrtick + vm86 interaction and observed:

> Btw, Peter, Ingo: I think that commit is doing bad things. They aren't
> _incorrect_ per se, but they are definitely bad.
>
> Why?
>
> Using random _TIF_WORK_MASK flags is really impolite for doing
> "scheduling" work. There's a reason that arch/x86/kernel/entry_32.S
> special-cases the _TIF_NEED_RESCHED flag: we don't want to exit out of
> vm86 mode unnecessarily.
>
> See the "work_notifysig_v86" label, and how it does that
> "save_v86_state()" thing etc etc.

Right, I never liked having to fiddle with those TIF flags. Initially I
needed it because the hrtimer base lock could not nest in the rq lock.
That however is fixed these days.

Currently the only reason left to fiddle with the TIF flags is remote
wakeups. We cannot program a remote cpu's hrtimer. I've been thinking
about using the new and improved IPI function call stuff to implement
hrtimer_start_on().

However that does require that smp_call_function_single(.wait=0) works
from interrupt context - /me looks at the latest series from Jens - Yes
that does seem to be supported, good.

Here's a stab at cleaning this stuff up ...

Mihai reported test success as well.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Mihai Moldovan <ionic@ionic.de>
Cc: Michal Januszewski <spock@gentoo.org>
Cc: Antonino Daplas <adaplas@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 10:37:28 +02:00
Mike Travis
94a1e869c7 NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var
* Slight optimization when getting one's own cpu_info percpu data.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 10:21:11 +02:00
Ingo Molnar
c4dc59ae7a x86, VisWS: turn into generic arch, eliminate leftover files
remove unused leftovers.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 09:31:24 +02:00
Yinghai Lu
63b5d7af25 x86: add ->pre_time_init to x86_quirks
so NUMAQ can use that to call numaq_pre_time_init()

This allows us to remove a NUMAQ special from arch/x86/kernel/setup.c.

(and paves the way to remove the NUMAQ subarch)

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 09:25:52 +02:00
Yinghai Lu
64898a8bad x86: extend and use x86_quirks to clean up NUMAQ code
add these new x86_quirks methods:

	int *mpc_record;
	int (*mpc_apic_id)(struct mpc_config_processor *m);
	void (*mpc_oem_bus_info)(struct mpc_config_bus *m, char *name);
	void (*mpc_oem_pci_bus)(struct mpc_config_bus *m);
	void (*smp_read_mpc_oem)(struct mp_config_oemtable *oemtable,
                                    unsigned short oemsize);

... and move NUMAQ related mps table handling to numaq_32.c.

also move the call to smp_read_mpc_oem() to smp_read_mpc() directly.

Should not change functionality, albeit it would be nice to get it
tested on real NUMAQ as well ...

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 09:25:52 +02:00
Yinghai Lu
3c9cb6de1e x86: introduce x86_quirks
introduce x86_quirks array of boot-time quirk methods.

No change in functionality intended.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 09:18:17 +02:00
Jan Beulich
48fe4a76e2 x86: i386: reduce boot fixmap space
As 256 entries are needed, aligning to a 256-entry boundary is
sufficient and still guarantees the single pte table requirement.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-18 16:17:52 -07:00
Jan Beulich
08ad8afaa0 x86: reduce force_mwait visibility
It's not used anywhere outside its single referencing file.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-18 15:55:09 -07:00
Jan Beulich
08e1a13e7d x86: reduce forbid_dac's visibility
It's not used anywhere outside its declaring file.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-18 14:39:37 -07:00
Ingo Molnar
9553e11325 Merge branch 'x86/uv' into x86/x2apic
Conflicts:

	arch/x86/kernel/Makefile

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 23:01:04 +02:00
Ingo Molnar
453c1404c5 Merge branch 'x86/apic' into x86/x2apic
Conflicts:

	arch/x86/kernel/paravirt.c
	arch/x86/kernel/smpboot.c
	arch/x86/kernel/vmi_32.c
	arch/x86/lguest/boot.c
	arch/x86/xen/enlighten.c
	include/asm-x86/apic.h
	include/asm-x86/paravirt.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 23:00:05 +02:00
Ingo Molnar
a208f37a46 Merge branch 'linus' into x86/x2apic 2008-07-18 22:50:34 +02:00
Ingo Molnar
bb2c018b09 Merge branch 'linus' into cpus4096
Conflicts:

	drivers/acpi/processor_throttling.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 22:00:54 +02:00
Ingo Molnar
f6dc8ccaab Merge branch 'linus' into core/generic-dma-coherent
Conflicts:

	kernel/Makefile

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 21:13:20 +02:00
Jaswinder Singh
6ac8d51f01 x86: introducing asm-x86/traps.h
Declaring x86 traps under one hood.
Declaring x86 do_traps before defining them.

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 18:51:57 +02:00
Ingo Molnar
f1b0c8d3d3 Merge branch 'linus' into x86/amd-iommu 2008-07-18 18:43:08 +02:00
Thomas Petazzoni
9781f39fd2 x86: consolidate the definition of the force_mwait variable
The force_mwait variable iss defined either in
arch/x86/kernel/cpu/amd.c or in arch/x86/kernel/setup_64.c, but it is
only initialized and used in arch/x86/kernel/process.c. This patch
moves the declaration to arch/x86/kernel/process.c.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: michael@free-electrons.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 18:39:19 +02:00
Herton Ronaldo Krzesinski
723edb5060 Fix typos from signal_32/64.h merge
Fallout from commit 33185c504f ("x86:
merge signal_32/64.h")

Thanks to Dick Streefland who provided an useful testcase on
http://lkml.org/lkml/2008/3/17/205 (only applicable to 2.6.24.x), that
helped a lot as a deterministic way to bisect an issue that leaded to
this fix.

Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Cc: Roland McGrath <roland@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 17:59:13 +02:00
Russ Anderson
7019cc2dd6 x86 BIOS interface for RTC on SGI UV
Real-time code needs to know the number of cycles per second
on SGI UV.  The information is provided via a run time BIOS
call.  This patch provides the linux side of that interface.
This is the first of several run time BIOS calls to be defined
in uv/bios.h and bios_uv.c.

Note that BIOS_CALL() is just a stub for now.  The bios
side is being worked on.

Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:35:14 +02:00
Alexander van Heukelum
8450e85399 x86, cleanup: fix description of __fls(): __fls(0) is undefined
Ricardo M. Correia spotted that the use of __fls() in fls64() did
not seem to make sense. In fact fls64()'s implementation is fine,
but the description of __fls() was wrong. Fix that.

Reported-by: "Ricardo M. Correia" <Ricardo.M.Correia@Sun.COM>
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:32:38 +02:00
Maciej W. Rozycki
35b680557f x86: more apic debugging
[ mingo@elte.hu: picked up this patch from Maciej, lets make apic=debug
                 print out more info - we had a lot of APIC changes ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:27:51 +02:00
Maciej W. Rozycki
baa1318841 x86: APIC: Make apic_verbosity unsigned
As a microoptimisation, make apic_verbosity unsigned.  This will make
apic_printk(APIC_QUIET, ...) expand into just printk(...) with the
surrounding condition and a reference to apic_verbosity removed.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:27:43 +02:00
Yinghai Lu
1f067167a8 x86: seperate memtest from init_64.c
it's separate functionality that deserves its own file.

This also prepares 32-bit memtest support.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:10:27 +02:00
Harvey Harrison
3217256188 x86: suppress sparse returning void warnings
include/asm/paravirt.h:1404:2: warning: returning void-valued expression
include/asm/paravirt.h:1414:2: warning: returning void-valued expression

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 13:42:01 +02:00
Ingo Molnar
2fb5e1e101 Merge branch 'linus' into x86/paravirt-spinlocks
Conflicts:

	arch/x86/kernel/Makefile

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 13:41:27 +02:00
Sebastian Siewior
2b7207a6b5 ftrace: copy + paste typo in asm/ftrace.h
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <srostedt@redhat.co>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 13:14:08 +02:00
Maciej W. Rozycki
593f4a788e x86: APIC: remove apic_write_around(); use alternatives
Use alternatives to select the workaround for the 11AP Pentium erratum
for the affected steppings on the fly rather than build time.  Remove the
X86_GOOD_APIC configuration option and replace all the calls to
apic_write_around() with plain apic_write(), protecting accesses to the
ESR as appropriate due to the 3AP Pentium erratum.  Remove
apic_read_around() and all its invocations altogether as not needed.
Remove apic_write_atomic() and all its implementing backends.  The use of
ASM_OUTPUT2() is not strictly needed for input constraints, but I have
used it for readability's sake.

I had the feeling no one else was brave enough to do it, so I went ahead
and here it is.  Verified by checking the generated assembly and tested
with both a 32-bit and a 64-bit configuration, also with the 11AP
"feature" forced on and verified with gdb on /proc/kcore to work as
expected (as an 11AP machines are quite hard to get hands on these days).
Some script complained about the use of "volatile", but apic_write() needs
it for the same reason and is effectively a replacement for writel(), so I
have disregarded it.

I am not sure what the policy wrt defconfig files is, they are generated
and there is risk of a conflict resulting from an unrelated change, so I
have left changes to them out.  The option will get removed from them at
the next run.

Some testing with machines other than mine will be needed to avoid some
stupid mistake, but despite its volume, the change is not really that
intrusive, so I am fairly confident that because it works for me, it will
everywhere.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 12:51:21 +02:00
Ingo Molnar
48ae744434 Merge branch 'linus' into x86/step 2008-07-18 10:14:56 +02:00
Ingo Molnar
393d81aa02 Merge branch 'linus' into xen-64bit 2008-07-17 23:57:20 +02:00
H. Peter Anvin
4fdf08b5bf x86: unify and correct the GDT_ENTRY() macro
Merge the GDT_ENTRY() macro between arch/x86/boot/pm.c and
arch/x86/kernel/acpi/sleep.c and put the new one in
<asm-x86/segment.h>.

While we're at it, correct the bitmasks for the limit and flags.  The
new version relies on using ULL constants in order to cause type
promotion rather than explicit casts; this avoids having to include
<linux/types.h> in <asm-x86/segments.h>.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-17 11:29:24 -07:00
Linus Torvalds
2b04be7e8a Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix asm/e820.h for userspace inclusion
  x86: fix numaq_tsc_disable
  x86: fix kernel_physical_mapping_init() for large x86 systems
2008-07-17 10:38:59 -07:00
Rusty Russell
2567d71cc7 x86: fix asm/e820.h for userspace inclusion
asm-x86/e820.h is included from userspace.  'x86: make e820.c to have
common functions' (b79cd8f126) broke it:

	make -C Documentation/lguest
	cc -Wall -Wmissing-declarations -Wmissing-prototypes -O3 -I../../include
lguest.c  -lz -o lguest
	In file included from ../../include/asm-x86/bootparam.h:8,
	                 from lguest.c:45:
	../../include/asm/e820.h:66: error: expected ‘)’ before ‘start’
	../../include/asm/e820.h:67: error: expected ‘)’ before ‘start’
	../../include/asm/e820.h:68: error: expected ‘)’ before ‘start’
	../../include/asm/e820.h:72: error: expected ‘=’, ‘,’, ‘;’, ‘asm’
or ‘__attribute__’ before ‘e820_update_range’
	...

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-17 19:28:48 +02:00
Linus Torvalds
dc7c65db28 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (72 commits)
  Revert "x86/PCI: ACPI based PCI gap calculation"
  PCI: remove unnecessary volatile in PCIe hotplug struct controller
  x86/PCI: ACPI based PCI gap calculation
  PCI: include linux/pm_wakeup.h for device_set_wakeup_capable
  PCI PM: Fix pci_prepare_to_sleep
  x86/PCI: Fix PCI config space for domains > 0
  Fix acpi_pm_device_sleep_wake() by providing a stub for CONFIG_PM_SLEEP=n
  PCI: Simplify PCI device PM code
  PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep
  PCI ACPI: Rework PCI handling of wake-up
  ACPI: Introduce new device wakeup flag 'prepared'
  ACPI: Introduce acpi_device_sleep_wake function
  PCI: rework pci_set_power_state function to call platform first
  PCI: Introduce platform_pci_power_manageable function
  ACPI: Introduce acpi_bus_power_manageable function
  PCI: make pci_name use dev_name
  PCI: handle pci_name() being const
  PCI: add stub for pci_set_consistent_dma_mask()
  PCI: remove unused arch pcibios_update_resource() functions
  PCI: fix pci_setup_device()'s sprinting into a const buffer
  ...

Fixed up conflicts in various files (arch/x86/kernel/setup_64.c,
arch/x86/pci/irq.c, arch/x86/pci/pci.h, drivers/acpi/sleep/main.c,
drivers/pci/pci.c, drivers/pci/pci.h, include/acpi/acpi_bus.h) from x86
and ACPI updates manually.
2008-07-16 17:25:46 -07:00
Zhao Yakui
da5e09a1b3 ACPI : Create "idle=nomwait" bootparam
"idle=nomwait" disables the use of the MWAIT
instruction from both C1 (C1_FFH) and deeper (C2C3_FFH)
C-states.

When MWAIT is unavailable, the BIOS and OS generally
negotiate to use the HALT instruction for C1,
and use IO accesses for deeper C-states.

This option is useful for power and performance
comparisons, and also to work around BIOS bugs
where broken MWAIT support is advertised.

http://bugzilla.kernel.org/show_bug.cgi?id=10807
http://bugzilla.kernel.org/show_bug.cgi?id=10914

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Li Shaohua <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:05 +02:00
Zhao Yakui
c1e3b377ad ACPI: Create "idle=halt" bootparam
"idle=halt" limits the idle loop to using
the halt instruction.  No MWAIT, no IO accesses,
no C-states deeper than C1.

If something is broken in the idle code,
"idle=halt" is a less severe workaround
than "idle=poll" which disables all power savings.

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:05 +02:00
Roland McGrath
380fdd7585 x86 ptrace: user-sets-TF nits
This closes some arcane holes in single-step handling that can arise
only when user programs set TF directly (via popf or sigreturn) and
then use vDSO (syscall/sysenter) system call entry.  In those entry
paths, the clear_TF_reenable case hits and we must check TIF_SINGLESTEP
to be sure our bookkeeping stays correct wrt the user's view of TF.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-16 12:15:17 -07:00
Roland McGrath
d4d6715016 x86 ptrace: unify syscall tracing
This unifies and cleans up the syscall tracing code on i386 and x86_64.

Using a single function for entry and exit tracing on 32-bit made the
do_syscall_trace() into some terrible spaghetti.  The logic is clear and
simple using separate syscall_trace_enter() and syscall_trace_leave()
functions as on 64-bit.

The unification adds PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support
on x86_64, for 32-bit ptrace() callers and for 64-bit ptrace() callers
tracing either 32-bit or 64-bit tasks.  It behaves just like 32-bit.

Changing syscall_trace_enter() to return the syscall number shortens
all the assembly paths, while adding the SYSEMU feature in a simple way.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-16 12:15:17 -07:00
Roland McGrath
64f0973319 x86 ptrace: unify TIF_SINGLESTEP
This unifies the treatment of TIF_SINGLESTEP on i386 and x86_64.
The bit is now excluded from _TIF_WORK_MASK on i386 as it has been
on x86_64.  This means the do_notify_resume() path using it is never
used, so TIF_SINGLESTEP is not cleared on returning to user mode.

Both now leave TIF_SINGLESTEP set when returning to user, so that
it's already set on an int $0x80 system call entry.  This removes
the need for testing TF on the system_call path.  Doing it this way
fixes the regression for PTRACE_SINGLESTEP into a sigreturn syscall,
introduced by commit 1e2e99f0e4.

The clear_TF_reenable case that sets TIF_SINGLESTEP can only happen
on a non-exception kernel entry, i.e. sysenter/syscall instruction.
That will always get to the syscall exit tracing path.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-16 12:15:16 -07:00
Ingo Molnar
4bb689eee1 x86: paravirt spinlocks, !CONFIG_SMP build fixes
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:15:53 +02:00
Jeremy Fitzhardinge
2d9e1e2f58 xen: implement Xen-specific spinlocks
The standard ticket spinlocks are very expensive in a virtual
environment, because their performance depends on Xen's scheduler
giving vcpus time in the order that they're supposed to take the
spinlock.

This implements a Xen-specific spinlock, which should be much more
efficient.

The fast-path is essentially the old Linux-x86 locks, using a single
lock byte.  The locker decrements the byte; if the result is 0, then
they have the lock.  If the lock is negative, then locker must spin
until the lock is positive again.

When there's contention, the locker spin for 2^16[*] iterations waiting
to get the lock.  If it fails to get the lock in that time, it adds
itself to the contention count in the lock and blocks on a per-cpu
event channel.

When unlocking the spinlock, the locker looks to see if there's anyone
blocked waiting for the lock by checking for a non-zero waiter count.
If there's a waiter, it traverses the per-cpu "lock_spinners"
variable, which contains which lock each CPU is waiting on.  It picks
one CPU waiting on the lock and sends it an event to wake it up.

This allows efficient fast-path spinlock operation, while allowing
spinning vcpus to give up their processor time while waiting for a
contended lock.

[*] 2^16 iterations is threshold at which 98% locks have been taken
according to Thomas Friebel's Xen Summit talk "Preventing Guests from
Spinning Around".  Therefore, we'd expect the lock and unlock slow
paths will only be entered 2% of the time.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:15:53 +02:00
Jeremy Fitzhardinge
8efcbab674 paravirt: introduce a "lock-byte" spinlock implementation
Implement a version of the old spinlock algorithm, in which everyone
spins waiting for a lock byte.  In order to be compatible with the
ticket-lock's use of a zero initializer, this uses the convention of
'0' for unlocked and '1' for locked.

This algorithm is much better than ticket locks in a virtual
envionment, because it doesn't interact badly with the vcpu scheduler.
If there are multiple vcpus spinning on a lock and the lock is
released, the next vcpu to be scheduled will take the lock, rather
than cycling around until the next ticketed vcpu gets it.

To use this, you must call paravirt_use_bytelocks() very early, before
any spinlocks have been taken.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:15:53 +02:00
Jeremy Fitzhardinge
74d4affde8 x86/paravirt: add hooks for spinlock operations
Ticket spinlocks have absolutely ghastly worst-case performance
characteristics in a virtual environment.  If there is any contention
for physical CPUs (ie, there are more runnable vcpus than cpus), then
ticket locks can cause the system to end up spending 90+% of its time
spinning.

The problem is that (v)cpus waiting on a ticket spinlock will be
granted access to the lock in strict order they got their tickets.  If
the hypervisor scheduler doesn't give the vcpus time in that order,
they will burn timeslices waiting for the scheduler to give the right
vcpu some time.  In the worst case it could take O(n^2) vcpu scheduler
timeslices for everyone waiting on the lock to get it, not counting
new cpus trying to take the lock while the log-jam is sorted out.

These hooks allow a paravirt backend to replace the spinlock
implementation.

At the very least, this could revert the implementation back to the
old lock algorithm, which allows the next scheduled vcpu to take the
lock, and has basically fairly good performance.

It also allows the spinlocks to take advantages of the hypervisor
features to make locks more efficient (spin and block, for example).

The cost to native execution is an extra direct call when using a
spinlock function.  There's no overhead if CONFIG_PARAVIRT is turned
off.

The lock structure is fixed at a single "unsigned int", initialized to
zero, but the spinlock implementation can use it as it wishes.

Thanks to Thomas Friebel's Xen Summit talk "Preventing Guests from
Spinning Around" for pointing out this problem.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:15:52 +02:00
Jeremy Fitzhardinge
6a52e4b1cd x86_64: further cleanup of 32-bit compat syscall mechanisms
AMD only supports "syscall" from 32-bit compat usermode.
Intel and Centaur(?) only support "sysenter" from 32-bit compat usermode.

Set the X86 feature bits accordingly, and set up the vdso in
accordance with those bits.  On the offchance we run on in a 64-bit
environment which supports neither syscall nor sysenter from 32-bit
mode, then fall back to the int $0x80 vdso.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-16 11:08:27 +02:00
Jeremy Fitzhardinge
c24481e9da xen64: save lots of registers
The Xen hypercall interface is allowed to trash any or all of the
argument registers, so we need to be careful that the kernel state
isn't damaged.  On 32-bit kernels, the hypercall parameter registers
same as a regparm function call, so we've got away without explicit
clobbering so far.  The 64-bit ABI defines lots of caller-save
registers, so save them all for safety.  We can trim this set later by
re-distributing the responsibility for saving all these registers.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:05:23 +02:00
Jeremy Fitzhardinge
c05f1cfaba xen64: implement 64-bit update_descriptor
64-bit hypercall interface can pass a maddr in one argument rather
than splitting it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:05:09 +02:00
Eduardo Habkost
45eb0d8898 Xen64: HYPERVISOR_set_segment_base() implementation
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:03:31 +02:00
Jeremy Fitzhardinge
88459d4c7e xen64: register callbacks in arch-independent way
Use callback_op hypercall to register callbacks in a 32/64-bit
independent way (64-bit doesn't need a code segment, but that detail
is hidden in XEN_CALLBACK).

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:03:01 +02:00
Jeremy Fitzhardinge
ce803e705f xen64: use arbitrary_virt_to_machine for xen_set_pmd
When building initial pagetables in 64-bit kernel the pud/pmd pointer may
be in ioremap/fixmap space, so we need to walk the pagetable to look up the
physical address.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:01:17 +02:00
Jeremy Fitzhardinge
084a2a4e76 xen64: early mapping setup
Set up the initial pagetables to map the kernel mapping into the
physical mapping space.  This makes __va() usable, since it requires
physical mappings.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:00:07 +02:00
Jeremy Fitzhardinge
5b09b2876e x86_64: add workaround for no %gs-based percpu
As a stopgap until Mike Travis's x86-64 gs-based percpu patches are
ready, provide workaround functions for x86_read/write_percpu for
Xen's use.

Specifically, this means that we can't really make use of vcpu
placement, because we can't use a single gs-based memory access to get
to vcpu fields.  So disable all that for now.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:58:13 +02:00
Jeremy Fitzhardinge
f6e587325b xen64: add extra pv_mmu_ops
We need extra pv_mmu_ops for 64-bit, to deal with the extra level of
pagetable.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:57:16 +02:00
Jeremy Fitzhardinge
e74359028d xen64: fix calls into hypercall page
The 64-bit calling convention for hypercalls uses different registers
from 32-bit.  Annoyingly, gcc's asm syntax doesn't have a way to
specify one of the extra numeric reigisters in a constraint, so we
must use explicitly placed register variables.  Given that we have to
do it for some args, may as well do it for all.

Also fix syntax gcc generates for the call instruction itself.  We
need a plain direct call, but the asm expansion which works on 32-bit
generates a rip-relative addressing mode in 64-bit, which is treated
as an indirect call.  The alternative is to pass the hypercall page
offset into the asm, and have it add it to the hypercall page start
address to generate the call.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:57:00 +02:00
Jeremy Fitzhardinge
ca15f20f11 xen: fix 64-bit hypercall variants
64-bit guests can pass 64-bit quantities in a single argument,
so fix up the hypercalls.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:56:46 +02:00
Jeremy Fitzhardinge
48b5db2062 xen64: define asm/xen/interface for 64-bit
Copy 64-bit definitions of various interface structures into place.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:56:18 +02:00
Jeremy Fitzhardinge
7c33b1e6ee x86_64: unstatic get_local_pda
This allows Xen's xen_cpu_up() to allocate a pda for the new CPU.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:55:07 +02:00
Eduardo Habkost
a312b37b2a x86/paravirt: call paravirt_pagetable_setup_{start, done}
Call paravirt_pagetable_setup_{start,done}

These paravirt_ops functions were not being called on x86_64.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:53:43 +02:00
Ingo Molnar
82638844d9 Merge branch 'linus' into cpus4096
Conflicts:

	arch/x86/xen/smp.c
	kernel/sched_rt.c
	net/iucv/iucv.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 00:29:07 +02:00
Ingo Molnar
1a781a777b Merge branch 'generic-ipi' into generic-ipi-for-linus
Conflicts:

	arch/powerpc/Kconfig
	arch/s390/kernel/time.c
	arch/x86/kernel/apic_32.c
	arch/x86/kernel/cpu/perfctr-watchdog.c
	arch/x86/kernel/i8259_64.c
	arch/x86/kernel/ldt.c
	arch/x86/kernel/nmi_64.c
	arch/x86/kernel/smpboot.c
	arch/x86/xen/smp.c
	include/asm-x86/hw_irq_32.h
	include/asm-x86/hw_irq_64.h
	include/asm-x86/mach-default/irq_vectors.h
	include/asm-x86/mach-voyager/irq_vectors.h
	include/asm-x86/smp.h
	kernel/Makefile

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-15 21:55:59 +02:00
Kevin Winchester
3f1c38723e x86: Fix compile error with CONFIG_AS_CFI=n
AS      arch/x86/lib/csum-copy_64.o
arch/x86/lib/csum-copy_64.S: Assembler messages:
arch/x86/lib/csum-copy_64.S:48: Error: Macro `ignore' was already defined
make[1]: *** [arch/x86/lib/csum-copy_64.o] Error 1
make: *** [arch/x86/lib] Error 2

It appears that csum-copy_64.S and dwarf2.h both define an ignore macro. 
I would expect one of them can be renamed quite easily, unless they 
are references elsewhere. 

Caused-by-commit: 392a0fc96b
    x86: merge dwarf2 headers

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-07-15 15:30:29 +02:00
Ingo Molnar
5806b81ac1 Merge branch 'auto-ftrace-next' into tracing/for-linus
Conflicts:

	arch/x86/kernel/entry_32.S
	arch/x86/kernel/process_32.c
	arch/x86/kernel/process_64.c
	arch/x86/lib/Makefile
	include/asm-x86/irqflags.h
	kernel/Makefile
	kernel/sched.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-14 16:11:52 +02:00
Ingo Molnar
d59fdcf2ac Merge commit 'v2.6.26' into x86/core 2008-07-14 11:37:46 +02:00
Yinghai Lu
2387ce57a8 x86: make 64bit hpet_set_mapping to use ioremap too, v2
keep the one for VSYSCALL_HPET

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-14 09:24:17 +02:00
Yinghai Lu
94a8c3c243 x86: let 32bit use apic_ops too - fix
fix for pv - clean up the namespace there too.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-14 09:02:54 +02:00
Mike Travis
11369f356b x86: change _node_to_cpumask_ptr to return const ptr
* Strengthen the return type for the _node_to_cpumask_ptr to be
    a const pointer.  This adds compiler checking to insure that
    node_to_cpumask_map[] is not changed inadvertently.

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Acked-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-13 19:11:58 +02:00
Maciej W. Rozycki
ce8b06b985 x86: I/O APIC: remove an IRQ2-mask hack
Now that IRQ2 is never made available to the I/O APIC, there is no need
to special-case it and mask as a workaround for broken systems.  Actually,
because of the former, mask_IO_APIC_irq(2) is a no-op already.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-13 11:43:48 +02:00
Yinghai Lu
f910a9dc7c x86: make 64bit have get_apic_id
generalize the x2apic code some more.

let read_apic_id become a macro (later on a function/inline)
GET_APIC_ID(apic_read(APIC_ID))

  +#define read_apic_id()  (GET_APIC_ID(apic_read(APIC_ID)))

instead of this weird construct:

  -#define read_apic_id  (genapic->read_apic_id)

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-13 08:27:46 +02:00
Yinghai Lu
3d88cca708 x86: fix numaq_tsc_disable calling
got this on a test-system:

 calling  numaq_tsc_disable+0x0/0x39
 NUMAQ: disabling TSC
 initcall numaq_tsc_disable+0x0/0x39 returned 0 after 0 msecs

that's because we should not be using arch_initcall to call numaq_tsc_disable.

need to call it in setup_arch before time_init()/tsc_init()
and call it in init_intel() to make the cpu feature bits right.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-13 08:19:45 +02:00
Linus Torvalds
9df2fe9867 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix ldt limit for 64 bit
2008-07-12 14:34:31 -07:00
Yinghai Lu
4c9961d56e x86: make read_apic_id return final apicid
also remove GET_APIC_ID when read_apic_id is used.

need to apply after
	[PATCH] x86: mach_apicdef.h need to include before smp.h

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:14 +02:00
Yinghai Lu
c535b6a1a6 x86: let 32bit use apic_ops too
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:12 +02:00
Suresh Siddha
ad66dd340f x2apic: xen64 paravirt basic apic ops
Define the Xen specific basic apic ops, in additon to paravirt apic ops,
with some misc warning fixes.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:10 +02:00
Suresh Siddha
2d9579a124 x64, x2apic/intr-remap: support for x2apic physical mode support
x2apic Physical mode  support. By default we will use x2apic cluster mode.
x2apic physical mode can be selected using "x2apic_phys" boot parameter.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:07 +02:00
Suresh Siddha
6e1cb38a2a x64, x2apic/intr-remap: add x2apic support, including enabling interrupt-remapping
x2apic support.  Interrupt-remapping must be enabled before enabling x2apic,
this is needed to ensure that IO interrupts continue to work properly after the
cpu mode is changed to x2apic(which uses 32bit extended physical/cluster
apic id).

On systems where apicid's are > 255, BIOS can handover the control to OS in
x2apic mode. Or if the OS handover was in legacy xapic mode, check
if it is capable of x2apic mode. And if we succeed in enabling
Interrupt-remapping, then we can enable x2apic mode in the CPU.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:06 +02:00
Suresh Siddha
75c46fa61b x64, x2apic/intr-remap: MSI and MSI-X support for interrupt remapping infrastructure
MSI and MSI-X support for interrupt remapping infrastructure.

MSI address register will be programmed with interrupt-remapping table
entry(IRTE) index and the IRTE will contain information about the vector,
cpu destination, etc.

For MSI-X, all the IRTE's will be consecutively allocated in the table,
and the address registers will contain the starting index to the block
and the data register will contain the subindex with in that block.

This also introduces a new irq_chip for cleaner irq migration (in the process
context as opposed to the current irq migration in the context of an interrupt.
interrupt-remapping infrastructure will help us achieve this).

As MSI is edge triggered, irq migration is a simple atomic update(of vector
and cpu destination) of IRTE and flushing the hardware cache.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:05 +02:00
Suresh Siddha
89027d35aa x64, x2apic/intr-remap: IO-APIC support for interrupt-remapping
IO-APIC support in the presence of interrupt-remapping infrastructure.

IO-APIC RTE will be programmed with interrupt-remapping table entry(IRTE)
index and the IRTE will contain information about the vector, cpu destination,
trigger mode etc, which traditionally was present in the IO-APIC RTE.

Introduce a new irq_chip for cleaner irq migration (in the process
context as opposed to the current irq migration in the context of an interrupt.
interrupt-remapping infrastructure will help us achieve this cleanly).

For edge triggered, irq migration is a simple atomic update(of vector
and cpu destination) of IRTE and flush the hardware cache.

For level triggered, we need to modify the io-apic RTE aswell with the update
vector information, along with modifying IRTE with vector and cpu destination.
So irq migration for level triggered is little  bit more complex compared to
edge triggered migration. But the good news is, we use the same algorithm
for level triggered migration as we have today, only difference being,
we now initiate the irq migration from process context instead of the
interrupt context.

In future, when we do a directed EOI (combined with cpu EOI broadcast
suppression) to the IO-APIC, level triggered irq migration will also be
as simple as edge triggered migration and we can do the irq migration
with a simple atomic update to IO-APIC RTE.

TBD: some tests/changes needed in the presence of fixup_irqs() for
level triggered irq migration.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:05 +02:00
Suresh Siddha
12a67cf685 x64, x2apic/intr-remap: x2apic cluster mode support
x2apic cluster mode support.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:03 +02:00
Suresh Siddha
cff73a6ffa x64, x2apic/intr-remap: introcude self IPI to genapic routines
Introduce self IPI op for genapic.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:02 +02:00
Suresh Siddha
13c88fb58d x64, x2apic/intr-remap: x2apic ops for x2apic mode support
x2apic ops for x2apic mode support. This uses MSR interface and differs
slightly from the xapic register layout.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:01 +02:00
Suresh Siddha
32e1d0a065 x64, x2apic/intr-remap: cpuid bits for x2apic feature
cpuid feature for x2apic.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:45:00 +02:00
Suresh Siddha
1b374e4d6f x64, x2apic/intr-remap: basic apic ops support
Introduce basic apic operations which handle the apic programming. This
will be used later to introduce another specific operations for x2apic.

For the perfomance critial accesses like IPI's, EOI etc, we use the
native operations as they are already referenced by different
indirections like genapic, irq_chip etc.

64bit Paravirt ops can also define their apic operations accordingly.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:59 +02:00
Suresh Siddha
0c81c746f9 x64, x2apic/intr-remap: introduce read_apic_id() to genapic routines
Move the read_apic_id()  to genapic routines.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:57 +02:00
Suresh Siddha
4dc2f96cac x64, x2apic/intr-remap: ioapic routines which deal with initial io-apic RTE setup
Generic ioapic specific routines which be used later during enabling
interrupt-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:56 +02:00
Suresh Siddha
d94d93ca5c x64, x2apic/intr-remap: 8259 specific mask/unmask routines
8259 specific mask/unmask routines which be used later while enabling
interrupt-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: andi@firstfloor.org
Cc: ebiederm@xmission.com
Cc: jbarnes@virtuousgeek.org
Cc: steiner@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 08:44:55 +02:00
Michael Karcher
5ac37f87ff x86: fix ldt limit for 64 bit
Fix size of LDT entries. On x86-64, ldt_desc is a double-sized descriptor.

Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-12 07:11:31 +02:00
Ingo Molnar
6c82a000a2 Merge branch 'x86/generalize-visws' into x86/core 2008-07-11 21:22:18 +02:00
Glauber Costa
392a0fc96b x86: merge dwarf2 headers
Merge dwarf2_32.h and dwarf2_64.h into dwarf2.h.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 20:49:39 +02:00
Glauber Costa
d73a731abe x86: use AS_CFI instead of UNWIND_INFO
In dwarf2_32.h, test for CONFIG_AS_CFI instead of
CONFIG_UNWIND_INFO. Turns out that searching for UNWIND_INFO
returns no match in any Kconfig or Makefile, so we're really
just throwing everything away regarding dwarf frames for i386.

The test that generates CONFIG_AS_CFI does not have anything
x86_64-specific, and right now, checking V=1 builds shows me
that the flags is there anyway, although unused.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 20:49:35 +02:00
Glauber Costa
70f1bba4c8 x86: use ignore macro instead of hash comment
In dwarf_64.h header, use the "ignore" macro the way
i386 does.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 20:49:32 +02:00
Ingo Molnar
d9fc3fd3fa x86: fix savesegment() bug causing crashes on 64-bit
i spent a fair amount of time chasing a 64-bit bootup crash that manifested
itself as bootup segfaults:

  S10network[1825]: segfault at 7f3e2b5d16b8 ip 00000031108748c9 sp 00007fffb9c14c70 error 4 in libc-2.7.so[3110800000+14d000]

eventually causing init to die and panic the system:

  Kernel panic - not syncing: Attempted to kill init!
  Pid: 1, comm: init Not tainted 2.6.26-rc9-tip #13878

after a maratonic bisection session, the bad commit turned out to be:

| b7675791859075418199c7af86a116ea34eaf5bd is first bad commit
| commit b7675791859075418199c7af86a116ea34eaf5bd
| Author: Jeremy Fitzhardinge <jeremy@goop.org>
| Date:   Wed Jun 25 00:19:00 2008 -0400
|
|     x86: remove open-coded save/load segment operations
|
|     This removes a pile of buggy open-coded implementations of savesegment
|     and loadsegment.

after some more bisection of this patch itself, it turns out that what
makes the difference are the savesegment() changes to __switch_to().

Taking a look at this portion of arch/x86/kernel/process_64.o revealed
this crutial difference:

| good:    99c:       8c e0                   mov    %fs,%eax
|          99e:       89 45 cc                mov    %eax,-0x34(%rbp)
|
| bad:     99c:       8c 65 cc                mov    %fs,-0x34(%rbp)

which is due to:

|                 unsigned fsindex;
| -               asm volatile("movl %%fs,%0" : "=r" (fsindex));
| +               savesegment(fs, fsindex);

savesegment() is implemented as:

 #define savesegment(seg, value)                                \
          asm("mov %%" #seg ",%0":"=rm" (value) : : "memory")

note the "m" modifier - it allows GCC to generate the segment move
into a memory operand as well.

But regarding segment operands there's a subtle detail in the x86
instruction set: the above 16-bit moves are zero-extend, but only
if it goes to a register.

If it goes to a memory operand, -0x34(%rbp) in the above case, there's
no zero-extend to 32-bit and the instruction will only save 16 bits
instead of the intended 32-bit.

The other 16 bits is random data - which can cause problems when that
value is used later on.

The solution is to only allow segment operands to go to registers.
This fix allows my test-system to boot up without crashing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 19:51:47 +02:00
Joerg Roedel
d591b0a3ae x86, AMD IOMMU: replace DEVID macro with a function
This patch replaces the DEVID macro with a function and uses them where
apropriate (also in the core code).

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: iommu@lists.linux-foundation.org
Cc: bhavna.sarathy@amd.com
Cc: robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 18:01:18 +02:00
Joerg Roedel
83f5aac18c x86, AMD IOMMU: fix device table entry size
A device table entry is actually only 256 *bits* large. Not 256 bytes.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: iommu@lists.linux-foundation.org
Cc: bhavna.sarathy@amd.com
Cc: robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 18:01:17 +02:00
Joerg Roedel
8ea80d783e x86, AMD IOMMU: replace HIGH_U32 macro with upper_32_bits function
Removes a driver specific macro and replaces it with a generic function already
available in Linux.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: iommu@lists.linux-foundation.org
Cc: bhavna.sarathy@amd.com
Cc: robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 18:01:11 +02:00
Joerg Roedel
5694703f14 x86, AMD IOMMU: add comments to amd_iommu_types.h
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: iommu@lists.linux-foundation.org
Cc: bhavna.sarathy@amd.com
Cc: robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 18:01:09 +02:00
Jeremy Fitzhardinge
8d28aab59f x86_64: add pseudo-features for 32-bit compat syscall
Add pseudo-feature bits to describe whether the CPU supports sysenter
and/or syscall from ia32-compat userspace.  This removes a hardcoded
test in vdso32-setup.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 15:44:57 +02:00
FUJITA Tomonori
be54f9d1c8 x86: remove ifdef CONFIG_SWIOTLB in pci-dma.c
As other IOMMUs do, this puts dummy pci_swiotlb_init() in swiotlb.h
and remove ifdef CONFIG_SWIOTLB in pci-dma.c.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 11:00:55 +02:00
FUJITA Tomonori
ac7ded2adb x86: remove ifdef CONFIG_GART_IOMMU in pci-dma.c
Our way to handle gart_* functions for CONFIG_GART_IOMMU and
!CONFIG_GART_IOMMU cases is inconsistent.

We have some dummy gart_* functions in !CONFIG_GART_IOMMU case and
also use ifdef CONFIG_GART_IOMMU tricks in pci-dma.c to call some
gart_* functions in only CONFIG_GART_IOMMU case.

This patch removes ifdef CONFIG_GART_IOMMU in pci-dma.c and always use
dummy gart_* functions in iommu.h.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 11:00:54 +02:00
FUJITA Tomonori
46a7fa270a x86: make only GART code include gart.h
gart.h has only GART-specific stuff. Only GART code needs it. Other
IOMMU stuff should include iommu.h instead of gart.h.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 11:00:54 +02:00
Yinghai Lu
f361a450bf x86: introduce max_low_pfn_mapped for 64-bit
when more than 4g memory is installed, don't map the big hole below 4g.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-11 10:24:04 +02:00
Ingo Molnar
8bfaba873f x86, VisWS: turn into generic arch, add NR_IRQS quirk
NR_IRQS: let VISWS be just a sub-case of the generic code.

This can create a somewhat larger irq_desc[] array if NR_CPUS is high
but that should not worry VisWS which has 4 CPUs at most.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:41 +02:00
Ingo Molnar
4191894b68 x86, VisWS: turn into generic arch, eliminate include/asm-x86/mach-visws/setup_arch.h
use the generic version of setup_arch.h - it's the same.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:40 +02:00
Ingo Molnar
efd746b889 x86, VisWS: turn into generic arch, move definitions
move the SGIVW definitions from setup_arch.h into its own header file.

preparation for turning VISWS into a generic PC architecture.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:40 +02:00
Ingo Molnar
b4b8641671 x86, VisWS: turn into generic arch, create include/asm-x86/visws/
move the include/asm-x86/mach-visws/ VISWS specific hardware
details include files into include/asm-x86/visws, to be used from
generic code.

No code changed.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:39 +02:00
Ingo Molnar
18c413e27e x86, VisWS: turn into generic arch, eliminate asm-x86/mach-visws/mach_apicdef.h
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:38 +02:00
Ingo Molnar
b70d9c2473 x86, VisWS: turn into generic arch, update asm-x86/mach-visws/mach_apicdef.h
update asm-x86/mach-visws/mach_apicdef.h to the generic version.

This should work fine as VISWS has a standard local APIC and thus
its mach_apicdef.h copy is just an ancient version of the generic code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:37 +02:00
Ingo Molnar
6649ababd5 x86, VisWS: turn into generic arch, eliminate include/asm-x86/mach-visws/smpboot_hooks.h
now that include/asm-x86/mach-visws/smpboot_hooks.h equals
to the default file in ../mach-default/smpboot_hooks.h, simply
include it instead of maintaining a copy.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:36 +02:00
Ingo Molnar
244a5e2ee0 x86, VisWS: turn into generic arch, update include/asm-x86/mach-visws/smpboot_hooks.h
update include/asm-x86/mach-visws/smpboot_hooks.h to
include/asm-x86/mach-default/smpboot_hooks.h (the generic version).

this _should_ work, because VISWS sets skip_ioapic_setup, but it
should be tested on a real VISWS to make sure.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:36 +02:00
Ingo Molnar
6a64b5da9b x86, VisWS: turn into generic arch, enhance include/asm-x86/mach-default/smpboot_hooks.h
Allow the generic smpboot quirks code to be built with
ONFIG_X86_IO_APIC disabled. This way VISWS will be able
to use it as-is.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:35 +02:00
Ingo Molnar
d8d1bc73f0 x86, VisWS: turn into generic arch, eliminate include/asm-x86/mach-visws/mach_apic.h
now that include/asm-x86/mach-visws/mach_apic.h equals
to include/asm-x86/mach-default/mach_apic.h, simply start
using the generic one.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:34 +02:00
Ingo Molnar
808fc878d3 x86, VisWS: turn into generic arch, update asm-x86/mach-visws/mach_apic.h
update asm-x86/mach-visws/mach_apic.h to the generic version.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:34 +02:00
Ingo Molnar
5548ed1135 x86, VisWS: turn into generic arch, install proper PCI quirk
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:33 +02:00
Ingo Molnar
3b33553bad x86: add early quirk support
Add early quirks support.

In preparation of enabling the generic architecture to boot on a VISWS.

This will allow us to remove the VISWS subarch and all its complications.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 18:55:31 +02:00
Ingo Molnar
42a2f217a5 x86, visws: use mach-default/entry_arch.h
mach-default/entry_arch.h is exactly the same file as
mach-visws/entry_arch.h, so include the first from the second,
so that updates to the generic one get picked up by VISWS as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 12:35:50 +02:00
Ingo Molnar
ca201c8230 x86, visws: fix generic-ipi build
fix:

 arch/x86/kernel/built-in.o: In function `smp_intr_init':
 (.init.text+0x49e2): undefined reference to `call_function_single_interrupt'

Caused by include/asm-x86/mach-visws/entry_arch.h getting out of sync
with the include/asm-x86/mach-default/entry_arch.h file it derives from.

Copy the default file over - next step will be to simply include the default
file.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 12:35:46 +02:00
Ingo Molnar
5373fdbdc1 Merge branch 'tracing/mmiotrace' into auto-ftrace-next 2008-07-10 11:43:06 +02:00
Ingo Molnar
bac0c9103b Merge branch 'tracing/ftrace' into auto-ftrace-next 2008-07-10 11:43:00 +02:00
FUJITA Tomonori
e93be88d7e x86: clean up iommu.h and gart.h
This remove lots of duplications in iommu.h and gart.h.

The end result of this patch is:

- iommu.h is a header file for everyone related with IOMMUs.

- gart.h is the private header file. Only pci-gart_64.c and its friends
  include it.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: fujita.tomonori@lab.ntt.co.jp
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 10:43:26 +02:00
Ingo Molnar
849567b936 Merge branch 'x86/gart' into x86/core 2008-07-10 10:43:11 +02:00
Yinghai Lu
f34fa82b19 x86, acpi: merge __acpi_map_table
and let 64-bit to fall back to use fixmap too.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 09:01:25 +02:00
Ingo Molnar
4fb3dc2729 x86: build fix for "x86: fix C1E && nx6325 stability problem"
fix:

 arch/x86/kernel/built-in.o: In function `dmi_ignore_irq0_timer_override':
 boot.c:(.init.text+0x3ea4): undefined reference to `force_mask_ioapic_irq_2'

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-10 08:09:08 +02:00
Vitaly Mayatskikh
1129585a08 x86: introduce copy_user_handle_tail() routine
Introduce generic C routine for handling necessary tail operations after
protection fault in copy_*_user on x86.

Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 15:51:03 +02:00
Ingo Molnar
de989ef093 Merge branch 'x86/unify-lib' into x86/core 2008-07-09 15:00:48 +02:00
Rafael J. Wysocki
e2079c4386 x86: fix C1E && nx6325 stability problem
The problems are that, with the ACPI vs timer overring issue _fixed_,
after using the box for some time (between several seconds and 1 hour, at
random) processes get very high CPU loads (once I've got X using 107% of
the CPU, for example) and the system becomes unresponsive, as though there
were interrupts lost or something similar.

Andreas Herrman reproduced similar problems:

> Ok, now I've reproduced the stability problem.
> - Using tip/master,
> - reverting e38502eb8aa82314d5ab0eba45f50e6790dadd88 and
> - applying your patch from this posting
>   http://marc.info/?l=linux-kernel&m=121539354224562&w=4
>
> Starting X, firefox, gimp, tuxpaint and doing some drawing in tuxpaint
> results in a slow system. Drawing is almost not possible anymore --
> Selections of new colors, cursors etc. is performed with huge delay
> if it's performed at all.
>
> BTW, the code sets up timer IRQ as Virtual Wire IRQ:
>
> Jul  8 14:57:58 kodscha IO-APIC (apicid-pin) 2-22, 2-23 not connected.
> Jul  8 14:57:58 kodscha ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> Jul  8 14:57:58 kodscha ...trying to set up timer as Virtual Wire IRQ... works.
>
> and both INT0 and INT2 of IOAPIC are masked:
>
> Jul  8 14:57:58 kodscha NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
> Jul  8 14:57:58 kodscha 00 000 1    0    0   0   0    0    0    00
> Jul  8 14:57:58 kodscha 01 003 0    0    0   0   0    1    1    31
> Jul  8 14:57:58 kodscha 02 003 1    0    0   0   0    0    0    30
>
> I've also seen strange CPU utilization -- with syslog-ng:
>
> top - 15:33:06 up 35 min,  4 users,  load average: 1.70, 0.68, 0.37
> Tasks:  64 total,   4 running,  60 sleeping,   0 stopped,   0 zombie
> Cpu0  :  0.0%us,100.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu1  :  6.4%us, 87.2%sy,  0.0%ni,  5.8%id,  0.0%wa,  0.6%hi,  0.0%si,  0.0%st
> Mem:    895384k total,   283568k used,   611816k free,    35492k buffers
> Swap:  1959920k total,        0k used,  1959920k free,   163044k cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  4632 root      20   0 17216  800  580 S  104  0.1   0:34.22 syslog-ng
> 28505 root      20   0  205m  11m 4024 S    6  1.3   0:21.16 X
> 28518 root      20   0 56292 5652 4492 S    1  0.6   0:01.80 fluxbox
>     1 root      20   0  3724  608  508 S    0  0.1   0:00.36 init
>
> So far I have no clue why C1E-idle in conjunction with virtual wire
> mode causes this strange behaviour.
>
> ... and I start to think about the root cause of all this.
>
> I've performed similar tests under X with the IRQ0/INT0 configuration and
> I did not see above symptoms.

So lets fall back to the IRQ0/INT0 configuration on this box.

This basically restores the dont-use-the-lapic-timer exception mechanism
that was unconditional on this box prior commit 8750bf5 ("x86: add C1E
aware idle function").

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 11:03:21 +02:00
FUJITA Tomonori
8cbfd4f44d x86, iommu: replace CONFIG_IOMMU with CONFIG_GART_IOMMU in iommu.h
Hmm, looks like it would be nice to have more cleanups of iommu.h and
gart.h.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 10:53:13 +02:00
Yinghai Lu
2dc807b37b x86: make max_pfn cover acpi table below 4g
When system have 4g less ram installed, and acpi table sit
near end of ram, make max_pfn cover them too,
so 64bit kernel don't need to mess up fixmap.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: "Suresh Siddha" <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 10:43:25 +02:00
Glauber Costa
22cac16707 x86: define architectural characteristics in uaccess.h.
Remove them from the arch-specific file.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:29 +02:00
Glauber Costa
8bc7de0c5d x86: put movsl_mask into uaccess.h.
x86_64 does not need it, but it won't have X86_INTEL_USERCOPY
defined either.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:28 +02:00
Glauber Costa
8cb834e99f x86: move __get_user and __put_user into uaccess.h.
We also carry the unaligned version with us. Only x86_64 uses
it, but there's no problem in defining it.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:27 +02:00
Glauber Costa
e30a44fdbf x86: merge put_user.
Move both versions, which are highly similar, to uaccess.h.
Note that, for x86_64, X86_WP_WORKS_OK is always defined.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:26 +02:00
Glauber Costa
ec840956d2 x86: turn __put_user_check directly into put_user.
We also check user pointer in x86_64 put_user, the way i386 does.

In a separate patch for bisecting purposes.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:26 +02:00
Glauber Costa
64a978c1e3 x86: be more explicit in __put_user_x.
For both __put_user_x and __put_user_8 macros, pass the error
variable explicitly.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:25 +02:00
Glauber Costa
3f16822116 x86: merge __get_user_asm and its users.
Move __get_user_asm and __get_user_size and __get_user_nocheck
to uaccess.h. This requires us to define a macro at __get_user_size
for the 64-bit access case.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:24 +02:00
Glauber Costa
6fd4725d75 x86: don't always use EFAULT on __get_user_size.
Let the user of the macro specify the desired return.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:23 +02:00
Glauber Costa
dc70ddf409 x86: merge __put_user_asm and its user.
Move both __put_user_asm and __put_user_size to
uaccess.h. i386 already had a special function for 64-bit access,
so for x86_64, we just define a macro with the same name.
Note that for X86_64, CONFIG_X86_WP_WORKS_OK will always
be defined, so the #else part will never be even compiled in.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:22 +02:00
Glauber Costa
d42e6af613 x86: don't always use EFAULT on __put_user_size.
Let the user of the macro specify the desired return.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:22 +02:00
Glauber Costa
5e322163b1 x86: use k modifier for 4-byte access.
Do it in a separate patch for bisectability.
Goal is to have put_user_size integrated.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:20 +02:00
Glauber Costa
002ca1690c x86: move __addr_ok to uaccess.h.
Take it out of uaccess_32.h. Since it seems that no users
of the x86_64 exists, we simply pick the i386 version.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:19 +02:00
Glauber Costa
865e5b7650 x86: merge getuser.
Merge versions of getuser from uaccess_32.h and uaccess_64.h into
uaccess.h. There is a part which is 64-bit only (for now), and for
that, we use a __get_user_8 macro.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:18 +02:00
Glauber Costa
ca23386216 x86: merge common parts of uaccess.
Common parts of uaccess_32.h and uaccess_64.h
are put in uaccess.h. Bits in uaccess_32.h and
uaccess_64.h that come to this file are equal
except for comments and whitespaces differences.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:18 +02:00
Glauber Costa
be9d06bfd4 x86: use something common for both architectures.
Using explicit hexa (0xFFFFFFUL) introduces an unnecessary difference
between i386 and x86_64 because of the size of their long. Use -1UL instead.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:17 +02:00
Glauber Costa
1dc186e82c x86: use long instead of int.
Do not refer to the processor word-size with int, as it won't
work with x86_64. Use long instead.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:16 +02:00
Glauber Costa
8b0a8aaf05 x86: introduce likely in macro.
Put the likely hint in access_ok. Just for
bisectability.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:15 +02:00
Glauber Costa
c28b95d9bb x86: change asm constraint.
Our integration efforts broke a build with this function being used
with i386. Reason is "g" can put the operand in an imm32, which according
to The Book (tm), is invalid as the second operand.

This is actually a bug
in x86_64 too, since the x86_64 instruction set reference does not list
it as valid.

We probably didn't trigger this before due to the ammount of
registers available for 64-bit platforms. But that's just my guess.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:14 +02:00
Glauber Costa
53938a68a2 x86: commonize __range_not_ok.
For i386, __range_not_ok is a better name than __range_ok, since
it returns 0 when it is in fact okay. Other than that,
both versions does not need the word size specifiers, and we remove them.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:13 +02:00
Glauber Costa
2528de431d x86: use macros from asm.h.
In putuser_32.S and putuser_64.S, replace things like .quad, .long,
and explicit references to [r|e]ax for the apropriate macros
in asm/asm.h.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:12 +02:00
Glauber Costa
0ada316403 x86: pass argument to putuser_64 functions in ax register.
This is consistent with i386 usage.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:08 +02:00
Glauber Costa
770546b99f x86: clobber rbx in putuser_64.S.
Instead of clobbering r8, clobber rbx, which is the i386 way.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:08 +02:00
Glauber Costa
70706e432e x86: user put_user_x instead of all variants.
Follow the pattern, and define a single put_user_x, instead
of defining macros for all available sizes. Exception is
put_user_8, since the "A" constraint does not give us enough
power to specify which register (a or d) to use in the 32-bit
common case.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:07 +02:00
Glauber Costa
268cf048c8 x86: don't save ebx in putuser_32.S.
Clobber it in the inline asm macros, and let the compiler do this for us.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:06 +02:00
Glauber Costa
6c2d458680 x86: merge getuser asm functions.
getuser_32.S and getuser_64.S are merged into getuser.S.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:05 +02:00
Glauber Costa
40faf463e6 x86: introduce __ASM_REG macro.
There are situations in which the architecture wants to use the
register that represents its word-size, whatever it is. For those,
introduce __ASM_REG in asm.h, along with the first users _ASM_AX
and _ASM_DX. They have users waiting for it, namely the getuser
functions.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:14:04 +02:00
Glauber Costa
edf10162b2 x86: don't clobber r8 nor use rcx.
There's really no reason to clobber r8 or pass the address in rcx.
We can safely use only two registers (which we already have to touch anyway)
to do the job.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:13:59 +02:00
Ingo Molnar
f8dd0d3c62 x86: delay lib unification build fix
fix:

arch/x86/lib/delay.c:93:24: error: macro "use_tsc_delay" passed 1 arguments, but takes just 0
arch/x86/lib/delay.c:94: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘{’ token

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:13:59 +02:00
Ingo Molnar
26e9e57b10 x86, uv: build fix #2 for "x86, uv: update x86 mmr list for SGI uv"
fix:

 In file included from arch/x86/kernel/tlb_uv.c:14:
 include/asm/uv/uv_mmrs.h:986: error: redefinition of ‘union uvh_rh_gam_cfg_overlay_config_mmr_u’
 include/asm/uv/uv_mmrs.h:988: error: redefinition of ‘struct uvh_rh_gam_cfg_overlay_config_mmr_s’
 include/asm/uv/uv_mmrs.h:1064: error: redefinition of ‘union uvh_rh_gam_mmioh_overlay_config_mmr_u’
 include/asm/uv/uv_mmrs.h:1066: error: redefinition of ‘struct uvh_rh_gam_mmioh_overlay_config_mmr_s’

caused by another duplicate section (cut & paste error) in commit
5d061e397d "x86, uv: update x86 mmr list for SGI uv".

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 08:18:41 +02:00
Ingo Molnar
e407dffd17 x86, uv: build fix for "x86, uv: update x86 mmr list for SGI uv"
fix:

In file included from arch/x86/kernel/genx2apic_uv_x.c:25:
include/asm/uv/uv_mmrs.h:986: error: redefinition of ‘union uvh_rh_gam_cfg_overlay_config_mmr_u’
include/asm/uv/uv_mmrs.h:988: error: redefinition of ‘struct uvh_rh_gam_cfg_overlay_config_mmr_s’
include/asm/uv/uv_mmrs.h:1064: error: redefinition of ‘union uvh_rh_gam_mmioh_overlay_config_mmr_u’
include/asm/uv/uv_mmrs.h:1066: error: redefinition of ‘struct uvh_rh_gam_mmioh_overlay_config_mmr_s’

caused by duplicate section (cut & paste error) in commit
5d061e397d "x86, uv: update x86 mmr list for SGI uv".

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 08:18:41 +02:00
Alok Kataria
e93ef949fd x86: rename paravirtualized TSC functions
Rename the paravirtualized calculate_cpu_khz to calibrate_tsc.
In all cases, we actually calibrate_tsc and use that as the cpu_khz value.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Dan Hecht <dhecht@vmware.com>
Cc: Dan Hecht <dhecht@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 07:43:28 +02:00
Alok Kataria
8fbbc4b45c x86: merge tsc_init and clocksource code
Unify the clocksource code.
Unify the tsc_init code.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Dan Hecht <dhecht@vmware.com>
Cc: Dan Hecht <dhecht@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 07:43:27 +02:00
Alok Kataria
bfc0f5947a x86: merge tsc calibration
Merge the tsc calibration code for the 32bit and 64bit kernel.
The paravirtualized calculate_cpu_khz for 64bit now points to the correct
tsc_calibrate code as in 32bit.
Original native_calculate_cpu_khz for 64 bit is now called as calibrate_cpu.

Also moved the recalibrate_cpu_khz function in the common file.
Note that this function is called only from powernow K7 cpu freq driver.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Dan Hecht <dhecht@vmware.com>
Cc: Dan Hecht <dhecht@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 07:43:25 +02:00
Dimitri Sivanich
5d061e397d x86, uv: update x86 mmr list for SGI uv
This patch updates the X86 mmr list for SGI uv.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Russ Anderson <rja@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 07:43:24 +02:00
Jack Steiner
83f5d894ca x86: map UV chipset space - UV support
Create page table entries to map the SGI UV chipset GRU. local MMR &
global MMR ranges.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 07:43:23 +02:00
Jack Steiner
3a9e189d69 x86: map UV chipset space - pagetable
Add boot-time function for creating additional 2MB page table entries for
mapping chipset specific cached/uncached ranges.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 07:43:23 +02:00
Bernhard Walle
5dfcf14d5b x86: use FIRMWARE_MEMMAP on x86/E820
This patch uses the /sys/firmware/memmap interface provided in the last patch
on the x86 architecture when E820 is used. The patch copies the E820
memory map very early, and registers the E820 map afterwards via
firmware_map_add_early().

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Acked-by: Greg KH <gregkh@suse.de>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: kexec@lists.infradead.org
Cc: yhlu.kernel@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 17:55:42 +02:00
Yinghai Lu
6247943d8a x86: remove acpi_srat config v2
use ACPI_NUMA directly

and move srat_32.c to mm/

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 15:49:08 +02:00
Jeremy Fitzhardinge
ef5e94af16 x86_32: remove __PAGE_KERNEL(_EXEC)
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Older x86-32 processors do not support global mappings (PGD), so must
only use it if the processor supports it.

The _PAGE_KERNEL* flags always have _PAGE_KERNEL set, since logically
we always want it set.

This is OK even on processors which do not support PGD, since all
_PAGE flags are masked with __supported_pte_mask before being turned
into a real in-pagetable pte.  On 32-bit systems, __supported_pte_mask
is initialized to not contain _PAGE_GLOBAL, and it is then added if
the CPU is found to support it.

The x86-32 code used to use __PAGE_KERNEL/__PAGE_KERNEL_EXEC for this
purpose, but they're now redundant and can be removed.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:29 +02:00
Jeremy Fitzhardinge
8490638cf0 x86: always set _PAGE_GLOBAL in _PAGE_KERNEL* flags
Consistently set _PAGE_GLOBAL in _PAGE_KERNEL flags.  This makes 32-
and 64-bit code consistent, and removes some special cases where
__PAGE_KERNEL* did not have _PAGE_GLOBAL set, causing confusion as a
result of the inconsistencies.

This patch only affects x86-64, which generally always supports PGD.
The x86-32 patch is next.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:28 +02:00
Ingo Molnar
4a7017370a x86: move prefill_possible_map calling early, fix
fix:

arch/x86/kernel/built-in.o: In function `setup_arch':
: undefined reference to `prefill_possible_map'

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:25 +02:00
Yinghai Lu
329513a35d x86: move prefill_possible_map calling early
call it right after we are done with MADT/mptable handling, instead of
doing that in setup_per_cpu_areas() later on...

this way for_possible_cpu() can be used early.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:24 +02:00
Yinghai Lu
cb95a13a8a x86: merge zones_sizes_init for numa and non numa on 32-bit
move out e820_register_active_regions from non numa zones_sizes_init()
and remove numa version zones_sizes_init().

and let 32 bit call remove_all_active_ranges() in setup_arch() directly
like 64-bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:22 +02:00
Yinghai Lu
dc8e8120ad x86: change copy_e820_map to append_e820_map
so it has a more meaningful name.
also change it to static.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:19 +02:00
Alok Kataria
fd6493e166 x86: cleanup e820_setup_gap(), v2
e820_search_gap also take a end_addr parameter to limit search from
start_addr to end_addr.

Signed-off-by: AloK N Kataria <akataria@vmware.com>
Acked-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: "lenb@kernel.org" <lenb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:17 +02:00
Mike Travis
6a2f47ca27 x86: add check for node passed to node_to_cpumask, v3
* When CONFIG_DEBUG_PER_CPU_MAPS is set, the node passed to
    node_to_cpumask and node_to_cpumask_ptr should be validated.
    If invalid, then a dump_stack is performed and a zero cpumask
    is returned.

v2: Slightly different version to remove a compiler warning.
v3: Redone to reflect moving setup.c -> setup_percpu.c

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:16 +02:00
Yinghai Lu
28bb223795 x86: move reserve_setup_data to setup.c
Ying Huang would like setup_data to be reserved, but not included in the
no save range.

Here we try to modify the e820 table to reserve that range early.
also add that in early_res in case bootloader messes up with the ramdisk.

other solution would be
1. add early_res_to_highmem...
2. early_res_to_e820...
but they could reserve another type memory wrongly, if early_res has some
resource reserved early, and not needed later, but it is not removed from
early_res in time. Like the RAMDISK (already handled).

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: andi@firstfloor.org
Tested-by: Huang, Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:14 +02:00
Ingo Molnar
3ae960a598 - x86: move early_ioremap prototypes to io.h
now that the early-ioremap code is unified, move the prototypes too from
io_32.h to io.h.

this fixes:

arch/x86/kernel/setup.c:531: error: implicit declaration of function ‘early_ioremap_init'

on 64-bit.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:12 +02:00
Yinghai Lu
914bebfad4 x86: use disable_apic in 32bit
change the enable_local_apic to static force_enable_local_apic for 32bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:08 +02:00
Jeremy Fitzhardinge
df366e9822 x86_64: fix non-paravirt compilation
Make sure SWAPGS and PARAVIRT_ADJUST_EXCEPTION_FRAME are properly
defined when CONFIG_PARAVIRT is off.

Fixes Ingo's build failure:
arch/x86/kernel/entry_64.S: Assembler messages:
arch/x86/kernel/entry_64.S:1201: Error: invalid character '_' in mnemonic
arch/x86/kernel/entry_64.S:1205: Error: invalid character '_' in mnemonic
arch/x86/kernel/entry_64.S:1209: Error: invalid character '_' in mnemonic
arch/x86/kernel/entry_64.S:1213: Error: invalid character '_' in mnemonic

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Mark McLoughlin <markmc@redhat.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Stephen Tweedie <sct@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:05 +02:00
Yinghai Lu
f3294a33e7 x86: let setup_arch call init_apic_mappings for 32bit
instead of calling it from trap_init()

also move init ioapic mapping out of apic_32.c

so 32 bit do same as 64 bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:04 +02:00