This fixes up a variety of minor problems in compiling with ARCH=ppc
arising from using the merged versions of various header files.
A lot of the changes are just adding #include <asm/machdep.h> to
files that use ppc_md or smp_ops_t.
This also arranges for us to use semaphore.c, vecemu.c, vector.S and
fpu.S from arch/powerpc/kernel when compiling with ARCH=ppc.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is required to avoid unloading a module that has active timewait
sockets, such as DCCP.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the ability of changing the state a TCP connection. I know
that this must be used with care but it's required to provide a complete
conntrack creation via conntrack_netlink. So I'll document this aspect on
the upcoming docs.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initially we used 64bit counters for conntrack-based accounting, since we
had no event mechanism to tell userspace that our counters are about to
overflow. With nfnetlink_conntrack, we now have such a event mechanism and
thus can save 16bytes per connection.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
To keep consistency, the TCP private protocol information is nested
attributes under CTA_PROTOINFO_TCP. This way the sequence of attributes to
access the TCP state information looks like here below:
CTA_PROTOINFO
CTA_PROTOINFO_TCP
CTA_PROTOINFO_TCP_STATE
instead of:
CTA_PROTOINFO
CTA_PROTOINFO_TCP_STATE
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Without this #include, __be16 is not defined and userspace programs
will break.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When 'rustynat' was merged in 2.6.12, the use of the "helper" pointer of
struct ipt_nat_info was obsoleted, but the pointer not removed from the
struct.
This patch removes the pointer, thereby yet again shrinking struct
ip_conntrack.
Discovered-by: Rusty Russell <rusty@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As Henrik Nordstrom pointed out, all our efforts with "split endian" (i.e.
host byte order tags, net byte order values) are useless, unless a parser
can determine whether an attribute is nested or not.
This patch steals the highest bit of nfattr.nfa_type to indicate whether
the data payload contains a nested nfattr (1) or not (0).
This will break userspace compatibility, but luckily no kernel with
nfnetlink was released so far.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
CPU hotplug fills up the possible map to NR_CPUs, but it did that after
setting up per CPU data. This lead to CPU data not getting allocated
for all possible CPUs, which lead to various side effects.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If a process issues an URB from userspace and (starts to) terminate
before the URB comes back, we run into the issue described above. This
is because the urb saves a pointer to "current" when it is posted to the
device, but there's no guarantee that this pointer is still valid
afterwards.
In fact, there are three separate issues:
1) the pointer to "current" can become invalid, since the task could be
completely gone when the URB completion comes back from the device.
2) Even if the saved task pointer is still pointing to a valid task_struct,
task_struct->sighand could have gone meanwhile.
3) Even if the process is perfectly fine, permissions may have changed,
and we can no longer send it a signal.
So what we do instead, is to save the PID and uid's of the process, and
introduce a new kill_proc_info_as_uid() function.
Signed-off-by: Harald Welte <laforge@gnumonks.org>
[ Fixed up types and added symbol exports ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The following patch makes swsusp avoid the possible temporary corruption
of page translation tables during resume on x86-64. This is achieved by
creating a copy of the relevant page tables that will not be modified by
swsusp and can be safely used by it on resume.
The problem is that during resume on x86-64 swsusp may temporarily
corrupt the page tables used for the direct mapping of RAM. If that
happens, a page fault occurs and cannot be handled properly, which leads
to the solid hang of the affected system. This leads to the loss of the
system's state from before suspend and may result in the loss of data or
the corruption of filesystems, so it is a serious issue. Also, it
appears to happen quite often (for me, as often as 50% of the time).
The problem is related to the fact that (at least) one of the PMD
entries used in the direct memory mapping (starting at PAGE_OFFSET)
points to a page table the physical address of which is much greater
than the physical address of the PMD entry itself. Moreover,
unfortunately, the physical address of the page table before suspend
(i.e. the one stored in the suspend image) happens to be different to
the physical address of the corresponding page table used during resume
(i.e. the one that is valid right before swsusp_arch_resume() in
arch/x86_64/kernel/suspend_asm.S is executed). Thus while the image is
restored, the "offending" PMD entry gets overwritten, so it does not
point to the right physical address any more (i.e. there's no page
table at the address pointed to by it, because it points to the address
the page table has been at during suspend). Consequently, if the PMD
entry is used later on, and it _is_ used in the process of copying the
image pages, a page fault occurs, but it cannot be handled in the normal
way and the system hangs.
In principle we can call create_resume_mapping() from
swsusp_arch_resume() (ie. from suspend_asm.S), but then the memory
allocations in create_resume_mapping(), resume_pud_mapping(), and
resume_pmd_mapping() must be made carefully so that we use _only_
NosaveFree pages in them (the other pages are overwritten by the loop in
swsusp_arch_resume()). Additionally, we are in atomic context at that
time, so we cannot use GFP_KERNEL. Moreover, if one of the allocations
fails, we should free all of the allocated pages, so we need to trace
them somehow.
All of this is done in the appended patch, except that the functions
populating the page tables are located in arch/x86_64/kernel/suspend.c
rather than in init.c. It may be done in a more elegan way in the
future, with the help of some swsusp patches that are in the works now.
[AK: move some externs into headers, renamed a function]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This moves the Device_List member from struct device_node to
struct pci_dn, which cleans up the device_node and makes the code
a little simpler.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is a bunch of mostly small fixes that are needed to get
ARCH=powerpc to compile for 64-bit. This adds setup_64.c from
arch/ppc64/kernel/setup.c and locks.c from arch/ppc64/lib/locks.c.
Signed-off-by: Paul Mackerras <paulus@samba.org>
The only real change here is that lmb_enforce_memory_limit now takes
the memory_limit as a parameter instead of as a global variable.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Patch from Richard Purdie
Allow the GPIO pin suspend states to be specified for SCOOP devices.
This is needed for correct operation on the spitz platform.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This adds register definitions from the ppc64 processor.h to reg.h,
and makes a single merged processor.h. I moved __is_processor from
the ppc64 system.h to the merged reg.h along with the PVR register
constants.
Signed-off-by: Paul Mackerras <paulus@samba.org>
- added typedef unsigned int __nocast gfp_t;
- replaced __nocast uses for gfp flags with gfp_t - it gives exactly
the same warnings as far as sparse is concerned, doesn't change
generated code (from gcc point of view we replaced unsigned int with
typedef) and documents what's going on far better.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The attached patch splits key permissions checking out of key-ui.h and
moves it into a .c file. It's quite large and called quite a lot, and
it's about to get bigger with the addition of LSM support for keys...
key_any_permission() is also discarded as it's no longer used.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
EMU10K1/EMU10K2 driver
Fixed the error at loading SBLive Game board (and possible other models).
The PCI SSIDs of this board conflicts with SB Live 5.1 Platinum, which has
no AC97 chip.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
AC97 Codec
Don't use dev.platform_data to store a reference to the containing
ac97_t structure. Such assignment is redundent since we can deduce the
ac97_t structure location from the contained device structure. This
sets platform_data free for other purposes.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Adds alignment attribute to a few structures used with SCTP socket
options so that the sizes and offsets remain the same when built using
either 32 or 64 bit tools.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The old socket options are marked with a _OLD suffix so that the
existing 32-bit apps on 32-bit kernels do not break.
Signed-off-by: Ivan Skytte Jrgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This also creates merged versions of do_init_bootmem, paging_init
and mem_init and moves them to arch/powerpc/mm/mem.c. It gets rid
of the mem_pieces stuff.
I made memory_limit a parameter to lmb_enforce_memory_limit rather
than a global referenced by that function. This will require some
small changes to ppc64 if we want to continue building ARCH=ppc64
using the merged lmb.c.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This brings in the ppc64 version of prom_init.c, prom.c and btext.c
and makes them work for ppc32. This also brings in the new calling
convention, where the first entry to the kernel (with r5 != 0) goes
to the prom_init code, which then restarts from the beginning (with
r5 == 0) after it has done its stuff.
For now this also brings in the ppc32 version of setup.c. It also
merges lmb.h.
Signed-off-by: Paul Mackerras <paulus@samba.org>
These macros help in writing assembly code that works for both ppc32
and ppc64. With this we now have a common fpu.S. This takes out
load_up_fpu from head_64.S.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Here is a patch that adds a helper called xfrm_policy_id2dir to
document the fact that the policy direction can be and is derived
from the index.
This is based on a patch by YOSHIFUJI Hideaki and 210313105@suda.edu.cn.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we switch extern inline to static inline, we'd better switch the
pre-declarations we use to say that these puppies have
__attribute_const__ on them.
Otherwise we get extern declaration followed by static inline one.
Which makes gcc unhappy, and for a good reason...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix implicit nocast warnings in xfrm code:
net/xfrm/xfrm_policy.c:232:47: warning: implicit cast to nocast type
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix nocast sparse warnings:
include/linux/textsearch.h:165:57: warning: implicit cast to nocast type
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix nocast sparse warnings:
net/rxrpc/call.c:2013:25: warning: implicit cast to nocast type
net/rxrpc/connection.c:538:46: warning: implicit cast to nocast type
net/sunrpc/sched.c:730:36: warning: implicit cast to nocast type
net/sunrpc/sched.c:734:56: warning: implicit cast to nocast type
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Randy Dunlap <rdunlap@xenotime.net>
Fix implicit nocast warnings in ip_vs code:
net/ipv4/ipvs/ip_vs_app.c:631:54: warning: implicit cast to nocast type
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix implicit nocast warnings in decnet code:
net/decnet/af_decnet.c:458:40: warning: implicit cast to nocast type
net/decnet/dn_nsp_out.c:125:35: warning: implicit cast to nocast type
net/decnet/dn_nsp_out.c:219:29: warning: implicit cast to nocast type
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix implicit nocast warnings in connector code:
drivers/connector/connector.c:102:24: warning: implicit cast to nocast type
drivers/connector/connector.c:114:45: warning: implicit cast to nocast type
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix implicit nocast warnings in atm code:
net/atm/atm_misc.c:35:44: warning: implicit cast to nocast type
drivers/atm/fore200e.c:183:33: warning: implicit cast to nocast type
Also use kzalloc() instead of kmalloc().
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Macro ended up backwards during one of cleanups. Found by Alessandro Zummo.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
No need to align struct inet_ehash_bucket on a 8 bytes boundary.
On 32 bits Uniprocessor, that's a waste of 4 bytes per struct (50 %)
On other platforms, the attribute is useless, natual alignement is already 8.
platform | Size before | Size after patch
-------------+-------------+------------------
32 bits, UP | 8 | 4
32 bits, SMP | 8 | 8
64 bits, UP | 8 | 8
64 bits, SMP | 16 | 16
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from Sascha Hauer
Current implementation of imx_gpio_mode does not allow to
configure all alternate routing possibilities of the i.MX. With
this patch every bit in the gpio setup registers has a
corresponding bit in the gpio_mode parameter, so every routing
should be possible now.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Sascha Hauer
After coming out of idle mode the h720x goes into slow mode. Switch
it back to run mode.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The old code had the IP and SP coming from the registers in the thread
struct, which are completely wrong since those are the userspace
registers. This fixes that by pulling the correct values from the
jmp_buf in which the kernel state of each thread is stored.
Signed-off-by: Allan Graves <allan.graves@oracle.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The following patch renames __in_dev_get() to __in_dev_get_rtnl() and
introduces __in_dev_get_rcu() to cover the second case.
1) RCU with refcnt should use in_dev_get().
2) RCU without refcnt should use __in_dev_get_rcu().
3) All others must hold RTNL and use __in_dev_get_rtnl().
There is one exception in net/ipv4/route.c which is in fact a pre-existing
race condition. I've marked it as such so that we remember to fix it.
This patch is based on suggestions and prior work by Suzanne Wood and
Paul McKenney.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnaldo and I agreed it could be applied now, because I have other
pending patches depending on this one (Thank you Arnaldo)
(The other important patch moves skc_refcnt in a separate cache line,
so that the SMP/NUMA performance doesnt suffer from cache line ping pongs)
1) First some performance data :
--------------------------------
tcp_v4_rcv() wastes a *lot* of time in __inet_lookup_established()
The most time critical code is :
sk_for_each(sk, node, &head->chain) {
if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif))
goto hit; /* You sunk my battleship! */
}
The sk_for_each() does use prefetch() hints but only the begining of
"struct sock" is prefetched.
As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far
away from the begining of "struct sock", it has to bring into CPU
cache cold cache line. Each iteration has to use at least 2 cache
lines.
This can be problematic if some chains are very long.
2) The goal
-----------
The idea I had is to change things so that INET_MATCH() may return
FALSE in 99% of cases only using the data already in the CPU cache,
using one cache line per iteration.
3) Description of the patch
---------------------------
Adds a new 'unsigned int skc_hash' field in 'struct sock_common',
filling a 32 bits hole on 64 bits platform.
struct sock_common {
unsigned short skc_family;
volatile unsigned char skc_state;
unsigned char skc_reuse;
int skc_bound_dev_if;
struct hlist_node skc_node;
struct hlist_node skc_bind_node;
atomic_t skc_refcnt;
+ unsigned int skc_hash;
struct proto *skc_prot;
};
Store in this 32 bits field the full hash, not masked by (ehash_size -
1) Using this full hash as the first comparison done in INET_MATCH
permits us immediatly skip the element without touching a second cache
line in case of a miss.
Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to
sk_hash and tw_hash) already contains the slot number if we mask with
(ehash_size - 1)
File include/net/inet_hashtables.h
64 bits platforms :
#define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
(((__sk)->sk_hash == (__hash))
((*((__u64 *)&(inet_sk(__sk)->daddr)))== (__cookie)) && \
((*((__u32 *)&(inet_sk(__sk)->dport))) == (__ports)) && \
(!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))
32bits platforms:
#define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
(((__sk)->sk_hash == (__hash)) && \
(inet_sk(__sk)->daddr == (__saddr)) && \
(inet_sk(__sk)->rcv_saddr == (__daddr)) && \
(!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))
- Adds a prefetch(head->chain.first) in
__inet_lookup_established()/__tcp_v4_check_established() and
__inet6_lookup_established()/__tcp_v6_check_established() and
__dccp_v4_check_established() to bring into cache the first element of the
list, before the {read|write}_lock(&head->lock);
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
I've found the problem in general. It affects any 64-bit
architecture. The problem occurs when you change the system time.
Suppose that when you boot your system clock is forward by a day.
This gets recorded down in skb_tv_base. You then wind the clock back
by a day. From that point onwards the offset will be negative which
essentially overflows the 32-bit variables they're stored in.
In fact, why don't we just store the real time stamp in those 32-bit
variables? After all, we're not going to overflow for quite a while
yet.
When we do overflow, we'll need a better solution of course.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use '#ifdef' consistently on __KERNEL__. This was reported as bug #5340
(isn't easier to send a fix than report the bug?!)
Signed-off-by: Diego Calleja <diegocg@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use idle_power4.S from ppc64 as we are not going to support
32 bit power4 in the merged tree.
Merge ppc64 traps.c into powerpc traps.c:
use ppc64 versions of exception routine names
(as they don't have StudlyCaps)
make all the versions if die() have the same
prototype
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
include/asm/hw_irq.h:70: warning: `struct hw_interrupt_type' declared inside parameter list
include/asm/hw_irq.h:70: warning: its scope is only this definition or declaration, which is probably not what you want
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Only one of the run or kick path is supposed to put an iocb on the run
list. If both of them do it than one of them can end up referencing a
freed iocb. The kick path could delete the task_list item from the wait
queue before getting the ctx_lock and putting the iocb on the run list.
The run path was testing the task_list item outside the lock so that it
could catch ki_retry methods that return -EIOCBRETRY *without* putting the
iocb on a wait queue and promising to call kick_iocb. This unlocked check
could then race with the kick path to cause both to try and put the iocb on
the run list.
The patch stops the run path from testing task_list by requring that any
ki_retry that returns -EIOCBRETRY *must* guarantee that kick_iocb() will be
called in the future. aio_p{read,write}, the only in-tree -EIOCBRETRY
users, are updated.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Delete all of the code working with sp_banks[] and replace
with clean acquisition and sorting of physical memory
parameters from the firmware.
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from Daniel Jacobowitz
Thread flags are inherited on fork(). In order for a binary which has
the iWMMXt coprocessor enabled to run a binary which needs the FPA
emulation, we need to explicitly clear TIF_USING_IWMMXT if we are not
going to set it.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
I checked with AMD and they requested to only disable it for family 15.
Also disable it for i386 too. And some style fixes.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Roland points out that the flags end up having non-obvious dependencies
elsewhere, so revert aa55a08687 and add
some comments about why things are as they are.
We'll just have to fix up the broken comparisons. Roland has a patch.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
do_signal_stop:
for_each_thread(t) {
if (t->state < TASK_STOPPED)
++sig->group_stop_count;
}
However, TASK_NONINTERACTIVE > TASK_STOPPED, so this loop will not
count TASK_INTERRUPTIBLE | TASK_NONINTERACTIVE threads.
See also wait_task_stopped(), which checks ->state > TASK_STOPPED.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
[ We really probably should always use the appropriate bitmasks to test
task states, not do it like this. Using something like
#define TASK_RUNNABLE (TASK_RUNNING | TASK_INTERRUPTIBLE | \
TASK_UNINTERRUPTIBLE | TASK_NONINTERACTIVE)
and then doing "if (task->state & TASK_RUNNABLE)" or similar. But the
ordering of the task states is historical, and keeping the ordering
does make sense regardless. ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- document places where we pass kernel address to low-level primitive
that deals with kernel/user addresses
- uintptr_t is unsigned long, not long
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Instead of doing byte-at-a-time user accesses to figure
out where the fault occurred, read the saved fault_address
from the current thread structure.
For the sake of defensive programming, if the fault_address
does not fall into the user buffer range, simply assume the
whole area faulted. This will cause the fixup for
copy_from_user() to clear the entire kernel side buffer.
Signed-off-by: David S. Miller <davem@davemloft.net>
We were not calling kernel_mna_trap_fault() correctly.
Instead of being fancy, just return 0 vs. -EFAULT from
the assembler stubs, and handle that return value as
appropriate.
Create an "__retl_efault" stub for assembler exception
table entries and use it where possible.
Signed-off-by: David S. Miller <davem@davemloft.net>
The funny "range" exception table entries we had were only
used by the compat layer socketcall assembly, and it wasn't
even needed there.
For free we now get proper exception table sorting and fast
binary searching.
Signed-off-by: David S. Miller <davem@davemloft.net>
The attached patch adds extra permission grants to keys for the possessor of a
key in addition to the owner, group and other permissions bits. This makes
SUID binaries easier to support without going as far as labelling keys and key
targets using the LSM facilities.
This patch adds a second "pointer type" to key structures (struct key_ref *)
that can have the bottom bit of the address set to indicate the possession of
a key. This is propagated through searches from the keyring to the discovered
key. It has been made a separate type so that the compiler can spot attempts
to dereference a potentially incorrect pointer.
The "possession" attribute can't be attached to a key structure directly as
it's not an intrinsic property of a key.
Pointers to keys have been replaced with struct key_ref *'s wherever
possession information needs to be passed through.
This does assume that the bottom bit of the pointer will always be zero on
return from kmem_cache_alloc().
The key reference type has been made into a typedef so that at least it can be
located in the sources, even though it's basically a pointer to an undefined
type. I've also renamed the accessor functions to be more useful, and all
reference variables should now end in "_ref".
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
My previous patch fixing invalidation of huge PTEs wasn't good enough, we
still had an issue if a PTE invalidation batch contained both small and
large pages. This patch fixes this by making sure the batch is flushed if
the page size fed to it changes.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move the ZERO_PAGE remapping complexity to the move_pte macro in
asm-generic, have it conditionally depend on
__HAVE_ARCH_MULTIPLE_ZERO_PAGE, which gets defined for MIPS.
For architectures without __HAVE_ARCH_MULTIPLE_ZERO_PAGE, move_pte becomes
a noop.
From: Hugh Dickins <hugh@veritas.com>
Fix nasty little bug we've missed in Nick's mremap move ZERO_PAGE patch.
The "pte" at that point may be a swap entry or a pte_file entry: we must
check pte_present before perhaps corrupting such an entry.
Patch below against 2.6.14-rc2-mm1, but the same bug is in 2.6.14-rc2's
mm/mremap.c, and more dangerous there since it's affecting all arches: I
think the safest course is to send Nick's patch and Yoichi's build fix and
this fix (build tested) on to Linus - so only MIPS can be affected.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Also, the us3_cpufreq driver can work on Ultra-IV and IV+.
They use the SAFARI bus register to control the clock divider
just like Ultra-III and III+ do.
Signed-off-by: David S. Miller <davem@davemloft.net>
Changed ppc32 so that cur_cpu_spec is just a single pointer for all CPUs.
Additionally, made call_setup_cpu check to see if the cpu_setup pointer
is NULL or not before calling the function. This lets remove the dummy
cpu_setup calls that just return.
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Merged cputable.h between ppc32 and ppc64. In doing this removed support
for the BEGIN_FTR_SECTION/END_FTR_SECTION macros in C code since they
dont compile correctly. C code should use cpu_has_feature(). This is
based on Arnd Bergmann's initial patch.
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
powerpc: Merge byteorder.h
Essentially adopts the 64-bit version of this file. The 32-bit version had
been using unsigned ints for arguments/return values that were actually
only 16 bits - the new file uses __u16 for these items as in the 64-bit
version of the header. The order of some of the asm constraints
in the 64-bit version was slightly different than the 32-bit version,
but they produce identical code.
Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The following is generated when compiling a
recent (2.6.14-rc2-git5) kernel configured for
ARM, with GCC4.
CC init/main.o
In file included from include/linux/netdevice.h:29,
from include/net/sock.h:48,
from init/main.c:50:
include/linux/if_ether.h:114: error: array type has incomplete element type
It seems that if CONFIG_SYSCTL is not set, then
the compiler will throw an error due to the definition
of the ether_table[] array
Attached is a solution to the problem
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Written by Adrian Sun (asun@darksunrising.com).
Ported to 2.6.x by Tom 'spot' Callaway <tcallawa@redhat.com>.
Further cleaned up and integrated by David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
Place them on separate cache lines in SMP to lower memory bouncing
between multiple CPU accessing the device.
- One part is mostly used on receive path (including
eth_type_trans()) (poll_list, poll, quota, weight, last_rx,
dev_addr, broadcast)
- One part is mostly used on queue transmit path (qdisc)
(queue_lock, qdisc, qdisc_sleeping, qdisc_list, tx_queue_len)
- One part is mostly used on xmit path (device)
(xmit_lock, xmit_lock_owner, priv, hard_start_xmit, trans_start)
'features' is placed outside of these hot points, in a location that
may be shared by all cpus (because mostly read)
name_hlist is moved close to name[IFNAMSIZ] to speedup __dev_get_by_name()
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
and rename it to pci.c. This also required moving
arch/ppc64/kernel/pci.h into include/asm-powerpc (called
ppc-pci.h.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Complete moving arch/ppc64/kernel/mpic.h,
include/asm-ppc/reg.h, include/asm-ppc64/kdebug.h
and include/asm-ppc64/kprobes.h
Add arch/powerpc/platforms/Makefile and use it from
arch/powerpc/Makefile
Introduce OLDARCH temporarily so we can point back to
the originating architecture
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
When you've enabled conntrack and NAT as a module (standard case in all
distributions), and you've also enabled the new conntrack netlink
interface, loading ip_conntrack_netlink.ko will auto-load iptable_nat.ko.
This causes a huge performance penalty, since for every packet you iterate
the nat code, even if you don't want it.
This patch splits iptable_nat.ko into the NAT core (ip_nat.ko) and the
iptables frontend (iptable_nat.ko). Threfore, ip_conntrack_netlink.ko will
only pull ip_nat.ko, but not the frontend. ip_nat.ko will "only" allocate
some resources, but not affect runtime performance.
This separation is also a nice step in anticipation of new packet filters
(nf-hipac, ipset, pkttables) being able to use the NAT core.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If input message rate from userspace is too high, do not drop them,
but try to deliver using work queue allocation.
Failing there is some kind of congestion control.
It also removes warn_on on this condition, which scares people.
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro pointed out that the current IB userspace verbs interface
allows userspace to cause mischief by closing file descriptors before
we're ready, or issuing the same command twice at the same time. This
patch closes those races, and fixes other obvious problems such as a
module reference leak.
Some other interface bogosities will require an ABI change to fix
properly, so I'm deferring those fixes until 2.6.15.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Added a missing TO_NATIVE call to scripts/mod/file2alias.c:do_pcmcia_entry()
- Add an alignment attribute to struct pcmcia_device_no to solve an alignment
issue seen when cross-compiling on x86 for m68k.
Signed-off-by: Kars de Jong <jongk@linux-m68k.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Support some more TI cardbus bridges. most of them are multifunction
devices which adds 1394 controllers, smartcard readers etc. this could
also help with the various problems with the XX21 controllers seen on the
linux-pcmcia list.
Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Echo Audio cardbus products are known to be incompatible with EnE bridges.
in order to maybe solve the problem a EnE specific test bit has to be set,
another cleared...but other setups have a good chance to break when just
forcing the bits. so do the whole thingy automatically.
The patch adds a hook in cb_alloc() that allows special tuning for the
different chipsets. for ene just match the Echo products and set/clear the
test bits, defaults to do the same thing as w/o the patch to not break
working setups.
Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch>
Cc: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
At boot time, determine the D-cache, I-cache and E-cache size and
line-size. Use them in cache flushes when appropriate.
This change was motivated by discovering that the D-cache on
UltraSparc-IIIi and later are 64K not 32K, and the flushes done by the
Cheetah error handlers were assuming a 32K size.
There are still some pieces of code that are hard coding things and
will need to be fixed up at some point.
While we're here, fix the D-cache and I-cache parity error handlers
to run with interrupts disabled, and when the trap occurs at trap
level > 1 log the event via a counter displayed in /proc/cpuinfo.
Signed-off-by: David S. Miller <davem@davemloft.net>
This creates the directory structure under arch/powerpc and a bunch
of Kconfig files. It does a first-cut merge of arch/powerpc/mm,
arch/powerpc/lib and arch/powerpc/platforms/powermac. This is enough
to build a 32-bit powermac kernel with ARCH=powerpc.
For now we are getting some unmerged files from arch/ppc/kernel and
arch/ppc/syslib, or arch/ppc64/kernel. This makes some minor changes
to files in those directories and files outside arch/powerpc.
The boot directory is still not merged. That's going to be interesting.
Signed-off-by: Paul Mackerras <paulus@samba.org>
The trick is that we do the kernel linear mapping TLB miss starting
with an instruction sequence like this:
ba,pt %xcc, kvmap_load
xor %g2, %g4, %g5
succeeded by an instruction sequence which performs a full page table
walk starting at swapper_pg_dir.
We first take over the trap table from the firmware. Then, using this
constant PTE generation for the linear mapping area above, we build
the kernel page tables for the linear mapping.
After this is setup, we patch that branch above into a "nop", which
will cause TLB misses to fall through to the full page table walk.
With this, the page unmapping for CONFIG_DEBUG_PAGEALLOC is trivial.
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from Ben Dooks
The VA addresses of the Anubis CPLD registers
confoict with the addresses for the ISA space
maps used by the rest of the s3c2410 architecture
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Currently we just ignore the device, which means there are a few
arrays out there that we don't find.
This patch updates the scsi_report_lun_scan() to take a target instead
of a device so it can be called on a return of
SCSI_SCAN_TARGET_PRESENT, which is what a PQ 3 device returns.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
powerpc: Merge semaphore.h
Adopted the ppc64 version of semaphore.h. The 32-bit version used
smp_wmb(), but recent updates to atomic.h mean this is no longer required.
The 64-bit version made use of unlikely(), which has been retained in the
combined version.
This patch requires the recent atomic.h patch.
Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Merge asm-ppc*/rwsem.h into include/asm-powerpc.
Removed smp_*mb() memory barriers from the ppc32 code
as they are now burried in the atomic_*() functions as
suggested by Paul, implemented by Arnd, and pushed out
by Becky. I am not the droid you are looking for.
This patch depends on Becky's atomic.h merge patch.
Signed-off-by: Jon Loeliger <jdl@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
powerpc: Merge atomic.h and memory.h into powerpc
Merged atomic.h into include/powerpc. Moved asm-style HMT_ defines from
memory.h into ppc_asm.h, where there were already HMT_defines; moved c-style
HMT_ defines to processor.h. Renamed memory.h to synch.h to better reflect
its contents.
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Jon Loeliger <linuxppc@jdl.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Merge asm-ppc*/seccomp.h. Drop TIF_32BIT check.
Signed-off-by: Jon Loeliger <jdl@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The spinlock_types.h merge renamed the structure for raw_spinlock_t to
match ppc64. In doing so some of the spinlock macros/functions needed to
be updated to match. Apparently, this seems to only be caught when
building power3.
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch fixes a number of bugs. It cannot be reasonably split up in
multiple fixes, since all bugs interact with each other and affect the same
function:
Bug #1:
The event cache code cannot be called while a lock is held. Therefore, the
call to ip_conntrack_event_cache() within ip_ct_refresh_acct() needs to be
moved outside of the locked section. This fixes a number of 2.6.14-rcX
oops and deadlock reports.
Bug #2:
We used to call ct_add_counters() for unconfirmed connections without
holding a lock. Since the add operations are not atomic, we could race
with another CPU.
Bug #3:
ip_ct_refresh_acct() lost REFRESH events in some cases where refresh
(and the corresponding event) are desired, but no accounting shall be
performed. Both, evenst and accounting implicitly depended on the skb
parameter bein non-null. We now re-introduce a non-accounting
"ip_ct_refresh()" variant to explicitly state the desired behaviour.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove io_remap_page_range() from all of Linux 2.6.x (as requested and
suggested by Randy Dunlap) and minor clean-ups.
Signed-off-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
pte_modify marks a page as needing flush, which is redundant because the
resulting PTE is still set with set_pte, which already handles that.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The SMU is the "system controller" chip used by Apple recent G5 machines
including the iMac G5. It drives things like fans, i2c busses, real time
clock, etc...
The current kernel contains a very crude driver that doesn't do much more
than reading the real time clock synchronously. This is a completely
rewritten driver that provides interrupt based command queuing, a userland
interface, and an i2c/smbus driver for accessing the devices hanging off
the SMU i2c busses like temperature sensors. This driver is a basic block
for upcoming work on thermal control for those machines, among others.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It is essential that index_of() be inlined. But alpha undoes the gcc
inlining hackery and index_of() ends up out-of-line. So fiddle with things
to make that function inline again.
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In the lead up to 2.6.13 I fixed a large number of reboot problems by
making the calling conventions consistent. Despite checking and double
checking my work it appears I missed an obvious one.
This first patch simply refactors the reboot routines so all of the
preparation for various kinds of reboots are in their own functions.
Making it very hard to get the various kinds of reboot out of sync.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
early_setup() calls htab_initialize() which is similar, but not identical
to iSeries_bolt_kernel().
On iSeries the Hypervisor has already inserted some ptes for us, and we
simply have to detect that and bolt them. iSeries_hpte_bolt_or_insert()
implements that logic.
For the case of a non-existing pte we just call iSeries_hpte_insert(). This
appears to work, although it's not entirely equivalent to the old code in
iSeries_make_pte() which panicked if we got a secondary slot. Not sure if
that's important.
Finally we call iSeries_hpte_bolt_or_insert() from create_pte_mapping(),
which is called from htab_initialize() for each lmb region.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
In order to call finish_device_tree() on iSeries we need to define
virt_irq_create_mapping(). We also need to set ppc64_interrupt_controller to
something other than zero. If we want to do interrupt setup via the device
tree on iSeries this code will need some serious work, but it's harmless to
have it there as long as the nodes in the iSeries device tree don't cause
it to be invoked.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Instead of all of this cpu-specific code to remap the kernel
to the correct location, use portable firmware calls to do
this instead.
What we do now is the following in position independant
assembler:
chosen_node = prom_finddevice("/chosen");
prom_mmu_ihandle_cache = prom_getint(chosen_node, "mmu");
vaddr = 4MB_ALIGN(current_text_addr());
prom_translate(vaddr, &paddr_high, &paddr_low, &mode);
prom_boot_mapping_mode = mode;
prom_boot_mapping_phys_high = paddr_high;
prom_boot_mapping_phys_low = paddr_low;
prom_map(-1, 8 * 1024 * 1024, KERNBASE, paddr_low);
and that replaces the massive amount of by-hand TLB probing and
programming we used to do here.
The new code should also handle properly the case where the kernel
is mapped at the correct address already (think: future kexec
support).
Consequently, the bulk of remap_kernel() dies as does the entirety
of arch/sparc64/prom/map.S
We try to share some strings in the PROM library with the ones used
at bootup, and while we're here mark input strings to oplib.h routines
with "const" when appropriate.
There are many more simplifications now possible. For one thing, we
can consolidate the two copies we now have of a lot of cpu setup code
sitting in head.S and trampoline.S.
This is a significant step towards CONFIG_DEBUG_PAGEALLOC support.
Signed-off-by: David S. Miller <davem@davemloft.net>
Wire the MCA/INIT handler stacks into DTR[2] and track them in
IA64_KR(CURRENT_STACK). This gives the MCA/INIT handler stacks the
same TLB status as normal kernel stacks. Reload the old CURRENT_STACK
data on return from OS to SAL.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>