Commit Graph

189 Commits

Author SHA1 Message Date
Eric Paris
43ed7e16a8 fanotify: use merge argument to determine actual event added to queue
fanotify needs to know the actual event added to queues so it can be
correctly checked for return values from userspace.  To do this we need to
pass that information from the merger code back to the main even handling
routine.  Currently that information is unused, but it will be.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:01 -04:00
Eric Paris
6e5f77b32e fsnotify: intoduce a notification merge argument
Each group can define their own notification (and secondary_q) merge
function.  Inotify does tail drop, fanotify does matching and drop which
can actually allocate a completely new event.  But for fanotify to properly
deal with permissions events it needs to know the new event which was
ultimately added to the notification queue.  This patch just implements a
void ** argument which is passed to the merge function.  fanotify can use
this field to pass the new event back to higher layers.

Signed-off-by: Eric Paris <eparis@redhat.com>
for fanotify to properly deal with permissions events
2010-07-28 09:59:01 -04:00
Eric Paris
cb2d429faf fsnotify: add group priorities
This introduces an ordering to fsnotify groups.  With purely asynchronous
notification based "things" implementing fsnotify (inotify, dnotify) ordering
isn't particularly important.  But if people want to use fsnotify for the
basis of sycronous notification or blocking notification ordering becomes
important.

eg. A Hierarchical Storage Management listener would need to get its event
before an AV scanner could get its event (since the HSM would need to
bring the data in for the AV scanner to scan.)  Typically asynchronous notification
would want to run after the AV scanner made any relevant access decisions
so as to not send notification about an event that was denied.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:01 -04:00
Eric Paris
4d92604cc9 fanotify: clear all fanotify marks
fanotify listeners may want to clear all marks.  They may want to do this
to destroy all of their inode marks which have nothing but ignores.
Realistically this is useful for av vendors who update policy and want to
clear all of their cached allows.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:00 -04:00
Eric Paris
c9778a98e7 fanotify: allow ignored_masks to survive modify
Some users may want to truely ignore an inode even if it has been modified.
Say you are wanting a mount which contains a log file and you really don't
want any notification about that file.  This patch allows the listener to
do that.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:00 -04:00
Eric Paris
c908370fc1 fsnotify: allow ignored_mask to survive modification
Some inodes a group may want to never hear about a set of events even if
the inode is modified.  We add a new mark flag which indicates that these
marks should not have their ignored_mask cleared on modification.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:00 -04:00
Eric Paris
e898386146 fsnotify: clear ignored mask on modify
On inode modification we clear the ignored mask for all of the marks on the
inode.  This allows userspace to ignore accesses to inodes until there is
something different.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:00 -04:00
Eric Paris
b9e4e3bd04 fanotify: allow users to set an ignored_mask
Change the sys_fanotify_mark() system call so users can set ignored_masks
on inodes.  Remember, if a user new sets a real mask, and only sets ignored
masks, the ignore will never be pinned in memory.  Thus ignored_masks can
be lost under memory pressure and the user may again get events they
previously thought were ignored.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:00 -04:00
Eric Paris
32a4df13b8 fanotify: ignored_mask to ignore events
When fanotify receives an event it will check event->mask & ~ignored_mask.
If no bits are left the event will not be sent.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:00 -04:00
Eric Paris
33af5e32e0 fsnotify: ignored_mask - excluding notification
The ignored_mask is a new mask which is part of fsnotify marks.  A group's
should_send_event() function can use the ignored mask to determine that
certain events are not of interest.  In particular if a group registers a
mask including FS_OPEN on a vfsmount they could add FS_OPEN to the
ignored_mask for individual inodes and not send open events for those
inodes.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:00 -04:00
Eric Paris
90b1e7a578 fsnotify: allow marks to not pin inodes in core
inotify marks must pin inodes in core.  dnotify doesn't technically need to
since they are closed when the directory is closed.  fanotify also need to
pin inodes in core as it works today.  But the next step is to introduce
the concept of 'ignored masks' which is actually a mask of events for an
inode of no interest.  I claim that these should be liberally sent to the
kernel and should not pin the inode in core.  If the inode is brought back
in the listener will get an event it may have thought excluded, but this is
not a serious situation and one any listener should deal with.

This patch lays the ground work for non-pinning inode marks by using lazy
inode pinning.  We do not pin a mark until it has a non-zero mask entry.  If a
listener new sets a mask we never pin the inode.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:59 -04:00
Andreas Gruenbacher
33d3dfff45 fanotify: remove outgoing function checks in fanotify.h
A number of validity checks on outgoing data are done in static inlines but
are only used in one place.  Instead just do them where they are used for
readability.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:59 -04:00
Andreas Gruenbacher
88380fe66e fanotify: remove fanotify.h declarations
fanotify_mark_validate functions are all needlessly declared in headers as
static inlines.  Instead just do the checks where they are needed for code
readability.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:59 -04:00
Andreas Gruenbacher
f3640192c0 fanotify: split fanotify_remove_mark
split fanotify_remove_mark into fanotify_remove_inode_mark and
fanotify_remove_vfsmount_mark.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:59 -04:00
Andreas Gruenbacher
eac8e9e80c fanotify: rename FAN_MARK_ON_VFSMOUNT to FAN_MARK_MOUNT
the term 'vfsmount' isn't sensicle to userspace.  instead call is 'mount.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:59 -04:00
Eric Paris
0ff21db9fc fanotify: hooks the fanotify_mark syscall to the vfsmount code
Create a new fanotify_mark flag which indicates we should attach the mark
to the vfsmount holding the object referenced by dfd and pathname rather
than the inode itself.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:59 -04:00
Andreas Gruenbacher
90dd201d1a fanotify: remove fanotify_add_mark
fanotify_add_mark now does nothing useful anymore, drop it.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:58 -04:00
Andreas Gruenbacher
52202dfbd9 fanotify: do not return pointer from fanotify_add_*_mark
No need to return the mark from fanotify_add_*_mark to fanotify_add_mark

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:58 -04:00
Andreas Gruenbacher
912ee3946c fanotify: do not call fanotify_update_object_mask in fanotify_add_mark
Recalculate masks in fanotify_add_mark, don't use
fanotify_update_object_mask.  This gets us one step closers to readable
code.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:58 -04:00
Andreas Gruenbacher
088b09b0ac fanotify: do not call fanotify_update_object_mask in fanotify_remove_mark
Recalculate masks in fanotify_remove_mark, don't use
fanotify_update_object_mask.  This gets us one step closers to readable
code.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:58 -04:00
Andreas Gruenbacher
c6223f4649 fanotify: remove fanotify_update_mark
fanotify_update_mark() doesn't do much useful;  remove it.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:58 -04:00
Eric Paris
88826276dc fanotify: infrastructure to add an remove marks on vfsmounts
infrastructure work to add and remove marks on vfsmounts.  This should get
every set up except wiring the functions to the syscalls.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Eric Paris
1c529063a3 fanotify: should_send_event needs to handle vfsmounts
currently should_send_event in fanotify only cares about marks on inodes.
This patch extends that interface to indicate that it cares about events
that happened on vfsmounts.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Andreas Gruenbacher
ca9c726eea fsnotify: Infrastructure for per-mount watches
Per-mount watches allow groups to listen to fsnotify events on an entire
mount.  This patch simply adds and initializes the fields needed in the
vfsmount struct to make this happen.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Eric Paris
0d48b7f01f fsnotify: vfsmount marks generic functions
Much like inode-mark.c has all of the code dealing with marks on inodes
this patch adds a vfsmount-mark.c which has similar code but is intended
for marks on vfsmounts.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Andreas Gruenbacher
2504c5d63b fsnotify/vfsmount: add fsnotify fields to struct vfsmount
This patch adds the list and mask fields needed to support vfsmount marks.
These are the same fields fsnotify needs on an inode.  They are not used,
just declared and we note where the cleanup hook should be (the function is
not yet defined)

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Eric Paris
ba643f04cd fsnotify: clear marks to 0 in fsnotify_init_mark
Currently fsnotify_init_mark sets some fields to 0/NULL.  Some users
already used some sorts of zalloc, some didn't.  This patch uses memset to
explicitly zero everything in the fsnotify_mark when it is initialized so we
don't have to be careful if fields are later added to marks.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Eric Paris
5444e2981c fsnotify: split generic and inode specific mark code
currently all marking is done by functions in inode-mark.c.  Some of this
is pretty generic and should be instead done in a generic function and we
should only put the inode specific code in inode-mark.c

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Andreas Gruenbacher
32c3263221 fanotify: Add pids to events
Pass the process identifiers of the triggering processes to fanotify
listeners: this information is useful for event filtering and logging.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:56 -04:00
Andreas Gruenbacher
22aa425dec fanotify: create_fd cleanup
Code cleanup which does the fd creation work seperately from the userspace
metadata creation.  It fits better with the other code.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:56 -04:00
Heiko Carstens
9bbfc964b8 fanotify: CONFIG_HAVE_SYSCALL_WRAPPERS for sys_fanotify_mark
Please note that you need the patch below in addition, otherwise the
syscall wrapper stuff won't work on those 32 bit architectures which enable
the wrappers.

When enabled the syscall wrapper defines always take long parameters and then
cast them to whatever is needed. This approach doesn't work for the 32 bit
case where the original syscall takes a long long parameter, since we would
lose the upper 32 bits.
So syscalls with 64 bit arguments are special cases wrt to syscall wrappers
and enp up in the ugliness below (see also sys_fallocate). In addition these
special cased syscall wrappers have the drawback that ftrace syscall tracing
doesn't work on them, since they don't get defined by using the usual macros.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:56 -04:00
Paul Mundt
ef601a9cfd fanotify: select ANON_INODES.
fanotify references anon_inode_getfd(), which is only available with
ANON_INODES enabled. Presently this bails out with the following:

  LD      vmlinux
fs/built-in.o: In function `sys_fanotify_init':
(.text+0x26d1c): undefined reference to `anon_inode_getfd'
make: *** [vmlinux] Error 1

which is trivially corrected by adding an ANON_INODES select.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:56 -04:00
Eric Paris
a1014f1023 fanotify: send events using read
Send events to userspace by reading the file descriptor from fanotify_init().
One will get blocks of data which look like:

struct fanotify_event_metadata {
	__u32 event_len;
	__u32 vers;
	__s32 fd;
	__u64 mask;
	__s64 pid;
	__u64 cookie;
} __attribute__ ((packed));

Simple code to retrieve and deal with events is below

	while ((len = read(fan_fd, buf, sizeof(buf))) > 0) {
		struct fanotify_event_metadata *metadata;

		metadata = (void *)buf;
		while(FAN_EVENT_OK(metadata, len)) {
			[PROCESS HERE!!]
			if (metadata->fd >= 0 && close(metadata->fd) != 0)
				goto fail;
			metadata = FAN_EVENT_NEXT(metadata, len);
		}
	}

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:56 -04:00
Eric Paris
2a3edf8604 fanotify: fanotify_mark syscall implementation
NAME
	fanotify_mark - add, remove, or modify an fanotify mark on a
filesystem object

SYNOPSIS
	int fanotify_mark(int fanotify_fd, unsigned int flags, u64 mask,
			  int dfd, const char *pathname)

DESCRIPTION
	fanotify_mark() is used to add remove or modify a mark on a filesystem
	object.  Marks are used to indicate that the fanotify group is
	interested in events which occur on that object.  At this point in
	time marks may only be added to files and directories.

	fanotify_fd must be a file descriptor returned by fanotify_init()

	The flags field must contain exactly one of the following:

	FAN_MARK_ADD - or the bits in mask and ignored mask into the mark
	FAN_MARK_REMOVE - bitwise remove the bits in mask and ignored mark
		from the mark

	The following values can be OR'd into the flags field:

	FAN_MARK_DONT_FOLLOW - same meaning as O_NOFOLLOW as described in open(2)
	FAN_MARK_ONLYDIR - same meaning as O_DIRECTORY as described in open(2)

	dfd may be any of the following:
	AT_FDCWD: the object will be lookup up based on pathname similar
		to open(2)

	file descriptor of a directory: if pathname is not NULL the
		object to modify will be lookup up similar to openat(2)

	file descriptor of the final object: if pathname is NULL the
		object to modify will be the object referenced by dfd

	The mask is the bitwise OR of the set of events of interest such as:
	FAN_ACCESS		- object was accessed (read)
	FAN_MODIFY		- object was modified (write)
	FAN_CLOSE_WRITE		- object was writable and was closed
	FAN_CLOSE_NOWRITE	- object was read only and was closed
	FAN_OPEN		- object was opened
	FAN_EVENT_ON_CHILD	- interested in objected that happen to
				  children.  Only relavent when the object
				  is a directory
	FAN_Q_OVERFLOW		- event queue overflowed (not implemented)

RETURN VALUE
	On success, this system call returns 0. On error, -1 is
	returned, and errno is set to indicate the error.

ERRORS
	EINVAL An invalid value was specified in flags.

	EINVAL An invalid value was specified in mask.

	EINVAL An invalid value was specified in ignored_mask.

	EINVAL fanotify_fd is not a file descriptor as returned by
	fanotify_init()

	EBADF fanotify_fd is not a valid file descriptor

	EBADF dfd is not a valid file descriptor and path is NULL.

	ENOTDIR dfd is not a directory and path is not NULL

	EACCESS no search permissions on some part of the path

	ENENT file not found

	ENOMEM Insufficient kernel memory is available.

CONFORMING TO
	These system calls are Linux-specific.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:56 -04:00
Eric Paris
bbaa4168b2 fanotify: sys_fanotify_mark declartion
This patch simply declares the new sys_fanotify_mark syscall

int fanotify_mark(int fanotify_fd, unsigned int flags, u64_mask,
		  int dfd const char *pathname)

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:55 -04:00
Eric Paris
52c923dd07 fanotify: fanotify_init syscall implementation
NAME
	fanotify_init - initialize an fanotify group

SYNOPSIS
	int fanotify_init(unsigned int flags, unsigned int event_f_flags, int priority);

DESCRIPTION
	fanotify_init() initializes a new fanotify instance and returns a file
	descriptor associated with the new fanotify event queue.

	The following values can be OR'd into the flags field:

	FAN_NONBLOCK Set the O_NONBLOCK file status flag on the new open file description.
		Using this flag saves extra calls to fcntl(2) to achieve the same
		result.

	FAN_CLOEXEC Set the close-on-exec (FD_CLOEXEC) flag on the new file descriptor.
		See the description of the O_CLOEXEC flag in open(2) for reasons why
		this may be useful.

	The event_f_flags argument is unused and must be set to 0

	The priority argument is unused and must be set to 0

RETURN VALUE
	On success, this system call return a new file descriptor. On error, -1 is
	returned, and errno is set to indicate the error.

ERRORS
	EINVAL An invalid value was specified in flags.

	EINVAL A non-zero valid was passed in event_f_flags or in priority

	ENFILE The system limit on the total number of file descriptors has been reached.

	ENOMEM Insufficient kernel memory is available.

CONFORMING TO
	These system calls are Linux-specific.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:55 -04:00
Eric Paris
11637e4b7d fanotify: fanotify_init syscall declaration
This patch defines a new syscall fanotify_init() of the form:

int sys_fanotify_init(unsigned int flags, unsigned int event_f_flags,
		      unsigned int priority)

This syscall is used to create and fanotify group.  This is very similar to
the inotify_init() syscall.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:55 -04:00
Eric Paris
9dced01a09 fanotify: do not clone on merge unless needed
Currently if 2 events are going to be merged on the notication queue with
different masks the second event will be cloned and will replace the first
event.  However if this notification queue is the only place referencing
the event in question there is no reason not to just update the event in
place.  We can tell this if the event->refcnt == 1.  Since we hold a
reference for each queue this event is on we know that when refcnt == 1
this is the only queue.  The other concern is that it might be about to be
added to a new queue, but this can't be the case since fsnotify holds a
reference on the event until it is finished adding it to queues.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:55 -04:00
Eric Paris
a12a7dd328 fanotify: merge notification events with different masks
Instead of just merging fanotify events if they are exactly the same, merge
notification events with different masks.  To do this we have to clone the
old event, update the mask in the new event with the new merged mask, and
put the new event in place of the old event.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:55 -04:00
Eric Paris
767cd46c33 fanotify:drop notification if they exist in the outgoing queue
fanotify listeners get an open file descriptor to the object in question so
the ordering of operations is not as important as in other notification
systems.  inotify will drop events if the last event in the event FIFO is
the same as the current event.  This patch will drop fanotify events if
they are the same as another event anywhere in the event FIFO.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:54 -04:00
Eric Paris
ff0b16a985 fanotify: fscking all notification system
fanotify is a novel file notification system which bases notification on
giving userspace both an event type (open, close, read, write) and an open
file descriptor to the object in question.  This should address a number of
races and problems with other notification systems like inotify and dnotify
and should allow the future implementation of blocking or access controlled
notification.  These are useful for on access scanners or hierachical storage
management schemes.

This patch just implements the basics of the fsnotify functions.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:54 -04:00
Andreas Gruenbacher
3556608709 fsnotify: take inode->i_lock inside fsnotify_find_mark_entry()
All callers to fsnotify_find_mark_entry() except one take and
release inode->i_lock around the call.  Take the lock inside
fsnotify_find_mark_entry() instead.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:54 -04:00
Eric Paris
ef5e2b785f dnotify: rename mark_entry to mark
nomenclature change.  Used to call things 'entries' but now we just call
them 'marks.'  Do those changes for dnotify.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:54 -04:00
Eric Paris
000285deb9 inotify: rename mark_entry to just mark
rename anything in inotify that deals with mark_entry to just be mark.  It
makes a lot more sense.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:54 -04:00
Eric Paris
841bdc10f5 fsnotify: rename mark_entry to just mark
previously I used mark_entry when talking about marks on inodes.  The
_entry is pretty useless.  Just use "mark" instead.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:53 -04:00
Eric Paris
d07754412f fsnotify: rename fsnotify_find_mark_entry to fsnotify_find_mark
the _entry portion of fsnotify functions is useless.  Drop it.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:53 -04:00
Eric Paris
e61ce86737 fsnotify: rename fsnotify_mark_entry to just fsnotify_mark
The name is long and it serves no real purpose.  So rename
fsnotify_mark_entry to just fsnotify_mark.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:53 -04:00
Andreas Gruenbacher
72acc85442 fsnotify: kill FSNOTIFY_EVENT_FILE
Some fsnotify operations send a struct file.  This is more information than
we technically need.  We instead send a struct path in all cases instead of
sometimes a path and sometimes a file.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:53 -04:00
Eric Paris
098cf2fc77 fsnotify: add flags to fsnotify_mark_entries
To differentiate between inode and vfsmount (or other future) types of
marks we add a flags field and set the inode bit on inode marks (the only
currently supported type of mark)

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:52 -04:00
Eric Paris
2823e04de4 fsnotify: put inode specific fields in an fsnotify_mark in a union
The addition of marks on vfs mounts will be simplified if the inode
specific parts of a mark and the vfsmnt specific parts of a mark are
actually in a union so naming can be easy.  This patch just implements the
inode struct and the union.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:52 -04:00
Eric Paris
3a9fb89f4c fsnotify: include vfsmount in should_send_event when appropriate
To ensure that a group will not duplicate events when it receives it based
on the vfsmount and the inode should_send_event test we should distinguish
those two cases.  We pass a vfsmount to this function so groups can make
their own determinations.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:52 -04:00
Eric Paris
7131485a93 fsnotify: mount point listeners list and global mask
currently all of the notification systems implemented select which inodes
they care about and receive messages only about those inodes (or the
children of those inodes.)  This patch begins to flesh out fsnotify support
for the concept of listeners that want to hear notification for an inode
accessed below a given monut point.  This patch implements a second list
of fsnotify groups to hold these types of groups and a second global mask
to hold the events of interest for this type of group.

The reason we want a second group list and mask is because the inode based
notification should_send_event support which makes each group look for a mark
on the given inode.  With one nfsmount listener that means that every group would
have to take the inode->i_lock, look for their mark, not find one, and return
for every operation.   By seperating vfsmount from inode listeners only when
there is a inode listener will the inode groups have to look for their
mark and take the inode lock.  vfsmount listeners will have to grab the lock and
look for a mark but there should be fewer of them, and one vfsmount listener
won't cause the i_lock to be grabbed and released for every fsnotify group
on every io operation.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:52 -04:00
Eric Paris
4ca763523e fsnotify: add groups to fsnotify_inode_groups when registering inode watch
Currently all fsnotify groups are added immediately to the
fsnotify_inode_groups list upon creation.  This means, even groups with no
watches (common for audit) will be on the global tracking list and will
get checked for every event.  This patch adds groups to the global list on
when the first inode mark is added to the group.

Signed-of-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:51 -04:00
Eric Paris
36fddebaa8 fsnotify: initialize the group->num_marks in a better place
Currently the comments say that group->num_marks is held because the group
is on the fsnotify_group list.  This isn't strictly the case, we really
just hold the num_marks for the life of the group (any time group->refcnt
is != 0)  This patch moves the initialization stuff and makes it clear when
it is really being held.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:51 -04:00
Eric Paris
19c2a0e1a2 fsnotify: rename fsnotify_groups to fsnotify_inode_groups
Simple renaming patch.  fsnotify is about to support mount point listeners
so I am renaming fsnotify_groups and fsnotify_mask to indicate these are lists
used only for groups which have watches on inodes.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:51 -04:00
Eric Paris
0d2e2a1d00 fsnotify: drop mask argument from fsnotify_alloc_group
Nothing uses the mask argument to fsnotify_alloc_group.  This patch drops
that argument.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:51 -04:00
Eric Paris
ffab83402f fsnotify: fsnotify_obtain_group should be fsnotify_alloc_group
fsnotify_obtain_group was intended to be able to find an already existing
group.  Nothing uses that functionality.  This just renames it to
fsnotify_alloc_group so it is clear what it is doing.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:50 -04:00
Eric Paris
cd7752ce7c fsnotify: fsnotify_obtain_group kzalloc cleanup
fsnotify_obtain_group uses kzalloc but then proceedes to set things to 0.
This patch just deletes those useless lines.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:50 -04:00
Eric Paris
74be0cc828 fsnotify: remove group_num altogether
The original fsnotify interface has a group-num which was intended to be
able to find a group after it was added.  I no longer think this is a
necessary thing to do and so we remove the group_num.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:50 -04:00
Eric Paris
cac69dad32 fsnotify: lock annotation for event replacement
fsnotify_replace_event need to lock both the old and the new event.  This
causes lockdep to get all pissed off since it dosn't know this is safe.
It's safe in this case since the new event is impossible to be reached from
other places in the kernel.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:50 -04:00
Eric Paris
1201a5361b fsnotify: replace an event on a list
fanotify would like to clone events already on its notification list, make
changes to the new event, and then replace the old event on the list with
the new event.  This patch implements the replace functionality of that
process.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:49 -04:00
Eric Paris
b4e4e14073 fsnotify: clone existing events
fsnotify_clone_event will take an event, clone it, and return the cloned
event to the caller.  Since events may be in use by multiple fsnotify
groups simultaneously certain event entries (such as the mask) cannot be
changed after the event was created.  Since fanotify would like to merge
events happening on the same file it needs a new clean event to work with
so it can change any fields it wishes.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:49 -04:00
Eric Paris
74766bbfa9 fsnotify: per group notification queue merge types
inotify only wishes to merge a new event with the last event on the
notification fifo.  fanotify is willing to merge any events including by
means of bitwise OR masks of multiple events together.  This patch moves
the inotify event merging logic out of the generic fsnotify notification.c
and into the inotify code.  This allows each use of fsnotify to provide
their own merge functionality.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:49 -04:00
Eric Paris
28c60e37f8 fsnotify: send struct file when sending events to parents when possible
fanotify needs a path in order to open an fd to the object which changed.
Currently notifications to inode's parents are done using only the inode.
For some parental notification we have the entire file, send that so
fanotify can use it.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:48 -04:00
Eric Paris
8112e2d6a7 fsnotify: include data in should_send calls
fanotify is going to need to look at file->private_data to know if an event
should be sent or not.  This passes the data (which might be a file,
dentry, inode, or none) to the should_send function calls so fanotify can
get that information when available

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:31 -04:00
Eric Paris
7b0a04fbfb fsnotify: provide the data type to should_send_event
fanotify is only interested in event types which contain enough information
to open the original file in the context of the fanotify listener.  Since
fanotify may not want to send events if that data isn't present we pass
the data type to the should_send_event function call so fanotify can express
its lack of interest.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:31 -04:00
Eric Paris
d7f0ce4e43 inotify: do not spam console without limit
inotify was supposed to have a dmesg printk ratelimitor which would cause
inotify to only emit one message per boot.  The static bool was never set
so it kept firing messages.  This patch correctly limits warnings in multiple
places.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:31 -04:00
Eric Paris
2dfc1cae4c inotify: remove inotify in kernel interface
nothing uses inotify in the kernel, drop it!

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:31 -04:00
Eric Paris
7050c48826 inotify: do not reuse watch descriptors
Prior to 2.6.31 inotify would not reuse watch descriptors until all of
them had been used at least once.  After the rewrite inotify would reuse
watch descriptors.  The selinux utility 'restorecond' was found to have
problems when watch descriptors were reused.  This patch reverts to the
pre inotify rewrite behavior to not reuse watch descriptors.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:20 -04:00
Eric Paris
6f3a539e3b fsnotify: use kmem_cache_zalloc to simplify event initialization
fsnotify event initialization is done entry by entry with almost everything
set to either 0 or NULL.  Use kmem_cache_zalloc and only initialize things
that need non-zero initialization.  Also means we don't have to change
initialization entries based on the config options.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:20 -04:00
Eric Paris
f0553af054 fsnotify: kzalloc fsnotify groups
Use kzalloc for fsnotify_groups so that none of the fields can leak any
information accidentally.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:20 -04:00
Eric Paris
31ddd3268d inotify: use container_of instead of casting
inotify_free_mark casts directly from an fsnotify_mark_entry to an
inotify_inode_mark_entry.  This works, but should use container_of instead
for future proofing.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:19 -04:00
Eric Paris
b4277d3dd5 fsnotify: use fsnotify_create_event to allocate the q_overflow event
Currently fsnotify defines a static fsnotify event which is sent when a
group overflows its allotted queue length.  This patch just allocates that
event from the event cache rather than defining it statically.  There is no
known reason that the current implementation is wrong, but this makes sure the
event is initialized and created like any other.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:19 -04:00
Eric Paris
40554c3dae fsnotify: allow addition of duplicate fsnotify marks
This patch allows a task to add a second fsnotify mark to an inode for the
same group.  This mark will be added to the end of the inode's list and
this will never be found by the stand fsnotify_find_mark() function.   This
is useful if a user wants to add a new mark before removing the old one.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:17 -04:00
Eric Paris
9e1c74321d fsnotify: duplicate fsnotify_mark_entry data between 2 marks
Simple copy fsnotify information from one mark to another in preparation
for the second mark to replace the first.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:17 -04:00
Eric Paris
b7ba837153 inotify: simplify the inotify idr handling
This patch moves all of the idr editing operations into their own idr
functions.  It makes it easier to prove locking correctness and to to
understand the code flow.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:16 -04:00
Al Viro
1712ac8fda Saner locking around deactivate_super()
Make sure that s_umount is acquired *before* we drop the final
active reference; we still have the fast path (atomic_dec_unless)
and we have gotten rid of the window between the moment when
s_active hits zero and s_umount is acquired.  Which simplifies
the living hell out of grab_super() and inotify pin_to_kill()
stuff.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:14 -04:00
Al Viro
b20bd1a5e7 get rid of S_BIAS
use atomic_inc_not_zero(&sb->s_active) instead of playing games with
checking ->s_count > S_BIAS

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:14 -04:00
Linus Torvalds
4fc4c3ce0d Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
  inotify: don't leak user struct on inotify release
  inotify: race use after free/double free in inotify inode marks
  inotify: clean up the inotify_add_watch out path
  Inotify: undefined reference to `anon_inode_getfd'

Manual merge to remove duplicate "select ANON_INODES" from Kconfig file
2010-05-14 11:49:42 -07:00
Pavel Emelyanov
b3b38d842f inotify: don't leak user struct on inotify release
inotify_new_group() receives a get_uid-ed user_struct and saves the
reference on group->inotify_data.user.  The problem is that free_uid() is
never called on it.

Issue seem to be introduced by 63c882a0 (inotify: reimplement inotify
using fsnotify) after 2.6.30.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Eric Paris <eparis@parisplace.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-05-14 11:53:36 -04:00
Eric Paris
e08733446e inotify: race use after free/double free in inotify inode marks
There is a race in the inotify add/rm watch code.  A task can find and
remove a mark which doesn't have all of it's references.  This can
result in a use after free/double free situation.

Task A					Task B
------------				-----------
inotify_new_watch()
 allocate a mark (refcnt == 1)
 add it to the idr
					inotify_rm_watch()
					 inotify_remove_from_idr()
					  fsnotify_put_mark()
					      refcnt hits 0, free
 take reference because we are on idr
 [at this point it is a use after free]
 [time goes on]
 refcnt may hit 0 again, double free

The fix is to take the reference BEFORE the object can be found in the
idr.

Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: <stable@kernel.org>
2010-05-14 11:52:57 -04:00
Eric Paris
3dbc6fb6a3 inotify: clean up the inotify_add_watch out path
inotify_add_watch explictly frees the unused inode mark, but it can just
use the generic code.  Just do that.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-05-14 11:51:07 -04:00
Russell King
e7b702b1a8 Inotify: undefined reference to `anon_inode_getfd'
Fix:

fs/built-in.o: In function `sys_inotify_init1':
summary.c:(.text+0x347a4): undefined reference to `anon_inode_getfd'

found by kautobuild with arms bcmring_defconfig, which ends up with
INOTIFY_USER enabled (through the 'default y') but leaves ANON_INODES
unset.  However, inotify_user.c uses anon_inode_getfd().

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-05-12 11:03:40 -04:00
Ralf Baechle
12b1b32168 Inotify: Fix build failure in inotify user support
CONFIG_INOTIFY_USER defined but CONFIG_ANON_INODES undefined will result
in the following build failure:

    LD      vmlinux
  fs/built-in.o: In function 'sys_inotify_init1':
  (.text.sys_inotify_init1+0x22c): undefined reference to 'anon_inode_getfd'
  fs/built-in.o: In function `sys_inotify_init1':
  (.text.sys_inotify_init1+0x22c): relocation truncated to fit: R_MIPS_26 against 'anon_inode_getfd'
  make[2]: *** [vmlinux] Error 1
  make[1]: *** [sub-make] Error 2
  make: *** [all] Error 2

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-30 10:14:56 -07:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Al Viro
c44dcc56d2 switch inotify_user to anon_inode
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-02-19 03:35:12 -05:00
Eric Paris
976ae32be4 inotify: only warn once for inotify problems
inotify will WARN() if it finds that the idr and the fsnotify internals
somehow got out of sync.  It was only supposed to do this once but due
to this stupid bug it would warn every single time a problem was
detected.

Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-15 14:49:23 -08:00
Eric Paris
9e572cc987 inotify: do not reuse watch descriptors
Since commit 7e790dd5fc ("inotify: fix
error paths in inotify_update_watch") inotify changed the manor in which
it gave watch descriptors back to userspace.  Previous to this commit
inotify acted like the following:

  inotify_add_watch(X, Y, Z) = 1
  inotify_rm_watch(X, 1);
  inotify_add_watch(X, Y, Z) = 2

but after this patch inotify would return watch descriptors like so:

  inotify_add_watch(X, Y, Z) = 1
  inotify_rm_watch(X, 1);
  inotify_add_watch(X, Y, Z) = 1

which I saw as equivalent to opening an fd where

  open(file) = 1;
  close(1);
  open(file) = 1;

seemed perfectly reasonable.  The issue is that quite a bit of userspace
apparently relies on the behavior in which watch descriptors will not be
quickly reused.  KDE relies on it, I know some selinux packages rely on
it, and I have heard complaints from other random sources such as debian
bug 558981.

Although the man page implies what we do is ok, we broke userspace so
this patch almost reverts us to the old behavior.  It is still slightly
racey and I have patches that would fix that, but they are rather large
and this will fix it for all real world cases.  The race is as follows:

 - task1 creates a watch and blocks in idr_new_watch() before it updates
   the hint.
 - task2 creates a watch and updates the hint.
 - task1 updates the hint with it's older wd
 - task removes the watch created by task2
 - task adds a new watch and will reuse the wd originally given to task2

it requires moving some locking around the hint (last_wd) but this should
solve it for the real world and be -stable safe.

As a side effect this patch papers over a bug in the lib/idr code which
is causing a large number WARN's to pop on people's system and many
reports in kerneloops.org.  I'm working on the root cause of that idr
bug seperately but this should make inotify immune to that issue.

Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-15 14:49:23 -08:00
Al Viro
2c48b9c455 switch alloc_file() to passing struct path
... and have the caller grab both mnt and dentry; kill
leak in infiniband, while we are at it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:42 -05:00
Al Viro
825f9692fb switched inotify_init1() to alloc_file()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:40 -05:00
Linus Torvalds
4ef58d4e2a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
  tree-wide: fix misspelling of "definition" in comments
  reiserfs: fix misspelling of "journaled"
  doc: Fix a typo in slub.txt.
  inotify: remove superfluous return code check
  hdlc: spelling fix in find_pvc() comment
  doc: fix regulator docs cut-and-pasteism
  mtd: Fix comment in Kconfig
  doc: Fix IRQ chip docs
  tree-wide: fix assorted typos all over the place
  drivers/ata/libata-sff.c: comment spelling fixes
  fix typos/grammos in Documentation/edac.txt
  sysctl: add missing comments
  fs/debugfs/inode.c: fix comment typos
  sgivwfb: Make use of ARRAY_SIZE.
  sky2: fix sky2_link_down copy/paste comment error
  tree-wide: fix typos "couter" -> "counter"
  tree-wide: fix typos "offest" -> "offset"
  fix kerneldoc for set_irq_msi()
  spidev: fix double "of of" in comment
  comment typo fix: sybsystem -> subsystem
  ...
2009-12-09 19:43:33 -08:00
Giuseppe Scrivano
336e8683b9 inotify: remove superfluous return code check
Signed-off-by: Giuseppe Scrivano <gscrivano@gnu.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 15:39:58 +01:00
Eric W. Biederman
6d4561110a sysctl: Drop & in front of every proc_handler.
For consistency drop & in front of every proc_handler.  Explicity
taking the address is unnecessary and it prevents optimizations
like stubbing the proc_handlers to NULL.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-11-18 08:37:40 -08:00
Eric W. Biederman
ab09203e30 sysctl fs: Remove dead binary sysctl support
Now that sys_sysctl is a generic wrapper around /proc/sys  .ctl_name
and .strategy members of sysctl tables are dead code.  Remove them.

Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-11-12 02:04:55 -08:00
Andreas Gruenbacher
945526846a dnotify: ignore FS_EVENT_ON_CHILD
Mask off FS_EVENT_ON_CHILD in dnotify_handle_event().  Otherwise, when there
is more than one watch on a directory and dnotify_should_send_event()
succeeds, events with FS_EVENT_ON_CHILD set will trigger all watches and cause
spurious events.

This case was overlooked in commit e42e2773.

	#define _GNU_SOURCE

	#include <stdio.h>
	#include <stdlib.h>
	#include <unistd.h>
	#include <signal.h>
	#include <sys/types.h>
	#include <sys/stat.h>
	#include <fcntl.h>
	#include <string.h>

	static void create_event(int s, siginfo_t* si, void* p)
	{
		printf("create\n");
	}

	static void delete_event(int s, siginfo_t* si, void* p)
	{
		printf("delete\n");
	}

	int main (void) {
		struct sigaction action;
		char *tmpdir, *file;
		int fd1, fd2;

		sigemptyset (&action.sa_mask);
		action.sa_flags = SA_SIGINFO;

		action.sa_sigaction = create_event;
		sigaction (SIGRTMIN + 0, &action, NULL);

		action.sa_sigaction = delete_event;
		sigaction (SIGRTMIN + 1, &action, NULL);

	#	define TMPDIR "/tmp/test.XXXXXX"
		tmpdir = malloc(strlen(TMPDIR) + 1);
		strcpy(tmpdir, TMPDIR);
		mkdtemp(tmpdir);

	#	define TMPFILE "/file"
		file = malloc(strlen(tmpdir) + strlen(TMPFILE) + 1);
		sprintf(file, "%s/%s", tmpdir, TMPFILE);

		fd1 = open (tmpdir, O_RDONLY);
		fcntl(fd1, F_SETSIG, SIGRTMIN);
		fcntl(fd1, F_NOTIFY, DN_MULTISHOT | DN_CREATE);

		fd2 = open (tmpdir, O_RDONLY);
		fcntl(fd2, F_SETSIG, SIGRTMIN + 1);
		fcntl(fd2, F_NOTIFY, DN_MULTISHOT | DN_DELETE);

		if (fork()) {
			/* This triggers a create event */
			creat(file, 0600);
			/* This triggers a create and delete event (!) */
			unlink(file);
		} else {
			sleep(1);
			rmdir(tmpdir);
		}

		return 0;
	}

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2009-10-20 18:02:33 -04:00
Wei Yongjun
3de0ef4f20 inotify: fix coalesce duplicate events into a single event in special case
If we do rename a dir entry, like this:

  rename("/tmp/ino7UrgoJ.rename1", "/tmp/ino7UrgoJ.rename2")
  rename("/tmp/ino7UrgoJ.rename2", "/tmp/ino7UrgoJ")

The duplicate events should be coalesced into a single event. But those two
events do not be coalesced into a single event, due to some bad check in
event_compare(). It can not match the two NULL inodes as the same event.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2009-10-18 15:49:38 -04:00
Eric Paris
9f0d793b52 fsnotify: do not set group for a mark before it is on the i_list
fsnotify_add_mark is supposed to add a mark to the g_list and i_list and to
set the group and inode for the mark.  fsnotify_destroy_mark_by_entry uses
the fact that ->group != NULL to know if this group should be destroyed or
if it's already been done.

But fsnotify_add_mark sets the group and inode before it actually adds the
mark to the i_list and g_list.  This can result in a race in inotify, it
requires 3 threads.

sys_inotify_add_watch("file")	sys_inotify_add_watch("file")	sys_inotify_rm_watch([a])
inotify_update_watch()
inotify_new_watch()
inotify_add_to_idr()
   ^--- returns wd = [a]
				inotfiy_update_watch()
				inotify_new_watch()
				inotify_add_to_idr()
				fsnotify_add_mark()
				   ^--- returns wd = [b]
				returns to userspace;
								inotify_idr_find([a])
								   ^--- gives us the pointer from task 1
fsnotify_add_mark()
   ^--- this is going to set the mark->group and mark->inode fields, but will
return -EEXIST because of the race with [b].
								fsnotify_destroy_mark()
								   ^--- since ->group != NULL we call back
									into inotify_freeing_mark() which calls
								inotify_remove_from_idr([a])

since fsnotify_add_mark() failed we call:
inotify_remove_from_idr([a])     <------WHOOPS it's not in the idr, this could
					have been any entry added later!

The fix is to make sure we don't set mark->group until we are sure the mark is
on the inode and fsnotify_add_mark will return success.

Signed-off-by: Eric Paris <eparis@redhat.com>
2009-10-18 15:49:38 -04:00
Eric Paris
750a8870fe inotify: update the group mask on mark addition
Seperating the addition and update of marks in inotify resulted in a
regression in that inotify never gets events.  The inotify group mask is
always 0.  This mask should be updated any time a new mark is added.

Signed-off-by: Eric Paris <eparis@redhat.com>
2009-08-28 12:51:14 -04:00
Eric Paris
83cb10f0ef inotify: fix length reporting and size checking
0db501bd06 introduced a regresion in that it now sends a nul
terminator but the length accounting when checking for space or
reporting to userspace did not take this into account.  This corrects
all of the rounding logic.

Signed-off-by: Eric Paris <eparis@redhat.com>
2009-08-28 11:57:55 -04:00
Brian Rogers
b962e7312a inotify: do not send a block of zeros when no pathname is available
When an event has no pathname, there's no need to pad it with a null byte and
therefore generate an inotify_event sized block of zeros. This fixes a
regression introduced by commit 0db501bd06 where
my system wouldn't finish booting because some process was being confused by
this.

Signed-off-by: Brian Rogers <brian@xyzw.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
2009-08-28 10:03:06 -04:00