License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
2010-04-07 06:14:15 +08:00
|
|
|
#include <linux/ceph/ceph_debug.h>
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
|
|
|
#include <linux/spinlock.h>
|
|
|
|
#include <linux/namei.h>
|
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 16:04:11 +08:00
|
|
|
#include <linux/slab.h>
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
#include <linux/sched.h>
|
2016-04-14 06:30:17 +08:00
|
|
|
#include <linux/xattr.h>
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
|
|
|
#include "super.h"
|
2010-04-07 06:14:15 +08:00
|
|
|
#include "mds_client.h"
|
2022-03-14 10:28:35 +08:00
|
|
|
#include "crypto.h"
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Directory operations: readdir, lookup, create, link, unlink,
|
|
|
|
* rename, etc.
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Ceph MDS operations are specified in terms of a base ino and
|
|
|
|
* relative path. Thus, the client can specify an operation on a
|
|
|
|
* specific inode (e.g., a getattr due to fstat(2)), or as a path
|
|
|
|
* relative to, say, the root directory.
|
|
|
|
*
|
|
|
|
* Normally, we limit ourselves to strict inode ops (no path component)
|
|
|
|
* or dentry operations (a single path component relative to an ino). The
|
|
|
|
* exception to this is open_root_dentry(), which will open the mount
|
|
|
|
* point by name.
|
|
|
|
*/
|
|
|
|
|
2010-08-04 01:25:30 +08:00
|
|
|
const struct dentry_operations ceph_dentry_ops;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2019-01-31 16:55:51 +08:00
|
|
|
static bool __dentry_lease_is_valid(struct ceph_dentry_info *di);
|
|
|
|
static int __dir_lease_try_check(const struct dentry *dentry);
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
/*
|
|
|
|
* Initialize ceph dentry state.
|
|
|
|
*/
|
2016-10-29 10:05:13 +08:00
|
|
|
static int ceph_d_init(struct dentry *dentry)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
|
|
|
struct ceph_dentry_info *di;
|
2020-09-03 21:01:39 +08:00
|
|
|
struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dentry->d_sb);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2016-03-13 15:26:29 +08:00
|
|
|
di = kmem_cache_zalloc(ceph_dentry_cachep, GFP_KERNEL);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
if (!di)
|
|
|
|
return -ENOMEM; /* oh well */
|
|
|
|
|
|
|
|
di->dentry = dentry;
|
|
|
|
di->lease_session = NULL;
|
2016-06-22 22:35:04 +08:00
|
|
|
di->time = jiffies;
|
2011-07-27 02:30:15 +08:00
|
|
|
dentry->d_fsdata = di;
|
2019-01-31 16:55:51 +08:00
|
|
|
INIT_LIST_HEAD(&di->lease_list);
|
2020-03-20 11:44:59 +08:00
|
|
|
|
|
|
|
atomic64_inc(&mdsc->metric.total_dentries);
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2016-04-29 11:27:30 +08:00
|
|
|
* for f_pos for readdir:
|
|
|
|
* - hash order:
|
|
|
|
* (0xff << 52) | ((24 bits hash) << 28) |
|
|
|
|
* (the nth entry has hash collision);
|
|
|
|
* - frag+name order;
|
|
|
|
* ((frag value) << 28) | (the nth entry in frag);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
*/
|
2016-04-29 11:27:30 +08:00
|
|
|
#define OFFSET_BITS 28
|
|
|
|
#define OFFSET_MASK ((1 << OFFSET_BITS) - 1)
|
|
|
|
#define HASH_ORDER (0xffull << (OFFSET_BITS + 24))
|
|
|
|
loff_t ceph_make_fpos(unsigned high, unsigned off, bool hash_order)
|
|
|
|
{
|
|
|
|
loff_t fpos = ((loff_t)high << 28) | (loff_t)off;
|
|
|
|
if (hash_order)
|
|
|
|
fpos |= HASH_ORDER;
|
|
|
|
return fpos;
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool is_hash_order(loff_t p)
|
|
|
|
{
|
|
|
|
return (p & HASH_ORDER) == HASH_ORDER;
|
|
|
|
}
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
static unsigned fpos_frag(loff_t p)
|
|
|
|
{
|
2016-04-29 11:27:30 +08:00
|
|
|
return p >> OFFSET_BITS;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
2016-04-29 11:27:30 +08:00
|
|
|
|
|
|
|
static unsigned fpos_hash(loff_t p)
|
|
|
|
{
|
|
|
|
return ceph_frag_value(fpos_frag(p));
|
|
|
|
}
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
static unsigned fpos_off(loff_t p)
|
|
|
|
{
|
2016-04-29 11:27:30 +08:00
|
|
|
return p & OFFSET_MASK;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
|
2014-02-13 19:40:26 +08:00
|
|
|
static int fpos_cmp(loff_t l, loff_t r)
|
|
|
|
{
|
|
|
|
int v = ceph_frag_compare(fpos_frag(l), fpos_frag(r));
|
|
|
|
if (v)
|
|
|
|
return v;
|
|
|
|
return (int)(fpos_off(l) - fpos_off(r));
|
|
|
|
}
|
|
|
|
|
2015-06-16 20:48:56 +08:00
|
|
|
/*
|
|
|
|
* make note of the last dentry we read, so we can
|
|
|
|
* continue at the same lexicographical point,
|
|
|
|
* regardless of what dir changes take place on the
|
|
|
|
* server.
|
|
|
|
*/
|
2023-06-12 09:04:07 +08:00
|
|
|
static int note_last_dentry(struct ceph_fs_client *fsc,
|
|
|
|
struct ceph_dir_file_info *dfi,
|
|
|
|
const char *name,
|
2015-06-16 20:48:56 +08:00
|
|
|
int len, unsigned next_offset)
|
|
|
|
{
|
|
|
|
char *buf = kmalloc(len+1, GFP_KERNEL);
|
|
|
|
if (!buf)
|
|
|
|
return -ENOMEM;
|
2018-03-13 10:42:44 +08:00
|
|
|
kfree(dfi->last_name);
|
|
|
|
dfi->last_name = buf;
|
|
|
|
memcpy(dfi->last_name, name, len);
|
|
|
|
dfi->last_name[len] = 0;
|
|
|
|
dfi->next_offset = next_offset;
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(fsc->client, "'%s'\n", dfi->last_name);
|
2015-06-16 20:48:56 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2016-04-28 17:43:35 +08:00
|
|
|
|
|
|
|
static struct dentry *
|
|
|
|
__dcache_find_get_entry(struct dentry *parent, u64 idx,
|
|
|
|
struct ceph_readdir_cache_control *cache_ctl)
|
|
|
|
{
|
|
|
|
struct inode *dir = d_inode(parent);
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = ceph_inode_to_client(dir);
|
2016-04-28 17:43:35 +08:00
|
|
|
struct dentry *dentry;
|
|
|
|
unsigned idx_mask = (PAGE_SIZE / sizeof(struct dentry *)) - 1;
|
|
|
|
loff_t ptr_pos = idx * sizeof(struct dentry *);
|
|
|
|
pgoff_t ptr_pgoff = ptr_pos >> PAGE_SHIFT;
|
|
|
|
|
|
|
|
if (ptr_pos >= i_size_read(dir))
|
|
|
|
return NULL;
|
|
|
|
|
2024-05-22 01:58:45 +08:00
|
|
|
if (!cache_ctl->page || ptr_pgoff != cache_ctl->page->index) {
|
2016-04-28 17:43:35 +08:00
|
|
|
ceph_readdir_cache_release(cache_ctl);
|
|
|
|
cache_ctl->page = find_lock_page(&dir->i_data, ptr_pgoff);
|
|
|
|
if (!cache_ctl->page) {
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, " page %lu not found\n", ptr_pgoff);
|
2016-04-28 17:43:35 +08:00
|
|
|
return ERR_PTR(-EAGAIN);
|
|
|
|
}
|
|
|
|
/* reading/filling the cache are serialized by
|
2022-02-11 11:33:25 +08:00
|
|
|
i_rwsem, no need to use page lock */
|
2016-04-28 17:43:35 +08:00
|
|
|
unlock_page(cache_ctl->page);
|
|
|
|
cache_ctl->dentries = kmap(cache_ctl->page);
|
|
|
|
}
|
|
|
|
|
|
|
|
cache_ctl->index = idx & idx_mask;
|
|
|
|
|
|
|
|
rcu_read_lock();
|
|
|
|
spin_lock(&parent->d_lock);
|
|
|
|
/* check i_size again here, because empty directory can be
|
2022-02-11 11:33:25 +08:00
|
|
|
* marked as complete while not holding the i_rwsem. */
|
2016-04-28 17:43:35 +08:00
|
|
|
if (ceph_dir_is_complete_ordered(dir) && ptr_pos < i_size_read(dir))
|
|
|
|
dentry = cache_ctl->dentries[cache_ctl->index];
|
|
|
|
else
|
|
|
|
dentry = NULL;
|
|
|
|
spin_unlock(&parent->d_lock);
|
|
|
|
if (dentry && !lockref_get_not_dead(&dentry->d_lockref))
|
|
|
|
dentry = NULL;
|
|
|
|
rcu_read_unlock();
|
|
|
|
return dentry ? : ERR_PTR(-EAGAIN);
|
|
|
|
}
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
/*
|
|
|
|
* When possible, we try to satisfy a readdir by peeking at the
|
|
|
|
* dcache. We make this work by carefully ordering dentries on
|
2023-11-07 15:00:39 +08:00
|
|
|
* d_children when we initially get results back from the MDS, and
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
* falling back to a "normal" sync readdir if any dentries in the dir
|
|
|
|
* are dropped.
|
|
|
|
*
|
2013-03-13 19:44:32 +08:00
|
|
|
* Complete dir indicates that we have all dentries in the dir. It is
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
* defined IFF we hold CEPH_CAP_FILE_SHARED (which will be revoked by
|
|
|
|
* the MDS if/when the directory is modified).
|
|
|
|
*/
|
2014-04-06 14:10:04 +08:00
|
|
|
static int __dcache_readdir(struct file *file, struct dir_context *ctx,
|
2017-11-27 10:47:46 +08:00
|
|
|
int shared_gen)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
2018-03-13 10:42:44 +08:00
|
|
|
struct ceph_dir_file_info *dfi = file->private_data;
|
2014-10-31 13:22:04 +08:00
|
|
|
struct dentry *parent = file->f_path.dentry;
|
2015-03-18 06:25:59 +08:00
|
|
|
struct inode *dir = d_inode(parent);
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_fs_client *fsc = ceph_inode_to_fs_client(dir);
|
|
|
|
struct ceph_client *cl = ceph_inode_to_client(dir);
|
2015-06-16 20:48:56 +08:00
|
|
|
struct dentry *dentry, *last = NULL;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_dentry_info *di;
|
2015-06-16 20:48:56 +08:00
|
|
|
struct ceph_readdir_cache_control cache_ctl = {};
|
2016-04-28 17:43:35 +08:00
|
|
|
u64 idx = 0;
|
|
|
|
int err = 0;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx v%u at %llx\n", dir, ceph_vinop(dir),
|
|
|
|
(unsigned)shared_gen, ctx->pos);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2016-04-28 17:43:35 +08:00
|
|
|
/* search start position */
|
|
|
|
if (ctx->pos > 2) {
|
|
|
|
u64 count = div_u64(i_size_read(dir), sizeof(struct dentry *));
|
|
|
|
while (count > 0) {
|
|
|
|
u64 step = count >> 1;
|
|
|
|
dentry = __dcache_find_get_entry(parent, idx + step,
|
|
|
|
&cache_ctl);
|
|
|
|
if (!dentry) {
|
|
|
|
/* use linar search */
|
|
|
|
idx = 0;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
if (IS_ERR(dentry)) {
|
|
|
|
err = PTR_ERR(dentry);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
di = ceph_dentry(dentry);
|
|
|
|
spin_lock(&dentry->d_lock);
|
|
|
|
if (fpos_cmp(di->offset, ctx->pos) < 0) {
|
|
|
|
idx += step + 1;
|
|
|
|
count -= step + 1;
|
|
|
|
} else {
|
|
|
|
count = step;
|
|
|
|
}
|
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
dput(dentry);
|
|
|
|
}
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx cache idx %llu\n", dir,
|
|
|
|
ceph_vinop(dir), idx);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
|
2015-06-16 20:48:56 +08:00
|
|
|
|
2016-04-28 17:43:35 +08:00
|
|
|
for (;;) {
|
|
|
|
bool emit_dentry = false;
|
|
|
|
dentry = __dcache_find_get_entry(parent, idx++, &cache_ctl);
|
|
|
|
if (!dentry) {
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->file_info.flags |= CEPH_F_ATEND;
|
2015-06-16 20:48:56 +08:00
|
|
|
err = 0;
|
|
|
|
break;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
2016-04-28 17:43:35 +08:00
|
|
|
if (IS_ERR(dentry)) {
|
|
|
|
err = PTR_ERR(dentry);
|
|
|
|
goto out;
|
2015-06-16 20:48:56 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
spin_lock(&dentry->d_lock);
|
2017-11-27 11:23:48 +08:00
|
|
|
di = ceph_dentry(dentry);
|
|
|
|
if (d_unhashed(dentry) ||
|
|
|
|
d_really_is_negative(dentry) ||
|
2022-03-14 10:28:35 +08:00
|
|
|
di->lease_shared_gen != shared_gen ||
|
|
|
|
((dentry->d_flags & DCACHE_NOKEY_NAME) &&
|
|
|
|
fscrypt_has_encryption_key(dir))) {
|
2017-11-27 11:23:48 +08:00
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
dput(dentry);
|
|
|
|
err = -EAGAIN;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
if (fpos_cmp(ctx->pos, di->offset) <= 0) {
|
2019-01-31 16:55:51 +08:00
|
|
|
__ceph_dentry_dir_lease_touch(di);
|
2015-06-16 20:48:56 +08:00
|
|
|
emit_dentry = true;
|
|
|
|
}
|
2011-01-07 14:49:33 +08:00
|
|
|
spin_unlock(&dentry->d_lock);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2015-06-16 20:48:56 +08:00
|
|
|
if (emit_dentry) {
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, " %llx dentry %p %pd %p\n", di->offset,
|
|
|
|
dentry, dentry, d_inode(dentry));
|
2015-06-16 20:48:56 +08:00
|
|
|
ctx->pos = di->offset;
|
|
|
|
if (!dir_emit(ctx, dentry->d_name.name,
|
ceph: fix inode number handling on arches with 32-bit ino_t
Tuan and Ulrich mentioned that they were hitting a problem on s390x,
which has a 32-bit ino_t value, even though it's a 64-bit arch (for
historical reasons).
I think the current handling of inode numbers in the ceph driver is
wrong. It tries to use 32-bit inode numbers on 32-bit arches, but that's
actually not a problem. 32-bit arches can deal with 64-bit inode numbers
just fine when userland code is compiled with LFS support (the common
case these days).
What we really want to do is just use 64-bit numbers everywhere, unless
someone has mounted with the ino32 mount option. In that case, we want
to ensure that we hash the inode number down to something that will fit
in 32 bits before presenting the value to userland.
Add new helper functions that do this, and only do the conversion before
presenting these values to userland in getattr and readdir.
The inode table hashvalue is changed to just cast the inode number to
unsigned long, as low-order bits are the most likely to vary anyway.
While it's not strictly required, we do want to put something in
inode->i_ino. Instead of basing it on BITS_PER_LONG, however, base it on
the size of the ino_t type.
NOTE: This is a user-visible change on 32-bit arches:
1/ inode numbers will be seen to have changed between kernel versions.
32-bit arches will see large inode numbers now instead of the hashed
ones they saw before.
2/ any really old software not built with LFS support may start failing
stat() calls with -EOVERFLOW on inode numbers >2^32. Nothing much we
can do about these, but hopefully the intersection of people running
such code on ceph will be very small.
The workaround for both problems is to mount with "-o ino32".
[ idryomov: changelog tweak ]
URL: https://tracker.ceph.com/issues/46828
Reported-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Reported-and-Tested-by: Tuan Hoang1 <Tuan.Hoang1@ibm.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-18 20:03:48 +08:00
|
|
|
dentry->d_name.len, ceph_present_inode(d_inode(dentry)),
|
2015-06-16 20:48:56 +08:00
|
|
|
d_inode(dentry)->i_mode >> 12)) {
|
|
|
|
dput(dentry);
|
|
|
|
err = 0;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
ctx->pos++;
|
2014-04-08 21:42:59 +08:00
|
|
|
|
2015-06-16 20:48:56 +08:00
|
|
|
if (last)
|
|
|
|
dput(last);
|
|
|
|
last = dentry;
|
|
|
|
} else {
|
|
|
|
dput(dentry);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
2015-06-16 20:48:56 +08:00
|
|
|
}
|
2016-04-28 17:43:35 +08:00
|
|
|
out:
|
2015-06-16 20:48:56 +08:00
|
|
|
ceph_readdir_cache_release(&cache_ctl);
|
|
|
|
if (last) {
|
|
|
|
int ret;
|
|
|
|
di = ceph_dentry(last);
|
2023-06-12 09:04:07 +08:00
|
|
|
ret = note_last_dentry(fsc, dfi, last->d_name.name,
|
|
|
|
last->d_name.len,
|
2015-06-16 20:48:56 +08:00
|
|
|
fpos_off(di->offset) + 1);
|
|
|
|
if (ret < 0)
|
|
|
|
err = ret;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
dput(last);
|
2017-07-06 11:12:21 +08:00
|
|
|
/* last_name no longer match cache index */
|
2018-03-13 10:42:44 +08:00
|
|
|
if (dfi->readdir_cache_idx >= 0) {
|
|
|
|
dfi->readdir_cache_idx = -1;
|
|
|
|
dfi->dir_release_count = 0;
|
2017-07-06 11:12:21 +08:00
|
|
|
}
|
2015-06-16 20:48:56 +08:00
|
|
|
}
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2018-03-13 10:42:44 +08:00
|
|
|
static bool need_send_readdir(struct ceph_dir_file_info *dfi, loff_t pos)
|
2016-04-29 11:27:30 +08:00
|
|
|
{
|
2018-03-13 10:42:44 +08:00
|
|
|
if (!dfi->last_readdir)
|
2016-04-29 11:27:30 +08:00
|
|
|
return true;
|
|
|
|
if (is_hash_order(pos))
|
2018-03-13 10:42:44 +08:00
|
|
|
return !ceph_frag_contains_value(dfi->frag, fpos_hash(pos));
|
2016-04-29 11:27:30 +08:00
|
|
|
else
|
2018-03-13 10:42:44 +08:00
|
|
|
return dfi->frag != fpos_frag(pos);
|
2016-04-29 11:27:30 +08:00
|
|
|
}
|
|
|
|
|
2013-05-18 04:52:26 +08:00
|
|
|
static int ceph_readdir(struct file *file, struct dir_context *ctx)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
2018-03-13 10:42:44 +08:00
|
|
|
struct ceph_dir_file_info *dfi = file->private_data;
|
2013-05-18 04:52:26 +08:00
|
|
|
struct inode *inode = file_inode(file);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_inode_info *ci = ceph_inode(inode);
|
2023-06-12 10:50:38 +08:00
|
|
|
struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode);
|
2010-04-07 06:14:15 +08:00
|
|
|
struct ceph_mds_client *mdsc = fsc->mdsc;
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = fsc->client;
|
2016-04-28 15:17:40 +08:00
|
|
|
int i;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
int err;
|
2017-04-24 11:56:50 +08:00
|
|
|
unsigned frag = -1;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_mds_reply_info_parsed *rinfo;
|
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx file %p pos %llx\n", inode,
|
|
|
|
ceph_vinop(inode), file, ctx->pos);
|
2018-03-13 10:42:44 +08:00
|
|
|
if (dfi->file_info.flags & CEPH_F_ATEND)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return 0;
|
|
|
|
|
|
|
|
/* always start with . and .. */
|
2013-05-18 04:52:26 +08:00
|
|
|
if (ctx->pos == 0) {
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx off 0 -> '.'\n", inode,
|
|
|
|
ceph_vinop(inode));
|
ceph: fix inode number handling on arches with 32-bit ino_t
Tuan and Ulrich mentioned that they were hitting a problem on s390x,
which has a 32-bit ino_t value, even though it's a 64-bit arch (for
historical reasons).
I think the current handling of inode numbers in the ceph driver is
wrong. It tries to use 32-bit inode numbers on 32-bit arches, but that's
actually not a problem. 32-bit arches can deal with 64-bit inode numbers
just fine when userland code is compiled with LFS support (the common
case these days).
What we really want to do is just use 64-bit numbers everywhere, unless
someone has mounted with the ino32 mount option. In that case, we want
to ensure that we hash the inode number down to something that will fit
in 32 bits before presenting the value to userland.
Add new helper functions that do this, and only do the conversion before
presenting these values to userland in getattr and readdir.
The inode table hashvalue is changed to just cast the inode number to
unsigned long, as low-order bits are the most likely to vary anyway.
While it's not strictly required, we do want to put something in
inode->i_ino. Instead of basing it on BITS_PER_LONG, however, base it on
the size of the ino_t type.
NOTE: This is a user-visible change on 32-bit arches:
1/ inode numbers will be seen to have changed between kernel versions.
32-bit arches will see large inode numbers now instead of the hashed
ones they saw before.
2/ any really old software not built with LFS support may start failing
stat() calls with -EOVERFLOW on inode numbers >2^32. Nothing much we
can do about these, but hopefully the intersection of people running
such code on ceph will be very small.
The workaround for both problems is to mount with "-o ino32".
[ idryomov: changelog tweak ]
URL: https://tracker.ceph.com/issues/46828
Reported-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Reported-and-Tested-by: Tuan Hoang1 <Tuan.Hoang1@ibm.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-18 20:03:48 +08:00
|
|
|
if (!dir_emit(ctx, ".", 1, ceph_present_inode(inode),
|
2013-05-18 04:52:26 +08:00
|
|
|
inode->i_mode >> 12))
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return 0;
|
2013-05-18 04:52:26 +08:00
|
|
|
ctx->pos = 1;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
2013-05-18 04:52:26 +08:00
|
|
|
if (ctx->pos == 1) {
|
ceph: fix inode number handling on arches with 32-bit ino_t
Tuan and Ulrich mentioned that they were hitting a problem on s390x,
which has a 32-bit ino_t value, even though it's a 64-bit arch (for
historical reasons).
I think the current handling of inode numbers in the ceph driver is
wrong. It tries to use 32-bit inode numbers on 32-bit arches, but that's
actually not a problem. 32-bit arches can deal with 64-bit inode numbers
just fine when userland code is compiled with LFS support (the common
case these days).
What we really want to do is just use 64-bit numbers everywhere, unless
someone has mounted with the ino32 mount option. In that case, we want
to ensure that we hash the inode number down to something that will fit
in 32 bits before presenting the value to userland.
Add new helper functions that do this, and only do the conversion before
presenting these values to userland in getattr and readdir.
The inode table hashvalue is changed to just cast the inode number to
unsigned long, as low-order bits are the most likely to vary anyway.
While it's not strictly required, we do want to put something in
inode->i_ino. Instead of basing it on BITS_PER_LONG, however, base it on
the size of the ino_t type.
NOTE: This is a user-visible change on 32-bit arches:
1/ inode numbers will be seen to have changed between kernel versions.
32-bit arches will see large inode numbers now instead of the hashed
ones they saw before.
2/ any really old software not built with LFS support may start failing
stat() calls with -EOVERFLOW on inode numbers >2^32. Nothing much we
can do about these, but hopefully the intersection of people running
such code on ceph will be very small.
The workaround for both problems is to mount with "-o ino32".
[ idryomov: changelog tweak ]
URL: https://tracker.ceph.com/issues/46828
Reported-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Reported-and-Tested-by: Tuan Hoang1 <Tuan.Hoang1@ibm.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-18 20:03:48 +08:00
|
|
|
u64 ino;
|
|
|
|
struct dentry *dentry = file->f_path.dentry;
|
|
|
|
|
|
|
|
spin_lock(&dentry->d_lock);
|
|
|
|
ino = ceph_present_inode(dentry->d_parent->d_inode);
|
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx off 1 -> '..'\n", inode,
|
|
|
|
ceph_vinop(inode));
|
ceph: fix inode number handling on arches with 32-bit ino_t
Tuan and Ulrich mentioned that they were hitting a problem on s390x,
which has a 32-bit ino_t value, even though it's a 64-bit arch (for
historical reasons).
I think the current handling of inode numbers in the ceph driver is
wrong. It tries to use 32-bit inode numbers on 32-bit arches, but that's
actually not a problem. 32-bit arches can deal with 64-bit inode numbers
just fine when userland code is compiled with LFS support (the common
case these days).
What we really want to do is just use 64-bit numbers everywhere, unless
someone has mounted with the ino32 mount option. In that case, we want
to ensure that we hash the inode number down to something that will fit
in 32 bits before presenting the value to userland.
Add new helper functions that do this, and only do the conversion before
presenting these values to userland in getattr and readdir.
The inode table hashvalue is changed to just cast the inode number to
unsigned long, as low-order bits are the most likely to vary anyway.
While it's not strictly required, we do want to put something in
inode->i_ino. Instead of basing it on BITS_PER_LONG, however, base it on
the size of the ino_t type.
NOTE: This is a user-visible change on 32-bit arches:
1/ inode numbers will be seen to have changed between kernel versions.
32-bit arches will see large inode numbers now instead of the hashed
ones they saw before.
2/ any really old software not built with LFS support may start failing
stat() calls with -EOVERFLOW on inode numbers >2^32. Nothing much we
can do about these, but hopefully the intersection of people running
such code on ceph will be very small.
The workaround for both problems is to mount with "-o ino32".
[ idryomov: changelog tweak ]
URL: https://tracker.ceph.com/issues/46828
Reported-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Reported-and-Tested-by: Tuan Hoang1 <Tuan.Hoang1@ibm.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-18 20:03:48 +08:00
|
|
|
if (!dir_emit(ctx, "..", 2, ino, inode->i_mode >> 12))
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return 0;
|
2013-05-18 04:52:26 +08:00
|
|
|
ctx->pos = 2;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
|
2022-11-29 18:39:49 +08:00
|
|
|
err = ceph_fscrypt_prepare_readdir(inode);
|
|
|
|
if (err < 0)
|
2022-03-14 10:28:35 +08:00
|
|
|
return err;
|
|
|
|
|
2011-12-01 01:47:09 +08:00
|
|
|
spin_lock(&ci->i_ceph_lock);
|
2020-03-05 20:21:00 +08:00
|
|
|
/* request Fx cap. if have Fx, we don't need to release Fs cap
|
|
|
|
* for later create/unlink. */
|
|
|
|
__ceph_touch_fmode(ci, mdsc, CEPH_FILE_MODE_WR);
|
|
|
|
/* can we use the dcache? */
|
2015-06-16 20:48:56 +08:00
|
|
|
if (ceph_test_mount_opt(fsc, DCACHE) &&
|
2010-04-07 06:14:15 +08:00
|
|
|
!ceph_test_mount_opt(fsc, NOASYNCREADDIR) &&
|
2010-07-23 04:47:21 +08:00
|
|
|
ceph_snap(inode) != CEPH_SNAPDIR &&
|
2014-10-22 09:09:56 +08:00
|
|
|
__ceph_dir_is_complete_ordered(ci) &&
|
2020-03-20 11:45:00 +08:00
|
|
|
__ceph_caps_issued_mask_metric(ci, CEPH_CAP_FILE_SHARED, 1)) {
|
2017-11-27 10:47:46 +08:00
|
|
|
int shared_gen = atomic_read(&ci->i_shared_gen);
|
2020-03-20 11:45:00 +08:00
|
|
|
|
2011-12-01 01:47:09 +08:00
|
|
|
spin_unlock(&ci->i_ceph_lock);
|
2014-04-06 14:10:04 +08:00
|
|
|
err = __dcache_readdir(file, ctx, shared_gen);
|
2010-10-19 05:04:31 +08:00
|
|
|
if (err != -EAGAIN)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return err;
|
2010-10-19 05:04:31 +08:00
|
|
|
} else {
|
2011-12-01 01:47:09 +08:00
|
|
|
spin_unlock(&ci->i_ceph_lock);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* proceed with a normal readdir */
|
|
|
|
more:
|
|
|
|
/* do we have the correct frag content buffered? */
|
2018-03-13 10:42:44 +08:00
|
|
|
if (need_send_readdir(dfi, ctx->pos)) {
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_mds_request *req;
|
|
|
|
int op = ceph_snap(inode) == CEPH_SNAPDIR ?
|
|
|
|
CEPH_MDS_OP_LSSNAP : CEPH_MDS_OP_READDIR;
|
|
|
|
|
|
|
|
/* discard old result, if any */
|
2018-03-13 10:42:44 +08:00
|
|
|
if (dfi->last_readdir) {
|
|
|
|
ceph_mdsc_put_request(dfi->last_readdir);
|
|
|
|
dfi->last_readdir = NULL;
|
2010-03-11 04:03:32 +08:00
|
|
|
}
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2016-04-29 11:27:30 +08:00
|
|
|
if (is_hash_order(ctx->pos)) {
|
2017-04-24 11:56:50 +08:00
|
|
|
/* fragtree isn't always accurate. choose frag
|
|
|
|
* based on previous reply when possible. */
|
|
|
|
if (frag == (unsigned)-1)
|
|
|
|
frag = ceph_choose_frag(ci, fpos_hash(ctx->pos),
|
|
|
|
NULL, NULL);
|
2016-04-29 11:27:30 +08:00
|
|
|
} else {
|
|
|
|
frag = fpos_frag(ctx->pos);
|
|
|
|
}
|
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "fetching %p %llx.%llx frag %x offset '%s'\n",
|
|
|
|
inode, ceph_vinop(inode), frag, dfi->last_name);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
|
|
|
|
if (IS_ERR(req))
|
|
|
|
return PTR_ERR(req);
|
2022-03-14 10:28:35 +08:00
|
|
|
|
2014-03-29 13:41:15 +08:00
|
|
|
err = ceph_alloc_readdir_reply_buffer(req, inode);
|
|
|
|
if (err) {
|
|
|
|
ceph_mdsc_put_request(req);
|
|
|
|
return err;
|
|
|
|
}
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
/* hints to request -> mds selection code */
|
|
|
|
req->r_direct_mode = USE_AUTH_MDS;
|
2017-07-26 12:48:08 +08:00
|
|
|
if (op == CEPH_MDS_OP_READDIR) {
|
|
|
|
req->r_direct_hash = ceph_frag_value(frag);
|
|
|
|
__set_bit(CEPH_MDS_R_DIRECT_IS_HASH, &req->r_req_flags);
|
2017-11-23 18:28:16 +08:00
|
|
|
req->r_inode_drop = CEPH_CAP_FILE_EXCL;
|
2017-07-26 12:48:08 +08:00
|
|
|
}
|
2018-03-13 10:42:44 +08:00
|
|
|
if (dfi->last_name) {
|
2022-03-14 10:28:35 +08:00
|
|
|
struct qstr d_name = { .name = dfi->last_name,
|
|
|
|
.len = strlen(dfi->last_name) };
|
|
|
|
|
|
|
|
req->r_path2 = kzalloc(NAME_MAX + 1, GFP_KERNEL);
|
2015-03-22 00:54:58 +08:00
|
|
|
if (!req->r_path2) {
|
|
|
|
ceph_mdsc_put_request(req);
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
2022-03-14 10:28:35 +08:00
|
|
|
|
|
|
|
err = ceph_encode_encrypted_dname(inode, &d_name,
|
|
|
|
req->r_path2);
|
|
|
|
if (err < 0) {
|
|
|
|
ceph_mdsc_put_request(req);
|
|
|
|
return err;
|
|
|
|
}
|
2017-04-06 00:54:05 +08:00
|
|
|
} else if (is_hash_order(ctx->pos)) {
|
|
|
|
req->r_args.readdir.offset_hash =
|
|
|
|
cpu_to_le32(fpos_hash(ctx->pos));
|
2015-03-22 00:54:58 +08:00
|
|
|
}
|
2017-04-06 00:54:05 +08:00
|
|
|
|
2018-03-13 10:42:44 +08:00
|
|
|
req->r_dir_release_cnt = dfi->dir_release_count;
|
|
|
|
req->r_dir_ordered_cnt = dfi->dir_ordered_count;
|
|
|
|
req->r_readdir_cache_idx = dfi->readdir_cache_idx;
|
|
|
|
req->r_readdir_offset = dfi->next_offset;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_args.readdir.frag = cpu_to_le32(frag);
|
2016-04-27 17:48:30 +08:00
|
|
|
req->r_args.readdir.flags =
|
|
|
|
cpu_to_le16(CEPH_READDIR_REPLY_BITFLAGS);
|
2015-03-22 00:54:58 +08:00
|
|
|
|
|
|
|
req->r_inode = inode;
|
|
|
|
ihold(inode);
|
|
|
|
req->r_dentry = dget(file->f_path.dentry);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
err = ceph_mdsc_do_request(mdsc, NULL, req);
|
|
|
|
if (err < 0) {
|
|
|
|
ceph_mdsc_put_request(req);
|
|
|
|
return err;
|
|
|
|
}
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx got and parsed readdir result=%d"
|
|
|
|
"on frag %x, end=%d, complete=%d, hash_order=%d\n",
|
|
|
|
inode, ceph_vinop(inode), err, frag,
|
|
|
|
(int)req->r_reply_info.dir_end,
|
|
|
|
(int)req->r_reply_info.dir_complete,
|
|
|
|
(int)req->r_reply_info.hash_order);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2013-09-18 09:44:13 +08:00
|
|
|
rinfo = &req->r_reply_info;
|
|
|
|
if (le32_to_cpu(rinfo->dir_dir->frag) != frag) {
|
|
|
|
frag = le32_to_cpu(rinfo->dir_dir->frag);
|
2016-04-29 11:27:30 +08:00
|
|
|
if (!rinfo->hash_order) {
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->next_offset = req->r_readdir_offset;
|
2016-04-29 11:27:30 +08:00
|
|
|
/* adjust ctx->pos to beginning of frag */
|
|
|
|
ctx->pos = ceph_make_fpos(frag,
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->next_offset,
|
2016-04-29 11:27:30 +08:00
|
|
|
false);
|
|
|
|
}
|
2013-09-18 09:44:13 +08:00
|
|
|
}
|
2015-06-16 20:48:56 +08:00
|
|
|
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->frag = frag;
|
|
|
|
dfi->last_readdir = req;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2017-02-02 02:49:09 +08:00
|
|
|
if (test_bit(CEPH_MDS_R_DID_PREPOPULATE, &req->r_req_flags)) {
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->readdir_cache_idx = req->r_readdir_cache_idx;
|
|
|
|
if (dfi->readdir_cache_idx < 0) {
|
2015-06-16 20:48:56 +08:00
|
|
|
/* preclude from marking dir ordered */
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->dir_ordered_count = 0;
|
2016-04-28 15:17:40 +08:00
|
|
|
} else if (ceph_frag_is_leftmost(frag) &&
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->next_offset == 2) {
|
2015-06-16 20:48:56 +08:00
|
|
|
/* note dir version at start of readdir so
|
|
|
|
* we can tell if any dentries get dropped */
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->dir_release_count = req->r_dir_release_cnt;
|
|
|
|
dfi->dir_ordered_count = req->r_dir_ordered_cnt;
|
2015-06-16 20:48:56 +08:00
|
|
|
}
|
|
|
|
} else {
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx !did_prepopulate\n", inode,
|
|
|
|
ceph_vinop(inode));
|
2015-06-16 20:48:56 +08:00
|
|
|
/* disable readdir cache */
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->readdir_cache_idx = -1;
|
2015-06-16 20:48:56 +08:00
|
|
|
/* preclude from marking dir complete */
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->dir_release_count = 0;
|
2015-06-16 20:48:56 +08:00
|
|
|
}
|
|
|
|
|
2016-04-29 11:27:30 +08:00
|
|
|
/* note next offset and last dentry name */
|
|
|
|
if (rinfo->dir_nr > 0) {
|
2016-04-28 09:37:39 +08:00
|
|
|
struct ceph_mds_reply_dir_entry *rde =
|
|
|
|
rinfo->dir_entries + (rinfo->dir_nr-1);
|
2016-04-29 11:27:30 +08:00
|
|
|
unsigned next_offset = req->r_reply_info.dir_end ?
|
|
|
|
2 : (fpos_off(rde->offset) + 1);
|
2023-06-12 09:04:07 +08:00
|
|
|
err = note_last_dentry(fsc, dfi, rde->name,
|
|
|
|
rde->name_len, next_offset);
|
2022-03-05 19:52:59 +08:00
|
|
|
if (err) {
|
|
|
|
ceph_mdsc_put_request(dfi->last_readdir);
|
|
|
|
dfi->last_readdir = NULL;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return err;
|
2022-03-05 19:52:59 +08:00
|
|
|
}
|
2016-04-29 11:27:30 +08:00
|
|
|
} else if (req->r_reply_info.dir_end) {
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->next_offset = 2;
|
2016-04-29 11:27:30 +08:00
|
|
|
/* keep last name */
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-03-13 10:42:44 +08:00
|
|
|
rinfo = &dfi->last_readdir->r_reply_info;
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx frag %x num %d pos %llx chunk first %llx\n",
|
|
|
|
inode, ceph_vinop(inode), dfi->frag, rinfo->dir_nr, ctx->pos,
|
|
|
|
rinfo->dir_nr ? rinfo->dir_entries[0].offset : 0LL);
|
2013-05-18 04:52:26 +08:00
|
|
|
|
2016-04-28 15:17:40 +08:00
|
|
|
i = 0;
|
|
|
|
/* search start position */
|
|
|
|
if (rinfo->dir_nr > 0) {
|
|
|
|
int step, nr = rinfo->dir_nr;
|
|
|
|
while (nr > 0) {
|
|
|
|
step = nr >> 1;
|
|
|
|
if (rinfo->dir_entries[i + step].offset < ctx->pos) {
|
|
|
|
i += step + 1;
|
|
|
|
nr -= step + 1;
|
|
|
|
} else {
|
|
|
|
nr = step;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
for (; i < rinfo->dir_nr; i++) {
|
|
|
|
struct ceph_mds_reply_dir_entry *rde = rinfo->dir_entries + i;
|
2010-11-19 01:15:07 +08:00
|
|
|
|
2022-03-14 10:28:35 +08:00
|
|
|
if (rde->offset < ctx->pos) {
|
2023-06-12 09:04:07 +08:00
|
|
|
pr_warn_client(cl,
|
|
|
|
"%p %llx.%llx rde->offset 0x%llx ctx->pos 0x%llx\n",
|
|
|
|
inode, ceph_vinop(inode), rde->offset, ctx->pos);
|
2022-03-14 10:28:35 +08:00
|
|
|
return -EIO;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (WARN_ON_ONCE(!rde->inode.in))
|
|
|
|
return -EIO;
|
2016-04-28 15:17:40 +08:00
|
|
|
|
|
|
|
ctx->pos = rde->offset;
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx (%d/%d) -> %llx '%.*s' %p\n", inode,
|
|
|
|
ceph_vinop(inode), i, rinfo->dir_nr, ctx->pos,
|
|
|
|
rde->name_len, rde->name, &rde->inode.in);
|
2016-04-28 15:17:40 +08:00
|
|
|
|
2016-04-28 09:37:39 +08:00
|
|
|
if (!dir_emit(ctx, rde->name, rde->name_len,
|
ceph: fix inode number handling on arches with 32-bit ino_t
Tuan and Ulrich mentioned that they were hitting a problem on s390x,
which has a 32-bit ino_t value, even though it's a 64-bit arch (for
historical reasons).
I think the current handling of inode numbers in the ceph driver is
wrong. It tries to use 32-bit inode numbers on 32-bit arches, but that's
actually not a problem. 32-bit arches can deal with 64-bit inode numbers
just fine when userland code is compiled with LFS support (the common
case these days).
What we really want to do is just use 64-bit numbers everywhere, unless
someone has mounted with the ino32 mount option. In that case, we want
to ensure that we hash the inode number down to something that will fit
in 32 bits before presenting the value to userland.
Add new helper functions that do this, and only do the conversion before
presenting these values to userland in getattr and readdir.
The inode table hashvalue is changed to just cast the inode number to
unsigned long, as low-order bits are the most likely to vary anyway.
While it's not strictly required, we do want to put something in
inode->i_ino. Instead of basing it on BITS_PER_LONG, however, base it on
the size of the ino_t type.
NOTE: This is a user-visible change on 32-bit arches:
1/ inode numbers will be seen to have changed between kernel versions.
32-bit arches will see large inode numbers now instead of the hashed
ones they saw before.
2/ any really old software not built with LFS support may start failing
stat() calls with -EOVERFLOW on inode numbers >2^32. Nothing much we
can do about these, but hopefully the intersection of people running
such code on ceph will be very small.
The workaround for both problems is to mount with "-o ino32".
[ idryomov: changelog tweak ]
URL: https://tracker.ceph.com/issues/46828
Reported-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Reported-and-Tested-by: Tuan Hoang1 <Tuan.Hoang1@ibm.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-18 20:03:48 +08:00
|
|
|
ceph_present_ino(inode->i_sb, le64_to_cpu(rde->inode.in->ino)),
|
|
|
|
le32_to_cpu(rde->inode.in->mode) >> 12)) {
|
2022-03-05 19:52:59 +08:00
|
|
|
/*
|
|
|
|
* NOTE: Here no need to put the 'dfi->last_readdir',
|
|
|
|
* because when dir_emit stops us it's most likely
|
|
|
|
* doesn't have enough memory, etc. So for next readdir
|
|
|
|
* it will continue.
|
|
|
|
*/
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "filldir stopping us...\n");
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return 0;
|
|
|
|
}
|
2022-03-14 10:28:35 +08:00
|
|
|
|
|
|
|
/* Reset the lengths to their original allocated vals */
|
2013-05-18 04:52:26 +08:00
|
|
|
ctx->pos++;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
|
2018-03-13 10:42:44 +08:00
|
|
|
ceph_mdsc_put_request(dfi->last_readdir);
|
|
|
|
dfi->last_readdir = NULL;
|
2017-04-24 11:56:50 +08:00
|
|
|
|
2018-03-13 10:42:44 +08:00
|
|
|
if (dfi->next_offset > 2) {
|
|
|
|
frag = dfi->frag;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
goto more;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* more frags? */
|
2018-03-13 10:42:44 +08:00
|
|
|
if (!ceph_frag_is_rightmost(dfi->frag)) {
|
|
|
|
frag = ceph_frag_next(dfi->frag);
|
2016-04-29 11:27:30 +08:00
|
|
|
if (is_hash_order(ctx->pos)) {
|
|
|
|
loff_t new_pos = ceph_make_fpos(ceph_frag_value(frag),
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->next_offset, true);
|
2016-04-29 11:27:30 +08:00
|
|
|
if (new_pos > ctx->pos)
|
|
|
|
ctx->pos = new_pos;
|
|
|
|
/* keep last_name */
|
|
|
|
} else {
|
2018-03-13 10:42:44 +08:00
|
|
|
ctx->pos = ceph_make_fpos(frag, dfi->next_offset,
|
|
|
|
false);
|
|
|
|
kfree(dfi->last_name);
|
|
|
|
dfi->last_name = NULL;
|
2016-04-29 11:27:30 +08:00
|
|
|
}
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx next frag is %x\n", inode,
|
|
|
|
ceph_vinop(inode), frag);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
goto more;
|
|
|
|
}
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->file_info.flags |= CEPH_F_ATEND;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* if dir_release_count still matches the dir, no dentries
|
|
|
|
* were released during the whole readdir, and we should have
|
|
|
|
* the complete dir contents in our cache.
|
|
|
|
*/
|
2018-03-13 10:42:44 +08:00
|
|
|
if (atomic64_read(&ci->i_release_count) ==
|
|
|
|
dfi->dir_release_count) {
|
2015-06-16 20:48:56 +08:00
|
|
|
spin_lock(&ci->i_ceph_lock);
|
2018-03-13 10:42:44 +08:00
|
|
|
if (dfi->dir_ordered_count ==
|
|
|
|
atomic64_read(&ci->i_ordered_count)) {
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, " marking %p %llx.%llx complete and ordered\n",
|
|
|
|
inode, ceph_vinop(inode));
|
2015-06-16 20:48:56 +08:00
|
|
|
/* use i_size to track number of entries in
|
|
|
|
* readdir cache */
|
2018-03-13 10:42:44 +08:00
|
|
|
BUG_ON(dfi->readdir_cache_idx < 0);
|
|
|
|
i_size_write(inode, dfi->readdir_cache_idx *
|
2015-06-16 20:48:56 +08:00
|
|
|
sizeof(struct dentry*));
|
|
|
|
} else {
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, " marking %llx.%llx complete\n",
|
|
|
|
ceph_vinop(inode));
|
2015-06-16 20:48:56 +08:00
|
|
|
}
|
2018-03-13 10:42:44 +08:00
|
|
|
__ceph_dir_set_complete(ci, dfi->dir_release_count,
|
|
|
|
dfi->dir_ordered_count);
|
2015-06-16 20:48:56 +08:00
|
|
|
spin_unlock(&ci->i_ceph_lock);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx file %p done.\n", inode, ceph_vinop(inode),
|
|
|
|
file);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-03-13 10:42:44 +08:00
|
|
|
static void reset_readdir(struct ceph_dir_file_info *dfi)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
2018-03-13 10:42:44 +08:00
|
|
|
if (dfi->last_readdir) {
|
|
|
|
ceph_mdsc_put_request(dfi->last_readdir);
|
|
|
|
dfi->last_readdir = NULL;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
2018-03-13 10:42:44 +08:00
|
|
|
kfree(dfi->last_name);
|
|
|
|
dfi->last_name = NULL;
|
|
|
|
dfi->dir_release_count = 0;
|
|
|
|
dfi->readdir_cache_idx = -1;
|
|
|
|
dfi->next_offset = 2; /* compensate for . and .. */
|
|
|
|
dfi->file_info.flags &= ~CEPH_F_ATEND;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
|
2016-04-28 15:17:40 +08:00
|
|
|
/*
|
|
|
|
* discard buffered readdir content on seekdir(0), or seek to new frag,
|
|
|
|
* or seek prior to current chunk
|
|
|
|
*/
|
2018-03-13 10:42:44 +08:00
|
|
|
static bool need_reset_readdir(struct ceph_dir_file_info *dfi, loff_t new_pos)
|
2016-04-28 15:17:40 +08:00
|
|
|
{
|
|
|
|
struct ceph_mds_reply_info_parsed *rinfo;
|
2016-04-29 11:27:30 +08:00
|
|
|
loff_t chunk_offset;
|
2016-04-28 15:17:40 +08:00
|
|
|
if (new_pos == 0)
|
|
|
|
return true;
|
2016-04-29 11:27:30 +08:00
|
|
|
if (is_hash_order(new_pos)) {
|
|
|
|
/* no need to reset last_name for a forward seek when
|
|
|
|
* dentries are sotred in hash order */
|
2018-03-13 10:42:44 +08:00
|
|
|
} else if (dfi->frag != fpos_frag(new_pos)) {
|
2016-04-28 15:17:40 +08:00
|
|
|
return true;
|
2016-04-29 11:27:30 +08:00
|
|
|
}
|
2018-03-13 10:42:44 +08:00
|
|
|
rinfo = dfi->last_readdir ? &dfi->last_readdir->r_reply_info : NULL;
|
2016-04-28 15:17:40 +08:00
|
|
|
if (!rinfo || !rinfo->dir_nr)
|
|
|
|
return true;
|
2016-04-29 11:27:30 +08:00
|
|
|
chunk_offset = rinfo->dir_entries[0].offset;
|
|
|
|
return new_pos < chunk_offset ||
|
|
|
|
is_hash_order(new_pos) != is_hash_order(chunk_offset);
|
2016-04-28 15:17:40 +08:00
|
|
|
}
|
|
|
|
|
2012-12-18 07:59:39 +08:00
|
|
|
static loff_t ceph_dir_llseek(struct file *file, loff_t offset, int whence)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
2018-03-13 10:42:44 +08:00
|
|
|
struct ceph_dir_file_info *dfi = file->private_data;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct inode *inode = file->f_mapping->host;
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = ceph_inode_to_client(inode);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
loff_t retval;
|
|
|
|
|
2016-01-23 04:40:57 +08:00
|
|
|
inode_lock(inode);
|
2011-07-19 01:21:38 +08:00
|
|
|
retval = -EINVAL;
|
2012-12-18 07:59:39 +08:00
|
|
|
switch (whence) {
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
case SEEK_CUR:
|
|
|
|
offset += file->f_pos;
|
2021-03-05 17:59:23 +08:00
|
|
|
break;
|
2011-07-19 01:21:38 +08:00
|
|
|
case SEEK_SET:
|
|
|
|
break;
|
2015-06-16 20:48:56 +08:00
|
|
|
case SEEK_END:
|
|
|
|
retval = -EOPNOTSUPP;
|
2021-03-05 17:59:23 +08:00
|
|
|
goto out;
|
2011-07-19 01:21:38 +08:00
|
|
|
default:
|
|
|
|
goto out;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
2011-07-19 01:21:38 +08:00
|
|
|
|
2014-02-27 16:26:24 +08:00
|
|
|
if (offset >= 0) {
|
2018-03-13 10:42:44 +08:00
|
|
|
if (need_reset_readdir(dfi, offset)) {
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx dropping %p content\n",
|
|
|
|
inode, ceph_vinop(inode), file);
|
2018-03-13 10:42:44 +08:00
|
|
|
reset_readdir(dfi);
|
2016-04-29 11:27:30 +08:00
|
|
|
} else if (is_hash_order(offset) && offset > file->f_pos) {
|
|
|
|
/* for hash offset, we don't know if a forward seek
|
|
|
|
* is within same frag */
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->dir_release_count = 0;
|
|
|
|
dfi->readdir_cache_idx = -1;
|
2016-04-29 11:27:30 +08:00
|
|
|
}
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
if (offset != file->f_pos) {
|
|
|
|
file->f_pos = offset;
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->file_info.flags &= ~CEPH_F_ATEND;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
retval = offset;
|
|
|
|
}
|
2011-07-19 01:21:38 +08:00
|
|
|
out:
|
2016-01-23 04:40:57 +08:00
|
|
|
inode_unlock(inode);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2011-07-27 02:28:11 +08:00
|
|
|
* Handle lookups for the hidden .snap directory.
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
*/
|
2021-03-01 21:01:54 +08:00
|
|
|
struct dentry *ceph_handle_snapdir(struct ceph_mds_request *req,
|
2021-06-03 00:46:07 +08:00
|
|
|
struct dentry *dentry)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
2023-06-12 10:50:38 +08:00
|
|
|
struct ceph_fs_client *fsc = ceph_sb_to_fs_client(dentry->d_sb);
|
2022-02-11 11:33:25 +08:00
|
|
|
struct inode *parent = d_inode(dentry->d_parent); /* we hold i_rwsem */
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = ceph_inode_to_client(parent);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
|
|
|
/* .snap dir? */
|
2021-06-03 00:46:07 +08:00
|
|
|
if (ceph_snap(parent) == CEPH_NOSNAP &&
|
2021-03-01 21:01:54 +08:00
|
|
|
strcmp(dentry->d_name.name, fsc->mount_options->snapdir_name) == 0) {
|
|
|
|
struct dentry *res;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct inode *inode = ceph_get_snapdir(parent);
|
2021-03-01 21:01:54 +08:00
|
|
|
|
|
|
|
res = d_splice_alias(inode, dentry);
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "ENOENT on snapdir %p '%pd', linking to "
|
|
|
|
"snapdir %p %llx.%llx. Spliced dentry %p\n",
|
|
|
|
dentry, dentry, inode, ceph_vinop(inode), res);
|
2021-03-01 21:01:54 +08:00
|
|
|
if (res)
|
|
|
|
dentry = res;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
2021-03-01 21:01:54 +08:00
|
|
|
return dentry;
|
2011-07-27 02:28:11 +08:00
|
|
|
}
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2011-07-27 02:28:11 +08:00
|
|
|
/*
|
|
|
|
* Figure out final result of a lookup/open request.
|
|
|
|
*
|
|
|
|
* Mainly, make sure we return the final req->r_dentry (if it already
|
|
|
|
* existed) in place of the original VFS-provided dentry when they
|
|
|
|
* differ.
|
|
|
|
*
|
|
|
|
* Gracefully handle the case where the MDS replies with -ENOENT and
|
|
|
|
* no trace (which it may do, at its discretion, e.g., if it doesn't
|
|
|
|
* care to issue a lease on the negative dentry).
|
|
|
|
*/
|
|
|
|
struct dentry *ceph_finish_lookup(struct ceph_mds_request *req,
|
|
|
|
struct dentry *dentry, int err)
|
|
|
|
{
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = req->r_mdsc->fsc->client;
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
if (err == -ENOENT) {
|
|
|
|
/* no trace? */
|
|
|
|
err = 0;
|
|
|
|
if (!req->r_reply_info.head->is_dentry) {
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl,
|
|
|
|
"ENOENT and no trace, dentry %p inode %llx.%llx\n",
|
|
|
|
dentry, ceph_vinop(d_inode(dentry)));
|
2015-03-18 06:25:59 +08:00
|
|
|
if (d_really_is_positive(dentry)) {
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
d_drop(dentry);
|
|
|
|
err = -ENOENT;
|
|
|
|
} else {
|
|
|
|
d_add(dentry, NULL);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (err)
|
|
|
|
dentry = ERR_PTR(err);
|
|
|
|
else if (dentry != req->r_dentry)
|
|
|
|
dentry = dget(req->r_dentry); /* we got spliced */
|
|
|
|
else
|
|
|
|
dentry = NULL;
|
|
|
|
return dentry;
|
|
|
|
}
|
|
|
|
|
2016-03-25 17:18:39 +08:00
|
|
|
static bool is_root_ceph_dentry(struct inode *inode, struct dentry *dentry)
|
2009-12-03 03:54:25 +08:00
|
|
|
{
|
|
|
|
return ceph_ino(inode) == CEPH_INO_ROOT &&
|
|
|
|
strncmp(dentry->d_name.name, ".ceph", 5) == 0;
|
|
|
|
}
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
/*
|
|
|
|
* Look up a single dir entry. If there is a lookup intent, inform
|
|
|
|
* the MDS so that it gets our 'caps wanted' value in a single op.
|
|
|
|
*/
|
|
|
|
static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
|
2012-06-11 05:13:09 +08:00
|
|
|
unsigned int flags)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
2023-06-12 10:50:38 +08:00
|
|
|
struct ceph_fs_client *fsc = ceph_sb_to_fs_client(dir->i_sb);
|
2020-09-03 21:01:39 +08:00
|
|
|
struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb);
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = fsc->client;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_mds_request *req;
|
|
|
|
int op;
|
2016-03-07 10:34:50 +08:00
|
|
|
int mask;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
int err;
|
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx/'%pd' dentry %p\n", dir, ceph_vinop(dir),
|
|
|
|
dentry, dentry);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
|
|
|
if (dentry->d_name.len > NAME_MAX)
|
|
|
|
return ERR_PTR(-ENAMETOOLONG);
|
|
|
|
|
2021-01-27 03:12:24 +08:00
|
|
|
if (IS_ENCRYPTED(dir)) {
|
2023-03-17 02:14:12 +08:00
|
|
|
bool had_key = fscrypt_has_encryption_key(dir);
|
|
|
|
|
|
|
|
err = fscrypt_prepare_lookup_partial(dir, dentry);
|
2022-11-29 18:39:49 +08:00
|
|
|
if (err < 0)
|
2021-01-27 03:12:24 +08:00
|
|
|
return ERR_PTR(err);
|
2023-03-17 02:14:12 +08:00
|
|
|
|
|
|
|
/* mark directory as incomplete if it has been unlocked */
|
|
|
|
if (!had_key && fscrypt_has_encryption_key(dir))
|
|
|
|
ceph_dir_clear_complete(dir);
|
2021-01-27 03:12:24 +08:00
|
|
|
}
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
/* can we conclude ENOENT locally? */
|
2015-03-18 06:25:59 +08:00
|
|
|
if (d_really_is_negative(dentry)) {
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_inode_info *ci = ceph_inode(dir);
|
|
|
|
struct ceph_dentry_info *di = ceph_dentry(dentry);
|
|
|
|
|
2011-12-01 01:47:09 +08:00
|
|
|
spin_lock(&ci->i_ceph_lock);
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, " dir %llx.%llx flags are 0x%lx\n",
|
|
|
|
ceph_vinop(dir), ci->i_ceph_flags);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
if (strncmp(dentry->d_name.name,
|
2010-04-07 06:14:15 +08:00
|
|
|
fsc->mount_options->snapdir_name,
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
dentry->d_name.len) &&
|
2009-12-03 03:54:25 +08:00
|
|
|
!is_root_ceph_dentry(dir, dentry) &&
|
2015-03-04 16:05:04 +08:00
|
|
|
ceph_test_mount_opt(fsc, DCACHE) &&
|
2013-03-13 19:44:32 +08:00
|
|
|
__ceph_dir_is_complete(ci) &&
|
2020-03-20 11:45:00 +08:00
|
|
|
__ceph_caps_issued_mask_metric(ci, CEPH_CAP_FILE_SHARED, 1)) {
|
2020-03-05 20:21:00 +08:00
|
|
|
__ceph_touch_fmode(ci, mdsc, CEPH_FILE_MODE_RD);
|
2011-12-01 01:47:09 +08:00
|
|
|
spin_unlock(&ci->i_ceph_lock);
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, " dir %llx.%llx complete, -ENOENT\n",
|
|
|
|
ceph_vinop(dir));
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
d_add(dentry, NULL);
|
2017-11-27 10:47:46 +08:00
|
|
|
di->lease_shared_gen = atomic_read(&ci->i_shared_gen);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return NULL;
|
|
|
|
}
|
2011-12-01 01:47:09 +08:00
|
|
|
spin_unlock(&ci->i_ceph_lock);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
op = ceph_snap(dir) == CEPH_SNAPDIR ?
|
|
|
|
CEPH_MDS_OP_LOOKUPSNAP : CEPH_MDS_OP_LOOKUP;
|
|
|
|
req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
|
|
|
|
if (IS_ERR(req))
|
2010-05-22 18:01:14 +08:00
|
|
|
return ERR_CAST(req);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_dentry = dget(dentry);
|
|
|
|
req->r_num_caps = 2;
|
2016-03-07 10:34:50 +08:00
|
|
|
|
|
|
|
mask = CEPH_STAT_CAP_INODE | CEPH_CAP_AUTH_SHARED;
|
|
|
|
if (ceph_security_xattr_wanted(dir))
|
|
|
|
mask |= CEPH_CAP_XATTR_SHARED;
|
|
|
|
req->r_args.getattr.mask = cpu_to_le32(mask);
|
|
|
|
|
2021-06-19 01:05:06 +08:00
|
|
|
ihold(dir);
|
2017-01-31 23:28:26 +08:00
|
|
|
req->r_parent = dir;
|
|
|
|
set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
err = ceph_mdsc_do_request(mdsc, NULL, req);
|
2021-06-03 00:46:07 +08:00
|
|
|
if (err == -ENOENT) {
|
|
|
|
struct dentry *res;
|
|
|
|
|
|
|
|
res = ceph_handle_snapdir(req, dentry);
|
|
|
|
if (IS_ERR(res)) {
|
|
|
|
err = PTR_ERR(res);
|
|
|
|
} else {
|
|
|
|
dentry = res;
|
|
|
|
err = 0;
|
|
|
|
}
|
2021-03-01 21:01:54 +08:00
|
|
|
}
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
dentry = ceph_finish_lookup(req, dentry, err);
|
|
|
|
ceph_mdsc_put_request(req); /* will dput(dentry) */
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "result=%p\n", dentry);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return dentry;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we do a create but get no trace back from the MDS, follow up with
|
|
|
|
* a lookup (the VFS expects us to link up the provided dentry).
|
|
|
|
*/
|
|
|
|
int ceph_handle_notrace_create(struct inode *dir, struct dentry *dentry)
|
|
|
|
{
|
2012-06-11 05:13:09 +08:00
|
|
|
struct dentry *result = ceph_lookup(dir, dentry, 0);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
|
|
|
if (result && !IS_ERR(result)) {
|
|
|
|
/*
|
|
|
|
* We created the item, then did a lookup, and found
|
|
|
|
* it was already linked to another inode we already
|
2015-02-04 15:10:48 +08:00
|
|
|
* had in our cache (and thus got spliced). To not
|
|
|
|
* confuse VFS (especially when inode is a directory),
|
|
|
|
* we don't link our dentry to that inode, return an
|
|
|
|
* error instead.
|
|
|
|
*
|
|
|
|
* This event should be rare and it happens only when
|
|
|
|
* we talk to old MDS. Recent MDS does not send traceless
|
|
|
|
* reply for request that creates new inode.
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
*/
|
2015-02-02 11:27:56 +08:00
|
|
|
d_drop(result);
|
2015-02-04 15:10:48 +08:00
|
|
|
return -ESTALE;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
return PTR_ERR(result);
|
|
|
|
}
|
|
|
|
|
2023-01-13 19:49:16 +08:00
|
|
|
static int ceph_mknod(struct mnt_idmap *idmap, struct inode *dir,
|
2021-01-21 21:19:43 +08:00
|
|
|
struct dentry *dentry, umode_t mode, dev_t rdev)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
2020-09-03 21:01:39 +08:00
|
|
|
struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb);
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = mdsc->fsc->client;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_mds_request *req;
|
2019-05-26 15:35:39 +08:00
|
|
|
struct ceph_acl_sec_ctx as_ctx = {};
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
int err;
|
|
|
|
|
|
|
|
if (ceph_snap(dir) != CEPH_NOSNAP)
|
|
|
|
return -EROFS;
|
|
|
|
|
2022-05-10 09:47:01 +08:00
|
|
|
err = ceph_wait_on_conflict_unlink(dentry);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
2018-07-09 22:17:30 +08:00
|
|
|
if (ceph_quota_is_max_files_exceeded(dir)) {
|
|
|
|
err = -EDQUOT;
|
|
|
|
goto out;
|
|
|
|
}
|
2018-01-05 18:47:19 +08:00
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx/'%pd' dentry %p mode 0%ho rdev %d\n",
|
|
|
|
dir, ceph_vinop(dir), dentry, dentry, mode, rdev);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_MKNOD, USE_AUTH_MDS);
|
|
|
|
if (IS_ERR(req)) {
|
2014-09-16 20:35:17 +08:00
|
|
|
err = PTR_ERR(req);
|
|
|
|
goto out;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
ceph: preallocate inode for ops that may create one
When creating a new inode, we need to determine the crypto context
before we can transmit the RPC. The fscrypt API has a routine for getting
a crypto context before a create occurs, but it requires an inode.
Change the ceph code to preallocate an inode in advance of a create of
any sort (open(), mknod(), symlink(), etc). Move the existing code that
generates the ACL and SELinux blobs into this routine since that's
mostly common across all the different codepaths.
In most cases, we just want to allow ceph_fill_trace to use that inode
after the reply comes in, so add a new field to the MDS request for it
(r_new_inode).
The async create codepath is a bit different though. In that case, we
want to hash the inode in advance of the RPC so that it can be used
before the reply comes in. If the call subsequently fails with
-EJUKEBOX, then just put the references and clean up the as_ctx. Note
that with this change, we now need to regenerate the as_ctx when this
occurs, but it's quite rare for it to happen.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-27 01:11:00 +08:00
|
|
|
|
|
|
|
req->r_new_inode = ceph_new_inode(dir, dentry, &mode, &as_ctx);
|
|
|
|
if (IS_ERR(req->r_new_inode)) {
|
|
|
|
err = PTR_ERR(req->r_new_inode);
|
|
|
|
req->r_new_inode = NULL;
|
|
|
|
goto out_req;
|
|
|
|
}
|
|
|
|
|
2022-08-25 21:31:06 +08:00
|
|
|
if (S_ISREG(mode) && IS_ENCRYPTED(dir))
|
|
|
|
set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags);
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_dentry = dget(dentry);
|
|
|
|
req->r_num_caps = 2;
|
2017-01-31 23:28:26 +08:00
|
|
|
req->r_parent = dir;
|
2021-06-19 01:05:06 +08:00
|
|
|
ihold(dir);
|
2017-01-31 23:28:26 +08:00
|
|
|
set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags);
|
2023-08-07 21:26:19 +08:00
|
|
|
req->r_mnt_idmap = mnt_idmap_get(idmap);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_args.mknod.mode = cpu_to_le32(mode);
|
|
|
|
req->r_args.mknod.rdev = cpu_to_le32(rdev);
|
2023-06-05 14:58:18 +08:00
|
|
|
req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_AUTH_EXCL |
|
|
|
|
CEPH_CAP_XATTR_EXCL;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_dentry_unless = CEPH_CAP_FILE_EXCL;
|
ceph: preallocate inode for ops that may create one
When creating a new inode, we need to determine the crypto context
before we can transmit the RPC. The fscrypt API has a routine for getting
a crypto context before a create occurs, but it requires an inode.
Change the ceph code to preallocate an inode in advance of a create of
any sort (open(), mknod(), symlink(), etc). Move the existing code that
generates the ACL and SELinux blobs into this routine since that's
mostly common across all the different codepaths.
In most cases, we just want to allow ceph_fill_trace to use that inode
after the reply comes in, so add a new field to the MDS request for it
(r_new_inode).
The async create codepath is a bit different though. In that case, we
want to hash the inode in advance of the RPC so that it can be used
before the reply comes in. If the call subsequently fails with
-EJUKEBOX, then just put the references and clean up the as_ctx. Note
that with this change, we now need to regenerate the as_ctx when this
occurs, but it's quite rare for it to happen.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-27 01:11:00 +08:00
|
|
|
|
|
|
|
ceph_as_ctx_to_req(req, &as_ctx);
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
err = ceph_mdsc_do_request(mdsc, dir, req);
|
|
|
|
if (!err && !req->r_reply_info.head->is_dentry)
|
|
|
|
err = ceph_handle_notrace_create(dir, dentry);
|
ceph: preallocate inode for ops that may create one
When creating a new inode, we need to determine the crypto context
before we can transmit the RPC. The fscrypt API has a routine for getting
a crypto context before a create occurs, but it requires an inode.
Change the ceph code to preallocate an inode in advance of a create of
any sort (open(), mknod(), symlink(), etc). Move the existing code that
generates the ACL and SELinux blobs into this routine since that's
mostly common across all the different codepaths.
In most cases, we just want to allow ceph_fill_trace to use that inode
after the reply comes in, so add a new field to the MDS request for it
(r_new_inode).
The async create codepath is a bit different though. In that case, we
want to hash the inode in advance of the RPC so that it can be used
before the reply comes in. If the call subsequently fails with
-EJUKEBOX, then just put the references and clean up the as_ctx. Note
that with this change, we now need to regenerate the as_ctx when this
occurs, but it's quite rare for it to happen.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-27 01:11:00 +08:00
|
|
|
out_req:
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
ceph_mdsc_put_request(req);
|
2014-09-16 20:35:17 +08:00
|
|
|
out:
|
2013-11-11 15:18:03 +08:00
|
|
|
if (!err)
|
2019-05-26 15:35:39 +08:00
|
|
|
ceph_init_inode_acls(d_inode(dentry), &as_ctx);
|
2014-02-11 12:55:05 +08:00
|
|
|
else
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
d_drop(dentry);
|
2019-05-26 15:35:39 +08:00
|
|
|
ceph_release_acl_sec_ctx(&as_ctx);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2023-01-13 19:49:13 +08:00
|
|
|
static int ceph_create(struct mnt_idmap *idmap, struct inode *dir,
|
2021-01-21 21:19:43 +08:00
|
|
|
struct dentry *dentry, umode_t mode, bool excl)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
2023-01-13 19:49:16 +08:00
|
|
|
return ceph_mknod(idmap, dir, dentry, mode, 0);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
|
2020-09-04 01:31:10 +08:00
|
|
|
#if IS_ENABLED(CONFIG_FS_ENCRYPTION)
|
|
|
|
static int prep_encrypted_symlink_target(struct ceph_mds_request *req,
|
|
|
|
const char *dest)
|
|
|
|
{
|
|
|
|
int err;
|
|
|
|
int len = strlen(dest);
|
|
|
|
struct fscrypt_str osd_link = FSTR_INIT(NULL, 0);
|
|
|
|
|
|
|
|
err = fscrypt_prepare_symlink(req->r_parent, dest, len, PATH_MAX,
|
|
|
|
&osd_link);
|
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
err = fscrypt_encrypt_symlink(req->r_new_inode, dest, len, &osd_link);
|
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
req->r_path2 = kmalloc(CEPH_BASE64_CHARS(osd_link.len) + 1, GFP_KERNEL);
|
|
|
|
if (!req->r_path2) {
|
|
|
|
err = -ENOMEM;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
len = ceph_base64_encode(osd_link.name, osd_link.len, req->r_path2);
|
|
|
|
req->r_path2[len] = '\0';
|
|
|
|
out:
|
|
|
|
fscrypt_fname_free_buffer(&osd_link);
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
#else
|
|
|
|
static int prep_encrypted_symlink_target(struct ceph_mds_request *req,
|
|
|
|
const char *dest)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2023-01-13 19:49:14 +08:00
|
|
|
static int ceph_symlink(struct mnt_idmap *idmap, struct inode *dir,
|
2021-01-21 21:19:43 +08:00
|
|
|
struct dentry *dentry, const char *dest)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
2020-09-03 21:01:39 +08:00
|
|
|
struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb);
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = mdsc->fsc->client;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_mds_request *req;
|
2019-05-26 16:27:56 +08:00
|
|
|
struct ceph_acl_sec_ctx as_ctx = {};
|
ceph: preallocate inode for ops that may create one
When creating a new inode, we need to determine the crypto context
before we can transmit the RPC. The fscrypt API has a routine for getting
a crypto context before a create occurs, but it requires an inode.
Change the ceph code to preallocate an inode in advance of a create of
any sort (open(), mknod(), symlink(), etc). Move the existing code that
generates the ACL and SELinux blobs into this routine since that's
mostly common across all the different codepaths.
In most cases, we just want to allow ceph_fill_trace to use that inode
after the reply comes in, so add a new field to the MDS request for it
(r_new_inode).
The async create codepath is a bit different though. In that case, we
want to hash the inode in advance of the RPC so that it can be used
before the reply comes in. If the call subsequently fails with
-EJUKEBOX, then just put the references and clean up the as_ctx. Note
that with this change, we now need to regenerate the as_ctx when this
occurs, but it's quite rare for it to happen.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-27 01:11:00 +08:00
|
|
|
umode_t mode = S_IFLNK | 0777;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
int err;
|
|
|
|
|
|
|
|
if (ceph_snap(dir) != CEPH_NOSNAP)
|
|
|
|
return -EROFS;
|
|
|
|
|
2022-05-10 09:47:01 +08:00
|
|
|
err = ceph_wait_on_conflict_unlink(dentry);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
2018-07-09 22:17:31 +08:00
|
|
|
if (ceph_quota_is_max_files_exceeded(dir)) {
|
|
|
|
err = -EDQUOT;
|
|
|
|
goto out;
|
|
|
|
}
|
2018-01-05 18:47:19 +08:00
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx/'%pd' to '%s'\n", dir, ceph_vinop(dir), dentry,
|
|
|
|
dest);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_SYMLINK, USE_AUTH_MDS);
|
|
|
|
if (IS_ERR(req)) {
|
2014-09-16 20:35:17 +08:00
|
|
|
err = PTR_ERR(req);
|
|
|
|
goto out;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
ceph: preallocate inode for ops that may create one
When creating a new inode, we need to determine the crypto context
before we can transmit the RPC. The fscrypt API has a routine for getting
a crypto context before a create occurs, but it requires an inode.
Change the ceph code to preallocate an inode in advance of a create of
any sort (open(), mknod(), symlink(), etc). Move the existing code that
generates the ACL and SELinux blobs into this routine since that's
mostly common across all the different codepaths.
In most cases, we just want to allow ceph_fill_trace to use that inode
after the reply comes in, so add a new field to the MDS request for it
(r_new_inode).
The async create codepath is a bit different though. In that case, we
want to hash the inode in advance of the RPC so that it can be used
before the reply comes in. If the call subsequently fails with
-EJUKEBOX, then just put the references and clean up the as_ctx. Note
that with this change, we now need to regenerate the as_ctx when this
occurs, but it's quite rare for it to happen.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-27 01:11:00 +08:00
|
|
|
|
|
|
|
req->r_new_inode = ceph_new_inode(dir, dentry, &mode, &as_ctx);
|
|
|
|
if (IS_ERR(req->r_new_inode)) {
|
|
|
|
err = PTR_ERR(req->r_new_inode);
|
|
|
|
req->r_new_inode = NULL;
|
|
|
|
goto out_req;
|
|
|
|
}
|
|
|
|
|
2017-01-31 23:28:26 +08:00
|
|
|
req->r_parent = dir;
|
2021-06-19 01:05:06 +08:00
|
|
|
ihold(dir);
|
|
|
|
|
2020-09-04 01:31:10 +08:00
|
|
|
if (IS_ENCRYPTED(req->r_new_inode)) {
|
|
|
|
err = prep_encrypted_symlink_target(req, dest);
|
|
|
|
if (err)
|
|
|
|
goto out_req;
|
|
|
|
} else {
|
|
|
|
req->r_path2 = kstrdup(dest, GFP_KERNEL);
|
|
|
|
if (!req->r_path2) {
|
|
|
|
err = -ENOMEM;
|
|
|
|
goto out_req;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-01-31 23:28:26 +08:00
|
|
|
set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags);
|
2023-08-07 21:26:19 +08:00
|
|
|
req->r_mnt_idmap = mnt_idmap_get(idmap);
|
2015-03-22 00:54:58 +08:00
|
|
|
req->r_dentry = dget(dentry);
|
|
|
|
req->r_num_caps = 2;
|
2023-06-05 14:58:18 +08:00
|
|
|
req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_AUTH_EXCL |
|
|
|
|
CEPH_CAP_XATTR_EXCL;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_dentry_unless = CEPH_CAP_FILE_EXCL;
|
ceph: preallocate inode for ops that may create one
When creating a new inode, we need to determine the crypto context
before we can transmit the RPC. The fscrypt API has a routine for getting
a crypto context before a create occurs, but it requires an inode.
Change the ceph code to preallocate an inode in advance of a create of
any sort (open(), mknod(), symlink(), etc). Move the existing code that
generates the ACL and SELinux blobs into this routine since that's
mostly common across all the different codepaths.
In most cases, we just want to allow ceph_fill_trace to use that inode
after the reply comes in, so add a new field to the MDS request for it
(r_new_inode).
The async create codepath is a bit different though. In that case, we
want to hash the inode in advance of the RPC so that it can be used
before the reply comes in. If the call subsequently fails with
-EJUKEBOX, then just put the references and clean up the as_ctx. Note
that with this change, we now need to regenerate the as_ctx when this
occurs, but it's quite rare for it to happen.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-27 01:11:00 +08:00
|
|
|
|
|
|
|
ceph_as_ctx_to_req(req, &as_ctx);
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
err = ceph_mdsc_do_request(mdsc, dir, req);
|
|
|
|
if (!err && !req->r_reply_info.head->is_dentry)
|
|
|
|
err = ceph_handle_notrace_create(dir, dentry);
|
ceph: preallocate inode for ops that may create one
When creating a new inode, we need to determine the crypto context
before we can transmit the RPC. The fscrypt API has a routine for getting
a crypto context before a create occurs, but it requires an inode.
Change the ceph code to preallocate an inode in advance of a create of
any sort (open(), mknod(), symlink(), etc). Move the existing code that
generates the ACL and SELinux blobs into this routine since that's
mostly common across all the different codepaths.
In most cases, we just want to allow ceph_fill_trace to use that inode
after the reply comes in, so add a new field to the MDS request for it
(r_new_inode).
The async create codepath is a bit different though. In that case, we
want to hash the inode in advance of the RPC so that it can be used
before the reply comes in. If the call subsequently fails with
-EJUKEBOX, then just put the references and clean up the as_ctx. Note
that with this change, we now need to regenerate the as_ctx when this
occurs, but it's quite rare for it to happen.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-27 01:11:00 +08:00
|
|
|
out_req:
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
ceph_mdsc_put_request(req);
|
2014-09-16 20:35:17 +08:00
|
|
|
out:
|
|
|
|
if (err)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
d_drop(dentry);
|
2019-05-26 16:27:56 +08:00
|
|
|
ceph_release_acl_sec_ctx(&as_ctx);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2023-01-13 19:49:15 +08:00
|
|
|
static int ceph_mkdir(struct mnt_idmap *idmap, struct inode *dir,
|
2021-01-21 21:19:43 +08:00
|
|
|
struct dentry *dentry, umode_t mode)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
2020-09-03 21:01:39 +08:00
|
|
|
struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb);
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = mdsc->fsc->client;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_mds_request *req;
|
2019-05-26 15:35:39 +08:00
|
|
|
struct ceph_acl_sec_ctx as_ctx = {};
|
2022-05-10 09:47:01 +08:00
|
|
|
int err;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
int op;
|
|
|
|
|
2022-05-10 09:47:01 +08:00
|
|
|
err = ceph_wait_on_conflict_unlink(dentry);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
if (ceph_snap(dir) == CEPH_SNAPDIR) {
|
|
|
|
/* mkdir .snap/foo is a MKSNAP */
|
|
|
|
op = CEPH_MDS_OP_MKSNAP;
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "mksnap %llx.%llx/'%pd' dentry %p\n",
|
|
|
|
ceph_vinop(dir), dentry, dentry);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
} else if (ceph_snap(dir) == CEPH_NOSNAP) {
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "mkdir %llx.%llx/'%pd' dentry %p mode 0%ho\n",
|
|
|
|
ceph_vinop(dir), dentry, dentry, mode);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
op = CEPH_MDS_OP_MKDIR;
|
|
|
|
} else {
|
2022-05-10 09:47:01 +08:00
|
|
|
err = -EROFS;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
goto out;
|
|
|
|
}
|
2014-09-16 20:35:17 +08:00
|
|
|
|
2018-01-12 16:26:17 +08:00
|
|
|
if (op == CEPH_MDS_OP_MKDIR &&
|
|
|
|
ceph_quota_is_max_files_exceeded(dir)) {
|
2018-01-05 18:47:19 +08:00
|
|
|
err = -EDQUOT;
|
|
|
|
goto out;
|
|
|
|
}
|
2022-08-25 21:31:31 +08:00
|
|
|
if ((op == CEPH_MDS_OP_MKSNAP) && IS_ENCRYPTED(dir) &&
|
|
|
|
!fscrypt_has_encryption_key(dir)) {
|
|
|
|
err = -ENOKEY;
|
|
|
|
goto out;
|
|
|
|
}
|
2018-01-05 18:47:19 +08:00
|
|
|
|
2014-09-16 20:35:17 +08:00
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
|
|
|
|
if (IS_ERR(req)) {
|
|
|
|
err = PTR_ERR(req);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
ceph: preallocate inode for ops that may create one
When creating a new inode, we need to determine the crypto context
before we can transmit the RPC. The fscrypt API has a routine for getting
a crypto context before a create occurs, but it requires an inode.
Change the ceph code to preallocate an inode in advance of a create of
any sort (open(), mknod(), symlink(), etc). Move the existing code that
generates the ACL and SELinux blobs into this routine since that's
mostly common across all the different codepaths.
In most cases, we just want to allow ceph_fill_trace to use that inode
after the reply comes in, so add a new field to the MDS request for it
(r_new_inode).
The async create codepath is a bit different though. In that case, we
want to hash the inode in advance of the RPC so that it can be used
before the reply comes in. If the call subsequently fails with
-EJUKEBOX, then just put the references and clean up the as_ctx. Note
that with this change, we now need to regenerate the as_ctx when this
occurs, but it's quite rare for it to happen.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-27 01:11:00 +08:00
|
|
|
mode |= S_IFDIR;
|
|
|
|
req->r_new_inode = ceph_new_inode(dir, dentry, &mode, &as_ctx);
|
|
|
|
if (IS_ERR(req->r_new_inode)) {
|
|
|
|
err = PTR_ERR(req->r_new_inode);
|
|
|
|
req->r_new_inode = NULL;
|
|
|
|
goto out_req;
|
|
|
|
}
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_dentry = dget(dentry);
|
|
|
|
req->r_num_caps = 2;
|
2017-01-31 23:28:26 +08:00
|
|
|
req->r_parent = dir;
|
2021-06-19 01:05:06 +08:00
|
|
|
ihold(dir);
|
2017-01-31 23:28:26 +08:00
|
|
|
set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags);
|
2023-08-07 21:26:19 +08:00
|
|
|
if (op == CEPH_MDS_OP_MKDIR)
|
|
|
|
req->r_mnt_idmap = mnt_idmap_get(idmap);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_args.mkdir.mode = cpu_to_le32(mode);
|
2023-06-05 14:58:18 +08:00
|
|
|
req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_AUTH_EXCL |
|
|
|
|
CEPH_CAP_XATTR_EXCL;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_dentry_unless = CEPH_CAP_FILE_EXCL;
|
ceph: preallocate inode for ops that may create one
When creating a new inode, we need to determine the crypto context
before we can transmit the RPC. The fscrypt API has a routine for getting
a crypto context before a create occurs, but it requires an inode.
Change the ceph code to preallocate an inode in advance of a create of
any sort (open(), mknod(), symlink(), etc). Move the existing code that
generates the ACL and SELinux blobs into this routine since that's
mostly common across all the different codepaths.
In most cases, we just want to allow ceph_fill_trace to use that inode
after the reply comes in, so add a new field to the MDS request for it
(r_new_inode).
The async create codepath is a bit different though. In that case, we
want to hash the inode in advance of the RPC so that it can be used
before the reply comes in. If the call subsequently fails with
-EJUKEBOX, then just put the references and clean up the as_ctx. Note
that with this change, we now need to regenerate the as_ctx when this
occurs, but it's quite rare for it to happen.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-27 01:11:00 +08:00
|
|
|
|
|
|
|
ceph_as_ctx_to_req(req, &as_ctx);
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
err = ceph_mdsc_do_request(mdsc, dir, req);
|
2014-12-10 16:17:31 +08:00
|
|
|
if (!err &&
|
|
|
|
!req->r_reply_info.head->is_target &&
|
|
|
|
!req->r_reply_info.head->is_dentry)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
err = ceph_handle_notrace_create(dir, dentry);
|
ceph: preallocate inode for ops that may create one
When creating a new inode, we need to determine the crypto context
before we can transmit the RPC. The fscrypt API has a routine for getting
a crypto context before a create occurs, but it requires an inode.
Change the ceph code to preallocate an inode in advance of a create of
any sort (open(), mknod(), symlink(), etc). Move the existing code that
generates the ACL and SELinux blobs into this routine since that's
mostly common across all the different codepaths.
In most cases, we just want to allow ceph_fill_trace to use that inode
after the reply comes in, so add a new field to the MDS request for it
(r_new_inode).
The async create codepath is a bit different though. In that case, we
want to hash the inode in advance of the RPC so that it can be used
before the reply comes in. If the call subsequently fails with
-EJUKEBOX, then just put the references and clean up the as_ctx. Note
that with this change, we now need to regenerate the as_ctx when this
occurs, but it's quite rare for it to happen.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-27 01:11:00 +08:00
|
|
|
out_req:
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
ceph_mdsc_put_request(req);
|
|
|
|
out:
|
2014-02-11 12:55:05 +08:00
|
|
|
if (!err)
|
2019-05-26 15:35:39 +08:00
|
|
|
ceph_init_inode_acls(d_inode(dentry), &as_ctx);
|
2014-02-11 12:55:05 +08:00
|
|
|
else
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
d_drop(dentry);
|
2019-05-26 15:35:39 +08:00
|
|
|
ceph_release_acl_sec_ctx(&as_ctx);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int ceph_link(struct dentry *old_dentry, struct inode *dir,
|
|
|
|
struct dentry *dentry)
|
|
|
|
{
|
2020-09-03 21:01:39 +08:00
|
|
|
struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb);
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = mdsc->fsc->client;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_mds_request *req;
|
|
|
|
int err;
|
|
|
|
|
2023-04-26 10:38:57 +08:00
|
|
|
if (dentry->d_flags & DCACHE_DISCONNECTED)
|
|
|
|
return -EINVAL;
|
|
|
|
|
2022-05-10 09:47:01 +08:00
|
|
|
err = ceph_wait_on_conflict_unlink(dentry);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
if (ceph_snap(dir) != CEPH_NOSNAP)
|
|
|
|
return -EROFS;
|
|
|
|
|
2021-07-02 02:40:51 +08:00
|
|
|
err = fscrypt_prepare_link(old_dentry, dir, dentry);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %llx.%llx/'%pd' to '%pd'\n", dir, ceph_vinop(dir),
|
|
|
|
old_dentry, dentry);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_LINK, USE_AUTH_MDS);
|
|
|
|
if (IS_ERR(req)) {
|
|
|
|
d_drop(dentry);
|
|
|
|
return PTR_ERR(req);
|
|
|
|
}
|
|
|
|
req->r_dentry = dget(dentry);
|
|
|
|
req->r_num_caps = 2;
|
2013-02-06 05:41:23 +08:00
|
|
|
req->r_old_dentry = dget(old_dentry);
|
2023-04-26 10:38:57 +08:00
|
|
|
/*
|
|
|
|
* The old_dentry maybe a DCACHE_DISCONNECTED dentry, then we
|
|
|
|
* will just pass the ino# to MDSs.
|
|
|
|
*/
|
|
|
|
if (old_dentry->d_flags & DCACHE_DISCONNECTED)
|
|
|
|
req->r_ino2 = ceph_vino(d_inode(old_dentry));
|
2017-01-31 23:28:26 +08:00
|
|
|
req->r_parent = dir;
|
2021-06-19 01:05:06 +08:00
|
|
|
ihold(dir);
|
2017-01-31 23:28:26 +08:00
|
|
|
set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags);
|
2023-06-05 14:58:18 +08:00
|
|
|
req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_XATTR_EXCL;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_dentry_unless = CEPH_CAP_FILE_EXCL;
|
2013-07-21 20:25:25 +08:00
|
|
|
/* release LINK_SHARED on source inode (mds will lock it) */
|
2017-11-23 17:59:13 +08:00
|
|
|
req->r_old_inode_drop = CEPH_CAP_LINK_SHARED | CEPH_CAP_LINK_EXCL;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
err = ceph_mdsc_do_request(mdsc, dir, req);
|
2011-05-28 00:24:26 +08:00
|
|
|
if (err) {
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
d_drop(dentry);
|
2011-05-28 00:24:26 +08:00
|
|
|
} else if (!req->r_reply_info.head->is_dentry) {
|
2015-03-18 06:25:59 +08:00
|
|
|
ihold(d_inode(old_dentry));
|
|
|
|
d_instantiate(dentry, d_inode(old_dentry));
|
2011-05-28 00:24:26 +08:00
|
|
|
}
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
ceph_mdsc_put_request(req);
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
ceph: perform asynchronous unlink if we have sufficient caps
The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.
When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.
In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.
If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.
The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.
Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-04-03 03:35:56 +08:00
|
|
|
static void ceph_async_unlink_cb(struct ceph_mds_client *mdsc,
|
|
|
|
struct ceph_mds_request *req)
|
|
|
|
{
|
2022-05-10 09:47:01 +08:00
|
|
|
struct dentry *dentry = req->r_dentry;
|
2023-06-12 10:50:38 +08:00
|
|
|
struct ceph_fs_client *fsc = ceph_sb_to_fs_client(dentry->d_sb);
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = fsc->client;
|
2022-05-10 09:47:01 +08:00
|
|
|
struct ceph_dentry_info *di = ceph_dentry(dentry);
|
ceph: perform asynchronous unlink if we have sufficient caps
The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.
When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.
In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.
If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.
The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.
Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-04-03 03:35:56 +08:00
|
|
|
int result = req->r_err ? req->r_err :
|
|
|
|
le32_to_cpu(req->r_reply_info.head->result);
|
|
|
|
|
2022-05-10 09:47:01 +08:00
|
|
|
if (!test_bit(CEPH_DENTRY_ASYNC_UNLINK_BIT, &di->flags))
|
2023-06-12 09:04:07 +08:00
|
|
|
pr_warn_client(cl,
|
|
|
|
"dentry %p:%pd async unlink bit is not set\n",
|
|
|
|
dentry, dentry);
|
2022-05-10 09:47:01 +08:00
|
|
|
|
|
|
|
spin_lock(&fsc->async_unlink_conflict_lock);
|
|
|
|
hash_del_rcu(&di->hnode);
|
|
|
|
spin_unlock(&fsc->async_unlink_conflict_lock);
|
|
|
|
|
|
|
|
spin_lock(&dentry->d_lock);
|
|
|
|
di->flags &= ~CEPH_DENTRY_ASYNC_UNLINK;
|
|
|
|
wake_up_bit(&di->flags, CEPH_DENTRY_ASYNC_UNLINK_BIT);
|
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
|
|
|
|
synchronize_rcu();
|
|
|
|
|
ceph: perform asynchronous unlink if we have sufficient caps
The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.
When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.
In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.
If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.
The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.
Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-04-03 03:35:56 +08:00
|
|
|
if (result == -EJUKEBOX)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
/* If op failed, mark everyone involved for errors */
|
|
|
|
if (result) {
|
2020-04-08 20:41:38 +08:00
|
|
|
int pathlen = 0;
|
|
|
|
u64 base = 0;
|
2023-06-09 15:15:47 +08:00
|
|
|
char *path = ceph_mdsc_build_path(mdsc, dentry, &pathlen,
|
ceph: perform asynchronous unlink if we have sufficient caps
The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.
When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.
In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.
If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.
The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.
Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-04-03 03:35:56 +08:00
|
|
|
&base, 0);
|
|
|
|
|
|
|
|
/* mark error on parent + clear complete */
|
|
|
|
mapping_set_error(req->r_parent->i_mapping, result);
|
|
|
|
ceph_dir_clear_complete(req->r_parent);
|
|
|
|
|
|
|
|
/* drop the dentry -- we don't know its status */
|
2022-05-10 09:47:01 +08:00
|
|
|
if (!d_unhashed(dentry))
|
|
|
|
d_drop(dentry);
|
ceph: perform asynchronous unlink if we have sufficient caps
The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.
When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.
In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.
If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.
The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.
Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-04-03 03:35:56 +08:00
|
|
|
|
|
|
|
/* mark inode itself for an error (since metadata is bogus) */
|
|
|
|
mapping_set_error(req->r_old_inode->i_mapping, result);
|
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
pr_warn_client(cl, "failure path=(%llx)%s result=%d!\n",
|
|
|
|
base, IS_ERR(path) ? "<<bad>>" : path, result);
|
ceph: perform asynchronous unlink if we have sufficient caps
The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.
When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.
In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.
If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.
The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.
Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-04-03 03:35:56 +08:00
|
|
|
ceph_mdsc_free_path(path, pathlen);
|
|
|
|
}
|
|
|
|
out:
|
|
|
|
iput(req->r_old_inode);
|
|
|
|
ceph_mdsc_release_dir_caps(req);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int get_caps_for_async_unlink(struct inode *dir, struct dentry *dentry)
|
|
|
|
{
|
|
|
|
struct ceph_inode_info *ci = ceph_inode(dir);
|
|
|
|
struct ceph_dentry_info *di;
|
|
|
|
int got = 0, want = CEPH_CAP_FILE_EXCL | CEPH_CAP_DIR_UNLINK;
|
|
|
|
|
|
|
|
spin_lock(&ci->i_ceph_lock);
|
|
|
|
if ((__ceph_caps_issued(ci, NULL) & want) == want) {
|
|
|
|
ceph_take_cap_refs(ci, want, false);
|
|
|
|
got = want;
|
|
|
|
}
|
|
|
|
spin_unlock(&ci->i_ceph_lock);
|
|
|
|
|
|
|
|
/* If we didn't get anything, return 0 */
|
|
|
|
if (!got)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
spin_lock(&dentry->d_lock);
|
|
|
|
di = ceph_dentry(dentry);
|
|
|
|
/*
|
|
|
|
* - We are holding Fx, which implies Fs caps.
|
|
|
|
* - Only support async unlink for primary linkage
|
|
|
|
*/
|
|
|
|
if (atomic_read(&ci->i_shared_gen) != di->lease_shared_gen ||
|
|
|
|
!(di->flags & CEPH_DENTRY_PRIMARY_LINK))
|
|
|
|
want = 0;
|
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
|
|
|
|
/* Do we still want what we've got? */
|
|
|
|
if (want == got)
|
|
|
|
return got;
|
|
|
|
|
|
|
|
ceph_put_cap_refs(ci, got);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
/*
|
|
|
|
* rmdir and unlink are differ only by the metadata op code
|
|
|
|
*/
|
|
|
|
static int ceph_unlink(struct inode *dir, struct dentry *dentry)
|
|
|
|
{
|
2023-06-12 10:50:38 +08:00
|
|
|
struct ceph_fs_client *fsc = ceph_sb_to_fs_client(dir->i_sb);
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = fsc->client;
|
2010-04-07 06:14:15 +08:00
|
|
|
struct ceph_mds_client *mdsc = fsc->mdsc;
|
2015-03-18 06:25:59 +08:00
|
|
|
struct inode *inode = d_inode(dentry);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_mds_request *req;
|
ceph: perform asynchronous unlink if we have sufficient caps
The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.
When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.
In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.
If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.
The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.
Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-04-03 03:35:56 +08:00
|
|
|
bool try_async = ceph_test_mount_opt(fsc, ASYNC_DIROPS);
|
2023-11-08 11:06:05 +08:00
|
|
|
struct dentry *dn;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
int err = -EROFS;
|
|
|
|
int op;
|
2023-11-08 11:06:05 +08:00
|
|
|
char *path;
|
|
|
|
int pathlen;
|
|
|
|
u64 pathbase;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
|
|
|
if (ceph_snap(dir) == CEPH_SNAPDIR) {
|
|
|
|
/* rmdir .snap/foo is RMSNAP */
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "rmsnap %llx.%llx/'%pd' dn\n", ceph_vinop(dir),
|
|
|
|
dentry);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
op = CEPH_MDS_OP_RMSNAP;
|
|
|
|
} else if (ceph_snap(dir) == CEPH_NOSNAP) {
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "unlink/rmdir %llx.%llx/'%pd' inode %llx.%llx\n",
|
|
|
|
ceph_vinop(dir), dentry, ceph_vinop(inode));
|
VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
Convert the following where appropriate:
(1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
(2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
(3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
complicated than it appears as some calls should be converted to
d_can_lookup() instead. The difference is whether the directory in
question is a real dir with a ->lookup op or whether it's a fake dir with
a ->d_automount op.
In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).
Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer. In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.
However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.
There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
intended for special directory entry types that don't have attached inodes.
The following perl+coccinelle script was used:
use strict;
my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
die "Can't grep for S_ISDIR and co. callers";
@callers = <$fd>;
close($fd);
unless (@callers) {
print "No matches\n";
exit(0);
}
my @cocci = (
'@@',
'expression E;',
'@@',
'',
'- S_ISLNK(E->d_inode->i_mode)',
'+ d_is_symlink(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISDIR(E->d_inode->i_mode)',
'+ d_is_dir(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISREG(E->d_inode->i_mode)',
'+ d_is_reg(E)' );
my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);
foreach my $file (@callers) {
chomp $file;
print "Processing ", $file, "\n";
system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
die "spatch failed";
}
[AV: overlayfs parts skipped]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-01-29 20:02:35 +08:00
|
|
|
op = d_is_dir(dentry) ?
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
CEPH_MDS_OP_RMDIR : CEPH_MDS_OP_UNLINK;
|
|
|
|
} else
|
|
|
|
goto out;
|
2023-11-08 11:06:05 +08:00
|
|
|
|
|
|
|
dn = d_find_alias(dir);
|
|
|
|
if (!dn) {
|
|
|
|
try_async = false;
|
|
|
|
} else {
|
|
|
|
path = ceph_mdsc_build_path(mdsc, dn, &pathlen, &pathbase, 0);
|
|
|
|
if (IS_ERR(path)) {
|
|
|
|
try_async = false;
|
|
|
|
err = 0;
|
|
|
|
} else {
|
|
|
|
err = ceph_mds_check_access(mdsc, path, MAY_WRITE);
|
|
|
|
}
|
|
|
|
ceph_mdsc_free_path(path, pathlen);
|
|
|
|
dput(dn);
|
|
|
|
|
|
|
|
/* For none EACCES cases will let the MDS do the mds auth check */
|
|
|
|
if (err == -EACCES) {
|
|
|
|
return err;
|
|
|
|
} else if (err < 0) {
|
|
|
|
try_async = false;
|
|
|
|
err = 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
ceph: perform asynchronous unlink if we have sufficient caps
The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.
When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.
In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.
If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.
The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.
Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-04-03 03:35:56 +08:00
|
|
|
retry:
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
|
|
|
|
if (IS_ERR(req)) {
|
|
|
|
err = PTR_ERR(req);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
req->r_dentry = dget(dentry);
|
|
|
|
req->r_num_caps = 2;
|
2017-01-31 23:28:26 +08:00
|
|
|
req->r_parent = dir;
|
2021-06-19 01:05:06 +08:00
|
|
|
ihold(dir);
|
2023-06-05 14:58:18 +08:00
|
|
|
req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_XATTR_EXCL;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_dentry_unless = CEPH_CAP_FILE_EXCL;
|
2018-01-24 21:24:33 +08:00
|
|
|
req->r_inode_drop = ceph_drop_caps_for_unlink(inode);
|
ceph: perform asynchronous unlink if we have sufficient caps
The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.
When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.
In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.
If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.
The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.
Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-04-03 03:35:56 +08:00
|
|
|
|
|
|
|
if (try_async && op == CEPH_MDS_OP_UNLINK &&
|
|
|
|
(req->r_dir_caps = get_caps_for_async_unlink(dir, dentry))) {
|
2022-05-10 09:47:01 +08:00
|
|
|
struct ceph_dentry_info *di = ceph_dentry(dentry);
|
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "async unlink on %llx.%llx/'%pd' caps=%s",
|
|
|
|
ceph_vinop(dir), dentry,
|
|
|
|
ceph_cap_string(req->r_dir_caps));
|
ceph: perform asynchronous unlink if we have sufficient caps
The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.
When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.
In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.
If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.
The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.
Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-04-03 03:35:56 +08:00
|
|
|
set_bit(CEPH_MDS_R_ASYNC, &req->r_req_flags);
|
|
|
|
req->r_callback = ceph_async_unlink_cb;
|
|
|
|
req->r_old_inode = d_inode(dentry);
|
|
|
|
ihold(req->r_old_inode);
|
2022-05-10 09:47:01 +08:00
|
|
|
|
|
|
|
spin_lock(&dentry->d_lock);
|
|
|
|
di->flags |= CEPH_DENTRY_ASYNC_UNLINK;
|
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
|
|
|
|
spin_lock(&fsc->async_unlink_conflict_lock);
|
|
|
|
hash_add_rcu(fsc->async_unlink_conflict, &di->hnode,
|
|
|
|
dentry->d_name.hash);
|
|
|
|
spin_unlock(&fsc->async_unlink_conflict_lock);
|
|
|
|
|
ceph: perform asynchronous unlink if we have sufficient caps
The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.
When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.
In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.
If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.
The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.
Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-04-03 03:35:56 +08:00
|
|
|
err = ceph_mdsc_submit_request(mdsc, dir, req);
|
|
|
|
if (!err) {
|
|
|
|
/*
|
|
|
|
* We have enough caps, so we assume that the unlink
|
|
|
|
* will succeed. Fix up the target inode and dcache.
|
|
|
|
*/
|
|
|
|
drop_nlink(inode);
|
|
|
|
d_delete(dentry);
|
2022-05-10 09:47:01 +08:00
|
|
|
} else {
|
|
|
|
spin_lock(&fsc->async_unlink_conflict_lock);
|
|
|
|
hash_del_rcu(&di->hnode);
|
|
|
|
spin_unlock(&fsc->async_unlink_conflict_lock);
|
|
|
|
|
|
|
|
spin_lock(&dentry->d_lock);
|
|
|
|
di->flags &= ~CEPH_DENTRY_ASYNC_UNLINK;
|
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
|
|
|
|
if (err == -EJUKEBOX) {
|
|
|
|
try_async = false;
|
|
|
|
ceph_mdsc_put_request(req);
|
|
|
|
goto retry;
|
|
|
|
}
|
ceph: perform asynchronous unlink if we have sufficient caps
The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.
When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.
In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.
If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.
The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.
Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-04-03 03:35:56 +08:00
|
|
|
}
|
|
|
|
} else {
|
|
|
|
set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags);
|
|
|
|
err = ceph_mdsc_do_request(mdsc, dir, req);
|
|
|
|
if (!err && !req->r_reply_info.head->is_dentry)
|
|
|
|
d_delete(dentry);
|
|
|
|
}
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
ceph_mdsc_put_request(req);
|
|
|
|
out:
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2023-01-13 19:49:17 +08:00
|
|
|
static int ceph_rename(struct mnt_idmap *idmap, struct inode *old_dir,
|
2021-01-21 21:19:43 +08:00
|
|
|
struct dentry *old_dentry, struct inode *new_dir,
|
|
|
|
struct dentry *new_dentry, unsigned int flags)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
2020-09-03 21:01:39 +08:00
|
|
|
struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(old_dir->i_sb);
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = mdsc->fsc->client;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_mds_request *req;
|
2015-04-07 15:36:32 +08:00
|
|
|
int op = CEPH_MDS_OP_RENAME;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
int err;
|
|
|
|
|
fs: make remaining filesystems use .rename2
This is trivial to do:
- add flags argument to foo_rename()
- check if flags is zero
- assign foo_rename() to .rename2 instead of .rename
This doesn't mean it's impossible to support RENAME_NOREPLACE for these
filesystems, but it is not trivial, like for local filesystems.
RENAME_NOREPLACE must guarantee atomicity (i.e. it shouldn't be possible
for a file to be created on one host while it is overwritten by rename on
another host).
Filesystems converted:
9p, afs, ceph, coda, ecryptfs, kernfs, lustre, ncpfs, nfs, ocfs2, orangefs.
After this, we can get rid of the duplicate interfaces for rename.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: David Howells <dhowells@redhat.com> [AFS]
Acked-by: Mike Marshall <hubcap@omnibond.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Mark Fasheh <mfasheh@suse.com>
2016-09-27 17:03:58 +08:00
|
|
|
if (flags)
|
|
|
|
return -EINVAL;
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
if (ceph_snap(old_dir) != ceph_snap(new_dir))
|
|
|
|
return -EXDEV;
|
2015-04-07 15:36:32 +08:00
|
|
|
if (ceph_snap(old_dir) != CEPH_NOSNAP) {
|
|
|
|
if (old_dir == new_dir && ceph_snap(old_dir) == CEPH_SNAPDIR)
|
|
|
|
op = CEPH_MDS_OP_RENAMESNAP;
|
|
|
|
else
|
|
|
|
return -EROFS;
|
|
|
|
}
|
2020-11-12 23:23:21 +08:00
|
|
|
/* don't allow cross-quota renames */
|
|
|
|
if ((old_dir != new_dir) &&
|
|
|
|
(!ceph_quota_is_same_realm(old_dir, new_dir)))
|
|
|
|
return -EXDEV;
|
2018-01-05 18:47:20 +08:00
|
|
|
|
2022-05-10 09:47:01 +08:00
|
|
|
err = ceph_wait_on_conflict_unlink(new_dentry);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
2021-07-02 02:40:51 +08:00
|
|
|
err = fscrypt_prepare_rename(old_dir, old_dentry, new_dir, new_dentry,
|
|
|
|
flags);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%llx.%llx/'%pd' to %llx.%llx/'%pd'\n",
|
|
|
|
ceph_vinop(old_dir), old_dentry, ceph_vinop(new_dir),
|
|
|
|
new_dentry);
|
2015-04-07 15:36:32 +08:00
|
|
|
req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
if (IS_ERR(req))
|
|
|
|
return PTR_ERR(req);
|
2013-02-06 05:36:05 +08:00
|
|
|
ihold(old_dir);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_dentry = dget(new_dentry);
|
|
|
|
req->r_num_caps = 2;
|
|
|
|
req->r_old_dentry = dget(old_dentry);
|
2013-02-06 05:36:05 +08:00
|
|
|
req->r_old_dentry_dir = old_dir;
|
2017-01-31 23:28:26 +08:00
|
|
|
req->r_parent = new_dir;
|
2021-06-19 01:05:06 +08:00
|
|
|
ihold(new_dir);
|
2017-01-31 23:28:26 +08:00
|
|
|
set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags);
|
2023-06-05 14:58:18 +08:00
|
|
|
req->r_old_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_XATTR_EXCL;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_old_dentry_unless = CEPH_CAP_FILE_EXCL;
|
2023-06-05 14:58:18 +08:00
|
|
|
req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_XATTR_EXCL;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
req->r_dentry_unless = CEPH_CAP_FILE_EXCL;
|
|
|
|
/* release LINK_RDCACHE on source inode (mds will lock it) */
|
2017-11-23 17:59:13 +08:00
|
|
|
req->r_old_inode_drop = CEPH_CAP_LINK_SHARED | CEPH_CAP_LINK_EXCL;
|
2018-01-24 21:24:33 +08:00
|
|
|
if (d_really_is_positive(new_dentry)) {
|
|
|
|
req->r_inode_drop =
|
|
|
|
ceph_drop_caps_for_unlink(d_inode(new_dentry));
|
|
|
|
}
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
err = ceph_mdsc_do_request(mdsc, old_dir, req);
|
|
|
|
if (!err && !req->r_reply_info.head->is_dentry) {
|
|
|
|
/*
|
|
|
|
* Normally d_move() is done by fill_trace (called by
|
|
|
|
* do_request, above). If there is no trace, we need
|
|
|
|
* to do it here.
|
|
|
|
*/
|
|
|
|
d_move(old_dentry, new_dentry);
|
|
|
|
}
|
|
|
|
ceph_mdsc_put_request(req);
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2019-01-31 16:55:51 +08:00
|
|
|
/*
|
|
|
|
* Move dentry to tail of mdsc->dentry_leases list when lease is updated.
|
|
|
|
* Leases at front of the list will expire first. (Assume all leases have
|
|
|
|
* similar duration)
|
|
|
|
*
|
|
|
|
* Called under dentry->d_lock.
|
|
|
|
*/
|
|
|
|
void __ceph_dentry_lease_touch(struct ceph_dentry_info *di)
|
|
|
|
{
|
|
|
|
struct dentry *dn = di->dentry;
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(dn->d_sb)->mdsc;
|
|
|
|
struct ceph_client *cl = mdsc->fsc->client;
|
2019-01-31 16:55:51 +08:00
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %p '%pd'\n", di, dn, dn);
|
2019-01-31 16:55:51 +08:00
|
|
|
|
|
|
|
di->flags |= CEPH_DENTRY_LEASE_LIST;
|
|
|
|
if (di->flags & CEPH_DENTRY_SHRINK_LIST) {
|
|
|
|
di->flags |= CEPH_DENTRY_REFERENCED;
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
spin_lock(&mdsc->dentry_list_lock);
|
|
|
|
list_move_tail(&di->lease_list, &mdsc->dentry_leases);
|
|
|
|
spin_unlock(&mdsc->dentry_list_lock);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void __dentry_dir_lease_touch(struct ceph_mds_client* mdsc,
|
|
|
|
struct ceph_dentry_info *di)
|
|
|
|
{
|
|
|
|
di->flags &= ~(CEPH_DENTRY_LEASE_LIST | CEPH_DENTRY_REFERENCED);
|
|
|
|
di->lease_gen = 0;
|
|
|
|
di->time = jiffies;
|
|
|
|
list_move_tail(&di->lease_list, &mdsc->dentry_dir_leases);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* When dir lease is used, add dentry to tail of mdsc->dentry_dir_leases
|
|
|
|
* list if it's not in the list, otherwise set 'referenced' flag.
|
|
|
|
*
|
|
|
|
* Called under dentry->d_lock.
|
|
|
|
*/
|
|
|
|
void __ceph_dentry_dir_lease_touch(struct ceph_dentry_info *di)
|
|
|
|
{
|
|
|
|
struct dentry *dn = di->dentry;
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(dn->d_sb)->mdsc;
|
|
|
|
struct ceph_client *cl = mdsc->fsc->client;
|
2019-01-31 16:55:51 +08:00
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p %p '%pd' (offset 0x%llx)\n", di, dn, dn, di->offset);
|
2019-01-31 16:55:51 +08:00
|
|
|
|
|
|
|
if (!list_empty(&di->lease_list)) {
|
|
|
|
if (di->flags & CEPH_DENTRY_LEASE_LIST) {
|
|
|
|
/* don't remove dentry from dentry lease list
|
|
|
|
* if its lease is valid */
|
|
|
|
if (__dentry_lease_is_valid(di))
|
|
|
|
return;
|
|
|
|
} else {
|
|
|
|
di->flags |= CEPH_DENTRY_REFERENCED;
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (di->flags & CEPH_DENTRY_SHRINK_LIST) {
|
|
|
|
di->flags |= CEPH_DENTRY_REFERENCED;
|
|
|
|
di->flags &= ~CEPH_DENTRY_LEASE_LIST;
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
spin_lock(&mdsc->dentry_list_lock);
|
2024-07-09 14:44:00 +08:00
|
|
|
__dentry_dir_lease_touch(mdsc, di);
|
2019-01-31 16:55:51 +08:00
|
|
|
spin_unlock(&mdsc->dentry_list_lock);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void __dentry_lease_unlist(struct ceph_dentry_info *di)
|
|
|
|
{
|
|
|
|
struct ceph_mds_client *mdsc;
|
|
|
|
if (di->flags & CEPH_DENTRY_SHRINK_LIST)
|
|
|
|
return;
|
|
|
|
if (list_empty(&di->lease_list))
|
|
|
|
return;
|
|
|
|
|
2023-06-12 10:50:38 +08:00
|
|
|
mdsc = ceph_sb_to_fs_client(di->dentry->d_sb)->mdsc;
|
2019-01-31 16:55:51 +08:00
|
|
|
spin_lock(&mdsc->dentry_list_lock);
|
|
|
|
list_del_init(&di->lease_list);
|
|
|
|
spin_unlock(&mdsc->dentry_list_lock);
|
|
|
|
}
|
|
|
|
|
|
|
|
enum {
|
|
|
|
KEEP = 0,
|
|
|
|
DELETE = 1,
|
|
|
|
TOUCH = 2,
|
|
|
|
STOP = 4,
|
|
|
|
};
|
|
|
|
|
|
|
|
struct ceph_lease_walk_control {
|
|
|
|
bool dir_lease;
|
2019-02-01 14:57:15 +08:00
|
|
|
bool expire_dir_lease;
|
2019-01-31 16:55:51 +08:00
|
|
|
unsigned long nr_to_scan;
|
|
|
|
unsigned long dir_lease_ttl;
|
|
|
|
};
|
|
|
|
|
2023-12-20 13:29:25 +08:00
|
|
|
static int __dir_lease_check(const struct dentry *, struct ceph_lease_walk_control *);
|
|
|
|
static int __dentry_lease_check(const struct dentry *);
|
|
|
|
|
2019-01-31 16:55:51 +08:00
|
|
|
static unsigned long
|
|
|
|
__dentry_leases_walk(struct ceph_mds_client *mdsc,
|
2023-12-20 13:29:25 +08:00
|
|
|
struct ceph_lease_walk_control *lwc)
|
2019-01-31 16:55:51 +08:00
|
|
|
{
|
|
|
|
struct ceph_dentry_info *di, *tmp;
|
|
|
|
struct dentry *dentry, *last = NULL;
|
|
|
|
struct list_head* list;
|
|
|
|
LIST_HEAD(dispose);
|
|
|
|
unsigned long freed = 0;
|
|
|
|
int ret = 0;
|
|
|
|
|
|
|
|
list = lwc->dir_lease ? &mdsc->dentry_dir_leases : &mdsc->dentry_leases;
|
|
|
|
spin_lock(&mdsc->dentry_list_lock);
|
|
|
|
list_for_each_entry_safe(di, tmp, list, lease_list) {
|
|
|
|
if (!lwc->nr_to_scan)
|
|
|
|
break;
|
|
|
|
--lwc->nr_to_scan;
|
|
|
|
|
|
|
|
dentry = di->dentry;
|
|
|
|
if (last == dentry)
|
|
|
|
break;
|
|
|
|
|
|
|
|
if (!spin_trylock(&dentry->d_lock))
|
|
|
|
continue;
|
|
|
|
|
2019-06-28 10:25:23 +08:00
|
|
|
if (__lockref_is_dead(&dentry->d_lockref)) {
|
2019-01-31 16:55:51 +08:00
|
|
|
list_del_init(&di->lease_list);
|
|
|
|
goto next;
|
|
|
|
}
|
|
|
|
|
2023-12-20 13:29:25 +08:00
|
|
|
if (lwc->dir_lease)
|
|
|
|
ret = __dir_lease_check(dentry, lwc);
|
|
|
|
else
|
|
|
|
ret = __dentry_lease_check(dentry);
|
2019-01-31 16:55:51 +08:00
|
|
|
if (ret & TOUCH) {
|
|
|
|
/* move it into tail of dir lease list */
|
|
|
|
__dentry_dir_lease_touch(mdsc, di);
|
|
|
|
if (!last)
|
|
|
|
last = dentry;
|
|
|
|
}
|
|
|
|
if (ret & DELETE) {
|
|
|
|
/* stale lease */
|
|
|
|
di->flags &= ~CEPH_DENTRY_REFERENCED;
|
|
|
|
if (dentry->d_lockref.count > 0) {
|
|
|
|
/* update_dentry_lease() will re-add
|
|
|
|
* it to lease list, or
|
|
|
|
* ceph_d_delete() will return 1 when
|
|
|
|
* last reference is dropped */
|
|
|
|
list_del_init(&di->lease_list);
|
|
|
|
} else {
|
|
|
|
di->flags |= CEPH_DENTRY_SHRINK_LIST;
|
|
|
|
list_move_tail(&di->lease_list, &dispose);
|
|
|
|
dget_dlock(dentry);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
next:
|
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
if (ret & STOP)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
spin_unlock(&mdsc->dentry_list_lock);
|
|
|
|
|
|
|
|
while (!list_empty(&dispose)) {
|
|
|
|
di = list_first_entry(&dispose, struct ceph_dentry_info,
|
|
|
|
lease_list);
|
|
|
|
dentry = di->dentry;
|
|
|
|
spin_lock(&dentry->d_lock);
|
|
|
|
|
|
|
|
list_del_init(&di->lease_list);
|
|
|
|
di->flags &= ~CEPH_DENTRY_SHRINK_LIST;
|
|
|
|
if (di->flags & CEPH_DENTRY_REFERENCED) {
|
|
|
|
spin_lock(&mdsc->dentry_list_lock);
|
|
|
|
if (di->flags & CEPH_DENTRY_LEASE_LIST) {
|
|
|
|
list_add_tail(&di->lease_list,
|
|
|
|
&mdsc->dentry_leases);
|
|
|
|
} else {
|
|
|
|
__dentry_dir_lease_touch(mdsc, di);
|
|
|
|
}
|
|
|
|
spin_unlock(&mdsc->dentry_list_lock);
|
|
|
|
} else {
|
|
|
|
freed++;
|
|
|
|
}
|
|
|
|
|
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
/* ceph_d_delete() does the trick */
|
|
|
|
dput(dentry);
|
|
|
|
}
|
|
|
|
return freed;
|
|
|
|
}
|
|
|
|
|
2023-12-20 13:29:25 +08:00
|
|
|
static int __dentry_lease_check(const struct dentry *dentry)
|
2019-01-31 16:55:51 +08:00
|
|
|
{
|
|
|
|
struct ceph_dentry_info *di = ceph_dentry(dentry);
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (__dentry_lease_is_valid(di))
|
|
|
|
return STOP;
|
|
|
|
ret = __dir_lease_try_check(dentry);
|
|
|
|
if (ret == -EBUSY)
|
|
|
|
return KEEP;
|
|
|
|
if (ret > 0)
|
|
|
|
return TOUCH;
|
|
|
|
return DELETE;
|
|
|
|
}
|
|
|
|
|
2023-12-20 13:29:25 +08:00
|
|
|
static int __dir_lease_check(const struct dentry *dentry,
|
|
|
|
struct ceph_lease_walk_control *lwc)
|
2019-01-31 16:55:51 +08:00
|
|
|
{
|
|
|
|
struct ceph_dentry_info *di = ceph_dentry(dentry);
|
|
|
|
|
|
|
|
int ret = __dir_lease_try_check(dentry);
|
|
|
|
if (ret == -EBUSY)
|
|
|
|
return KEEP;
|
|
|
|
if (ret > 0) {
|
|
|
|
if (time_before(jiffies, di->time + lwc->dir_lease_ttl))
|
|
|
|
return STOP;
|
|
|
|
/* Move dentry to tail of dir lease list if we don't want
|
|
|
|
* to delete it. So dentries in the list are checked in a
|
|
|
|
* round robin manner */
|
2019-02-01 14:57:15 +08:00
|
|
|
if (!lwc->expire_dir_lease)
|
|
|
|
return TOUCH;
|
|
|
|
if (dentry->d_lockref.count > 0 ||
|
|
|
|
(di->flags & CEPH_DENTRY_REFERENCED))
|
|
|
|
return TOUCH;
|
|
|
|
/* invalidate dir lease */
|
|
|
|
di->lease_shared_gen = 0;
|
2019-01-31 16:55:51 +08:00
|
|
|
}
|
|
|
|
return DELETE;
|
|
|
|
}
|
|
|
|
|
|
|
|
int ceph_trim_dentries(struct ceph_mds_client *mdsc)
|
|
|
|
{
|
|
|
|
struct ceph_lease_walk_control lwc;
|
2019-02-01 14:57:15 +08:00
|
|
|
unsigned long count;
|
2019-01-31 16:55:51 +08:00
|
|
|
unsigned long freed;
|
|
|
|
|
2019-02-01 14:57:15 +08:00
|
|
|
spin_lock(&mdsc->caps_list_lock);
|
|
|
|
if (mdsc->caps_use_max > 0 &&
|
|
|
|
mdsc->caps_use_count > mdsc->caps_use_max)
|
|
|
|
count = mdsc->caps_use_count - mdsc->caps_use_max;
|
|
|
|
else
|
|
|
|
count = 0;
|
|
|
|
spin_unlock(&mdsc->caps_list_lock);
|
|
|
|
|
2019-01-31 16:55:51 +08:00
|
|
|
lwc.dir_lease = false;
|
|
|
|
lwc.nr_to_scan = CEPH_CAPS_PER_RELEASE * 2;
|
2023-12-20 13:29:25 +08:00
|
|
|
freed = __dentry_leases_walk(mdsc, &lwc);
|
2019-01-31 16:55:51 +08:00
|
|
|
if (!lwc.nr_to_scan) /* more invalid leases */
|
|
|
|
return -EAGAIN;
|
|
|
|
|
|
|
|
if (lwc.nr_to_scan < CEPH_CAPS_PER_RELEASE)
|
|
|
|
lwc.nr_to_scan = CEPH_CAPS_PER_RELEASE;
|
|
|
|
|
|
|
|
lwc.dir_lease = true;
|
2019-02-01 14:57:15 +08:00
|
|
|
lwc.expire_dir_lease = freed < count;
|
|
|
|
lwc.dir_lease_ttl = mdsc->fsc->mount_options->caps_wanted_delay_max * HZ;
|
2023-12-20 13:29:25 +08:00
|
|
|
freed +=__dentry_leases_walk(mdsc, &lwc);
|
2019-01-31 16:55:51 +08:00
|
|
|
if (!lwc.nr_to_scan) /* more to check */
|
|
|
|
return -EAGAIN;
|
|
|
|
|
|
|
|
return freed > 0 ? 1 : 0;
|
|
|
|
}
|
|
|
|
|
2010-05-15 00:35:38 +08:00
|
|
|
/*
|
|
|
|
* Ensure a dentry lease will no longer revalidate.
|
|
|
|
*/
|
|
|
|
void ceph_invalidate_dentry_lease(struct dentry *dentry)
|
|
|
|
{
|
2019-01-31 16:55:51 +08:00
|
|
|
struct ceph_dentry_info *di = ceph_dentry(dentry);
|
2010-05-15 00:35:38 +08:00
|
|
|
spin_lock(&dentry->d_lock);
|
2019-01-31 16:55:51 +08:00
|
|
|
di->time = jiffies;
|
|
|
|
di->lease_shared_gen = 0;
|
2020-02-19 03:12:32 +08:00
|
|
|
di->flags &= ~CEPH_DENTRY_PRIMARY_LINK;
|
2019-01-31 16:55:51 +08:00
|
|
|
__dentry_lease_unlist(di);
|
2010-05-15 00:35:38 +08:00
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
}
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Check if dentry lease is valid. If not, delete the lease. Try to
|
|
|
|
* renew if the least is more than half up.
|
|
|
|
*/
|
2019-01-28 20:43:55 +08:00
|
|
|
static bool __dentry_lease_is_valid(struct ceph_dentry_info *di)
|
|
|
|
{
|
|
|
|
struct ceph_mds_session *session;
|
|
|
|
|
|
|
|
if (!di->lease_gen)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
session = di->lease_session;
|
|
|
|
if (session) {
|
|
|
|
u32 gen;
|
|
|
|
unsigned long ttl;
|
|
|
|
|
2021-06-05 00:03:09 +08:00
|
|
|
gen = atomic_read(&session->s_cap_gen);
|
2019-01-28 20:43:55 +08:00
|
|
|
ttl = session->s_cap_ttl;
|
|
|
|
|
|
|
|
if (di->lease_gen == gen &&
|
|
|
|
time_before(jiffies, ttl) &&
|
|
|
|
time_before(jiffies, di->time))
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
di->lease_gen = 0;
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2019-05-23 10:45:24 +08:00
|
|
|
static int dentry_lease_is_valid(struct dentry *dentry, unsigned int flags)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
|
|
|
struct ceph_dentry_info *di;
|
|
|
|
struct ceph_mds_session *session = NULL;
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(dentry->d_sb)->mdsc;
|
|
|
|
struct ceph_client *cl = mdsc->fsc->client;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
u32 seq = 0;
|
2019-01-28 20:43:55 +08:00
|
|
|
int valid = 0;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
|
|
|
spin_lock(&dentry->d_lock);
|
|
|
|
di = ceph_dentry(dentry);
|
2019-01-28 20:43:55 +08:00
|
|
|
if (di && __dentry_lease_is_valid(di)) {
|
|
|
|
valid = 1;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2019-01-28 20:43:55 +08:00
|
|
|
if (di->lease_renew_after &&
|
|
|
|
time_after(jiffies, di->lease_renew_after)) {
|
|
|
|
/*
|
|
|
|
* We should renew. If we're in RCU walk mode
|
|
|
|
* though, we can't do that so just return
|
|
|
|
* -ECHILD.
|
|
|
|
*/
|
|
|
|
if (flags & LOOKUP_RCU) {
|
|
|
|
valid = -ECHILD;
|
|
|
|
} else {
|
|
|
|
session = ceph_get_mds_session(di->lease_session);
|
|
|
|
seq = di->lease_seq;
|
|
|
|
di->lease_renew_after = 0;
|
|
|
|
di->lease_renew_from = jiffies;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
|
|
|
|
if (session) {
|
2019-05-23 10:45:24 +08:00
|
|
|
ceph_mdsc_lease_send_msg(session, dentry,
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
CEPH_MDS_LEASE_RENEW, seq);
|
|
|
|
ceph_put_mds_session(session);
|
|
|
|
}
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "dentry %p = %d\n", dentry, valid);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return valid;
|
|
|
|
}
|
|
|
|
|
2019-01-28 20:43:55 +08:00
|
|
|
/*
|
|
|
|
* Called under dentry->d_lock.
|
|
|
|
*/
|
|
|
|
static int __dir_lease_try_check(const struct dentry *dentry)
|
|
|
|
{
|
|
|
|
struct ceph_dentry_info *di = ceph_dentry(dentry);
|
|
|
|
struct inode *dir;
|
|
|
|
struct ceph_inode_info *ci;
|
|
|
|
int valid = 0;
|
|
|
|
|
|
|
|
if (!di->lease_shared_gen)
|
|
|
|
return 0;
|
|
|
|
if (IS_ROOT(dentry))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
dir = d_inode(dentry->d_parent);
|
|
|
|
ci = ceph_inode(dir);
|
|
|
|
|
|
|
|
if (spin_trylock(&ci->i_ceph_lock)) {
|
|
|
|
if (atomic_read(&ci->i_shared_gen) == di->lease_shared_gen &&
|
|
|
|
__ceph_caps_issued_mask(ci, CEPH_CAP_FILE_SHARED, 0))
|
|
|
|
valid = 1;
|
|
|
|
spin_unlock(&ci->i_ceph_lock);
|
|
|
|
} else {
|
|
|
|
valid = -EBUSY;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!valid)
|
|
|
|
di->lease_shared_gen = 0;
|
|
|
|
return valid;
|
|
|
|
}
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
/*
|
|
|
|
* Check if directory-wide content lease/cap is valid.
|
|
|
|
*/
|
2020-03-05 20:21:00 +08:00
|
|
|
static int dir_lease_is_valid(struct inode *dir, struct dentry *dentry,
|
|
|
|
struct ceph_mds_client *mdsc)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
|
|
|
struct ceph_inode_info *ci = ceph_inode(dir);
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_client *cl = mdsc->fsc->client;
|
2019-05-22 17:26:27 +08:00
|
|
|
int valid;
|
|
|
|
int shared_gen;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2011-12-01 01:47:09 +08:00
|
|
|
spin_lock(&ci->i_ceph_lock);
|
2019-05-22 17:26:27 +08:00
|
|
|
valid = __ceph_caps_issued_mask(ci, CEPH_CAP_FILE_SHARED, 1);
|
2020-03-05 20:21:00 +08:00
|
|
|
if (valid) {
|
|
|
|
__ceph_touch_fmode(ci, mdsc, CEPH_FILE_MODE_RD);
|
|
|
|
shared_gen = atomic_read(&ci->i_shared_gen);
|
|
|
|
}
|
2011-12-01 01:47:09 +08:00
|
|
|
spin_unlock(&ci->i_ceph_lock);
|
2019-05-22 17:26:27 +08:00
|
|
|
if (valid) {
|
|
|
|
struct ceph_dentry_info *di;
|
|
|
|
spin_lock(&dentry->d_lock);
|
|
|
|
di = ceph_dentry(dentry);
|
|
|
|
if (dir == d_inode(dentry->d_parent) &&
|
|
|
|
di && di->lease_shared_gen == shared_gen)
|
|
|
|
__ceph_dentry_dir_lease_touch(di);
|
|
|
|
else
|
|
|
|
valid = 0;
|
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
}
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "dir %p %llx.%llx v%u dentry %p '%pd' = %d\n", dir,
|
|
|
|
ceph_vinop(dir), (unsigned)atomic_read(&ci->i_shared_gen),
|
|
|
|
dentry, dentry, valid);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return valid;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check if cached dentry can be trusted.
|
|
|
|
*/
|
2012-06-11 04:03:43 +08:00
|
|
|
static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_mds_client *mdsc = ceph_sb_to_fs_client(dentry->d_sb)->mdsc;
|
|
|
|
struct ceph_client *cl = mdsc->fsc->client;
|
2011-07-27 02:30:43 +08:00
|
|
|
int valid = 0;
|
2016-03-16 16:40:23 +08:00
|
|
|
struct dentry *parent;
|
2019-10-29 21:50:19 +08:00
|
|
|
struct inode *dir, *inode;
|
2011-01-07 14:49:57 +08:00
|
|
|
|
2020-08-08 03:47:17 +08:00
|
|
|
valid = fscrypt_d_revalidate(dentry, flags);
|
|
|
|
if (valid <= 0)
|
|
|
|
return valid;
|
|
|
|
|
2016-07-01 21:39:21 +08:00
|
|
|
if (flags & LOOKUP_RCU) {
|
2016-12-26 17:26:34 +08:00
|
|
|
parent = READ_ONCE(dentry->d_parent);
|
2016-07-01 21:39:21 +08:00
|
|
|
dir = d_inode_rcu(parent);
|
|
|
|
if (!dir)
|
|
|
|
return -ECHILD;
|
2019-10-29 21:50:19 +08:00
|
|
|
inode = d_inode_rcu(dentry);
|
2016-07-01 21:39:21 +08:00
|
|
|
} else {
|
|
|
|
parent = dget_parent(dentry);
|
|
|
|
dir = d_inode(parent);
|
2019-10-29 21:50:19 +08:00
|
|
|
inode = d_inode(dentry);
|
2016-07-01 21:39:21 +08:00
|
|
|
}
|
2011-01-07 14:49:57 +08:00
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p '%pd' inode %p offset 0x%llx nokey %d\n",
|
|
|
|
dentry, dentry, inode, ceph_dentry(dentry)->offset,
|
|
|
|
!!(dentry->d_flags & DCACHE_NOKEY_NAME));
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2023-06-12 10:50:38 +08:00
|
|
|
mdsc = ceph_sb_to_fs_client(dir->i_sb)->mdsc;
|
2020-03-05 20:21:00 +08:00
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
/* always trust cached snapped dentries, snapdir dentry */
|
|
|
|
if (ceph_snap(dir) != CEPH_NOSNAP) {
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p '%pd' inode %p is SNAPPED\n", dentry,
|
|
|
|
dentry, inode);
|
2011-07-27 02:30:43 +08:00
|
|
|
valid = 1;
|
2019-10-29 21:50:19 +08:00
|
|
|
} else if (inode && ceph_snap(inode) == CEPH_SNAPDIR) {
|
2011-07-27 02:30:43 +08:00
|
|
|
valid = 1;
|
2016-07-01 21:39:20 +08:00
|
|
|
} else {
|
2019-05-23 10:45:24 +08:00
|
|
|
valid = dentry_lease_is_valid(dentry, flags);
|
2016-07-01 21:39:20 +08:00
|
|
|
if (valid == -ECHILD)
|
|
|
|
return valid;
|
2020-03-05 20:21:00 +08:00
|
|
|
if (valid || dir_lease_is_valid(dir, dentry, mdsc)) {
|
2019-10-29 21:50:19 +08:00
|
|
|
if (inode)
|
|
|
|
valid = ceph_is_any_caps(inode);
|
2016-07-01 21:39:20 +08:00
|
|
|
else
|
|
|
|
valid = 1;
|
|
|
|
}
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
|
2016-03-17 14:41:59 +08:00
|
|
|
if (!valid) {
|
|
|
|
struct ceph_mds_request *req;
|
2017-01-13 03:42:38 +08:00
|
|
|
int op, err;
|
|
|
|
u32 mask;
|
2016-03-17 14:41:59 +08:00
|
|
|
|
2016-07-01 21:39:21 +08:00
|
|
|
if (flags & LOOKUP_RCU)
|
|
|
|
return -ECHILD;
|
|
|
|
|
2020-03-20 11:44:59 +08:00
|
|
|
percpu_counter_inc(&mdsc->metric.d_lease_mis);
|
|
|
|
|
2016-03-17 14:41:59 +08:00
|
|
|
op = ceph_snap(dir) == CEPH_SNAPDIR ?
|
2017-01-30 22:47:25 +08:00
|
|
|
CEPH_MDS_OP_LOOKUPSNAP : CEPH_MDS_OP_LOOKUP;
|
2016-03-17 14:41:59 +08:00
|
|
|
req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
|
|
|
|
if (!IS_ERR(req)) {
|
|
|
|
req->r_dentry = dget(dentry);
|
2017-01-30 22:47:25 +08:00
|
|
|
req->r_num_caps = 2;
|
|
|
|
req->r_parent = dir;
|
2021-06-19 01:05:06 +08:00
|
|
|
ihold(dir);
|
2016-03-17 14:41:59 +08:00
|
|
|
|
|
|
|
mask = CEPH_STAT_CAP_INODE | CEPH_CAP_AUTH_SHARED;
|
|
|
|
if (ceph_security_xattr_wanted(dir))
|
|
|
|
mask |= CEPH_CAP_XATTR_SHARED;
|
2017-01-13 03:42:38 +08:00
|
|
|
req->r_args.getattr.mask = cpu_to_le32(mask);
|
2016-03-17 14:41:59 +08:00
|
|
|
|
|
|
|
err = ceph_mdsc_do_request(mdsc, NULL, req);
|
2016-12-01 04:56:46 +08:00
|
|
|
switch (err) {
|
|
|
|
case 0:
|
|
|
|
if (d_really_is_positive(dentry) &&
|
|
|
|
d_inode(dentry) == req->r_target_inode)
|
|
|
|
valid = 1;
|
|
|
|
break;
|
|
|
|
case -ENOENT:
|
|
|
|
if (d_really_is_negative(dentry))
|
|
|
|
valid = 1;
|
2020-08-24 06:36:59 +08:00
|
|
|
fallthrough;
|
2016-12-01 04:56:46 +08:00
|
|
|
default:
|
|
|
|
break;
|
2016-03-17 14:41:59 +08:00
|
|
|
}
|
|
|
|
ceph_mdsc_put_request(req);
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p '%pd', lookup result=%d\n", dentry,
|
|
|
|
dentry, err);
|
2016-03-17 14:41:59 +08:00
|
|
|
}
|
2020-03-20 11:44:59 +08:00
|
|
|
} else {
|
|
|
|
percpu_counter_inc(&mdsc->metric.d_lease_hit);
|
2016-03-17 14:41:59 +08:00
|
|
|
}
|
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "%p '%pd' %s\n", dentry, dentry, valid ? "valid" : "invalid");
|
2019-01-31 16:55:51 +08:00
|
|
|
if (!valid)
|
2013-11-30 12:47:41 +08:00
|
|
|
ceph_dir_clear_complete(dir);
|
2016-03-16 16:40:23 +08:00
|
|
|
|
2016-07-01 21:39:21 +08:00
|
|
|
if (!(flags & LOOKUP_RCU))
|
|
|
|
dput(parent);
|
2011-07-27 02:30:43 +08:00
|
|
|
return valid;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
|
2019-01-28 20:43:55 +08:00
|
|
|
/*
|
|
|
|
* Delete unused dentry that doesn't have valid lease
|
|
|
|
*
|
|
|
|
* Called under dentry->d_lock.
|
|
|
|
*/
|
|
|
|
static int ceph_d_delete(const struct dentry *dentry)
|
|
|
|
{
|
|
|
|
struct ceph_dentry_info *di;
|
|
|
|
|
|
|
|
/* won't release caps */
|
|
|
|
if (d_really_is_negative(dentry))
|
|
|
|
return 0;
|
|
|
|
if (ceph_snap(d_inode(dentry)) != CEPH_NOSNAP)
|
|
|
|
return 0;
|
|
|
|
/* vaild lease? */
|
|
|
|
di = ceph_dentry(dentry);
|
|
|
|
if (di) {
|
|
|
|
if (__dentry_lease_is_valid(di))
|
|
|
|
return 0;
|
|
|
|
if (__dir_lease_try_check(dentry))
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
/*
|
2011-03-16 05:57:41 +08:00
|
|
|
* Release our ceph_dentry_info.
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
*/
|
2011-03-16 05:57:41 +08:00
|
|
|
static void ceph_d_release(struct dentry *dentry)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
{
|
|
|
|
struct ceph_dentry_info *di = ceph_dentry(dentry);
|
2023-06-12 10:50:38 +08:00
|
|
|
struct ceph_fs_client *fsc = ceph_sb_to_fs_client(dentry->d_sb);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(fsc->client, "dentry %p '%pd'\n", dentry, dentry);
|
2016-07-01 21:39:20 +08:00
|
|
|
|
2020-03-20 11:44:59 +08:00
|
|
|
atomic64_dec(&fsc->mdsc->metric.total_dentries);
|
|
|
|
|
2016-07-01 21:39:20 +08:00
|
|
|
spin_lock(&dentry->d_lock);
|
2019-01-31 16:55:51 +08:00
|
|
|
__dentry_lease_unlist(di);
|
2016-07-01 21:39:20 +08:00
|
|
|
dentry->d_fsdata = NULL;
|
|
|
|
spin_unlock(&dentry->d_lock);
|
|
|
|
|
2021-06-10 02:09:52 +08:00
|
|
|
ceph_put_mds_session(di->lease_session);
|
2011-11-12 01:48:53 +08:00
|
|
|
kmem_cache_free(ceph_dentry_cachep, di);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
|
2011-03-16 06:53:40 +08:00
|
|
|
/*
|
|
|
|
* When the VFS prunes a dentry from the cache, we need to clear the
|
|
|
|
* complete flag on the parent directory.
|
|
|
|
*
|
|
|
|
* Called under dentry->d_lock.
|
|
|
|
*/
|
|
|
|
static void ceph_d_prune(struct dentry *dentry)
|
|
|
|
{
|
2023-06-12 09:04:07 +08:00
|
|
|
struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dentry->d_sb);
|
|
|
|
struct ceph_client *cl = mdsc->fsc->client;
|
2017-11-27 11:23:48 +08:00
|
|
|
struct ceph_inode_info *dir_ci;
|
|
|
|
struct ceph_dentry_info *di;
|
|
|
|
|
2023-06-12 09:04:07 +08:00
|
|
|
doutc(cl, "dentry %p '%pd'\n", dentry, dentry);
|
2011-03-16 06:53:40 +08:00
|
|
|
|
|
|
|
/* do we have a valid parent? */
|
2012-06-08 04:43:35 +08:00
|
|
|
if (IS_ROOT(dentry))
|
2011-03-16 06:53:40 +08:00
|
|
|
return;
|
|
|
|
|
2017-11-27 11:23:48 +08:00
|
|
|
/* we hold d_lock, so d_parent is stable */
|
|
|
|
dir_ci = ceph_inode(d_inode(dentry->d_parent));
|
|
|
|
if (dir_ci->i_vino.snap == CEPH_SNAPDIR)
|
2011-03-16 06:53:40 +08:00
|
|
|
return;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2017-11-27 11:23:48 +08:00
|
|
|
/* who calls d_delete() should also disable dcache readdir */
|
|
|
|
if (d_really_is_negative(dentry))
|
2016-10-29 09:52:50 +08:00
|
|
|
return;
|
|
|
|
|
2017-11-27 11:23:48 +08:00
|
|
|
/* d_fsdata does not get cleared until d_release */
|
|
|
|
if (!d_unhashed(dentry)) {
|
|
|
|
__ceph_dir_clear_complete(dir_ci);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Disable dcache readdir just in case that someone called d_drop()
|
|
|
|
* or d_invalidate(), but MDS didn't revoke CEPH_CAP_FILE_SHARED
|
|
|
|
* properly (dcache readdir is still enabled) */
|
|
|
|
di = ceph_dentry(dentry);
|
|
|
|
if (di->offset > 0 &&
|
|
|
|
di->lease_shared_gen == atomic_read(&dir_ci->i_shared_gen))
|
|
|
|
__ceph_dir_clear_ordered(dir_ci);
|
2011-03-16 06:53:40 +08:00
|
|
|
}
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* read() on a dir. This weird interface hack only works if mounted
|
|
|
|
* with '-o dirstat'.
|
|
|
|
*/
|
|
|
|
static ssize_t ceph_read_dir(struct file *file, char __user *buf, size_t size,
|
|
|
|
loff_t *ppos)
|
|
|
|
{
|
2018-03-13 10:42:44 +08:00
|
|
|
struct ceph_dir_file_info *dfi = file->private_data;
|
2013-01-24 06:07:38 +08:00
|
|
|
struct inode *inode = file_inode(file);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
struct ceph_inode_info *ci = ceph_inode(inode);
|
|
|
|
int left;
|
2011-05-13 05:28:05 +08:00
|
|
|
const int bufsize = 1024;
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
|
2023-06-12 10:50:38 +08:00
|
|
|
if (!ceph_test_mount_opt(ceph_sb_to_fs_client(inode->i_sb), DIRSTAT))
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return -EISDIR;
|
|
|
|
|
2018-03-13 10:42:44 +08:00
|
|
|
if (!dfi->dir_info) {
|
|
|
|
dfi->dir_info = kmalloc(bufsize, GFP_KERNEL);
|
|
|
|
if (!dfi->dir_info)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return -ENOMEM;
|
2018-03-13 10:42:44 +08:00
|
|
|
dfi->dir_info_len =
|
|
|
|
snprintf(dfi->dir_info, bufsize,
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
"entries: %20lld\n"
|
|
|
|
" files: %20lld\n"
|
|
|
|
" subdirs: %20lld\n"
|
|
|
|
"rentries: %20lld\n"
|
|
|
|
" rfiles: %20lld\n"
|
|
|
|
" rsubdirs: %20lld\n"
|
|
|
|
"rbytes: %20lld\n"
|
2018-07-14 04:18:36 +08:00
|
|
|
"rctime: %10lld.%09ld\n",
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
ci->i_files + ci->i_subdirs,
|
|
|
|
ci->i_files,
|
|
|
|
ci->i_subdirs,
|
|
|
|
ci->i_rfiles + ci->i_rsubdirs,
|
|
|
|
ci->i_rfiles,
|
|
|
|
ci->i_rsubdirs,
|
|
|
|
ci->i_rbytes,
|
2018-07-14 04:18:36 +08:00
|
|
|
ci->i_rctime.tv_sec,
|
|
|
|
ci->i_rctime.tv_nsec);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
}
|
|
|
|
|
2018-03-13 10:42:44 +08:00
|
|
|
if (*ppos >= dfi->dir_info_len)
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
return 0;
|
2018-03-13 10:42:44 +08:00
|
|
|
size = min_t(unsigned, size, dfi->dir_info_len-*ppos);
|
|
|
|
left = copy_to_user(buf, dfi->dir_info + *ppos, size);
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
if (left == size)
|
|
|
|
return -EFAULT;
|
|
|
|
*ppos += (size - left);
|
|
|
|
return size - left;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
2010-11-17 03:14:34 +08:00
|
|
|
/*
|
|
|
|
* Return name hash for a given dentry. This is dependent on
|
|
|
|
* the parent directory's hash function.
|
|
|
|
*/
|
2011-07-27 02:30:55 +08:00
|
|
|
unsigned ceph_dentry_hash(struct inode *dir, struct dentry *dn)
|
2010-11-17 03:14:34 +08:00
|
|
|
{
|
|
|
|
struct ceph_inode_info *dci = ceph_inode(dir);
|
2019-04-18 00:58:28 +08:00
|
|
|
unsigned hash;
|
2010-11-17 03:14:34 +08:00
|
|
|
|
|
|
|
switch (dci->i_dir_layout.dl_dir_hash) {
|
|
|
|
case 0: /* for backward compat */
|
|
|
|
case CEPH_STR_HASH_LINUX:
|
|
|
|
return dn->d_name.hash;
|
|
|
|
|
|
|
|
default:
|
2019-04-18 00:58:28 +08:00
|
|
|
spin_lock(&dn->d_lock);
|
|
|
|
hash = ceph_str_hash(dci->i_dir_layout.dl_dir_hash,
|
2010-11-17 03:14:34 +08:00
|
|
|
dn->d_name.name, dn->d_name.len);
|
2019-04-18 00:58:28 +08:00
|
|
|
spin_unlock(&dn->d_lock);
|
|
|
|
return hash;
|
2010-11-17 03:14:34 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
vfs: get rid of old '->iterate' directory operation
All users now just use '->iterate_shared()', which only takes the
directory inode lock for reading.
Filesystems that never got convered to shared mode now instead use a
wrapper that drops the lock, re-takes it in write mode, calls the old
function, and then downgrades the lock back to read mode.
This way the VFS layer and other callers no longer need to care about
filesystems that never got converted to the modern era.
The filesystems that use the new wrapper are ceph, coda, exfat, jfs,
ntfs, ocfs2, overlayfs, and vboxsf.
Honestly, several of them look like they really could just iterate their
directories in shared mode and skip the wrapper entirely, but the point
of this change is to not change semantics or fix filesystems that
haven't been fixed in the last 7+ years, but to finally get rid of the
dual iterators.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06 03:25:01 +08:00
|
|
|
WRAP_DIR_ITER(ceph_readdir) // FIXME!
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
const struct file_operations ceph_dir_fops = {
|
|
|
|
.read = ceph_read_dir,
|
vfs: get rid of old '->iterate' directory operation
All users now just use '->iterate_shared()', which only takes the
directory inode lock for reading.
Filesystems that never got convered to shared mode now instead use a
wrapper that drops the lock, re-takes it in write mode, calls the old
function, and then downgrades the lock back to read mode.
This way the VFS layer and other callers no longer need to care about
filesystems that never got converted to the modern era.
The filesystems that use the new wrapper are ceph, coda, exfat, jfs,
ntfs, ocfs2, overlayfs, and vboxsf.
Honestly, several of them look like they really could just iterate their
directories in shared mode and skip the wrapper entirely, but the point
of this change is to not change semantics or fix filesystems that
haven't been fixed in the last 7+ years, but to finally get rid of the
dual iterators.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06 03:25:01 +08:00
|
|
|
.iterate_shared = shared_ceph_readdir,
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
.llseek = ceph_dir_llseek,
|
|
|
|
.open = ceph_open,
|
|
|
|
.release = ceph_release,
|
|
|
|
.unlocked_ioctl = ceph_ioctl,
|
2018-09-12 02:47:23 +08:00
|
|
|
.compat_ioctl = compat_ptr_ioctl,
|
2015-05-27 11:19:34 +08:00
|
|
|
.fsync = ceph_fsync,
|
2018-05-15 11:30:43 +08:00
|
|
|
.lock = ceph_lock,
|
|
|
|
.flock = ceph_flock,
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
};
|
|
|
|
|
2015-01-14 13:46:04 +08:00
|
|
|
const struct file_operations ceph_snapdir_fops = {
|
vfs: get rid of old '->iterate' directory operation
All users now just use '->iterate_shared()', which only takes the
directory inode lock for reading.
Filesystems that never got convered to shared mode now instead use a
wrapper that drops the lock, re-takes it in write mode, calls the old
function, and then downgrades the lock back to read mode.
This way the VFS layer and other callers no longer need to care about
filesystems that never got converted to the modern era.
The filesystems that use the new wrapper are ceph, coda, exfat, jfs,
ntfs, ocfs2, overlayfs, and vboxsf.
Honestly, several of them look like they really could just iterate their
directories in shared mode and skip the wrapper entirely, but the point
of this change is to not change semantics or fix filesystems that
haven't been fixed in the last 7+ years, but to finally get rid of the
dual iterators.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06 03:25:01 +08:00
|
|
|
.iterate_shared = shared_ceph_readdir,
|
2015-01-14 13:46:04 +08:00
|
|
|
.llseek = ceph_dir_llseek,
|
|
|
|
.open = ceph_open,
|
|
|
|
.release = ceph_release,
|
|
|
|
};
|
|
|
|
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
const struct inode_operations ceph_dir_iops = {
|
|
|
|
.lookup = ceph_lookup,
|
|
|
|
.permission = ceph_permission,
|
|
|
|
.getattr = ceph_getattr,
|
|
|
|
.setattr = ceph_setattr,
|
|
|
|
.listxattr = ceph_listxattr,
|
2022-09-22 23:17:00 +08:00
|
|
|
.get_inode_acl = ceph_get_acl,
|
2014-01-29 22:22:25 +08:00
|
|
|
.set_acl = ceph_set_acl,
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
.mknod = ceph_mknod,
|
|
|
|
.symlink = ceph_symlink,
|
|
|
|
.mkdir = ceph_mkdir,
|
|
|
|
.link = ceph_link,
|
|
|
|
.unlink = ceph_unlink,
|
|
|
|
.rmdir = ceph_unlink,
|
|
|
|
.rename = ceph_rename,
|
|
|
|
.create = ceph_create,
|
2012-06-05 21:10:25 +08:00
|
|
|
.atomic_open = ceph_atomic_open,
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
};
|
|
|
|
|
2015-01-14 13:46:04 +08:00
|
|
|
const struct inode_operations ceph_snapdir_iops = {
|
|
|
|
.lookup = ceph_lookup,
|
|
|
|
.permission = ceph_permission,
|
|
|
|
.getattr = ceph_getattr,
|
|
|
|
.mkdir = ceph_mkdir,
|
|
|
|
.rmdir = ceph_unlink,
|
2015-04-07 15:36:32 +08:00
|
|
|
.rename = ceph_rename,
|
2015-01-14 13:46:04 +08:00
|
|
|
};
|
|
|
|
|
2010-08-04 01:25:30 +08:00
|
|
|
const struct dentry_operations ceph_dentry_ops = {
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
.d_revalidate = ceph_d_revalidate,
|
2019-01-28 20:43:55 +08:00
|
|
|
.d_delete = ceph_d_delete,
|
2011-03-16 05:57:41 +08:00
|
|
|
.d_release = ceph_d_release,
|
2011-03-16 06:53:40 +08:00
|
|
|
.d_prune = ceph_d_prune,
|
2016-10-29 10:05:13 +08:00
|
|
|
.d_init = ceph_d_init,
|
ceph: directory operations
Directory operations, including lookup, are defined here. We take
advantage of lookup intents when possible. For the most part, we just
need to build the proper requests for the metadata server(s) and
pass things off to the mds_client.
The results of most operations are normally incorporated into the
client's cache when the reply is parsed by ceph_fill_trace().
However, if the MDS replies without a trace (e.g., when retrying an
update after an MDS failure recovery), some operation-specific cleanup
may be needed.
We can validate cached dentries in two ways. A per-dentry lease may
be issued by the MDS, or a per-directory cap may be issued that acts
as a lease on the entire directory. In the latter case, a 'gen' value
is used to determine which dentries belong to the currently leased
directory contents.
We normally prepopulate the dcache and icache with readdir results.
This makes subsequent lookups and getattrs avoid any server
interaction. It also lets us satisfy readdir operation by peeking at
the dcache IFF we hold the per-directory cap/lease, previously
performed a readdir, and haven't dropped any of the resulting
dentries.
Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-07 02:31:08 +08:00
|
|
|
};
|