linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-16 16:54:20 +08:00

Author	SHA1	Message	Date
Alex Elder	9f5dffdc8f	rbd: make rbd_dev_destroy() match rbd_dev_create() Currently, rbd_dev_destroy() does more than just the inverse of what rbd_dev_create() does. Stop doing that, and move the two extra things it does into the three call sites. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:42 -07:00
Alex Elder	468521c1b1	rbd: define rbd snap context routines Encapsulate the creation of a snapshot context for rbd in a new function rbd_snap_context_create(). Define rbd wrappers for getting and dropping references to them once they're created. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:41 -07:00
Alex Elder	c0cd10db46	rbd: use rbd_warn(), not WARN_ON() Change some calls to WARN_ON() so they use rbd_warn() instead, so we get consistent messaging. A few remain but they can probably just go away eventually. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:40 -07:00
Alex Elder	500d0c0fbb	rbd: move stripe_unit and stripe_count into header This commit added fetching if fancy striping parameters: 09186ddb rbd: get and check striping parameters They are almost unused, but the two fields storing the information really belonged in the rbd_image_header structure. This patch moves them there. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:39 -07:00
Alex Elder	ecb4dc2256	rbd: make rbd spec names pointer to const Make the names and image id in an rbd_spec be pointers to constant data. This required the use of a local variable to hold the snapshot name in rbd_add_parse_args() to avoid a warning. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:37 -07:00
Alex Elder	e1d4213f09	rbd: set snapshot id in rbd_dev_probe_update_spec() Set the rbd spec's snapshot id for an image getting mapped in rbd_dev_probe_update_spec() rather than rbd_dev_set_mapping(). This is the more logical place for that to happen (even though it means we might look up the snapshot by name twice). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:36 -07:00
Alex Elder	8b0241f85a	rbd: have snap_by_name() return a snapshot A function called snap_by_name() ought to just look up a snapshot by name. It does that, but then it assigns some stuff to the rbd device structure as well. Change the function to do just the lookup, and have the caller do the assignments that follow. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:35 -07:00
Alex Elder	5655c4d940	rbd: fix image id leak in initial probe If a format 2 image id is found for an image being mapped, but the subsequent probe of the image fails, rbd_dev_probe() quits without freeing the image id. Fix that. Also drop a redundant hunk of code in rbd_dev_image_id(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:34 -07:00
Alex Elder	c0fba36880	rbd: have rbd_dev_image_id() set format 1 image id Currently, rbd_dev_probe() assumes that any error returned by rbd_dev_image_id() is most likely -ENOENT, and responds by calling the format 1 probe routine, rbd_dev_v1_probe(). Then, at the top of rbd_dev_v1_probe(), an empty string is allocated for the image id. This is sort of unbalanced. Fix this by having rbd_dev_image_id() look for -ENOENT from its "get_id" method call. If that is seen, have it allocate the empty string there rather than depending on rbd_dev_v1_probe() to do it. Given that this is effectively defining the format of the image, set rbd_dev->image_format inside rbd_dev_image_id() rather than in the format-specific probe routines. Also drop a redundant hunk of code in rbd_dev_image_id(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:33 -07:00
Alex Elder	a0cab92432	rbd: avoid dropping extra reference in rbd_free_disk() I found during some failure injection testing that the call to rbd_free_disk() in the error path of rbd_dev_probe_finish() was dropping an extra reference to the disk queue. The problem occurred when put_disk tried to drop a reference to the disk's queue. A call to blk_cleanup_queue() just prior to that will have also dropped a reference to the queue. The problem is that the reference dropped by put_disk() is assumed to have been taken by add_disk(). Our code has error paths that can occur after the disk and its queue are initialized, but before the call to add_disk(), and in those paths we won't have that extra reference. The fix is easy though. In rbd_free_disk() we're already checking the disk's GENHD_FL_UP flag. That flag is an indication that add_disk() has been called, so just call blk_cleanup_queue() conditional on that flag being set. This resolves: http://tracker.ceph.com/issues/4800 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:32 -07:00
Alex Elder	f40eb349e0	rbd: use rbd_obj_method_sync() return value Now that rbd_obj_method_sync() returns the number of bytes returned by the method call, that value should be used by callers to ensure we don't overrun the valid portion of the buffer. Fix the two spots that remained that weren't doing that, rbd_dev_image_name() and rbd_dev_v2_snap_name(). Rearrange the error path slightly in rbd_dev_v2_snap_name(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:31 -07:00
Alex Elder	6e584f5244	rbd: fix leak of format 2 snapshot names When the snapshot context for an rbd device gets updated (or the initial one is recorded) a a list of snapshot structures is created to represent them, one entry per snapshot. Each entry includes a dynamically-allocated copy of the snapshot name. Currently the name is allocated in rbd_snap_create(), as a duplicate of the passed-in name. For format 1 images, the snapshot name provided is just a pointer to an existing name. But for format 2 images, the passed-in name is already dynamically allocated, and in the the process of duplicating it here we are leaking the passed-in name. Fix this by dynamically allocating the name for format 1 snapshots also, and then stop allocating a duplicate in rbd_snap_create(). Change rbd_dev_v1_snap_info() so none of its parameters is side-effected unless it's going to return success. This is part of: http://tracker.ceph.com/issues/4803 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:30 -07:00
Alex Elder	6087b51b9e	rbd: rename __rbd_add_snap_dev() Rename __rbd_add_snap_dev() to be rbd_snap_create(). We no longer have devices for non-mapped snapshots, and we're not actually "adding" it to the list in this function, just creating it. Rename rbd_remove_snap_dev() to be rbd_snap_destroy() for reasons similar to the above. Stop having this function delete the snapshot from its list (to be symmetrical with its create counterpart) and do that in the caller instead. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:29 -07:00
Alex Elder	acb1b6caf1	rbd: only update values on snap_info success Change rbd_dev_v2_snap_info() so it only ever sets values of the size and features parameters if looking up the snapshot name was successful. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:28 -07:00
Alex Elder	c86f86e9e7	rbd: make snap_size order parameter optional Only one of the two callers of _rbd_dev_v2_snap_size() needs the order value returned. So make that an optional argument--a null pointer if the caller doesn't need it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:27 -07:00
Alex Elder	522a0cc0f0	rbd: fix leak of snapshots during initial probe When an rbd image is initially mapped, its snapshot context is collected, and then a list of snapshot entries representing the snapshots in that context is created. The list is created using rbd_dev_snaps_update(). (This function also supports updating an existing snapshot list based on a new snapshot context.) If an error occurs, updating the list is aborted, and the list is currently left as-is, in an inconsistent state. At that point, there may be a partially-constructed list, but the calling functions (rbd_dev_probe_finish() from rbd_dev_probe() from rbd_add()) never clean them up. So this constitutes a leak. A snapshot list that is inconsistent with the current snapshot context is of no use, and might even be actively bad. So rather than just having the caller clean it up, have rbd_dev_snaps_update() just clear out the entire snapshot list in the event an error occurs. The other place rbd_dev_snaps_update() is used is when a refresh is triggered, either because of a watch callback or via a write to the /sys/bus/rbd/devices/<id>/refresh interface. An error while updating the snapshots has no substantive effect in either of those cases, but one of them issues a warning. Move that warning to the common rbd_dev_refresh() function so it gets issued regardless of how it got initiated. This is part of: http://tracker.ceph.com/issues/4803 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:26 -07:00
Alex Elder	3e83b65bb9	rbd: don't create sysfs entries for non-mapped snapshots When an rbd image gets mapped a device entry gets created for it under /sys/bus/rbd/devices/<id>/. Inside that directory there are sysfs files that contain information about the image: its size, feature bits, major device number, and so on. Additionally, if that image has any snapshots, a device entry gets created for each of those as a "child" of the mapped device. Each of these is a subdirectory of the mapped device, and each directory contains a few files with information about the snapshot (its snapshot id, size, and feature mask). There is no clear benefit to having those device entries for the snapshots. The information provided via sysfs of of little real value--and all of it is available via rbd CLI commands. If we still wanted to see the kernel's view of this information it could be done much more simply by including it in a single sysfs file for the mapped image. But there is a clear cost to supporting them. Every time a snapshot context changes, these entries need to be updated (deleted snapshots removed, new snapshots created). The rbd driver is notified of changes to the snapshot context via callbacks from an osd, and care must be taken to coordinate removal of snapshot data structures with the possibility of one these notifications occurring. Things would be considerably simpler if we just didn't have to maintain device entries for the snapshots. So get rid of them. The ability to map a snapshot of an rbd image will remain; the only thing lost will be the ability to query these sysfs directories for information about snapshots of mapped images. This resolves: http://tracker.ceph.com/issues/4796 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:25 -07:00
Alex Elder	770eba6e29	rbd: activate support for layered images Now that we have most everything in place to support layered rbd images, enable support for them in the kernel client. Issue a warning to the log that the support is considered experimental whenever a format 2 layered image is mapped. Note that we also have to claim to support the STRIPINGV2 feature, due to a mistake in the way the rbd CLI set up those flags. This feature can work if it has the right parameters, and safeguards have been put in place to reject those images that do not have compatible parameters. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:23 -07:00
Alex Elder	cc070d59bc	rbd: get and check striping parameters If an rbd format 2 image indicates it supports the STRIPINGV2 feature we need to find out its stripe unit and stripe count in order to know whether we can use it. We don't yet support fancy striping fully, but if the default parameters are used the behavior is indistinguishible from non-fancy striping. This is necessary because some images require the STRIPINGV2 feature even if they use the default parameters. (Which is to say the feature bit was erroneously set even if the feature was not used.) This resolves: http://tracker.ceph.com/issues/4709 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:21 -07:00
Alex Elder	57385b51c3	rbd: have rbd_obj_method_sync() return transfer count Callers of rbd_obj_method_sync() don't know how many bytes of data got returned by the class method call. As a result, they have been assuming enough got returned to decode whatever was expected. This isn't safe. We know how many bytes got transferred, so have rbd_obj_method_sync() return that amount (rather than just 0) if the call is successful. Change all callers to use this return value to ensure decoding of the results is done safely. On the other hand, most callers of rbd_obj_method_sync() only indicate success or failure, so all of their callers can simply test for non-zero result. This resolves: http://tracker.ceph.com/issues/4773 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:20 -07:00
Alex Elder	4157976b27	rbd: void data pointers for rbd_obj_method_sync() Make the inbound and outbound data parameters have void rather than character type for rbd_obj_method_sync(). This makes it more clear they don't expect typed data, and eliminates the need for some silly type casts. One more unrelated change: define the features buffer used in _rbd_dev_v2_snap_features() to be a packed data structure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:19 -07:00
Alex Elder	80ef15bf71	rbd: give rbd_obj_read_sync() buffer void type Make the buf parameter into which the data is to be read have type void pointer. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:18 -07:00
Alex Elder	a9e8ba2cb3	rbd: enforce parent overlap A clone image has a defined overlap point with its parent image. That is the byte offset beyond which the parent image has no defined data to back the clone, and anything thereafter can be viewed as being zero-filled by the clone image. This is needed because a clone image can be resized. If it gets resized larger than the snapshot it is based on, the overlap defines the original size. If the clone gets resized downward below the original size the new clone size defines the overlap. If the clone is subsequently resized to be larger, the overlap won't be increased because the previous resize invalidated any parent data beyond that point. This resolves: http://tracker.ceph.com/issues/4724 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:15 -07:00
Alex Elder	0eefd470f0	rbd: issue a copyup for layered writes This implements the main copyup functionality for layered writes. Here we add a copyup_pages field to the object request, which is used only for copyup requests to keep track of the page array containing data read from the parent image. A copyup request is currently the only request rbd has that requires two osd operations. Because of this we handle copyup specially. All image object requests get an osd request allocated when they are created. For a write request, if a copyup is required, the osd request originally allocated is released, and a new one (with room for two osd ops) is allocated to replace it. A new function rbd_osd_req_create_copyup() allocates an osd request suitable for a copyup request. The first op is then filled with a copyup object class method call, supplying the array of pages containing data read from the parent. The second op is filled in with the original write request. The original request otherwise remains intact, and it describes the original write request (found in the second osd op). The presence of the copyup op is sort of implicit; a non-null copyup_pages field could be used to distinguish between a "normal" write request and a request containing both a copyup call and a write. This resolves: http://tracker.ceph.com/issues/3419 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:14 -07:00
Alex Elder	3d7efd18d9	rbd: implement full object parent reads As a step toward implementing layered writes, implement reading the data for a target object from the parent image for a write request whose target object is known to not exist. Add a copyup_pages field to an image request to track the page array used (only) for such a request. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:13 -07:00
Laurent Barbe	d98df63ea7	rbd: revalidate_disk upon rbd resize If rbd disk is open and rbd resize is done, new size is not visible by filesystem. Like is done in virtio-blk and dm driver, revalidate_disk() permits to update the bd_inode size. Signed-off-by: Laurent Barbe <laurent@ksperis.com> Reviewed-by: Alex Elder <elder@inktank.com>	2013-05-01 21:19:12 -07:00
Alex Elder	f1a4739f33	rbd: support page array image requests This patch adds the ability to build an image request whose data will be written from or read into memory described by a page array. (Previously only bio lists were supported.) Originally this was going to define a new function for this purpose but it was largely identical to the rbd_img_request_fill_bio(). So instead, rbd_img_request_fill_bio() has been generalized to handle both types of image request. For the moment we still only fill image requests with bio data. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:11 -07:00
Alex Elder	b9434c5b43	rbd: define zero_pages() Define a new function zero_pages() that zeroes a range of memory defined by a page array, along the lines of zero_bio_chain(). It saves and the irq flags like bvec_kmap_irq() does, though I'm not sure at this point that it's necessary. Update rbd_img_obj_request_read_callback() to use the new function if the object request contains page rather than bio data. For the moment, only bio data is used for osd READ ops. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:10 -07:00
Alex Elder	b454e36d26	rbd: encapsulate submission of image object requests Object requests that are part of an image request are subject to some additional handling. Define rbd_img_obj_request_submit() to encapsulate that, and use it when initially submitting an image object request, and when re-submitting it during callback of an object existence check. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:08 -07:00
Alex Elder	9d4df01f08	rbd: define separate read and write format funcs Separate rbd_osd_req_format() into two functions, one for read requests and the other for write requests. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:07 -07:00
Alex Elder	c5b5ef6c51	rbd: issue stat request before layered write This is a step toward fully implementing layered writes. Add checks before request submission for the object(s) associated with an image request. For write requests, if we don't know that the target object exists, issue a STAT request to find out. When that request completes, mark the known and exists flags for the original object request accordingly and re-submit the object request. (Note that this still does the existence check only; the copyup operation is not yet done.) A new object request is created to perform the existence check. A pointer to the original request is added to that object request to allow the stat request to re-issue the original request after updating its flags. If there is a failure with the stat request the error code is stored with the original request, which is then completed. This resolves: http://tracker.ceph.com/issues/3418 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:04 -07:00
Alex Elder	5679c59f60	rbd: add target object existence flags This creates two new flags for object requests to indicate what is known about the existence of the object to which a request is to be sent. The KNOWN flag will be true if the the EXISTS flag is meaningful. That is: KNOWN EXISTS ----- ------ 0 0 don't know whether the object exists 0 1 (not used/invalid) 1 0 object is known to not exist 1 0 object is known to exist This will be used in determining how to handle write requests for data objects for layered rbd images. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:03 -07:00
Alex Elder	57acbaa7fb	rbd: always check IMG_DATA flag In a few spots, whether the an object request's img_request pointer is null is used to determine whether an object request is being done as part of an image data request. Stop doing that, and instead always use the object request IMG_DATA flag for that purpose. Swap the order of the definition of the IMG_DATA and DONE flag helpers, because obj_request_done_set() now refers to obj_request_img_data_set() to get its rbd_dev value. This will become important because the img_request pointer is about to become part of a union. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:02 -07:00
Alex Elder	b155e86cf6	rbd: adjust image object request ref counting An extra reference is taken when an object request is added as one of the requests making up an image object. A reference is dropped again when the image's object requests get submitted. The original reference for the object request will remain throughout this period, so we don't need to add and then take away an extra one. This can be interpreted as the image request inheriting the original object request's reference. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:19:01 -07:00
Alex Elder	406e2c9f92	libceph: kill off osd data write_request parameters In the incremental move toward supporting distinct data items in an osd request some of the functions had "write_request" parameters to indicate, basically, whether the data belonged to in_data or the out_data. Now that we maintain the data fields in the op structure there is no need to indicate the direction, so get rid of the "write_request" parameters. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:58 -07:00
Alex Elder	8b3e1a5698	rbd: implement layered reads Implement layered read requests for format 2 rbd images. If an rbd image is a clone of a snapshot, the snapshot will be the clone's "parent" image. When an object read request on a clone comes back with ENOENT it indicates that the clone is not yet populated with that portion of the image's data, and the parent image should be consulted to satisfy the read. When this occurs, a new image request is created, directed to the parent image. The offset and length of the image are the same as the image-relative offset and length of the object request that produced ENOENT. Data from the parent image therefore satisfies the object read request for the original image request. While this code works, it will not be active until we enable the layering feature (by adding RBD_FEATURE_LAYERING to the value of RBD_FEATURES_SUPPORTED). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:48 -07:00
Alex Elder	2f82ee54d9	rbd: probe the parent of an image if present Call the probe function for the parent device if one is present. Since we don't formally support the layering feature we won't be using this functionality just yet. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:47 -07:00
Alex Elder	6365d33a27	rbd: add an object request flag for image data objects Add a flag to distinguish between object requests being done on standalone objects and requests being sent for objects representing rbd image data (i.e., object requests that are the result of image request). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:46 -07:00
Alex Elder	926f9b3f08	rbd: define an rbd object request flags field We're going to need some more Boolean values for object requests, so create a flags bit field and use it to record whether the request is done. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:44 -07:00
Alex Elder	1217857fbf	rbd: encapsulate image object end request handling Encapsulate the code that completes processing of an object request that's part of an image request. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:43 -07:00
Alex Elder	d0b2e94455	rbd: define image request layered flag Define a flag indicating whether an image request is for a layered image (one with a parent image to which requests will be redirected if the target object of a request does not exist). The code that checks this flag will be added shortly. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:42 -07:00
Alex Elder	9849e98636	rbd: define image request originator flag Define a flag indicating whether an image request originated from the Linux block layer (from blk_fetch_request()) or whether it was initiated in order to satisfy an object request for a child image of a layered rbd device. For image requests initiated by objects of child images we'll save a pointer to the object request rather than the Linux block request. For now, only block requests are used. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:41 -07:00
Alex Elder	0c425248e0	rbd: define image request flags There are several Boolean values we'll be maintaining for image requests. Switch from the single write_request field to a general-purpose flags field, and use one if its bits to represent the direction of I/O for the image request. Define helper functions for setting and testing that flag. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:40 -07:00
Alex Elder	7da22d296d	rbd: record image-relative offset in object requests For an image object request we will need to know what offset within the rbd image the request covers. Record that when the object request gets created. Update the I/O error warnings so they use this so what's reported is more informative. Rename a local variable to fit the convention used everywhere else. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:39 -07:00
Alex Elder	55f27e0931	rbd: record aggregate image transfer count Compute the total number of bytes transferred for an image request--the sum across each of the request's object requests. To avoid contention do it only when all object requests are complete, in rbd_img_request_complete(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:38 -07:00
Alex Elder	a5a337d438	rbd: record overall image request result If any image object request produces a non-zero result, preserve that as the result of the overall image request. If multiple objects have non-zero results, save only the first one. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:37 -07:00
Alex Elder	5cbf6f12c4	rbd: update feature bits There is a new rbd feature bit defined for "fancy striping." Add it to the ones defined in the kernel client. Change RBD_FEATURES_ALL so it represents the set of all feature bits (rather than just the ones we support). Define a new symbol RBD_FEATURES_SUPPORTED to indicate the supported ones. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:36 -07:00
Alex Elder	04017e29bb	libceph: make method call data be a separate data item Right now the data for a method call is specified via a pointer and length, and it's copied--along with the class and method name--into a pagelist data item to be sent to the osd. Instead, encode the data in a data item separate from the class and method names. This will allow large amounts of data to be supplied to methods without copying. Only rbd uses the class functionality right now, and when it really needs this it will probably need to use a page array rather than a page list. But this simple implementation demonstrates the functionality on the osd client, and that's enough for now. This resolves: http://tracker.ceph.com/issues/4104 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:35 -07:00
Alex Elder	a4ce40a9a7	libceph: combine initializing and setting osd data This ends up being a rather large patch but what it's doing is somewhat straightforward. Basically, this is replacing two calls with one. The first of the two calls is initializing a struct ceph_osd_data with data (either a page array, a page list, or a bio list); the second is setting an osd request op so it associates that data with one of the op's parameters. In place of those two will be a single function that initializes the op directly. That means we sort of fan out a set of the needed functions: - extent ops with pages data - extent ops with pagelist data - extent ops with bio list data and - class ops with page data for receiving a response We also have define another one, but it's only used internally: - class ops with pagelist data for request parameters Note that we still haven't gotten rid of the osd request's r_data_in and r_data_out fields. All the osd ops refer to them for their data. For now, these data fields are pointers assigned to the appropriate r_data_* field when these new functions are called. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:23 -07:00
Alex Elder	2169238dd3	rbd: rearrange some code for consistency This patch just trivially moves around some code for consistency. In preparation for initializing osd request data fields in ceph_osdc_build_request(), I wanted to verify that rbd did in fact call that immediately before it called ceph_osdc_start_request(). It was true (although image requests are built in a group and then started as a group). But I made the changes here just to make it more obvious, by making all of the calls follow a common sequence: osd_req_op_<optype>_init(); ceph_osd_data_<type>_init() osd_req_op_<optype>_<datafield>() rbd_osd_req_format() ... ret = rbd_obj_request_submit() I moved the initialization of the callback for image object requests into rbd_img_request_fill_bio(), again, for consistency. To avoid a forward reference, I moved the definition of rbd_img_obj_callback() up in the file. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:18 -07:00

1 2 3 4 5 ...

360 Commits