Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add raidz expansion to ztest #12

Merged
merged 9 commits into from
Sep 30, 2020
1 change: 1 addition & 0 deletions include/sys/spa_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -425,6 +425,7 @@ struct spa {
kcondvar_t spa_waiters_cv;
int spa_waiters; /* number of waiting threads */
boolean_t spa_waiters_cancel; /* waiters should return */
boolean_t spa_raidz_expanding; /* expansion in progress */
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we use spa_raidz_expand != NULL, rather than adding a new field? Are they set/cleared at different times? It looks to me like spa_raidz_expand and spa_raidz_expanding are both set while the namespace lock is held, so don't think there's any race condition when the expansion starts.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because since the spa_raidz_expand pointer is set in Syncing context which is run in a different task AIUI. we would have a race where we could exit here and another expand request could get in here and also start an expand before the pointer gets set. I wanted to use the pointer setting but this seemed better. (Unless I am misunderstanding how things work)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah so it happens via:

		dmu_tx_t *tx = dmu_tx_create_assigned(spa->spa_dsl_pool, txg);
		dsl_sync_task_nowait(spa->spa_dsl_pool, vdev_raidz_attach_sync,
		    newvd, 0, ZFS_SPACE_CHECK_EXTRA_RESERVED, tx);

which means that it will happen when that txg is synced. Which we wait for with the call to spa_vdev_exit(spa, newrootvd, dtl_max_txg, 0);. It looks like this all happens while holding the spa_namespace_lock, so another removal couldn't be initiated before we set spa_raidz_expand.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed that spa_vdev_exit() waited till the TXG is finished. That makes it simpler, will change to use the expand pointer.


/*
* spa_refcount & spa_config_lock must be the last elements
Expand Down
17 changes: 12 additions & 5 deletions module/zfs/spa.c
Original file line number Diff line number Diff line change
Expand Up @@ -6606,6 +6606,11 @@ spa_vdev_attach(spa_t *spa, uint64_t guid, nvlist_t *nvroot, int replacing,

if (oldvd->vdev_ops == &vdev_raidz_ops) {
raidz = B_TRUE;
/*
* Can't expand a raidz while prior expand is in progress.
*/
if (spa->spa_raidz_expanding)
return (spa_vdev_exit(spa, NULL, txg, EBUSY));
} else if (!oldvd->vdev_ops->vdev_op_leaf) {
return (spa_vdev_exit(spa, NULL, txg, ENOTSUP));
}
Expand Down Expand Up @@ -6773,6 +6778,7 @@ spa_vdev_attach(spa_t *spa, uint64_t guid, nvlist_t *nvroot, int replacing,
dtl_max_txg = txg + TXG_CONCURRENT_STATES;

if (raidz) {
spa->spa_raidz_expanding = B_TRUE;
dmu_tx_t *tx = dmu_tx_create_assigned(spa->spa_dsl_pool, txg);
dsl_sync_task_nowait(spa->spa_dsl_pool, vdev_raidz_attach_sync,
newvd, 0, ZFS_SPACE_CHECK_NONE, tx);
Expand All @@ -6794,10 +6800,10 @@ spa_vdev_attach(spa_t *spa, uint64_t guid, nvlist_t *nvroot, int replacing,
vdev_dirty(tvd, VDD_DTL, newvd, txg);

/*
* Schedule the resilver or rebuild to restart in the future. We do
* this to ensure that dmu_sync-ed blocks have been stitched into the
* respective datasets.
*/
* Schedule the resilver or rebuild to restart in the future.
* We do this to ensure that dmu_sync-ed blocks have been
* stitched into the respective datasets.
*/
if (rebuild) {
newvd->vdev_rebuild_txg = txg;

Expand All @@ -6806,7 +6812,8 @@ spa_vdev_attach(spa_t *spa, uint64_t guid, nvlist_t *nvroot, int replacing,
newvd->vdev_resilver_txg = txg;

if (dsl_scan_resilvering(spa_get_dsl(spa)) &&
spa_feature_is_enabled(spa, SPA_FEATURE_RESILVER_DEFER)) {
spa_feature_is_enabled(spa,
SPA_FEATURE_RESILVER_DEFER)) {
vdev_defer_resilver(newvd);
} else {
dsl_scan_restart_resilver(spa->spa_dsl_pool,
Expand Down
3 changes: 3 additions & 0 deletions module/zfs/vdev_raidz.c
Original file line number Diff line number Diff line change
Expand Up @@ -3262,6 +3262,7 @@ spa_raidz_expand_cb(void *arg, zthr_t *zthr)
spa_t *spa = arg;
vdev_raidz_expand_t *vre = spa->spa_raidz_expand;

ASSERT3B(spa->spa_raidz_expanding, ==, B_TRUE);
spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER);
vdev_t *raidvd = vdev_lookup_top(spa, vre->vre_vdev_id);

Expand Down Expand Up @@ -3423,6 +3424,7 @@ spa_raidz_expand_cb(void *arg, zthr_t *zthr)
}

spa->spa_raidz_expand = NULL;
spa->spa_raidz_expanding = B_FALSE;
}

void
Expand Down Expand Up @@ -3613,6 +3615,7 @@ vdev_raidz_load(vdev_t *vd)
ASSERT3P(vd->vdev_spa->spa_raidz_expand, ==, NULL);

vd->vdev_spa->spa_raidz_expand = &vdrz->vn_vre;
vd->vdev_spa->spa_raidz_expanding = B_TRUE;
}

uint64_t state = DSS_NONE;
Expand Down