Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduce the copy by chunk for replication #202

Merged
merged 1 commit into from
Nov 3, 2022

Conversation

chlins
Copy link
Member

@chlins chlins commented Sep 16, 2022

Signed-off-by: chlins chenyuzh@vmware.com

@chlins chlins requested review from a team as code owners September 16, 2022 06:48
@chlins
Copy link
Member Author

chlins commented Sep 19, 2022

@goharbor/all-maintainers Hi, please review this proposal, thanks!


#### Phase 2 Implementation

The key point of phase 2 is **`Breakpoint & Resume`**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we decided to do phase 1 at this stage, and the details in the phase 2 have not been fully discussed, it may not be very appropriate.

wy65701436
wy65701436 previously approved these changes Sep 19, 2022
Copy link
Contributor

@wy65701436 wy65701436 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

```go
type Client interface {
// PullBlobChunk pulls the specified blob, but by chunked
PullBlobChunk(repository, digest string, start, end int64) (size int64, blob io.ReadCloser, err error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is size needed in the return value since it can be calculated via start and end?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but just return this as unify with original PullBlob method.

*update policy*

```rest
PUT /replication/policies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it has the policy ID in the URI?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, hide the existed column.


**Scope:**

Chunk resuming crossing the job/execution/policy and multiple times execution(e.g. job retry), cache chunk location and last end range in redis.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is cool but a little unsure from the cost-effective perspective


The retry for chunk adopts the same strategy with blob, default 5 times and can be configured by environment.

#### Phase 2 Implementation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, If we want to implement phase 1 in next minor release and spark more conversations around phase 2, would it be better we split them into two proposals?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep the phase2 idea in this proposal for understand, and when we plan to do phase2 in the future, then we can re-design the entire phase2 and push a new proposal for review.


The key point of phase 2 is **`Breakpoint & Resume`**.

From the process of chunk API, we need to store the location for next chunk push and the last pushed chunk end range, so we need to define common interface for easily integration and adapter in the future.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need more explanation for the concepts like breakpoint and location to understand the idea....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just initialize idea, we can discuss this for more details when we do phase2.

wy65701436
wy65701436 previously approved these changes Nov 2, 2022
Copy link
Contributor

@wy65701436 wy65701436 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@OrlinVasilev
Copy link
Member

@chlins @wy65701436 the Phase1 and Phase2 are not defined at least I didn't find anything.
Can you please provide update!

@chlins
Copy link
Member Author

chlins commented Nov 2, 2022

@OrlinVasilev
Copy link
Member

@chlins my confusion comes from "Phase 1" and "Phase1"

Vad1mo
Vad1mo previously approved these changes Nov 2, 2022
@chlins chlins dismissed stale reviews from Vad1mo and wy65701436 via fc9140a November 3, 2022 03:26
@chlins chlins force-pushed the feat/replication-chunk-transfer branch from 9cca0f1 to fc9140a Compare November 3, 2022 03:26
Copy link
Contributor

@wy65701436 wy65701436 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@AllForNothing AllForNothing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: chlins <chenyuzh@vmware.com>
@chlins chlins force-pushed the feat/replication-chunk-transfer branch from fc9140a to fe62c84 Compare November 3, 2022 06:34
@chlins chlins merged commit dd91b0a into goharbor:main Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants