Introduce cache layer to improve performance #15937

wy65701436 · 2021-11-03T07:32:00Z

In the currely design, we found that in large scale scenario usage, Harbor in some particular case will reache the DB connection limit and high CPU usage.

However, it doesn't have to let all get/list requests go directly to the database. If it introduces an cache layer before DB, it will significantly improve concurrency capability.

Vad1mo · 2021-11-03T07:53:36Z

We should also evaluate the option to use a third party DB layer cache present in Connection Pooler ( PgBouncer vs. Pgpool-II)

Also, if you plan to use a cache, we need to have an in Code Distributed Cache (ICDC) such as https://github.com/buraksezer/olric in order to scale horizontally.

Here I propose to switch to pgx the golang PostgreSQL driver which has some caching options as well.
related: #15209

buraksezer · 2021-11-08T20:38:39Z

Hi all,

Thank you for mentioning Olric. I'm the author of that library. If you have any questions, it will be my pleasure to help you.

xaleeks · 2021-11-09T03:34:35Z

let's aim for a concise problem statement and a design doc completed by v2.5, including estimation of performance improvement etc. Assigning to @wy65701436

chlins · 2022-01-20T03:23:48Z

Move to 2.6 after discussions. cc @xaleeks

chlins · 2022-03-03T03:56:36Z

Action Items

schrej · 2022-03-11T09:34:10Z

Have you considered implementing connection pooling and re-using existing connections instead of opening one per request?
With our deployment we mainly have issues with the amount of connections getting opened by harbor, and it seems like it's one for each individual request. With authentication enabled that leads to pretty significant CPU overhead for authentication.

Imo it would be a better approach to try optimising the usage of the database before adding an additional layer.

Edt: After digging through the code, it seems like connection pooling is already enabled. Why is it that it needs that many connections then? Do they get locked up by transactions?
How does the database interaction work when uploading images for example, is it creating a transaction that takes as long as the upload?

github-actions · 2022-07-05T09:09:06Z

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

chlins · 2022-07-06T09:17:21Z

@schrej The cache layer is an abstract concept which based on the harbor codebase, there is no additional component will be introduced, we just cache the mostly used resource to the redis for quick search and reduce the database connection, the question you mentioned need to analyze case by case, so feel free to file issue when you met the db connection issue and describe your scenario, thanks.

chlins · 2022-07-06T09:17:44Z

Close this epic as engineer story has been completed.

wy65701436 added the target/2.5.0 label Nov 3, 2021

wy65701436 assigned wy65701436, heww and chlins Nov 3, 2021

wy65701436 added the area/performance label Nov 3, 2021

chlins mentioned this issue Nov 17, 2021

Performance & Replication improvement engineering story #16014

Closed

wy65701436 assigned ninjadq Dec 3, 2021

wy65701436 unassigned ninjadq Jan 14, 2022

chlins added target/2.6.0 and removed target/2.5.0 labels Jan 20, 2022

chlins added the Epic label Mar 3, 2022

wy65701436 unassigned heww Mar 3, 2022

github-actions bot added the Stale label Jul 5, 2022

Vad1mo removed the Stale label Jul 5, 2022

chlins closed this as completed Jul 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce cache layer to improve performance #15937

Introduce cache layer to improve performance #15937

wy65701436 commented Nov 3, 2021 •

edited

Loading

Vad1mo commented Nov 3, 2021

buraksezer commented Nov 8, 2021

xaleeks commented Nov 9, 2021

chlins commented Jan 20, 2022

chlins commented Mar 3, 2022 •

edited

Loading

schrej commented Mar 11, 2022 •

edited

Loading

github-actions bot commented Jul 5, 2022

chlins commented Jul 6, 2022

chlins commented Jul 6, 2022

Introduce cache layer to improve performance #15937

Introduce cache layer to improve performance #15937

Comments

wy65701436 commented Nov 3, 2021 • edited Loading

Vad1mo commented Nov 3, 2021

buraksezer commented Nov 8, 2021

xaleeks commented Nov 9, 2021

chlins commented Jan 20, 2022

chlins commented Mar 3, 2022 • edited Loading

schrej commented Mar 11, 2022 • edited Loading

github-actions bot commented Jul 5, 2022

chlins commented Jul 6, 2022

chlins commented Jul 6, 2022

wy65701436 commented Nov 3, 2021 •

edited

Loading

chlins commented Mar 3, 2022 •

edited

Loading

schrej commented Mar 11, 2022 •

edited

Loading