Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added central caching #257

Closed
wants to merge 35 commits into from
Closed

feat: added central caching #257

wants to merge 35 commits into from

Conversation

fritzduchardt
Copy link
Collaborator

@fritzduchardt fritzduchardt commented Mar 24, 2024

  • Each directory in a vendir config is processed individually and placed into a central cache.
  • All application-specific temporary files are places in a central root folder .myks/envs
  • Small bug fix on myks render ALL

@kbudde
Copy link
Member

kbudde commented Mar 29, 2024

Hey 🖖🏽,
I used this preview version to work on my lab. As I user, I can say this is an incredible change ❤️
It's way more obvious now for which application you have overwrites, as the envs/ENV/_apps/APP/ does not contain temporary files anymore. Cleaning up/ rebuilding is quite fast ich you keep the .vendor folder.

  • You might want to move one more file into the top level .myks/envs hierarchy: During rendering I get an envs/ENV/.myks/tmp/env-data.yaml file for each environment.
  • I added .myks and .vendor to my .gitignore. In a build system .vendor should be cached between builds 🏎️

I haven't taken a closer look at the code yet, but I have to say the way it works looks really promising.

@kbudde
Copy link
Member

kbudde commented Mar 29, 2024

Oh, I've found a disadvantage, but maybe you have an idea:
If you have several version of a chart, which is cached in .vendor folder, you cannot say which cache entry belongs to which chart.

@fritzduchardt
Copy link
Collaborator Author

fritzduchardt commented Mar 31, 2024

Oh, I've found a disadvantage, but maybe you have an idea: If you have several version of a chart, which is cached in .vendor folder, you cannot say which cache entry belongs to which chart.

Hey 🖖🏽, I used this preview version to work on my lab. As I user, I can say this is an incredible change ❤️ It's way more obvious now for which application you have overwrites, as the envs/ENV/_apps/APP/ does not contain temporary files anymore. Cleaning up/ rebuilding is quite fast ich you keep the .vendor folder.

  • You might want to move one more file into the top level .myks/envs hierarchy: During rendering I get an envs/ENV/.myks/tmp/env-data.yaml file for each environment.
  • I added .myks and .vendor to my .gitignore. In a build system .vendor should be cached between builds 🏎️

I haven't taken a closer look at the code yet, but I have to say the way it works looks really promising.

Thanks for the feedback @kbudde! Good point about the .myks folder. I will move it as well.

@fritzduchardt
Copy link
Collaborator Author

Oh, I've found a disadvantage, but maybe you have an idea: If you have several version of a chart, which is cached in .vendor folder, you cannot say which cache entry belongs to which chart.

I was thinking about that. We could add a slug to the cached folder names that contains versions or other characteristics, e.g. paths for local directories, branch names for git repos.

Copy link
Member

@Zebradil Zebradil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really glad you went for this :-)

Leaving a few comments for now. I still have to test it on my setup.

internal/myks/util.go Outdated Show resolved Hide resolved
internal/myks/globe.go Outdated Show resolved Hide resolved
@kbudde
Copy link
Member

kbudde commented Apr 6, 2024

Tested again, today with bad internet connection ;)
myks all ALL
old version: 39s
new version: 0.6s

There is still one unexpected difference.
Changing the base.ytt.yaml of http bingo to include a second helm chart

#@ load("@ytt:data", "data")

#@ app = data.values.application
---
apiVersion: vendir.k14s.io/v1alpha1
kind: Config
directories:
  - path: #@ "charts/" + app.name
    contents:
      - path: .
        helmChart:
          name: #@ app.name
          version: #@ app.version
          repository:
            url: #@ app.url
# just copy paste from above and rename as a very simple example of a second chart:
  - path: #@ "charts/" + app.name + "-backup"
    contents:
      - path: .
        helmChart:
          name: #@ app.name
          version: #@ app.version
          repository:
            url: #@ app.url

Only the last chart is in the rendered folder with the new myks

Comment on lines +115 to +119
#! Configuration options for caching during vendir sync step.
cache:
#! If true, the cache is enabled for all apps per default and can be selectively disabled with the vendir lazy flag.
#! If false, the cache is disabled for all apps per default and can be selectively enabled with the vendir lazy flag.
enabled: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about moving this under the sync "namespace"?

Suggested change
#! Configuration options for caching during vendir sync step.
cache:
#! If true, the cache is enabled for all apps per default and can be selectively disabled with the vendir lazy flag.
#! If false, the cache is disabled for all apps per default and can be selectively enabled with the vendir lazy flag.
enabled: true
#! Configuration options for the sync step.
sync:
cache:
#! If true, the cache is enabled for all apps per default and can be selectively disabled with the vendir lazy flag.
#! If false, the cache is disabled for all apps per default and can be selectively enabled with the vendir lazy flag.
enabled: true

@Zebradil
Copy link
Member

Zebradil commented Apr 7, 2024

@kbudde The issue with two charts could've been because of this bug (fixed): 6845a1e

UPD: this actually broke the thing. Here is the correct version of the hashing function: 8677cf2.

UPD 2: should be fixed in 0bded65

@Zebradil
Copy link
Member

Zebradil commented Apr 8, 2024

@fritzduchardt I made some changes to the caching logic in 0bded65. Now, instead of deciding whether to run vendir sync or not implicitly for the user, the lazy flag is explicitly injected into the patched vendir config file. This way, we delegate sync decisions to vendir and make this part of the logic more visible.

I also restricted vendir configs to a single content item (0bded65#diff-057be4d85a28f33bb6feddad323f77f05bd9484f8b01ea17fabe6491358ddb60R179-R186). This might be not needed, but it's too late and I can't think about this thoroughly anymore 🙃

A non-logic change is refactoring: the cache- and patch-related code is moved out of the doSync function.

@kbudde
Copy link
Member

kbudde commented Apr 13, 2024

I fear it got worse... 😿

I still had my example repo with two prototypes, where I copied the httpbingo prototype to have two helm charts in the double prototype.
It looks like if you have the same helm chart in two prototypes something can go wrong.

Prototypes:

  • double: has httpbingo + httpbingo-backup in it (same chart version, differenet) path
    base.ytt.yaml:
apiVersion: vendir.k14s.io/v1alpha1
kind: Config
directories:
  - path: #@ "charts/" + app.name
    contents:
      - path: .
        helmChart:
          name: #@ app.name
          version: #@ app.version
          repository:
            url: #@ app.url
  - path: #@ "charts/" + app.name + "-backup"
    contents:
      - path: .
        helmChart:
          name: #@ app.name
          version: #@ app.version
          repository:
            url: #@ app.url
  • and the httpbingo prototype

The vendir cache looks like this:
image

The "normal" httpbingo prototype now suddenly includes the -backup files
image

This might be due to my strange example repo. I think I have to start using multi source prototypes in other projects :)

@Zebradil
Copy link
Member

Yes, it doesn't seem right 😁
I'll have a look. Thank you for the test case!

@kbudde kbudde linked an issue Apr 14, 2024 that may be closed by this pull request
@kbudde
Copy link
Member

kbudde commented Apr 14, 2024

I've found an easier example: The described behavior always happens, if you have to prototypes with a helm chart with the same version but different path.

directories:
  - path: #@ "charts/foo"
    ...
---
directories:
  - path: #@ "charts/httpbingo"
    ...

will lead to
image

@kbudde
Copy link
Member

kbudde commented Apr 14, 2024

Just thinking after looking at the code.
At the moment we are only hashing one entry in contents.
The path one level higher is ignored but leads to this duplication. Adding the path to the hashing should solve the issue while downloading the chart twice. But this should be ok, or not? I mean it's only to prevent unexpected results in use cases where two prototypes are sharing the same source in all other cases it will work the same way.

---
apiVersion: vendir.k14s.io/v1alpha1
kind: Config
directories:
  - path:  #@ "charts/" + app.name #! or "charts/foo"
    contents:
    #! <<-- hashed
      - path: .
        helmChart:
          name: #@ app.name
          version: #@ app.version
          repository:
            url: #@ app.url
    #! hashed -->>

Another option would be to move the path inside contents and enforce "." path on higher level. This would as well ensure a clean hash.

---
apiVersion: vendir.k14s.io/v1alpha1
kind: Config
directories:
  - path: . #! should this be enforced?
    contents:
    #! <<-- hashed
      - path: #@ "charts/" + app.name #! or "charts/foo"
        helmChart:
          name: #@ app.name
          version: #@ app.version
          repository:
            url: #@ app.url
    #! hashed -->>

@Zebradil
Copy link
Member

Superseded by #274

@Zebradil Zebradil closed this May 10, 2024
@Zebradil Zebradil deleted the central-cache branch May 20, 2024 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Central Caching
3 participants