Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance and bundled size #6

Merged
merged 10 commits into from
Feb 29, 2024

Conversation

kevlened
Copy link
Contributor

By using base64 instead of an integer array, I reduced my remix build size by 75% and decreased the build time from 17 sec to 3 sec.

Copy link

codesandbox bot commented Feb 27, 2024

Review or Edit in CodeSandbox

Open the branch in Web EditorVS CodeInsiders

Open Preview

Copy link

stackblitz bot commented Feb 27, 2024

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

@kevlened
Copy link
Contributor Author

Ah, it should be noted that this currently works for bundles targeting node.js (SSR), but not browsers.

@kevlened
Copy link
Contributor Author

It now works in both browser and node (>=16) environments

@tachibana-shin
Copy link
Owner

I've thought about doing this before but it comes at the cost of performance.

  • This means that if I accept base64, the compiled package will only increase by 20%, but the browser will have to decode the base64 string again, which is not ideal for files > 10MB.
  • If I use int8 arrays I have to accept that the compilation size increases up to 100% however it allows the browser to load the data directly into RAM without recompiling the string

Maybe we should add an option like ?arraybuffer&base64 instead of fixing the old code

@kevlened
Copy link
Contributor Author

Allowing devs to decide their own performance tradeoffs makes sense. The base64 addition is a clean interface; I'll add it.

@kevlened
Copy link
Contributor Author

kevlened commented Feb 28, 2024

The ampersand is causing the tests to fail with:

Error: ENOENT: no such file or directory, open '/vite-plugin-arraybuffer/src/typescript.svg?arrayb'

I'll try to find examples of other plugins with the ampersand syntax.

edit: The worker plugin is an example: https://vitejs.dev/guide/assets.html#importing-script-as-a-worker

@kevlened
Copy link
Contributor Author

The error seems unrelated to the ampersand. I get the same error by:

  1. git clone git@github.com:tachibana-shin/vite-plugin-arraybuffer.git
  2. cd vite-plugin-arraybuffer
  3. LANG=C.UTF-8 LC_ALL=C.UTF-8 find . -type f -exec sed -i '' 's/uint8array/uint8arraytest/g' {} +
  4. pnpm install
  5. pnpm test

This is quite unexpected. I'm not sure why simply changing the param name would affect the outcome.

src/main.ts Outdated Show resolved Hide resolved
@tachibana-shin tachibana-shin merged commit 32ba3bd into tachibana-shin:main Feb 29, 2024
5 checks passed
@tachibana-shin
Copy link
Owner

Thanks for PR!

src/main.ts Show resolved Hide resolved
@kevlened
Copy link
Contributor Author

kevlened commented Mar 9, 2024

I've thought about doing this before but it comes at the cost of performance.

  • This means that if I accept base64, the compiled package will only increase by 20%, but the browser will have to decode the base64 string again, which is not ideal for files > 10MB.
  • If I use int8 arrays I have to accept that the compilation size increases up to 100% however it allows the browser to load the data directly into RAM without recompiling the string

Maybe we should add an option like ?arraybuffer&base64 instead of fixing the old code

I wanted to circle back here, because it’s been bugging me. I’ve benchmarked this, and parsing base64 is faster than parsing a string representation of a Uint8Array for files over 5kB. The difference is greater the larger the file. When factoring in network transit time, I imagine base64 is faster in all scenarios, because the Uint8Array requires 400% more bytes while base64 requires 33% more bytes.

That’s all. It’s worth benchmarking.

@tachibana-shin
Copy link
Owner

I've thought about doing this before but it comes at the cost of performance.

  • This means that if I accept base64, the compiled package will only increase by 20%, but the browser will have to decode the base64 string again, which is not ideal for files > 10MB.
  • If I use int8 arrays I have to accept that the compilation size increases up to 100% however it allows the browser to load the data directly into RAM without recompiling the string

Maybe we should add an option like ?arraybuffer&base64 instead of fixing the old code

I wanted to circle back here, because it’s been bugging me. I’ve benchmarked this, and parsing base64 is faster than parsing a string representation of a Uint8Array for files over 5kB. The difference is greater the larger the file. When factoring in network transit time, I imagine base64 is faster in all scenarios, because the Uint8Array requires 400% more bytes while base64 requires 33% more bytes.

That’s all. It’s worth benchmarking.

Base64 is only fast in the encoding process. My post-compiled code tests show that when I try to execute base64 decoder 300 threads at the same time are a lot slower than uint8.

I have another idea that instead of storing it as uin8 array, I will store it as latin code which allows better performance and smaller package size.

@tachibana-shin
Copy link
Owner

If I use latin method I will get packet size = standard packet size and copy speed to RAM is fastest but I am doubting the reliability after encoding

@kevlened
Copy link
Contributor Author

kevlened commented Mar 9, 2024

Here's a benchmark. This shows for any file above ~6.5kB, it is more efficient to parse using base64.

It also shows the file size differences if you uncomment the logs and open the console. On average, uint8array will take 160% longer to download than base64, even if it's faster to parse at small file sizes.

Granted, this doesn't account for compression. Uint8arrays will compress better (~1/3 their final size), while base64 compresses to ~2/3 of their final size. The means that when compressed, uint8arrays will still take 30% longer to download before parsing begins.

It's possible your tests didn't include the actual parsing time of the Uint8array string. It's not possible to "load the file directly into memory" when you encode binary in js, because the browser always has to parse the js.

@tachibana-shin
Copy link
Owner

Here's a benchmark. This shows for any file above ~6.5kB, it is more efficient to parse using base64.

It also shows the file size differences if you uncomment the logs and open the console. On average, uint8array will take 160% longer to download than base64, even if it's faster to parse at small file sizes.

Granted, this doesn't account for compression. Uint8arrays will compress better (~1/3 their final size), while base64 compresses to ~2/3 of their final size. The means that when compressed, uint8arrays will still take 30% longer to download before parsing begins.

It's possible your tests didn't include the actual parsing time of the Uint8array string. It's not possible to "load the file directly into memory" when you encode binary in js, because the browser always has to parse the js.

js translation time is negligible, blink takes just over 10ms to translate 3mb js directly to machine code

@kevlened
Copy link
Contributor Author

The 3mb Uint8Array you’re parsing is <1mb in base64, so it would be parsed in <3ms, giving 7ms to parse the base64 to a Uint8Array. 7ms in this context is more than enough time, so the base64 nets a faster time, as the benchmark shows.

I’m not sure I understand how you’d like to change the benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants