Improve performance and bundled size #6

kevlened · 2024-02-27T16:57:15Z

By using base64 instead of an integer array, I reduced my remix build size by 75% and decreased the build time from 17 sec to 3 sec.

codesandbox · 2024-02-27T16:57:18Z

Review or Edit in CodeSandbox

Open the branch in Web Editor • VS Code • Insiders

Open Preview

stackblitz · 2024-02-27T16:57:18Z

Run & review this pull request in StackBlitz Codeflow.

kevlened · 2024-02-27T18:05:54Z

Ah, it should be noted that this currently works for bundles targeting node.js (SSR), but not browsers.

kevlened · 2024-02-27T18:21:13Z

It now works in both browser and node (>=16) environments

tachibana-shin · 2024-02-28T09:04:25Z

I've thought about doing this before but it comes at the cost of performance.

This means that if I accept base64, the compiled package will only increase by 20%, but the browser will have to decode the base64 string again, which is not ideal for files > 10MB.
If I use int8 arrays I have to accept that the compilation size increases up to 100% however it allows the browser to load the data directly into RAM without recompiling the string

Maybe we should add an option like ?arraybuffer&base64 instead of fixing the old code

kevlened · 2024-02-28T12:39:42Z

Allowing devs to decide their own performance tradeoffs makes sense. The base64 addition is a clean interface; I'll add it.

kevlened · 2024-02-28T13:05:16Z

The ampersand is causing the tests to fail with:

Error: ENOENT: no such file or directory, open '/vite-plugin-arraybuffer/src/typescript.svg?arrayb'

I'll try to find examples of other plugins with the ampersand syntax.

edit: The worker plugin is an example: https://vitejs.dev/guide/assets.html#importing-script-as-a-worker

kevlened · 2024-02-28T15:28:18Z

The error seems unrelated to the ampersand. I get the same error by:

git clone git@github.com:tachibana-shin/vite-plugin-arraybuffer.git
cd vite-plugin-arraybuffer
LANG=C.UTF-8 LC_ALL=C.UTF-8 find . -type f -exec sed -i '' 's/uint8array/uint8arraytest/g' {} +
pnpm install
pnpm test

This is quite unexpected. I'm not sure why simply changing the param name would affect the outcome.

src/main.ts

tachibana-shin · 2024-02-29T02:28:05Z

Thanks for PR!

src/main.ts

kevlened · 2024-03-09T04:58:29Z

I've thought about doing this before but it comes at the cost of performance.

This means that if I accept base64, the compiled package will only increase by 20%, but the browser will have to decode the base64 string again, which is not ideal for files > 10MB.

If I use int8 arrays I have to accept that the compilation size increases up to 100% however it allows the browser to load the data directly into RAM without recompiling the string

Maybe we should add an option like ?arraybuffer&base64 instead of fixing the old code

I wanted to circle back here, because it’s been bugging me. I’ve benchmarked this, and parsing base64 is faster than parsing a string representation of a Uint8Array for files over 5kB. The difference is greater the larger the file. When factoring in network transit time, I imagine base64 is faster in all scenarios, because the Uint8Array requires 400% more bytes while base64 requires 33% more bytes.

That’s all. It’s worth benchmarking.

tachibana-shin · 2024-03-09T10:34:44Z

I've thought about doing this before but it comes at the cost of performance.

This means that if I accept base64, the compiled package will only increase by 20%, but the browser will have to decode the base64 string again, which is not ideal for files > 10MB.

If I use int8 arrays I have to accept that the compilation size increases up to 100% however it allows the browser to load the data directly into RAM without recompiling the string

Maybe we should add an option like ?arraybuffer&base64 instead of fixing the old code

I wanted to circle back here, because it’s been bugging me. I’ve benchmarked this, and parsing base64 is faster than parsing a string representation of a Uint8Array for files over 5kB. The difference is greater the larger the file. When factoring in network transit time, I imagine base64 is faster in all scenarios, because the Uint8Array requires 400% more bytes while base64 requires 33% more bytes.

That’s all. It’s worth benchmarking.

Base64 is only fast in the encoding process. My post-compiled code tests show that when I try to execute base64 decoder 300 threads at the same time are a lot slower than uint8.

I have another idea that instead of storing it as uin8 array, I will store it as latin code which allows better performance and smaller package size.

tachibana-shin · 2024-03-09T10:36:41Z

If I use latin method I will get packet size = standard packet size and copy speed to RAM is fastest but I am doubting the reliability after encoding

kevlened · 2024-03-09T16:37:39Z

Here's a benchmark. This shows for any file above ~6.5kB, it is more efficient to parse using base64.

It also shows the file size differences if you uncomment the logs and open the console. On average, uint8array will take 160% longer to download than base64, even if it's faster to parse at small file sizes.

Granted, this doesn't account for compression. Uint8arrays will compress better (~1/3 their final size), while base64 compresses to ~2/3 of their final size. The means that when compressed, uint8arrays will still take 30% longer to download before parsing begins.

It's possible your tests didn't include the actual parsing time of the Uint8array string. It's not possible to "load the file directly into memory" when you encode binary in js, because the browser always has to parse the js.

tachibana-shin · 2024-03-10T04:57:28Z

Here's a benchmark. This shows for any file above ~6.5kB, it is more efficient to parse using base64.

It also shows the file size differences if you uncomment the logs and open the console. On average, uint8array will take 160% longer to download than base64, even if it's faster to parse at small file sizes.

Granted, this doesn't account for compression. Uint8arrays will compress better (~1/3 their final size), while base64 compresses to ~2/3 of their final size. The means that when compressed, uint8arrays will still take 30% longer to download before parsing begins.

It's possible your tests didn't include the actual parsing time of the Uint8array string. It's not possible to "load the file directly into memory" when you encode binary in js, because the browser always has to parse the js.

js translation time is negligible, blink takes just over 10ms to translate 3mb js directly to machine code

kevlened · 2024-03-10T05:24:52Z

The 3mb Uint8Array you’re parsing is <1mb in base64, so it would be parsed in <3ms, giving 7ms to parse the base64 to a Uint8Array. 7ms in this context is more than enough time, so the base64 nets a faster time, as the benchmark shows.

I’m not sure I understand how you’d like to change the benchmark.

kevlened added 2 commits February 27, 2024 08:48

test the file contents

a1c96b2

use base64 instead of an integer array

99ae1f3

kevlened added 2 commits February 27, 2024 09:27

single to double-quoted strings

8c05a31

consistent semicolons

dc080d0

make isomorphic

f32a74a

kevlened force-pushed the main branch from f349fe6 to f32a74a Compare February 27, 2024 18:20

simple version, but vitest doesn't start

118e0f3

tachibana-shin self-requested a review February 28, 2024 15:56

tachibana-shin reviewed Feb 28, 2024

View reviewed changes

src/main.ts Outdated Show resolved Hide resolved

tachibana-shin added 4 commits February 29, 2024 09:13

resolve base64 decode with native module

a6d0c6a

decode with atob if exists on self

20acbb7

update doc

3f7635e

fix test

55e547e

tachibana-shin merged commit 32ba3bd into tachibana-shin:main Feb 29, 2024
5 checks passed

kevlened commented Feb 29, 2024

View reviewed changes

src/main.ts Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance and bundled size #6

Improve performance and bundled size #6

kevlened commented Feb 27, 2024

codesandbox bot commented Feb 27, 2024

stackblitz bot commented Feb 27, 2024

kevlened commented Feb 27, 2024

kevlened commented Feb 27, 2024

tachibana-shin commented Feb 28, 2024

kevlened commented Feb 28, 2024

kevlened commented Feb 28, 2024 •

edited

Loading

kevlened commented Feb 28, 2024

tachibana-shin commented Feb 29, 2024

kevlened commented Mar 9, 2024

tachibana-shin commented Mar 9, 2024

tachibana-shin commented Mar 9, 2024

kevlened commented Mar 9, 2024 •

edited

Loading

tachibana-shin commented Mar 10, 2024

kevlened commented Mar 10, 2024

Improve performance and bundled size #6

Improve performance and bundled size #6

Conversation

kevlened commented Feb 27, 2024

codesandbox bot commented Feb 27, 2024

Review or Edit in CodeSandbox

stackblitz bot commented Feb 27, 2024

kevlened commented Feb 27, 2024

kevlened commented Feb 27, 2024

tachibana-shin commented Feb 28, 2024

kevlened commented Feb 28, 2024

kevlened commented Feb 28, 2024 • edited Loading

kevlened commented Feb 28, 2024

tachibana-shin commented Feb 29, 2024

kevlened commented Mar 9, 2024

tachibana-shin commented Mar 9, 2024

tachibana-shin commented Mar 9, 2024

kevlened commented Mar 9, 2024 • edited Loading

tachibana-shin commented Mar 10, 2024

kevlened commented Mar 10, 2024

kevlened commented Feb 28, 2024 •

edited

Loading

kevlened commented Mar 9, 2024 •

edited

Loading