-
-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow disabling bzip2 compression for index files #1081
Allow disabling bzip2 compression for index files #1081
Conversation
golangs compress/gzip isn't a parallel implementation, so it's quite a bit slower on most modern servers then pgzip. The below benchmark run shows that publishing a debian bullseye mirror snapshot (amd64, arm64, armhf, source) shows a gain of about 35% in publishing time (when skipping bz2 using MR aptly-dev#1081) ``` hyperfine -w 1 -m 3 -L aptly aptly-nobz2,aptly-nobz2-pgzip -p "{aptly} -config aptly.conf publish drop bullseye || true" "{aptly} -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye" Benchmark 1: aptly-nobz2 -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye Time (mean ± σ): 35.548 s ± 0.378 s [User: 39.465 s, System: 10.046 s] Range (min … max): 35.149 s … 35.902 s 3 runs Benchmark 2: aptly-nobz2-pgzip -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye Time (mean ± σ): 26.592 s ± 0.069 s [User: 42.207 s, System: 9.676 s] Range (min … max): 26.521 s … 26.660 s 3 runs Summary 'aptly-nobz2-pgzip -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye' ran 1.34 ± 0.01 times faster than 'aptly-nobz2 -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye' ``` Signed-off-by: Sjoerd Simons <sjoerd@collabora.com>
Thanks for this change, it looks good in itself! |
Apt first downloads the Release or InRelease files; Which will list all the other index files; From there it will pick the most suitable one (e.g. Package, Packages.gz etc) for a given component/architecture. For reference Debian itself doesn't use .bz2 compressed Packages files anymore. Just uncompressed, gzip and xz (see e.g. http://ftp.debian.org/debian/dists/bullseye/Release) The background of my various patches is that we're for Apertis.org we're moving from reprepro to aplty. Reprepro, or at least the version we used, also doesn't support bzip compressed packages files and has been working just fine :). Hence not wanting to spent lots of CPU cycles on bzip2 now.. For the future adding xz compression may be more interesting as that provides a better compression ration. But really size of Packages files is not something that is a high priority |
golangs compress/gzip isn't a parallel implementation, so it's quite a bit slower on most modern servers then pgzip. The below benchmark run shows that publishing a debian bullseye mirror snapshot (amd64, arm64, armhf, source) shows a gain of about 35% in publishing time (when skipping bz2 using MR aptly-dev#1081) ``` hyperfine -w 1 -m 3 -L aptly aptly-nobz2,aptly-nobz2-pgzip -p "{aptly} -config aptly.conf publish drop bullseye || true" "{aptly} -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye" Benchmark 1: aptly-nobz2 -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye Time (mean ± σ): 35.548 s ± 0.378 s [User: 39.465 s, System: 10.046 s] Range (min … max): 35.149 s … 35.902 s 3 runs Benchmark 2: aptly-nobz2-pgzip -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye Time (mean ± σ): 26.592 s ± 0.069 s [User: 42.207 s, System: 9.676 s] Range (min … max): 26.521 s … 26.660 s 3 runs Summary 'aptly-nobz2-pgzip -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye' ran 1.34 ± 0.01 times faster than 'aptly-nobz2 -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye' ``` Signed-off-by: Sjoerd Simons <sjoerd@collabora.com>
golangs compress/gzip isn't a parallel implementation, so it's quite a bit slower on most modern servers then pgzip. The below benchmark run shows that publishing a debian bullseye mirror snapshot (amd64, arm64, armhf, source) shows a gain of about 35% in publishing time (when skipping bz2 using MR #1081) ``` hyperfine -w 1 -m 3 -L aptly aptly-nobz2,aptly-nobz2-pgzip -p "{aptly} -config aptly.conf publish drop bullseye || true" "{aptly} -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye" Benchmark 1: aptly-nobz2 -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye Time (mean ± σ): 35.548 s ± 0.378 s [User: 39.465 s, System: 10.046 s] Range (min … max): 35.149 s … 35.902 s 3 runs Benchmark 2: aptly-nobz2-pgzip -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye Time (mean ± σ): 26.592 s ± 0.069 s [User: 42.207 s, System: 9.676 s] Range (min … max): 26.521 s … 26.660 s 3 runs Summary 'aptly-nobz2-pgzip -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye' ran 1.34 ± 0.01 times faster than 'aptly-nobz2 -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye' ``` Signed-off-by: Sjoerd Simons <sjoerd@collabora.com>
Using bzip2 generates smaller index files (roughly 20% smaller Packages files) but it comes with a big performance penalty. When publishing a debian mirror snapshot (amd64, arm64, armhf, source) without contents skipping bzip speeds things up around 1.8 times. ``` $ hyperfine -w 1 -L skip-bz2 true,false -m 3 -p "aptly -config aptly.conf publish drop bullseye || true" "aptly -config aptly.conf publish snapshot --skip-bz2={skip-bz2} --skip-contents --skip-signing bullseye" Benchmark 1: aptly -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye Time (mean ± σ): 35.567 s ± 0.307 s [User: 39.366 s, System: 10.075 s] Range (min … max): 35.311 s … 35.907 s 3 runs Benchmark 2: aptly -config aptly.conf publish snapshot --skip-bz2=false --skip-contents --skip-signing bullseye Time (mean ± σ): 64.740 s ± 0.135 s [User: 68.565 s, System: 10.129 s] Range (min … max): 64.596 s … 64.862 s 3 runs Summary 'aptly -config aptly.conf publish snapshot --skip-bz2=true --skip-contents --skip-signing bullseye' ran 1.82 ± 0.02 times faster than 'aptly -config aptly.conf publish snapshot --skip-bz2=false --skip-contents --skip-signing bullseye' ``` Allow skipping bz2 creation for setups where faster publishing is more important then Package file size. Signed-off-by: Sjoerd Simons <sjoerd@collabora.com>
28c5c78
to
36f29db
Compare
Codecov Report
@@ Coverage Diff @@
## master #1081 +/- ##
==========================================
- Coverage 52.15% 52.12% -0.03%
==========================================
Files 73 73
Lines 11257 11270 +13
==========================================
+ Hits 5871 5875 +4
- Misses 4822 4830 +8
- Partials 564 565 +1
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
Cool thanks for the changes! |
This was merged and running on aptly 1.5.0, but this isn't documented on the website |
Using bzip2 generates smaller index files (roughly 20% smaller Packages
files) but it comes with a big performance penalty. When publishing a
debian mirror snapshot (amd64, arm64, armhf, source) without contents
skipping bzip speeds things up around 1.8 times.
Allow skipping bz2 creation for setups where faster publishing is more
important then Package file size.
Signed-off-by: Sjoerd Simons sjoerd@collabora.com