Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some improvements to OME-TIFF write performance #4242

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

melissalinkert
Copy link
Member

88b846d is primarily meant to address #4204, while 0b42e1b is expected to provide some help for #4204, #3983, and #3480.

If the use of ByteArrayOutputStream in 0b42e1b is too concerning, I can rework that change to just reduce the seeks in that were removed at line 1181 - that alone would give some measurable benefit.

Using the same test as #4204 (comment), I now see:

$ for i in `cat timepoint-counts.txt` ; do java SmallPyramidWriter $i.ome.tiff $i; done
Populating metadata...
Writing image to '1.ome.tiff'...
init: 32 ms
image writing: 347 ms
close: 214 ms
Done.
Populating metadata...
Writing image to '8.ome.tiff'...
init: 36 ms
image writing: 699 ms
close: 250 ms
Done.
Populating metadata...
Writing image to '64.ome.tiff'...
init: 31 ms
image writing: 3409 ms
close: 955 ms
Done.
Populating metadata...
Writing image to '128.ome.tiff'...
init: 35 ms
image writing: 7984 ms
close: 1009 ms
Done.
Populating metadata...
Writing image to '512.ome.tiff'...
init: 41 ms
image writing: 68755 ms
close: 2649 ms
Done.
Populating metadata...
Writing image to '1024.ome.tiff'...
init: 26 ms
image writing: 257061 ms
close: 5343 ms
Done.
Populating metadata...
Writing image to '2048.ome.tiff'...
Switching to BigTIFF (by file size)
init: 41 ms
image writing: 1002326 ms
close: 10170 ms
Done.

Comparing bfconvert performance on gh-3983&sizeX=512&sizeY=1024&pixelType=uint16&series=27&sizeT=55.fake (which matches the dimensions in #3983), I see total conversion times:

Local disk Network disk
develop 58s 351s
this PR 22s 91s

To check that this did not impact working with slides, I also tried bfconvert -noflat CMU-1.svs CMU-1.ome.tiff with and without this change. As expected, I did not see any change in conversion performance; both output OME-TIFF files behaved identically when opened in QuPath.

All together, that looks like it might be enough of an improvement to justify closing #4204 / #3983 / #3480, but is only one set of tests - confirmation with other test data and on other systems would definitely be appreciated. In particular, if any of the original issue reporters (@NicoKiaru / @anntzer / @ebremer / @tischi) have time and interest to try this, comments would be welcome.

@melissalinkert melissalinkert added this to the 8.0.0 milestone Sep 22, 2024
@NicoKiaru
Copy link
Contributor

Thanks @melissalinkert ! I'll give it a shot tomorrow.

@anntzer
Copy link

anntzer commented Sep 23, 2024

From a very quick test (not exactly the same dataset as #3983, but similar: 5 series x 141 planes, written to the same network drive or locally), comparing with 6.13.0, this PR is actually 1) worse when writing to a local fast storage (tmpfs): 27s vs. 16s and 2) still as bad when writing to the network drive: ~300s.

@NicoKiaru
Copy link
Contributor

This fixes #4204!

@ebremer
Copy link

ebremer commented Sep 30, 2024

@melissalinkert looking forward to testing this! Now, can you coax some more speed out of the readers? :-)

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performed some initial testing on the pyramidal OME-TIFF writing. Using a real-world example from IDR and the AstroP65.ome.tiff file which has 1402 Z-sections and 3 channels i.e. over 1.2K planes.

Using Bio-Formats 7.3.1, the default bfconvert -noflat command completed in 544m11.512s. With this PR included and three consecutive executions, the same command completed in ~48min. So a massive performance improvement due when writing the IFDs which matches the initial investigation and the testing in the original issue.

From a code perspective, the change consumes the new API introduced in #3410 allowing to use directly the offset of the IFD rather than recomputing it from its index. One comment about dropping the unused currentFullResolution variable.

I'll look into the second change next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants