Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document that dropping a async_fs::File with buffered writes may not immediately close file on drop #14

Open
erickt opened this issue Sep 16, 2022 · 3 comments

Comments

@erickt
Copy link

erickt commented Sep 16, 2022

I just found an unexpected use case that would be worthwhile documenting here, and in blocking. As best as I can tell, if a user does:

let mut file = async_fs::File::open("some-file").await?;
file.write_all(b"some bytes").await?;
drop(file)

The underlying std::fs::File is not necessarily dropped when we drop the async_fs::File. This can happen because of how blocking implements AsyncWrite. Whenever it starts out in the Idle state, it creates an async pipe, and submits the (read_pipe, real_file) to blocking's background thread pool. Any subsequent writes are written into the write_pipe that's eventually consumed by the background task. If there's a large number of outstanding writes, then it might take a decent amount of time before the underlying std::fs::File is actually dropped.

The fix for this is to explicitly a function that ultimately calls Unblock::poll_stop, like async_fs::File::close, or async_fs::File::poll_flush.

We ended up running into this in Fuchsia when downloading a large number of files. While we thought we had sufficient controls around the number of concurrently written files by controlling when we dropped async_fs::File, but some users were hitting the "too many opened files" error because by default OSX's ulimit -n is only 256.

Would it be possible to update the docs for async_fs::File to mention that users should run flush/sync_data/sync_all/close/etc if they write data, otherwise the underlying file might not get closed until some indeterminate point in the future?

@taiki-e
Copy link
Collaborator

taiki-e commented Sep 16, 2022

It makes sense to mention that flush is needed. BufWriter also has similar docs.

@erickt
Copy link
Author

erickt commented Sep 16, 2022

Yeah, I think if async_fs::File had language similar to that BufWriter that's a little more explicit about how async_fs::File is implemented. tokio::fs::File's docs also directly speak to this issue.

I suppose another way to approach this is we could try to get rid of the decoupling between file.write(buf).await completion and when the underlying write syscall is called. We could associate a futures::channel::oneshot with each write, which async_fs::File::write could await on, but that'd probably have some impact on our throughput.

Or maybe something clever could be done using the recently stabilized std::thread::scope to get rid of the pipe? That'd probably be a pretty big redesign though.

@taiki-e
Copy link
Collaborator

taiki-e commented Sep 20, 2022

I have not investigated the details regarding alternatives, but at this point I think it is reasonable to improve the docs.

As for std scope threads, it is usually incompatible with asynchronous code because it blocks the current thread until all spawned threads have been completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants