Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/fix uio partial copies #10148

Merged
merged 6 commits into from
Apr 1, 2020

Conversation

fsvm88
Copy link
Contributor

@fsvm88 fsvm88 commented Mar 22, 2020

In zfs_write(), the loop continues to the next iteration without accounting for partial copies occurring in uiomove_iov when copy_from_user/__copy_from_user_inatomic return a non-zero status.
This results in "zfs: accessing past end of object..." in the kernel log, and the write failing.

Motivation and Context

On some workloads, writing to disk results in "zfs: accessing past end of object".
See issue #8673.

Description

In zfs_write(), the loop continues to the next iteration without accounting for partial copies occurring in uiomove_iov (which does not update the uio struct) when copy_from_user/__copy_from_user_inatomic return a non-zero status, which indicates the bytes left to copy.

How Has This Been Tested?

I had a reproducer workload: compiling GCC in a musl chroot, from a glibc host system. The bug would happen at random, at different parts and times of the build.
Before the provided PR, I was never able to compile GCC successfully in the chroot.
I had a couple revisions of the patch that worked partially, before reaching the ones submitted in this PR.
After applying the fixes in this PR, I was able to compile GCC 11 times in a row without any breakage.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • I have run the ZFS Test Suite with this change applied.
  • All commit messages are properly formatted and contain Signed-off-by.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Mar 26, 2020
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for running this down. Your analysis looks right to be, I'm surprised we didn't handle this, but that's clearly the case.

module/zcommon/zfs_uio.c Outdated Show resolved Hide resolved
module/zcommon/zfs_uio.c Outdated Show resolved Hide resolved
/ data itself, generate an EFAULT, and make
/ the loop break early without completing the
/ last chunk.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the detailed comment is welcome, but perhaps it could be a little more concise. Also please use the existing comment style found elsewhere in this function. You can run make checkstyle locally to verify the formatting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed all "make checkstyle" issues, and made the comment shorter.
Let me know if it's ok.

@fsvm88 fsvm88 force-pushed the bugfix/fix-uio-partial-copies branch from f2b6553 to 10ad043 Compare March 27, 2020 23:59
@codecov-io
Copy link

codecov-io commented Mar 28, 2020

Codecov Report

Merging #10148 into master will increase coverage by 0.10%.
The diff coverage is 27.27%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #10148      +/-   ##
==========================================
+ Coverage   79.28%   79.38%   +0.10%     
==========================================
  Files         385      385              
  Lines      122440   122445       +5     
==========================================
+ Hits        97079    97206     +127     
+ Misses      25361    25239     -122     
Flag Coverage Δ
#kernel 79.66% <27.27%> (-0.04%) ⬇️
#user 66.15% <ø> (+1.03%) ⬆️
Impacted Files Coverage Δ
module/os/linux/zfs/zfs_vnops.c 69.39% <0.00%> (-0.60%) ⬇️
module/zcommon/zfs_uio.c 88.77% <33.33%> (-3.86%) ⬇️
module/os/linux/spl/spl-kmem-cache.c 75.94% <0.00%> (-9.05%) ⬇️
module/zcommon/zfs_fletcher.c 75.65% <0.00%> (-2.64%) ⬇️
module/zfs/dsl_synctask.c 92.40% <0.00%> (-2.54%) ⬇️
lib/libzfs/libzfs_changelist.c 84.37% <0.00%> (-1.96%) ⬇️
module/os/linux/zfs/zfs_dir.c 82.29% <0.00%> (-1.15%) ⬇️
module/zfs/vdev_mirror.c 94.77% <0.00%> (-1.12%) ⬇️
module/zfs/vdev_trim.c 94.57% <0.00%> (-0.91%) ⬇️
cmd/zed/agents/zfs_mod.c 77.55% <0.00%> (-0.67%) ⬇️
... and 53 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0929c4d...7a7c85f. Read the comment docs.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Would you just mind rebasing this on the latest master branch. This will get us a clean CI run, and we'll get backported for the next 0.8 point release.

@behlendorf
Copy link
Contributor

@wgqimut would you mind reviewing this PR.

EFAULT.

Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
the loop, and leave a lengthy comment explaining the reason why this is done.

Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
… this is not reachable. Fix double-counting bug.

Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
…request on PR.

Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
@fsvm88 fsvm88 force-pushed the bugfix/fix-uio-partial-copies branch from ab66235 to 7a7c85f Compare March 31, 2020 20:15
@fsvm88 fsvm88 changed the base branch from zfs-0.8-release to master March 31, 2020 20:15
@fsvm88
Copy link
Contributor Author

fsvm88 commented Mar 31, 2020

... Would you just mind rebasing this on the latest master branch. ...

Done, let me know if something else is missing.

Copy link
Contributor

@wgqimut wgqimut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Apr 1, 2020
@behlendorf behlendorf marked this pull request as ready for review April 1, 2020 16:39
@behlendorf
Copy link
Contributor

@fsvm88 thanks for the rebase. This is ready to go, I'll get it merged. Thank you for resolving this!

@behlendorf behlendorf merged commit c9e3efd into openzfs:master Apr 1, 2020
@fsvm88 fsvm88 deleted the bugfix/fix-uio-partial-copies branch April 3, 2020 21:40
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Apr 15, 2020
In zfs_write(), the loop continues to the next iteration without
accounting for partial copies occurring in uiomove_iov when 
copy_from_user/__copy_from_user_inatomic return a non-zero status.
This results in "zfs: accessing past end of object..." in the 
kernel log, and the write failing.

Account for partial copies and update uio struct before returning
EFAULT, leave a comment explaining the reason why this is done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: ilbsmart <wgqimut@gmail.com>
Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Closes openzfs#8673 
Closes openzfs#10148
@gmelikov
Copy link
Member

@fsvm88 can you point me to reproducer? I think it would be great to have a test for this case.

@fsvm88
Copy link
Contributor Author

fsvm88 commented Apr 16, 2020

@gmelikov As explained in the PR's detail, my workload was compiling big projects (kernel, GCC) in a musl chroot (from a glibc host).
Others before me have reported issues with highly concurrent write workloads and (possibly) memory reclaim occurring (see the issue #8673).
I believe this tends to occur more often in musl due to how it implements reading from/writing to files (this is just guessed).

tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Apr 22, 2020
In zfs_write(), the loop continues to the next iteration without
accounting for partial copies occurring in uiomove_iov when 
copy_from_user/__copy_from_user_inatomic return a non-zero status.
This results in "zfs: accessing past end of object..." in the 
kernel log, and the write failing.

Account for partial copies and update uio struct before returning
EFAULT, leave a comment explaining the reason why this is done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: ilbsmart <wgqimut@gmail.com>
Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Closes openzfs#8673 
Closes openzfs#10148
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Apr 22, 2020
In zfs_write(), the loop continues to the next iteration without
accounting for partial copies occurring in uiomove_iov when 
copy_from_user/__copy_from_user_inatomic return a non-zero status.
This results in "zfs: accessing past end of object..." in the 
kernel log, and the write failing.

Account for partial copies and update uio struct before returning
EFAULT, leave a comment explaining the reason why this is done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: ilbsmart <wgqimut@gmail.com>
Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Closes openzfs#8673 
Closes openzfs#10148
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Apr 28, 2020
In zfs_write(), the loop continues to the next iteration without
accounting for partial copies occurring in uiomove_iov when 
copy_from_user/__copy_from_user_inatomic return a non-zero status.
This results in "zfs: accessing past end of object..." in the 
kernel log, and the write failing.

Account for partial copies and update uio struct before returning
EFAULT, leave a comment explaining the reason why this is done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: ilbsmart <wgqimut@gmail.com>
Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Closes openzfs#8673 
Closes openzfs#10148
tonyhutter pushed a commit that referenced this pull request May 12, 2020
In zfs_write(), the loop continues to the next iteration without
accounting for partial copies occurring in uiomove_iov when 
copy_from_user/__copy_from_user_inatomic return a non-zero status.
This results in "zfs: accessing past end of object..." in the 
kernel log, and the write failing.

Account for partial copies and update uio struct before returning
EFAULT, leave a comment explaining the reason why this is done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: ilbsmart <wgqimut@gmail.com>
Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Closes #8673 
Closes #10148
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
In zfs_write(), the loop continues to the next iteration without
accounting for partial copies occurring in uiomove_iov when 
copy_from_user/__copy_from_user_inatomic return a non-zero status.
This results in "zfs: accessing past end of object..." in the 
kernel log, and the write failing.

Account for partial copies and update uio struct before returning
EFAULT, leave a comment explaining the reason why this is done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: ilbsmart <wgqimut@gmail.com>
Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Closes openzfs#8673 
Closes openzfs#10148
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants