Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo build breaks with "resource temporarily unavailable (error 35)" (I'm assuming EAGAIN) on v2.2.0 and v2.2.2 #809

Open
siriobalmelli opened this issue Jan 30, 2024 · 3 comments

Comments

@siriobalmelli
Copy link

siriobalmelli commented Jan 30, 2024

Running cargo build on a large Rust project gives resource temporarily unavailable (error 35) when cargo attempts to link files between subdirectories of build.

Bug visible on:

  • Apple M1 Max
  • both Ventura and Sonoma
  • both v2.2.0 and v2.2.2
  • "vanilla" (created using defaults) pool+datasets created by both v2.2.2 (M1) and v2.1.0 (x86)
  • both the internal SSD and an external Samsung T7

Bug does not recur on (M1 Max, Ventura, v2.1.6); neither with a "vanilla" (defaults) pool+dataset nor with the following options:

NAME          PROPERTY                       VALUE                          SOURCE
testpool  size                           928G                           -
testpool  capacity                       0%                             -
testpool  altroot                        -                              default
testpool  health                         ONLINE                         -
testpool  guid                           2780145945418842444            -
testpool  version                        -                              default
testpool  bootfs                         -                              default
testpool  delegation                     on                             default
testpool  autoreplace                    on                             local
testpool  cachefile                      -                              default
testpool  failmode                       wait                           default
testpool  listsnapshots                  off                            default
testpool  autoexpand                     on                             local
testpool  dedupratio                     1.00x                          -
testpool  free                           927G                           -
testpool  allocated                      1022M                          -
testpool  readonly                       off                            -
testpool  ashift                         13                             local
testpool  comment                        -                              default
testpool  expandsize                     -                              -
testpool  freeing                        0                              -
testpool  fragmentation                  0%                             -
testpool  leaked                         0                              -
testpool  multihost                      off                            default
testpool  checkpoint                     -                              -
testpool  load_guid                      4414947270116414331            -
testpool  autotrim                       off                            default
testpool  compatibility                  openzfs-2.1-freebsd            local
testpool  feature@async_destroy          enabled                        local
testpool  feature@empty_bpobj            active                         local
testpool  feature@lz4_compress           active                         local
testpool  feature@multi_vdev_crash_dump  enabled                        local
testpool  feature@spacemap_histogram     active                         local
testpool  feature@enabled_txg            active                         local
testpool  feature@hole_birth             active                         local
testpool  feature@extensible_dataset     active                         local
testpool  feature@embedded_data          active                         local
testpool  feature@bookmarks              enabled                        local
testpool  feature@filesystem_limits      enabled                        local
testpool  feature@large_blocks           active                         local
testpool  feature@large_dnode            active                         local
testpool  feature@sha512                 enabled                        local
testpool  feature@skein                  enabled                        local
testpool  feature@edonr                  disabled                       local
testpool  feature@userobj_accounting     active                         local
testpool  feature@encryption             active                         local
testpool  feature@project_quota          active                         local
testpool  feature@device_removal         enabled                        local
testpool  feature@obsolete_counts        enabled                        local
testpool  feature@zpool_checkpoint       enabled                        local
testpool  feature@spacemap_v2            active                         local
testpool  feature@allocation_classes     enabled                        local
testpool  feature@resilver_defer         enabled                        local
testpool  feature@bookmark_v2            enabled                        local
testpool  feature@redaction_bookmarks    enabled                        local
testpool  feature@redacted_datasets      enabled                        local
testpool  feature@bookmark_written       enabled                        local
testpool  feature@log_spacemap           active                         local
testpool  feature@livelist               enabled                        local
testpool  feature@device_rebuild         enabled                        local
testpool  feature@zstd_compress          enabled                        local
testpool  feature@draid                  enabled                        local
testpool  feature@zilsaxattr             disabled                       local
testpool  feature@head_errlog            disabled                       local
testpool  feature@blake3                 disabled                       local
NAME                      PROPERTY               VALUE                  SOURCE
testpool/testvol/repos  type                   filesystem             -
testpool/testvol/repos  creation               Tue Jan 30  9:22 2024  -
testpool/testvol/repos  used                   872M                   -
testpool/testvol/repos  available              898G                   -
testpool/testvol/repos  referenced             872M                   -
testpool/testvol/repos  compressratio          2.49x                  -
testpool/testvol/repos  mounted                yes                    -
testpool/testvol/repos  quota                  none                   default
testpool/testvol/repos  reservation            none                   default
testpool/testvol/repos  recordsize             256K                   local
testpool/testvol/repos  mountpoint             /Users/testvol/repos     local
testpool/testvol/repos  sharenfs               off                    default
testpool/testvol/repos  checksum               on                     default
testpool/testvol/repos  compression            lz4                    local
testpool/testvol/repos  atime                  off                    local
testpool/testvol/repos  devices                on                     default
testpool/testvol/repos  exec                   on                     default
testpool/testvol/repos  setuid                 on                     default
testpool/testvol/repos  readonly               off                    default
testpool/testvol/repos  zoned                  off                    default
testpool/testvol/repos  snapdir                hidden                 default
testpool/testvol/repos  aclmode                discard                default
testpool/testvol/repos  aclinherit             restricted             default
testpool/testvol/repos  createtxg              30                     -
testpool/testvol/repos  canmount               on                     local
testpool/testvol/repos  xattr                  sa                     local
testpool/testvol/repos  copies                 1                      default
testpool/testvol/repos  version                5                      -
testpool/testvol/repos  utf8only               on                     -
testpool/testvol/repos  normalization          none                   -
testpool/testvol/repos  casesensitivity        sensitive              -
testpool/testvol/repos  vscan                  off                    default
testpool/testvol/repos  nbmand                 off                    default
testpool/testvol/repos  sharesmb               off                    default
testpool/testvol/repos  refquota               none                   default
testpool/testvol/repos  refreservation         none                   default
testpool/testvol/repos  guid                   14751730203951669605   -
testpool/testvol/repos  primarycache           all                    default
testpool/testvol/repos  secondarycache         all                    default
testpool/testvol/repos  usedbysnapshots        0B                     -
testpool/testvol/repos  usedbydataset          872M                   -
testpool/testvol/repos  usedbychildren         0B                     -
testpool/testvol/repos  usedbyrefreservation   0B                     -
testpool/testvol/repos  logbias                latency                default
testpool/testvol/repos  objsetid               520                    -
testpool/testvol/repos  dedup                  off                    default
testpool/testvol/repos  mlslabel               none                   default
testpool/testvol/repos  sync                   disabled               local
testpool/testvol/repos  dnodesize              auto                   local
testpool/testvol/repos  refcompressratio       2.49x                  -
testpool/testvol/repos  written                872M                   -
testpool/testvol/repos  logicalused            2.03G                  -
testpool/testvol/repos  logicalreferenced      2.03G                  -
testpool/testvol/repos  volmode                default                default
testpool/testvol/repos  filesystem_limit       none                   default
testpool/testvol/repos  snapshot_limit         none                   default
testpool/testvol/repos  filesystem_count       none                   default
testpool/testvol/repos  snapshot_count         none                   default
testpool/testvol/repos  snapdev                hidden                 default
testpool/testvol/repos  acltype                nfsv4                  default
testpool/testvol/repos  context                none                   default
testpool/testvol/repos  fscontext              none                   default
testpool/testvol/repos  defcontext             none                   default
testpool/testvol/repos  rootcontext            none                   default
testpool/testvol/repos  relatime               on                     default
testpool/testvol/repos  redundant_metadata     all                    local
testpool/testvol/repos  overlay                on                     default
testpool/testvol/repos  encryption             aes-256-gcm            -
testpool/testvol/repos  keylocation            none                   default
testpool/testvol/repos  keyformat              passphrase             -
testpool/testvol/repos  pbkdf2iters            1048576                -
testpool/testvol/repos  encryptionroot         testpool/testvol     -
testpool/testvol/repos  keystatus              available              -
testpool/testvol/repos  special_small_blocks   0                      default
testpool/testvol/repos  com.apple.browse       on                     default
testpool/testvol/repos  com.apple.ignoreowner  off                    default
testpool/testvol/repos  com.apple.mimic        off                    default
testpool/testvol/repos  com.apple.devdisk      poolonly               default

I could not find a similar bug upstream, and I'm unsure if this is the right place to report this.

Unfortunately, this came up while commissioning an M1 Max machine (my first) and the above is from my notes, since the machine now works. If this is not a known bug however, and can't easily be reproduced with the above, please feed back and I will set up some VMs (or find a spare M1 machine) to reproduce.

@lundman
Copy link
Contributor

lundman commented Jan 30, 2024

It can return resource temporarily unavailable (error 35) if clonefile() is done "too quickly" after the file is created, avoided with a zpool sync between. Newer macOS will use clonefile() if it is available, when older macOS did not.

You could try with clonefile disable, and see if the issue still occurs.

@siriobalmelli
Copy link
Author

That makes sense, thank you.

After some searching I cannot seem to find how I would disable clonefile(); this also doesn't fully make sense to me since AFAIK this is a syscall and I have no way of influencing what syscalls cargo makes.

Is there a dataset or zpool option to disable support for clonefile()? In the context of ZFS, cloning refers to zfs clone et al. so not much joy finding something there either.

Is this issue with clonefile() being done "too soon" after file creation being tracked somewhere? Perhaps the simplest approach is to link it here and track the evolution of the underlying issue?

Thank you

@lundman
Copy link
Contributor

lundman commented Feb 14, 2024

Looks like we should pull in
openzfs/zfs#15842

thiblahute added a commit to thiblahute/cargo that referenced this issue May 2, 2024
Falling back to hard_link when that happens, retrying can lead to very
long wait before copying works (up to 4secs in my tests) while
hard_linking works straight away.

Looks related to openzfsonosx/zfs#809

Closes rust-lang#13838
thiblahute added a commit to thiblahute/cargo that referenced this issue May 2, 2024
Falling back to hard_link when that happens, retrying can lead to very
long wait before copying works (up to 4secs in my tests) while
hard_linking works straight away.

Looks related to openzfsonosx/zfs#809

Closes rust-lang#13838
thiblahute added a commit to thiblahute/cargo that referenced this issue May 2, 2024
Falling back to hard_link when that happens, retrying can lead to very
long wait before copying works (up to 4secs in my tests) while
hard_linking works straight away.

Looks related to openzfsonosx/zfs#809

Closes rust-lang#13838
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants