Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only partially able to read file through ipns mount #5328

Open
kvm2116 opened this issue Aug 1, 2018 · 35 comments
Open

Only partially able to read file through ipns mount #5328

kvm2116 opened this issue Aug 1, 2018 · 35 comments

Comments

@kvm2116
Copy link
Contributor

kvm2116 commented Aug 1, 2018

Version information:

go-ipfs version: 0.4.18-dev-4bca53e
Repo version: 7
System version: amd64/linux
Golang version: go1.10

Type:

Bug

Description:

I created a text file (size 300MB) with random ascii characters using the following command:
base64 /dev/urandom | head -c 300000000 > file.txt
Added this text file to /ipns mount on the machine

On another machine, I tried reading a third of the file using head:
head -c 1000000 /ipns/QmezBPmm4RRBRTBYPhVHsSSAPS2Q8hRTDd3PrfTsG8yvnf/file.txt

This command prints some number of lines and then gets stuck permanently.
Using debugging log, the following line is repeated "10:42:23.157 DEBUG bitswap: 9 keys in bitswap wantlist workers.go:185"

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 23, 2018

@Stebalien Any thoughts?
I am trying to debug, but without much luck so far.

@Stebalien
Copy link
Member

When that happens, could you check ipfs swarm peers --streams to see if the machines are still connected and if they have open bitswap streams?

This could also be #5183.

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 23, 2018

This is what the output of "ipfs swarm peers --streams" looks like when it is stuck

mkunal@node-1:~$ ipfs swarm peers --streams
/ip4/128.110.153.130/tcp/4001/ipfs/QmYgbFxWGykyTMNrotndQmoV67RpixAsus6zkRzk5bB66Q
/ipfs/bitswap/1.1.0
/ipfs/kad/1.0.0
/ipfs/kad/1.0.0

@Stebalien
Copy link
Member

Damn. Assuming peer A is downloading and peer B is serving, can you run ipfs bitswap wantlist -p $PEER_A on peer B and ipfs bitswap wantlist on peer A? This is looking more like #5183.

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 23, 2018

On peer B (serving):
$ipfs bitswap wantlist -p QmRHCnAmHGjZACNND1v5AJw6ndCPyrN5nT7k5umSsZ69D4
QmXFRdKuZiCcnRiTNyEcKtwL127WDjYrnu4C5pScykR5hg
QmXC8zjajA5q3RRJa9oXJxkVBN5ibihjnJvPZRayMERCjC
QmSw6qxguQ3ZSk8ufA416pVvVHwSotKppQg54VUHEoGrrv

On peer A (downloading):
$ ipfs bitswap wantlist
QmXFRdKuZiCcnRiTNyEcKtwL127WDjYrnu4C5pScykR5hg
QmXC8zjajA5q3RRJa9oXJxkVBN5ibihjnJvPZRayMERCjC
QmSw6qxguQ3ZSk8ufA416pVvVHwSotKppQg54VUHEoGrrv

On peer B (serving), I verified the blocks exist by printing the raw data using "ipfs object get QmSw6qxguQ3ZSk8ufA416pVvVHwSotKppQg54VUHEoGrrv"

@Stebalien
Copy link
Member

Interesting...

Does peer B have those blocks? Does it hang when running ipfs block stat QmXFRdKuZiCcnRiTNyEcKtwL127WDjYrnu4C5pScykR5hg?

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 23, 2018

No hanging on peer B.

$ipfs block stat QmXFRdKuZiCcnRiTNyEcKtwL127WDjYrnu4C5pScykR5hg
Key: QmXFRdKuZiCcnRiTNyEcKtwL127WDjYrnu4C5pScykR5hg
Size: 262158

$ ipfs block stat QmSw6qxguQ3ZSk8ufA416pVvVHwSotKppQg54VUHEoGrrv
Key: QmSw6qxguQ3ZSk8ufA416pVvVHwSotKppQg54VUHEoGrrv
Size: 262158

$ ipfs block stat QmXC8zjajA5q3RRJa9oXJxkVBN5ibihjnJvPZRayMERCjC
Key: QmXC8zjajA5q3RRJa9oXJxkVBN5ibihjnJvPZRayMERCjC
Size: 262158

On peer A (downloading), it hangs for the above 3 commands.

@Stebalien
Copy link
Member

Awesome! Well, not for you but this definitely looks like #5183. Could you follow the instructions here and upload the result to github (or ipfs)? That'll help me figure out where IPFS is stuck.

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 23, 2018

Here are the results:

EDIT: added the binary
issue_dump.tar.gz

@Stebalien
Copy link
Member

Can I get the exact binary you used as well?

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 23, 2018

My apologies. Added the binary in the tar.

issue_dump.tar.gz

@Stebalien
Copy link
Member

Which peer is this and would you mind doing the other peer as well?

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 23, 2018

Edit: the two files were swapped

Sender (peer B):
sender_issue_dump.tar.gz

Downloader (peer A):

downloader_issue_dump.tar.gz

@Stebalien
Copy link
Member

Stebalien commented Aug 23, 2018

Awesome! Thanks!

One more question. Is this the first time you've started these nodes (trying to reproduce).

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 23, 2018

I am renting these machines from Cloudlab (https://www.cloudlab.us/).

Since the time I obtained them (about 8 days ago), they have been running. I didn't reboot them.

Does that answer the question?

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 23, 2018

Please let me know how I can help to identify or fix the issue.

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 23, 2018

Not sure if this helps though I would like to point out that using "ipfs get file.txt" works as it should. It downloads the file in its entirety without any issues.

@Stebalien
Copy link
Member

Oh. Hm. That's really odd. Can you run on ipfs refs -r $file on peer A. It looks like it actually has the file in question, it just doesn't realize it...

Just to make sure file.txt is the file you'r trying to download through fuse, right?

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 23, 2018

Yes, file.txt is the file trying to be downloaded.

In the output, it prints one line and then hangs.
$ ipfs refs -r /ipns/QmYgbFxWGykyTMNrotndQmoV67RpixAsus6zkRzk5bB66Q/file.txt
QmaQ7e4fEWRizAUgGe5Dbbf5oAb8RfNtJCUpYxpYmQNbUn

Ah, I think I see what you mean.

In the earlier experiment, which worked:
On peer B, I did "ipfs add file.txt" -> output_hash
On peer A, "ipfs get output_hash"

In the /ipns mount scenario, which is hanging:
On peer B, "cp file.txt /ipns/local/file.txt"
On peer A, "head -c 1000000 /ipns/peer_a/file.txt"

@Stebalien
Copy link
Member

Wait, but ipfs get /ipns/QmYgbFxWGykyTMNrotndQmoV67RpixAsus6zkRzk5bB66Q/file.txt succeeds? Is this on peer A?

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 23, 2018

I just tried that and it succeeds.

on peer A (the client trying to download):
ipfs get /ipns/QmYgbFxWGykyTMNrotndQmoV67RpixAsus6zkRzk5bB66Q/file.txt
Saving file(s) to file.txt
286.17 MiB / 286.17 MiB [===========================================================] 100.00% 12s

@Stebalien
Copy link
Member

Does it actually download the complete file (trying to rule out a bug in ipfs get).

@Stebalien
Copy link
Member

And thank you for all your help on this!

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 24, 2018

Sure thing. Happy to help! Please feel free to let me know any way I can help.

I did "cat file.txt" and it printed the entire file.
I also did a diff on the downloaded file and the original file and they are identical.

@Stebalien
Copy link
Member

So, to summarize, on peer A:

  1. ipfs get /ipns/Qm... works.
  2. ipfs refs -r /ipns/Qm... hangs.
  3. cat /ipns/Qm... hangs.

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 24, 2018

Yes.
Any ideas on what might be causing this issue? I have the entire weekend available to run more tests/possibly implement the solution.

@Stebalien
Copy link
Member

No but the get versus refs difference is probably going to narrow this down a lot.

@Stebalien
Copy link
Member

If you re-run these commands (#5328 (comment)), do you get the same results?

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 24, 2018

No, it varies. Sometimes more entries, sometimes less.

@Stebalien
Copy link
Member

What's the output of ipfs diag cmds?

@Stebalien
Copy link
Member

(on both machines)

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 24, 2018

I ran "cat /ipns/Qm.../file.txt". After it hangs, obtained the output of "ipfs diag cmds".

Peer A (downloader):
Command Active StartTime RunTime
repo/gc false Aug 24 15:39:03 14.671006ms
refs false Aug 24 15:39:09 5.458619543s
repo/gc false Aug 24 15:57:06 135.123628ms
pin/ls false Aug 24 15:57:11 254.572µs
refs false Aug 24 15:57:23 5.416848016s
pin/ls false Aug 24 16:21:03 231.248µs
repo/gc false Aug 24 16:21:06 144.058141ms
diag/cmds true Aug 24 16:22:42 1.515304ms

Peer B (sender):
Command Active StartTime RunTime
bitswap/wantlist false Aug 24 15:36:00 888.235µs
bitswap/wantlist false Aug 24 15:36:07 227.549µs
bitswap/wantlist false Aug 24 15:36:48 352.875µs
bitswap/wantlist false Aug 24 15:38:27 222.73µs
diag/cmds true Aug 24 16:23:21 1.188582ms

@Stebalien
Copy link
Member

And does ipfs get still work? Does ipfs refs still hang?

Internally, it looks like anything that might cause ipfs refs to hang should cause ipfs get to hang.

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 24, 2018 via email

@kvm2116
Copy link
Contributor Author

kvm2116 commented Aug 24, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants