Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transport: fix racey send to writes channel in WriteStatus #1546

Merged
merged 1 commit into from
Oct 4, 2017
Merged

transport: fix racey send to writes channel in WriteStatus #1546

merged 1 commit into from
Oct 4, 2017

Conversation

gyuho
Copy link
Contributor

@gyuho gyuho commented Sep 29, 2017

Concurrent 'SendMsg' calls to stream lead to
multiple 'WriteStatus' calls, while closing
'writes' channel is not synchronized.

This patch marks 'streamDone' first before 'ht.do',
so that following 'WriteStatus' does not trigger panic
on 'writes' channel.

Address #1111.

c.f. etcd-io/etcd#8627.

@gyuho
Copy link
Contributor Author

gyuho commented Sep 29, 2017

Reproducible case:

func TestHandlerTransport_HandleStreams_MultiWriteStatus(t *testing.T) {
	st := newHandleStreamTest(t)
	handleStream := func(s *Stream) {
		if want := "/service/foo.bar"; s.method != want {
			t.Errorf("stream method = %q; want %q", s.method, want)
		}
		st.bodyw.Close() // no body

		var wg sync.WaitGroup
		wg.Add(5)
		for i := 0; i < 5; i++ {
			go func() {
				defer wg.Done()
				st.ht.WriteStatus(s, status.New(codes.OK, ""))
			}()
		}
		wg.Wait()
	}
	st.ht.HandleStreams(
		func(s *Stream) { go handleStream(s) },
		func(ctx context.Context, method string) context.Context { return ctx },
	)
}

We are using grpc-go v1.6.0.

Thanks!

@gyuho gyuho changed the title transport: block writes channel send on WriteStatus transport: block send to 'writes' channel on WriteStatus Sep 29, 2017
@gyuho gyuho changed the title transport: block send to 'writes' channel on WriteStatus transport: fix racey send to writes channel in WriteStatus Sep 29, 2017
@gyuho
Copy link
Contributor Author

gyuho commented Oct 4, 2017

Could anybody review this, please? /cc @menghanl @MakMukhi @dfawley

We are seeing the panics on this code path pretty often when closing down grpc-server with inflight streams.

type ServerTransport interface {
	...

	// WriteStatus sends the status of a stream to the client.  WriteStatus is
	// the final call made on a stream and always occurs.
	WriteStatus(s *Stream, st *status.Status) error

	...

I believe serverHandlerTransport.streamDone check is there to prevent redundant error writes from SendMsg. But it is possible that following SendMsg calls on the same stream error and WriteStatus before serverHandlerTransport.do completes, thus panic-ing on closing of closed ht.writes channel.

This patch just marks the stream as done before calling serverHandlerTransport.do, so subsequent error-ing SendMsg calls do not trigger panic on WriteStatus.

@menghanl
Copy link
Contributor

menghanl commented Oct 4, 2017

The change LGTM. Thanks for the fix.
Can you also copy your test to handler_server_test.go?

@gyuho
Copy link
Contributor Author

gyuho commented Oct 4, 2017

@menghanl PTAL.

And could we have patch release v1.6.1 with this?

Thanks a lot!

@menghanl menghanl added this to the 1.7 Release milestone Oct 4, 2017
@menghanl
Copy link
Contributor

menghanl commented Oct 4, 2017

Thanks for the quick fix.
The release date for 1.7.0 is next Wednesday (Oct 11).
Is it OK for you to wait for 1.7.0?

@gyuho
Copy link
Contributor Author

gyuho commented Oct 4, 2017

@menghanl Sounds good. Thanks!

@menghanl
Copy link
Contributor

menghanl commented Oct 4, 2017

Can please also you do a rebase?
The test failed because go get staticcheck failed (disabled in #1561). Thanks!

Concurrent 'SendMsg' calls to stream lead to
multiple 'WriteStatus' calls, while closing
'writes' channel is not synchronized.

This patch marks 'streamDone' first before 'ht.do',
so that following 'WriteStatus' does not trigger panic
on 'writes' channel.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
@gyuho
Copy link
Contributor Author

gyuho commented Oct 4, 2017

@menghanl Just rebased with current master. Thanks!

Copy link
Contributor

@menghanl menghanl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
LGTM.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants