-
Notifications
You must be signed in to change notification settings - Fork 51
WIP: https & http/2 support (both h2c and full h2) #27
Conversation
Codecov Report
@@ Coverage Diff @@
## master #27 +/- ##
========================================
Coverage ? 57.3%
========================================
Files ? 11
Lines ? 623
Branches ? 0
========================================
Hits ? 357
Misses ? 237
Partials ? 29
Continue to review full report at Codecov.
|
Is there a slack channel for this commit to discuss on scale to zero support? |
looks like hello-osiris example in main branch only works for service type:loadbalancer..I see some changes done in this commit to address it. Is that true? We tried using ClusterIP in service yaml it did not seem to scale out. |
To add to the above comment, if hello-osiris example is type:LoadBalancer, then scale to zero works and scale out can be triggered by accessing through the service's cluster ip. If hello-osiris example is type:ClusterIP, then scale to zero works but accessing through the service's cluster ip results in 'Connection refused' with no scale out. |
We don't currently have a Slack channel open to the community.
Any seemingly related changes are a coincidence, because, until now, I was unaware of there being any issue with services that aren't type Can you clarify whether you see this problem in both the master branch and this branch? Might I suggest opening a separate issue for this, please? |
Can you please provide detailed instructions for reproducing the behavior you are observing? After attempting to replicate this, I have a pretty good guess at what's going on. |
@krancour |
@tbeerbower, I'm glad to know things are working for you right now.
I can give you a little insight into why I think you might possibly have observed this issue and then seen it go away. I'll describe the steps I used to try and replicate this and how I was able to do so under one particular condition, but not under others...
And then I remembered why this happens... The activator is a multi-tenant component with "multi-tenancy" in this case meaning that it serves many Osiris-enabled services/deployments across many namespaces. Internally, the activator maintains a map of public IPs, private IPs, and known internal and external DNS names to specific deployments which it may need to reactivate. But here's the thing... there could be a So... bottom line is that if you had tested with Hope this helps explain why you might have seen an issue and then also seen it go away. |
@krancour, thanks for the explanation. I don't think that it applies in my case as I was always testing with the cluster IP. I did see another issue today that may be related. I have another deployment with a
If I immediately retry the curl command then it scales out and gives the correct response. However, something is not right with the pod. It doesn't contain the Osiris container. So at this point it will never scale back to zero. Another strange thing, if I delete the deployment and recreate it, the new initial pod will not contain the Osiris container. I think that I can only get it working again by changing the name of the deployment and recreating it. Sorry I don't have clear steps to reproduce. I'll try to get it to happen again with a simple test case. Any suggestions on how to debug this? Thanks! |
Can we please merge this pr into master? Why is this pending for so long? |
tbh, it's pending feedback from an internal user at MS who specifically requested this feature and is supposed to be confirming that it has adequately enabled their use case. I'll follow up with them tomorrow. |
Thanks Kent. Looks like when I tried to activate with service endpoint i.e. service-name.namespace..svc.cluster.local through kubernetes ingress. The activator did not instantiate pods. We use ingress controller load balancer to route to a local ingress in a namespace talking to a service endpoint. |
I'm not quite following you. Can you open a separate issue please and describe steps to reproduce? |
I finally figured out that annotation for load balancer is missing in my service yaml osiris.deislabs.io/loadBalancerHostname: After specifying that it worked fine. Is there a way we can specify global default for this value in values.yaml of osiris helm charts? Instead of having to update the value of service yaml every time we deploy a new service? observation is the time taken to terminate is not absolute for idle pods i.e. if it is specified 150 secs metrics polling interval in the values.yaml then I see it sometimes terminates after 5 mins..not sure why is the case. Another observation is if the pods are terminating and then if the request comes in during termination even though the pod instantiation is happening we run into 502 errors frequently. Is there any recommendation for this? |
@gogiraghu1 glad to hear it's working for you now.
This is what confused me about your issue. When you
I thought your issue might be something like that, but wasn't sure enough to suggest it because of the confusion.
Nope. The purpose of that annotation is to say "this service is the one that needs to be activated by requests for this domain name". If a given domain name were mapped to multiple services, which among those should be activated by a corresponding request would be ambiguous. If you don't mind, please open a separate issue if you want to discuss that further. I don't want the HTTP/2 thread to be hijacked by something unrelated. |
Sure Kent. Will do. |
No description provided.