-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(aws-eks): Custom::AWSCDK-EKS-HelmChart StateNotFoundError: State functionActiveV2 not found #23862
Comments
Update: The error in the description of this bug seems to happen on any aws-cdk eks cluster I try to create (I'm trying a few different tutorial examples). Also, I'm trying on my linux machine now to see if I get different results. I still got an error but a different one now:
|
Hi As this issue is related to the sample repo, the best place to report this issue is https://github.com/aws-samples/cdk-eks-fargate/issues As this is not relevant to aws-cdk directly, I am closing this issue for now. |
|
@jaredtbates No, I did not find a solution to this issue. I went back to using terraform for now lol. I'd love to figure this out but I am not sure how to debug this issue. @pahud mentioned that it is not directly related to this repository and to open a new issue here: I just forgot to be honest. |
I'm seeing this exact same error when deploying a custom resource backed by a NodeJS lambda (wrapped in a Provider). Oddly this only seems to reproduce in the ap-south-2 region. In the provider's OnEvent log group I see
But there's no corresponding RequestId in the lambda function's log group (the function that the provider says it's invoking). I'll try to create a repro but given the recent comments from others also seeing this error I suggest reopening this issue. |
reopening this issue as it is still relevant. |
Hi @jaredtbates @beamsies @mmayors Instead of deploying the sample from https://github.com/aws-samples/cdk-eks-fargate, can you share a small sample that I can reproduce it in my account? I'd be happy to help investigate. |
The reason I need a small sample for issue reproduction is that I feel this error is the lambda function not being able to callback to the cloudformation service on custom resource creation, and this usually happens when your lambda function is connecting to the vpc subnets that have no egress. But I am not 100% sure so I need a small sample so I can dive into it.
You can try redeploy it with |
I had the same error when I used cdk8s-plus-24 kplus deployment. When I reverted my deployment to k8s.io/v1 KubeDeployment, the error disappeared. |
We aren't using cdk8s or cdk8s+ at this point, just the built in CDK constructs. @pahud I can try to get you a reproduction next week sometime if I get time. To follow up, if I let the update run its course, cloudformation seems to time out after an hour or so and then just keeps going and succeeds. I guess I hadn't waited long enough? But I still think this error is causing the trouble. |
@jaredtbates Depends on your deployment size, if you just deploy a EKS cluster 1.24 with a default managed nodegroup in an existing VPC, the deployment should be completed in 20minutes. AFAIK we recently have some known eks issues:
Anyways, feel free to provide me a working minimal sample here that I can reproduce in my account. I will need to know how you configure your eks cluster to avoid some known issues like that. copy @mmayors |
So I attached a minimal repro, but... it only reproduces in a specific account and only in ap-south-2 ¯_(ツ)_/¯
If I deploy to ap-south-2 using a particular account, CloudFormation fails when creating the custom resources. It successfully creates maybe 3 out of 10 custom resources, then the rest change to If I deploy to us-east-1, it succeeds. And if I deploy to ap-south-2 using any other account opted in to the region, it also succeeds. I know that's not a lot to go on but it's what I have. Open to any suggestions to help narrow this down. (Not sure if relevant to the stacktrace) It doesn't matter if I use the NodeJS 16 or 18 lambda runtime. But AFAIK JavaScript SDK V3 should come preinstalled with Node18, not V2. |
i'm not going to be able to get a reproduction since we already updated our clusters and don't have time. Sorry about that. It's likely an edge case specific to our versions or environment then. |
@jaredtbates No problem. If anyone is able to get a reproduction please share in the comments. This issue will auto close in a few days if no further comments. Feel free to reopen if necessary. |
This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled. |
@pahud I have the same issue on a new project, for now just trying to deploy a simple cluster with an instance of Kafka and its associated Zookeeper helper to EKS using cdk, kubectl 1.24 and no customized networking, in the eu-west-1 region. This cluster has deployed correctly recently, but I had to pull it down because the instance type I specified was too small for the number of pods, and now each deployment attempt fails with the same error:
The exact resource that it fails at (here “pubsubclustermanifestzookeepersvc4A739AD8”) differs between attempts, and can be both the services, the AWS auth object, and the pods. Here is a reproduction repo with just the stack I am trying to deploy: https://github.com/strongmindsnan/CdkTest |
@strongmindsnan this issue may not related to EKS but CFN and custom resources. Please watch #24358 for updates. |
Describe the bug
I'm trying to deploy an eks cluster from a tutorial I'm following here:
After I run
cdk bootstrap
I then runcdk deploy
.Then I get this error a little more than 1/2 through the process:
11:27:27 PM | CREATE_FAILED | Custom::AWSCDK-EKS-HelmChart | my-cluster/chart-a...r/Resource/Default Received response status [FAILED] from custom resource. Message returned: StateNotFoundError: State functionActiveV2 not found. at constructor.loadWaiterConfig (/var/runtime/node_modules/aws-sdk/lib/resource_waiter.js:196:32) at new constructor (/var/runtime/node_modules/aws-sdk/lib/resource_waiter.js:64:10) at features.constructor.waitFor (/var/runtime/node_modules/aws-sdk/lib/service.js:271:18) at Object.defaultInvokeFunction [as invokeFunction] (/var/task/outbound.js:1:826) at processTicksAndRejections (internal/process/task_queues.js:95:5) at async invokeUserFunction (/var/task/framework.js:1:2149) at async onEvent (/var/task/framework.js:1:365) at async Runtime.handler (/var/task/cfn-response.js:1:1543) (RequestId: 0d8ef9af-72af-4130-82bb-c480d217e863)
Expected Behavior
I'm expecting the
cdk deploy
command to successfully deploy the eks cdk stack since it was from a tutorial on aws blog.Current Behavior
The
cdk deploy
command failed with the following error:11:27:27 PM | CREATE_FAILED | Custom::AWSCDK-EKS-HelmChart | my-cluster/chart-a...r/Resource/Default Received response status [FAILED] from custom resource. Message returned: StateNotFoundError: State functionActiveV2 not found. at constructor.loadWaiterConfig (/var/runtime/node_modules/aws-sdk/lib/resource_waiter.js:196:32) at new constructor (/var/runtime/node_modules/aws-sdk/lib/resource_waiter.js:64:10) at features.constructor.waitFor (/var/runtime/node_modules/aws-sdk/lib/service.js:271:18) at Object.defaultInvokeFunction [as invokeFunction] (/var/task/outbound.js:1:826) at processTicksAndRejections (internal/process/task_queues.js:95:5) at async invokeUserFunction (/var/task/framework.js:1:2149) at async onEvent (/var/task/framework.js:1:365) at async Runtime.handler (/var/task/cfn-response.js:1:1543) (RequestId: 0d8ef9af-72af-4130-82bb-c480d217e863)
Reproduction Steps
https://github.com/aws-samples/cdk-eks-fargate
npm install
npm i -g aws-cdk
cdk bootstrap
cdk deploy
Possible Solution
I have searched all over the internet for similar issues and I am not sure. I am learning from a tutorial and ran into this strange error. I've tried different versions of aws-cdk (going down) and that didn't help either.
Additional Information/Context
One thing to note that may be causing this: I"m using aws-nuke to purge any and all resources when I'm done for the day as I'm just trying to get the cluster up and running and configured the way I like it. I'm doing this for cost reasons so I do not incur charges for something that isn't serving any apps/websites.
CDK CLI Version
2.62.1 (build 8641449)
Framework Version
^2.31.1
Node.js Version
v18.13.0
OS
WSL2
Language
Typescript
Language Version
^4.0.2
Other information
I'm attempting to deploy on
us-east-2
.The text was updated successfully, but these errors were encountered: