Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade AWS SDK for Go to support AWS IAM Roles for K8s ServiceAccounts #7450

Closed
tomaspinho opened this issue Sep 9, 2019 · 23 comments
Closed

Comments

@tomaspinho
Copy link

Is your feature request related to a problem? Please describe.
Yes, the current version of the AWS SDK for Go that ships with Vault (v1.19.39) does not support using a K8s ServiceAccount to authenticate against a a AWS IAM Role when Vault is deployed inside a Kubernetes cluster. See: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html

Describe the solution you'd like
Bump the version of AWS SDK for Go.

Describe alternatives you've considered
We could hijack the entrypoint of our Vault containers to prepopulate IAM credentials, but that would be ugly.

Explain any additional use-cases
NA

Additional context
https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html

@micahhausler
Copy link

The way the AWS code in Vault is currently structured, using IAM for Service Accounts on Kubernetes would not fully work with Vault.

There are several places Vault calls out to AWS, (I may be missing others)

All these use the /helper/awsutil package for AWS credential discovery, which will probably work fine for Auth, Storage, and Seal, but the Secret Engine is where this gets problematic. According to the IAM AssumeRoleWithWebIdentity docs:

The temporary security credentials created by AssumeRoleWithWebIdentity can be used to make API calls to any AWS service with the following exception: you cannot call the STS GetFederationToken or GetSessionToken API operations.

The AWS Secret Engine supports 3 credential methods

  • iam_user- This would work fine if the service account IAM role had user creation permissions
  • assumed_role- This would probably work.
  • federation_token - This will not work.

It may be worth allowing for different source credentials for the Auth, Storage, and Seal functionality than the Secret Engine, or just using as-is and documenting the Federation Token limitation when used with IAM for Service Accounts on Kubernetes

@joelthompson
Copy link
Contributor

Hey @micahhausler -- thanks for the heads up here! I don't see this as a major drawback as the same limitations are true when using Vault on an EC2 instance in an IAM instance profile or an ECS cluster with a task role, and this is also called out in the documentation:

Notice: Due to limitations in AWS, in order to use the federation_token credential type, Vault must be configured with IAM user credentials. AWS does not allow temporary credentials (such as those from an IAM instance profile) to be used.

@micahhausler
Copy link

micahhausler commented Sep 10, 2019

@joelthompson Awesome.

The other UX issue here is that Web Identity and Assume Role providers use the role_arn field, and you can use the SDK to do an Assume Role after a Web Identity cred fetch, so you may want to consider if that is allowed and how that can be specified

joelthompson added a commit to joelthompson/vault that referenced this issue Sep 10, 2019
@joelthompson
Copy link
Contributor

Hey @micahhausler -- I submitted a PR to bump the golang SDK in #7458. However, I don't think that will achieve the goals of @tomaspinho.

The only time that the automated k8s credentials (stscreds.NewWebIdentityCredentials) will be used, as far as I can see in the SDK, is in session.assumeWebIdentity. The only calls to that are from session.resolveCredentials and session.resolveCredsFromProfile. The only calls to these come from session.mergeConfigSrcs and that, in turn, only gets called by session.newSession and that only gets called by session.NewSessionWithOptions. And the only time that gets called by session.New (which Vault uses) is when the AWS_SDK_LOAD_CONFIG environment variable is set. Maybe this is a bug in the SDK that session.New and session.NewSession have such different behaviors?

If the upstream SDK won't be changing, it seems like the options would be to switch to session.NewSession (rather than session.New) or manually create a stscreds.WebIdentityRoleProvider (which involves duplicating and reading the associated environment variables for the token file path, role ARN, and role session name) and insert that provider explicitly into the credential chain so we're not depending on the SDK to do this for us.

Am I missing something? Does this seem right to you?

@tomaspinho
Copy link
Author

@joelthompson setting that environment variable is not a problem for us, as long as it works (and is documented for others). Let me know if I can help with anything!

@joelthompson
Copy link
Contributor

Hey @tomaspinho -- I don't have familiarity with running things on k8s. Would you be able to build the code in my PR and test to see if it works, both with and without the environment variable set? Alternatively, if you have a quick script you can share to get Vault up and running on EKS to test it myself, that would also be great.

Agree that this needs documenting, but I want to separate out the SDK version bump from other changes to Vault for ease of review.

@tomaspinho
Copy link
Author

@joelthompson Yeah, I can definitely help with that. Is there a way to enable debug logs for Vault, especifically for AWS SDK interactions?

@joelthompson
Copy link
Contributor

@tomaspinho I'm not sure there's a good way to enable that without more code changes. Probably the easiest way is to just enable the AWS KMS seal and only give access to the KMS key to the Vault pod's service account IAM role.

@micahhausler
Copy link

The only time that the automated k8s credentials (stscreds.NewWebIdentityCredentials) will be used, as far as I can see in the SDK, is in session.assumeWebIdentity. The only calls to that are from session.resolveCredentials and session.resolveCredsFromProfile. The only calls to these come from session.mergeConfigSrcs and that, in turn, only gets called by session.newSession and that only gets called by session.NewSessionWithOptions. And the only time that gets called by session.New (which Vault uses) is when the AWS_SDK_LOAD_CONFIG environment variable is set. Maybe this is a bug in the SDK that session.New and session.NewSession have such different behaviors?

Yes, session.New() is missing the behavior (unless AWS_SDK_LOAD_CONFIG is set), it would be worth raising this as a Go SDK bug.

If the upstream SDK won't be changing, it seems like the options would be to switch to session.NewSession (rather than session.New) or manually create a stscreds.WebIdentityRoleProvider (which involves duplicating and reading the associated environment variables for the token file path, role ARN, and role session name) and insert that provider explicitly into the credential chain so we're not depending on the SDK to do this for us.

Am I missing something? Does this seem right to you?

I would lean against manually creating a stscreds.WebIdentityRoleProvider and adding that to the chain.

The shortest path to adding support might be just to switch from session.New to session.NewSession, as the input signatures are the same, and you can just bubble up a possible error.

@tomaspinho
Copy link
Author

Honestly think the whole AWS_SDK_LOAD_CONFIG in session.New() shenanigan is by design, but having hit that sort of thing in the past multiple times, it would be best if they documented this behaviour a little bit better.

@joelthompson
Copy link
Contributor

Maybe? Can't hurt to wait a couple days to see how the SDK team reacts. Maybe they make the behavior more consistent, maybe they just document the diff, maybe they do nothing? I think it would be good to see how they react first.

@tomaspinho
Copy link
Author

Issue created on AWS's side says it's expected behaviour and that session.New is actually deprecated.

@fernandesnikhil
Copy link

fernandesnikhil commented Sep 27, 2019

Any update on what this project intends to do now? We could really use the IAM-Service account auth even if it means setting an extra variable: AWS_SDK_LOAD_CONFIG on our Vault pods.

@tomaspinho
Copy link
Author

tomaspinho commented Oct 3, 2019

Tested building Vault with the changes in #7458 and running inside our testing EKS cluster both with and without AWS_SDK_LOAD_CONFIG set. Vault doesn't assume the IAM role. An updated awscli running inside the same pod assumes it correctly.

@dgozalo
Copy link
Contributor

dgozalo commented Oct 24, 2019

From AWS side they already confirmed that this is the expected behavior. Is a decision being made to move this forward? I guess this would be solved by changing New to NewSession in the code base.

It would be incredibly beneficial for us to use IRSA, so I would gladly help with it.

@dgozalo
Copy link
Contributor

dgozalo commented Oct 25, 2019

So I've been testing a solution for this and I managed to make it assume the role locally ( but it should work in a Pod, too ).

The problem that I've seen is that Vault is creating its own CredentialsChain, this overrides the default behavior of the sdk and it ends up not adding the WebIdentityProvider so it never attempts to assume the role.

I also tried creating a NewWebIdentityProvider and when calling session.NewSession without passing any configuration at all, it assumed the role automatically, this allowed me to pass that provider to the chain and it also worked.

The part that I don't understand is why is Vault creating its own credential providers chain and not letting the SDK work it out on its own, this seems to block the use of any further enhancements added by it. For example now that IRSA is supported, you'd need to add the WebIdentityProvider manually and it could happen with any further change.

I'm willing to open a PR and fix this, but I'll need to know if Vault wants to keep its own Providers chain or delegate that to the SDK or just create a WebIdentityProvider and add it to the chain manually.

@dgozalo
Copy link
Contributor

dgozalo commented Oct 25, 2019

I managed to get this to work running in a Pod, it's assuming a role created by IRSA.

I tried it by creating a WebIdentityProvider and adding it to the existing CredentialsChain and reading the env variables, otherwise it would never even attempt to use this Provider as Vault is overriding them.

Next I found this issue: kubernetes/kubernetes#82573 which basically doesn't allow non-root users to access the injected token.

As vault is using a custom user and group, the usual workaround (setting securityContext.fsGroup to 65534) didn't work.

I had to set securityContext.fsGroup to 1000 (the id of the vault group) and securityContext.runAsGroup to 1000 in the Vault pod.

The application correctly assumed the role like this.

I will open a Pull Request later so we can discuss it further.

@mukgupta
Copy link

Any update on this? We too need IRSA support.

@rubroboletus
Copy link
Contributor

Any progress with this issue?

@paranoidsp
Copy link

As of now, we're working around this by making an STS call to retrieve temporary secret/key from AWS, and using that to authenticate the Vault cli. This issue looks to be mostly solved, so could anyone let us know if there's anything we could contribute to push this forward?

@thardie
Copy link

thardie commented Mar 4, 2020

Getting the temporary secret/key is not a workable solution. They expire too quickly, and vault doesn't exit when they do. It just goes into a complain loop about expired token. Vault needs to read the token file (which changes every 24 hours) and use the proper API to keep the session alive indefinately.

@tyrannosaurus-becks
Copy link
Contributor

Closed by #7738

@pbernal pbernal modified the milestones: triaged, 1.5 Mar 6, 2020
@hferentschik
Copy link

hferentschik commented Apr 15, 2020

hi @tyrannosaurus-becks, @pbernal - any ETA when and in which release this will be available? Having this fix would help us a ton.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests