Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve AWS IAM unique IDs #2814

Merged
merged 5 commits into from
Jun 7, 2017
Merged

Conversation

joelthompson
Copy link
Contributor

This adds a (now-default) option on roles to resolve the
bound_iam_principal_arn (when using AWS IAM auth) to AWS's internal
unique ID. The primary reason for this is to prevent a particular role
or user from being deleted and recreated with the same ARN and thus
taking over Vault permissions that were intended to be bound to the
previous ARN, which more closely mimics AWS's behavior.

By preferentially resolving via the internal unqiue ID rather than the
ARN, this also fixes the issue in #2729 when resolution is turned on.

This also includes a commit that supersedes the changes in #2802. The fix
in #2802 is correct, but by using the new ARN parser, we're able to better
test the resolution to ensure that we're at least parsing the ARN properly.

This adds a (now-default) option on roles to resolve the
bound_iam_principal_arn (when using AWS IAM auth) to AWS's internal
unique ID. The primary reason for this is to prevent a particular role
or user from being deleted and recreated with the same ARN and thus
taking over Vault permissions that were intended to be bound to the
previous ARN, which more closely mimics AWS's behavior.

By preferentially resolving via the internal unqiue ID rather than the
ARN, this also fixes the issue in hashicorp#2729
Similar to hashicorp#2729, the existing implementation for the aws ec2 auth
method can't handle an instance profile with a path in the name, when
using a bound IAM role. This improves the parsing of the instance
profile ARN to eliminate that error.
@jefferai jefferai added this to the 0.7.3 milestone Jun 6, 2017
@jefferai jefferai requested a review from vishalnayak June 6, 2017 18:17
Copy link
Member

@vishalnayak vishalnayak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joelthompson This is looking great to me. Nice work! I've left some minor comments.

// accounts using their IAM instance profile to get their credentials.
defaultAWSAccountID string

resolveArnToUniqueId func(logical.Storage, string) (string, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this to resolveArnToUniqueIDFunc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if err != nil {
return nil, fmt.Errorf("unable to fetch current caller: %v", err)
}
b.defaultAWSAccountID = *identity.Account
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a nil check for identity here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

stsRole := ""
if sts != nil {
stsRole = sts.StsRole
return nil, fmt.Errorf("failed to extract out IAM instance profile name from IAM instance profile ARN; error: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update the error message to indicate failure to extract out entity instead of profile name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

role creation (or upgrading to use this) succeed, then Vault has
already been able to resolve internal IDs, and it doesn't need any
further IAM permissions to authenticate users. If a role has been
deleted and recreated, and Vault has cached the old unique ID, the you
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/the you/you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if err != nil {
return "", err
}
return *userInfo.User.UserId, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When err == nil, can we trust the responses from GetUser, GetRole and GetInstanceProfile to be non-nil. If not, can we add nil checks appropriately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the current source, it's not possible. But, I don't see that as a guarantee documented anywhere, so better safe than sorry. I've added the additional checks to be safe.

"canonical_arn": canonicalArn,
"client_arn": callerID.Arn,
"canonical_arn": entity.canonicalArn(),
"client_user_id": callerID.UserId,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not be the split version of the UserId in callerID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, great catch, thanks! Done (and added a test for it).

Copy link
Contributor Author

@joelthompson joelthompson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think I've addressed everything.

// accounts using their IAM instance profile to get their credentials.
defaultAWSAccountID string

resolveArnToUniqueId func(logical.Storage, string) (string, error)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if err != nil {
return "", err
}
return *userInfo.User.UserId, nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the current source, it's not possible. But, I don't see that as a guarantee documented anywhere, so better safe than sorry. I've added the additional checks to be safe.

if err != nil {
return nil, fmt.Errorf("unable to fetch current caller: %v", err)
}
b.defaultAWSAccountID = *identity.Account
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

stsRole := ""
if sts != nil {
stsRole = sts.StsRole
return nil, fmt.Errorf("failed to extract out IAM instance profile name from IAM instance profile ARN; error: %v", err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

role creation (or upgrading to use this) succeed, then Vault has
already been able to resolve internal IDs, and it doesn't need any
further IAM permissions to authenticate users. If a role has been
deleted and recreated, and Vault has cached the old unique ID, the you
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"canonical_arn": canonicalArn,
"client_arn": callerID.Arn,
"canonical_arn": entity.canonicalArn(),
"client_user_id": callerID.UserId,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, great catch, thanks! Done (and added a test for it).

@@ -80,20 +81,39 @@ func (b *backend) getClientConfig(s logical.Storage, region, stsRole, clientType
return nil, fmt.Errorf("could not compile valid credentials through the default provider chain")
}

stsConfig, err := b.getRawClientConfig(s, region, "sts")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this change? Will it always be a valid assumption to assume that you're getting STS credentials?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question (and I probably should have commented it).

Previously, Vault just assumed that if there wasn't an stsRole configured for the given account, to just use the default credentials provided. That worked because you were looking up two things:

  1. Instance IDs, which were basically guaranteed to be globally unique, so if you found the given instance ID, you knew it was the instance ID you were looking for.
  2. IAM instance profiles by name. These weren't globally unique across accounts, but it was OK because the only thing you looked for was the associated IAM role ARN, which does include the account number, so you were OK if you looked up the profile in the wrong account because it wouldn't have the associated IAM ARN you need.

In this PR, I need to resolve unique IDs, which requires looking up a particular IAM entity (user or role) in a given account with a given name. If I have an stsRole configured for the given account, then great, I'll just try to assume it. But, what if I don't have an stsRole defined for that account ID? I'm not a huge fan of the assumption that is being made that it should just look up using the default credentials -- I'm looking up the entity by name, but entities with the same name can exist in multiple accounts. I could check the returned ARN to ensure it has the expected account ID embedded in it, but this feels more correct to me. getClientConfig is being asked for an AWS API client in account 123, we should check to ensure that the client we're returning is actually in account 123. And that's the goal here.

Thinking through this a little more, there's a small issue here in that an operator could call auth/aws/config/sts with an account_id of 123456789012, but pass in an stsRole belonging to a different account, as the config/sts endpoint doesn't actually validate that the passed-in stsRole ARN belongs to account_id but I think that's probably a separate discussion (and I'll open a separate issue for it).

Will it always be a valid assumption to assume that you're getting STS credentials?

I'd put it a little differently. I believe it is a valid assumption that, if you can make any AWS API call, you are able to call sts:GetCallerIdentity. I've even tried an IAM policy to explicitly deny sts:GetCallerIdentity, and it still succeeds.

Does this all make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question was less about sts:GetCallerIdentity and more about whether it's safe to assume that you always want to use STS creds to make the client calls but from your explanation it sounds correct -- I assume that in any given account we should always be able to get STS creds? Or will this not be the case if they're not using IAM roles, in which case there needs to be a fallback?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to answer your question because the phrase "STS creds" isn't well defined, so I'm not sure what you're asking and thus not sure how to answer. Let me try to explain a little more, and if I'm still not answering your question, it'd probably be best to discuss offline. Apologies if this response is a bit basic, but I'm just trying to understand where we're not connecting.

STS is both a service that requires credentials to authenticate to it and a service that can act as a source of credentials. The code in question does both. I'm pretty sure that any type of IAM entity can authenticate to STS.

it's safe to assume that you always want to use STS creds to make the client calls

Which client calls?

If an stsRole has been configured for a given target account ID, then we use the existing creds Vault has been configured with to authenticate to STS to ask for a new set of credentials corresponding to the stsRole, and it uses those credentials for future requests. I'm not really changing this code path at all -- I'm just pulling the existing code up a level so the results can be used, but no material changes are being made. And we're depending on customers to configure the IAM roles such that Vault is able to retrieve credentials via sts:AssumeRole, so the customer has said he/she wants to get credentials from STS.

If no stsRole has been configured for an account (the big change I'm making here), then we just use the existing creds Vault has been configured with to authenticate to sts:GetCallerIdentity to learn the identity of the creds being given; we're not asking for any new credentials from STS. And we continue to use the creds Vault has been given (i.e., not any temporary credentials retrieved via STS) to make future client calls.

it's safe to assume that you always want to use STS creds to make the client calls

Which client calls? The only times we ask STS for credentials is when clients have explicitly told Vault to go to STS for credentials for those calls, and I'm not changing that.

I assume that in any given account we should always be able to get STS creds?

In any given account, we should always be able to call sts:GetCallerIdentity using any valid creds, whether those are IAM user creds or assumed role creds. Beyond that, it depends. If users have set up the appropriate STS configs in Vault and have set up the appropriate IAM permissions in AWS, then yes. But if's really a user configuration issue.

Or will this not be the case if they're not using IAM roles

The stsRole must be an IAM role, yes. Otherwise, even if Vault is using, say, IAM User credentials, all this will still work. IAM users can call sts:GetCallerIdentity and IAM users can call sts:AssumeRole (as long as the appropriate IAM permissions have been set up).

Does this answer what you're asking? If not, what am I missing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to go with "I'm out of my AWS depth here" and just trust you on this. Don't worry, it's not that you haven't answered my question, it's more that I'm not sure what my question is anymore, but your answer has convinced me that you know exactly what you're doing and I should leave it be :-D

@jefferai
Copy link
Member

jefferai commented Jun 7, 2017

Thanks a bunch @joelthompson !

@jefferai jefferai merged commit d858511 into hashicorp:master Jun 7, 2017
@joelthompson joelthompson deleted the aws_iam_unique_id branch June 9, 2017 03:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants