Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3: possible race condition, adding new object to bucket between deletion of bucket objects and bucket deletion prevents deletion #21776

Open
max-allan-surevine opened this issue Aug 26, 2022 · 17 comments
Assignees
Labels
@aws-cdk/aws-s3 Related to Amazon S3 @aws-cdk/custom-resources Related to AWS CDK Custom Resources bug This issue is a bug. documentation This is a problem with documentation. p2

Comments

@max-allan-surevine
Copy link
Contributor

Describe the bug

If I create a log bucket. Then a content bucket, using log bucket for server access logs, when I do a "cdk destroy" the access logs are not removed and the stack fails to destroy.

Expected Behavior

The destroy to complete successfully! (The log bucket and content to be removed.)

Current Behavior

Cloudwatch logs for the delete objects Lambda suggests it was successful :

START RequestId: 72426a95-6f49-4d26-9642-2e15450b205a Version: $LATEST
2022-08-26T10:14:21.449Z	72426a95-6f49-4d26-9642-2e15450b205a	INFO	
{
    "RequestType": "Delete",
    "ServiceToken": "arn:aws:lambda:eu-west-2:123492961234:function:CdkBucketStack-CustomS3AutoDeleteObjectsCustomReso-kArk0g4I9Il2",
    "ResponseURL": "...",
    "StackId": "arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/0531d0e0-2522-11ed-abbb-0aa24eb467d4",
    "RequestId": "b81d7b6f-c28c-455f-83cf-8cf4be357191",
    "LogicalResourceId": "LogBucketAutoDeleteObjectsCustomResource7762F42C",
    "PhysicalResourceId": "f9dbbbc0-641f-4026-a15e-df345405a307",
    "ResourceType": "Custom::S3AutoDeleteObjects",
    "ResourceProperties": {
        "ServiceToken": "arn:aws:lambda:eu-west-2:123492961234:function:CdkBucketStack-CustomS3AutoDeleteObjectsCustomReso-kArk0g4I9Il2",
        "BucketName": "cdkbucketstack-logbucketcc3b17e8-79kbay52r42q"
    }
}

2022-08-26T10:14:27.491Z	72426a95-6f49-4d26-9642-2e15450b205a	INFO	submit response to cloudformation {
  Status: 'SUCCESS',
  Reason: 'SUCCESS',
  StackId: 'arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/0531d0e0-2522-11ed-abbb-0aa24eb467d4',
  RequestId: 'b81d7b6f-c28c-455f-83cf-8cf4be357191',
  PhysicalResourceId: 'f9dbbbc0-641f-4026-a15e-df345405a307',
  LogicalResourceId: 'LogBucketAutoDeleteObjectsCustomResource7762F42C',
  NoEcho: undefined,
  Data: undefined
}
END RequestId: 72426a95-6f49-4d26-9642-2e15450b205a
REPORT RequestId: 72426a95-6f49-4d26-9642-2e15450b205a	Duration: 6312.29 ms	Billed Duration: 6313 ms	Memory Size: 128 MB	Max Memory Used: 81 MB	Init Duration: 174.79 ms	

But the result of the cdk destroy is a fail because there is content :

CdkBucketStack: destroying...
11:15:21 | DELETE_FAILED        | AWS::S3::Bucket             | LogBucketCC3B17E8
The bucket you tried to delete is not empty (Service: Amazon S3; Status Code: 409; Error Code: BucketNotEmpty; Request ID: HJVZC3J1DP4N2QSM; S3 Extended Request ID: /OcCq7vdx6Yxt6bShdNQhgr8fUI3/pG1gmCuC4cu7dBm3juB+Sa9U+FM+O41vk+FVCV8Xwk49YA=; Proxy: null)


 ❌  CdkBucketStack: destroy failed Error: The stack named CdkBucketStack is in a failed state. You may need to delete it from the AWS console : DELETE_FAILED (The following resource(s) failed to delete: [LogBucketCC3B17E8]. ): The bucket you tried to delete is not empty (Service: Amazon S3; Status Code: 409; Error Code: BucketNotEmpty; Request ID: HJVZC3J1DP4N2QSM; S3 Extended Request ID: /OcCq7vdx6Yxt6bShdNQhgr8fUI3/pG1gmCuC4cu7dBm3juB+Sa9U+FM+O41vk+FVCV8Xwk49YA=; Proxy: null)

Reproduction Steps

Deploy the template below. Upload your current directory to "bucket" and wait up to an hour until there are some logs in logBucket. Then run the destroy.

(Yes the code is a bit janky, I did a lazy copy/paste and removed some logic that isn't relevant etc...)

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as iam from 'aws-cdk-lib/aws-iam';

export class CdkBucketStack extends cdk.Stack {
  public readonly Bucket: cdk.CfnOutput;
  public readonly logBucket: cdk.CfnOutput;
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);
    // Create a bucket for all the different logs.
    let logBucketProps: any = {
        removalPolicy: cdk.RemovalPolicy.DESTROY,
        autoDeleteObjects: true,
        accessControl: s3.BucketAccessControl.LOG_DELIVERY_WRITE,
    };
    logBucketProps["enforceSSL"] = true;
    logBucketProps["encryption"] = s3.BucketEncryption.S3_MANAGED;
    logBucketProps["blockPublicAccess"] = s3.BlockPublicAccess.BLOCK_ALL;
    const logBucket = new s3.Bucket(this, 'LogBucket', logBucketProps);
    logBucket.addToResourcePolicy(
      new iam.PolicyStatement({
        effect: iam.Effect.ALLOW,
        principals: [new iam.ServicePrincipal('logging.s3.amazonaws.com')],
        actions: ['s3:PutObject'],
        resources: [`${logBucket.bucketArn}/*`],
      }),
    );

    let BucketProps: any = {
        removalPolicy: cdk.RemovalPolicy.DESTROY,
        autoDeleteObjects: true,
    };
    BucketProps["enforceSSL"] = true;
    BucketProps["encryption"] = s3.BucketEncryption.S3_MANAGED;
    BucketProps["serverAccessLogsPrefix"] = "output-access-logs/";
    BucketProps["serverAccessLogsBucket"] = logBucket;
    BucketProps["blockPublicAccess"] = s3.BlockPublicAccess.BLOCK_ALL;
    const bucket = new s3.Bucket(this, 'OutputBucket', BucketProps);

    this.Bucket = new cdk.CfnOutput(this, 'S3Bucket', {
      value: bucket.bucketName,
      description: "The bucket",
     });
    this.logBucket = new cdk.CfnOutput(this, 'logS3Bucket', {
      value: logBucket.bucketName,
      description: "The log bucket",
    });
  }
}

Possible Solution

I imagine the "delete content" lambda is not checking some error condition it isn't expecting and reporting success when it hasn't succeeded. The Lambda does get deleted and the code in the Lambda editor is minified so I couldn't easily retry or debug it.

Additional Information/Context

The logic I removed in the code example avoids deleting the content in our "production" deployments. But in dev, we have no need to keep the access logs. So if someone was thinking "but it's log data, we can't delete that", please : do delete any data. Or at least provide a "actuallyAutoDeleteAllObjects: true" option.

CDK CLI Version

2.38.1 (build a5ced21)

Framework Version

No response

Node.js Version

Node.js v18.8.0

OS

OSX

Language

Typescript

Language Version

No response

Other information

No response

@max-allan-surevine max-allan-surevine added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Aug 26, 2022
@github-actions github-actions bot added the @aws-cdk/aws-s3 Related to Amazon S3 label Aug 26, 2022
@pradoz
Copy link

pradoz commented Aug 26, 2022

Based on the error you received, it looks like the custom resource didn't actually finish emptying the bucket (although it did try to delete it):

(Service: Amazon S3; Status Code: 409; Error Code: BucketNotEmpty ...

This can happen when S3 buckets have a large amount of objects in them. I was able to run cdk deploy and cdk destroy with the following [refactored] stack (you can safely remove the IAM policy):

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as s3 from 'aws-cdk-lib/aws-s3';

export class DummyStack extends cdk.Stack {
  public readonly Bucket: cdk.CfnOutput;
  public readonly logBucket: cdk.CfnOutput;
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // Create a bucket for all the different logs.
    const logBucket = new s3.Bucket(this, 'LogBucket', {
        removalPolicy: cdk.RemovalPolicy.DESTROY,
        autoDeleteObjects: true,
        accessControl: s3.BucketAccessControl.LOG_DELIVERY_WRITE,
        enforceSSL: true,
        encryption: s3.BucketEncryption.S3_MANAGED,
        blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
    });

    const bucket = new s3.Bucket(this, 'OutputBucket', {
        removalPolicy: cdk.RemovalPolicy.DESTROY,
        autoDeleteObjects: true,
        enforceSSL: true,
        encryption: s3.BucketEncryption.S3_MANAGED,
        serverAccessLogsPrefix: "output-access-logs/",
        serverAccessLogsBucket: logBucket,
        blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
    });

    this.Bucket = new cdk.CfnOutput(this, 'S3Bucket', {
      value: bucket.bucketName,
      description: "The bucket",
     });
    this.logBucket = new cdk.CfnOutput(this, 'logS3Bucket', {
      value: logBucket.bucketName,
      description: "The log bucket",
    });
  }
}

@peterwoodworth
Copy link
Contributor

I wasn't able to reproduce this with the template and steps provided @max-allan-surevine, maybe @pradoz is right and this is related to having a large number of objects to delete?

@peterwoodworth peterwoodworth added p2 response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. needs-reproduction This issue needs reproduction. and removed needs-triage This issue or PR still needs to be triaged. labels Aug 27, 2022
@github-actions
Copy link

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Aug 29, 2022
@max-allan-surevine
Copy link
Contributor Author

If the "empty bucket" lambda has not emptied the bucket. Surely it should not be returning with a "Success"? Regardless of the reason why it didn't empty the bucket.
I don't really think that many objects is the problem. The bucket only existed for ~30 minutes and the "content" bucket has a LOT more objects in than the log bucket. Yet IT was deleted OK. (I copy up the local directory, which has all the npm dependencies in)

Did you definitely wait for some content to be published to the log bucket before deleting the stack???

Using the stack @pradoz provided, I still get exactly the same problem. There is a single folder output-access-logs with a single object in it.

The stack was created about 09:55 BST, the object is dated 10:20:50 (BST) , the delete failed at 09:21:07.499Z. So the Lambda appears to be failing to delete a single object.

Is it possible the log delivery mechanism is holding an object "open" and that can't be deleted??

MaAl00350:cdk-bucket max [main] $ cdk deploy

✨  Synthesis time: 8.54s

....
[100%] success: Published e57c1acaa363d7d2b81736776007a7091bc73dff4aeb8135627c4511a51e7dca:current_account-current_region
CdkBucketStack: creating CloudFormation changeset...

 ✅  CdkBucketStack

✨  Deployment time: 67.81s

Outputs:
CdkBucketStack.S3Bucket = cdkbucketstack-outputbucket7114eb27-th0y46k00uu8
CdkBucketStack.logS3Bucket = cdkbucketstack-logbucketcc3b17e8-g0crsksv9497
Stack ARN:
arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/21276a90-2841-11ed-8b89-0687d8847c86

✨  Total time: 76.35s


MaAl00350:cdk-bucket max [main] $ aws s3 cp --recursive . s3://cdkbucketstack-outputbucket7114eb27-th0y46k00uu8

.....

MaAl00350:cdk-bucket max [main] $ aws s3 ls s3://cdkbucketstack-logbucketcc3b17e8-g0crsksv9497/
MaAl00350:cdk-bucket max [main] $ date
Tue 30 Aug 2022 09:58:35 BST
... time passes...

MaAl00350:cdk-bucket max [main] $ aws s3 ls s3://cdkbucketstack-logbucketcc3b17e8-g0crsksv9497/
                           PRE output-access-logs/
MaAl00350:cdk-bucket max [main] $ date
Tue 30 Aug 2022 10:19:50 BST
MaAl00350:cdk-bucket max [main] $ cdk destroy
Are you sure you want to delete: CdkBucketStack (y/n)? y
CdkBucketStack: destroying...
10:21:07 | DELETE_FAILED        | AWS::S3::Bucket             | LogBucketCC3B17E8
The bucket you tried to delete is not empty (Service: Amazon S3; Status Code: 409; Error Code: BucketNotEmpty; Request ID: 13B5WKE8TCT9VS8K; S3 Extended Requ
est ID: /7/TL8ibQg7kBnNvHowx/D81U0VISP2NwXFUmne62ZBh4ahsFgTHqnUYHyZRNGnuShor+HdGMQ8=; Proxy: null)


 ❌  CdkBucketStack: destroy failed Error: The stack named CdkBucketStack is in a failed state. You may need to delete it from the AWS console : DELETE_FAILED (The following resource(s) failed to delete: [LogBucketCC3B17E8]. ): The bucket you tried to delete is not empty (Service: Amazon S3; Status Code: 409; Error Code: BucketNotEmpty; Request ID: 13B5WKE8TCT9VS8K; S3 Extended Request ID: /7/TL8ibQg7kBnNvHowx/D81U0VISP2NwXFUmne62ZBh4ahsFgTHqnUYHyZRNGnuShor+HdGMQ8=; Proxy: null)
    at destroyStack (/usr/local/lib/node_modules/aws-cdk/lib/api/deploy-stack.ts:523:11)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at CdkToolkit.destroy (/usr/local/lib/node_modules/aws-cdk/lib/cdk-toolkit.ts:469:9)
    at initCommandLine (/usr/local/lib/node_modules/aws-cdk/lib/cli.ts:347:12)

The stack named CdkBucketStack is in a failed state. You may need to delete it from the AWS console : DELETE_FAILED (The following resource(s) failed to delete: [LogBucketCC3B17E8]. ): The bucket you tried to delete is not empty (Service: Amazon S3; Status Code: 409; Error Code: BucketNotEmpty; Request ID: 13B5WKE8TCT9VS8K; S3 Extended Request ID: /7/TL8ibQg7kBnNvHowx/D81U0VISP2NwXFUmne62ZBh4ahsFgTHqnUYHyZRNGnuShor+HdGMQ8=; Proxy: null)
MaAl00350:cdk-bucket max [main] $ aws s3 ls s3://cdkbucketstack-logbucketcc3b17e8-g0crsksv9497/output-access-logs
                           PRE output-access-logs/
MaAl00350:cdk-bucket max [main] $ aws s3 ls s3://cdkbucketstack-logbucketcc3b17e8-g0crsksv9497/output-access-logs/
2022-08-30 10:20:50      82616 2022-08-30-09-20-49-A7FCC762ED235F70


MaAl00350:cdk-bucket max [main] $ aws cloudformation describe-stack-events --stack-name CdkBucketStack
{
    "StackEvents": [
        {
            "StackId": "arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/21276a90-2841-11ed-8b89-0687d8847c86",
            "EventId": "152341c0-2845-11ed-8b22-02fc99796c02",
            "StackName": "CdkBucketStack",
            "LogicalResourceId": "CdkBucketStack",
            "PhysicalResourceId": "arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/21276a90-2841-11ed-8b89-0687d8847c86",
            "ResourceType": "AWS::CloudFormation::Stack",
            "Timestamp": "2022-08-30T09:21:09.077Z",
            "ResourceStatus": "DELETE_FAILED",
            "ResourceStatusReason": "The following resource(s) failed to delete: [LogBucketCC3B17E8]. "
        },
        {
            "StackId": "arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/21276a90-2841-11ed-8b89-0687d8847c86",
            "EventId": "CustomS3AutoDeleteObjectsCustomResourceProviderRole3B1BD092-DELETE_COMPLETE-2022-08-30T09:21:08.501Z",
            "StackName": "CdkBucketStack",
            "LogicalResourceId": "CustomS3AutoDeleteObjectsCustomResourceProviderRole3B1BD092",
            "PhysicalResourceId": "CdkBucketStack-CustomS3AutoDeleteObjectsCustomReso-G9052CNB9HP",
            "ResourceType": "AWS::IAM::Role",
            "Timestamp": "2022-08-30T09:21:08.501Z",
            "ResourceStatus": "DELETE_COMPLETE",
            "ResourceProperties": "{\"ManagedPolicyArns\":[\"arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole\"],\"AssumeRolePolicyDocument\":{\"Version\":\"2012-10-17\",\"Statement\":[{\"Action\":\"sts:AssumeRole\",\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"lambda.amazonaws.com\"}}]}}"
        },
        {
            "StackId": "arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/21276a90-2841-11ed-8b89-0687d8847c86",
            "EventId": "LogBucketCC3B17E8-DELETE_FAILED-2022-08-30T09:21:07.499Z",
            "StackName": "CdkBucketStack",
            "LogicalResourceId": "LogBucketCC3B17E8",
            "PhysicalResourceId": "cdkbucketstack-logbucketcc3b17e8-g0crsksv9497",
            "ResourceType": "AWS::S3::Bucket",
            "Timestamp": "2022-08-30T09:21:07.499Z",
            "ResourceStatus": "DELETE_FAILED",
            "ResourceStatusReason": "The bucket you tried to delete is not empty (Service: Amazon S3; Status Code: 409; Error Code: BucketNotEmpty; Request ID: 13B5WKE8TCT9VS8K; S3 Extended Request ID: /7/TL8ibQg7kBnNvHowx/D81U0VISP2NwXFUmne62ZBh4ahsFgTHqnUYHyZRNGnuShor+HdGMQ8=; Proxy: null)",
            "ResourceProperties": "{\"PublicAccessBlockConfiguration\":{\"RestrictPublicBuckets\":\"true\",\"BlockPublicPolicy\":\"true\",\"BlockPublicAcls\":\"true\",\"IgnorePublicAcls\":\"true\"},\"BucketEncryption\":{\"ServerSideEncryptionConfiguration\":[{\"ServerSideEncryptionByDefault\":{\"SSEAlgorithm\":\"AES256\"}}]},\"AccessControl\":\"LogDeliveryWrite\",\"Tags\":[{\"Value\":\"true\",\"Key\":\"aws-cdk:auto-delete-objects\"}]}"
        },
        {
            "StackId": "arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/21276a90-2841-11ed-8b89-0687d8847c86",
            "EventId": "CustomS3AutoDeleteObjectsCustomResourceProviderRole3B1BD092-DELETE_IN_PROGRESS-2022-08-30T09:21:06.925Z",
            "StackName": "CdkBucketStack",
            "LogicalResourceId": "CustomS3AutoDeleteObjectsCustomResourceProviderRole3B1BD092",
            "PhysicalResourceId": "CdkBucketStack-CustomS3AutoDeleteObjectsCustomReso-G9052CNB9HP",
            "ResourceType": "AWS::IAM::Role",
            "Timestamp": "2022-08-30T09:21:06.925Z",
            "ResourceStatus": "DELETE_IN_PROGRESS",
            "ResourceProperties": "{\"ManagedPolicyArns\":[\"arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole\"],\"AssumeRolePolicyDocument\":{\"Version\":\"2012-10-17\",\"Statement\":[{\"Action\":\"sts:AssumeRole\",\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"lambda.amazonaws.com\"}}]}}"
        },
        {
            "StackId": "arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/21276a90-2841-11ed-8b89-0687d8847c86",
            "EventId": "CustomS3AutoDeleteObjectsCustomResourceProviderHandler9D90184F-DELETE_COMPLETE-2022-08-30T09:21:06.385Z",
            "StackName": "CdkBucketStack",
            "LogicalResourceId": "CustomS3AutoDeleteObjectsCustomResourceProviderHandler9D90184F",
            "PhysicalResourceId": "CdkBucketStack-CustomS3AutoDeleteObjectsCustomReso-u3gSlToflZdO",
            "ResourceType": "AWS::Lambda::Function",
            "Timestamp": "2022-08-30T09:21:06.385Z",
            "ResourceStatus": "DELETE_COMPLETE",
            "ResourceProperties": "{\"Role\":\"arn:aws:iam::123492961234:role/CdkBucketStack-CustomS3AutoDeleteObjectsCustomReso-G9052CNB9HP\",\"MemorySize\":\"128\",\"Runtime\":\"nodejs14.x\",\"Description\":\"Lambda function for auto-deleting objects in cdkbucketstack-logbucketcc3b17e8-g0crsksv9497 S3 bucket.\",\"Timeout\":\"900\",\"Handler\":\"__entrypoint__.handler\",\"Code\":{\"S3Bucket\":\"cdk-hnb659fds-assets-123492961234-eu-west-2\",\"S3Key\":\"e57c1acaa363d7d2b81736776007a7091bc73dff4aeb8135627c4511a51e7dca.zip\"}}"
        },
        {
            "StackId": "arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/21276a90-2841-11ed-8b89-0687d8847c86",
            "EventId": "OutputBucket7114EB27-DELETE_COMPLETE-2022-08-30T09:21:02.187Z",
            "StackName": "CdkBucketStack",
            "LogicalResourceId": "OutputBucket7114EB27",
            "PhysicalResourceId": "cdkbucketstack-outputbucket7114eb27-th0y46k00uu8",
            "ResourceType": "AWS::S3::Bucket",
            "Timestamp": "2022-08-30T09:21:02.187Z",
            "ResourceStatus": "DELETE_COMPLETE",
            "ResourceProperties": "{\"PublicAccessBlockConfiguration\":{\"RestrictPublicBuckets\":\"true\",\"BlockPublicPolicy\":\"true\",\"BlockPublicAcls\":\"true\",\"IgnorePublicAcls\":\"true\"},\"BucketEncryption\":{\"ServerSideEncryptionConfiguration\":[{\"ServerSideEncryptionByDefault\":{\"SSEAlgorithm\":\"AES256\"}}]},\"LoggingConfiguration\":{\"LogFilePrefix\":\"output-access-logs/\",\"DestinationBucketName\":\"cdkbucketstack-logbucketcc3b17e8-g0crsksv9497\"},\"Tags\":[{\"Value\":\"true\",\"Key\":\"aws-cdk:auto-delete-objects\"}]}"
        },
        {
            "StackId": "arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/21276a90-2841-11ed-8b89-0687d8847c86",
            "EventId": "OutputBucket7114EB27-DELETE_IN_PROGRESS-2022-08-30T09:21:01.477Z",
            "StackName": "CdkBucketStack",
            "LogicalResourceId": "OutputBucket7114EB27",
            "PhysicalResourceId": "cdkbucketstack-outputbucket7114eb27-th0y46k00uu8",
            "ResourceType": "AWS::S3::Bucket",
            "Timestamp": "2022-08-30T09:21:01.477Z",
            "ResourceStatus": "DELETE_IN_PROGRESS",
            "ResourceProperties": "{\"PublicAccessBlockConfiguration\":{\"RestrictPublicBuckets\":\"true\",\"BlockPublicPolicy\":\"true\",\"BlockPublicAcls\":\"true\",\"IgnorePublicAcls\":\"true\"},\"BucketEncryption\":{\"ServerSideEncryptionConfiguration\":[{\"ServerSideEncryptionByDefault\":{\"SSEAlgorithm\":\"AES256\"}}]},\"LoggingConfiguration\":{\"LogFilePrefix\":\"output-access-logs/\",\"DestinationBucketName\":\"cdkbucketstack-logbucketcc3b17e8-g0crsksv9497\"},\"Tags\":[{\"Value\":\"true\",\"Key\":\"aws-cdk:auto-delete-objects\"}]}"
        },

@max-allan-surevine
Copy link
Contributor Author

Oh, and the logs from the auto delete bucket lambda, ran at 9:20:19Z. Which is about 30seconds before the timestamp on the object in the bucket. (10:20:50 BST) :

2022-08-30T09:20:12.560Z	c91433c7-40e7-439c-8a5e-d035fd54c866	INFO	
{
    "RequestType": "Delete",
    "ServiceToken": "arn:aws:lambda:eu-west-2:123492961234:function:CdkBucketStack-CustomS3AutoDeleteObjectsCustomReso-u3gSlToflZdO",
    "ResponseURL": "...",
    "StackId": "arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/21276a90-2841-11ed-8b89-0687d8847c86",
    "RequestId": "7c29bf80-973f-4911-b95a-4060d5a6469a",
    "LogicalResourceId": "LogBucketAutoDeleteObjectsCustomResource7762F42C",
    "PhysicalResourceId": "9942411f-a823-4ea6-bcc1-d5453163cd7a",
    "ResourceType": "Custom::S3AutoDeleteObjects",
    "ResourceProperties": {
        "ServiceToken": "arn:aws:lambda:eu-west-2:123492961234:function:CdkBucketStack-CustomS3AutoDeleteObjectsCustomReso-u3gSlToflZdO",
        "BucketName": "cdkbucketstack-logbucketcc3b17e8-g0crsksv9497"
    }
}

2022-08-30T09:20:19.661Z	c91433c7-40e7-439c-8a5e-d035fd54c866	INFO	submit response to cloudformation {
  Status: 'SUCCESS',
  Reason: 'SUCCESS',
  StackId: 'arn:aws:cloudformation:eu-west-2:123492961234:stack/CdkBucketStack/21276a90-2841-11ed-8b89-0687d8847c86',
  RequestId: '7c29bf80-973f-4911-b95a-4060d5a6469a',
  PhysicalResourceId: '9942411f-a823-4ea6-bcc1-d5453163cd7a',
  LogicalResourceId: 'LogBucketAutoDeleteObjectsCustomResource7762F42C',
  NoEcho: undefined,
  Data: undefined
}
END RequestId: c91433c7-40e7-439c-8a5e-d035fd54c866

@github-actions github-actions bot removed closing-soon This issue will automatically close in 4 days unless further comments are made. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. labels Aug 30, 2022
@peterwoodworth
Copy link
Contributor

Did you definitely wait for some content to be published to the log bucket before deleting the stack???

Yes

For reference, here's our logic for emptying a bucket

async function emptyBucket(bucketName: string) {
const listedObjects = await s3.listObjectVersions({ Bucket: bucketName }).promise();
const contents = [...listedObjects.Versions ?? [], ...listedObjects.DeleteMarkers ?? []];
if (contents.length === 0) {
return;
}
const records = contents.map((record: any) => ({ Key: record.Key, VersionId: record.VersionId }));
await s3.deleteObjects({ Bucket: bucketName, Delete: { Objects: records } }).promise();
if (listedObjects?.IsTruncated) {
await emptyBucket(bucketName);
}
}

We don't check to ensure the bucket is actually emptied, and assume that this function takes care of it. I'll try to reproduce this again later today

According to your description, I wonder if the remaining object was added to the bucket during the attempt to delete all objects. Or, according to your timeline, maybe it was added after all objects were deleted, but before the bucket itself was deleted

@max-allan-surevine
Copy link
Contributor Author

I just tried again and waited a long time since adding files to the content bucket. And no logs had been added to the log bucket for ~20 minutes. (2 hours since content was added) Then bucket&stack deleted successfully.... The delayed log delivery thing seems to be causing this problem somehow, but it is going to be tricky to figure out whether it was during empty or between empty and delete.

During empty sounds reasonable, there could be a lot of log files slowly being created in the background. The initial "show objects" wouldn't spot the new file, then it takes a while to delete all the objects, during which a new object is created by the log delivery. I think simply repeating the delete objects would catch that most of the time, because the second delete would only have a small number to delete and so complete quickly. But then if there is a delay before deleting the bucket, more files could still arrive.

Maybe the best plan is to remove the log delivery permission before deleting the objects!

Could I make the accessControl on the log bucket be it's own cdk object, with a dependency from the objectsBucket so the accessControl is created after and deleted before the buckets??

@peterwoodworth peterwoodworth changed the title s3: Cannot delete server access log bucket s3: possible race condition, adding new object to bucket between deletion of bucket objects and bucket deletion prevents deletion Sep 1, 2022
@peterwoodworth
Copy link
Contributor

Maybe the best plan is to remove the log delivery permission before deleting the objects!
Could I make the accessControl on the log bucket be it's own cdk object, with a dependency from the objectsBucket so the accessControl is created after and deleted before the buckets??

I'm not sure how possible any of this is. I don't think you could make accessControl on the log bucket be its own object - the only things you can add dependencies on are actual CloudFormation resources, or abstractions of them. There are only the two buckets, with their two bucketpolicies, I'm not confident that rearranging the order of deletion here is either a good idea, or even an idea that will work. Do you know what the current order of deletion is between the lambda deleting objects and all of these resources getting deleted?

@ahmetsevki
Copy link

ahmetsevki commented Feb 21, 2023

We are seeing this as well.
Some background:

  • This started to happen after we made an upgrade of aws-cdk from 2.31 to 2.59, and @aws-solutions-constructs/core from 2.10.0 to 2.30.0 in the same PR.
  • It is only happening in buckets that we use buildS3Bucket helper from @aws-solutions-constructs/core to create a bucket + its logging bucket (because these constructs by default turn on s3 access logs)
  • The only difference in our code was that, we had both the bucket + the logging bucket companion use KMS. After the move we had the change our logging buckets to use S3_MANAGED encryption, because aws-cdk 2.59 enforces that server access logs bucket cannot use KMS, and it should use S3_MANAGED encryption.

Our observation is that bucket is emptied however, there are some queues that are still processing, and after the bucket deletion server access logs gets delivered into the bucket between custom resource emptying the bucket and destroying it.

Our solution could be

  • turning off server access logs in development versions of our stacks (we don't destroy production stack).
  • manually putting cloudformation dependencies on the buckets to the queue lambdas that access those buckets so that the lambdas are deleted first 👎

But I think the custom resource should disable server access logging on the bucket before deleting the items in it as the proper solution, or it should destroy the bucket first, and then delete its logging bucket.

@peterwoodworth peterwoodworth added p1 needs-review documentation This is a problem with documentation. and removed p2 needs-review needs-reproduction This issue needs reproduction. labels May 19, 2023
@peterwoodworth
Copy link
Contributor

Hey, we've decided that, at least for now, we're not sure how to address this. We'll document that this is an issue that will need to be worked around if ran into.

@rv2673
Copy link
Contributor

rv2673 commented Sep 25, 2023

@peterwoodworth One solution might be to update bucket policy to block adding objects to the bucket and after that deleting the objects. That should prevent writes to the bucket between clearing the bucket and deleting the bucket. (Unless the used write method ignores the bucket policy)

@lancedolan
Copy link

lancedolan commented Mar 29, 2024

Seeing this problem also.

Interestingly, we observe the issue when our system couldn't cause any access events on the content bucket itself during the Stack destroy events. We don't have any compute accessing the buckets during that time. I have no problem manually emptying and then deleting the access logs bucket because there's nothing continuously accessing the content which be populating the access logs.

However, CDK fails to empty and then delete the access logs every time, reliably. Content bucket gets emptied and deleted, never the access logs.

Therefore, I wonder if it's a timing in CDK itself. Perhaps marking its "empty the access logs bucket" process finished before its "empty the content bucket" has finished.

One theory:
1.In parallel:

  1. Begin clearing content bucket
  2. Begin clear access logs bucket
  3. Access logs completes emptying first
  4. Content bucket continues emptying, thus writing new entries to access logs
  5. Attempt to delete access logs fails because it's not empty

@jduchon-sonarsource
Copy link

jduchon-sonarsource commented Jul 15, 2024

I'm seeing the exact same problem and after a little bit of debugging, I thought this could be the problem too

One theory: 1.In parallel:

  1. Begin clearing content bucket
  2. Begin clear access logs bucket
  3. Access logs completes emptying first
  4. Content bucket continues emptying, thus writing new entries to access logs
  5. Attempt to delete access logs fails because it's not empty

In addition, I noticed that after running cdk destroy multiple times, the CustomS3AutoDeleteObject lambda just gets deleted.

@PieterVanEde
Copy link

PieterVanEde commented Jul 25, 2024

I'm seeing the same problem, and was looking to see if I could implement a fix. My thought was to modify the auto-delete-objects-handler, so it would disable access logging prior to start deleting objects. This would ensure that the downstream logging bucket would not be bothered by access log entries resulting from objects deleted in the source bucket.

However, then I was a bit confused, as the existing code already puts a policy statement onto the bucket before it starts deleting objects. This policy statement explicitly denies PutObject for all principals.

So I wonder: is the more specific bucket policy to allow PutObject for logging getting more priority then the Deny for PutObject?

I would be happy to submit a patch, but I would need some feedback if my interpretation is correct.

Background info:

This is the existing policy on my logging bucket, before I try to destroy the stack:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<roleid>:role/<replaced>-f-CustomS3AutoDeleteObjects-FhzxbAEGWCwc"
            },
            "Action": [
                "s3:DeleteObject*",
                "s3:GetBucket*",
                "s3:List*",
                "s3:PutBucketPolicy"
            ],
            "Resource": [
                "arn:aws:s3:::<replaced>-logging-bucket",
                "arn:aws:s3:::<replaced>-logging-bucket/*"
            ]
        },
        {
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::<replaced>-logging-bucket",
                "arn:aws:s3:::<replaced>-logging-bucket/*"
            ],
            "Condition": {
                "Bool": {
                    "aws:SecureTransport": "false"
                }
            }
        },
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "logging.s3.amazonaws.com"
            },
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::<replaced>-logging-bucket/fe*",
            "Condition": {
                "StringEquals": {
                    "aws:SourceAccount": "<id>"
                },
                "ArnLike": {
                    "aws:SourceArn": "arn:aws:s3:::<replaced>"
                }
            }
        },
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::054676820928:root"
            },
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::<replaced>-logging-bucket/AWSLogs/255377600067/*"
        },
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "delivery.logs.amazonaws.com"
            },
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::<replaced>-logging-bucket/AWSLogs/255377600067/*",
            "Condition": {
                "StringEquals": {
                    "s3:x-amz-acl": "bucket-owner-full-control"
                }
            }
        },
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "delivery.logs.amazonaws.com"
            },
            "Action": "s3:GetBucketAcl",
            "Resource": "arn:aws:s3:::<replaced>-logging-bucket"
        }
    ]
}

And this is the code fragment in packages/@aws-cdk/custom-resource-handlers/lib/aws-s3/auto-delete-objects-handler/index.ts:

policy.Statement.push(
      // Prevent any more objects from being created in the bucket
      {
        Principal: '*',
        Effect: 'Deny',
        Action: ['s3:PutObject'],
        Resource: [`arn:aws:s3:::${bucketName}/*`],
      },
    );

@GavinZZ GavinZZ self-assigned this Oct 1, 2024
@GavinZZ
Copy link
Contributor

GavinZZ commented Oct 1, 2024

There seems to be a duplicate issue around this problem, #26874. @PieterVanEde @jduchon-sonarsource I see you both commented this year and I want to check on the version of CDK you used.

This PR added a Deny policy during custom resource onDelete event and should fix the race condition mentioned here. I couldn't reproduce this reported problem myself and thinking this PR might have addressed the problem.

One other note, to answer @PieterVanEde's question on the policy

{
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::<replaced>-logging-bucket",
                "arn:aws:s3:::<replaced>-logging-bucket/*"
            ],
            "Condition": {
                "Bool": {
                    "aws:SecureTransport": "false"
                }
            }
        },

This policy was not the one added by the custom resource (in the PR). The custom resource only adds the policy when onDelete event happens. This above policy is from the result of setting enforceSSLStatement to true when initializing bucket.

Would like to get the community help on this issue and see if anyone is still able to reproduce this issue using the latest versions of CDK (CDK v2.130+).

If anyone is still able to reproduce this issue using a newer CDK version, please tag me immediately and provide how you reproduced it. I'll take a look and get back to you ASAP.

@GavinZZ GavinZZ added p2 @aws-cdk/custom-resources Related to AWS CDK Custom Resources and removed p1 labels Oct 2, 2024
@PieterVanEde
Copy link

At the project where I observed this behaviour, we are using CDK 2.130 exactly. I couldn't find a more clear reproduction path, because this seems to be a kind of race condition not having a clear reproduction path. But if we could somehow prevent new logging to be written to the access bucket before we start cleaning it up, then that should solve this problem.

@GavinZZ
Copy link
Contributor

GavinZZ commented Oct 7, 2024

@PieterVanEde thanks for getting back to me. That's interesting, because AFAIK we've added a deny request to s3:putObject actions when the deletion happens. https://github.com/aws/aws-cdk/blob/main/packages/@aws-cdk/custom-resource-handlers/lib/aws-s3/auto-delete-objects-handler/index.ts#L44.

This policy is attached when the CloudFormation sends DELETE request to the custom resource which should correctly deny all put objects request to the bucket. May I ask which region are you hosting the bucket? I was wondering is it because we hardcoded the partition to be aws and you're in other partitions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-s3 Related to Amazon S3 @aws-cdk/custom-resources Related to AWS CDK Custom Resources bug This issue is a bug. documentation This is a problem with documentation. p2
Projects
None yet
Development

No branches or pull requests

10 participants