Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codepipeline_actions.EcsDeployAction: CannotPullContainerError: 403 Forbidden #29876

Closed
harrison-traintobecome opened this issue Apr 17, 2024 · 11 comments
Labels
@aws-cdk/aws-codepipeline-actions bug This issue is a bug. closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. p3 response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.

Comments

@harrison-traintobecome
Copy link

Describe the bug

I have a ECS service that I want to deploy through a CI/CD pipeline, the pipeline deploys and builds everything fine. The EcsDeployAction times out, upon inspection in the console, the latest, stopped, task container shows an error:

Task stopped at: 2024-04-17T20:43:16.177Z
CannotPullContainerError: pull image manifest has been retried 1 time(s): failed to resolve ref 123456789123.dkr.ecr.us-east-1.amazonaws.com/a2-ecr-app-repo:latest: pulling from host 123456789123.dkr.ecr.us-east-1.amazonaws.com failed with status code [manifests latest]: 403 Forbidden

The specified image is present in the specified ECR repository, I have tried adding admin permissions to the EcsDeployAction's role and the Codepipeline's service role, as well as making the ECR repository's images public to all resources in my account.

Relevent CDK code:

from aws_cdk import (
    Stack,
    Duration,
    RemovalPolicy,
    aws_ecr as ecr,
    aws_iam as iam,
    aws_secretsmanager as sms,
    aws_codebuild as codebuild,
    aws_codecommit as codecommit,
    aws_codepipeline as codepipeline,
    aws_codepipeline_actions as codepipeline_actions,
    
)
from constructs import Construct

# from .domainStack import _DOMAIN_NAME_

class pipelineStack(Stack):
    def __init__(self, scope: Construct, id: str,  app_stack, **kwargs):
        super().__init__(scope, id, **kwargs)
        
        branch = 'main'
            
        CONNECTION_ARN = f'arn:aws:codestar-connections:{self.region}:{self.account}:connection/44d88d9b-2d8e-470c-afa9-e449499e6155'
        
        app_source_artifact = codepipeline.Artifact()
        app_build_artifact = codepipeline.Artifact()
        app_image = codepipeline.Artifact()
        
        # Source stage
        source_stage = codepipeline.StageProps(
            stage_name="Source",
            actions=[
                codepipeline_actions.CodeStarConnectionsSourceAction(
                    action_name="****",
                    owner="****",
                    repo="****",
                    branch=branch,
                    output=app_source_artifact,
                    connection_arn=CONNECTION_ARN,
                )
            ]
        )
        
        # Docker Build for ECS
        # https://github.com/aws-samples/amazon-ecs-anywhere-cicd-pipeline-cdk-sample/blob/main/EcsAnywhereCdk/lib/ecs_anywhere_pipeline-stack.ts
        app_ui_repo = ecr.Repository(self, "A2EcrAppRepo",
            repository_name="a2-ecr-app-repo",
            image_scan_on_push=True,
            removal_policy=RemovalPolicy.DESTROY,
            
        )

        app_docker_build = codebuild.PipelineProject(self, 'AppDockerBuild',
            project_name="AppDockerBuild",
            environment=codebuild.BuildEnvironment(
                build_image=codebuild.LinuxBuildImage.AMAZON_LINUX_2_3,
                privileged=True
            ),
            build_spec=codebuild.BuildSpec.from_source_filename('buildspec.yml'),
            environment_variables={
                'IMAGE_REPO_NAME': codebuild.BuildEnvironmentVariable(
                    type=codebuild.BuildEnvironmentVariableType.PLAINTEXT,
                    value=app_ui_repo.repository_name
                ),
                'IMAGE_REPO_URI': codebuild.BuildEnvironmentVariable(
                    type=codebuild.BuildEnvironmentVariableType.PLAINTEXT,
                    value=app_ui_repo.repository_uri
                )
                
            }
        )
        
        docker_build_stage = codepipeline.StageProps(
            stage_name="EcsDockerBuild",
            actions=[
                codepipeline_actions.CodeBuildAction(
                    action_name="AppBuild",
                    project=app_docker_build,
                    input=app_source_artifact,
                    outputs=[app_image],
                )
            ]
        )
        
        app_ui_repo.grant_pull_push(app_docker_build)
        
        #IAM
        code_pipeline_service_role = iam.Role(self, 'CodePipelineServiceRole',
            assumed_by=iam.ServicePrincipal('codepipeline.amazonaws.com'),
            role_name='CodePipelineServiceRole'
        )
        code_pipeline_service_role.add_to_policy(iam.PolicyStatement(
            effect=iam.Effect.ALLOW,
            resources=['*'],
            actions=[
                's3:*',
                'ecr:*',
                'ec2:*',
                'ecs:*'
            ]
        ))
        app_ui_repo.grant_pull_push(code_pipeline_service_role)
        
        # Deploy stage
        deploy_stage = codepipeline.StageProps(
            stage_name="Deploy",
            actions=[
                
                codepipeline_actions.EcsDeployAction(
                    action_name="DeployApp",
                    image_file=app_image.at_path('images.json'),
                    service=app_stack.service.service,
                    deployment_timeout=Duration.minutes(10),
                    run_order=3,
                    # role=ecs_deploy_role
                )
            ]
        )
        
        # Pipeline
        pipeline = codepipeline.Pipeline(self, "Pipeline",
            role=code_pipeline_service_role,
            pipeline_name=f"WebPipeline",
            stages=[
                source_stage,
                docker_build_stage,
                deploy_stage,
            ]
        )

Buildspec.yml in the source repository:

version: 0.2

phases:
  install:
    runtime-versions:
      docker: latest
  pre_build:
    commands:
      - echo Logging in to Docker hub...
      # - docker login -u=${DOCKER_USERNAME} -p=${DOCKER_PASSWORD}
      - echo Logging in to Amazon ECR...
      - $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION)
      - CODEBUILD_RESOLVED_SOURCE_VERSION="${CODEBUILD_RESOLVED_SOURCE_VERSION:-$IMAGE_TAG}"
      - COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
      - IMAGE_TAG="latest"
      - IMAGE_URI="$IMAGE_REPO_NAME:$IMAGE_TAG"
      - DOCKERFILE_PATH="$CODEBUILD_SRC_DIR/Dockerfile"
  build:
    commands:
      - echo Build started on `date`
      - echo Building the Docker image...
      - pwd
      - ls -ltr
      # - cd app/app
      - docker build -f $DOCKERFILE_PATH -t $IMAGE_REPO_NAME:$IMAGE_TAG .
      - docker images
      - docker tag $IMAGE_REPO_NAME:$IMAGE_TAG $IMAGE_REPO_URI:$IMAGE_TAG    
  post_build:
    commands:
      - bash -c "if [ /"$CODEBUILD_BUILD_SUCCEEDING/" == /"0/" ]; then exit 1; fi"
      - echo Build stage successfully completed on `date`
      - echo Pushing the Docker image...
      - docker push $IMAGE_REPO_URI:$IMAGE_TAG
      - printf '[{"name":"appContainer","imageUri":"%s"}]' "$IMAGE_REPO_URI:$IMAGE_TAG" > images.json
      - pwd
      - ls -ltr
      - cat images.json
artifacts:
  files: images.json

Expected Behavior

Codepipeline to successfully deploy/update the Ecs service.

Current Behavior

The Codepipeline fails on the EcsDeployAction step.

Reproduction Steps

Due to the nature of CI/CD pipelines, it is hard to make a copy and paste-able snippet, but here is everything you need to do to setup this repo:

  • Create a repository that contains the aforementioned buildspec.yml and a barebones Dockerfile
  • Create a connection string to that repository, AWS Console -> Codepipeline -> Settings -> Connections
  • Replace the values for CodeStarConnectionsSourceAction and CONNECTION_ARN
  • cdk deploy

Possible Solution

May be a python issue, the typescript ECS/Codepipeline repos seam to be working fine

Additional Information/Context

Common issues I have ruled out:

  • Object not being present (misleading 403)
  • Codepipeline/EcsDeployAction Service Role's have insufficient permissions

CDK CLI Version

2.108.0 (build 5665a95)

Framework Version

No response

Node.js Version

v18.17.1

OS

Amazon Linux 2

Language

Python

Language Version

Python 3.8.16

Other information

No response

@harrison-traintobecome harrison-traintobecome added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Apr 17, 2024
@ashishdhingra ashishdhingra self-assigned this Apr 17, 2024
@ashishdhingra ashishdhingra added needs-reproduction This issue needs reproduction. and removed needs-triage This issue or PR still needs to be triaged. labels Apr 17, 2024
@ashishdhingra
Copy link
Contributor

@harrison-traintobecome Good morning. Apologies for delayed reply. Could you please confirm if your CDK deployment is in a private VPC? If yes, do you have VPCE Endpoint configured for EKS service?

Thanks,
Ashish

@ashishdhingra ashishdhingra added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Apr 22, 2024
@harrison-traintobecome
Copy link
Author

The stack that contains the ecs defintion is as follows:

from aws_cdk import (
    Stack,
    Duration,
    aws_ecs as ecs,
    aws_ec2 as ec2,
    aws_ecs_patterns as ecs_patterns,
)
from constructs import Construct

class appStack(Stack):
    def __init__(self, scope: Construct, id: str, **kwargs):
        super().__init__(scope, id, **kwargs)
        
        memory = 512 * 1
        cpu = 256 * 1
            
        task_definition = ecs.FargateTaskDefinition(self, "AppTaskDef",
            memory_limit_mib=memory,
            cpu=cpu,
        )
        
        frontend_container = task_definition.add_container('AppContainer',
            image=ecs.ContainerImage.from_asset('src/app/'),
            container_name='appContainer',
            memory_limit_mib=memory,
            cpu=cpu,
            port_mappings=[
                ecs.PortMapping(container_port=3000, host_port=3000)
            ],
            logging=ecs.AwsLogDriver(stream_prefix="appEvents", mode=ecs.AwsLogDriverMode.NON_BLOCKING),
        )
        
        vpc = ec2.Vpc(self, 'vpc')
        cluster = ecs.Cluster(self, 'cluster', vpc=vpc)
        
        self.service = ecs_patterns.ApplicationLoadBalancedFargateService(self, "AppEcsService",
            cluster=cluster,
            task_definition=task_definition,
            public_load_balancer=True,
            memory_limit_mib=memory,
            cpu=cpu,
            desired_count=1,
        )

If this^ is not what you meant, could you please clarify? I have not manually setup any EKS or ECS Vpce Endpoints.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Apr 22, 2024
@ashishdhingra
Copy link
Contributor

@harrison-traintobecome Good afternoon. Apologies for delayed reply. Could you please confirm if you are able to resolve the issue? From the error message, it appears IAM permission related issue most likely thrown by Docker command in buildspec.yml. Also, are your trying to set it up using guidance provided at https://github.com/aws-samples/amazon-ecs-anywhere-cicd-pipeline-cdk-sample?

Thanks,
Ashish

@harrison-traintobecome
Copy link
Author

Thanks for the response, this remains unresolved.

I can confirm that the issue does not lie in the buildspac.yml as the "images.json" output artifact exists and is properly formatted.

The example repository mentioned ( https://github.com/aws-samples/amazon-ecs-anywhere-cicd-pipeline-cdk-sample) was referenced heavily to make my CDK code.

Thanks,
Harrison

@ashishdhingra
Copy link
Contributor

@harrison-traintobecome Thanks for the reply. Could you also your Amazon ECR repository policy for restrictions on accessing the repository (reference blog post https://repost.aws/knowledge-center/ecs-pull-container-api-error-ecr).

In the mean while, I'm trying end-to-end example at my end.

Thanks,
Ashish

@harrison-traintobecome
Copy link
Author

Yes, I grant the necessary ECR permissions here "app_ui_repo.grant_pull_push(code_pipeline_service_role)" - and have tried granting the EcsDeployRole and CodepipelineServiceRole admin access.

@ashishdhingra
Copy link
Contributor

Yes, I grant the necessary ECR permissions here "app_ui_repo.grant_pull_push(code_pipeline_service_role)" - and have tried granting the EcsDeployRole and CodepipelineServiceRole admin access.

@harrison-traintobecome I'm referring to any IAM policy at ECR repository level that restricts access. IAM role might have access, but ECR repository might have IAM policy that restricts access.

@harrison-traintobecome
Copy link
Author

No, all permissions pertaining to the ECR repository are defined in the CDK and there are no other policies attached.

@ashishdhingra ashishdhingra removed their assignment Jun 11, 2024
@pahud
Copy link
Contributor

pahud commented Jun 11, 2024

The error

CannotPullContainerError: pull image manifest has been retried 1 time(s): failed to resolve ref 123456789123.dkr.ecr.us-east-1.amazonaws.com/a2-ecr-app-repo:latest: pulling from host 123456789123.dkr.ecr.us-east-1.amazonaws.com failed with status code [manifests latest]: 403 Forbidden

indicates a 403 Forbidden error when attempting to pull an image from Amazon Elastic Container Registry (ECR) during an Amazon Elastic Container Service (ECS) CI/CD pipeline deployment.

An ECS CI/CD pipeline typically consists of the following stages:

Source Stage - Triggers the pipeline and retrieves the source code.
Build Stage - Builds the Docker image and pushes it to ECR.
Deploy Stage - Updates the container image URI and bumps the ECS Task Definition revision by calling the ECS API.

The ECS control plane then triggers a rolling update of tasks, and the ECS Task Execution Role attempts to pull the new container image specified in the updated Task Definition. This is the only stage where ECR image pulling occurs.

Things to check:

  1. Use CfnOutput to print out the ECS Task Execution Role name or ARN for debugging purposes.
  2. Verify that the Task Execution Role has the necessary permissions to pull images from the ECR repository.
  3. Check the policies of your ECR repository to ensure that the Task Execution Role is allowed to pull images.

If unsure, feel free to share your policies for further assistance.

Relevant AWS documentation:

Amazon ECR Repositories
Amazon ECS Task Execution IAM Role
Amazon ECS Task Definition Parameters

@pahud pahud added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. p3 and removed needs-reproduction This issue needs reproduction. labels Jun 11, 2024
Copy link

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

@github-actions github-actions bot added closing-soon This issue will automatically close in 4 days unless further comments are made. closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. and removed closing-soon This issue will automatically close in 4 days unless further comments are made. labels Jun 14, 2024
@aws-cdk-automation
Copy link
Collaborator

Comments on closed issues and PRs are hard for our team to see. If you need help, please open a new issue that references this one.

@aws aws locked as resolved and limited conversation to collaborators Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
@aws-cdk/aws-codepipeline-actions bug This issue is a bug. closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. p3 response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.
Projects
None yet
Development

No branches or pull requests

4 participants