Skip to content

Latest commit

 

History

History
543 lines (435 loc) · 23.1 KB

sbom_gen_scan.md

File metadata and controls

543 lines (435 loc) · 23.1 KB

Proposal: SBOM Generation and Scan

Author: stonezdj

Discussion:

Abstract

Software Bill of Materials(SBOM) is a formal representation of the software components and dependencies used in a software project. It provides a comprehensive inventory of software components, their versions, and their relationships, which help organizations to manage their software supply chain risks more effectively. Harbor, as a cloud native registrt0r .............../*- *---00sfghjkly, naturally manages OCI artifacts, integrate the SBOM into harbor can significantly enhance its functionality, providing users with a more comprehensive and transparent view of their software while also ensuring increased security and compliance.

Background

A Software Bill of Materials (SBOM) is a complete, formally structured list of components, libraries, and modules that are required to build (i.e. compile and link) a given piece of software and the supply chain relationships between them. These components can be open source or proprietary, free or paid, and widely available or restricted access. From NTIA’s SBOM FAQ: https://www.ntia.doc.gov/files/ntia/publications/ntia_sbom_faq_-_april_15_draft.pdf

Industry Standards

Currently, there are two widely recognized vendor-neutral standards, SPDX from the Linux Foundation and CycloneDX from the OWASP Foundation. Additionally, there are other standards defined by the SBOM generator, including syft and such more.

Tools

There are several open source well-known tools that can generate SBOMs from container images or scan vulnerabilities from SBOMs.

Generate SBOM

Scan SBOM

Goals

  • Generate SBOM for artifacts by 3rd tools and stores the SBOM as accessory in Harbor.
  • No limitation of the SBOM generation tool, provide the flexible ways to attach the SBOM.
    • Scanner Mode: Extend the spec of pluggable scanner. The scanner provides additional SBOM generation and scanning capabilities. Harbor is responsible for sending SBOM generation or scanning requests to the scanner. When a request is received, the scanner retrieves the subject artifact and generates the SBOM. Additionally, when Harbor sends a scan request for the SBOM, the scanner scans for vulnerabilities within it.
  • The SBOM will be listed as the accessory of the artifact in the harbor.
  • Other interact functions based on different scenarios after integrating SBOM to harbor.

Personas

  1. The system administrator.
  2. The project administrator.
  3. The project maintainer.
  4. The authorized users.

User stories

  1. As the system administrator, I want to add a scanner to Harbor which has the ability of generate and scan SBOM.
  2. As the project administrator, I want to add a custom scanner instead of system default one to Harbor which has the ability of generate and scan SBOM.
  3. As the system/project administrator, I want to enable the SBOM generation automatically for a project.
  4. As the project maintainer, I want to trigger the generate or scan operation for the SBOM manually.
  5. As the authorized user with read permission, I want to view the SBOM accessories if I can see the sbom information. and can download it to local.
  6. As the authorized user with delete permission, I want to delete the SBOM accessories if I can delete the subject artifact.

Non Goals

Current implementation doens't handle the sbom generated by client mode, it means that the external scanner generate the sbom and upload it to the Harbor. the Harbor only display it as an artifact accessory, Harbor doesn't display its packages and version information in detail, the Harbor only support to display the SBOM generated by Harbor (Scanner Mode).

Compatibility

The existing API should be compatible with the previous version. it means that the previous API, plugable scanner spec 1.0 should still work if the scanner implements the spec 1.0. but the SBOM feature will not be available for the previous scanner. The system_info API provide a field named sbom_enabled, it indicates if the SBOM feature is enabled in the system. if it is enabled, then all the SBOM related API will be available. if it is disabled, then the SBOM related API will be disabled.

Implementation

Database Schema Changes

The scanner table should support to store the generate SBOM capability information. The scanner table should be updated to add the following columns

ALTER TABLE scanner_registration ADD COLUMN IF NOT EXISTS capabilities JSONB;
ALTER TABLE scanner_registration ADD COLUMN IF NOT EXISTS spec_version varchar(50);

Add a table sbom_package with the following columns:

create table sbom_package (
    id serial,
    uuid varchar(64) not null, <--- the uuid of the scan report uuid
    artifact_digest varchar(256) not null,
    registration_uuid varchar(64) not null,
    sbom_type varchar(255), <--- value can be spdx, cyclonedx
    sbom_digest varchar(255),
    package_name varchar(255),
    package_version varchar(255),
    license varchar(100),
    created_at timestamp,
    updated_at timestamp,
    unique (artifact_digest, registration_uuid, sbom_type, package_name, package_version),
    primary key(id)
)

Adatper API change

The adapter need to implement the following APIs to support the SBOM generation and scan.

  1. Update the metadata API to add the capability to generate SBOM The /api/v1/metadata API should return the following information if it support to generate SBOM Request Method:
GET /api/v1/metadata

Response:

{
    "scanner": {
        "name": "Trivy",
        "vendor": "Aqua Security",
        "version": "x.x.x"
    },
    "capabilities": [
        {
            "type": "vulnerability",
            "consumes_mime_types": [
                "application/vnd.oci.image.manifest.v1+json",
                "application/vnd.docker.distribution.manifest.v2+json"
            ],
            "produces_mime_types": [
                "application/vnd.security.vulnerability.report; version=1.1",
            ]
        },
        {
            "type": "sbom",
            "consumes_mime_types": [
                "application/vnd.oci.image.manifest.v1+json",
                "application/vnd.docker.distribution.manifest.v2+json"
            ],
            "produces_mime_types": [
                "application/vnd.security.sbom.report+json; version=1.0"
            ]
            "additional_attributes": {
                "sbom_media_types": ["application/spdx+json","application/vnd.cyclonedx+json"]
            }
        }
    ]
}

If the adapter does not support to generate sbom, the produces_mime_types should not include the SBOM mime type. In the header of the response, it should include the content type

Content-Type: application/vnd.scanner.adapter.metadata+json; version=1.0

Content-Type: application/vnd.scanner.adapter.metadata+json; version=1.1

In the Harbor core, the registered scanner API should be updated to retrieve the capability information from the adapter and store it in the database. the scanner_registration table should be updated to add the capability column. the capability column should be a jsonb type. the capability information should be stored in the following format:

"capabilities": [
        {
            "type": "vulnerability",
            "consumes_mime_types": [
                "application/vnd.oci.image.manifest.v1+json",
                "application/vnd.docker.distribution.manifest.v2+json"
            ],
            "produces_mime_types": [
                "application/vnd.security.vulnerability.report; version=1.1",
            ]
        },
        {
            "type": "sbom",
            "consumes_mime_types": [
                "application/vnd.oci.image.manifest.v1+json",
                "application/vnd.docker.distribution.manifest.v2+json"
            ],
            "produces_mime_types": [
                "application/vnd.security.sbom.report+json; version=1.0"
            ]
            "additional_attributes": {
                "sbom_media_types": ["application/spdx+json","application/vnd.cyclonedx+json"]
            }
        }
]

If the version is 1.0, it means that the adapter does not support to generate SBOM. if the version is 1.2, it is a 1.2 version pluggable scanner. the spec_version in the scanner_registration table should be updated to 1.2. if the scanner is default internal scanner, the spec_version should be 1.2 by default.

If the type is sbom, the additional_attributes should has the sbom_media_types, it indicate the supported media type of the SBOM report, if the scanner support to generate spdx, then the "application/spdx+json" should be included in the sbom_media_types.

The upgrade script will migrate the existing scanner information to the new table format. by default the trivy scanner should have the following data in capability column:

    "capabilities": [
        {
            "type": "vulnerability",
            "consumes_mime_types": [
                "application/vnd.oci.image.manifest.v1+json",
                "application/vnd.docker.distribution.manifest.v2+json"
            ],
            "produces_mime_types": [
                "application/vnd.security.vulnerability.report; version=1.1",
            ]
            "additional_attributes": {
                "sbom_media_types": ["application/spdx+json","application/vnd.cyclonedx+json"]
            }
        }
    ]

If the adapter is a pluggable scanner spec version 1.1 adapter, the capabilities should be a fixed value

"capabilities": [
    {
        "type": "vulnerability",
        "consumes_mime_types": [
            "application/vnd.oci.image.manifest.v1+json",
            "application/vnd.docker.distribution.manifest.v2+json"
        ],
        "produces_mime_types": [
            "application/vnd.security.vulnerability.report; version=1.1",
        ]
    }
]
  1. Update the scan API to add the capability to generate SBOM (adapter) The ScanRequest send to /api/v1/scan should include the following information when it requires to generate SBOM

Request Method:

POST /api/v1/scan

Request Body:

{
    "registry": {
        "url":"https://harbor.example.com",
        "authorization": "Basic xxxxxxx"
    }
    "artifact": {
        "repository": "library/nginx",
        "reference": "latest"
    },
    "enabled_capabilities": [{
        "type": "sbom" 
        "produces_mime_types": [
            "application/vnd.security.sbom.report+json; version=1.0"
        ],
        "parameters": {
            "sbom_media_types": ["application/spdx+json"]
        }
    }]
}

In the enabled_capabilities, it is an array of object, for each element the type field is mandatory, it indicates the type of the scan request. the produces_mime_types and parameters are optional.

The parameters.sbom_media_types should be a string array which indicate the media_types of the SBOM report. the sbom_media_types should be in the same list of the sbom_media_types in the scanner_registration table. if the sbom_media_types is not in it, the scanner adapter should return 400 to the client.

The scanner accept this request and send back the scan request id in the body of the response, the scan request id is the UUID of the request, it is only used in the scanner scope. The response body should be like this:

{
    "id": "3fa85f64-5717-4562-b3fc-2c963f66afa6"
}

In the Harbor job service, the scan job should be updated to send different request(sbom/vulnerability) to the scanner adapter.

  1. Update the report API to retrieve SBOM or vulnerability report.

The /api/v1/scan/{scan_request_id}/report API should return the following information if it support to generate SBOM Request Method:

GET /api/v1/scan/{scan_request_id}/report?sbom_media_type=application/spdx+json  (url should be encoded)

Request Header should include the Accept Header, if the Accept Header is not provided, it will return 400 to the client. The Accept Header should match the produced_mime_types in the scan request.

If the scan request type is vulnerability, the request URL should be

GET /api/v1/scan/{scan_request_id}/report

the Accept header should be one of the following

Accept: application/vnd.scanner.adapter.vuln.report.harbor+json; version=1.0
or
Accept: application/vnd.security.vulnerability.report; version=1.1
or
Accept: application/vnd.scanner.adapter.vuln.report.raw

If the request is SBOM, the request URL should be

GET /api/v1/scan/{scan_request_id}/report?sbom_media_type=application/spdx+json  (url should be encoded)

and the Accept header should be: Accept: application/vnd.security.sbom.report+json; version=1.0 For different scan request, the Accpet header should be different.

Response:

{
    "generated_at": "2021-10-11T07:20:50.52Z",
    "artifact": {
        "repository": "library/nginx",
        "reference": "latest"
    },
    "scanner": {
        "name": "Trivy",
        "vendor": "Aqua Security",
        "version": "x.x.x"
    },
    "media_type": "application/spdx+json | application/vnd.cyclonedx+json",
    "sbom": {
        < the sbom content >
    }
}

If the sbom_media_type is not provided, the scanner adapter should return 400 to the client. if the request report is not available, it should return 404 to the client. if the request is not ready, it should return 404 to the client. the client will wait until timeout.

In the Harbor job service, the scan job should be updated to retrieve the SBOM report from the scanner adapter, store the SBOM report into the database. If there is a SBOM report exists for the same scanner, it should be replaced.

Harbor API change

  1. Add a configuration item to enable auto generate SBOM on image push, the default value is false.

  2. Add a system configuration item to set default SBOM media_type for scanner, its value could be application/spdx+json or application/vnd.cyclonedx+json

  3. Update scanners API to add two attribute for scanners. the attribute is support_scan, support_sbom, they are caculated value from sbom capabilities.

GET /api/v2.0/scanners

Request body:

[
    {
        "access_credential": "",
        "auth": "",
        "create_time": "2023-11-29T02:56:07.469Z",
        "description": "The Trivy scanner adapter",
        "disabled": false,
        "is_default": true,
        "name": "Trivy",
        "skip_certVerify": false,
        "update_time": "2024-01-30T04:43:52.551Z",
        "url": "https://trivy-adapter:8443",
        "use_internal_addr": true,
        "uuid": "d7de8c76-8e62-11ee-a288-0242ac130009"
        "support_scan": true,
        "support_sbom": true
    }
]
  1. Update existing scan API to allow to generate SBOM Request Method:
POST /api/v2.0/projects/<projectname>/repositories/<repository>/artifacts/<digest>/scan

Request Body:

{
    "scan_type":"sbom"
}

When the scan_type is empty or vulnerability, the scan type should be vulnerability scan, if the scan_type is sbom, the scan type should be SBOM generation. The SBOM generation job will be created and executed asynchronously. it will use the default scanner to scan the artifact. if the scanner is not available, the job will be failed.

If the scan request is accepted, it will return 202 to the client, request body is empty.

  1. Update the existing stop scan API to support stop SBOM generation Request Method:
POST /api/v2.0/projects/<projectname>/repositories/<repository>/artifacts/<digest>/scan/stop

Request Body:

{
    "scan_type":"sbom"
}

If the scan_type is empty, it is default to "vulnerability". it will query the execution table to find the scan job with the same artifact digest and the scan type. if the job is running, it will stop the job. The execution's extra_attrs should be updated to include the scan_type information.

When the scan_type is empty or vulnerability, the scan type should be vulnerability scan, if the scan_type is sbom, the scan type should be SBOM generation. The SBOM generation job will be stopped. if the scanner is not available, the job will be failed.

  1. Update existing scan job service, add support to generate SBOM for an artifact, the existing jobservice in pkg/scan/job.go to support SBOM generate in the request parameter. the previous vulnerability scan job is still supported. after receiving the SBOM report, the job service should:

    1. Delete the previous SBOM report generated with the same scanner if exist. include the information in the table scan_report, sbom_package and the artifact accessory in the OCI registry.
    2. Parse the SBOM report and store the SBOM information into the table scan_report, the scan_report table's report column store the original response from the scanner adapter, insert the package information into sbom_package table.

    The generic process of the SBOM generation and scan is as following:

    1. Parse the scan request
    2. Select the scanner to scan the artifact
    3. Generate required secret to access the registry and repository, send the scan request to the scanner adapter
    4. If the scan request is accepted, it will return 202 to the client. the client will receive the request id in the response body.
    5. The job service query the scan report from the scanner adapter by the request id. until all required reports are ready. it also have a timeout to avoid the job being blocked.
    6. Parse the report and store the report information into the database.
  2. Update existing list artifact API to support list artifact with_sbom_overview, if with_sbom_overview is true, the sbom_overview data should be provided.

[
    {
        "digest": "sha256:1d417d2b74017139bb2bd2a9ff7f6be0c6d9ee25452d70190e3508df8a6a1586",
        "icon": "sha256:0048162a053eef4d4ce3fe7518615bef084403614f8bca43b40ae2e762e11e06",
        "id": 53,
        "labels": null,
        "manifest_media_type": "application/vnd.docker.distribution.manifest.v2+json",
        "media_type": "application/vnd.docker.container.image.v1+json",
        "project_id": 1,
        "pull_time": "2024-01-30T08:08:00.508Z",
        "push_time": "2024-01-30T06:10:44.848Z",
        "repository_id": 22,
        ...
        "sbom_overview": {
            "application/vnd.security.sbom.report+json; version=1.0": {
                "duration": 4,
                "end_time": "2024-01-30T08:07:56.000Z",
                "report_id": "1a6f49a5-17ea-49b4-94ff-f38fc80cc0c8",
                "scan_status": "Stopped",
                "start_time": "2024-01-30T08:07:52.000Z"
            }
        },
        ...
        "size": 7028041,
        "tags": null,
        "type": "IMAGE"
    }
]
  1. Add an API to retrieve the SBOM report for the artifact Request Method:
GET /api/v2.0/projects/<projectname>/repositories/<repository>/artifacts/<digest>/additions/sbom

Response:

{
    "spdxVersion": "SPDX-2.3",
    "dataLicense": "CC0-1.0",
    "SPDXID": "SPDXRef-DOCUMENT",
    "name": "alpine:latest",
    "documentNamespace": "<http://aquasecurity.github.io/trivy/container_image/alpine:latest-24fab3cb-05fa-479b-b3b7-f76151354cc3>",
    "creationInfo":{
      "licenseListVersion": "",
      "creators": [
        "Organization: aquasecurity",
        "Tool: trivy-0.44.1"
      ],
      "created": "2023-09-22T07:41:04Z"
    },
    "packages": [
        {
          "name": "alpine",
          "SPDXID": "SPDXRef-OperatingSystem-68bf9b9d283c287a",
          "versionInfo": "3.18.3",
          "downloadLocation": "NONE",
          "copyrightText": "",
          "primaryPackagePurpose": "OPERATING-SYSTEM"
        },
        ...
    ]
    ...
    }
   ...
}

Because the end user could upload the SBOM to an artifact, and also Harbor itself could generate SBOM for an existing artifact, the SBOM report information should be stored in the aritfact_accessory table. The current API only display the SBOM report generated in Harbor. for SBOM report uploaded by the end user, because Harbor could not guarantee the SBOM report is valid and consumable by the API. it will not display the SBOM report uploaded by the end user.

If the SBOM report is unavailable, it returns http code 404 to the client.

  1. Add UI tab to display the SBOM accessory of the artifact

  2. Update the list artifact API to display if the SBOM is available for the artifact, the attribute should be sbom_available, it is a boolean value. if the SBOM info is available, it will display the SBOM summary status in the artifact table.

  3. Update the system info API to add an option sbom_enabled, the SBOM is enabled when the following conditions are met.

  4. The scan_spec version is 1.1

  5. The default scanner_registration's support_sbom is true. When sbom_enabled, the Generate SBOM button can be enabled, or it is disabled.

  6. Update the scanner registration API to pluggable scanner spec 1.1 adapter and update its information in scanner_registration table in migration script.

SBOM Generation

The SBOM generation is triggered by the following scenarios: 1. The user trigger the SBOM generation by calling the SBOM API 2. The user push the artifact to the repository and the scanner policy is configured to trigger the SBOM generation when the artifact is pushed to the repository.

Lifecycle Management

The SBOM is stored the database of Harbor, it should:

  1. It cannot be replicated by Harbor replication
  2. If the artifact is delete, the SBOM information should be removed as well
  3. If it is generated by the current scanner in the Harbor, it should removed when a new SBOM generation start. for each artifact, it only keeps the latest SBOM report generated by the current scanner.

UI

  1. Configuration item to enable auto generate SBOM on push. if the current scanner doesn't support to generate SBOM, the configuration item should be disabled. Generate sbom on push

  2. System configuration item to select the default SBOM media_type, it could be application/spdx+json or application/vnd.cyclonedx+json

  3. Add button to generate SBOM in the artifact table, manually generate SBOM for the artifact. the button should be disabled if the current scanner does not support to generate SBOM. the generate SBOM button and stop generate SBOM button should be mutex, the stop generate SBOM and stop scan should have the same backend API with different parameters. Generate sbom

  4. In the artifact table, if there is sbom_enabled, it will display the SBOM status. the SBOM status could link to the SBOM detail information in #4. Display sbom

  5. Display the full content of the current SBOM (include spdx and cyclonedx). the content of the SBOM can be download as a file. Display sbom

Open issues (if applicable)

Terminology