Kaniko + Cosign + Vault on Self-Hosted Kubernetes

Executive Summary

This paper presents a supply-chain security pipeline for self-hosted Kubernetes environments that are isolated from public cloud infrastructure. The architecture combines four components — Kaniko for daemon-free image builds, Cosign v2 for local-key image signing, Kyverno for admission-time signature enforcement, and HashiCorp Vault for ephemeral secrets delivery — to achieve verified, policy-enforced container deployments without dependency on the Sigstore public transparency infrastructure. The principal finding is that the majority of published guidance for container signing assumes network reachability to rekor.sigstore.dev and other public Sigstore services; each of those assumptions breaks in air-gapped or self-hosted deployments in ways that are not immediately apparent from tooling error messages. The pattern described here resolves each failure mode and establishes a per-environment key hierarchy that prevents cross-tier image promotion.

Key Findings

The public Sigstore transparency log assumption embedded in default Cosign tooling renders standard signing guides non-functional in air-gapped environments. The --tlog-upload=false flag is required to bypass Rekor upload; its omission causes signing to succeed locally and then fail silently during log submission, with error messages that obscure the root cause.
Kyverno’s verifyImages policy will reject every image admission in an air-gapped cluster unless ignoreTlog: true and ignoreSCT: true are explicitly set. The failure mode presents as a signature verification error rather than a network connectivity error, misleading operators into re-examining signing configuration rather than policy configuration.
Docker-in-Docker (--privileged mode) is not required for Kubernetes-native image builds. Kaniko executes builds as a standard container, making it compatible with restrictive pod security standards that prohibit privileged escalation.
The Vault agent sidecar pattern is poorly suited to short-lived CI jobs. Inline Kubernetes auth eliminates sidecar lifecycle complexity: the job authenticates directly, retrieves the signing key, uses it once, and removes it — the key’s disk residency is bounded to the duration of the signing call.
Per-tier workflow files provide stronger deployment isolation than parameterized tier selection. Separate workflow files hard-code tier-specific key references, registry paths, and cluster credentials, eliminating a class of misconfiguration errors that are possible when tier is an input parameter.
Key rotation in this architecture requires coordinated updates across three systems — the Vault secret, the Kyverno cluster policy, and any images that must remain deployable under the new key. The absence of a formalized runbook for this sequence is the primary operational risk in the current implementation.

1. Introduction

Container supply-chain security has received sustained attention following a series of high-profile software supply-chain incidents. The Sigstore project, and its primary tool Cosign, provides cryptographic image signing and verification infrastructure that has become the de facto standard for Kubernetes environments. However, the published documentation and tooling defaults are designed for environments with unrestricted outbound network access to public Sigstore infrastructure. Self-hosted Kubernetes deployments — those operating in corporate data centers, regulated industries with network segmentation requirements, or environments with restrictive egress policies — encounter a consistent pattern of failures when following standard guidance. Each failure is recoverable, but the path from failure to resolution is poorly documented. This paper presents a complete pipeline architecture that addresses these failures systematically. The architecture covers the full lifecycle from source commit to running pod: daemon-free image construction, local-key cryptographic signing, admission-time enforcement, and ephemeral secrets delivery. All components operate without dependency on public cloud services or the Sigstore public transparency infrastructure. The pipeline described here builds on a self-hosted CI foundation running inside the cluster — Gitea Actions with act_runner and a private container registry — and layers supply-chain security controls on top of that foundation.

2. Architecture Overview

The pipeline consists of four components, each selected to satisfy a constraint that the default tooling alternative could not meet.

The architecture decisions documented in this section are driven by specific self-hosted constraints. Teams operating in cloud environments with unrestricted egress may find the standard tooling defaults sufficient for several of these components.

Component	Default Assumption	Self-Hosted Constraint	Selected Alternative
Image builds	Docker daemon available; `--privileged` mode acceptable	Privileged containers restricted by cluster security policy	Kaniko (daemon-free, standard container)
Image signing	`rekor.sigstore.dev` reachable; public transparency log available	Egress to Sigstore infrastructure blocked	Cosign v2 with `--tlog-upload=false` and local key pair
Admission enforcement	Rekor entry verifiable at admission time	Kyverno cannot reach Rekor; Certificate Transparency unavailable	Kyverno with `ignoreTlog: true`, `ignoreSCT: true`
Secrets delivery	Cloud-managed secrets service (AWS Secrets Manager, GCP Secret Manager) or persistent Vault agent sidecar	No managed secrets service; CI jobs are short-lived	Vault inline Kubernetes auth; ephemeral key retrieval
Registry	Public registry (Docker Hub, GHCR) or cloud-managed	Private registry with self-signed TLS certificate	Private registry; `--skip-tls-verify` or custom CA mount

The components interact in a linear sequence: Kaniko builds the image and emits a digest; the CI job authenticates to Vault and retrieves the signing key; Cosign signs the image by digest; Kyverno enforces the signature at admission time. Each handoff is explicit and auditable.

3. Container Build: Kaniko

3.1 Problem Statement

The conventional approach to building container images inside a Kubernetes cluster is Docker-in-Docker: a privileged container runs a Docker daemon that executes the build. This approach requires the --privileged security context, which grants the container nearly unrestricted access to the host kernel. Cluster security policies in hardened environments routinely prohibit this.

3.2 Kaniko Operation

Kaniko resolves this constraint by executing each Dockerfile instruction directly as a process within a standard, unprivileged container. It does not require or use a Docker socket. The resulting image is pushed directly to a registry from the build container.

3.3 Configuration

The following workflow step demonstrates Kaniko’s daemon-free build invocation and the digest handoff mechanism required by subsequent signing steps:

# CI workflow: Kaniko build step
- name: build-and-push
  image: gcr.io/kaniko-project/executor:latest
  args:
    - "--dockerfile=Dockerfile"
    - "--context=git://gitea.example.com/org/repo"
    - "--destination=registry.example.com/org/app:${IMAGE_TAG}"
    - "--cache=true"
    - "--cache-repo=registry.example.com/org/app/cache"
    - "--skip-tls-verify"                              # private registry with self-signed cert
    - "--compressed-cache"
    - "--digest-file=/workspace/image-digest"          # write pushed digest for Cosign handoff
  volumeMounts:
    - name: docker-config
      mountPath: /kaniko/.docker
  volumes:
    - name: docker-config
      secret:
        secretName: registry-credentials
        items:
          - key: .dockerconfigjson
            path: config.json                          # Kaniko reads credentials here, not from Docker socket

--skip-tls-verify should be treated as a transitional configuration. The preferred approach is to mount the internal CA certificate and reference it via --registry-certificate. Persistent use of --skip-tls-verify in production workflows eliminates TLS validation and weakens the integrity of the build-to-registry channel.

Two parameters warrant specific attention. First, --digest-file writes the pushed image digest to a workspace file. Cosign v2 signs by digest rather than by tag; a tag is mutable and provides no cryptographic binding to a specific image layer set. The digest is the only input that guarantees the signed artifact and the deployed artifact are identical. If --digest-file fails silently — for example, due to a workspace path issue — the signing step will attempt to sign an empty string and produce a misleading error.

Add an explicit existence check on the digest file before the signing step executes: test -s /workspace/image-digest || (echo "ERROR: digest file missing or empty" && exit 1). This surfaces the failure at the correct step rather than propagating an empty digest into the Cosign invocation.

Second, registry credentials are provided via a Kubernetes secret mounted at /kaniko/.docker/config.json. Kaniko does not interact with a Docker socket and therefore does not read credentials from the Docker credential store; the mounted path is the correct location.

4. Image Signing: Cosign v2 Air-Gapped

4.1 Problem Statement

The Cosign v2 documentation presents local key-pair signing as a straightforward alternative to keyless signing. In practice, the default configuration for key-pair signing still attempts to record the signature in the Rekor public transparency log. In an air-gapped environment, the signing operation completes locally and then fails when Cosign attempts to contact rekor.sigstore.dev. The error message does not clearly identify the transparency log upload as the failure source, leading operators to investigate key configuration rather than network configuration.

4.2 Air-Gapped Signing Configuration

The following commands demonstrate the complete set of flags required for signing and verification in an environment without Sigstore infrastructure access:

# Sign with a local key, bypassing the transparency log entirely
cosign sign \
  --key cosign.key \
  --registry-referrers-mode=oci-1-1 \
  --tlog-upload=false \
  --insecure-skip-tls-verify \
  registry.example.com/org/app@${IMAGE_DIGEST}

# Verify — same flags required on the verification side
cosign verify \
  --key cosign.pub \
  --insecure-skip-tls-verify \
  --ignore-tlog \
  --ignore-sct \
  registry.example.com/org/app@${IMAGE_DIGEST}

The function of each flag is as follows:

--tlog-upload=false — suppresses the attempt to upload the signature entry to the Rekor transparency log. This flag is the primary fix for the air-gapped signing failure. The default value is true; omitting this flag is the failure mode.
--ignore-tlog — instructs verification to proceed without requiring a Rekor log entry. Images signed with --tlog-upload=false possess no Rekor entry by design; verification will fail unless this flag is present.
--ignore-sct — suppresses the requirement for a Signed Certificate Timestamp, which is part of the Certificate Transparency infrastructure. Both SCT validation and Rekor validation depend on public internet infrastructure; both must be disabled in air-gapped environments.
--registry-referrers-mode=oci-1-1 — stores the signature using the OCI 1.1 referrers API, attaching it to the image manifest in the registry. This is the preferred storage model in Cosign v2 and avoids the legacy tag-based signature storage fallback.

The following step demonstrates integration with the digest artifact produced by the Kaniko build step:

IMAGE_DIGEST=$(cat /workspace/image-digest)
cosign sign \
  --key cosign.key \
  --tlog-upload=false \
  --insecure-skip-tls-verify \
  registry.example.com/org/app@${IMAGE_DIGEST}

4.3 Per-Tier Key Hierarchy

Each deployment environment maintains a distinct Cosign key pair. The CI workflow selects the key corresponding to the target tier. Kyverno policy on each cluster validates signatures against the public key registered for that environment exclusively. This structure enforces an architectural boundary: a dev-signed image cannot be admitted to the production cluster because its signature does not verify against the production public key, regardless of how the workflow is invoked.

5. Policy Enforcement: Kyverno

5.1 Problem Statement

Kyverno’s verifyImages admission policy is the standard mechanism for enforcing image signing at pod admission time. The default policy configuration — as presented in the Kyverno documentation — expects a reachable Rekor instance and valid Certificate Transparency records. In an air-gapped cluster, every pod admission that triggers the policy fails. Kyverno reports a signature verification failure rather than a network connectivity error, which misdirects investigation toward the signing pipeline rather than the policy configuration.

5.2 Air-Gapped Policy Configuration

The following ClusterPolicy manifest demonstrates the complete configuration required for air-gapped signature enforcement:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-signed-images
spec:
  validationFailureAction: Enforce
  background: false            # disable background scanning — avoids false failures if Kyverno can't reach external services
  rules:
    - name: check-image-signature
      match:
        any:
          - resources:
              kinds:
                - Pod
      verifyImages:
        - imageReferences:
            - "registry.example.com/org/*"
          attestors:
            - count: 1
              entries:
                - keys:
                    publicKeys: |-
                      -----BEGIN PUBLIC KEY-----
                      <cosign-public-key>
                      -----END PUBLIC KEY-----
                    ctlog:
                      ignoreSCT: true      # don't require Certificate Transparency entry
                    rekor:
                      ignoreTlog: true     # don't require Rekor transparency log entry

ignoreTlog: true and ignoreSCT: true are not workarounds or unsupported configurations. They are the documented mechanism for air-gapped Cosign verification in Kyverno. The documentation coverage of these fields is sparse relative to their operational significance in self-hosted environments.

The background: false setting disables Kyverno’s periodic background scanning cycle. In an air-gapped environment, background scanning generates a high volume of spurious failures as Kyverno attempts to contact external services on each scan interval. Admission-time enforcement is the security-relevant behavior; background scanning provides no additional security value here and introduces operational noise.

When the Cosign key pair is rotated, the Kyverno ClusterPolicy must be updated to reference the new public key before or simultaneously with the deployment of images signed with the new key. Kyverno enforces policy only at admission time; already-running pods are not re-evaluated. A policy update that lags behind key rotation will cause new deployments to fail until the policy is reconciled. A formal rotation runbook that sequences these three steps — Vault secret update, Kyverno policy update, image re-signing — is required to manage this safely.

6. Secrets Management: Vault Inline Auth

6.1 Problem Statement

The standard pattern for Vault integration in Kubernetes is the Vault Agent sidecar: a secondary container runs alongside the application, authenticates to Vault, and writes secrets to a shared volume for the primary container to consume. This pattern is appropriate for long-running application workloads where the sidecar lifecycle aligns with the application lifecycle. For a CI job that completes in under sixty seconds, the sidecar pattern introduces startup overhead, requires clean sidecar termination logic, and adds a long-lived process to a short-lived job context.

6.2 Inline Authentication Pattern

The following step demonstrates inline Vault authentication using the Kubernetes auth method, without a sidecar:

# Inline Vault auth — CI job authenticates directly, no sidecar needed
- name: vault-auth-and-sign
  image: hashicorp/vault:latest
  env:
    - name: VAULT_ADDR
      value: "https://vault.internal.example.com"
    - name: VAULT_NAMESPACE
      value: "ci"
  command:
    - /bin/sh
    - -c
    - |
      # Authenticate using the pod's Kubernetes service account token
      SA_TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
      VAULT_TOKEN=$(vault write -field=token auth/kubernetes/login \
        role=ci-signer \
        jwt="${SA_TOKEN}")
      export VAULT_TOKEN

      # Pull the tier-specific signing key from Vault KV
      vault kv get -field=private_key \
        secret/ci/cosign/dev > /tmp/cosign.key

      # Sign the image by the digest Kaniko wrote
      IMAGE_DIGEST=$(cat /workspace/image-digest)
      cosign sign \
        --key /tmp/cosign.key \
        --tlog-upload=false \
        --insecure-skip-tls-verify \
        registry.example.com/org/app@${IMAGE_DIGEST}

      # Delete key immediately — it has no purpose after signing
      rm -f /tmp/cosign.key

The Kubernetes auth method uses the pod’s projected service account JWT to authenticate against the Vault server. The Vault role (ci-signer) maps specific service account and namespace combinations to a policy granting read-only access to the signing key path for the corresponding tier. The CI job’s service account is granted exactly that policy and no broader access.

The signing key is written to /tmp/cosign.key and removed immediately after the signing call. The key’s disk residency is bounded to the duration of the cosign sign invocation. If the job is interrupted between write and cleanup, the key will persist in the container filesystem until the pod is garbage-collected. For environments with higher key confidentiality requirements, mounting /tmp as a tmpfs volume prevents the key from touching persistent storage entirely and is the recommended approach.

Scope the Vault policy to the minimum required path. A role that grants read access only to secret/ci/cosign/dev (rather than secret/ci/cosign/* or broader) ensures that a compromised CI job cannot retrieve signing keys for other tiers.

7. Per-Tier Deployment Model

7.1 Design Rationale

A single parameterized workflow that accepts a target tier as input is a common CI design pattern. In the context of supply-chain security, this pattern introduces a failure mode: a misconfigured input parameter, a CI bug, or an intentional circumvention attempt could cause a dev-signed image to be deployed to the production cluster. Because the signing key selection is driven by the parameter, the pipeline would sign the image with the dev key and the production Kyverno policy would reject it — but the attempted deployment would still occur. Separate workflow files per tier eliminate this class of error. The tier-specific key reference, registry path, and cluster credentials are embedded in the workflow definition itself. Deployment to the wrong tier requires deliberate modification of a workflow file, not a parameter change.

7.2 Workflow Structure

The following directory structure illustrates the per-tier workflow organization:

.gitea/workflows/
  deploy-dev.yml       # signs with COSIGN_KEY_DEV, deploys to dev cluster
  deploy-stage.yml     # signs with COSIGN_KEY_STAGE, deploys to stage cluster
  deploy-prod.yml      # signs with COSIGN_KEY_PROD, deploys to prod cluster

Each workflow file is structurally identical in its step sequence but hard-codes the tier-specific values. Each cluster runs a Kyverno ClusterPolicy referencing only the public key for that tier. The enforcement is bidirectional: a production cluster will reject any image not signed with the production key, regardless of which workflow file triggered the deployment or how it was invoked.

8. End-to-End Pipeline

The complete pipeline for a commit to the main branch proceeds as follows:

Trigger

A Gitea webhook event triggers the tier-specific deployment workflow on the act_runner pod.

Build

Kaniko builds the container image from the Dockerfile, pushes it to the private registry, and writes the image digest to /workspace/image-digest.

Authenticate

The CI job authenticates to Vault using the pod’s Kubernetes service account JWT and retrieves the tier-specific signing key via the ci-signer role.

Sign

Cosign signs the image by digest using --tlog-upload=false, storing the signature as an OCI artifact alongside the image in the private registry. The signing key is deleted from the workspace immediately after the signing call completes.

Update Config

Kustomize overlays are updated with the new image digest and committed to the GitOps configuration repository.

Sync

ArgoCD detects the configuration change and synchronizes the updated manifest to the target cluster.

Admit

Kyverno intercepts the pod admission request, retrieves the image signature from the registry, and verifies it against the public key embedded in the ClusterPolicy. Admission succeeds only if verification passes.

Run

The pod starts. Images that fail signature verification at step 7 are rejected at admission; they do not run.

Kyverno enforces the signature policy on every pod admission, not only on initial deployment. A pod restart, a rollout restart, or a horizontal scaling event will trigger re-verification. An image that passes verification at initial deployment will continue to be verified on every subsequent admission.

The failure mode at the admission step is deterministic: a missing signature, a signature created with the wrong key, or a corrupted signature artifact causes Kyverno to reject the admission with a specific policy violation event. The rejection is logged and observable in cluster audit logs.

9. Recommendations

The following recommendations are based on the operational findings documented in this paper. Security teams and platform engineers implementing this pattern should treat these as baseline requirements rather than optional enhancements.

Establish an explicit digest handoff validation step between the build and signing stages. Verify that the digest file exists and is non-empty before passing it to Cosign. A silent failure at this handoff produces misleading errors in the signing step and is difficult to diagnose without the validation check in place.
Document and rehearse the key rotation runbook before placing the pipeline in production. Key rotation requires coordinated updates to three systems — the Vault secret, the Kyverno ClusterPolicy, and any images that must remain deployable — in a defined sequence. The absence of a runbook is an operational risk that increases in severity with cluster count and key pair count.
Replace --skip-tls-verify with a mounted CA certificate for production workloads. TLS verification bypasses reduce the integrity guarantees of the build-to-registry channel. The private registry’s CA certificate should be distributed to CI job pods and referenced via the appropriate flag in both Kaniko and Cosign.
Mount the signing key path as tmpfs in environments with high key confidentiality requirements. The current implementation writes the signing key to the container filesystem for the duration of the signing call. A tmpfs mount ensures the key never reaches persistent storage and is not recoverable from the container filesystem if the pod is interrupted.
Scope Vault policies to the minimum required path per tier. A CI role that can read only the signing key for its target tier cannot retrieve keys for other tiers, even if the role’s JWT is compromised. Least-privilege Vault policy scoping is a defense-in-depth measure against lateral movement within the CI environment.
Generate Kyverno policy YAML from the public key stored in Vault. The current process requires a manual policy update when keys are rotated. Automating policy generation from the Vault-stored public key reduces the window between key rotation and policy enforcement, and eliminates a manual step from the rotation runbook.
Maintain separate workflow files per deployment tier rather than using parameterized tier selection. This is an architectural control, not a stylistic preference. Parameterized tier selection creates a class of misconfiguration errors that separate workflow files structurally prevent.

10. Conclusion

The architecture described in this paper demonstrates that a fully verified, policy-enforced container deployment pipeline is achievable on self-hosted Kubernetes without dependency on public cloud services, managed secrets infrastructure, or the Sigstore public transparency log. The implementation required identifying and resolving a set of default-configuration assumptions embedded in Cosign, Kyverno, and the Vault agent pattern — assumptions that are appropriate for cloud-native environments but fail consistently in air-gapped deployments. The most significant finding is not a technical one. The flags required to make Cosign and Kyverno operate correctly in an air-gapped environment are documented and stable. The friction is in locating those flags: the documentation surface area for air-gapped operation is sparse relative to the operational prevalence of self-hosted Kubernetes. This paper attempts to consolidate that information into a single reference. As supply-chain security requirements tighten under emerging regulatory frameworks, the pattern described here — local key signing with cluster-level admission enforcement — will become the baseline expectation for air-gapped Kubernetes environments. Organizations operating self-hosted infrastructure should treat the implementation of this pattern as a near-term priority rather than a deferred hardening exercise.

All content represents personal learning from personal projects. Code examples are sanitized and generalized. No proprietary information is shared. Opinions are my own and do not reflect my employer’s views.

Overview

Data & State

Code & Tooling

Debugging & Design

Infrastructure

Kaniko, Cosign, Vault: Container Signing Without the Cloud

Executive Summary

Key Findings

1. Introduction

2. Architecture Overview

3. Container Build: Kaniko

3.1 Problem Statement

3.2 Kaniko Operation

3.3 Configuration

4. Image Signing: Cosign v2 Air-Gapped

4.1 Problem Statement

4.2 Air-Gapped Signing Configuration

4.3 Per-Tier Key Hierarchy

5. Policy Enforcement: Kyverno

5.1 Problem Statement

5.2 Air-Gapped Policy Configuration

6. Secrets Management: Vault Inline Auth

6.1 Problem Statement

6.2 Inline Authentication Pattern

7. Per-Tier Deployment Model

7.1 Design Rationale

7.2 Workflow Structure

8. End-to-End Pipeline

9. Recommendations

10. Conclusion

Overview

Data & State

Code & Tooling

Debugging & Design

Infrastructure

Documentation Index

​Executive Summary

​Key Findings

​1. Introduction

​2. Architecture Overview

​3. Container Build: Kaniko

​3.1 Problem Statement

​3.2 Kaniko Operation

​3.3 Configuration

​4. Image Signing: Cosign v2 Air-Gapped

​4.1 Problem Statement

​4.2 Air-Gapped Signing Configuration

​4.3 Per-Tier Key Hierarchy

​5. Policy Enforcement: Kyverno

​5.1 Problem Statement

​5.2 Air-Gapped Policy Configuration

​6. Secrets Management: Vault Inline Auth

​6.1 Problem Statement

​6.2 Inline Authentication Pattern

​7. Per-Tier Deployment Model

​7.1 Design Rationale

​7.2 Workflow Structure

​8. End-to-End Pipeline

​9. Recommendations

​10. Conclusion

Executive Summary

Key Findings

1. Introduction

2. Architecture Overview

3. Container Build: Kaniko

3.1 Problem Statement

3.2 Kaniko Operation

3.3 Configuration

4. Image Signing: Cosign v2 Air-Gapped

4.1 Problem Statement

4.2 Air-Gapped Signing Configuration

4.3 Per-Tier Key Hierarchy

5. Policy Enforcement: Kyverno

5.1 Problem Statement

5.2 Air-Gapped Policy Configuration

6. Secrets Management: Vault Inline Auth

6.1 Problem Statement

6.2 Inline Authentication Pattern

7. Per-Tier Deployment Model

7.1 Design Rationale

7.2 Workflow Structure

8. End-to-End Pipeline

9. Recommendations

10. Conclusion