CI/CD Pipeline

Building a Complete CI/CD Pipeline for Microservices on Kubernetes (2025)

Complete guide to building CI/CD pipelines for microservices on Kubernetes: learn pipeline stages, compare GitHub Actions vs GitLab CI vs Tekton, implement automated testing and security scanning, deploy with GitOps, use blue-green and canary strategies, and coordinate multi-service deployments safely.

Introduction to CI/CD for Kubernetes Microservices

Continuous Integration and Continuous Deployment (CI/CD) pipelines for microservices on Kubernetes present unique challenges that traditional monolith CI/CD doesn't face: coordinating deployments across dozens of independently versioned services, managing service dependencies where frontend requires backend v2.1+ but backend requires database v3.0+, handling polyglot environments where different services use different languages, frameworks, and build tools, implementing safe deployment strategies like blue-green or canary for each service independently, and ensuring zero-downtime deployments when updating services that other services depend on in real-time production traffic.

A well-designed microservices CI/CD pipeline automates the entire journey from code commit through production deployment while maintaining safety, observability, and rollback capabilities. This includes building container images for each service, running comprehensive test suites, scanning images for security vulnerabilities, pushing images to container registries, updating Kubernetes manifests with new image tags, deploying to development and staging environments for validation, promoting to production with appropriate deployment strategy, monitoring deployment health with automatic rollback on failures, and providing visibility into which versions of which services are running in which environments.

This comprehensive guide teaches you how to build production-grade CI/CD pipelines for microservices on Kubernetes from scratch, covering: pipeline architecture and stages (build, test, scan, deploy, monitor), choosing CI/CD tools (GitHub Actions, GitLab CI, Jenkins, Tekton) with feature comparison, building container images efficiently with multi-stage Dockerfiles and caching, implementing comprehensive testing (unit tests, integration tests, contract tests for service boundaries), security scanning for vulnerabilities in code and container images, managing container registries (Docker Hub, ECR, GCR, ACR, Harbor), implementing GitOps with ArgoCD or Flux for deployment automation, deployment strategies for zero-downtime releases (rolling updates, blue-green, canary), monitoring deployments with health checks and automatic rollback, handling environment promotion from dev through staging to production, managing secrets securely throughout the pipeline, and how Atmosly's Pipeline Builder provides visual pipeline creation, deployment validation, automated rollback on health degradation, and coordination across multiple microservices ensuring safe progressive rollout of changes across your entire service mesh.

By implementing the patterns and practices in this guide, you'll build reliable, fast, secure CI/CD pipelines that enable your team to deploy microservices to Kubernetes confidently with minimal manual intervention and maximum safety.

CI/CD Pipeline Stages for Microservices

Stage 1: Source and Trigger

Trigger Events:

  • Push to main branch: Automatic deployment to development
  • Pull request created: Deploy to preview environment for testing
  • Tag created (v1.2.3): Deploy to staging, then production
  • Manual trigger: Operator-initiated deployment for hotfixes

Example GitHub Actions trigger:

name: CI/CD Pipeline

on:
  push:
    branches: [main]
    tags:
      - 'v*'
  pull_request:
    branches: [main]

Stage 2: Build Container Images

Multi-Stage Dockerfile Best Practices:

# Multi-stage build for smaller images
# Stage 1: Build
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

# Stage 2: Runtime (smaller final image)
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER node  # Don't run as root
EXPOSE 8080
CMD ["node", "dist/index.js"]

Image Tagging Strategy:

  • Git commit SHA: user-service:a3f5b2c (immutable, traceable)
  • Semantic version: user-service:v1.2.3 (production releases)
  • Branch name: user-service:main-a3f5b2c (development)
  • Avoid "latest": Non-deterministic, breaks reproducibility

Stage 3: Automated Testing

Test Pyramid for Microservices:

  • Unit Tests (70%): Test individual functions, fast (milliseconds), run on every commit
  • Integration Tests (20%): Test service with database, message queue. Run in Docker Compose or Kubernetes test environment
  • Contract Tests (5%): Test API contracts between services (PACT framework). Ensures consumer expectations match provider API
  • End-to-End Tests (5%): Test complete user workflows across all services. Slow (minutes), brittle, run on staging before production

Example test stage:

- name: Run Unit Tests
  run: npm test

- name: Run Integration Tests
  run: |
    docker-compose up -d postgres redis
    npm run test:integration
    docker-compose down

- name: Contract Tests
  run: npm run test:contract

Stage 4: Security Scanning

What to Scan:

  • Code vulnerabilities: Snyk, SonarQube, CodeQL
  • Dependencies: npm audit, pip-audit, go mod vulnerabilities
  • Container images: Trivy, Snyk Container, Aqua
  • Kubernetes YAML: Kubeval, kubectl dry-run, policy enforcement

Example Trivy scan:

- name: Scan Image for Vulnerabilities
  run: |
    trivy image --severity HIGH,CRITICAL \\
      --exit-code 1 \\
      myregistry.com/user-service:${{ github.sha }}
    # Fails pipeline if HIGH or CRITICAL vulnerabilities found

Stage 5: Push to Container Registry

- name: Push to Registry
  run: |
    docker tag user-service:${{ github.sha }} \\
      myregistry.com/user-service:${{ github.sha }}
    
    docker tag user-service:${{ github.sha }} \\
      myregistry.com/user-service:v1.2.3
    
    docker push myregistry.com/user-service:${{ github.sha }}
    docker push myregistry.com/user-service:v1.2.3

Stage 6: Update Kubernetes Manifests

Option A: Update manifest in Git (GitOps):

- name: Update Image Tag in Git
  run: |
    git clone https://github.com/myorg/k8s-manifests
    cd k8s-manifests
    
    # Update image tag using kustomize
    cd apps/user-service/overlays/production
    kustomize edit set image user-service=myregistry.com/user-service:v1.2.3
    
    git commit -am "Update user-service to v1.2.3"
    git push
    
    # ArgoCD detects change and deploys automatically

Option B: Direct kubectl apply:

- name: Deploy to Kubernetes
  run: |
    kubectl set image deployment/user-service \\
      user-service=myregistry.com/user-service:v1.2.3 \\
      -n production
    
    kubectl rollout status deployment/user-service -n production

CI/CD Tool Comparison for Kubernetes

GitHub Actions (Cloud-Based, GitHub-Native)

Pros:

  • ✅ Free for public repos, 2,000 minutes/month for private
  • ✅ Native GitHub integration (pull requests, issues, releases)
  • ✅ Huge marketplace of pre-built actions
  • ✅ Matrix builds for testing multiple versions
  • ✅ Secrets management built-in

Cons:

  • ❌ Limited free minutes for heavy builds
  • ❌ Vendor lock-in to GitHub
  • ❌ Cannot self-host (cloud only)

Best for: GitHub users, small-medium teams, cloud-native workflows

GitLab CI (Integrated CI/CD)

Pros:

  • ✅ Integrated with GitLab (SCM + CI/CD + registry in one)
  • ✅ Can self-host (on-prem or cloud)
  • Auto DevOps (automatic pipeline generation)
  • ✅ Built-in container registry
  • ✅ Generous free tier

Cons:

  • ❌ Less marketplace than GitHub Actions
  • ❌ Self-hosted requires maintenance

Best for: GitLab users, teams wanting all-in-one platform

Jenkins (Self-Hosted, Flexible)

Pros:

  • ✅ Completely free and open source
  • ✅ Massive plugin ecosystem (1,800+ plugins)
  • ✅ Highly customizable
  • ✅ Self-hosted (full control)

Cons:

  • ❌ Requires infrastructure and maintenance
  • ❌ Complex to configure initially
  • ❌ Security vulnerabilities if not kept updated

Best for: Enterprises with existing Jenkins, need for extreme customization

Tekton (Kubernetes-Native)

Pros:

  • ✅ Runs natively on Kubernetes (no external CI server)
  • ✅ Cloud-native (CNCF project)
  • ✅ Reusable pipeline components (Tasks, Pipelines)
  • ✅ No external dependencies

Cons:

  • ❌ Steeper learning curve
  • ❌ Less mature ecosystem vs GitHub Actions/GitLab
  • ❌ Requires Kubernetes cluster for CI

Best for: Teams fully committed to Kubernetes-native tooling

Complete CI/CD Pipeline Example: GitHub Actions

name: Microservice CI/CD

on:
  push:
    branches: [main]
    tags: ['v*']

env:
  REGISTRY: myregistry.com
  IMAGE_NAME: user-service

jobs:
  build-test-deploy:
    runs-on: ubuntu-latest
    
    steps:
    - name: Checkout Code
      uses: actions/checkout@v4
    
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v3
    
    - name: Login to Registry
      uses: docker/login-action@v3
      with:
        registry: ${{ env.REGISTRY }}
        username: ${{ secrets.REGISTRY_USERNAME }}
        password: ${{ secrets.REGISTRY_PASSWORD }}
    
    - name: Extract Metadata
      id: meta
      uses: docker/metadata-action@v5
      with:
        images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
        tags: |
          type=ref,event=branch
          type=ref,event=pr
          type=semver,pattern={{version}}
          type=sha,prefix={{branch}}-
    
    - name: Build and Push Image
      uses: docker/build-push-action@v5
      with:
        context: .
        push: true
        tags: ${{ steps.meta.outputs.tags }}
        cache-from: type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:buildcache
        cache-to: type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:buildcache,mode=max
    
    - name: Run Unit Tests
      run: |
        docker run --rm ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} npm test
    
    - name: Scan Image for Vulnerabilities
      uses: aquasecurity/trivy-action@master
      with:
        image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
        format: 'sarif'
        output: 'trivy-results.sarif'
        severity: 'CRITICAL,HIGH'
        exit-code: '1'  # Fail on vulnerabilities
    
    - name: Deploy to Development
      if: github.ref == 'refs/heads/main'
      run: |
        kubectl set image deployment/user-service \\
          user-service=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} \\
          -n development
    
    - name: Deploy to Production (Tags Only)
      if: startsWith(github.ref, 'refs/tags/v')
      run: |
        # Update image tag in GitOps repo
        git clone https://github.com/myorg/k8s-manifests
        cd k8s-manifests/apps/user-service/overlays/production
        kustomize edit set image user-service=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.ref_name }}
        git commit -am "Deploy user-service ${{ github.ref_name }}"
        git push
        # ArgoCD will detect and deploy

Deployment Strategies for Microservices

Rolling Update (Default Kubernetes)

Gradually replaces old pods with new ones:

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Create 1 extra pod during update
      maxUnavailable: 0  # Keep all replicas available (zero downtime)
  replicas: 5

Process with maxUnavailable: 0:

  1. Create 1 new pod (6 total running)
  2. Wait for new pod to pass readiness probe
  3. Terminate 1 old pod (5 running, 1 new + 4 old)
  4. Create another new pod (6 running, 2 new + 4 old)
  5. Repeat until all 5 pods are new version

Pros: Built-in, zero downtime, simple

Cons: Both versions running during rollout (old and new), hard to rollback mid-deployment

Blue-Green Deployment

Run both versions, switch traffic atomically:

# Deploy green (new version)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-service-green
spec:
  replicas: 5
  selector:
    matchLabels:
      app: user-service
      version: green
  template:
    metadata:
      labels:
        app: user-service
        version: green
    spec:
      containers:
      - name: app
        image: user-service:v2.0.0

# Blue (current) still running
# Service currently points to blue

# Test green thoroughly
curl http://user-service-green:8080/health

# Switch Service to green
kubectl patch service user-service -p \\
  '{"spec":{"selector":{"version":"green"}}}'

# Instant traffic switch
# Monitor for issues

# Rollback if needed (switch back to blue)
kubectl patch service user-service -p \\
  '{"spec":{"selector":{"version":"blue"}}}'

# Delete blue after green proven stable
kubectl delete deployment user-service-blue

Pros: Instant rollback, full testing before switch, zero downtime

Cons: 2x resources during deployment (both versions running)

Canary Deployment with Istio

Gradually shift traffic percentage from old to new:

# Deploy canary version
apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-service-canary
spec:
  replicas: 1  # Small canary
  template:
    metadata:
      labels:
        app: user-service
        version: canary
    spec:
      containers:
      - name: app
        image: user-service:v2.0.0

# Stable version (5 replicas)
# Configure Istio VirtualService for traffic splitting

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: user-service
spec:
  hosts:
  - user-service
  http:
  - match:
    - headers:
        canary:
          exact: "true"
    route:
    - destination:
        host: user-service
        subset: canary
  - route:
    - destination:
        host: user-service
        subset: stable
      weight: 90  # 90% to stable
    - destination:
        host: user-service
        subset: canary  
      weight: 10  # 10% canary traffic

# Monitor canary metrics (error rate, latency)
# If healthy, increase to 50/50, then 100% canary
# If unhealthy, rollback (set weight to 0)

Pros: Low risk (only 10% users affected), gradual validation, automatic rollback capability

Cons: Requires service mesh (Istio, Linkerd) or ingress controller with traffic splitting

Managing Secrets in CI/CD Pipelines

CI/CD Platform Secrets

GitHub Actions Secrets:

# Set secrets in GitHub repo settings
# Access in workflow:

- name: Login to Docker Registry
  env:
    REGISTRY_PASSWORD: ${{ secrets.REGISTRY_PASSWORD }}
  run: echo "$REGISTRY_PASSWORD" | docker login -u myuser --password-stdin

GitLab CI Variables:

# Set in GitLab project settings (masked, protected)
# Access in .gitlab-ci.yml:

deploy:
  script:
    - echo $REGISTRY_PASSWORD | docker login -u $REGISTRY_USER --password-stdin

Kubernetes Secrets in Manifests

Never commit plaintext Secrets to Git!

Use Sealed Secrets or External Secrets Operator (covered in GitOps guide above).

Multi-Service Deployment Coordination

Challenge: Service Dependencies

Microservices have dependencies:

  • Frontend depends on Backend v2.1+
  • Backend depends on Database v3.0+
  • All services depend on Message Queue

Deploying in wrong order breaks application.

Solution: Deployment Ordering

Using ArgoCD Sync Waves:

# Deploy database first (wave 0)
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
  annotations:
    argocd.argoproj.io/sync-wave: "0"

# Then backend (wave 1)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: backend
  annotations:
    argocd.argoproj.io/sync-wave: "1"

# Finally frontend (wave 2)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend
  annotations:
    argocd.argoproj.io/sync-wave: "2"

# ArgoCD deploys in order: database → backend → frontend

Using Helm Dependencies

# Chart.yaml
dependencies:
- name: postgresql
  version: 12.0.0
  repository: https://charts.bitnami.com/bitnami

# Helm ensures postgresql deployed before application

Environment Promotion Strategy

Development → Staging → Production Pipeline

Git Branch Strategy:

  • main branch → Auto-deploy to development
  • release/* branches → Deploy to staging
  • Git tags (v1.2.3) → Deploy to production

Promotion workflow:

  1. Developer commits to main → Auto-deployed to dev
  2. Test in dev, create release/v1.2.3 branch → Deployed to staging
  3. QA validates staging, create tag v1.2.3 → Deployed to production
  4. Monitor production, merge release branch back to main

Image Promotion (Safer)

Build image once, promote same image through environments:

# Build once with commit SHA
docker build -t myregistry.com/app:abc123 .

# Development: Tag as dev
docker tag myregistry.com/app:abc123 myregistry.com/app:dev-abc123

# After testing, promote to staging (retag same image)
docker tag myregistry.com/app:abc123 myregistry.com/app:staging-abc123

# After QA, promote to production
docker tag myregistry.com/app:abc123 myregistry.com/app:v1.2.3

# Same artifact through all environments (eliminates "works in dev not prod" issues)

How Atmosly Enhances CI/CD for Microservices (Pipeline Builder)

While the previous sections covered standard CI/CD practices, Atmosly's Pipeline Builder adds intelligence and automation on top of your chosen CI/CD tool (GitHub Actions, GitLab CI, etc.).

Visual Pipeline Creation

Instead of writing YAML by hand, Atmosly provides:

  • Drag-and-drop pipeline builder for creating multi-stage workflows visually
  • Automatic generation of GitHub Actions YAML, GitLab CI config, or ArgoCD Application manifests
  • Templates for common patterns (build → test → scan → deploy)
  • No YAML syntax errors—UI ensures valid configuration

Deployment Validation and Automated Rollback

Atmosly monitors deployments and automatically rolls back on failures:

  • Tracks deployment progress (how many replicas updated, how many ready)
  • Monitors error rates and latency during rollout
  • If error rate spikes >5% or latency p95 >2x baseline during deployment, triggers automatic rollback
  • Reverts to the previous image tag automatically
  • Alerts the team with deployment failure RCA

Multi-Service Deployment Coordination

For microservices architectures with 10-50 services:

  • Orchestrates deployments respecting dependencies (database before application)
  • Parallel deployment of independent services (frontend and backend can deploy simultaneously)
  • Wait conditions (backend deployment waits for database to be healthy)
  • Rollback coordination (if backend fails, rollback frontend too)

Monitoring and Observability in CI/CD

Deployment Metrics to Track

  • Deployment Frequency: How often you deploy (daily, weekly)
  • Lead Time: Commit to production time
  • Change Failure Rate: % of deployments causing incidents
  • MTTR: Time to recover from failed deployment

Post-Deployment Validation

# After deployment, run smoke tests
- name: Smoke Tests
  run: |
    # Wait for deployment ready
    kubectl wait --for=condition=available \\
      deployment/user-service -n production --timeout=300s
    
    # Test health endpoint
    SERVICE_URL=$(kubectl get service user-service -n production \\
      -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
    
    curl -f http://$SERVICE_URL/health || exit 1
    
    # Test critical API endpoint
    curl -f http://$SERVICE_URL/api/users/1 || exit 1

Best Practices for Microservices CI/CD

1. Build Images in Parallel

If monorepo with multiple services, build them concurrently:

jobs:
  changes:
    # Detect which services changed
    outputs:
      frontend: ${{ steps.changes.outputs.frontend }}
      backend: ${{ steps.changes.outputs.backend }}
  
  build-frontend:
    needs: changes
    if: needs.changes.outputs.frontend == 'true'
    # Build frontend only if changed
  
  build-backend:
    needs: changes
    if: needs.changes.outputs.backend == 'true'
    # Build backend only if changed (in parallel with frontend)

2. Use Build Caching

# Docker layer caching
docker build --cache-from myregistry.com/app:buildcache .

# Or BuildKit cache
docker buildx build --cache-from=type=registry,ref=myregistry.com/app:cache .

3. Fail Fast

Order pipeline stages to fail quickly:

  1. Lint (seconds) → Fail fast on syntax errors
  2. Unit tests (seconds-minutes) → Fail before slow integration tests
  3. Build image (minutes)
  4. Integration tests (minutes)
  5. Security scan (minutes)
  6. Deploy (minutes)

4. Implement Contract Testing

For microservices, test API contracts:

# Consumer defines expectations (contract)
# Provider validates it meets contract
# Prevents breaking changes

# Using PACT framework:
npm install @pact-foundation/pact

# Consumer test (frontend)
await provider
  .addInteraction({
    state: "user exists",
    uponReceiving: "get user request",
    withRequest: {
      method: "GET",
      path: "/api/users/1"
    },
    willRespondWith: {
      status: 200,
      body: {id: 1, name: "John"}
    }
  })

# Provider test (backend)  
# Verifies it implements contract

5. Tag Images with Git SHA

Always tag with commit SHA for traceability:

docker build -t myregistry.com/app:${GIT_SHA} .
docker build -t myregistry.com/app:v1.2.3 .  # Also semantic version

# In deployment, know exactly which commit is running
kubectl get deployment user-service -o jsonpath='{.spec.template.spec.containers[0].image}'
# Output: myregistry.com/user-service:a3f5b2c

# Can trace back to Git commit
git show a3f5b2c

Conclusion: Production-Ready CI/CD for Kubernetes Microservices

Building effective CI/CD pipelines for microservices on Kubernetes requires careful orchestration across build, test, security, deployment, and monitoring stages. Key success factors include choosing appropriate CI/CD tools (GitHub Actions, GitLab CI, Tekton), implementing comprehensive testing including contract tests for service boundaries, security scanning at multiple stages, using GitOps for deployment automation and audit trails, selecting appropriate deployment strategies per service criticality (rolling for low-risk, blue-green for critical), coordinating deployments across service dependencies, and monitoring deployment health with automatic rollback.

Essential Implementation Checklist:

  • Automated builds are triggered on every commit
  • Comprehensive test suite (unit, integration, contract)
  • Security scanning and blocking deployments with critical vulnerabilities
  • Image tagging strategy using Git SHA for traceability
  • GitOps with ArgoCD or Flux for declarative deployments
  • Environment promotion (dev → staging → production)
  • Zero-downtime deployment strategies
  • Post-deployment validation and smoke tests
  • Monitoring and automated rollback on failures
  • Secrets management (never commit to Git)

For teams managing complex microservices architectures, Atmosly's Pipeline Builder adds visual pipeline creation, deployment coordination across multiple services, automated health-based rollback, and integration with GitOps tools simplifying pipeline setup while maintaining best practices and safety.

Ready to build production-grade CI/CD pipelines for your microservices? Start with Atmosly's Pipeline Builder for visual pipeline creation, then enhance with automated deployment validation, rollback, and multi-service coordination.

Frequently Asked Questions

What are the essential stages in a Kubernetes CI/CD pipeline for microservices?
  1. Source

    Pipeline triggers on Git events: pushes, pull requests, or tags to main/release branches (e.g., refs/heads/main, release tags).

  2. Build

    Build container images using multi-stage Dockerfiles to produce small final images. Tag images with the Git commit SHA for traceability and semantic version tags for releases. Use layer caching to speed repeated builds.

    # example image tags
    my-registry/my-service:$(git rev-parse --short HEAD)
    my-registry/my-service:1.4.2
    
  3. Test

    Run a layered test strategy:

    • Unit tests (fast, most tests — run in parallel).
    • Integration tests that spin up dependent services (DB, queues) or use testcontainers.
    • Contract tests (e.g., PACT) to validate API compatibility between services.
    • Optionally run end-to-end (E2E) tests against a staging environment.
  4. Security Scan

    Shift-left security checks into CI:

    • Static code analysis and SAST (Snyk, SonarQube).
    • Dependency scans (npm audit, pip-audit).
    • Container image CVE scanning (Trivy, Clair) — fail the build on CRITICAL/HIGH CVEs per policy.
    • Validate Kubernetes manifests (kubeval, kube-score, conftest).
    # example: fail on critical CVEs with trivy
    trivy image --severity CRITICAL --exit-code 1 my-registry/my-service:sha
    
  5. Push

    Push signed images to a secure container registry (ECR, GCR, ACR, Harbor). Use immutable tags and optionally image signing (cosign).

    docker push my-registry/my-service:$(git rev-parse --short HEAD)
  6. Deploy

    Deploy images to Kubernetes by updating manifests/Helm charts with the new image tag. Prefer GitOps (Argo CD / Flux) to detect the change and reconcile, or use controlled kubectl/helm apply pipelines for direct deploys.

    # example (Helm)
    helm upgrade --install my-service ./charts/my-service --set image.tag=sha1234
    
  7. Validate

    Run post-deployment checks and automated validation:

    • Smoke tests and API health checks.
    • Check readiness and liveness probes.
    • Monitor key metrics (error rate, latency, traffic) and wait for stable conditions before promoting.
  8. Promote

    If validation passes, promote the release to the next environment (dev → staging → production) using automated promotion strategies or manual approvals for production. Use blue/green or canary deployments for safe rollouts.

    Typical total pipeline time for a microservice: ~5–15 minutes (depends on tests & security scans).

What is the difference between blue-green and canary deployments?
  1. Blue–Green

    Run two complete versions simultaneously: blue = current, green = new. Test green thoroughly while blue serves production, then perform an atomic traffic switch to green. If issues appear, rollback instantly to blue. After green is stable, delete blue.

    Pros: Instant rollback, full testing before switch, zero downtime.

    Cons: Requires ~2× resources during deployment, and switching all traffic at once can be risky if tests missed an issue.

  2. Canary

    Gradually shift traffic to the new version (e.g., 10% → 25% → 50% → 100%), monitoring metrics (error rate, latency) at each step. Roll back if the canary shows problems; continue rollout if metrics remain healthy.

    Pros: Lower blast radius (only a small percentage of users initially affected), validation with real traffic, automatic rollback on metric degradation.

    Cons: Requires traffic-splitting capability (service mesh or advanced ingress) and both versions run longer.

  3. When to choose

    Use Blue–Green for critical services that need instant rollback capability. Use Canary for risk-averse teams that prefer gradual rollout with metric validation.

  4. Note on Rolling Updates

    Kubernetes rolling updates (the default) replace pods gradually but offer less traffic control and rollback safety compared with Canary or Blue–Green strategies. They are simpler to use but may be less safe for high-risk changes.

How do I implement GitOps for Kubernetes CI/CD?
  1. Setup Git repo:

    Use a clear repo structure, e.g. apps/SERVICE/overlays/ENVIRONMENT. Commit all Kubernetes manifests (Deployment, Service, ConfigMap, etc.). Use Kustomize overlays for environment differences (example: dev = 1 replica, prod = 10 replicas).

    apps/
      my-service/
        base/
          deployment.yaml
          service.yaml
        overlays/
          dev/
            kustomization.yaml  # replicas: 1
          prod/
            kustomization.yaml  # replicas: 10
    
  2. Choose a GitOps tool:

    Install either Argo CD (rich UI, easy onboarding) or Flux (automation-first, git-native CLI). Both reconcile cluster state to Git.

  3. Configure applications:

    Create an ArgoCD Application or Flux Kustomization that points to the repository path for each environment. Enable automatic sync and self-heal so cluster drift is reverted to Git state.

    # ArgoCD example (conceptual)
    apiVersion: argoproj.io/v1alpha1
    kind: Application
    spec:
      source:
        repoURL: '[email protected]:org/repo.git'
        path: 'apps/my-service/overlays/prod'
      destination:
        server: 'https://kubernetes.default.svc'
        namespace: 'my-service'
      syncPolicy:
        automated:
          prune: true
          selfHeal: true
    
  4. CI pipeline:

    CI builds images, runs tests & security scans, pushes images to registry, then updates image tags in Git manifests and commits the change. This keeps Git as the single source-of-truth for deployments.

    # update image tag in kustomize (example)
    kustomize edit set image my-service=gcr.io/myproj/my-service:sha1234
    git add .
    git commit -m "release: my-service sha1234"
    git push origin main
    
  5. GitOps deploys:

    ArgoCD/Flux detects the Git commit and automatically applies changes to the cluster. They continuously monitor sync status and reconcile differences.

  6. Secrets management:

    Never store plaintext secrets in Git. Use encrypted/sealed secrets or external secret controllers:

    • Sealed Secrets — encrypt secrets for Git and decrypt in-cluster.
    • ExternalSecrets — reference Vault, AWS Secrets Manager, or other secret stores at runtime.
  7. Benefits:
    • Audit trail via Git history (who changed what and when).
    • Easy rollback with git revert.
    • Declarative cluster state and drift detection (self-heal).
    • Separation of concerns: CI handles build/tests/push; GitOps handles deployment and reconciliation.
  8. Operational tips:
    • Protect production branches and require PR reviews for manifest changes.
    • Use immutable image tags (commit SHA) rather than :latest.
    • Enable health checks and automated promotion gates (smoke tests, metric checks) before promoting.
    • Monitor GitOps tool health and reconciliation status to detect failures early.
What tools should I use for CI/CD on Kubernetes?
  1. GitHub Actions

    Best for GitHub users. Native GitHub integration (PRs, issues), large marketplace of pre-built actions, matrix builds. Free tier: ~2,000 free minutes/month for private repos. Cloud-hosted only (cannot self-host runner orchestration at scale without using self-hosted runners).

  2. GitLab CI

    Best for GitLab users or teams wanting an all-in-one platform (SCM + CI/CD + registry). Can be self-hosted (on-prem or cloud), includes Auto DevOps (auto-generated pipelines) and integrated container registry. Generous free tier and built-in features for full lifecycle.

  3. Jenkins

    Best for enterprises needing extreme customization or with existing Jenkins investments. Open-source, highly extensible (~1,800+ plugins), fully self-hosted (requires infra). Strong flexibility but higher operational overhead (setup, security, plugin maintenance).

  4. Tekton

    Best for Kubernetes-native CI/CD. Runs entirely on Kubernetes (no external CI server), CNCF project with reusable pipeline components (Tasks and Pipelines). Ideal when you want CI to run inside the cluster, but requires a cluster for CI and more K8s expertise to operate.

  5. Recommendation & Deployment Pattern
    • Choose GitHub Actions if you use GitHub—simplest and most integrated experience.
    • Choose GitLab CI if you use GitLab or want a single product for SCM, CI, and registry.
    • Choose Jenkins if you require maximum control, customization, or have an existing Jenkins ecosystem.
    • Choose Tekton if you want a fully Kubernetes-native CI system and are comfortable running CI inside K8s.

    Deployment tip: Regardless of CI tool selected, use GitOps (Argo CD or Flux) for deployments: CI builds/tests/pushes artifacts; GitOps reconciles declarative manifests to clusters (clear separation of concerns, better auditability and reliability).