Overview

v1.0

Homelab Infrastructure

A production-ready homelab and DevOps platform running on DigitalOcean Kubernetes with GitOps workflows and automated CI/CD pipelines.

🎯 5 Kubernetes Nodes

3x compute + 2x storage optimized nodes in Singapore

🔐 SSL Everywhere

Automatic Let's Encrypt certificates for all services

🔄 GitOps Managed

ArgoCD syncs all changes from Gitea automatically

🚀 CI/CD Pipeline

GitHub Actions → GHCR → Auto-deploy via ArgoCD

📊 9 Services

DevOps tools + homelab applications

☁️ Cloudflare DNS

Proxied DNS with DDoS protection

Platform Summary

ComponentTechnologyStatus
Cloud ProviderDigitalOcean (SGP1)Active
OrchestratorKubernetes (DOKS)Running
IngressNGINX Ingress ControllerRunning
SSLCert-Manager + Let's EncryptActive
GitOpsArgoCDSynced
Source ControlGitea (self-hosted)Running
CI/CDGitHub Actions + GHCRActive
DNSCloudflareActive

Architecture

The platform follows a cloud-native architecture with GitOps at its core.

System Overview

graph TB CF[Cloudflare DNS] --> LB[Load Balancer] LB --> NGINX[NGINX Ingress] NGINX --> ARGO[ArgoCD] NGINX --> GITEA[Gitea] NGINX --> GRAFANA[Grafana] NGINX --> KUMA[Uptime Kuma] NGINX --> JELLYFIN[Jellyfin] NGINX --> NC[Nextcloud] NGINX --> VW[Vaultwarden] NGINX --> PH[Pi-hole] NGINX --> HA[Home Assistant] ARGO -. watches .-> GITEA GITEA -. stores .-> DOCS[Git Manifests] ARGO -. applies .-> K8s GITEA -. persists .-> STOR[DO Storage] JELLYFIN -. persists .-> STOR NC -. persists .-> STOR VW -. persists .-> STOR PH -. persists .-> STOR HA -. persists .-> STOR

CI/CD Pipeline

sequenceDiagram participant Dev as Developer participant GH as GitHub participant GA as GitHub Actions participant GHCR as GHCR participant Git as Gitea participant AC as ArgoCD participant K8s as Kubernetes Dev->>GH: git push GH->>GA: Trigger workflow GA->>GA: Run tests GA->>GA: Build Docker image GA->>GHCR: Push image GHCR-->>GA: Image stored GA->>Git: Update manifest Git-->>GA: Push confirmed AC->>Git: Detect change AC->>K8s: Apply deployment K8s->>GHCR: Pull new image GHCR-->>K8s: Image delivered K8s->>K8s: Rolling update K8s-->>Dev: New version live

GitOps Flow

graph LR A[Developer] --> B[Push to Gitea] B --> C{ArgoCD detects} C -->|Sync| D[Kubernetes] D --> E{Health check} E -->|Healthy| F[Live] E -->|Drift| G[Auto-heal] G --> D F -. Kuma .-> H[Uptime Kuma] F -. metrics .-> I[Grafana]

Cost Breakdown

Monthly infrastructure costs for the homelab platform.

pie title Monthly Cost Distribution "Compute (3x s-2vcpu-4gb)" : 36 "Storage Nodes (2x s-4vcpu-8gb)" : 48 "Load Balancer" : 12 "Block Storage (100GB)" : 10
ResourceSpecificationMonthly Cost
Basic Node Pool3x s-2vcpu-4gb ($12 each)$36
Storage Node Pool2x s-4vcpu-8gb ($24 each)$48
Load BalancerNGINX Ingress LB$12
Block Storage100GB SSD (shared)$10
Total~$106/mo
💡 Free Services

GitHub Container Registry (GHCR), GitHub Actions (2000 min/mo free), Let's Encrypt SSL, and Cloudflare DNS are all free tier.

Cost Optimization Tips

  • Use fewer/smaller nodes during low-usage periods
  • Enable DO's auto-scaling for dynamic workloads
  • Monitor actual storage usage — reduce PVC sizes if over-provisioned
  • Consider spot instances for non-critical workloads
  • Consolidate services that don't need separate deployments

Cluster Setup

How the DOKS cluster is provisioned and configured.

Cluster Creation

# Create cluster with 2 node pools
doctl kubernetes cluster create homelab \
  --region sgp1 \
  --version latest \
  --node-pool "name=basic-pool;size=s-2vcpu-4gb;count=3" \
  --node-pool "name=storage-pool;size=s-4vcpu-8gb;count=2" \
  --auto-upgrade --wait

# Save kubeconfig
doctl kubernetes cluster kubeconfig save homelab

# Verify
kubectl get nodes

Node Topology

graph TB subgraph "basic-pool (3 nodes, s-2vcpu-4gb)" N1[Node 1 2 vCPU, 4GB RAM] N2[Node 2 2 vCPU, 4GB RAM] N3[Node 3 2 vCPU, 4GB RAM] end subgraph "storage-pool (2 nodes, s-4vcpu-8gb)" N4[Node 4 4 vCPU, 8GB RAM] N5[Node 5 4 vCPU, 8GB RAM] end LB[Load Balancer] --> N1 LB --> N2 LB --> N3 LB --> N4 LB --> N5

Node Assignments

Node PoolWorkloadsReason
basic-pool (3x)Core services, lightweight appsSufficient for most services
storage-pool (2x)Jellyfin, Nextcloud, databasesMore RAM + CPU for storage-heavy workloads
📍 Region: Singapore (SGP1)

Chosen for low latency to Southeast Asia users. Alternative regions: NYC1 (US East), SFO3 (US West), FRA1 (Europe).

Infrastructure Components

Core infrastructure that powers the entire platform.

NGINX Ingress Controller

Routes external traffic to internal services based on hostname. Installed via Helm.

  • Creates a DigitalOcean Load Balancer automatically
  • Handles TLS termination with Let's Encrypt certificates
  • Routes traffic based on host headers (e.g., gitea.akze.net)

Cert-Manager + Let's Encrypt

Automatically provisions and renews SSL certificates for all services.

sequenceDiagram participant User as User Browser participant NGINX as NGINX Ingress participant CM as Cert-Manager participant LE as Let's Encrypt User->>NGINX: HTTPS request (no cert) NGINX->>CM: Certificate missing CM->>LE: Request certificate (HTTP-01) LE->>CM: Challenge token CM->>NGINX: Create challenge response LE->>NGINX: Verify challenge LE-->>CM: Issue certificate CM->>NGINX: Store as Kubernetes Secret NGINX-->>User: HTTPS with valid cert

DigitalOcean Block Storage

SSD-backed persistent volumes for stateful services. Automatically provisioned by the DO CSI driver.

ServiceStoragePurpose
Gitea10GiGit repositories
Grafana5GiDashboards & data
Nextcloud50GiFile storage
Jellyfin30GiMedia + cache
Vaultwarden5GiPassword database
Pi-hole2GiBlock lists + logs
Home Assistant5GiConfiguration
Uptime Kuma5GiMonitoring data

DNS Configuration

All services are accessed via subdomains of akze.net, managed through Cloudflare.

DNS Records

SubdomainTypeTargetProxy
*.akze.net (all services)ALoad Balancer IPOK Proxied
🌐 All Subdomains

argocd, gitea, grafana, kuma, jellyfin, nextcloud, vault, pihole, home, app, docs — all point to the same Load Balancer IP

graph LR User[User Browser] --> CF[Cloudflare DNS + Proxy] CF --> LB[DO Load Balancer] LB --> NGINX[NGINX Ingress] NGINX --> Routes{Route by Host} Routes --> |argocd.akze.net| S1[ArgoCD] Routes --> |gitea.akze.net| S2[Gitea] Routes --> |grafana.akze.net| S3[Grafana] Routes --> |*.akze.net| S4[Other Services]
⚠️ Internal DNS Challenge

Cloudflare-proxy domains are not resolvable from within the Kubernetes cluster. ArgoCD's repo-server uses hostAliases to map internal DNS to the Load Balancer IP directly.

All Services

Every service running on the platform, organized by category.

🔧 ArgoCD

GitOps continuous deployment — syncs Gitea manifests to cluster

Healthy

📦 Gitea

Self-hosted Git server — stores all Kubernetes manifests

Healthy

📊 Grafana

Metrics dashboards — monitors cluster and service health

Healthy

💓 Uptime Kuma

Service monitoring — tracks availability and response times

Healthy

🎬 Jellyfin

Media streaming — self-hosted Netflix alternative

Healthy

☁️ Nextcloud

File sync & calendar — Google Drive replacement

Healthy

🔐 Vaultwarden

Password manager — Bitwarden-compatible server

Healthy

🛡️ Pi-hole

Network ad blocker — DNS-level ad filtering

Healthy

🏠 Home Assistant

Smart home hub — IoT device management and automation

Healthy

🚀 Sample App

CI/CD demo app — Go web app auto-deployed via pipeline

Healthy

DevOps Tools

Professional DevOps tools for development and operations workflows.

ArgoCD — GitOps Engine

Continuously monitors the Gitea repository and automatically applies any manifest changes to the Kubernetes cluster. Supports auto-sync, self-healing, and rollback.

graph LR A[Gitea Repository] -->|git push| B[ArgoCD Detects] B --> C{Diff Analysis} C -->|Changes found| D[Sync to Cluster] C -->|No changes| E[Wait 3 min] E --> B D --> F{Health Check} F -->|OK| G[OK Synced] F -->|Drift| H[🔄 Self-Heal] H --> D

Gitea — Source Control

Lightweight, self-hosted Git server. Stores all Kubernetes manifests, Helm values, and deployment configurations. Acts as the single source of truth for the entire infrastructure.

Grafana — Observability

Dashboard platform for visualizing metrics. Connects to Prometheus data sources to display cluster resource usage, service health, and custom business metrics.

Uptime Kuma — Monitoring

Beautiful status page and uptime monitor. Checks all services periodically and sends alerts when services go down.

Homelab Applications

Personal productivity and entertainment applications.

Nextcloud

Complete self-hosted productivity suite with file sync, calendar, contacts, and collaborative editing. Uses Apache backend with 50Gi persistent storage.

Jellyfin

Free software media system. Streams movies, music, and TV shows to any device. Optimized with separate config and cache volumes (30Gi total).

Vaultwarden

Lightweight Bitwarden-compatible password manager. Stores all passwords encrypted, supports 2FA, and syncs across all devices.

Pi-hole

Network-level ad and tracker blocking. Works as a DNS sinkhole, blocking ads for all devices on the network without installing browser extensions.

Home Assistant

Open-source home automation platform. Connects to smart home devices, creates automations, and provides a unified dashboard for all IoT devices.

GitOps Workflow

Infrastructure as Code with automatic synchronization.

flowchart TD A[Developer edits YAML manifest locally] --> B[git add && git commit] B --> C[git push to Gitea] C --> D{ArgoCD polls every 3 minutes} D -->|Changes detected| E[ArgoCD pulls from Gitea] D -->|No changes| D E --> F[ArgoCD compares desired vs actual] F -->|Drift detected| G[Apply manifests to cluster] F -->|In sync| D G --> H[Kubernetes rolls out changes] H --> I[Health checks pass] I --> J[OK Deployed] J --> K[Uptime Kuma verifies] K --> L[Grafana shows metrics]

Repository Structure

homelab/
├── infrastructure/          # Core infrastructure
│   ├── nginx-ingress/       #   NGINX Ingress Controller
│   ├── cert-manager/        #   SSL certificates
│   └── storage/             #   Block storage config
├── apps/                    # Applications
│   ├── devops/              #   DevOps services
│   │   ├── argocd/          #     ArgoCD configuration
│   │   ├── gitea/           #     Gitea deployment
│   │   ├── grafana/         #     Grafana deployment
│   │   └── uptime-kuma/     #     Uptime monitoring
│   └── homelab/             #   Homelab services
│       ├── jellyfin/        #     Media streaming
│       ├── nextcloud/       #     File sync
│       ├── vaultwarden/     #     Password manager
│       ├── pihole/          #     Ad blocker
│       └── home-assistant/  #     Smart home
└── scripts/                 # Deployment scripts
    ├── 01-infrastructure.ps1
    ├── 02-devops.ps1
    └── 03-homelab.ps1

Principles

  • Declarative: Everything defined in YAML manifests
  • Versioned: All changes tracked in Git history
  • Automatic: No manual kubectl apply needed
  • Self-Healing: ArgoCD auto-corrects drift
  • Auditable: Every change has a commit message and author

CI/CD Pipeline

Automated build, test, and deployment pipeline for application code.

flowchart LR subgraph "Source" A[Developer pushes code] end subgraph "CI - GitHub Actions" B[Run tests] C[Build Docker image] D[Push to GHCR] end subgraph "CD - Manifest Update" E[Update image tag] F[Push to Gitea] end subgraph "Deploy - ArgoCD" G[Detect change] H[Rolling update] I[Health check] end subgraph "Production" J[(Live Service)] end A --> B --> C --> D --> E --> F --> G --> H --> I --> J

GitHub Actions Workflow

name: CI/CD Pipeline
on:
  push:
    branches: [main]

jobs:
  ci:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v5
      - run: go test -v ./...          # Run tests
      - uses: docker/login-action@v3   # Login to GHCR
      - uses: docker/build-push-action@v5  # Build & push

  cd:
    needs: ci
    runs-on: ubuntu-latest
    steps:
      - run: |
          git clone https://gitea.akze.net/user/homelab.git
          # Update image tag in manifest
          sed -i "s|image: .*|image: ghcr.io/user/app:${SHA}|" ...
          git push  # Push to Gitea

Why GHCR Over DO Registry?

FeatureGHCRDO Registry
Free TierUnlimited (public repos)1 repository
GitHub IntegrationNativeRequires token
CI AuthenticationAuto (GITHUB_TOKEN)Manual setup
Image Pull in KubernetesPAT with read:packagesDO token
CostFreeFree (limited)
OK Zero Docker Required Locally

All Docker builds happen on GitHub's runners. Developers only push code — no Docker installation needed on their machine.

Security Architecture

How the platform handles authentication, secrets, and access control.

Secret Management

Secret TypeStorageScope
Kubernetes Secretsetcd (encrypted at rest in DOKS)Per-namespace
GitHub Actions SecretsGitHub encrypted secretsPer-repository
Gitea Access TokensGitHub Actions secrets + Kubernetes secretsCI/CD pipeline
SSL CertificatesKubernetes TLS Secrets (cert-manager managed)Per-ingress

Security Principles

🔒 No Secrets in Git

All tokens and passwords stored in Kubernetes secrets or CI/CD secrets — never in manifests

🛡️ Cloudflare Protection

All traffic proxied through Cloudflare — DDoS protection and WAF

🔐 TLS Everywhere

Every service has automatic Let's Encrypt SSL — no HTTP-only endpoints

👤 RBAC

Kubernetes RBAC controls access to cluster resources

🔑 Private Repos

Both Gitea and GitHub repos are private — no public code exposure

📋 Audit Trail

Every infrastructure change has a Git commit with author and timestamp

Network Security

graph LR Internet --> CF[Cloudflare Proxy DDoS + WAF] CF --> LB[DO Load Balancer TLS Termination] LB --> NGINX[NGINX Ingress SSL Redirect] NGINX --> Kubernetes[Kubernetes Services ClusterIP only] Kubernetes -->|No direct external access| Internal[Internal Network Only accessible via Ingress]

Troubleshooting

Common issues and their solutions.

Service Returns 404

Cause: Missing ingressClassName

Ingress resources must have ingressClassName: nginx or NGINX controller won route them.

spec:
  ingressClassName: nginx  # Must be present
  tls: ...

ArgoCD Redirect Loop

Cause: Double HTTPS redirect

ArgoCD forces HTTPS internally + ingress also redirects → infinite loop.

# Fix: Set insecure mode
kubectl patch configmap argocd-cmd-params-cm -n argocd \
  --type merge -p '{"data":{"server.insecure":"true"}}'
kubectl rollout restart deployment/argocd-server -n argocd

ImagePullBackOff

Cause: Can't pull from GHCR

Kubernetes needs an imagePullSecret with a GitHub PAT that has read:packages scope.

# Create pull secret
kubectl create secret docker-registry ghcr-pull-secret \
  -n <namespace> \
  --docker-server=ghcr.io \
  --docker-username=<github-user> \
  --docker-password=ghp_YOUR_TOKEN

# Reference in deployment
# spec.template.spec.imagePullSecrets: [{name: ghcr-pull-secret}]

SSL Certificates Pending

Check: DNS propagation and HTTP-01 challenges

Let's Encrypt needs to verify domain ownership via HTTP-01 challenge. Ensure port 80 is accessible and DNS points to the correct IP.

kubectl get certificates -A
kubectl describe certificate <name> -n <namespace>
kubectl logs deploy/cert-manager -n cert-manager

Common kubectl Commands

# Check all pods
kubectl get pods --all-namespaces

# Describe a failing pod
kubectl describe pod <name> -n <namespace>

# View logs
kubectl logs <pod> -n <namespace> --tail=50

# Restart a deployment
kubectl rollout restart deployment/<name> -n <namespace>

# Check ArgoCD applications
kubectl get applications -n argocd

# Force ArgoCD refresh
kubectl patch application <name> -n argocd \
  --type merge -p '{"metadata":{"annotations":{"argocd.argoproj.io/refresh":"hard"}}}'