Skip to content

chaijunkin/home-ops

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

My Homelab Repository ❄️

... automated via Flux, Renovate and GitHub Actions πŸ€–

DiscordΒ Β  KubernetesΒ Β 

Status-PageΒ Β 

Age-DaysΒ Β  Uptime-DaysΒ Β  Node-CountΒ Β  Pod-CountΒ Β  CPU-UsageΒ Β  Memory-UsageΒ Β  Power-UsageΒ Β  Alerts


Overview

This is a monorepository is for my home k3s clusters. I try to adhere to Infrastructure as Code (IaC) and GitOps practices using tools like Ansible, Terraform, Kubernetes, Flux, Renovate, and GitHub Actions.

The purpose here is to learn k8s, while practicing Gitops.


β›΅ Kubernetes

There is a template over at onedr0p/flux-cluster-template if you want to try and follow along with some of the practices I use here.

Installation

My cluster is k3s provisioned overtop bare-metal Debian using the Ansible galaxy role ansible-role-k3s. This is a semi-hyper-converged cluster, workloads and block storage are sharing the same available resources on my nodes while I have a separate NAS server with ZFS for NFS/SMB shares, bulk file storage and backups.

Configuration

Taskfile Usage for Fly Apps

This repository uses Taskfile to automate Fly app management. Each Fly app task loads its environment variables from a per-app .config.env file, ensuring isolation and security for each app.

How it works

  • Each Fly app (e.g., gatus) has its own directory under infrastructure/flyio/<APP> containing a .config.env file with the required environment variables for that app.
  • The Taskfile tasks (e.g., fly:app:create, fly:app:deploy, fly:volume:create, etc.) use the APP variable to select which app to operate on.
  • When you run a task, the Taskfile loads the environment from infrastructure/flyio/<APP>/.config.env using the dotenv: key at the task level. This ensures all commands for that app use the correct environment.

Example: Using the Taskfile with the gatus app

Suppose you want to deploy or manage the gatus Fly app. First, ensure you have a .config.env file at infrastructure/flyio/gatus/.config.env with all required variables.

To create a volume for gatus:

task fly:volume:create APP=gatus

To deploy the app:

task fly:app:deploy APP=gatus

To view logs:

task fly:app:logs APP=gatus

All these commands will automatically load environment variables from infrastructure/flyio/gatus/.config.env.

Note: You do not need to manually export environment variables or use a global .config.env. Each app is fully isolated.


Configuration

The .config.env file contains environment variables needed to deploy the apps in this template. Each app has its own .config.env file as described above.

  1. Copy the .config.sample.env to .config.env in the appropriate app directory and fill out all the environment variables. All uncommented variables are required.

Fly.io setup

For some commands below, we use a task instead of flyctl because we the task writes (on app creation) and reads (subsequent commands) your app name from the config file. This is the only way to keep your app name hidden.

  1. Signup to Fly

    If you already have a Fly account, use flyctl auth login instead.

    flyctl auth signup
  2. Create a new fly app

    If this is your first app, you'll be asked to add credit card information, but, don't worry, you'll not be charged by this app.

    task fly:app:create APP=gatus
  3. Create a new volume

    This will show you a warning about invididual volumes. It's ok to have a single volume because we're not concerned about downtime for our gatus instance.

    task fly:volume:create APP=gatus
  4. Deploy your app

    task fly:app:deploy APP=gatus
  5. Setup your custom domain

    After your app is deployed, follow the steps here to setup your custom domain.

  6. Open your new gatus website

    That's all! Now you can open your custom domain and gatus should work.

Keeping dependencies up to date

This template uses Renovatebot to scan and open new PRs when dependencies are out of date.

To enable this, open their Github app page, click the "Configure" button, then choose your repo. The template already provides Renovate configs and there's no need for further action.

Troubleshooting

If your deployment failed or you can't open gatus web, you can see the logs with:

task fly:app:logs APP=gatus

If that command fails (eg, if the machine is stopped), try opening your logs in the browser:

task fly:app:logs:web APP=gatus

You can also ssh in the machine with:

task fly:app:ssh

and check individual logs using overmind:

# Run this command inside your fly machine
overmind connect gatus

This will open a tmux window with gatus logs. You can scroll your tmux window with Ctrl-B-] and use Ctrl-B-D to exit the tmux window.

Substitute gatus with caddy, or backup to see logs for other apps.

Continuous deployment

After your first manual deploy to Fly.io, per instructions above, you can setup continuous deployment via Github Actions.

  1. Install Github CLI

    brew install gh
  2. Login to Github

    gh auth login
  3. Set Fly secrets to your Github repo

    task github:secrets:set
  4. Test your workflow deployment

    task github:workflow:deploy

That's all! Now, any changes to your Dockerfile, fly.toml or scripts/config will trigger a fly deploy.

FAQ

  1. Why every fly command I run errors with: Error: the config for your app is missing an app name?

    For security reasons the app name is not sdaved in the fly.toml file. In that case, you have to add -a your-app-name to all fly commands.

    Your app name is found in your .config.env file.

    Example:

    fly secrets list -a your-app-name

    Or you can add:

    app = "your-app-name"

    to the beginning of your fly.toml file.

  2. How do I update the environment variables?

    After updating the .config.env file, you can update your environment variables in two different ways:

    task fly:secrets:set APP=gatus

    will read your .config.env file and import every defined variable to your fly app, Or you can just do a new deployment:

    task fly:app:deploy APP=gatus

    which will run the command above and do a new deployment afterwards.

Core Components

  • actions-runner-controller: self-hosted Github runners
  • cilium: internal Kubernetes networking plugin
  • cert-manager: creates SSL certificates for services in my cluster
  • external-dns: automatically syncs DNS records from my cluster ingresses to a DNS provider
  • external-secrets: managed Kubernetes secrets using Bitwarden.
  • ingress-nginx: ingress controller for Kubernetes using NGINX as a reverse proxy and load balancer
  • sops: managed secrets for Kubernetes, Ansible, and Terraform which are committed to Git
  • tf-controller: additional Flux component used to run Terraform from within a Kubernetes cluster.
  • volsync: backup and recovery of persistent volume claims

GitOps

Flux watches the clusters in my kubernetes folder (see Directories below) and makes the changes to my clusters based on the state of my Git repository.

The way Flux works for me here is it will recursively search the kubernetes/apps folder until it finds the most top level kustomization.yaml per directory and then apply all the resources listed in it. That aforementioned kustomization.yaml will generally only have a namespace resource and one or many Flux kustomizations. Those Flux kustomizations will generally have a HelmRelease or other resources related to the application underneath it which will be applied.

Renovate watches my entire repository looking for dependency updates, when they are found a PR is automatically created. When some PRs are merged Flux applies the changes to my cluster.

Directories

This Git repository contains the following directories under Kubernetes.

πŸ“ kubernetes
│── πŸ“ apps           # applications
│── πŸ“ bootstrap      # bootstrap procedures
│── πŸ“ flux           # core flux configuration
│── πŸ“ templates      # re-useable components

Flux Workflow

This is a high-level look how Flux deploys my applications with dependencies. Below there are 3 apps postgres, authentik and weave-gitops. postgres is the first app that needs to be running and healthy before authentik and weave-gitops. Once postgres is healthy authentik will be deployed and after that is healthy weave-gitops will be deployed.

graph TD;
  id1>Kustomization: cluster] -->|Creates| id2>Kustomization: cluster-apps];
  id2>Kustomization: cluster-apps] -->|Creates| id3>Kustomization: postgres];
  id2>Kustomization: cluster-apps] -->|Creates| id6>Kustomization: authentik]
  id2>Kustomization: cluster-apps] -->|Creates| id8>Kustomization: weave-gitops]
  id2>Kustomization: cluster-apps] -->|Creates| id5>Kustomization: postgres-cluster]
  id3>Kustomization: postgres] -->|Creates| id4[HelmRelease: postgres];
  id5>Kustomization: postgres-cluster] -->|Depends on| id3>Kustomization: postgres];
  id5>Kustomization: postgres-cluster] -->|Creates| id10[Postgres Cluster];
  id6>Kustomization: authentik] -->|Creates| id7(HelmRelease: authentik);
  id6>Kustomization: authentik] -->|Depends on| id5>Kustomization: postgres-cluster];
  id8>Kustomization: weave-gitops] -->|Creates| id9(HelmRelease: weave-gitops);
  id8>Kustomization: weave-gitops] -->|Depends on| id5>Kustomization: postgres-cluster];
  id9(HelmRelease: weave-gitops) -->|Depends on| id7(HelmRelease: authentik);
Loading

☁️ Cloud Dependencies

While most of my infrastructure and workloads are self-hosted I do rely upon the cloud for certain key parts of my setup. This saves me from having to worry about two things. (1) Dealing with chicken/egg scenarios and (2) services I critically need whether my cluster is online or not.

The alternative solution to these two problems would be to host a Kubernetes cluster in the cloud and deploy applications like HCVault and ntfy. However, maintaining another cluster and monitoring another group of workloads is a lot more time and effort than I am willing to put in.

Service Use Cost
Bitwarden Secrets with External Secrets ~$TBC/yr
Cloudflare Domain and S3 ~$TBC/yr
GitHub Hosting this repository and continuous integration/deployments Free
Gatus Monitoring internet connectivity and external facing applications Free
Total: ~$TBC/mo

πŸ”§ Hardware

Kubernetes Cluster

Name Device CPU OS Disk Data Disk RAM OS Purpose

First server (running NAS + compute home server)

Component Specification
CPU Intel E-2246G 3.6GHz (base) Xeon β€” 6 cores / 12 threads
RAM 64GB ECC Kingston
Motherboard Gigabyte C246-WU4 Workstation/Server
OS Disk 500GB Samsung Evo 850 SSD
NVMe 1TB Samsung 970 Evo Plus NVMe
NAS/Data Disks Seagate IronWolf / EXOS β€” various (1TB / 2TB / 4TB / 6TB / 8TB / 10TB) x2 (7200RPM, 256MB, CMR)
Networking Intel WiFi-6 AX201 PCIe
Cooling Be Quiet! Dark Rock Slim
PSU Corsair AX760
Case Lian-Li PC-7H Full Aluminum ATX

Note: PROPOSING TO DECOUPLE NAS AND HOME SERVER β€” budget constrained (3rd world country costs). This is a proposed architecture change; details and migration plan are tracked elsewhere in the repo.

Supporting Hardware

The rest of the homelab and network devices include:

Device Purpose / Notes
HORACO 2.5GbE Managed Switch (8-port 2.5GBASE-T, 10G SFP+ uplink) Core LAN switching / aggregation
PiKVM-A3 (for Raspberry Pi) Remote KVM management for hardware access
Intel N100 Celeron N5105 fanless mini PC (Aliexpress model) Soft-router / pfSense / ESXi-capable device; 4x Intel i226/i225 2.5G LAN ports, HD/DP β€” used as edge firewall/router
TP-LINK EAP670 AX5400 Ceiling Mount WiFi 6 Access Point Wireless coverage (ceiling mount)
PoE Cameras reolink_rlc1212a_frontdoor, reolink_rlc520a_upstair, reolink_rlc811a_backdoor, reolink_rlc520a_dining
D-Link DES-F1010P-E (10-port PoE switch, 8 PoE + 2 uplink; 93W budget) Powers PoE cameras / APs
TP-Link TL-SG108E (8-Port Easy Smart Switch) Small L2 management switch for edge/office

Networking/UPS Hardware

Device Purpose
Mini PC (Aliexpress Intel N100 / N5105) Edge router / firewall appliance
TP-LINK EAP670 Primary wireless AP (ceiling mount)
HORACO 2.5GbE switch Primary LAN switch (2.5G uplinks)
TP-Link TL-SG108E Small managed L2 switch for office/VLANs
D-Link DES-F1010P-E PoE switch for cameras / APs (93W power budget)
UPS (TBD β€” planned)

Estimated power consumption ⚑

Assumptions: these are rough, conservative estimates for typical (idle/average) and peak loads. Real consumption depends on workload, drive spin-up, PoE loads, and UPS efficiency. I assume a UPS/inverter efficiency of ~90% for mains draw calculations.

Item Typical (W) Peak (W) Notes
First server (E-2246G, 64GB, NVMe + SATA SSD + 2x HDD) 120 230 Typical under light/medium home-lab load; peak under full CPU+disk IO
HORACO 2.5GbE managed switch (8-port + SFP+) 25 30 Fanless small managed switch
PiKVM-A3 (Raspberry Pi) 5 6 Low-power remote KVM board
Intel N100 / N5105 mini PC (edge router) 20 35 Typical for fanless mini-PC, spikes under load/VPN/ESXi usage
TP-LINK EAP670 AP 12 20 PoE/AC/AP radio active under clients
D-Link DES-F1010P-E (PoE switch) - switch chassis 15 20 Switch chassis overhead (PoE separate)
Cameras (4x Reolink, listed models) - total 32 40 ~8W each typical, up to ~10W peak each
TP-Link TL-SG108E 6 8 Small L2 switch

Totals:

  • Total typical IT load: 235 W
  • Total peak IT load: 388 W

Accounting for UPS/inverter inefficiency (~90%):

  • Mains draw (typical) β‰ˆ 235 W / 0.9 β‰ˆ 261 W (0.261 kW)
  • Mains draw (peak) β‰ˆ 388 W / 0.9 β‰ˆ 431 W (0.431 kW)

Estimated energy consumption:

  • Daily (typical): 0.261 kW Γ— 24 h β‰ˆ 6.26 kWh/day
  • Monthly (30d, typical): β‰ˆ 187.9 kWh/month

Notes & next steps:

  • These are estimates to help size UPS, breakers, and monthly energy costs. If you want, I can: compute estimated monthly cost for your local electricity rate, produce a per-device YAML inventory with the wattages, or create a small PowerPoint/CSV for UPS sizing and redundancy.

πŸš€ Deployment Tasks

This repository includes several Taskfile tasks to automate infrastructure provisioning and cluster configuration:

Proxmox Infrastructure

  • init Initializes the Terraform working directory for Proxmox infrastructure.

    task talos:init
    • Runs terraform init in infrastructure/terraform/proxmox.
  • apply Applies the Terraform configuration to provision/update Proxmox infrastructure.

    task talos:apply
    • Runs terraform apply -auto-approve in infrastructure/terraform/proxmox.

See .taskfiles/talos/Taskfile.yaml for more details and additional tasks.


⭐ Stargazers

Star History Chart


🀝 Thanks

Big shout out to original flux-cluster-template, and the Home Operations Discord community.

Be sure to check out kubesearch.dev for ideas on how to deploy applications or get ideas on what you may deploy.


πŸ“œ Changelog

See my awful commit history

ARCHIVES FOLDER IS REMOVED ON Aug 10 14:20:50


πŸ” License

See LICENSE

About

K3s Cluster powered by proxmox

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors 4

  •  
  •  
  •  
  •