docs: publish 2026-04-26

This commit is contained in:
lerko96
2026-04-26 22:41:44 -04:00
commit 5738c6424d
8 changed files with 480 additions and 0 deletions
+64
View File
@@ -0,0 +1,64 @@
# Network
How I think about segmentation and why the policy looks the way it does. Specific subnets, VLAN IDs, IP plans, and firewall rule listings live in the private repo.
## Why segmentation matters here
A homelab pulls together an unusually wide trust spread on one piece of hardware: cloud-managed IoT devices that phone home constantly, a work laptop that touches an employer network, a guest WiFi that strangers join, internal services holding sensitive data, and admin surfaces that should never be exposed. Treating all of that as one flat network treats it like it has the same trust level. It doesn't.
The model here is **trust-tier VLANs** with explicit policy between them. Every tier has a documented purpose and a defined inbound/outbound posture.
## Trust tiers
Seven VLANs, organized roughly by how much I trust what's on them:
| Tier | What's on it | Posture |
|---|---|---|
| **Management** | Hypervisor, firewall, backup server, network controllers | Most trusted. Reachable only via VPN. Doesn't initiate outbound unless it has to. |
| **Internal services** | LXCs and VMs running the internal app stack | Trusted. Serves clients in adjacent tiers per policy. |
| **LAN** | Personal devices on home WiFi/Ethernet | Trusted. Consumes internal services. |
| **Work-from-home** | Employer-owned laptop | Untrusted lateral. Internet only — blocked from everything else, including internal DNS. |
| **IoT** | Smart devices, cloud-managed appliances | Untrusted. Internet only. Isolated from everything internal. |
| **Guest** | Visitor WiFi | Untrusted. Internet only. |
| **DMZ** | Internet-facing services | Treated as compromised by default. Locked down on outbound; inbound to internal is a tight allowlist. |
| **VPN (WireGuard)** | Authenticated remote clients | Same posture as LAN, plus admin-tier visibility. |
## Policy posture
- **Default deny inter-VLAN.** Every cross-tier flow is an explicit allow rule with a reason written next to it.
- **WFH and IoT are jailed.** They reach the internet and nothing else internal — not even DNS for the local hostnames. This is the most important rule in the firewall.
- **Management is the smallest possible tier.** Only what *runs* the lab lives there. No user-facing services. No outbound internet from anything that doesn't strictly need it.
- **DMZ is one-way.** Public services live there. They can't initiate connections inward except through a tight, firewall-enforced allowlist by source IP and destination port. The reverse proxy in the DMZ is *configured* to respect that, and the firewall is *also* configured to enforce it. Two layers, on purpose — misconfiguring the proxy is way easier than misconfiguring the firewall.
- **Admin surfaces are VPN-only.** Hypervisor, firewall, backup server, switches, APs — none of them are reachable from the internet. WireGuard first or it doesn't happen.
## DNS
Three layers, each doing one job:
1. **Pi-hole** — first hop for clients on most VLANs. Filters ad/tracker domains and holds the local A records that map internal hostnames to internal IPs. Not used by management hosts (see below) or by the WFH VLAN.
2. **Unbound on the firewall** — Pi-hole's upstream. Recursive resolver, validates DNSSEC.
3. **Cloudflare** — Unbound's eventual upstream when needed.
**Bootstrap exception:** the hypervisor itself (which is the box Pi-hole runs on) is statically pointed at the firewall's resolver, not Pi-hole. Otherwise there's a circular dependency at boot — the hypervisor needs DNS to come up, and Pi-hole is one of the things the hypervisor brings up.
**Known SPOF:** Pi-hole is the only thing resolving internal hostnames. If it dies, internal hostnames stop resolving until it's back. I thought about mirroring the records into Unbound on pfSense and decided not to — I'd rather know if Pi-hole is unhealthy than paper over it. Documented as a known limitation in the private repo.
## Internet exposure
Three ports forwarded from WAN to internal:
- **HTTP / HTTPS** — to the DMZ reverse proxy. Serves the small public service set.
- **WireGuard** — to the firewall. The only remote admin path.
Everything else is closed. I verify this from outside the network on a regular basis — the only way to actually know what's exposed is to scan from somewhere that isn't the LAN.
## IPv6
Disabled at the carrier-provided gateway. The lab is IPv4-only by design — fewer surfaces, simpler firewall reasoning, no AAAA leakage. I'll revisit this if I have a reason to; today I don't.
## Things that are easy to overlook
A couple of things worth being explicit about, because they bit me at some point:
- **Intra-VLAN traffic between LXCs on the same Proxmox bridge doesn't traverse the firewall.** Isolation is enforced *per-VLAN*, not *per-LXC*. Two LXCs sharing a tier can talk to each other directly. Useful to remember when you're reasoning about blast radius — the firewall doesn't see anything that doesn't cross a VLAN boundary.
- **Certificate Transparency.** Caddy uses Cloudflare DNS-01 for cert issuance, which is great because services don't have to be exposed to the internet to get a cert. But every cert that gets issued lands in CT logs forever, and per-hostname certs basically publish the internal hostname inventory to anyone who runs a CT search on the domain. A wildcard cert would limit CT exposure to `*.lerkolabs.com` and the apex; it's on my list as a future change, with the tradeoff being that wildcard compromise is worse than per-host.
+55
View File
@@ -0,0 +1,55 @@
# Security
Posture and practices. No enumerated weaknesses or specific control parameters here — those live in the private repo where they belong.
## Threat model
Operating assumption: this is a one-person homelab on a residential connection, running a mix of services I rely on (password manager, calendar, photos) and a few that are reachable from the internet (portfolio, self-hosted Git). The realistic threat is opportunistic — bots scanning my public IP, automated exploitation of known CVEs in exposed services, credential stuffing against any public login.
A targeted, well-resourced adversary is out of scope. The defenses are designed to make the cost of opportunistic attacks high enough that automation gives up and moves on.
## Layered controls
The defenses are intentionally redundant. Any single layer failing should leave the next one intact.
1. **Network segmentation.** Trust-tier VLANs with default-deny inter-tier policy. A compromise on one VLAN doesn't get lateral movement to another without crossing an explicit firewall rule.
2. **Public surface stays small.** Only a handful of services are reachable from the internet, all proxied through a DMZ-isolated host with a tight firewall-enforced allowlist into internal.
3. **TLS at the edge, properly.** ACME-automated certs via Cloudflare DNS-01. Public hostnames carry HSTS preload-eligible headers.
4. **Identity in front of everything internal.** Authentik handles SSO. OIDC where the app supports it; reverse-proxy forward auth where it doesn't. There's no "log in with the app's local account" bypass.
5. **Admin is VPN-only.** Hypervisor, firewall, backup server, switches, APs — none of them reachable from the internet. WireGuard is the only remote admin path, and it has its own keypair-based auth.
6. **Secrets aren't in git.** Configs reference secrets by name; values live in a password manager.
## Update cadence
- **Edge components** (firewall, reverse proxies, identity provider) — patched promptly when CVEs land. Highest blast radius if compromised.
- **Hypervisor and backup server** — quarterly review, with security patches applied out-of-cycle when needed.
- **Application LXCs** — rolling updates on a regular schedule, with the sensitive ones (password manager, photos, identity) bumped ahead of less-sensitive ones.
- **Container images** — re-pulled on the same rolling schedule.
A version tracker in the private repo records what's running where, so drift is visible at a glance.
## Backups
- Hypervisor-level backups to a dedicated backup server on a separate VLAN.
- Conservative retention, with the most recent backups always preserved no matter what gets pruned.
- Backups verified periodically; restores exercised on a non-production host so I find out if something is broken *before* I need it.
- A documented rebuild order means the lab can come back from cold in a few hours, assuming I have physical access to the firewall.
## Credentials & rotation
- Identity provider passwords: in a password manager, used through it.
- Service-to-service secrets (API tokens, DB passwords): rotated when there's a reason to (component change, suspected exposure), not on a fixed calendar.
- VPN credentials: per-device. Each remote client has its own keypair.
## Honest limitations
This is a learning environment, not a hardened production estate. A couple of things I'd rather be upfront about:
- **No HA.** One hypervisor, one firewall. The mitigation isn't redundancy, it's a tested rebuild path and conservative backups.
- **One-person ops.** Everything runs as it would in a prod environment with on-call staff, except there's only me. The choice of tooling reflects that — anything that needs constant attention has been swapped out for something simpler.
Neither of those is unmanaged risk. They're scoped accepted risks, reviewed quarterly in the private repo.
## Security contact
Found something concerning about a public-facing endpoint? `admin@lerkolabs.com`. I'll respond.
+108
View File
@@ -0,0 +1,108 @@
# Services
Everything I'm running, grouped by what it does. URLs, ports, and which host runs what are operational details — those live in the private repo.
## Identity & access
| Service | What it does |
|---|---|
| Authentik | SSO for everything internal. OIDC where the app supports it, Caddy forward auth where it doesn't. |
| Pi-hole | DNS for the LAN, ad blocking, and the source of truth for internal hostnames. |
| WireGuard | The only way in from outside. All admin work happens through the tunnel. |
## Reverse proxy & TLS
Two Caddy instances, by design:
- **Internal Caddy** — fronts everything internal. Reachable from inside the LAN or via VPN. Does most of the routing.
- **DMZ Caddy** — fronts the small set of things I want public. Lives on its own VLAN with no inbound access to internal services beyond a tight, firewall-enforced allowlist.
Both use Cloudflare DNS-01 for ACME, which is how internal-only services get valid public certs without ever being exposed to the internet for issuance.
## Productivity & knowledge
| Service | What it replaces |
|---|---|
| Outline | Notion / Confluence |
| Vikunja | Todoist / Asana |
| Hoarder | Pocket / Raindrop |
| Memos | Apple Notes (the quick-capture kind) |
| FreshRSS | Feedly |
| Bytestash | gist / pastebin |
| Filebrowser | Dropbox-style file access |
| Baikal | iCloud calendar/contacts (CalDAV / CardDAV) |
## Money
| Service | What it replaces |
|---|---|
| Actual Budget | YNAB / Mint |
| Ghostfolio | Personal Capital |
## Operations & day-to-day
| Service | What it does |
|---|---|
| Grist | Lightweight relational tracking — anything that wants to be in a spreadsheet but shouldn't be |
| Glance | Personal landing page / dashboard |
| Traggo | Time tracking |
## Media
| Service | What it does |
|---|---|
| Plex | Media library (legacy clients) |
| Jellyfin | Media library (primary, open source) |
| *arr stack | Library automation |
| qBittorrent | Downloads |
| Immich | Self-hosted Google Photos replacement |
## Home / IoT
| Service | What it does |
|---|---|
| Home Assistant OS | Home automation hub |
## Secrets
| Service | What it does |
|---|---|
| Vaultwarden | Bitwarden-compatible password manager. **Planned, not deployed yet.** |
## Bots & automation
| Service | What it does |
|---|---|
| Vocard | Discord music bot |
| MonitorRSS | RSS-to-Discord notifications |
| ntfy | Push notifications for ops alerts |
## Monitoring
| Service | What it does |
|---|---|
| Victoria Metrics | Time-series store |
| Grafana | Dashboards |
| Beszel | Lightweight host metrics |
| Uptime Kuma | Synthetic uptime checks |
## Public services
A small, intentional set of things that are reachable from the open internet. They all sit behind the DMZ reverse proxy on a VLAN with no inbound access to internal subnets.
| Service | Why it's public |
|---|---|
| Portfolio | It's a portfolio. |
| Self-hosted Git | Where you're reading this. |
| SSO endpoint | Has to be reachable for an OIDC flow on one specific public-facing service (the Discord bot dashboard). It's the only internal-VLAN backend the public proxy is allowed to talk to, and the firewall enforces that — not just the proxy config. |
| One Authentik-gated app | The Discord bot dashboard. Public so I can hit it from outside the LAN; gated by Authentik forward auth before anything responds. |
## Who can access what
Three audiences, three levels:
- **Internet, anonymous** — sees only the small public set above.
- **Internet, signed into Authentik** — same as above, plus access to the Authentik-gated public services.
- **Connected via WireGuard** — gets everything: internal apps and admin surfaces (hypervisor, firewall, backup server, network controller, monitoring). This is the only way to reach any admin surface.
The WFH and IoT VLANs are deliberately *outside* this access model. Those are for me-as-a-user (work laptop, smart devices), not me-as-an-operator. They never see the internal service plane.