[{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/ansible/","section":"Tags","summary":"","title":"Ansible","type":"tags"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/categories/architecture/","section":"Categories","summary":"","title":"Architecture","type":"categories"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/architecture/","section":"Tags","summary":"","title":"Architecture","type":"tags"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/backup/","section":"Tags","summary":"","title":"Backup","type":"tags"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/categories/","section":"Categories","summary":"","title":"Categories","type":"categories"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/crisp/","section":"Tags","summary":"","title":"Crisp","type":"tags"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/devops/","section":"Tags","summary":"","title":"Devops","type":"tags"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/categories/devops/","section":"Categories","summary":"","title":"DevOps","type":"categories"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/disaster-recovery/","section":"Tags","summary":"","title":"Disaster Recovery","type":"tags"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/hetzner/","section":"Tags","summary":"","title":"Hetzner","type":"tags"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/categories/infrastructure/","section":"Categories","summary":"","title":"Infrastructure","type":"categories"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/infrastructure/","section":"Tags","summary":"","title":"Infrastructure","type":"tags"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/podman/","section":"Tags","summary":"","title":"Podman","type":"tags"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/posts/","section":"Posts","summary":"","title":"Posts","type":"posts"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/","section":"Roberto Tazzoli","summary":"","title":"Roberto Tazzoli","type":"page"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/s3/","section":"Tags","summary":"","title":"S3","type":"tags"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/","section":"Tags","summary":"","title":"Tags","type":"tags"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/tailscale/","section":"Tags","summary":"","title":"Tailscale","type":"tags"},{"content":" Terraforming the Cloud: Provisioning and Configuring Vault on Hetzner via Terraform and Ansible # There are infrastructure sessions where the primary value is not the number of modified files, but the confirmation that the working method is actually working. This is one of those.\nIn recent days, I had very cleanly separated two phases of the Vault runtime on Hetzner. The first, hetzner-vault-local-lifecycle (C1), aimed to prove that the node could exist as a coherent local entity: TLS, Raft storage, strict bootstrap, automatic unseal, and clear identity contracts. The second, hetzner-vault-s3-backup-recovery (C2), was to add remote durability and recovery: periodic snapshots, coherent pointers on S3, comparison logic, remote durability repair, and, above all, a restore path when the local node no longer exists but the cryptographic truth in the controller and the remote backups are still healthy.\nChronologically, they seem like two separate jobs. In reality, however, they were a single path. For about three days I worked almost exclusively on design: review, refinement, clarification of contracts, definition of nomenclature, decision matrix, responsibility between Ansible and shell helpers, behavior in case of ambiguous state, bootstrap flow, the role of TazPod, receipt structure, restore limits. Then, when the time came to implement, the hard work was already done. The execution part compressed into a few hours and, above all, took place with a fluidity very different from the typical work of this kind.\nThis does not mean there were no problems. There were, and some were even instructive. But the type of problem changed. I didn\u0026rsquo;t find myself questioning the architecture in the middle of the session. Instead, I faced integration problems, operational details, and the actual behavior of tools like Podman, systemd, Tailscale, and Vault. It\u0026rsquo;s a huge difference. When the design is solid, even the unexpected stops being chaos and simply becomes an anomaly to isolate and correct.\nIn this article, I recount the entire transition: from the local lifecycle to the remote backup on S3, with the live verification of snapshot rotation, the proof that the remote state can be initialized and repaired starting from the local truth, the real test of the \u0026ldquo;unchanged snapshot\u0026rdquo; case, the destructive restore cycles, and, above all, the complete closure of the C2 matrix. At the end of the work, no \u0026ldquo;almost ready\u0026rdquo; branch remained: the VM is coherent, Vault is active, TazPod is coherent, S3 is coherent, the backup timer is alive, and the entire set of scenarios designed for C2 was executed and brought to green.\nThe starting point: an already credible C1, not a fragile prototype # The first important element to understand is that C2 was not built on an improvised foundation. The work on C1 had already eliminated much of the initial entropy.\nThe Vault node on Hetzner was no longer a simple container \u0026ldquo;that starts\u0026rdquo;. It was already a runtime with its own well-defined identity. The host node was lushycorp-vm.ts.tazlab.net, the Vault TLS service was lushycorp-api.ts.tazlab.net, the persistent paths were stable, the TLS configuration was clear, the bootstrap produced a vault_lineage_id, the local receipt told the identity of the cluster, and the automatic unseal mechanism already had a precise shape.\nThis distinction is also fundamental from a methodological point of view. Many backup and restore problems arise because you attempt to design remote recovery when the local system is not yet rigorously defined. In that case, the remote layer inherits ambiguities already present in the local layer and amplifies them. Here the opposite happened: the work done on C1 reduced the degrees of freedom. When I started C2, I no longer had to decide \u0026ldquo;what\u0026rdquo; the Vault node was. I only had to decide how to rigorously extend an already defined identity.\nHere it is worth clarifying immediately a term that I use often in the rest of the article: lineage. By vault_lineage_id I mean the stable identity of a specific life history of the Vault, that is, the \u0026ldquo;genealogical line\u0026rdquo; of that instance: it is born at the first bootstrap, it is kept in local receipts and canonical artifacts in TazPod, and it serves to distinguish a true restore of the same instance from a new initialization that instead produces a new identity. In practice, if I recreate the node but I am really bringing the same Vault back to life, the lineage remains the same. If instead I do a fresh init and a new Vault is born, a new lineage is also born.\nThis is why the division into phases proved so useful. foundation, local lifecycle, remote durability were not just names of convenience. They were true diagnostic boundaries. If something broke in the remote durability, I already knew I wasn\u0026rsquo;t debating TLS, baseline Tailscale, elementary Podman runtime, or local bootstrap. This drastically reduces the noise when reading logs and having to make quick decisions.\nThe real value of the three days of design # The most interesting part of this session, at least for me, was not so much writing the Ansible code or the helper scripts. It was seeing very concretely that the three days of design had really transformed the implementation session.\nThe point is not simply that \u0026ldquo;it went faster\u0026rdquo;. Speed, in infrastructure, says little by itself. You can go fast in the wrong direction too. The point is that the session took place without the typical breaks in continuity that happen when you program the architecture while writing the code. I didn\u0026rsquo;t have to stop halfway to ask myself if the bucket should contain the latest global snapshot or the latest snapshot per lineage. I didn\u0026rsquo;t have to redefine what a \u0026ldquo;legitimate\u0026rdquo; restore was. I didn\u0026rsquo;t have to decide on the fly whether the admin token should be recreated always or only in certain cases. All these choices had already been made explicit.\nThis had a very practical effect: when a problem emerged, the problem was confined. If the backup service unit was not passing the right variables, the correction was local. If the snapshot path was not mounted in the container, the correction was local. If two logically identical snapshots had different binary hashes, the problem didn\u0026rsquo;t suddenly become a crisis of the backup strategy; it became a refinement of the comparison contract. This difference between \u0026ldquo;confined problem\u0026rdquo; and \u0026ldquo;systemic problem\u0026rdquo; is the reason why I consider this session a success.\nIn more didactic terms: preventive design does not eliminate bugs, but it transforms the type of bug you encounter. It reduces the risk of architectural bugs, that is, those that force you to change your mental model halfway through the work. What remains are integration bugs, real behavior bugs, interfaces between components. They are still annoying, but much more manageable.\nC2 in practice: what it really had to do # The second phase of the project shouldn\u0026rsquo;t have been limited to \u0026ldquo;saving files to S3\u0026rdquo;. Such a simple formulation would have been dangerously incomplete. The real goal was to introduce coherent remote durability without confusing the concept of backup with that of identity.\nA coherent local Vault produces a certain cryptographic truth: unseal keys, administrative tokens, lineage, Raft state. A useful remote backup is not simply a blob of data; it is an artifact that must be able to be reconnected reliably to that same identity. This is where the contract of pointers and metadata comes from. It is not enough to upload a snapshot. You need to know which lineage it represents, which slot is active, which hash it corresponds to, and what the correct candidate is to use in the restore phase.\nFor this reason, I implemented three distinct levels in the bucket:\na global pointer (vault/raft-snapshots/latest.json) indicating the active lineage; a lineage-local pointer (vault/raft-snapshots/\u0026lt;vault_lineage_id\u0026gt;/latest.json) indicating the current restore candidate for that lineage; two remote slots (slot-a and slot-b) that allow a simple and readable rotation. This structure was an important choice also for operational readability. In the incident response phase, an elegant but opaque system is often worse than a slightly more verbose but transparent system. Here I wanted an operator, reading objects in S3 or local logs, to be able to understand what the active state was without having to \u0026ldquo;guess\u0026rdquo; based on implicit conventions.\nThe implementation of the C2 runtime # The physical implementation was distributed over a fairly wide area, but very tidy. I added a dedicated playbook (vault-s3-backup-recovery.yml), new task files in the shared Ansible role, two dedicated shell helpers for backup and restore, and new systemd units for the hourly timer and for the explicit restore.\nAn important aspect of the design was the division of responsibilities between Ansible and shell helpers. The helpers shouldn\u0026rsquo;t \u0026ldquo;decide\u0026rdquo; the behavior of the system. They had to mechanically execute restricted operations: save a snapshot, calculate hashes, read or write S3 objects, execute the restore primitive when already authorized. The visibility of choices — restore yes/no, failure yes/no, selected lineage, need for recreation of the admin token — had to remain in the Ansible tasks. This is not just a stylistic quirk. It is a choice that improves auditability and debugging. A state machine that lives in separate tasks is much more readable than a shell script that engulfs everything and returns a generic exit code.\nNew operational fixed points also appeared on the host node:\n/etc/lushycorp-vault/s3.env for root-only S3 credentials; /etc/lushycorp-vault/remote-restore.env for the restore request contract; /etc/lushycorp-vault/snapshot-backup-token.txt for the token limited to backup; /var/log/lushycorp-vault/vault-snapshot-backup.log; /var/log/lushycorp-vault/vault-remote-restore.log. This detail of the paths is less trivial than it seems. In long sessions or those distributed over several days, the difference between an \u0026ldquo;observable\u0026rdquo; system and one that forces you to guess the state from secondary symptoms is enormous. Here every important phase has its own log known before startup. When something went wrong, I didn\u0026rsquo;t have to reconstruct ex post where it could have failed. I could read it directly.\nThe first problems: good problems, not architectural problems # The first real stumble didn\u0026rsquo;t concern Vault, but the operator node. The first C2 execution failed during the Tailscale validation phase because the system expected the standard tailscaled socket, while the local operator was using a userspace instance with a dedicated socket in /tmp/tailscaled-operator.sock.\nThe interesting point is not so much the fix — relaunching create.sh with TAILSCALE_SOCKET=/tmp/tailscaled-operator.sock — but the fact that the problem was immediately readable and confined. The phase log dedicated to Tailscale validation clearly showed the failure of the local path. There was no ambiguous domino effect on Terraform, Ansible, or Vault. This is exactly the kind of behavior you expect from orchestration well-separated into phases.\nImmediately after, two other typical integration problems emerged:\nthe backup service unit was not yet passing all the necessary operational variables (S3_BUCKET, S3_PREFIX, etc.); the snapshot path existed on the host but was not mounted in the Vault container. Both were solved without having to change the model. I corrected the systemd templates to explicitly pass the required environment and I added the mount of the snapshot directory in the container service unit. This is the kind of work that in a poorly prepared session risks triggering broader doubts (\u0026ldquo;maybe the backup design is wrong\u0026rdquo;). Here instead it was clear from the start that it was a local wiring defect.\nThe initial backup to S3: first real test of phase C2 # Once the wiring part was corrected, the first real backup did what I expected from C2: it treated the remote layer as authoritatively reconstructible starting from a healthy local truth.\nThis point deserves an explanation. In the model I had defined, an \u0026ldquo;empty\u0026rdquo; or \u0026ldquo;incoherent\u0026rdquo; S3 must not block an already coherent local Vault. If the local node is healthy and TazPod is healthy, the remote layer is not the primary source of truth: it is the secondary durability. Consequently, the next backup must be able to initialize or repair the remote content without transforming a backup problem into a total blockage of the runtime.\nAnd that is exactly what happened. The first successful backup:\nclassified the remote state; wrote snapshot and metadata to S3; created the global pointer; created the lineage-local pointer; set the first active slot. This is not a purely \u0026ldquo;mechanical\u0026rdquo; victory. It is the proof that the distinction between local truth and remote durability had been modeled correctly. If the design had been more confused, the system could have attempted absurd restores, blocked the node out of excessive prudence, or written remote objects lacking sufficient context for a future rebuild.\nGenerating a distinguishable state: the marker-A marker # To avoid overly abstract tests, I wanted to introduce a clearly recognizable application state inside Vault. I therefore wrote a marker in the KV store, with the identifier marker-A and scenario baseline-before-matrix.\nWhy is it important? Because backup and restore tests must not stop at the infrastructure level. Knowing that Vault is \u0026ldquo;up\u0026rdquo; or \u0026ldquo;unsealed\u0026rdquo; is not enough. In a secrets system, the real question is: what data exactly does this instance contain? If after a rebuild the system comes back up but has lost or changed the data, the test has failed even if systemd is happy.\nThis marker had two very concrete uses:\nit made visible the difference between snapshots of different states; it provided a reference to reread after the destructive cycles. It\u0026rsquo;s a small detail, but it well represents the kind of approach I prefer in validations: avoid purely syntactic tests and introduce at least one readable functional signal that allows saying \u0026ldquo;this is really the same logical Vault that I expected to recover\u0026rdquo;.\nThe most interesting discovery: two logically identical snapshots, but different as files # The technically most instructive moment of the session came when I verified the behavior of the \u0026ldquo;unchanged snapshot\u0026rdquo; case. The initial contract provided for intuitive logic: if the hash of the current snapshot file is equal to that of the latest remote snapshot, the upload can be skipped.\nOn paper it is reasonable. In practice, it turned out to be false.\nI ran a determinism check by saving two consecutive snapshots without modifying the logical state of Vault. I expected identical files. Instead I got:\nsame logical content detected by vault operator raft snapshot inspect; same Raft index; different file hashes. This is a very important difference. It means that the snapshot binary file incorporates enough variability that it cannot be used as a reliable criterion to say \u0026ldquo;the logical state has remained the same\u0026rdquo;. If I had left the system like this, every run would have continued to upload new snapshots even in the absence of real modifications.\nThe correction I introduced was simple in concept but very important in result: I separated file integrity and logical equivalence.\nsnapshot_sha256 continues to describe the precise file uploaded to S3; snapshot_compare_fingerprint is calculated from the output of vault operator raft snapshot inspect and is used to figure out if the logical state has actually changed. After this change, the test that would previously have produced a false positive of \u0026ldquo;changed snapshot\u0026rdquo; finally returned the correct behavior: upload-skipped. For me this is one of the most successful points of the entire session, because it is a perfect example of how a real test can improve the design without destroying it. The overall model was not wrong. It just needed a comparison more suited to the real semantics of Vault.\nThe final turning point: really closing the T1 + H0 + S1 restore # After validating the backup path, I moved to the most delicate part: the T1 + H0 + S1 case, that is, TazPod coherent, local host empty, S3 coherent. In practical terms: the node is destroyed, but the controller still possesses the canonical bootstrap set and S3 possesses a coherent restore candidate. This is the heart of disaster recovery for phase C2.\nWhen I wrote the first version of this article, that branch was not yet fully closed. The destructive tests had already shown that the restore was selected correctly, that the lineage was resolved the right way, and that the system got very far in the reconstruction. However, there remained two real defects that prevented declaring the matrix green.\nThe first was a problem of remote state classification. In some missing-object conditions on S3, the code did not correctly preserve the distinction between empty and incoherent. The result was subtle but important: a missing lineage-local pointer could be treated in the wrong branch. The fix was small as a shell modification, but large as an operational consequence: I corrected the capture of the exit code and made the reading of S3 404s reliable in both the restore path and the backup path.\nThe second was the truly decisive problem: after the restore, the node still did not rebuild its host-side local-unseal path completely autonomously. In practice, Vault could be brought back up to the correct state, but the two local unseal shares were not always rehydrated and the oneshot unseal service could conclude too early during the window in which the container was not yet at the right point of the post-restore bootstrap.\nHere the useful work was not \u0026ldquo;adding random retries\u0026rdquo;, but respecting the already defined C1/C2 contract:\nthe C2 restore now explicitly rehydrates on the host node unseal-share-1 and unseal-share-2 starting from the canonical set kept in TazPod; the logic of vault-local-unseal.sh now better distinguishes the case in which Vault is not yet initialized but the unseal material already exists locally, avoiding declaring success too early; the convergence playbook no longer just relies on the systemd oneshot: it explicitly relaunches the local unseal helper after the restore, so the final state check actually happens after the reconstruction of the unseal path. After these fixes, the T1 + H0 + S1 branch passed all the way through. The node is destroyed, recreated, Vault is restored from the correct candidate on S3, the local receipt is updated, the host-side unseal shares return, the local-unseal resumes correctly, and the final Vault returns initialized=true and sealed=false without manual reconciliation.\nThe most important signal, however, remained the same: after the destructive part and the complete restore, the expected logical content was still there. I was not getting an \u0026ldquo;alive but new\u0026rdquo; Vault; I was really recovering the logical instance I wanted to bring back online.\nFrom half-victory to a complete green matrix # The difference between a promising session and a closed session lies entirely here: at a certain point you stop saying \u0026ldquo;the model looks right\u0026rdquo; and you start being able to say \u0026ldquo;the designed matrix actually passed\u0026rdquo;. That is exactly what happened in the final transition of C2.\nAfter the first implementation block and the first live tests, I already had strong evidence on backup, pointers, repair, and semantic comparison of snapshots. The final work transformed those partial proofs into a complete set of scenarios executed one by one.\nIn practice, all the cases designed for T7 were closed.\nTo read the matrix quickly: T indicates the state of the canonical set in TazPod, H the state of the local host/Vault node, S the state of the remote layer on S3. The suffix 0 means empty, 1 means coherent, 2 means incoherent. So T0 = empty TazPod, T1 = coherent TazPod, T2 = incoherent TazPod; H0 = empty local host/Vault, H1 = coherent local host/Vault, H2 = incoherent local host/Vault; S0 = empty S3, S1 = coherent S3, S2 = incoherent S3.\nT0 + H0 + S0 -\u0026gt; fresh init allowed; T0 + H0 + S1 -\u0026gt; hard fail because the canonical anchor in TazPod is missing; T1 + H0 + S1 -\u0026gt; restore succeeded during create.sh; T1 + H0 + S0 -\u0026gt; hard fail, no fake restore; T1 + H0 + S2 -\u0026gt; hard fail; T1 + H1 + S0 -\u0026gt; backup correctly initializes the remote layer; T1 + H1 + S2 -\u0026gt; backup correctly repairs the remote layer from coherent local truth; unchanged run -\u0026gt; real upload-skipped; first valid backup into remote-empty lineage -\u0026gt; write to slot-a + lineage-local pointer; changed run on coherent lineage -\u0026gt; switch to the inactive slot; missing pointer with slots still present -\u0026gt; restore hard-fail and subsequent repair via backup; metadata mismatch -\u0026gt; explicit hard fail; incoherent TazPod -\u0026gt; hard fail; incoherent local host -\u0026gt; hard fail. This step is important also conceptually. As long as a matrix remains partially open, the system is still \u0026ldquo;promising\u0026rdquo;. But when you have also covered the ugly cases — missing pointer, corrupted metadata, lineage mismatch, incoherent local state — the system stops being just convincing in a demo and begins to become credible in operation.\nThe most beautiful result: the final unexpected events confirmed the design, they didn\u0026rsquo;t demolish it # Paradoxically, the problems that emerged in the last part are the best proof that the days of design really served a purpose.\nIf the model had been fragile, these last tests would have forced a reworking of the general strategy: perhaps changing the structure of pointers, changing the relationship between TazPod and S3, or rewriting the semantics of empty/incoherent cases. Instead, it didn\u0026rsquo;t happen. The problems turned out to be exactly the kind I hoped to encounter in a well-prepared session: local, readable, confined problems.\na bug in the shell exit code capture; a precise problem in the reconstruction of the host-side material after restore; a too optimistic timing in the post-restore local-unseal. These are real problems, but they are not architectural problems. And this, for me, is the difference between a chaotic session and an engineerable session.\nThe final state left deliberately healthy and coherent # At the end of the work I did not leave behind a machine \u0026ldquo;good enough to stop testing\u0026rdquo;. I explicitly reconciled the environment to a clean, coherent, and reusable final state.\nThe final state is this:\nHetzner VM active and reachable; lushycorp-vault.service active; vault-local-unseal.service active; Vault initialized and unsealed; TazPod coherent with the canonical artifact set; S3 coherent with valid global pointer and lineage-local pointer; backup timer active; main logs present and available; final canonical lineage realigned to d91c4d14-30a6-4518-b162-d1c1a1b9c069. There is a detail that I consider important to tell openly: during the fresh init tests, a new temporary lineage was also generated. It would have been easy to consider it \u0026ldquo;lab noise\u0026rdquo; and ignore it. Instead, the serious work is precisely not leaving noise around. At the end of the session that temporary lineage was not left as operational state: the final runtime was brought back in coherence with the original canonical lineage, on host as well as in TazPod and S3.\nThis choice is intentional. In a context that wants to behave in an increasingly enterprise manner, even the end of the session is part of the work. It is not enough to show that the test passes. The system must be left in a condition that is understandable and operable by the next session.\nPost-lab reflections, now that C2 is truly closed # If I had to summarize this stage in a single sentence, today I would formulate it like this: design has shifted the difficulty from \u0026ldquo;understanding what to build\u0026rdquo; to \u0026ldquo;precisely closing the final real details until the whole matrix passes\u0026rdquo;.\nIt is exactly the kind of result I wanted to obtain with CRISP. Not because coding must become trivial, but because coding should be the last phase of a chain of already mature decisions. In this session the result was seen very concretely: few truly unexpected problems, all readable from the logs, almost all confined, no collapse of the architectural system, and no really destructive surprise emerging from nowhere.\nThe difference compared to the first draft of this article is that now I no longer have to stop and say \u0026ldquo;the recovery is not yet fully closed\u0026rdquo;. I can say something stronger and more useful: the remote backup is real, the comparison of snapshots was corrected based on the actual behavior of Vault, the destructive restore was closed, the hard-fail cases were verified, the runtime remains coherent across TazPod, host, and S3, and phase C2 can be considered concluded.\nThis, for a secrets platform on a single Hetzner node built with Podman, systemd, Tailscale, Ansible, and S3, is a very significant result. Not because it is \u0026ldquo;perfect\u0026rdquo; in an absolute sense, but because it has reached that rare point where design, implementation, destructive tests, and final operational state finally tell the same story.\nDesigning without rushing did not eliminate work. It made it proportionate. And, above all, it made the implementation linear enough to make something seem natural that, without those days of design, would probably have degenerated into many more hours of chaotic debugging.\nFor this phase, it is a great place to stop: with a live Vault, credible remote durability, a truly closed recovery, a complete matrix brought to green, and the confirmation that the time spent thinking before coding continues to be the most profitable investment of the entire laboratory.\n","date":"12 April 2026","externalUrl":null,"permalink":"/posts/terraforming-the-cloud-provisioning-configuring-vault-hetzner-terraform-ansible/","section":"Posts","summary":"","title":"Terraforming the Cloud: Provisioning and Configuring Vault on Hetzner via Terraform and Ansible","type":"posts"},{"content":"","date":"12 April 2026","externalUrl":null,"permalink":"/tags/vault/","section":"Tags","summary":"","title":"Vault","type":"tags"},{"content":"","date":"10 April 2026","externalUrl":null,"permalink":"/tags/agents/","section":"Tags","summary":"","title":"Agents","type":"tags"},{"content":"","date":"10 April 2026","externalUrl":null,"permalink":"/tags/ai/","section":"Tags","summary":"","title":"Ai","type":"tags"},{"content":"","date":"10 April 2026","externalUrl":null,"permalink":"/categories/ai/","section":"Categories","summary":"","title":"AI","type":"categories"},{"content":"","date":"10 April 2026","externalUrl":null,"permalink":"/tags/context-management/","section":"Tags","summary":"","title":"Context-Management","type":"tags"},{"content":"","date":"10 April 2026","externalUrl":null,"permalink":"/tags/memory/","section":"Tags","summary":"","title":"Memory","type":"tags"},{"content":"","date":"10 April 2026","externalUrl":null,"permalink":"/categories/productivity/","section":"Categories","summary":"","title":"Productivity","type":"categories"},{"content":"","date":"10 April 2026","externalUrl":null,"permalink":"/tags/productivity/","section":"Tags","summary":"","title":"Productivity","type":"tags"},{"content":" Recursive memory, compact context: the missing piece for working well with AI agents # The problem after solving the problem # A few weeks ago I had written an article about AGENTS.ctx, the context system I use to work with AI agents without having to explain everything again from scratch every time I open a new session. The basic idea was simple: instead of opening an empty chat and manually reinjecting rules, project structure, operating conventions, and the general state of the work, I organized everything into contexts that can be loaded on demand. The agent does not change: the context I make it read does.\nThat solution worked really well. Not in theory, but in day-to-day work. I open the tazpod context, and the agent immediately knows how the CLI is structured, which paths are critical, how to push to GitHub, and where the operational risks are. I open blog-writer, and the agent already knows how to plan an article, how to move to writing, and when to stop for review. I open crisp, and the whole rhythm changes: no implementation, only research and design.\nThe point, however, is that once the operational bootstrap was solved, a second problem emerged. More subtle than the first one, but just as important. Contexts explain how to work in a certain domain. They do not always explain where we are today, what happened in the latest sessions, which debts are still open, what the current truth of the system is. For work that lasts weeks or months, that distinction matters a lot.\nIn other words: contexts had given me the frame. I was still missing an active memory, continuously updated, compact enough to always stay within reach but rich enough to let an agent resume immediately from the right point.\nWhy “having more memory” was not enough # When working with LLMs and coding agents, the first reaction to continuity problems is often instinctive: keep as much as possible. Longer transcripts, more notes, more files, more logs, more summaries. On paper it looks like a good idea. In practice, it almost never is.\nThe problem is not only quantitative. It is structural. If active memory grows without control, sooner or later it stops being a tool and becomes noise. An agent that has to read too much material before becoming operational is not really aligned: it is simply overloaded. The bootstrap cost starts rising again, only in a different form. I am no longer manually re-explaining things, but I am forcing the system to digest an increasingly large block of heterogeneous context.\nThis is the same problem I had already tried to avoid when designing AGENTS.ctx: do not load everything, load only what is necessary. That same philosophy had to be extended to memory as well.\nSo the question was not: how do I give the agent more memory? The correct question was: how do I build a memory that stays operational, readable, dense, and does not degrade as the project moves forward?\nThe next step: an active, recursive memory that always stays small # The solution I built is simply called memory, and it has become the active continuity layer inside AGENTS.ctx. It is not an external database. It is not an opaque system. It is not some magic feature of the model. Once again it is a structure of simple files, versioned in Git, readable by any agent that can read Markdown.\nBut the difference from static contexts is fundamental: memory is not just a set of rules. It is a living memory that changes over time and keeps track of the present in a disciplined way.\nThe current model is organized like this:\nsystem-state.md debts.md past-summary.md chronicle.md past/ scripts/archive-memory.sh At first glance this may look like just another documentation folder. In reality it is a system of very precise roles, and the precision of those roles is exactly what makes it useful.\nThe role of the files: not an accumulation, but a hierarchy # The first important decision was to avoid the single omnivorous file. If everything ends up in the same document, the distinction between current state, chronology, technical debt, and historical summary breaks within a few sessions. At the beginning it may seem convenient. After a few days it becomes unmanageable.\nThat is why each file has a strict responsibility.\nsystem-state.md: the truth of today # This file exists to answer one question only: what is true now? Not what was true three days ago, not what was decided in a design discussion, not the full detail of the troubleshooting. Only the current operating doctrine.\nThis is where I put the things that, if I open a new agent, I want it to know immediately without digging:\nwhat the current TazPod model is, which important caveats exist, how the CI pipeline works right now, what role the various system layers play, which components are considered sources of truth. The purpose of system-state.md is not to narrate the past. It is to condense the present.\ndebts.md: what is not solved # At a certain point I also split technical debt into its own file. This was an important correction that emerged from real use. At the beginning it is very easy to mix open problems into chronology or into state. But those are different things.\nA technical debt is not just an event from the past. It is an active tension still present in the system. It needs to stay visible in structured form. That is why debts.md became the canonical register of everything that is open, in progress, deferred, or to be closed later.\nIn practice, when I reopen a session, I do not only want to know what has been done. I want to see immediately what is still missing, where the risks are, which problems I cannot afford to forget.\nchronicle.md: recent causal continuity # This is the diary of the current cycle. It is not a compressed historical summary, and it is not doctrine. It is the narrative layer that keeps together cause and effect from the latest sessions.\nThis is where the answer lives to the question: how did we get to where we are?\nThat distinction matters. State alone is not enough. If I open an agent and tell it that today the system works in a certain way, the “why” is often missing. And without the why, it becomes much easier to break something in the next session.\nChronology exists for exactly that reason: preserving the recent chain of events, problems, fixes, and consequences.\npast-summary.md: compression of the useful past # This is probably the most interesting piece in the entire system. Because the problem is not only keeping a recent memory. It is doing that without losing the past and without loading all of it every time.\npast-summary.md is the compressed layer of older history. It does not go into every detail. It should not. Its job is to provide high-density historical orientation: major decisions, architectural pivots, motivations that still matter today.\nIt is long-term memory, but still immediately usable.\npast/: the recursive archive # And here is the part that, for me, makes the system genuinely interesting. The past is not deleted. It is archived recursively.\nWhen the active cycle crosses a threshold, the system does not just “clean up.” It moves the root files into past/, preserves the structure, and then regenerates a clean active root. If in the future I need to reconstruct what happened more deeply, I can dig back one layer at a time.\nThis is a very important feature of the model. What I have immediately in hand is only what I need now. But if I need to understand what happened in more depth, I can do so by moving backward gradually, without turning the active root into an infinite archive.\nWhy recursion here is not a theoretical flourish # Saying “recursive” can sound like the kind of technical word used to make something look elegant. In reality, here it is a very concrete choice.\nIn a traditional linear memory there are usually two common outcomes:\neither the file grows without control, or the past gets cut brutally and continuity is lost. I wanted to avoid both. Recursion lets me preserve everything without having to keep everything loaded on the same operational plane. It is a subtle but decisive difference.\nThe active layer stays compact. The past does not disappear. It is simply moved into a less immediate historical layer. If I need it, I reopen it. If I do not, it does not pollute the current read.\nThis approach also has a very practical effect on the way I work: when I reopen an agent, it does not have to walk through weeks of history to understand what to do. It reads the active root, and only if the problem requires it does it go deeper.\nIt is stratified memory, not bloated memory.\nThe archive model: keeping memory alive without letting it explode # To make this model sustainable it was not enough to split files. I also needed a disciplined mechanism to prevent the active root from growing forever. That is why I introduced an explicit archive flow managed by scripts/archive-memory.sh.\nThe logic is this:\nwhen active chronology crosses a threshold, or when an important milestone is completed, or when I decide to force an archive, the current cycle gets archived, the root memory gets regenerated, and work starts again from a compact base. This part mattered not only as an idea, but as a real implementation. I was not interested in having an elegant model on paper. I wanted to verify that, when memory actually had to be archived, the behavior was clean: files moved correctly, summary regenerated, new chronology minimal, no useless logs polluting the workspace.\nThat is why I also executed a real forced archive, not just a theoretical review. It was an important step, because it turned the design into verified behavior.\nThe system was not born perfect, and that is a good thing # One of the things I find most interesting about this work is that the system did not emerge as a single “brilliant” solution. It was corrected as I used it.\nThe first version of the model was already useful, but its boundaries were still too soft. Some responsibilities were mixed together. Some historical information ended up in the wrong place. The risk was the classic one of documentation systems: they start organized and gradually fall back into chaos.\nThe most important corrections came precisely from real use:\ntechnical debt was split into debts.md, chronology was cleaned up until it truly became chronology, state was narrowed down to “truth of today,” past-summary.md became a first-class artifact, the archive contract was made more precise, validation was done with a full cycle, not just with common sense. For me, this is a sign of system health. A useful workflow is not the one that looks perfect in its first sketch. It is the one that survives contact with real use and survives revisions without losing coherence.\nMemory and Mnemosyne: two memories, two different uses # There is a distinction here that I think is worth making very explicit, because from the outside it could look redundant.\nI already have Mnemosyne, which I use as historical memory and semantic retrieval. So why build memory as well?\nBecause they serve two different things at two different moments.\nMnemosyne: search by meaning # Mnemosyne is useful when I do not remember exactly when something happened, but I remember what it was about. For example:\n“when did we already see a similar CI problem?” “had we already discussed that Gemini quota issue?” “in which session did that Hetzner trade-off emerge?” That is semantic search. It is extremely powerful, but it does not replace active context.\nMemory: know where we are now # memory is useful instead when I open the terminal and want to resume immediately. I do not want to search semantically through the past. I want to know:\nwhat is true today, what is still open, what happened recently, what the active baseline is from which to continue. It is chronological and operational memory. It is not retrieval. It is continuity of work.\nThat is why I consider memory and Mnemosyne complementary, not competing. One is for resuming. The other is for finding again.\nWhat this actually changes in day-to-day work # The most interesting part is not the architecture itself, but the change in rhythm it creates.\nWhen a workflow like this works, something very simple and very hard to achieve happens: I reopen an agent, and it already knows where we were. Not in a vague sense. Not in the sense that “maybe it remembers something.” It really knows what stage of the journey we are in.\nThis drastically reduces the cost of reopening sessions. And it also reduces another form of friction that, at first, I had underestimated: the mental energy spent reconstructing context. Every time I have to stop and manually summarize the state of a project, I am wasting part of my attention on an administrative task instead of on the actual technical problem.\nWith memory, that reconstruction is already there. And above all, it is there in a form small enough to remain useful.\nThe future direction: simplify even further # There is also an architectural consequence that I am seeing more clearly now that the system has started working well: the final model should probably become even simpler.\nMy long-term goal is to have essentially two layers only:\nmemory as chronological, current, recursive memory, Mnemosyne as semantic memory and search. Everything else should progressively converge there. Not because the intermediate history was not useful, but because a model that can be explained in two sentences is almost always more robust than one with too many transitional layers.\nThat does not mean throwing away the past. It means absorbing it into a clearer structure.\nPost-lab reflections # If I had to summarize the meaning of this work in one sentence, I would put it this way: after building contexts, I needed a way to never truly start from zero again. Recursive memory was that missing piece.\nI did not want infinite memory. I wanted memory that is always ready. I did not want an accumulation of notes. I wanted a system that clearly separates state, debts, chronology, and compressed past. I did not want to lose history. I wanted to be able to dig into it only when necessary.\nThe result, at least so far, is very convincing. Contexts continue to do their job: they provide rules, structure, addressing, and operational specialization. Memory adds the temporal continuity that was missing. Mnemosyne remains the semantic retrieval layer, useful when I need to search the past by meaning.\nThey are different problems, but two of them are now converging toward a cleaner form: an active, recursive memory with controlled size, and a semantic memory for historical search.\nFor the way I work, alone, with different agents and long-running projects, this is not an organizational detail. It is a change in the quality of the workflow. I open the terminal, reopen the agent, and the system already knows where to resume from. Not everything. Only what is needed. And that is exactly the point.\n","date":"10 April 2026","externalUrl":null,"permalink":"/posts/recursive-memory-compact-context-ai-agents/","section":"Posts","summary":"","title":"Recursive memory, compact context: the missing piece for working well with AI agents","type":"posts"},{"content":"","date":"10 April 2026","externalUrl":null,"permalink":"/tags/workflow/","section":"Tags","summary":"","title":"Workflow","type":"tags"},{"content":" A quieter infrastructure session than usual: when design reduces chaos # Objective of the session # This stage of the project had a very precise goal: to close the foundation step of the Hetzner pipeline, meaning to arrive at a runtime machine capable of being born from a golden image, being bootstrapped over public SSH, joining Tailscale correctly, moving the operational plane onto the private channel, and proving that it can converge repeatably even on the second run.\nPut more concretely, the target was the hetzner-tailscale-foundation project: not Vault yet, not the full lifecycle of the application service yet, but the first real operational layer on which everything else can rest. If this base is not clean, every subsequent phase becomes noisy: when something breaks, I can no longer tell whether the problem is in provisioning, networking, secrets, the runtime, or the final service. Closing the foundation properly means removing ambiguity from the rest of the journey.\nWhat is interesting is that this session was relatively calm. Not perfect, but orderly. There were a couple of real issues, also instructive ones, but it was not a marathon of chaos. And this is exactly the point I want to fix in place: it did not go well because the problem was trivial. It went well because the hardest work had already been done before implementation.\nThe invisible part that made the smoothness visible # If I looked only at the final execution, I could describe it like this: I launched the build, corrected a few real integration details, validated the transition onto Tailscale, verified idempotent rerun behavior, and closed with destroy.sh. That would be a correct account, but an incomplete one. The decisive point is that this build did not come from a “build me this infrastructure” prompt thrown at an LLM in the hope that everything would assemble itself elegantly.\nBefore getting here, there had already been hours of discussion, redefinition of the problem, clarification of constraints, review of TazPod’s real behavior, correction of wrong assumptions, and above all one fundamental choice: splitting the project into smaller parts. First the golden image, then the foundation, only after that the Vault convergence. This decomposition drastically reduced the number of active variables in each phase.\nThis is where the CRISP methodology and the following step into crisp-build had real value. Not so much as a methodological label, but as discipline. I used one context to design, discuss, correct the plan, and lock the contracts. Only after that did I open the implementation worksite. The practical benefit was enormous: when something did not match, the deviation was readable. I did not have to investigate one giant blob of provisioning+runtime+network+Vault all at once, but a single step of the system.\nWhy splitting into subprojects really changes the outcome # This is probably the strongest lesson of the session. If I had tried to do golden image, foundation, Tailscale bootstrap, secrets, and the first Vault lifecycle all at the same time, I would have produced a classic domino effect. Every error would have dirtied all the higher layers, making troubleshooting ambiguous. A VM that did not respond could have meant a broken image, a wrong ACL, a non-idempotent playbook, a bad bootstrap token, an incoherent Tailscale policy, or simple local network instability.\nBy splitting the journey into multiple steps instead, I got the opposite effect. The golden image had already been closed and validated as an independent gate. That meant that during the foundation phase I could treat the base runtime as reliable, and focus only on provisioning, network bootstrap, and operational convergence. In engineering terms, this is a huge reduction in diagnostic surface. It is not just “project management”: it is a concrete reduction of technical entropy.\nIt is the difference between launching an operation with ten open hypotheses and launching one where seven hypotheses have already been closed beforehand. Then, when real problems appear, as they did here, their nature is much more readable. And that is exactly what happened.\nThe actual implementation: building a clean and verifiable foundation # The implementation work was concentrated in the new workspace under ephemeral-castle/runtimes/lushycorp-vault/hetzner/. I built there all the pieces needed for the foundation:\nTerraform layer for the VM, bootstrap firewall, and local outputs, Ansible baseline for runtime verification, Ansible role for Tailscale bootstrap, create.sh and destroy.sh with separate logs per phase, helper scripts for inventory generation and tag validation, an explicit source of truth for the approved golden image. This choice has a precise meaning: the project could not depend on implicit memory or on IDs remembered out loud. If the approved image is lushycorp-vault-base-20260404-v4 with ID 373384231, that information has to live in a file actually consumed by the scripts, not in a mental note or in a sentence lost inside a design document.\nThe core of the foundation provisioning was intentionally simple. Terraform creates a VM from the approved golden image, opens only the minimum required in the cloud firewall for phase A, and generates the public inventory used for the first bootstrap. Ansible enters over public SSH, verifies the user model, installs or checks the required components, and brings the node into Tailscale. At that point, the system must be able to move onto the private plane and continue operating there.\nA very representative example of the Terraform layer is this one:\nresource \u0026#34;hcloud_server\u0026#34; \u0026#34;foundation\u0026#34; { name = var.server_name server_type = var.server_type image = var.image_id location = var.location ssh_keys = [var.ssh_key_name] firewall_ids = [ hcloud_firewall.foundation_bootstrap.id, ] public_net { ipv4_enabled = true ipv6_enabled = true } labels = merge(local.foundation_labels, { image_name = var.image_name image_id = var.image_id }) } Here the meaning of the step is clear: no infrastructure fantasy, no opaque layers, just the minimum necessary to generate a machine that is coherent and traceable, with outputs useful to the following steps.\nThe problems that emerged were “good” problems # The first interesting point is that the problems that emerged never put the overall architecture into question. That does not mean they were trivial. It means they were real integration problems, not signs of a wrong project.\nThe first serious error appeared at the moment of tailscale up on the runtime VM. The machine had been created correctly, initial access over SSH worked, the Tailscale daemon installed correctly, but the join failed with a very precise message: the requested tags were invalid or not permitted. This is exactly the kind of problem a live build is supposed to surface. The design correctly said that the node had to join with tag:tazlab-vault and tag:vault-api. The reality of the Tailscale control plane said instead that the bootstrap OAuth client still did not have the right ownership model to assign them.\nThis diagnosis was important because it showed the quality of the plan: I did not have to rethink the whole foundation, I had to correct a real contract between the bootstrap client and the tailnet policy. I updated the ACL source of truth and the OAuth client definition in ephemeral-castle/tailscale/, applying the fix directly to the real tailnet. It is an apparently small detail, but it says a lot: the project needed a control-plane alignment, not a pipeline rewrite.\nThe second problem: the node was online, but SSH over Tailscale still failed # After correcting the tag problem, the runtime did in fact join Tailscale and showed the expected tags. It looked like everything was resolved. But the next step — Ansible over Tailscale — kept failing. This was the most instructive point of the entire session, because on paper the node was healthy:\ntailscale ping replied, the node appeared in the tailnet, the tags were correct, sshd was active, the tailscale0 interface had its IP. And yet SSH to the 100.x address timed out.\nHere the difference between chaos and readable investigation showed up again. The fact that the peer was alive while the application transport was not told me that I was not looking at a global Tailscale failure. There were two possibilities: an incomplete ACL on the control-plane side, or a peculiarity on the operator side. In reality, both were true.\nOn one side, port 22 was explicitly missing from the ACL path tag:tazpod -\u0026gt; tag:tazlab-vault. That was a real policy error and had to be fixed on the tailnet. On the other side, there was an even more interesting aspect: in my local operator environment Tailscale runs in userspace-networking, so tailscale ping can work perfectly even if the host system has no direct kernel routing toward 100.x addresses.\nThis distinction is very important. A superficial use of the tools could easily have led to the wrong conclusion: “Tailscale is up, so SSH to the 100.x address should work.” But no. In userspace mode the mesh is healthy, but the TCP path from the host system may still require an explicit bridge.\nThe final correction was small, but highly instructive # The solution was not to force the local system to behave like a node with full kernel routing, but to adapt the transport switch to the real context. I therefore changed the Tailscale inventory generation so that it used tailscale nc as the SSH ProxyCommand. This way Ansible no longer depends on whether my local host can directly reach the 100.x address at the traditional network stack level: it uses the userspace channel provided by the local Tailscale daemon.\nIt is a small fix, but from a design point of view it is excellent, because it makes the system more robust with respect to the real operator environment. I am not writing a foundation that works only in the ideal lab; I am closing a foundation that works in the concrete context in which I am using it today.\nThe key part of the generated inventory became this:\n[foundation_tailscale] foundation-node ansible_host=100.83.183.124 ansible_user=admin ansible_ssh_private_key_file=/home/tazpod/secrets/ssh/lushycorp-vault/id_ed25519 ansible_ssh_common_args=\u0026#39;-o ProxyCommand=\u0026#34;tailscale nc %h %p\u0026#34; -o StrictHostKeyChecking=accept-new\u0026#39; This line tells a broader lesson: real systems do not always fail on the big concepts. They often fail at the contact points between a well-designed project and an operating environment with specific characteristics. The difference lies in the ability to read the problem without generalizing too quickly.\nThe final result: create, rerun, destroy # Once those details were corrected, the project closed its objective exactly as expected. create.sh reached the end successfully. The VM was born from the golden image, passed the public bootstrap, joined the tailnet as lushycorp-vault-foundation, showed the correct tags, answered on the Tailscale path, and executed the baseline check via Ansible on the private channel. Even the podman --version verification passed without surprises.\nEven more importantly, the rerun confirmed the sanity of the plan. Terraform went to no-op, Tailscale did not require unnecessary mutations, and the system showed the behavior I expected from a well-designed foundation: not only “it works once,” but it converges when I launch it again.\nFinally, I also executed destroy.sh and verified local cleanup. This step is essential for me. An infrastructure-as-code project is not really closed when it creates a machine: it is closed when it also knows how to remove it cleanly, leaving the workspace readable and ready for the next cycle. That is where you see whether the pipeline is just a demo or a process you can reopen.\nThe lesson about LLMs is the most important part of this post # All of this confirms in a very concrete way something I had already sensed and written before: LLMs are powerful, but the result does not depend only on their generative capacity. It depends enormously on how they are guided.\nIf the approach is “build me this infrastructure” and then I wait for a well-made system to emerge from a generic request, the risk is extremely high. I may get plausible output, but fragile, poorly aligned with the real context, or built on unchecked assumptions. A language model can produce an impressive amount of useful material, but it does not automatically replace the work of clarifying the problem.\nWhat this session shows is almost the opposite: when the operator knows what is being built, knows how to split the problem, knows how to identify the real constraints, and uses the LLM inside a disciplined structure, the multiplier changes scale. It is no longer a text generator trying to improvise an infrastructure. It becomes an accelerator for the engineering capacity of whoever is guiding it.\nThe most honest formula I take away is this: the more the person using the LLM understands the domain, the more the multiplier rises. If understanding is weak, the model amplifies ambiguity. If understanding is strong, the model amplifies speed, breadth of exploration, and implementation quality.\nPost-lab reflections # This stage is not memorable because “there were no problems.” There were problems, and that is exactly how it should be. It is memorable because the problems were the right kind: small, real, readable, and correctable without demolishing the project. That is the best possible signal for a foundation.\nThe most satisfying thing, at this stage, is not that I brought up a VM on Hetzner with Tailscale. It is that I verified that the combination of upfront design, work decomposition, and guided use of an LLM produces a much smoother execution than the one I would have obtained with a more impulsive or monolithic approach.\nIn the end, that is exactly the point of this session: less improvisation during the build, more intelligence before the build. And when that happens, even a complex infrastructure step can finally become a quieter session than usual.\n","date":"9 April 2026","externalUrl":null,"permalink":"/posts/infrastructure-session-when-design-reduces-chaos/","section":"Posts","summary":"","title":"A quieter infrastructure session than usual: when design reduces chaos","type":"posts"},{"content":"","date":"9 April 2026","externalUrl":null,"permalink":"/tags/automation/","section":"Tags","summary":"","title":"Automation","type":"tags"},{"content":"","date":"9 April 2026","externalUrl":null,"permalink":"/tags/llm/","section":"Tags","summary":"","title":"Llm","type":"tags"},{"content":"","date":"9 April 2026","externalUrl":null,"permalink":"/tags/terraform/","section":"Tags","summary":"","title":"Terraform","type":"tags"},{"content":" Objective of the session # The goal was simple to describe but not trivial to close properly: to arrive at a stable, reusable runtime golden image on Hetzner, ready to be consumed in the next foundation phase.\nIn practical terms, I wanted to eliminate heavy runtime bootstrap and move the work into build-time: prepare a builder VM, configure it, validate it, freeze it into a snapshot, then verify that machines born from that snapshot are coherent and predictable.\nAt the method level, I imposed a clear operational rule: not to stop at the “first time it seems to work,” but to close the full cycle all the way to a final version verified on fresh instances. This led to multiple iterations (v1 → v4), but that was the necessary step to turn a local result into a reliable artifact.\nWhy a golden image before the foundation # When building an infrastructure foundation, mixing provisioning, package installation, hardening, and application bootstrap at the same time creates a domino effect that is difficult to diagnose. If something fails, it is never immediately clear whether the problem is:\nin the network layer, in the access layer, in the runtime layer, or in a race condition during bootstrap. The golden image separates responsibilities:\nBuild-time: I prepare the base runtime once, in a repeatable way. Deploy-time: I instantiate and converge the network/foundation with fewer variables in play. This approach reduces the error surface and makes troubleshooting more readable. It is not just an “elegant” choice: it is a practical choice when I want to deliver a pipeline that holds up in future sessions, not just in the current demo.\nThe builder profile: economical but sufficient # An explicit constraint of the session was to use the cheapest profile possible, as long as it was adequate:\ncx23 4 GB RAM 40 GB SSD shared CPU This choice was kept across all final iterations. This is important because it avoids building a pipeline that works only on more expensive sizes and then degrades when brought back to realistic profiles.\nIn other words, I wanted to verify behavior within the real economic perimeter of the project, not in a “comfortable” environment.\nFirst real use of Ansible in my flow # This was the first time I used Ansible in a central way in my process, not as a secondary tool. The operational difference was clear: moving from manual actions to repeatable declarative configuration.\nThe playbook covered the runtime baseline with:\npackages required by the runtime, a coherent user model (admin for operations, vault non-interactive), SSH hardening (password authentication disabled), minimal but deterministic system configuration. An example of the core of the playbook used during build:\n- name: Configure Hetzner runtime golden image baseline hosts: builder become: true vars: runtime_packages: - podman - python3 - curl - jq - ca-certificates - gnupg - apt-transport-https tasks: - name: Update apt cache ansible.builtin.apt: update_cache: true cache_valid_time: 3600 - name: Install runtime baseline packages ansible.builtin.apt: name: \u0026#34;{{ runtime_packages }}\u0026#34; state: present - name: Ensure admin user exists ansible.builtin.user: name: admin shell: /bin/bash groups: sudo append: true create_home: true state: present - name: Ensure vault service user exists (no login) ansible.builtin.user: name: vault shell: /usr/sbin/nologin create_home: false system: true state: present The practical value was not “Ansible itself,” but the fact that every correction entered the playbook instead of remaining a manual workaround forgotten in the next session.\nThe build and validation cycle # The full operational flow was:\ncreate builder VM, apply baseline with Ansible, technical validations, power off builder, snapshot, test on a new VM from the snapshot, cleanup of temporary resources. This cycle was repeated multiple times until all inconsistencies between “the builder works” and “a fresh instance really works” were removed.\nWhy multiple snapshots (v1, v2, v3, v4) # This was the most important point of the session: true stability is not measured on the node I have just configured, but on a new machine born from the artifact.\nEach iteration removed a practical defect that surfaced only when retesting on a fresh instance. In the end, instead of keeping a chain of “almost good” snapshots, I chose a cleaner policy:\npromote only the final valid version, delete intermediate versions, lock a single handoff ID. The most concrete defect: different behavior across users # Part of the stabilization was making sure commands were available not only to root but also to the operational user.\nThis kind of problem is typical in image pipelines: installations that look correct but are tied to user-specific paths or shell contexts. The final solution was to make publishing the binary explicit in a system path (/usr/local/bin), so visibility would be uniform for root/admin on snapshot-born instances.\nThe lesson here is straightforward: when validating a golden image, it is not enough to verify “command present.” I also need to verify presence + execution + target user.\nThe part that looked like an image bug but was not # At an advanced stage I saw tests fail with VMs reported as running. The initial suspicion can easily drift toward a corrupted snapshot or incomplete bootstrap. In reality, the problem was different: unstable local connectivity in the working network (mobile hotspot), especially on the IPv6 path during certain time windows.\nThis has a huge practical impact on debugging: I can lose hours changing the image when the actual problem is in the client→server path.\nTo avoid false positives, I separated the two planes:\nimage artifact quality, test channel reliability. From there came the final operational decision for the test pipeline: robust IPv4 by default in mobile contexts, IPv6-only as an explicit mode when the local network is confirmed.\nHardening the test harness # To close the session properly, I did not stop at “the test passed once.” I also consolidated the operational tools.\nI structured three main scripts:\ndynamic inventory generation, end-to-end golden image build, image test with validations and cleanup. A simplified example of the test script intent:\n./scripts/test-image.sh \\ --image-id 373384231 \\ --server-name lv-img-script-test-ipv4-final The key point is that the test does not stop at SSH ping. It explicitly verifies the expected binaries for both operational users, and closes the cycle by deleting the test VM at the end.\nFinal result # Final promoted artifact:\nSnapshot name: lushycorp-vault-base-20260404-v4 Image ID: 373384231 Satisfied criteria:\nruntime baseline applied repeatably, validation on a fresh instance, coherent behavior for root/admin, stable test harness with explicit cleanup, no VM left active at the end of the run. What I take away from this stage # This session was not just about “building an image.” It was a stage in the maturation of the process.\nThe things that made the real difference were:\nSeparating build-time and deploy-time to reduce diagnostic noise. Using Ansible as the source of truth for configuration, not as occasional support. Always validating on new instances, not stopping at the builder machine. Distinguishing infrastructure bugs from local network bugs before changing the artifact. Closing with rigorous cleanup to avoid polluting the next cycle. From an operational perspective, the golden image pipeline is now in a usable state: not perfect in the abstract, but sufficiently deterministic to become a reliable input for the foundation phase.\nAnd that is exactly the kind of result I was looking for: less “a script that works today,” more “a process I can reopen tomorrow without starting from zero.”\n","date":"7 April 2026","externalUrl":null,"permalink":"/posts/hetzner-runtime-golden-image-final-path/","section":"Posts","summary":"","title":"Golden image runtime on Hetzner: the path to the final version","type":"posts"},{"content":"","date":"7 April 2026","externalUrl":null,"permalink":"/tags/golden-image/","section":"Tags","summary":"","title":"Golden-Image","type":"tags"},{"content":"","date":"7 April 2026","externalUrl":null,"permalink":"/tags/linux/","section":"Tags","summary":"","title":"Linux","type":"tags"},{"content":"","date":"7 April 2026","externalUrl":null,"permalink":"/tags/testing/","section":"Tags","summary":"","title":"Testing","type":"tags"},{"content":" LushyCorp Vault on Hetzner: security-driven architectural choices # This article is not about implementation. It is about design: how I defined the core of the LushyCorp Vault project on Hetzner before splitting it into execution subprojects.\nThe goal was only one: build a Vault runtime that could be born, die, and be reborn without losing security, without depending on fragile manual steps, and without introducing “convenience” secrets in the wrong places.\n1) The Current State: the real problem to solve # The starting point was not “I need a VM with Vault.” That is simple. The real problem was this:\nhow to boot a new machine in the cloud, how to configure it without passwords, how to inject private-network prerequisites, how to initialize Vault the first time, how to reopen it deterministically on subsequent runs, and how to do all of this without leaving secrets in image, user-data, or repositories. In other words: I was not designing a server, I was designing a secure lifecycle.\nIf this part is designed poorly, everything else (rotation, governance, private connectivity with the cluster, etc.) is built on weak foundations from day one.\n2) The “Why”: why these choices (and not others) # No secrets in the image # The base image had to contain only software and neutral configuration. No token, no bootstrap key, no cloud credential.\nReason: an image is meant to be cloned. If you place a secret in it, that secret automatically becomes replicable and hard to revoke in an orderly way.\nNo cloud-init/user-data to pass keys # I rejected the “pass everything through user-data” pattern because it is not consistent with the security model I wanted. Cloud metadata is not where I want sensitive credentials to transit.\nIf tomorrow I need to run an audit or incident response, I must be able to state with certainty: secrets never passed through provider metadata.\nInitial access only via key-based SSH, never password # The VM is born with one open port only, SSH, and only with key-based authentication already registered on Hetzner. No password access, no fragile interactive bootstrap.\nThis reduces two surfaces at the same time:\nopportunistic password attacks, dependence on non-repeatable manual steps. The turning point: from a massive SH script to Ansible (the real clarity moment) # A fundamental part of the design was exactly this: at first I was designing everything with a single, very complex SH script. The idea was to make the script do every step: SSH in, check states, apply configuration, inject keys, manage first-run/re-run branches, validate outputs, and perform cleanup.\nOn paper it looked feasible. In practice I was hand-building an idempotent orchestrator with conditional logic, retries, error handling, dependency ordering, and action traceability. At some point the question became inevitable: \u0026ldquo;isn’t this exactly the ideal case for Ansible?\u0026rdquo;\nThe answer was yes, with no ambiguity: we were effectively writing a mini-Ansible in Bash. That was the moment I truly understood what Ansible is for in the real world: not to \u0026ldquo;run remote commands,\u0026rdquo; but to provide a declarative, repeatable, and verifiable shape to machine convergence.\nFor me this was also an important professional step: I had known Ansible in theory for a long time, but I had never had a case where it was this clearly the right tool. In this project its value was obvious because:\nthe flow requires idempotency (first-run and re-run must converge, not diverge), security requires a deterministic configuration (no opaque manual steps), I needed to track \u0026ldquo;what is applied, when, and in which order.\u0026rdquo; In addition, Ansible’s declarative model is consistent with the rest of my stack: same Kubernetes mindset and same traceability discipline typical of GitOps flows. It is not Kubernetes, but it speaks the same operational language: desired state, convergence, verifiability.\nAnsible’s role in the project is therefore precise:\nconfigure the host environment consistently, inject required materials (e.g., Tailscale keys/config) at the correct point in the cycle, keep initial bootstrap separate from subsequent convergence, drastically reduce drift risk caused by SH scripts that grew beyond threshold. Without this declarative convergence, security would remain tied to memory from the previous session. With Ansible, instead, it becomes part of the system.\nWhy not rely on an external Key Manager (e.g., AWS KMS) at this stage # The most important discussion was this: “let’s use an external key manager and solve it.”\nOn paper it is elegant. In practice, in my scenario, to authenticate a machine outside their perimeter I would still need local authentication material (secrets/credentials) stored on the machine itself.\nSo the risk point does not disappear: it moves.\nStoring local credentials to authenticate to KMS, or storing local material needed for bootstrap in an encrypted container, in this context, they have a very similar risk profile, unless you run the first option with a full enterprise ecosystem that is not available here.\nHence the pragmatic and controllable choice: no forced dependency on an external key manager at this stage, but a deterministic cycle with encrypted artifacts and an explicit recovery path.\n3) The Target Architecture: the complete project, before the execution split # Before splitting it into multiple subprojects, the project was conceived as one end-to-end logical pipeline.\nStep A — Golden image runtime (technical base only) # I create a base image with required software preinstalled. The image is tested. No secrets inside the image. This is the trust base: a machine born ready to converge, but still “neutral” from a secrets perspective.\nStep B — Instance with only SSH port open # I instantiate the VM from the golden image. Open port: only 22. Access: only SSH key registered on Hetzner. No password, no shell bootstrap via cloud metadata.\nStep C — Convergence via Ansible, controlled injection # Once inside via SSH, Ansible prepares the runtime:\nconfigures system and environment, injects required materials for Tailscale, prepares the transition to the private channel. Step D — Switch management plane to Tailscale # After initial convergence:\nthe VM joins the Tailscale network, management moves to the private channel, in perspective, public SSH is closed (both internet-side and cloud firewall), from that point, operations are private. This is the key transition: public SSH is only an initial bridge, not a permanent channel.\nStep E — Vault phase: first boot vs reboot # Here the design is explicitly split.\nFirst boot (bootstrap) # Vault is initialized, required keys are generated (unseal/root metadata), artifacts are saved in the encrypted secrets path on S3. Subsequent boots (re-instantiation) # artifacts already exist, Vault is not re-initialized, state is recovered and runtime is reopened deterministically. This prevents the most dangerous risk: accidental “re-init” with loss of operational continuity.\nStep F — Private integration with the cluster # Once the runtime is stabilized on a private network, the cluster also joins the same private communication domain.\nThis is where the project delivers its final value:\nsecrets management, synchronization, rotation, happen on a private network, not over public exposure.\n4) Operational blueprint (scripts defined by the design) # This is the final blueprint. The first idea was a monolithic SH script; after the Ansible turning point, the project was redesigned into a pipeline where scripts orchestrate phases and Ansible handles configuration convergence.\n# 1) build secure base image (no secrets) create-runtime-golden-image.sh # 2) instantiate from golden image with initial SSH create-runtime-instance.sh # 3) host convergence + private-network material injection converge-runtime-with-ansible.sh # 4) switch management to Tailscale and progressively close public SSH switch-to-tailscale-management.sh # 5) Vault first-run bootstrap (init + encrypted artifact save) vault-first-init.sh # 6) re-instantiation reopen path (no re-init) vault-recover-from-secrets.sh # 7) controlled runtime resource cleanup destroy-runtime.sh The critical distinction is not script names, but responsibility boundaries. Each script must do one critical thing, with clear logs and verifiable outputs.\n5) Why this comes before implementation into subprojects # Only after defining this complete flow did I choose to split implementation into separate phases. The split was not created to “complicate governance”; it was created to keep the security design intact during execution.\nSo the point is not the number of subprojects. The point is that, at its core, the project remains this:\nclean image base, controlled bootstrap, declarative convergence via Ansible, transition to private management via Tailscale, deterministic Vault first-run/re-run lifecycle, no secrets in the wrong places. Future Outlook: what this architecture actually unlocks # When this design is respected, I gain three strategic properties:\nOperational repeatability\nI can recreate runtime without reinventing the procedure. Structural risk reduction\nsecrets do not transit through improper channels, public exposure is not the permanent operating mode. Vault lifecycle continuity\nfirst boot and subsequent reopens are distinct and controlled paths. This is the truly important part of the project: not “standing up Vault,” but building a system that remains secure even when rebuilt from zero.\n","date":"4 April 2026","externalUrl":null,"permalink":"/posts/lushycorp-vault-hetzner-security-architecture/","section":"Posts","summary":"","title":"LushyCorp Vault on Hetzner: security-driven architectural choices","type":"posts"},{"content":"","date":"4 April 2026","externalUrl":null,"permalink":"/categories/security/","section":"Categories","summary":"","title":"Security","type":"categories"},{"content":"","date":"4 April 2026","externalUrl":null,"permalink":"/tags/security/","section":"Tags","summary":"","title":"Security","type":"tags"},{"content":" The Illusion of \u0026ldquo;Free\u0026rdquo; and the Search for Stability # The initial goal was simple and ambitious: a private HashiCorp Vault cluster on Oracle Cloud\u0026rsquo;s \u0026ldquo;Always Free\u0026rdquo; resources in Turin. 4 ARM vCPUs, 24 GB of RAM, and 200 GB of storage, all for free. A paradise for a professional home lab.\nAfter a 24-hour battle, I had to face the harsh reality: when it comes to critical services, \u0026ldquo;free\u0026rdquo; can be very expensive in terms of time and reliability.\nThe name Lushy Corp was born from a typo — I was typing \u0026ldquo;HashiCorp Vault Container\u0026rdquo; and the AI agent read it as \u0026ldquo;LushyCorp\u0026rdquo;. Since that moment, it became the codename for our vault.\nThe OCI Saga: Tilting at Windmills # The Capacity Wall # The first hurdle arrived before I could even do anything. The Ampere (ARM64) instances in OCI\u0026rsquo;s eu-turin-1 datacenter are extremely popular: they offer excellent performance in a free tier. The problem is that demand far exceeds supply.\nI had to implement an aggressive loop in my provisioning script, a create.sh that continuously tried to create instances, often for hours. The Out of host capacity message became my constant companion. Oracle simply didn\u0026rsquo;t have resources available exactly when I requested them.\nThis highlighted a conceptual problem: if I have to \u0026ldquo;fight\u0026rdquo; to get a free resource, that time has a cost. And if that time is spent on a 24/7 critical service, the operational risk becomes unacceptable.\nThe Architecture Bug: Instances Spinning but Not Booting # After finally \u0026ldquo;snatching\u0026rdquo; an Ampere instance, I encountered a more subtle issue. The instance reached the RUNNING state without errors, but Talos Linux wouldn\u0026rsquo;t boot. No output on the console, just an instance spinning in a void.\nThe investigation took hours. Eventually, the root cause emerged: the imported ARM64 image was registered with architecture metadata set to None instead of ARM64. OCI accepted the instance, but at boot time the UEFI firmware didn\u0026rsquo;t recognize the architecture and silently froze.\nThe solution was to re-import the image via OCI CLI, explicitly specifying ARM64:\noci compute image import \\ --compartment-id $COMPARTMENT_ID \\ --image-id $IMAGE_ID \\ --source-image-type QCOW2 \\ --launch-mode PARAVIRTUALIZED \\ --architecture ARM64 An important lesson: on OCI, the image metadata must be correct at import time. They cannot be modified later.\nTerragrunt and Orchestration Issues # When Terragrunt started hanging on caching and credential issues, I had to bypass it completely, moving to direct OCI CLI commands. Furthermore, OCI took an anomalous amount of time to assign private IPs to the VNICs, requiring multiple hardware resets to force synchronization.\nThe prevailing feeling was not satisfaction: it was the realization that I was building on unstable foundations.\nThe Fatal Blow: The Shutdown Policy # The decisive moment came when I realized that OCI\u0026rsquo;s \u0026ldquo;Always Free\u0026rdquo; policies allow for the shutdown of instances deemed \u0026ldquo;idle\u0026rdquo;. For a service like Vault, which must be always available, this is an unacceptable risk.\nPicture the scene: it\u0026rsquo;s night, a Kubernetes application needs to access a secret to rotate a certificate, but the Vault has been shut down by Oracle because it\u0026rsquo;s \u0026ldquo;idle\u0026rdquo;. The certificate expires, the app errors out, and you are sleeping. It is exactly the kind of silent failure that a secrets management system must prevent at all costs.\nI decided that Lushy Corp\u0026rsquo;s operational stability was worth more than a few euros saved.\nThe AWS Pivot: The Problem with Spot Instances # Before arriving at Hetzner, I took a detour into the AWS ecosystem. I designed an architecture based on Fargate + EFS + Tailscale + KMS Auto-Unseal, intending to keep an estimated cost of around 4€/month.\nIt wasn\u0026rsquo;t the complexity of the infrastructure that stopped me. On the contrary, configuring that environment was an interesting technical stimulus, a great opportunity to learn and dive deep into advanced AWS components. The real issue, once the math was done, was the trade-off between costs and reliability.\nTo stay within that low budget, I would have had to use Fargate Spot instances. However, Spot instances introduce the exact same problem I was running away from on OCI: if AWS needs computational power, it shuts down your machine. We were back to square one, an unacceptable risk for a Vault.\nTo have a truly solid architecture working properly (using classic On-Demand instances), the expense would have risen to more than double what was initially budgeted. For a project born to learn, test myself, and test technologies in my home lab (where a Vault cluster is already an \u0026ldquo;exaggerated\u0026rdquo; superstructure for the data it holds in itself), it simply felt like an unjustified expense.\nThe Final Choice: Hetzner and the Beauty of Simplicity # The choice fell on a dedicated VPS on Hetzner. This decision offers the perfect balance for a professional home lab.\n1. Versatility: A Linux VM is not just for Vault. It can host other microservices, a reverse proxy, monitoring tools. The fixed cost of 4-5€/month gets distributed across multiple services over time.\n2. Operational Simplicity: With a pure VM, I have complete and direct control over every component. No opaque managed services, just pure Linux system administration.\n3. Predictable Cost: 4-5€/month guaranteed 24/7. It\u0026rsquo;s the price of peace of mind, without the risks of Spot instances and without billing surprises.\nThe Setup: Podman and Native Tailscale # Instead of hyper-packaged solutions, I\u0026rsquo;m thinking of a more \u0026ldquo;raw\u0026rdquo; and educational approach. I will install Tailscale directly on the operating system of the machine, ensuring secure connectivity at the host level in a clean way.\nAs for the containers, I decided to take the opportunity to use Podman instead of the classic Docker. Not because Docker isn\u0026rsquo;t good (in fact, Podman won\u0026rsquo;t necessarily give me extra features for this basic use case), but purely for the sake of trying it out and learning to use it in a real context. I will run the Vault container on this Podman layer. Since it\u0026rsquo;s a public VPS, having this infrastructure ready will be handy in the future to easily spin up other services.\nConclusion: Failing to Build Better # I \u0026ldquo;nuked\u0026rdquo; the OCI compartment, deleted the AWS Fargate project, but these were not failures. They were necessary steps in a journey that led to a more solid, pragmatic, and mindful architecture.\nEvery pivot taught me something:\nOCI: \u0026ldquo;Free\u0026rdquo; has huge hidden costs in terms of reliability and wasted time. AWS Fargate: \u0026ldquo;Cheap\u0026rdquo; serverless architectures via Spot are not suited for critical always-on services for lab infrastructures. Hetzner: The simplicity of a classic VM is a virtue. The era of Lushy Corp begins now, on a solid Linux foundation, ready to manage secrets.\nNext stop: Provisioning.\n","date":"30 March 2026","externalUrl":null,"permalink":"/posts/cloud-free-reality-lushy-corp-hetzner-pivot/","section":"Posts","summary":"","title":"Cloud Free and the Harsh Reality: Lushy Corp's Pivot to Hetzner","type":"posts"},{"content":"","date":"30 March 2026","externalUrl":null,"permalink":"/tags/homelab/","section":"Tags","summary":"","title":"Homelab","type":"tags"},{"content":"","date":"30 March 2026","externalUrl":null,"permalink":"/tags/oci/","section":"Tags","summary":"","title":"OCI","type":"tags"},{"content":"","date":"30 March 2026","externalUrl":null,"permalink":"/tags/vps/","section":"Tags","summary":"","title":"VPS","type":"tags"},{"content":"","date":"24 March 2026","externalUrl":null,"permalink":"/tags/infrastructure-as-code/","section":"Tags","summary":"","title":"Infrastructure-as-Code","type":"tags"},{"content":"","date":"24 March 2026","externalUrl":null,"permalink":"/tags/networking/","section":"Tags","summary":"","title":"Networking","type":"tags"},{"content":"","date":"24 March 2026","externalUrl":null,"permalink":"/tags/oauth/","section":"Tags","summary":"","title":"OAuth","type":"tags"},{"content":" Tailscale: The Secure Backbone of TazLab\u0026rsquo;s Rebirth # Introduction: The Connective Tissue Between Two Worlds # In the journey of rebuilding TazLab that I described in previous articles, we have reached a critical point. We have a plan to resurrect the infrastructure from a single S3 bucket and we have locked down the bootstrap credentials by eliminating them from the disk thanks to TazPod and AWS SSO. But there was one element still missing: the \u0026ldquo;invisible thread\u0026rdquo; that allows these components to talk to each other in a secure, private, and provider-agnostic way.\nToday\u0026rsquo;s goal was not just to \u0026ldquo;activate a VPN.\u0026rdquo; The goal was to design and implement the networking foundation of TazLab as a pure Infrastructure-as-Code (IaC) resource. No manual configurations in the Tailscale console, no temporary authentication keys that expire after 90 days forcing manual intervention. I looked for a solution that was eternal, declarative, and integrated into the ephemeral lifecycle of my clusters.\nThe Problem with Pre-auth Keys: A Predicted Technical Debt # The standard way to add nodes to a Tailnet is using Pre-auth Keys. They are convenient for a quick setup, but they present three fundamental problems for an infrastructure aiming for total automation:\nExpiry: Even if set to the maximum duration, they expire. This means if my cluster needs to scale or be reborn after six months, the bootstrap will fail because the key injected into the code or secrets is no longer valid. Manual Management: Generating a new key requires human action in the Tailscale UI. It is the opposite of the \u0026ldquo;Bootstrap from Zero\u0026rdquo; principle I am pursuing. Lack of IaC Traceability: You cannot define a Pre-auth Key in Terraform in a way that it is automatically recreated without external intervention except through convoluted workarounds. The correct architectural solution is the use of an OAuth Client. A Tailscale OAuth Client is not a key, but an identity that can generate authentication keys on the fly. It never expires (unless explicitly revoked) and can be managed programmatically. This is the component I decided to place at the heart of the TazLab network.\nThe IaC Phase: Ephemeral-Castle Expands # I started by creating a new directory in the infrastructure configurations repository: ephemeral-castle/tailscale/. Here I deposited the Terraform code that governs the entire network.\nThe Declarative Heart: acl.json # Instead of writing access policies directly in the Terraform HCL, I chose to maintain a separate acl.json file. This choice is not aesthetic: Tailscale ACLs are a complex JSON and having a dedicated file allows for independent validation and extreme clarity in reading.\nThe applied philosophy is Tag-based Zero Trust. No node has access to the network just because it is \u0026ldquo;on the LAN.\u0026rdquo; Access is granted only if the node possesses a specific tag. I defined five fundamental tags:\ntag:tazlab-vault: The Vault cluster nodes on Oracle Cloud. tag:tazlab-k8s: The main K8s cluster nodes on Proxmox/AWS. tag:vault-api: The specific identity of the Vault proxy. tag:tazlab-db: The specific identity of the database proxy. tag:tazpod: My administration workstation. The Least Privilege principle is rigorously applied: the K8s cluster can talk to Vault only on port 8200, and only through the proxy tag. Nodes do not see each other at the OS level; they only see the necessary services.\n{ \u0026#34;tagOwners\u0026#34;: { \u0026#34;tag:tazlab-vault\u0026#34;: [\u0026#34;roberto.tazzoli@gmail.com\u0026#34;], \u0026#34;tag:tazlab-k8s\u0026#34;: [\u0026#34;roberto.tazzoli@gmail.com\u0026#34;], \u0026#34;tag:vault-api\u0026#34;: [\u0026#34;roberto.tazzoli@gmail.com\u0026#34;], \u0026#34;tag:tazlab-db\u0026#34;: [\u0026#34;roberto.tazzoli@gmail.com\u0026#34;], \u0026#34;tag:tazpod\u0026#34;: [\u0026#34;roberto.tazzoli@gmail.com\u0026#34;] }, \u0026#34;acls\u0026#34;: [ { \u0026#34;action\u0026#34;: \u0026#34;accept\u0026#34;, \u0026#34;src\u0026#34;: [\u0026#34;tag:tazlab-vault\u0026#34;], \u0026#34;dst\u0026#34;: [\u0026#34;tag:tazlab-vault:8201\u0026#34;] }, { \u0026#34;action\u0026#34;: \u0026#34;accept\u0026#34;, \u0026#34;src\u0026#34;: [\u0026#34;tag:tazlab-k8s\u0026#34;], \u0026#34;dst\u0026#34;: [\u0026#34;tag:vault-api:8200\u0026#34;] }, { \u0026#34;action\u0026#34;: \u0026#34;accept\u0026#34;, \u0026#34;src\u0026#34;: [\u0026#34;tag:tazpod\u0026#34;], \u0026#34;dst\u0026#34;: [\u0026#34;tag:tazlab-vault:6443,50000\u0026#34;, \u0026#34;tag:tazlab-k8s:6443,50000\u0026#34;] } ] } During implementation, I encountered an interesting validation error: Terraform returned Error: ACL validation failed: json: unknown field \u0026quot;comment\u0026quot;. This is a classic example of a discrepancy between the UI (which allows inline comments in ACLs) and the pure JSON API, which does not accept them. I had to clean the acl.json file of every comment to allow Terraform to apply it successfully.\nThe Discovery (The \u0026ldquo;Aha!\u0026rdquo; Moment): Terraform and the OAuth Client # Initially, my plan included using curl within a bootstrap script to create the OAuth Client, as many dated guides suggested that the Tailscale Terraform provider did not yet support this resource.\nI started writing the setup.sh script using curl, but kept receiving 404 page not found errors. I tried debugging the URL, changing the format (using - for the tailnet name, or the full Tailnet ID), but without success. Troubleshooting was becoming frustrating.\nInstead of insisting on the error, I decided to take a step back and analyze the source code of the tailscale/tailscale ~\u0026gt; 0.17 Terraform provider. It was the breakthrough: I discovered that the tailscale_oauth_client resource exists and is perfectly functional.\nI deleted the curl script and rewrote everything in Terraform:\n# OAuth client for bootstrap (generates pre-auth keys) resource \u0026#34;tailscale_oauth_client\u0026#34; \u0026#34;bootstrap\u0026#34; { description = \u0026#34;tazlab-bootstrap\u0026#34; scopes = [\u0026#34;auth_keys\u0026#34;, \u0026#34;devices\u0026#34;] tags = [\u0026#34;tag:tazpod\u0026#34;] } This discovery radically changed the quality of the work. Now the identity that generates the network keys is a managed resource, tracked in terraform.tfstate, and recreatable with a single command. Idempotency is no longer a wish, but a technical reality.\nThe TagOwners Problem # Another obstacle presented itself immediately after: requested tags [tag:tazpod] are invalid or not permitted (400). To create an OAuth Client that can assign a tag, the user (or the API key) performing the operation must be explicitly declared as the \u0026ldquo;owner\u0026rdquo; of that tag in the tagOwners section of the ACLs. I had to update acl.json to include my email for every tag before Terraform could successfully create the OAuth client. It is a fundamental security detail: Tailscale prevents a compromised identity from creating new clients with arbitrary tags to which it has no access.\nIntegration with TazPod: Closing the Security Circle # Once the OAuth Client was generated via Terraform, the problem became: where do we save the client_id and the client_secret? They cannot be in the git repository (obviously), and I didn\u0026rsquo;t want to save them in an insecure local file.\nI used the TazPod RAM Vault. I updated the setup.sh orchestration script so that, after Terraform execution, it automatically extracts the secrets from the outputs:\n# Extract credentials from Terraform OAUTH_CLIENT_ID=$(terraform output -raw oauth_client_id) OAUTH_CLIENT_SECRET=$(terraform output -raw oauth_client_secret) # Save them into the TazPod RAM vault echo \u0026#34;$OAUTH_CLIENT_ID\u0026#34; \u0026gt; ~/secrets/tailscale-oauth-client-id echo \u0026#34;$OAUTH_CLIENT_SECRET\u0026#34; \u0026gt; ~/secrets/tailscale-oauth-client-secret # Sync with S3 (cd /workspace \u0026amp;\u0026amp; tazpod save \u0026amp;\u0026amp; tazpod push vault) Now, the rebirth cycle is complete for the network as well. When I run tazpod unlock, the secrets needed to connect to the Tailnet are mounted in memory. Any new cluster or TazPod instance can use these credentials to join the network in less than a second.\nEmpirical Verification: The Live Test # Theory is nice, but systems must work. I performed a live test by installing Tailscale directly into the tazpod-lab container (which didn\u0026rsquo;t include it yet). This lack was the trigger for an immediate update of TazPod\u0026rsquo;s layer hierarchy: Tailscale must be part of the base image\u0026rsquo;s DNA.\nAfter starting the tailscaled daemon in userspace mode (necessary because the container does not have permissions to create tun interfaces on the host kernel), I attempted to connect using the credentials just saved in the vault:\nID=$(cat ~/secrets/tailscale-oauth-client-id) SECRET=$(cat ~/secrets/tailscale-oauth-client-secret) sudo tailscale up \\ --client-id=\u0026#34;$ID\u0026#34; \\ --client-secret=\u0026#34;$SECRET\u0026#34; \\ --hostname=tazpod-lab \\ --advertise-tags=tag:tazpod \\ --reset The result was instantaneous: active login: tazpod-lab.magellanic-gondola.ts.net IP: 100.73.57.110\nThe node appeared in the network, correctly tagged as tag:tazpod, with key expiry automatically disabled by the system (standard Tailscale behavior for tagged nodes).\nPost-Lab Reflections: What We Learned # This session consolidated TazLab\u0026rsquo;s networking foundation in three ways:\nProvider Independence: It doesn\u0026rsquo;t matter if a cluster runs on OCI, AWS, or in my living room. If it has the Tailscale extension and the OAuth Client, it is part of the TazLab private network instantaneously. Zero Maintainability: By switching to OAuth Clients managed via IaC, I eliminated the risk of failures due to key expirations. The network is now a \u0026ldquo;living\u0026rdquo; entity that manages itself. Integrated Security: The chain of trust that starts with AWS SSO and passes through the TazPod RAM Vault now also protects network access. The next step in the roadmap is the provisioning of the tazlab-vault cluster on Oracle Cloud. Thanks to today\u0026rsquo;s work, that cluster will be born already talking privately with the rest of my world, without me ever having to expose its port 8200 to public internet traffic.\nThe network is there. The ephemeral castle now has its invisible walls.\n","date":"24 March 2026","externalUrl":null,"permalink":"/posts/tailscale-secure-backbone-tazlab-rebirth/","section":"Posts","summary":"","title":"Tailscale: The Secure Backbone of TazLab's Rebirth","type":"posts"},{"content":"","date":"24 March 2026","externalUrl":null,"permalink":"/tags/tazpod/","section":"Tags","summary":"","title":"Tazpod","type":"tags"},{"content":"","date":"24 March 2026","externalUrl":null,"permalink":"/tags/zero-trust/","section":"Tags","summary":"","title":"Zero Trust","type":"tags"},{"content":"","date":"22 March 2026","externalUrl":null,"permalink":"/tags/aws/","section":"Tags","summary":"","title":"Aws","type":"tags"},{"content":"","date":"22 March 2026","externalUrl":null,"permalink":"/tags/ci-cd/","section":"Tags","summary":"","title":"Ci-Cd","type":"tags"},{"content":"","date":"22 March 2026","externalUrl":null,"permalink":"/tags/docker/","section":"Tags","summary":"","title":"Docker","type":"tags"},{"content":"","date":"22 March 2026","externalUrl":null,"permalink":"/tags/github-actions/","section":"Tags","summary":"","title":"Github-Actions","type":"tags"},{"content":"","date":"22 March 2026","externalUrl":null,"permalink":"/tags/golang/","section":"Tags","summary":"","title":"Golang","type":"tags"},{"content":"","date":"22 March 2026","externalUrl":null,"permalink":"/tags/iam-identity-center/","section":"Tags","summary":"","title":"Iam-Identity-Center","type":"tags"},{"content":"","date":"22 March 2026","externalUrl":null,"permalink":"/tags/secrets-management/","section":"Tags","summary":"","title":"Secrets Management","type":"tags"},{"content":"","date":"22 March 2026","externalUrl":null,"permalink":"/tags/sso/","section":"Tags","summary":"","title":"Sso","type":"tags"},{"content":" Zero Credentials on Disk: Rewriting TazPod with AWS IAM Identity Center # Introduction: The Problem I Couldn\u0026rsquo;t Solve # In the previous article on this project I described the architectural vision: replace Infisical with AWS IAM Identity Center as the bootstrap anchor, eliminate every static credential from the TazPod Docker image, and make the entire rebirth cycle reproducible from a blank machine with only an S3 bucket, a passphrase, and an MFA device.\nThat was the design. This article tells the story of the implementation — four hours of work that produced TazPod 0.3.12, eleven build versions, six distinct bugs discovered exclusively during live testing on the real system, and an iteratively rebuilt CI/CD pipeline.\nPhase 1: The Surgical Removal of Infisical # The starting point was cmd/tazpod/main.go — 613 lines, roughly a third of which were dedicated exclusively to Infisical integration. The temptation in these cases is to do a gradual removal, leaving compatibility branches or deprecated wrappers. I deliberately resisted that temptation.\nThe principle I applied is called Design Integrity: the code must tell the truth about what the system does. Every line of Infisical code left compilable — even commented out, even with a deprecation warning — is a lie told to the next reader. The removal must be total or it is not a removal.\nI eliminated: the SecretMapping and SecretsConfig structs, the global variable secCfg, the constants SecretsYAML and EnvFile, the functions pullSecrets(), login() (Infisical version), runInfisical(), runCmd(), checkInfisicalLogin(), loadEnclaveEnv(), resolveSecret(), and the local isMounted() method (a duplicate of utils.IsMounted). The orphaned bytes and strings imports disappeared as well.\nThe result was a 250-line file instead of 613. The compiler confirmed the cleanliness on the first attempt.\nThe same operation in internal/vault/vault.go was more delicate. The Infisical constants (InfisicalLocalHome, InfisicalKeyringLocal, InfisicalVaultDir, InfisicalKeyringVault) were used by setupBindAuth() and Lock(). I replaced them with their AWS equivalents:\nconst ( AwsLocalHome = \u0026#34;/home/tazpod/.aws\u0026#34; AwsVaultDir = MountPath + \u0026#34;/.aws\u0026#34; PassCache = MountPath + \u0026#34;/.vault_pass\u0026#34; ) The setupBindAuth() function now creates a bind mount from the AWS directory in the RAM tmpfs to ~/.aws in the container. The mechanism is identical to what it used for Infisical — a bind mount that makes the RAM directory indistinguishable from a normal directory for any process, including the AWS CLI and the Go SDK.\nPhase 2: The ~/.aws Symlink — Two Implementations Before the Right One # The first implementation of the symlink for AWS configuration was an error of granularity. I wrote in SetupIdentity() (vault.go) the code to symlink the file ~/.aws/config to /workspace/.tazpod/aws/config. It was wrong for three reasons: I was symlinking a file instead of a directory, using the name aws without the leading dot (inconsistent with the pattern of other tools), and I had placed it in Go instead of .bashrc.\nThe correct pattern already existed in .bashrc for four other tools: .pi, .omp, .gemini, .claude. Each tool directory is symlinked from the workspace to home: ~/.pi → /workspace/.tazpod/.pi, and so on. The logic lives in .bashrc because it runs at every shell startup, guaranteeing symlink recreation even after a lock that unmounts the tmpfs.\nFor ~/.aws there was an additional complexity that the other tools didn\u0026rsquo;t have: when the vault is unlocked, setupBindAuth() executes rm -rf ~/.aws and replaces it with a bind mount from RAM. If the generic .bashrc loop ran in a new shell with the vault already open, it would destroy the active bind mount.\nThe solution was an explicit guard using mountpoint -q:\n# AWS config: symlink ~/.aws -\u0026gt; /workspace/.tazpod/.aws # Skip if already bind-mounted from the vault enclave (vault unlocked) if ! mountpoint -q \u0026#34;$HOME/.aws\u0026#34; 2\u0026gt;/dev/null; then mkdir -p /workspace/.tazpod/.aws if [ ! -L \u0026#34;$HOME/.aws\u0026#34; ] || [ \u0026#34;$(readlink \u0026#34;$HOME/.aws\u0026#34;)\u0026#34; != \u0026#34;/workspace/.tazpod/.aws\u0026#34; ]; then rm -rf \u0026#34;$HOME/.aws\u0026#34; \u0026amp;\u0026amp; ln -sf /workspace/.tazpod/.aws \u0026#34;$HOME/.aws\u0026#34; fi fi If ~/.aws is a mountpoint (vault unlocked), the block is skipped. If it isn\u0026rsquo;t (vault locked, or first launch), the symlink is created or recreated. The vault bind mount and the workspace symlink coexist without conflict, serving two distinct operational states.\nPhase 3: The Go AWS SDK Bug with SSO Profiles # The NewS3Client function in the utils package accepted only the bucket name. I added a second parameter for the SSO profile:\nfunc NewS3Client(bucket, profile string) (*S3Client, error) { opts := []func(*config.LoadOptions) error{ config.WithRegion(DefaultRegion), } if profile != \u0026#34;\u0026#34; \u0026amp;\u0026amp; os.Getenv(\u0026#34;AWS_ACCESS_KEY_ID\u0026#34;) == \u0026#34;\u0026#34; { opts = append(opts, config.WithSharedConfigProfile(profile)) } cfg, err := config.LoadDefaultConfig(context.TODO(), opts...) ... } The condition os.Getenv(\u0026quot;AWS_ACCESS_KEY_ID\u0026quot;) == \u0026quot;\u0026quot; is not obvious and deserves an explanation. During testing I discovered that passing WithSharedConfigProfile to the Go AWS SDK causes a 30+ second hang when AWS_ACCESS_KEY_ID is already in the environment. The SDK still tries to load the configuration of the SSO profile — including an attempt to contact the SSO endpoint to validate or refresh tokens — regardless of whether static credentials are already available.\nThe Go SDK v2 credential chain gives priority to environment variables over profile credentials. But profile configuration loading (region, endpoint, SSO parameters) happens anyway if WithSharedConfigProfile is passed. Skipping the profile when env vars are present is the correct solution: the static credentials already have everything needed.\nThis bug never manifests in production — where there are no static credentials and the SSO profile is the only source — but it is critical for testing and fallback situations.\nPhase 4: AWS IAM Identity Center — Guided Setup # The IAM Identity Center setup was interactive: I did it collaboratively, step by step from the AWS Console. The non-obvious points worth documenting:\nThe region is us-east-1, not eu-central-1. Even though I configured IAM Identity Center from the eu-central-1 console, the SSO portal is created in us-east-1. The portal URL — https://ssoins-7223c4f9117b4c94.portal.us-east-1.app.aws — explicitly contains the region. Configuring sso_region = eu-central-1 in the AWS profile produced InvalidRequestException: Couldn't find Identity Center Instance. The fix was immediate once the cause was identified.\nThe TazLabBootstrap permission set follows the Principle of Least Privilege. The inline policy permits only the three strictly necessary operations, on the single bucket and single prefix:\n{ \u0026#34;Version\u0026#34;: \u0026#34;2012-10-17\u0026#34;, \u0026#34;Statement\u0026#34;: [{ \u0026#34;Effect\u0026#34;: \u0026#34;Allow\u0026#34;, \u0026#34;Action\u0026#34;: [\u0026#34;s3:GetObject\u0026#34;, \u0026#34;s3:PutObject\u0026#34;, \u0026#34;s3:ListBucket\u0026#34;], \u0026#34;Resource\u0026#34;: [ \u0026#34;arn:aws:s3:::tazlab-storage\u0026#34;, \u0026#34;arn:aws:s3:::tazlab-storage/tazpod/vault/*\u0026#34; ] }] } No access to other buckets. No management operations. If this profile were compromised, an attacker could only download or overwrite the vault.tar.aes file — which is encrypted with AES-256-GCM and useless without the passphrase.\nThe persistent configuration file lives in /workspace/.tazpod/.aws/config, tracked in the workspace but not in the encrypted vault — because it contains no secrets:\n[profile tazlab-bootstrap] sso_start_url = https://ssoins-7223c4f9117b4c94.portal.us-east-1.app.aws sso_account_id = 468971461088 sso_role_name = TazLabBootstrap sso_region = us-east-1 region = eu-central-1 Phase 5: The CI/CD Pipeline — Seven Iterations # The existing GitHub Actions workflow was simple: it built the Go CLI (without injecting the version) and always built all four Docker images on every push to master. I rebuilt everything in seven iterative commits, each one fixing a specific problem.\nIteration 1: version in the binary. The build command didn\u0026rsquo;t use -ldflags, always producing a binary with Version = \u0026quot;dev\u0026quot;. Fixed to:\nGOOS=linux GOARCH=amd64 go build -ldflags \u0026#34;-X main.Version=${VERSION}\u0026#34; -o tazpod cmd/tazpod/main.go Iteration 2: automatic release publishing. Added a step with gh release create that publishes the compiled binary as a GitHub asset. This makes scripts/install.sh functional without manual intervention.\nIteration 3: selective build. Docker images shouldn\u0026rsquo;t be rebuilt on every commit. I added a check that analyzes git diff --name-only HEAD~1 HEAD:\nIf cmd/, internal/, or VERSION change → build CLI + release If .tazpod/Dockerfile* or dotfiles/ change → build Docker Iteration 4: GitHub Token permissions. The gh release create step was failing with HTTP 403. The cause: GITHUB_TOKEN has limited permissions by default in workflows. Solution:\npermissions: contents: write Iteration 5: the binary is not in git. With bin/tazpod (15MB) tracked by git, every push required 30-35 seconds of HTTPS upload. Removed with git rm --cached bin/tazpod, added bin/ to .gitignore. Subsequent pushes: less than 1 second.\nIteration 6: the CLI build must always run. With conditional builds, when only Dockerfiles changed the binary wasn\u0026rsquo;t compiled. But Dockerfile.base contains COPY tazpod /home/tazpod/.local/bin/tazpod — without the file in the build context, the Docker build fails. The Setup Go and Build CLI steps have no conditions: they always run. Only Publish GitHub Release is conditional.\nIteration 7: GHA Docker cache. Added cache-from and cache-to with type=gha and a scope per layer (tazpod-base, tazpod-aws, tazpod-k8s, tazpod-ai). The first build populates the cache; subsequent ones reuse unchanged layers. On a change to Dockerfile.ai (the final layer), the three previous layers are retrieved from cache in seconds.\nPhase 6: The Git Authentication Method — 30 Seconds vs 1 Second # During the CI/CD work I identified that every git push was systematically taking 30-35 seconds, causing tool timeouts. The cause was the authentication method used up to that point:\n# WRONG git -c http.extraheader=\u0026#34;Authorization: Basic $(echo -n x-access-token:${TOKEN} | base64)\u0026#34; push The http.extraheader method with Base64 adds overhead to git\u0026rsquo;s HTTP negotiation protocol — a handshake phase that with GitHub results in significantly slower performance compared to the native method.\nThe correct method uses an inline credential helper that implements git\u0026rsquo;s standard credential protocol:\n# CORRECT git -c credential.helper=\u0026#34;!f() { echo \u0026#39;username=x-access-token\u0026#39;; echo \\\u0026#34;password=${TOKEN}\\\u0026#34;; }; f\u0026#34; push origin master The measured difference: 30-35 seconds versus 0.8-1.2 seconds. The benchmark was performed on identical commits to the same repository. The correct method uses the protocol GitHub expects natively, without additional encoding layers.\nPhase 7: The Six Bugs of Live Testing # This is the part that differentiates an implementation designed on paper from one verified on a real system. All six bugs were invisible during development — none were detectable without running the complete flow on a real host machine.\nBug 1: loadConfigs() not called in the no-arguments path. In main(), loadConfigs() was invoked only after the argument check. When tazpod was executed without arguments, smartEntry() read cfg still at its zero value. Result: ❌ container_name missing in config.yaml. Fix: loadConfigs() as the first instruction of smartEntry().\nBug 2: hardcoded vault path. vault.VaultFile is constant at /workspace/.tazpod/vault/vault.tar.aes — the correct absolute path inside the container, where the project is always mounted at /workspace. On the host, the project can be anywhere. Fix: filepath.Join(cwd, \u0026quot;.tazpod/vault/vault.tar.aes\u0026quot;) relative to the current working directory of the host.\nBug 3: unlock asks the host user for the sudo password. vault.Unlock() executes sudo mount -t tmpfs to create the tmpfs in RAM. Inside the container, the tazpod user has NOPASSWD sudo. On the host, the user doesn\u0026rsquo;t have that privilege. The correct architectural separation: login and vault pull on the host (where there\u0026rsquo;s a browser for SSO), unlock inside the container (where there are sudo permissions). Implemented with execInContainer(), a helper that runs interactive commands via docker exec -it.\nBug 4: aws CLI not found during bootstrap. docker exec bash -c \u0026quot;...\u0026quot; opens a non-interactive shell that doesn\u0026rsquo;t source .bashrc. The ~/.aws symlink isn\u0026rsquo;t created, the AWS configuration isn\u0026rsquo;t found. Fix: pass -e AWS_CONFIG_FILE=/workspace/.tazpod/.aws/config explicitly to docker exec, bypassing the symlink entirely.\nBug 5: the sequence doesn\u0026rsquo;t stop on error. tazpod login exited with code 0 even on failure — main() didn\u0026rsquo;t propagate the exit codes of failed subcommands. The \u0026amp;\u0026amp; in the shell chain didn\u0026rsquo;t stop execution. Fix: os.Exit(1) in the error paths of login() and pullVault().\nBug 6: passphrase corrupted by the TTY buffer. With bash -c \u0026quot;tazpod login \u0026amp;\u0026amp; tazpod pull vault \u0026amp;\u0026amp; tazpod unlock\u0026quot;, the three commands share the same TTY. During the SSO flow — while the browser is open, the user navigates and enters the MFA code — keystrokes are buffered in the TTY. When the time comes to read the vault passphrase with term.ReadPassword, the TTY buffer already contains characters that get read as part of the passphrase. The result is ❌ WRONG PASSWORD with the correct passphrase. Fix: each step (login, pull, unlock) runs in a separate execInContainer call, with its own clean TTY. execInContainer returns bool to stop the sequence in case of failure.\nThese six bugs, resolved in sequence across versions 0.3.5 to 0.3.12, describe precisely the difference between a development environment (container, predictable cwd, controlled TTY) and a production environment (real host, different user, terminal sessions with non-deterministic I/O).\nReflections: What Changes with Zero Credentials on Disk # The final result is a binary that, run on a host with only Docker installed, autonomously manages the entire bootstrap flow: verifies the presence of an initialized project, brings up the container if necessary, and — if there\u0026rsquo;s no local vault — guides the user through aws sso login, the S3 download, and the RAM decryption.\nAll without any static AWS credentials ever touching the host\u0026rsquo;s disk.\nThe Docker image running in the container (tazzo/tazpod-aws:latest) contains the AWS CLI — but no credentials. The SSO configuration in /workspace/.tazpod/.aws/config contains the portal URL and the role name — but no tokens, no keys, no secrets. The encrypted vault on S3 contains everything else — but it\u0026rsquo;s useless without the passphrase that lives only in one\u0026rsquo;s head.\nThe architecture now has three characteristics it didn\u0026rsquo;t have before: it is verifiable (you can inspect every file and find no credentials), reproducible (the sequence tazpod → SSO → pull → unlock works from any host with Docker), and resilient to theft (stealing the laptop gives access to the Docker image and the public SSO configuration file, not the secrets).\nThe next step — which closes the cycle described in the previous article — is the provisioning of tazlab-vault on Oracle Cloud and the migration of application secrets from Infisical to HashiCorp Vault CE. But that is another session.\n","date":"22 March 2026","externalUrl":null,"permalink":"/posts/tazpod-zero-credentials-aws-sso/","section":"Posts","summary":"","title":"Zero Credentials on Disk: Rewriting TazPod with AWS IAM Identity Center","type":"posts"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/bootstrap/","section":"Tags","summary":"","title":"Bootstrap","type":"tags"},{"content":" The Current State: The Plan Had a Gap # In the previous article on this roadmap I described why TazLab is migrating from Infisical to HashiCorp Vault CE, and the direction it\u0026rsquo;s heading: dynamic secrets, automatic rotation, a second cluster on Oracle Cloud. The \u0026ldquo;what\u0026rdquo; and the \u0026ldquo;why\u0026rdquo; were clear.\nWhat was missing was the \u0026ldquo;how does the system survive when everything disappears\u0026rdquo;.\nThe question I couldn\u0026rsquo;t get out of my head was this: if tomorrow morning Proxmox, Oracle Cloud, and my computer all burned down at the same time, what would I have left? An S3 bucket, a passphrase in my head, and a physical MFA device. That\u0026rsquo;s it. From these three elements, everything must restart — not heroically and manually, but systematically, as automatically as possible.\nThis is the design session where we solved exactly that problem.\nThe \u0026ldquo;Why\u0026rdquo;: Migrating Is Not Enough, You Have to Reborn # The migration from Infisical to Vault is not just a vendor question. It\u0026rsquo;s an opportunity to redesign the bootstrap from scratch — the moment when the entire ephemeral castle philosophy is put to the test.\nAn infrastructure is truly ephemeral only if you can destroy and rebuild it without fear. And you can do that without fear only if you\u0026rsquo;ve answered this question honestly: what must exist outside the clusters to make them rebuildable?\nThe answer has a precise shape. Not an endless series of scattered secrets, not a dependency on an always-on external service. Three anchors, all on the same S3 bucket:\nS3: tazlab-storage/ ├── tazpod/vault.tar.aes ← bootstrap secrets (AES-256-GCM, passphrase) ├── vault/vault-latest.snap ← Vault Raft snapshot (all app secrets) └── pgbackrest/ ← PostgreSQL backup (Mnemosyne, tazlab-k8s data) The first contains the bare minimum to start everything before any cluster exists. The second is Vault\u0026rsquo;s memory — all application secrets, automatically updated every day. The third is the database: Mnemosyne data, configurations, history. None of the three makes sense without the other two. Together, they are everything needed to start over.\nThe Target Architecture: Four Hard Decisions # Designing this cycle required untying four knots that, on the surface, seemed simple.\nThe Bootstrap Problem: Which Comes First, the Chicken or the Egg? # The tazpod Docker image is public. It cannot contain credentials. But to download vault.tar.aes from S3 I need AWS credentials. And the AWS credentials are in the vault. And the vault is on S3.\nThe solution is not technical — it\u0026rsquo;s architectural. I used AWS IAM Identity Center (AWS\u0026rsquo;s SSO service): an interactive authentication flow where you enter your email, password, and MFA code, and receive temporary credentials valid for 8 hours. The AWS configuration file that goes into the image contains only the SSO portal URL and the role name — no secrets, safely publishable.\ndocker run tazzo/tazpod-ai │ ▼ aws sso login --profile tazlab-bootstrap │ → email + password + physical MFA ▼ aws s3 cp s3://tazlab-storage/tazpod/vault.tar.aes ... │ ▼ tazpod unlock ← passphrase (in my head only) │ ▼ secrets/ open — everything else starts from here The passphrase lives only in my head. The MFA device is physical. Without both, the S3 bucket is a useless encrypted archive.\nVault Unseal: Professional Doesn\u0026rsquo;t Mean Expensive # Vault always starts in a \u0026ldquo;sealed\u0026rdquo; state — it doesn\u0026rsquo;t respond until it\u0026rsquo;s given the key to decrypt its master key. In my head the problem seemed to require an external KMS: AWS KMS ($1/month), OCI KMS (free for software keys), something always available.\nBut there was a cleaner solution that required no external dependency. Vault\u0026rsquo;s unseal keys (Shamir algorithm: 3 keys, 3 of 5 required to open) are generated once at initialization. I save them in secrets/. At bootstrap, create.sh uses them directly:\nvault operator unseal $(cat /home/tazpod/secrets/vault-unseal-key-1) vault operator unseal $(cat /home/tazpod/secrets/vault-unseal-key-2) vault operator unseal $(cat /home/tazpod/secrets/vault-unseal-key-3) It\u0026rsquo;s completely automatic from the script\u0026rsquo;s perspective, because the human interaction had already happened: passphrase + MFA at the start of the bootstrap had already opened secrets/. From that point on, no intervention required.\nOCI KMS remains as an option for simulation environments where the manual cycle is inconvenient.\nThe Network: Tailscale as an Operating System Extension # ESO on tazlab-k8s must be able to reach Vault on tazlab-vault (OCI) from the very first moment it\u0026rsquo;s deployed. Vault cannot sit on a public endpoint without reason.\nThe solution is Tailscale — but not as a Kubernetes pod. As a Talos operating system extension. siderolabs/tailscale exists as an official extension: it gets baked into the image at the Talos Image Factory and starts as a system service, before Kubernetes even exists.\nOCI node boots │ ▼ Talos OS → Tailscale extension → node in tailnet ← before K8s │ ▼ Kubernetes bootstrap → cluster healthy │ ▼ Terragrunt deploys Vault → ESO connects to Vault via tailnet ✓ The Tailscale auth key (reusable, tagged tag:tazlab-node) lives in secrets/ and is injected into the machine config during provisioning. The node automatically rejoins the same network on every rebuild, with the same DNS name.\nThe same extension goes on tazlab-k8s. The two clusters communicate privately, exposing nothing to the internet.\ntazlab-vault: Minimal by Design # The last decision was perhaps the simplest once framed correctly: tazlab-vault doesn\u0026rsquo;t need Flux.\nFlux makes sense when you manage many applications that change continuously and want the cluster to self-reconcile. tazlab-vault has one single responsibility: running Vault. To deploy a single application, Flux is a layer of complexity that gains nothing. Vault upgrades must be deliberate, tested, and never automatic.\nThe choice is Terragrunt with the Helm provider — exactly the pattern already used in ephemeral-castle for ESO, MetalLB, and Longhorn. The layer structure:\nsecrets → platform → vault No engine (ESO), no gitops (Flux). Vault uses Raft integrated storage on hostPath — it doesn\u0026rsquo;t need Longhorn because persistent data is restored from the S3 snapshot on every rebuild anyway.\nPhased Approach: Seven Phases Toward Complete Rebirth # The work is organized in sequential phases, each stable before moving to the next.\nPhase A — Prerequisites: configure AWS IAM Identity Center, create the SSO user with MFA, generate the Tailscale reusable auth key. Zero impact on existing clusters.\nPhase B — tazlab-vault minimal: new Talos schematic with the Tailscale extension, resolve the OCI reserved IP blocker (from tazlab-vault-init), complete the Talos bootstrap, deploy Vault CE via Terragrunt, first initialization and save of unseal keys in vault.tar.aes.\nPhase C — Vault configuration: enable KV v2, configure the Kubernetes auth method for ESO, migrate all secrets from Infisical to Vault KV.\nPhase D — tazlab-k8s migration: update the Talos image with the Tailscale extension (rolling upgrade, not rebuild), replace the ClusterSecretStore from Infisical to Vault, update all ExternalSecret resources with the new KV paths.\nPhase E — tazpod Vault integration: remove the Infisical logic from main.go, implement tazpod pull via Vault CLI, update tazpod vpn to use Tailscale instead of the never-tested custom WireGuard.\nPhase F — Decommission Infisical: verify that zero components still use Infisical, remove provider and references from all repos, delete secrets from the Infisical account.\nPhase G — Make repos public: audit git history with trufflehog, verify .gitignore coverage, make tazpod, tazlab-k8s, and ephemeral-castle public.\nFuture Outlook: The Final Proof # There\u0026rsquo;s a test that doesn\u0026rsquo;t lie: can you make your repos public without fear?\nIf the answer is yes, you\u0026rsquo;ve truly achieved zero-secrets-in-git. Not as a declared principle, but as a verifiable reality. Anyone can open the code, see how everything works, and find no credentials, no tokens, no secrets. Security does not depend on obscurity.\nThe complete rebirth cycle then becomes this sequence, executable by anyone who has access to the three right elements:\nBlank machine + S3 bucket (always available) + passphrase (in your head) + MFA device (in your pocket) ────────────────────────────── = complete, operational infrastructure, \u0026lt; 30 minutes TazLab doesn\u0026rsquo;t have a fixed address. It only has an S3 Bucket from which it is reborn. +++\n","date":"20 March 2026","externalUrl":null,"permalink":"/posts/bootstrap-from-zero-vault-s3-rebirth/","section":"Posts","summary":"","title":"Bootstrap from Zero: Rebuilding Everything from a Single S3 Bucket","type":"posts"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/hashicorp-vault/","section":"Tags","summary":"","title":"HashiCorp Vault","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/infisical/","section":"Tags","summary":"","title":"Infisical","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/kubernetes/","section":"Tags","summary":"","title":"Kubernetes","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/oracle-cloud/","section":"Tags","summary":"","title":"Oracle Cloud","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/talos-os/","section":"Tags","summary":"","title":"Talos OS","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/terragrunt/","section":"Tags","summary":"","title":"Terragrunt","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/arm64/","section":"Tags","summary":"","title":"Arm64","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/iac/","section":"Tags","summary":"","title":"Iac","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/talos-linux/","section":"Tags","summary":"","title":"Talos-Linux","type":"tags"},{"content":" Terraforming the Cloud: My First IaC on OCI # Introduction: When Infrastructure Becomes Real # For years I have read about \u0026ldquo;Infrastructure as Code\u0026rdquo; (IaC). I studied the principles, watched tutorials, and even implemented local solutions that approached the concept. But there is a fundamental difference between defining a virtual machine on your own Proxmox server in the basement and defining a complete infrastructure on a public cloud provider like Oracle Cloud Infrastructure (OCI). The former is a controlled exercise; the latter is reality.\nToday I bridged that gap. The goal was not trivial: I didn\u0026rsquo;t want a \u0026ldquo;Hello World\u0026rdquo; with a single Linux instance. I wanted to replicate the robust and ephemeral architecture of my local cluster (tazlab-k8s) on OCI, leveraging the Always Free tier and the ARM64 (Ampere A1) architecture to build the foundation of the new tazlab-vault cluster. This cluster will eventually host an enterprise installation of HashiCorp Vault, so the \u0026ldquo;seriousness\u0026rdquo; of the project demanded absolute technical rigor from day one.\nThis is not a story of immediate success. It is the chronicle of an afternoon spent battling untested scaffolding, peculiarities of cloud images, and the paradox of TLS certificates in NAT environments. It is the story of how two virtual machines powering on can represent a significant technical victory.\n1. Context and Technology Choice # Why OCI? And why Terraform?\nThe choice of Oracle Cloud is purely pragmatic: their Always Free plan offers incredibly generous ARM64 resources (4 OCPUs and 24 GB of RAM), perfect for a two-node Kubernetes cluster (Control Plane + Worker) with no recurring costs.\nThe technology stack choice follows the philosophy of Ephemeral Castle, the framework I developed internally:\nTerraform: For provisioning base resources (VCN, Subnet, Instances). Terragrunt: To keep code DRY and manage dependencies between layers (network vs compute). Talos Linux: An immutable, minimal, and secure operating system for Kubernetes. Talos has no SSH, no shell, and is managed entirely via API. This forces a pure IaC approach: you cannot \u0026ldquo;ssh in and fix\u0026rdquo; a misconfiguration; you must destroy and recreate. I had prepared an initial scaffolding of Terraform files a week ago, but it had never been executed (\u0026ldquo;roughed out\u0026rdquo; but untested). Today was the day of reckoning.\n2. Phase 1: SDD and Account Preparation # Before writing a single line of code or running a command, I activated my Spec-Driven Development (SDD) process. Instead of diving headfirst into execution, I defined four artifacts:\nConstitution: Immutable rules (no secrets in code, mandatory logging, defined stack). Spec: What do we need to build today? (OCI Account, CLI, VMs, Lifecycle scripts). Plan: How do we do it? (Custom image import, module fixes, end-to-end tests). Tasks: 28 micro-tasks to track progress. This approach, which might seem bureaucratic for a personal project, proved to be a lifesaver when technical complexity exploded in later phases.\nFirst Contact with OCI # The OCI account was empty. Zero. I had to navigate the console to create the first Compartment (tazlab-vault), generate API Keys for programmatic access, and configure the OCI CLI on my workstation.\nA critical detail was determining the correct Availability Domain (AD). Unlike AWS or GCP which use zones like eu-central-1a, OCI uses tenancy-specific identifiers, such as GRGU:EU-TURIN-1-AD-1. Hardcoding these values is a mistake; I had to extract them dynamically or save them as secrets in my local vault (managed by Infisical).\n3. The Talos Image Dilemma # Here I encountered the first real architectural obstacle. My original scaffolding planned to use a standard Oracle Linux 8 image and then install Talos on top of it using a cloud-init script.\nOn paper, it works. In practice, it is fragile. It turns an atomic operation (OS boot) into a two-stage process prone to network and dependency errors. Furthermore, the cloud-init template I had written was just a non-functional placeholder.\nThe Decision: I decided to abandon the hybrid approach and use a native Talos image. Talos provides an \u0026ldquo;Image Factory\u0026rdquo; that allows generating custom disk images. I used the same schematic ID as my local cluster (e187c9b9...), which includes specific kernel modules (iscsi_tcp, nbd) for Longhorn distributed storage support.\nThe Import Odyssey # Importing a custom image into OCI is not as trivial as pasting a URL.\nAttempt 1: Pasting the Factory URL into the OCI console. Result: Error. OCI only accepts URLs from its own Object Storage. Attempt 2: Downloading the image, uploading it to an OCI Bucket, importing. Result: Error Shape VM.Standard.A1.Flex is not valid for image. OCI detected the image as x86 because I hadn\u0026rsquo;t specified the architecture. The web console did not allow selecting \u0026ldquo;ARM64\u0026rdquo; for custom images imported this way. The Solution (The Hard Way): I had to follow the official Talos \u0026ldquo;Bring Your Own Image\u0026rdquo; procedure for Oracle Cloud, which is surprisingly manual:\nDownload the raw compressed image (.raw.xz). Decompress it and convert it to QCOW2 format (qemu-img convert). Create a specific image_metadata.json file to tell OCI \u0026ldquo;Hey, this is a UEFI ARM64 image compatible with VM.Standard.A1.Flex\u0026rdquo;. Package everything into an .oci archive (tarball of qcow2 + json). Upload this 90MB package to the Bucket and import from there. Only then did OCI recognize the image as valid for Ampere A1 instances. It was a brutal reminder that the cloud is not magic; it\u0026rsquo;s just someone else\u0026rsquo;s computer with very strict rules.\n4. Terraforming: Debugging the Scaffolding # With the image ready, I ran terragrunt plan. The result was a wall of red errors. The code written a week ago and never tested was showing all its limitations.\n1. Non-Existent Functions # I had used get_terragrunt_config() in child files, a function that does not exist. Terragrunt requires including the root configuration and then reading values via read_terragrunt_config(). I had to rewrite the variable passing logic between the engine (network) and platform (compute) layers.\n2. Provider Conflicts # Each module declared its own required_providers, but the root file also generated a versions.tf. Result: Terraform panicked due to duplicate definitions. I had to clean up the modules, letting Terragrunt inject the correct dependencies.\n3. The \u0026ldquo;Tag Tax\u0026rdquo; # OCI is picky about tags. My code used tags = { ... }, but the OCI provider distinguishes between freeform_tags (free key-value) and defined_tags (enterprise taxonomies). I had to refactor every single resource to use freeform_tags. Additionally, I discovered that tags are case-insensitive in keys, causing merge conflicts when I tried to overwrite Layer with layer.\n4. DNS Label Limits # A trivial but annoying error: dns_label for subnets has a 15-character limit. My string tazlab-vault-public-subnet generated tazlabvaultpublicsubnet (23 characters), blocking VCN provisioning. A simple substr() solved it, but it reminded me to always check provider limits.\nAfter two hours of fix-plan-repeat cycles, I finally saw the most beautiful message in the world: Plan: 12 to add, 0 to change, 0 to destroy.\n5. \u0026ldquo;They\u0026rsquo;re Alive!\u0026rdquo; (and the hidden network problem) # I launched the create.sh script. Terraform created the VCN, subnets, Security Lists, and finally the two Compute instances. In less than 3 minutes, I had two public IPs.\nBut the cluster was not responding. The talosctl version command timed out.\nThe Investigation: I used nc (netcat) to test port 50000 (Talos API). Connection refused. It was strange. My Network Security Groups (NSG) explicitly allowed traffic on port 50000. I dug into the VCN configuration and found the culprit: the Default Security List. In OCI, every subnet has a default Security List that is applied in addition to NSGs. This list only allowed SSH (port 22). Even though my NSG said \u0026ldquo;allow everything\u0026rdquo;, the Security List said \u0026ldquo;block everything except SSH\u0026rdquo;. It\u0026rsquo;s a \u0026ldquo;defense in depth\u0026rdquo; security model that caught me by surprise.\nI opened the Security List and the situation changed instantly: Connection refused became tls: certificate required. The server was responding!\n6. The TLS Paradox and Machine Configuration # At this point, the machines were on, Talos was started, but I couldn\u0026rsquo;t bootstrap the cluster. Why?\nBecause Talos, being secure by-default, uses mTLS (Mutual TLS) for every communication. The server certificate is generated at first boot based on the machine configuration. The configuration, generated by Terraform, set the cluster_endpoint to the VM\u0026rsquo;s private IP address (10.0.1.100), the only one known at plan time.\nI, however, was trying to connect from the outside via the public IP (92.x.x.x). Result: The talosctl client connected to the public IP, the server presented a certificate valid only for 10.0.1.100, and the client rejected the connection due to a name mismatch.\nThe Dead End:\nI couldn\u0026rsquo;t regenerate the certificate without accessing the machine. I couldn\u0026rsquo;t access the machine without a valid certificate. I couldn\u0026rsquo;t use the private IP because I don\u0026rsquo;t have a site-to-site VPN with OCI (yet). I attempted to use Reserved Public IPs, injecting them into the configuration before instance creation. I modified Terraform to add these IPs to the certificate\u0026rsquo;s certSANs (Subject Alternative Names). Unfortunately, Terraform on OCI does not easily allow assigning a reserved public IP during instance creation in a single atomic step; it requires a separate resource. The instances were born with ephemeral IPs different from the ones I had put in the certificate anyway.\nConclusions: A Partial Success is Still a Success # At the end of the session, I had to accept a partial victory. The machines are up. Talos is installed and configured. The infrastructure is defined as code. The destroy.sh script, which I had to rewrite to correctly handle the cleanup of orphaned resources (terminated instances keeping boot disks occupied), works perfectly, allowing me to zero out costs with one command.\nI achieved the goal of \u0026ldquo;Terraforming\u0026rdquo;: I transformed an intention (a cluster) into reality (cloud resources) using only code. The TLS bootstrap problem is a classic \u0026ldquo;Day 2\u0026rdquo; problem anticipated to \u0026ldquo;Day 0\u0026rdquo;. The solution for Phase 2 is clear: correctly associate Reserved IPs to network interfaces (VNIC) or establish a secure tunnel to operate on the private IP.\nBut for today, seeing those two RUNNING lines in the Oracle console, knowing I didn\u0026rsquo;t click any button to create them, is an immense satisfaction. It is the confirmation that theoretical study has transformed into practical competence. The infrastructure, finally, has become real.\n","date":"20 March 2026","externalUrl":null,"permalink":"/posts/terraforming-the-cloud-iac-oci/","section":"Posts","summary":"","title":"Terraforming the Cloud: My First IaC on OCI","type":"posts"},{"content":"","date":"18 March 2026","externalUrl":null,"permalink":"/it/tags/agenti-ai/","section":"Tags","summary":"","title":"Agenti AI","type":"tags"},{"content":"","date":"18 March 2026","externalUrl":null,"permalink":"/tags/ai-agents/","section":"Tags","summary":"","title":"AI Agents","type":"tags"},{"content":"","date":"18 March 2026","externalUrl":null,"permalink":"/tags/cloud/","section":"Tags","summary":"","title":"Cloud","type":"tags"},{"content":" The Thesis: Powerful, But Only If You Know Where to Point Them # AI agents are the most transformative tool I have added to my workflow in recent years. In two months of evening work I produced what would have taken me two years to build alone. Yet every time I see someone on YouTube saying \u0026ldquo;look, I asked it to build me an app and it did\u0026rdquo;, I get a wry smile.\nBecause there is an enormous difference between doing something and doing it well. And that difference, for now, still runs entirely through the engineer.\nThis is not an enthusiastic chronicle about AI changing the world. It is the account of someone who works every day with these tools on Kubernetes, cloud clusters, GitOps pipelines, and secret management — and has learned the hard way where you can trust these agents and where, instead, you need to keep them on a leash.\nContextual Analysis: A Rapidly Moving Landscape # When I started experimenting with AI agents on infrastructure, my starting point was simple: I wanted to understand if they could help me do things I already knew how to do, but faster. I was not looking for magic. I was looking for efficiency.\nThe first problem I ran into was platform lock-in. Gemini CLI, Cloud Code, the native tools of the major providers: each has its own world, its own rules, its own updates that change interfaces without warning. One day you work a certain way, the next day someone has decided it works differently. And you have to follow along.\nThe turning point was discovering pi.dev, the platform on which this very working environment is built. It is a minimal agent, comparable to VI among editors: there is little by default, but it is configurable to an extreme degree and with disarming simplicity. You can tell it directly \u0026ldquo;create this extension, add this behavior\u0026rdquo; and build your own custom tool. Above all, it does not tie you to any specific provider.\nThis opened the door to OpenRouter, which is essentially a one-stop shop for every language model in existence. From there I started seriously exploring what the various providers offer, with a constant eye on costs — because I am a private individual, and a €200/month subscription is not a sustainable budget line.\nThe Field Comparison # I tested many models in the specific context of working on clusters, containers, and cloud. The verdict was not what I expected.\nClaude (Cloud Code) is excellent for complex design work. It reasons well, makes the right choices, understands architectures. The problem is cost: with Opus I exhaust my quota in an hour. With Sonnet you get a bit further, but not much. Excellent for surgical and critical work, unsustainable as a daily driver.\nGemini Flash 3.0 surprised me more than once. On at least two occasions, on real Kubernetes configuration problems that Sonnet could not unblock, Flash solved them on the first attempt. It is not a rule, but it is frequent enough to be significant. It has reasonable pricing and performs well in my domain. There is one important asterisk, however: Gemini CLI used via pi.dev becomes nearly unusable due to rate limiting issues — 50-second waits between calls, then it errors out. The solution is to use it through its native terminal, where it works correctly.\nMinimax M2.5 was a disappointment. It gets good press, but in my specific domain — cluster configuration, Kubernetes, cloud infrastructure — it made too many mistakes and forgot too many things.\nGrok 4.1 fast is not bad for the price. It loses track on long jobs, but on bounded tasks it is usable.\nStepfun (free model): very fast, but produces streams of intermediate logs because it is an extended reasoning model. In practice, it is nearly unusable on configuration work.\nGLM-5 (Zhipu AI) is the positive surprise of the bunch. Pricing comparable to Gemini Flash, it performs well on Kubernetes and cloud configurations, advances work with discipline, and self-corrects when it makes mistakes. It is one of those models I always keep as a backup when I run out of quota.\nHunter Alpha (openrouter/hunter-alpha) deserves its own mention. It is a free model of unknown authorship — probably a new version of DeepSeek or something similar, but it is not clear. I am using it with growing satisfaction: it is good at self-correcting, handles complex work well, and for now it is free. A bit slow — probably the consequence of the free tier — but the results are excellent.\nThe pattern that emerges is clear: the best model depends on the task, not the brand. And having a provider-agnostic platform like pi.dev, which lets me switch from one to another in seconds without changing my workflow, is worth more than any single model.\nDeep Dive: Where Agents Excel and Where They Break Down # 1. The Context and Memory Problem # The real value of an AI agent in infrastructure does not lie in the ability to write YAML. It lies in the ability to hold a complex context in mind and apply it consistently. Kubernetes is a universe: there is Linux at the base, then Docker, then Kubernetes itself, then networking, then security, then the specifics of each cloud provider. Each layer has its own syntax, its own tools, its own flags.\nFor years my problem was not not knowing what I wanted. I always knew what I wanted to achieve. The problem was having to re-read documentation every time I switched library or provider, because every ecosystem has its own idiosyncrasies. Terraform on Oracle is not Terraform on AWS — same logic, different commands and configurations.\nWith agents, this problem nearly disappears. I describe the result, they write the code. My time shifts from the rote memorization of commands to the precise definition of what I want to achieve.\nFor this I built a custom context management solution — which I have written about elsewhere — that lets me open a terminal and find myself in 20 seconds with the project context already loaded: what has been done, where we are, what problems we have encountered. I can switch agent, switch project, and the new agent knows exactly where to pick up. Contexts are kept small and dense — only the information relevant to the moment — which reduces the risk of coherence loss.\n2. The Risk of Autonomy: What Happens When You Let Them Run # This is the point I want to be most direct about, because it is where the most confusion exists.\nI have tried multiple times to let agents work autonomously on complex tasks. The result is almost invariably the same: they complete the task, but the choices they make along the way are often wrong. Not wrong in the sense that they do not work — sometimes they work perfectly well. Wrong in the sense that they do not respect the architectural constraints I had in mind, they take shortcuts that create technical debt, they build fragile solutions that cannot be extended.\nThe most emblematic case I have experienced: I work on a cluster managed with a GitOps philosophy, so everything ends up on Git. On more than one occasion, an agent left to run autonomously committed credentials inside a ConfigMap or directly in a YAML file. It saw a problem, it looked for the fastest solution, and that solution was wrong from a security standpoint. Not because it did not know that secrets should not be committed — if you ask it, it can explain this perfectly. But in the flow of autonomous work, the pressure to \u0026ldquo;close the task\u0026rdquo; overrode compliance with the rules.\nThis taught me one thing: AI agents know the best practices, but they do not feel them as an inviolable constraint when operating autonomously. They apply them when explicitly instructed to do so, or when someone is watching.\n3. Spec-Driven Development: The Structural Response # I am not the only one to have noticed this problem. An entire movement has emerged around what is called Spec-Driven Development: you design the system in detail first, document the architectural choices and constraints, and only then let the agent execute against that specification.\nI use it, and it works. But there is a necessary condition that is often omitted from the enthusiastic presentations: to write a good specification, you need to know what you are specifying. You cannot precisely describe a security architecture for Kubernetes if you do not have a solid understanding of RBAC, secrets engines, and network policies. The agent will follow your specifications to the letter — but if the specifications are vague or wrong, the result will be vague or wrong.\nThe method works because it moves the intellectual work to where it belongs: inside the engineer\u0026rsquo;s head, during the design phase. And the agent becomes the executor of a precise plan, not the designer.\n4. Debugging and Cognitive Load # One of the most concrete transformations I have experienced concerns debugging. I used to spend hours — sometimes nights — changing flags, recompiling, testing, reading stack traces, searching forums. It was the most frustrating part of the work. And it was also, it must be said, the most formative part.\nAgents do it in parallel and never tire. When there is a problem they cannot solve, they continue to iterate until they find the solution. And then they explain it to me — not just what they did, but why they chose that approach. This is the formative value I did not expect: not only do I stop doing manual debugging, but I learn from the reasoning the agent makes explicit.\nI have developed the habit of not using them as silent oracles. I always ask for explanations, discuss choices, sometimes challenge them. At the end of every significant session I ask for a report: what was done, what problems arose, how they were resolved, what architectural choices were made. At that point, reading it back, I notice things I had not caught while we were working — choices I would not have made, constraints that were worked around instead of respected.\nThe Human Element: The Engineer\u0026rsquo;s New Role # There is a metaphor I often use to myself: the AI agent is like someone who knows how to write code but needs to be herded in the right direction. It knows what to do, it has the technical skills, but if you leave it the steering wheel it goes where it finds the nearest grass, not where it needs to go. Your job is to indicate the direction, verify the path, and correct when it strays.\nThis profoundly changes what it means to be an engineer. I write less code. I think at a much higher level. I focus on architecture, constraints, trade-offs. I have become better at describing systems precisely — because if the description is imprecise, the agent produces something imprecise.\nAnd I am learning more, not less. Because discussing with an informed agent on a topic you do not know well is one of the most effective ways to deepen your understanding I have ever found. It is not a search engine, it is not documentation: it is an interlocutor that can answer the specific questions of your specific context. I went from a superficial knowledge of Kubernetes to a deep understanding of how secret management works, RBAC, credential rotation — not because I read a book, but because I spent hours discussing it in the context of my cluster, my use case, my mistakes.\nFinal Synthesis: Recommendations for Those Working Seriously # The dominant narrative about AI has one flaw: it tends to present these tools as equalizers. As if anyone, with the right prompt, could build the same things. That is not the case — at least not in serious work on complex infrastructure.\nMy observation, after months of daily use, is that AI is a multiplier, not a replacement. And the size of the multiplier depends on the baseline competence. For those with a solid foundation, these tools are transformative — 100x, perhaps 1000x on certain types of work. For those without one, the benefit is real but far more limited.\nThis does not mean they are useless for those who are learning. It means that to truly learn, you need to use them actively: discuss, ask for explanations, challenge the choices, build understanding instead of accepting the result. The mountain of expertise is still high. These tools make it more climbable, not shorter.\nFor those who want to use them seriously on infrastructure, some lessons I have learned the hard way:\nNever give up the steering wheel. Especially on a cluster hosting real services. Monitoring, analysis, proposed solutions — all fine. But the approval of every change must pass through a person who understands the consequences.\nDesign first, execute later. Robust Spec-Driven Development is the difference between an agent that advances your project and one that builds something you will have to throw away. But this requires that you already know how to do the thing, at least in its fundamental aspects.\nAlways verify critical output. Especially in a GitOps context: every commit you did not write yourself must be read. Agents do not have the risk perception of someone who has spent nights restoring a broken cluster.\nUse the right platform, not the right model. A provider-agnostic environment that lets you switch models without switching workflow is worth more than any single model. Models change constantly — last week the best one was one thing, now it is another. The workflow must remain stable.\nThe programmer is not dead. They have evolved into something closer to a systems designer who has at their disposal a team of tireless, fast, well-informed developers — but who need to know exactly where they are going.\n","date":"18 March 2026","externalUrl":null,"permalink":"/posts/man-in-the-loop-ai-agents-infrastructure/","section":"Posts","summary":"","title":"Man in the Loop: Reflections on Using AI Agents to Build Infrastructure","type":"posts"},{"content":"","date":"18 March 2026","externalUrl":null,"permalink":"/tags/openrouter/","section":"Tags","summary":"","title":"OpenRouter","type":"tags"},{"content":"","date":"18 March 2026","externalUrl":null,"permalink":"/tags/pi.dev/","section":"Tags","summary":"","title":"Pi.dev","type":"tags"},{"content":"","date":"17 March 2026","externalUrl":null,"permalink":"/tags/gitops/","section":"Tags","summary":"","title":"GitOps","type":"tags"},{"content":" The Current State: A Solid Cluster, But with an Achilles\u0026rsquo; Heel # TazLab today is an infrastructure that works. I have a Kubernetes cluster on Proxmox with Talos OS, a GitOps pipeline managed by Flux, metrics collected by Prometheus and visualized in Grafana, and etcd encrypted at rest. From the outside, it\u0026rsquo;s a setup that inspires confidence.\nBut looking from the inside, there is a problem that keeps me up at night.\nSecret management is handled by Infisical on its free plan. It works: it syncs secrets to Kubernetes via the External Secrets Operator, pods use them, life goes on. However, Infisical\u0026rsquo;s free plan imposes a limit I can no longer accept: it does not support automatic secret rotation.\nSecrets don\u0026rsquo;t rotate. Database credentials are static. If a key is compromised, the response is manual.\nThe \u0026ldquo;Why\u0026rdquo;: When AI Exposes the Problem # The turning point didn\u0026rsquo;t come from an incident, but from a reflection. I started using AI tools regularly in my workflow — Gemini CLI, Cloud Code, and other agents with access to the shell and filesystem. These tools are powerful, but they have an annoying habit: logging everything. Prompts, output, session context. Potentially, even fragments of secrets that appear in command responses.\nAt that point I realized that my secret security model was fragile by design. Not because Infisical does a bad job, but because static and long-lived secrets are inherently vulnerable. A secret that never rotates is a ticking time bomb.\nThe professional answer to this problem has a precise name: dynamic secrets and automatic key rotation.\nThe Target Architecture: Vault as the Center of Gravity # The choice fell on HashiCorp Vault Community Edition, installed as a pod inside the cluster itself. It\u0026rsquo;s a deliberately ambitious choice — probably overkill for a home lab — but it\u0026rsquo;s exactly the kind of overkill I want. Vault is the de facto industry standard for secret management in enterprise environments. Learning it here, in my lab, means bringing real skills to the real world.\nThe model I want to implement works like this:\nVault generates secrets dynamically and manages their expiration and rotation. External Secrets Operator intercepts changes and syncs the new secrets to Kubernetes as native Secret objects. Reloader detects changes in Secrets and ConfigMaps and automatically triggers a reload of the affected pods. The result: no static credentials, no manual intervention, no indefinite exposure window.\nThe New Node: Oracle Cloud Always Free # To host Vault robustly and separately from the main infrastructure, I am adding a second cluster to TazLab. The chosen platform is Oracle Cloud Infrastructure, which offers a generous and stable Always Free tier:\nControl Plane: VM with 8 GB of RAM Worker: VM with 16 GB of RAM OS: Talos OS, same as the local cluster — operational consistency first This Oracle cluster will become TazLab\u0026rsquo;s security node: it will host Vault, be reachable via VPN, and will not depend on the physical hardware at home.\nTailscale: The Glue That Holds Everything Together # The most critical piece of this architecture is not Vault — it\u0026rsquo;s the mesh VPN.\nTo understand why, you need to understand how Vault\u0026rsquo;s dynamic secrets for PostgreSQL work. When an application requests database credentials, Vault doesn\u0026rsquo;t return a password stored somewhere: it creates a PostgreSQL user on the spot, with a defined expiration, and deletes it when the lease expires. To do this, Vault needs direct access to the database with administrator privileges.\nIf Vault is on Oracle Cloud and PostgreSQL is on the local Proxmox cluster, a secure and permanent channel between the two is required. This is where Tailscale comes in: a modern, zero-config mesh VPN solution built on WireGuard. Every node in the network — local cluster, Oracle cluster, workstation — becomes part of the same private network, regardless of its physical location.\nThe VPN is not an implementation detail. It is the precondition that makes the entire architecture possible.\nPhased Approach: The Steps Along the Way # The work is structured in sequential phases, each of which must be stable before proceeding to the next.\nPhase 1 — Mesh VPN Configure Tailscale between the local Proxmox cluster and the new Oracle Cloud cluster. Verify bidirectional connectivity. No Vault, no dynamic secrets until this foundation is solid.\nPhase 2 — New Oracle Cluster Provisioning the Talos cluster on Oracle Cloud via Terragrunt. Integration with the existing GitOps repo. The cluster must be managed by Flux exactly like the local cluster.\nPhase 3 — HashiCorp Vault Deploy Vault on the Oracle cluster. Configure the PKI engine, the PostgreSQL secrets engine, and access policies. Progressive migration of secrets from Infisical to Vault.\nPhase 4 — ESO + Reloader Integration Configure External Secrets Operator on both clusters to read from Vault. Integrate Reloader for automatic pod reloading. Test the full cycle: rotation → sync → reload.\nFuture Outlook: The Ephemeral Cluster Becomes Reality # This roadmap is not just a list of tools to install. It is the step that transforms TazLab from a solid infrastructure to a truly ephemeral one.\nThe ultimate goal is a cluster you can destroy and recreate at will, on any cloud provider, at any time. The bootstrap process will be fully automated: the new cluster connects to the Tailscale mesh, obtains certificates automatically, reaches Vault and retrieves its secrets, restores data from the S3 bucket. No manual intervention.\nAWS today, Google Cloud tomorrow, Oracle the day after. The platform becomes irrelevant.\nThis is \u0026ldquo;Terraforming the Cloud\u0026rdquo; in its most complete form: not terraforming a single cloud, but making your own ecosystem independent of all of them.\nTazLab has no fixed address. It only has a starting point.\n","date":"17 March 2026","externalUrl":null,"permalink":"/posts/tazlab-roadmap-hashicorp-vault-oracle-cloud/","section":"Posts","summary":"","title":"TazLab Roadmap: HashiCorp Vault and Oracle Cloud","type":"posts"},{"content":"","date":"15 March 2026","externalUrl":null,"permalink":"/tags/flux/","section":"Tags","summary":"","title":"Flux","type":"tags"},{"content":"","date":"15 March 2026","externalUrl":null,"permalink":"/tags/sdd/","section":"Tags","summary":"","title":"Sdd","type":"tags"},{"content":" The latest TazLab developments # This is one of those articles you write when things are going well. No incidents to report, no cluster rebuilds at two in the morning, no Deployment refusing to start for incomprehensible reasons. The lab is running. The pipeline works. I\u0026rsquo;ve had time to think about processes instead of fighting problems.\nIn the last few sessions, two concrete things happened that are worth documenting: I implemented a Spec-Driven Development system as a pure Markdown context, and I used that system to fix a Flux DAG problem that had been resisting for weeks. The result was cleaner than I expected.\nSpec-Driven Development as a context: zero code, just rules # The previous article on AGENTS.ctx described the basic idea of the context management system: every operational domain has its own CONTEXT.md, loaded on demand, with the rules already written. The agent doesn\u0026rsquo;t change — the context does.\nThe natural question I asked myself immediately after was: can I apply the same principle to the development process itself? Not as an external tool, not as a standalone system — as another context to open when needed.\nThe answer is yes, and it took me about half a day.\nSDD (Spec-Driven Development) is today a file AGENTS.ctx/sdd/CONTEXT.md with a four-phase workflow and a set of rules the agent follows when it opens the context. No code. No dependencies. Just Markdown versioned on Git.\nThe four phases # The workflow I defined is deliberately linear, with explicit gates between each phase. The agent cannot move to the next phase without approval.\nPhase 1 — Constitution. The foundational document of the project. Defines the immutable foundations: language, framework, naming conventions, constraints, prohibited libraries. Once approved, the constitution doesn\u0026rsquo;t change without explicit approval. It\u0026rsquo;s the document you return to when, during implementation, a doubt arises about \u0026ldquo;but didn\u0026rsquo;t we say to do it this way?\u0026rdquo;.\nPhase 2 — Specification. Defines what to build in full logical detail. Not the implementation — the logic. Expected inputs, desired outputs, behavior for each case. Edge cases and error handling. Acceptance criteria: when is the work considered done? This document is the source of truth. If during implementation something doesn\u0026rsquo;t add up, you come back here.\nPhase 3 — Plan. Defines how to build it technically. Which files to modify, which to create. Architectural choices and their rationale. Dependencies and execution order. The plan is proposed by the agent based on constitution + spec, and requires approval before proceeding.\nPhase 4 — Tasks. The spec is decomposed into a checklist of atomic micro-tasks, each marked as [ ] or [x]. Each task is a discrete, completable action. During implementation, the tasks file is the GPS: open it, see the next pending step, execute it, mark it complete.\nThe project inventory # Every SDD project lives in AGENTS.ctx/sdd/assets/\u0026lt;project-name\u0026gt;/ with its four files. The context maintains an inventory table in CONTEXT.md that updates when a project is created or completed. When I open the context I immediately see the state of everything: what\u0026rsquo;s in progress, what\u0026rsquo;s blocked, what\u0026rsquo;s completed with the relevant notes.\nThis has an important practical effect: every subsequent session doesn\u0026rsquo;t start from zero. The agent reads the inventory, identifies the in-progress project, loads tasks.md, and continues from the next pending step. The warm-up is almost nonexistent.\nThe structure is the same one I can hand to Gemini or any other agent: just have it read the main AGENTS.ctx/CONTEXT.md, which explains where to find the available contexts, and the agent is immediately oriented without further explanation.\nThe first real test: flux-dag-fix-v2 # Theory is cheap. The first real test of SDD came immediately, with a problem I had been carrying for weeks: the Flux kustomization DAG on the TazLab cluster was not behaving as expected.\nThe problem context # The TazLab cluster is managed entirely via GitOps with Flux. Every Kubernetes resource is defined in Git, Flux continuously reconciles the state of the repository with the state of the cluster. Kustomizations — the logical groupings of resources — have dependencies declared explicitly via dependsOn, and can have wait: true which forces Flux to wait until all resources in the kustomization are ready before proceeding with the dependents.\nThe problem already had a precise analysis behind it: a structured document with a table of problems identified in the DAG, a target graph diagram, detailed sections for each fix, a risk matrix, and a summary of the 15 changes to apply. It wasn\u0026rsquo;t approximate documentation — it was a complete technical plan.\nThe difficulty was different: the plan was designed as a single solution to be applied all at once. All the changes, one commit, then final verification. Without a sequence of isolated steps, without a verification gate between one change and the next, without the ability to isolate exactly which change had introduced a problem if something went wrong.\nThe SDD import # I used the existing plan as a starting point to create the SDD project. The technical analysis was already done — what was missing was the execution structure. The phases did their work:\nThe constitution fixed a fundamental constraint I had never formally stated in previous sessions: one change at a time, each verified with a complete destroy+create cycle before proceeding to the next. It seems obvious, but without it written down somewhere, it\u0026rsquo;s easy to give in to the temptation to bundle multiple fixes into a single commit \u0026ldquo;to save time\u0026rdquo; — which is exactly what had led to the confused situation in the first place.\nThe specification forced me to state the actual root causes, not the symptoms. The symptoms were kustomizations stuck in NotReady, dependencies that wouldn\u0026rsquo;t unblock, pods that weren\u0026rsquo;t starting in the right order. The causes were distinct and separate, and required separate fixes.\nThe plan decomposed the work into 14 isolated steps, each with its own verification. Not 14 commits in blind sequence — 14 steps each with a destroy+create cycle and a precise set of conditions to verify before considering it complete.\nThe tasks file became the operational checklist. Each session: open tasks.md, see the next pending step, execute it, mark it complete, close. Next session: reopen, continue.\nThe root cause # Once the problem was properly structured, the main cause emerged clearly.\nThe infrastructure-operators-core kustomization was grouping together two fundamentally different categories of resources:\nLightweight controllers: cert-manager, traefik, Reloader, Dex, OAuth2-proxy, cloudflare-ddns. Relatively fast Helm charts, install in 1-2 minutes. Heavy charts: kube-prometheus-stack and postgres-operator. The former in particular has an installation that can take 10-15 minutes on slow hardware. The problem with wait: true on this kustomization was structural: Flux waits for all resources in the kustomization to be in Ready state before unblocking the dependents. With kube-prometheus-stack inside operators-core, adding wait: true meant blocking the entire graph for 15 minutes every time. All dependent kustomizations — infrastructure-bridge, infrastructure-instances, apps-static, apps-data — remained stuck waiting for Prometheus to finish installing.\nThis is a DAG design error, not a configuration error. I had mixed resources with radically different convergence times in the same graph node, and then tried to put a gate on that node. The gate was correct in principle — wait: true on operators-core ensures cert-manager is ready before certificates are requested — but impossible in practice as long as the node contained heavy charts.\nThe fix # The fix was to separate the concerns. I removed ../monitoring and ../postgres-operator from the infrastructure/operators/core/kustomization.yaml kustomization, leaving them in their dedicated kustomizations (infrastructure-monitoring and infrastructure-operators-data) which already existed and already managed their own lifecycle autonomously.\n# infrastructure/operators/core/kustomization.yaml — after the fix resources: - ../cert-manager - ../traefik - ../reloader - ../dex - ../auth - ../cloudflare-ddns # kube-prometheus-stack removed → managed by infrastructure-monitoring # postgres-operator removed → managed by infrastructure-operators-data With this change, operators-core contained only lightweight charts. The complete installation took 2-3 minutes. wait: true became safe to enable: the gate ensures that cert-manager, traefik, and the other fundamental controllers are operational before dependent kustomizations start creating resources that require them.\nThe final destroy+create cycle declared the blog online in 8 minutes and 20 seconds — the lightweight critical path was working exactly as designed. The PostgreSQL database, with the S3 restore running in the background, and the dependent services (Mnemosyne MCP, pgAdmin) completed around 12-13 minutes. Times within expectations: the restore is not on the blog\u0026rsquo;s critical path, it runs in parallel while the upstream pods are already serving traffic.\nWhat SDD changed in this session # I would be dishonest if I said that without SDD the problem would have been unsolvable. I probably would have solved it anyway. But with more attempts, more disorganized commits, and almost certainly I would have introduced regressions along the way.\nWhat SDD changed is the working mode: instead of proceeding by local attempts — \u0026ldquo;let\u0026rsquo;s try removing this dependency and see what happens\u0026rdquo; — I had to first formally state what was going wrong and why, then design a sequence of verifiable fixes, then execute them one at a time with explicit confirmation between each.\nThis discipline has a cost in terms of initial time. It has an enormous benefit in terms of clarity: when you\u0026rsquo;re on the tenth step out of fourteen and something isn\u0026rsquo;t behaving as expected, you know exactly what you\u0026rsquo;ve already verified, what you\u0026rsquo;ve ruled out, and where to look.\nThe working rhythm with contexts # There\u0026rsquo;s a side effect of the context system that I hadn\u0026rsquo;t anticipated when I designed it, and which has turned out to be more valuable than I expected: the working rhythm has changed.\nBefore the context system, every session had a non-negligible bootstrap cost. Reopening a session meant re-explaining where we were, what the state of the project was, what rules to follow. With complex projects, this could require several exchanges before being operational.\nToday the pattern has become: open the terminal, load the context, operational in a few seconds. The context brings with it the rules, the project state, the next step to execute. Close the terminal, reopen, I\u0026rsquo;m back exactly where I was.\nThis has also changed how I think about new features. When I want to add a new capability to my workflow, I no longer think \u0026ldquo;I need a new specialized agent\u0026rdquo;. I think \u0026ldquo;I need a new context with the right rules\u0026rdquo;. Write the CONTEXT.md, define the expected behavior, and every agent that reads it will behave consistently.\nThe portability advantage is real. Switching to Gemini from Claude requires resetting nothing: just have it read the main AGENTS.ctx/CONTEXT.md, which explains the structure of the system, where to find the available contexts and the general rules. The agent is immediately oriented. There\u0026rsquo;s no lock-in on any specific tool.\nReflections # This leg of the journey confirmed something I had intuited but not yet experienced directly: the structure of the process has an impact on output quality just as much as technical capabilities.\nThe Flux DAG problem wasn\u0026rsquo;t difficult once correctly stated. The difficulty was in stating it correctly after weeks of disorganized attempts that had accumulated noise. SDD didn\u0026rsquo;t add technical capabilities — it added the framework for using those capabilities in an orderly way.\nThere\u0026rsquo;s another thing worth noting: the system is deliberately simple. There\u0026rsquo;s no tool to install, no database to configure, no server to maintain. They are Markdown files in a Git folder. This simplicity is not a limitation — it\u0026rsquo;s a deliberate design choice. A system that depends on a few ubiquitous tools is a system that survives changes in the ecosystem and works on any machine, with any agent.\nThe natural next step is to use this same system for future projects, accumulating over time an inventory of completed specs, plans, and tasks that documents not only what was built, but why it was built that way.\n","date":"15 March 2026","externalUrl":null,"permalink":"/posts/sdd-context-dag-fix-first-shot/","section":"Posts","summary":"","title":"SDD in half a day: a context with rules, and the cluster DAG fixed on the first attempt","type":"posts"},{"content":" The end state # Today I migrated Mnemosyne from the deprecated SSE protocol to Streamable HTTP. But this is not an article about a technical migration. It is an article about what it means when your Kubernetes cluster becomes boring — in the good sense of the word.\nI made the commit, waited two minutes, and the new pod was running with the new configuration. No manual intervention, no kubectl apply, no panic. Flux detected the change in the Git repository, the ImagePolicy pointed to the new image built by the GitHub Action, and the Deployment was updated.\nThis is not a configuration I put together this morning. It is the result of months of iterations, full cluster rebuilds, CI/CD pipelines that failed and were repaired, ImagePolicies that did not recognize the correct tags. But today, finally, it works.\nThe migration as a case study # Mnemosyne is the MCP server that manages my semantic memory. It exposes tools for ingestion, search, and management of technical memories, using PostgreSQL with pgvector for semantic similarity. Until yesterday it used the SSE (Server-Sent Events) protocol to communicate with MCP clients.\nThe problem: the oh-my-pi client did not handle the SSE protocol correctly. It required the client to maintain a persistent GET connection on /sse while sending POST requests on /message. But oh-my-pi treated SSE as a simple HTTP POST, without a background listener.\nThe solution was not to fix the client, but to migrate to the new standard: Streamable HTTP. This protocol uses a single POST endpoint (/mcp) that returns an SSE response when necessary. No complex session management, no separate listeners.\nThe migration was straightforward:\nUpdated mcp-go from v0.44.0 to v0.45.0 Replaced NewSSEServer() with NewStreamableHTTPServer() Changed the endpoint from /sse + /message to /mcp Updated MCP_TRANSPORT from \u0026quot;sse\u0026quot; to \u0026quot;http\u0026quot; in the Deployment Four minimal changes. The code compiled on the first attempt. I committed, pushed, and the cluster did the rest.\nThe GitOps pipeline that works # The CI/CD pipeline is deliberately simple:\nCommit → GitHub Action → Build image → Push to registry → Flux reconciles → Deploy There are no multiple stages, no approval gates, no deployments to separate environments. It is a home lab, not an enterprise. But this simplicity is a feature, not a limitation.\nWhen I commit to mnemosyne-mcp-server, the GitHub Action:\nChecks out the code Builds the Docker image with a tag based on the run number and full commit SHA Pushes to Docker Hub as tazzo/mnemosyne-mcp:mcp-\u0026lt;run_number\u0026gt;-\u0026lt;full_sha\u0026gt; Meanwhile, in the cluster:\nFlux has an ImageRepository that monitors Docker Hub An ImagePolicy selects the most recent image The Deployment has a {\u0026quot;$imagepolicy\u0026quot;: \u0026quot;flux-system:mnemosyne-mcp\u0026quot;} annotation that Flux uses for auto-update When it detects a new image, it updates the Deployment Kubernetes rolls out the new pod Total time: 2-3 minutes from push to running pod.\nThe AGENTS.ctx context system # But the most interesting part is not the pipeline. It is how I structured the operational procedures.\nI created a context system in AGENTS.ctx/ that defines rules, workflows, and memory for each type of activity. Each context has:\nA CONTEXT.md file describing the project and its rules Asset files with specific prompts, templates, or resources An inventory of projects, statuses, and technical debt When I open a context, the agent I am using immediately becomes specialized. For example:\nblog-writer: Defines a 5-phase workflow (Planning → Writing → Review → Translation → Publish) with rules for style, formatting, and GitOps publication mnemosyne-mcp-server: Documents the MCP server, code structure, environment variables, and build/deploy procedures tazlab-k8s: Describes the Kubernetes cluster, Flux resources, and how to interact with it This article is the second I have written using the blog-writer context. The process has become almost automatic: I open the context, decide the key points, the agent writes, I review. No more endless iterations with generic prompts. The rules are already there, ready.\nThe vision: automatic procedures for Mnemosyne # The next step is to create a context for memory ingestion in Mnemosyne.\nCurrently, when I want to save a technical memory, I have to:\nFormat the content Manually call the ingest_memory tool Verify it was saved correctly With a dedicated context, this will become automatic. The agent will know:\nWhich format to use for memories How to structure content for semantic search When to save (e.g., at the end of a work session) How to verify the save was successful Just open the context and say \u0026ldquo;save what we did today.\u0026rdquo; Everything else is handled by the rules.\nThe multi-agent paradigm redesigned # For a long time, the prevailing paradigm for LLM automation has been \u0026ldquo;use different specialized agents for different tasks.\u0026rdquo; One agent for code, one for writing, one for data.\nWith the context system, this reasoning is overturned — but in a more subtle way than it might seem.\nFor how I work today, alone, the optimal configuration is a generic agent + N contexts loaded on-demand. When I open the blog-writer context, the agent already knows how to structure an article, which rules to follow, how to publish it. When I open mnemosyne-mcp-server, it knows the code structure, environment variables, the CI/CD pipeline. The agent does not change — the context does.\nBut the same system scales horizontally. In the future, I could deploy multiple separate agents directly on the Kubernetes cluster — each with its own context already loaded as a ConfigMap or mounted as a volume. One agent responsible for cluster maintenance, one dedicated to ingesting memories into Mnemosyne, one that monitors Flux deploys. Each autonomous, each specialized, each with a folder of contexts covering the operational situations it might encounter.\nThe point is that contexts are portable and composable. They are not tied to a single agent. They are units of operational knowledge that can be distributed, mounted, combined. Today I use them interactively. Tomorrow they could be the foundation of an autonomous automation system.\nThis reduces management complexity:\nA single format for operational knowledge (structured Markdown) Contexts versionable on Git, centrally updatable Same structure for interactive use and autonomous deployment It is like having a library of operational procedures that works both when I browse them myself and when an agent running on a pod reads them.\nThe cluster in good health # Back to the beginning: the cluster is stable. This does not mean there are no problems — there always are. But it means the problems are manageable, and the procedures are repeatable.\nWhen I had to migrate Mnemosyne to Streamable HTTP, I did not have to:\nRebuild the development environment Manually configure environment variables Debug the CI/CD pipeline Relearn how Flux works I simply:\nOpened the mnemosyne-mcp-server context Made the code changes Committed and pushed The rest happened by itself. This is the result of having documented, iterated, and built solid procedures over time.\nThe future pipeline # The pipeline today is simple. In the future it will become richer:\nAutomated tests: Every PR triggers tests before merge Staging environments: Deploy to a separate environment before production Automatic rollbacks: If health checks fail, roll back to the previous version Notifications: Slack or email when a deploy completes or fails But the foundation is there, and it is solid. Every new feature will be an extension, not a refoundation. This is the advantage of having built the foundations correctly.\nWhat we learned # This \u0026ldquo;leg\u0026rdquo; of the journey confirmed to me that:\nGitOps is not just theory: When it works, you forget it exists. You commit, and the code reaches production. Contexts change the way you work: In my case, working alone, a generic agent + well-defined contexts has turned out to be more convenient and manageable than many separate agents. It is not a universal law, but for this workflow it works well. Documentation is code: The CONTEXT.md files are alive. They are updated, versioned, and used every day. Simplicity wins: A pipeline with 3 steps that works is better than one with 10 that you do not know how to configure. The cluster is mature. Not \u0026ldquo;complete\u0026rdquo; — it never will be. But mature enough to let me work on interesting things instead of putting out fires.\n","date":"14 March 2026","externalUrl":null,"permalink":"/posts/mature-cluster-gitops-agent-contexts-mnemosyne/","section":"Posts","summary":"","title":"A mature cluster: automated deploys, agent contexts, and the Mnemosyne MCP migration","type":"posts"},{"content":"","date":"14 March 2026","externalUrl":null,"permalink":"/tags/mcp/","section":"Tags","summary":"","title":"Mcp","type":"tags"},{"content":"","date":"14 March 2026","externalUrl":null,"permalink":"/tags/mnemosyne/","section":"Tags","summary":"","title":"Mnemosyne","type":"tags"},{"content":" The Problem: Session Amnesia # Every time I restart a terminal and open a new session with an AI agent, I face the same problem: I have to re-explain everything from scratch. Where we are, what we\u0026rsquo;re doing, what the project rules are, what problems we\u0026rsquo;ve already solved.\nIt\u0026rsquo;s a frustrating cycle. The agent doesn\u0026rsquo;t remember anything from the previous session. I have to manually re-inject the context, or hope the system has some persistence mechanism — but often these mechanisms are opaque, inefficient, or simply don\u0026rsquo;t exist.\nThe problem becomes even more evident when working on multiple parallel projects. Each project has its own conventions, its structure, its unwritten rules. Loading everything into the same session is not just inefficient: it\u0026rsquo;s counterproductive.\nThe Context Window Limit # Language models have a physical limit: the context window. When the session starts filling up, performance degrades. Around 50% of capacity, the quality of responses visibly worsens. The model \u0026ldquo;forgets\u0026rdquo; initial instructions, loses coherence, repeats information.\nThis is context bloat: too much irrelevant information loaded into the same session. The solution isn\u0026rsquo;t more memory, but selective memory.\nTwo Tools, Two Purposes # Before arriving at the solution, it\u0026rsquo;s important to distinguish two different problems:\nMnemosyne is an MCP server I built for long-term memory. It records what I did, what problems I encountered, how I solved them. It\u0026rsquo;s a searchable archive: when a problem resurfaces, I search the memories and find the solution applied in the past. It\u0026rsquo;s useful for troubleshooting, automatic documentation, building a personal knowledge base.\nAGENTS.ctx answers a different problem: active context. I don\u0026rsquo;t want to remember what I did three months ago — I want the agent to know now where we are, what we\u0026rsquo;re doing, what rules to follow. And I want it to know without me having to repeat everything every time.\nMnemosyne is the historical diary. AGENTS.ctx is the operational brief.\nThe Architecture: Indirection and Selective Loading # The central idea of AGENTS.ctx is simple: don\u0026rsquo;t load everything, load only what\u0026rsquo;s necessary.\nThe structure is based on three levels:\nLevel 0: AGENTS.md (Entry Point) # In the working directory (/workspace), an AGENTS.md file contains basic instructions for the agent. It says what to do at startup, where to find contexts, how to manage them.\nThis file is lightweight, a few paragraphs. Its job is to point the way, not carry the load.\nLevel 1: AGENTS.ctx/CONTEXT.md (Base Context) # In the AGENTS.ctx/ folder, a CONTEXT.md file contains the base context: the list of available contexts, general rules that apply to all projects, folder structure.\nThis file is loaded automatically at startup. It\u0026rsquo;s the \u0026ldquo;operating system\u0026rdquo; of contexts: it provides the directory and fundamental rules.\nLevel 2: Specific Contexts # Each context has its own subfolder. They can be:\nProjects: tazpod/, ephemeral-castle/, tazlab-k8s/ Generic workflows: blog-writer/, plans/ — for repeatable activities Utilities: contexts that load only rules, are used and closed When I say \u0026ldquo;work in context X\u0026rdquo;, the agent loads only that file. Nothing more, nothing less. Finished the work, I close the session and start clean, ready for another context.\nComposite Contexts # Some work requires multiple contexts simultaneously. For example, \u0026ldquo;cluster\u0026rdquo; is a composite context that loads both ephemeral-castle (the Proxmox/Talos infrastructure) and tazlab-k8s (the Kubernetes configurations). The agent reads both files and merges the rules.\nThis allows working on complex systems without duplicating information.\nAgent-Agnostic by Design # A deliberate choice: everything is based on text files in simple folders. No databases, no proprietary formats, no lock-in.\nThis means I can use any agent: Gemini CLI, Claude Code, pi.dev. As long as the agent can read a text file and follow instructions.\nPortability is fundamental. I don\u0026rsquo;t want my workflow to depend on a specific tool. If tomorrow I discover a better agent, I want to be able to adopt it without rebuilding the whole system.\nInspiration and Attribution # The idea isn\u0026rsquo;t mine. I saw it in this video, which shows a similar approach for managing contexts with AI agents. I adapted the concept to my workflow, adding the layered structure, composite contexts, and integration with my existing system.\nHow It Works in Practice # The startup sequence is:\nThe agent reads /workspace/AGENTS.md Follows the instruction: \u0026ldquo;read AGENTS.ctx/CONTEXT.md\u0026rdquo; The base context lists available contexts When I say \u0026ldquo;context X\u0026rdquo;, the agent reads AGENTS.ctx/X/CONTEXT.md Structure of a Context # Each context can contain:\nCONTEXT.md: main instructions scripts/: interaction scripts (deploy, test, utility) docs/: additional documentation assets/: configuration files, templates, resources The structure is flexible. The important thing is that CONTEXT.md explains what\u0026rsquo;s there and how to use it.\nExample: tazpod Context # TazPod is a Go CLI for managing a nomadic, secrets-aware development environment. It provides:\nAn AES-256-GCM vault in RAM for secrets (mounted with tazpod unlock, zeroed with lock) Docker container with full toolchain (kubectl, terraform, helm, neovim, etc.) Automatic identity sync to S3 for portability Integration with Infisical for secrets management The tazpod/CONTEXT.md context explains to the agent the three-layer architecture (host CLI, tmpfs enclave, container), main commands, hardcoded paths, and custom procedures (like GitHub push with token).\nWhen I work on tazpod, the agent immediately has the complete picture: I don\u0026rsquo;t need to explain what the vault is, how the enclave works, or where the files are. The context is compact and focused.\nTrade-offs and Lessons Learned # What Works Well # Explicit loading: I know exactly what gets loaded Clean separation: each project has its own space Zero magic: no auto-discovery that loads unexpected things Portability: works with any agent What Could Improve # Manual management: I have to update tables when adding contexts No inference: the agent doesn\u0026rsquo;t guess the context, it must be explicit Initial overhead: requires some setup The main trade-off is between convenience and control. I chose control.\nConclusion: Compact Context, Better Performance # AGENTS.ctx solves a practical problem: avoiding repeating the same things every time I open a session. The solution isn\u0026rsquo;t more memory, but organized memory.\nIndirection, selective loading, separate contexts. The agent has only what\u0026rsquo;s necessary for the current work. No bloat, no degradation.\nAnd when I switch agents, the system comes with me.\n","date":"13 March 2026","externalUrl":null,"permalink":"/posts/ai-context-management-agents-ctx/","section":"Posts","summary":"","title":"AGENTS.ctx: Context Management for AI Agents Without Re-Explaining Everything","type":"posts"},{"content":"","date":"9 March 2026","externalUrl":null,"permalink":"/tags/cloud-native/","section":"Tags","summary":"","title":"Cloud-Native","type":"tags"},{"content":"","date":"9 March 2026","externalUrl":null,"permalink":"/tags/developer-tools/","section":"Tags","summary":"","title":"Developer Tools","type":"tags"},{"content":" Introduction: The Frustration of Walled Gardens # When working on a complex infrastructure like TazLab — a nomadic ecosystem built on Talos Kubernetes, GitOps with Flux CD, and a reduced attack surface through Zero Trust — the need for intelligent automation becomes critical quickly. I am not talking about procedural automation (that is solved with Terragrunt and Ansible), but about cognitive assistance: an AI agent capable of reading project context, reasoning about dependencies, suggesting refactors, and debugging complex orchestration issues.\nInitially, I tried to solve this problem by relying on mainstream tools: Google\u0026rsquo;s Gemini CLI and Cloud Code. Both promised native integration with Gemini APIs and a smooth workflow. However, after weeks of intensive use, I ran into structural limitations that made it impossible to adapt them to TazLab\u0026rsquo;s requirements.\nThis analysis documents my path toward Pi.Dev (pi-coding-agent), a minimal but radically configurable tool I adopted as the foundation for building specialized agents. The comparison is not academic: it reflects concrete needs that emerged from managing a home lab in production.\nPhase 1: The Limits of \u0026ldquo;Convenience-First\u0026rdquo; Solutions # Gemini CLI: Power Limited by Architectural Choices # Gemini CLI is Google\u0026rsquo;s official tool for interacting with Gemini models via the command line. My first impression was positive: it supports multi-modality (text, images, video), manages persistent sessions, and integrates the Model Context Protocol (MCP) to extend capabilities through external servers.\nConceptual Deep-Dive: Model Context Protocol (MCP) The Model Context Protocol is a JSON-RPC protocol that allows AI agents to invoke external \u0026ldquo;tools\u0026rdquo; (functions exposed by remote or local servers). For example, an MCP server can provide tools for querying a Postgres database, searching a vector knowledge base, or reading metrics from Prometheus. The protocol supports two transport modes: Stdio (inter-process communication on the same machine via stdin/stdout) and SSE (Server-Sent Events over HTTP, for distributed integrations).\nThe problem with Gemini CLI surfaces when you want to do more than Google anticipated. Here are the limitations I encountered:\nChronic slowness: Even on the Pro plan, Gemini CLI is noticeably slow. Responses arrive with significant latency — sometimes tens of seconds for queries that require context reading from the cluster. In an iterative debugging workflow, where you query the agent multiple times to refine a diagnosis, this slowness becomes a tangible drag on productivity.\nRigid extensibility via MCP: Although Gemini CLI supports MCP, configuration is limited to a JSON file (settings.json) that specifies which external servers to invoke. It is not possible to inject custom logic directly into the agent loop without going through a separate MCP server. This means that if I wanted an agent that, for example, automatically read Flux CD logs from the Kubernetes cluster before answering a question, I had to build a dedicated MCP server to expose that tool — for every single feature.\nNo control over the system prompt: Gemini CLI uses a hard-coded system prompt. It is not possible to modify it to instruct the agent about project-specific conventions (for example, \u0026ldquo;When writing Kubernetes manifests, always use Kustomize instead of Helm\u0026rdquo; or \u0026ldquo;For every commit, add a git note with the timestamp\u0026rdquo;). This drastically limits specialization.\nCloud Code: Speed at the Cost of Quota # Cloud Code is the next tool I tried — from the terminal, not as a VS Code extension which does not fit my workflow. The difference compared to Gemini CLI is immediate: responses are noticeably faster. For someone working on infrastructures like TazLab, where every query involves reading controller logs, Flux state, and kubectl output, response speed is not a cosmetic detail.\nThe problem is plan sustainability. On the Pro plan, a Kubernetes debugging session — reading kustomize-controller logs, validating manifests, iterating on a Flux issue — is enough to exhaust the quota. I struggle to get past two hours of intensive work before hitting the limit.\nWhy Cloud Code was not sustainable for TazLab:\nUnsustainable quota for Kubernetes workloads: The Pro plan depletes rapidly on tasks requiring intensive context. A single Flux debugging session consumes enough to block you for the rest of the day. This is not an edge case: it is the norm for anyone working on complex infrastructures.\nNo scripting capability: I cannot invoke Cloud Code from a Bash script to automate repetitive tasks. It is a closed conversational interface, not composable in pipelines.\nVendor Lock-In: The entire ecosystem pushes toward Google Cloud services. This philosophy is the opposite of TazLab\u0026rsquo;s, where digital sovereignty is a foundational principle. I do not want my ability to work to depend on the availability — or the generosity of the plan — of an external cloud service.\nPhase 2: Discovering Pi.Dev — Unix Philosophy for AI Agents # After weeks of frustration, I started looking for alternatives that would satisfy these requirements:\nRadical extensibility: The ability to modify every aspect of agent behavior. Modularity: Support for multiple specialized agents, each with its own system prompt and tool set. Multi-model: The ability to use different models (Anthropic Claude, Google Gemini, OpenAI, Ollama) depending on the task. Key is native support for OpenRouter, which provides access to practically any model available on the market with a single API key. One of the planned experiments is a systematic benchmark of frontier models on Kubernetes contexts — to determine which offer the best quality/cost ratio for tasks like Flux debugging, manifest analysis, and configuration generation. Scripting-friendly: Usable both interactively and in automated pipelines. Minimal: No dependencies on IDEs or heavy frameworks. Digging through open-source projects and community experiments, I arrived at Pi.Dev (pi-coding-agent). The analogy I often use to describe it: Pi.Dev is to Gemini CLI/Cloud Code what Neovim is to Visual Studio Code. It is minimal, configurable down to the finest detail, and requires an initial investment to master, but pays off with total flexibility.\nIt is worth adding a piece of context: OpenClaw, the coding agent that has gained considerable attention in the developer community in recent months, is built on Pi.Dev. This is not a marginal detail — it means the framework I use as a foundation has already proven it can hold up under real production loads and ambitions.\nAnatomy of Pi.Dev: Component-Based Architecture # Pi.Dev is written in TypeScript and distributed as an npm package. The architecture is based on three fundamental concepts:\nAgent: An AI instance with a specific system prompt, an associated model, and a set of available tools. Skill: Reusable modules that add contextual capabilities (e.g., \u0026ldquo;when the user asks to work on Kubernetes, load instructions from the KUBERNETES.md file\u0026rdquo;). Extension: Custom functions (tools) the agent can invoke, written in TypeScript and integrated through a simple interface. Conceptual Deep-Dive: Agent vs Assistant vs Tool It is important to distinguish the levels of abstraction. An Assistant (like Gemini or Claude) is the underlying model, provided by a vendor (Google, Anthropic). An Agent is a specific configuration of that assistant, with a system prompt and a set of tools. For example, I can have an agent called \u0026ldquo;k8s-debugger\u0026rdquo; that uses the claude-sonnet-4 model, with a system prompt that instructs it to always read Flux logs before responding, and with access to custom tools for querying Prometheus. A Tool is a function the agent can invoke. Pi.Dev allows defining tools both as extensions (local code) and as skills (predefined bundles of prompt + tools).\nThe key difference is the philosophical approach. Gemini CLI and Cloud Code are finished products — tools designed for a mainstream use case and then sealed. Pi.Dev is a toolkit — it provides the building blocks (conversation management, model invocation, MCP protocol) and lets the user construct their own agent architecture.\nPhase 3: Use Cases — Specialized Agents for the Kubernetes Ecosystem # Once I understood Pi.Dev\u0026rsquo;s potential, I started mapping the concrete use cases for TazLab. Here are two I am exploring:\nCase 1: The \u0026ldquo;Blog Writer\u0026rdquo; Agent (This Article) # The first agent I configured is the one writing this article. Its system prompt (CLAUDE.md in the repository) instructs it to:\nRead the existing blog documentation (~/kubernetes/blog-src/content/posts/) to understand the style. Follow a structured template (Introduction → Phases → Reflections). Expand each technical concept with \u0026ldquo;Deep-Dive\u0026rdquo; paragraphs. Use a professional first-person singular tone. This agent uses Anthropic\u0026rsquo;s claude-sonnet-4 model because it excels at long, structured technical writing. When I ask it to write an article, it autonomously reads existing examples, identifies the appropriate tags, and generates a complete Markdown file with TOML frontmatter.\nWhy this would not have been possible with Gemini CLI: With Gemini CLI, I would have had to:\nBuild an MCP server exposing a \u0026ldquo;read_blog_posts\u0026rdquo; tool. Launch the server in the background. Configure Gemini CLI to connect to the server. Manually write the system prompt every time, because I cannot save it in the configuration. Parse the text output and save it manually. With Pi.Dev, all of this is configured once in the agent file, and every invocation is automatic.\nCase 2: The \u0026ldquo;K8s Watchdog\u0026rdquo; Agent — Proactive Cluster Surveillance # The second use case is the most ambitious: a pod with a minimal version of Pi.Dev deployed inside the Kubernetes cluster, acting as a general-purpose watchdog over all critical infrastructure components.\nThe architecture is a Kubernetes CronJob with a configurable interval — probably between ten and thirty minutes. On each run, the agent queries the cluster on multiple fronts using the in-cluster client with a strict RBAC ServiceAccount: read-only access to resources, logs, and events.\nMonitoring scope:\nGitOps (Flux): HelmRelease, Kustomization, GitRepository state. Detects failed reconciliations, stalled resources, or revisions lagging behind the latest in the repository. Storage (Longhorn): Volume health, replica state, recent backups. Identifies volumes in degraded state or without snapshots within the expected interval. Database: Critical database pods (Postgres/CrunchyPostgres and other StatefulSets). Verifies they are Running, with no abnormal restarts and a responding liveness probe. General pods: Any pod in CrashLoopBackOff, OOMKilled, ImagePullBackOff, or with a restart count above a configurable threshold. Operational flow:\nNominal: If everything is healthy, produces a concise report and terminates. Anomaly detected: Switches to investigative mode — reads correlated Kubernetes events, logs from the failing component, and the state of dependent resources. Elevated cause: If a pod restarts too frequently, correlates with OOMKilled events, memory limits, and application logs. If a Longhorn volume is degraded, verifies node and replica state. Structured report: Probable diagnosis, ordered list of options to verify manually, and concrete solutions to evaluate — without applying anything autonomously. The distinction is deliberate: the agent has full visibility but zero executive power. The goal is not to create an autonomous system that could worsen an already critical situation, but to reduce triaging from \u0026ldquo;I read everything myself\u0026rdquo; to \u0026ldquo;I read the report and decide.\u0026rdquo;\nExpected model: an affordable model via OpenRouter — the analysis scope is broad but structured, and execution frequency makes cost per token a non-negotiable constraint.\nArchitectural Reflections: Toward an \u0026ldquo;Agent-Aware\u0026rdquo; Infrastructure # Adopting Pi.Dev is changing how I think about TazLab\u0026rsquo;s architecture. Traditionally, automation was divided into two categories:\nProcedural automation (Bash scripts, Terragrunt, Ansible) — repeatable, deterministic tasks. Human intervention (debugging, architectural decisions, refactoring) — tasks requiring reasoning. With configurable AI agents, a third category emerges: cognitive automation. Tasks that require reasoning but can be delegated to an agent given the right context.\nThe Trust Boundary Problem # However, this introduces a critical security challenge. An in-cluster AI agent with access to kubectl and cluster APIs potentially has the power to destroy the entire infrastructure. How do I manage this trust boundary?\nApproaches I am exploring:\nStrict RBAC: The in-cluster agent runs with a Kubernetes ServiceAccount with limited permissions. For example, it can read metrics and logs, but cannot delete resources or modify critical ConfigMaps.\nComplete Audit Trail: Every agent action is logged immutably (Loki + S3 backup). If the agent performs a destructive action, I can reconstruct the chain of events.\nHuman-in-the-Loop for Critical Actions: The agent can propose changes (e.g., \u0026ldquo;Here is a PR to scale Longhorn storage\u0026rdquo;), but application requires human approval via GitOps.\nSandbox Environments: Before deploying an agent to production, I test it on a staging cluster (TazLab\u0026rsquo;s \u0026ldquo;Green\u0026rdquo; cluster, not yet documented).\nThe \u0026ldquo;Agent-as-Operator\u0026rdquo; Pattern # A traditional Kubernetes Operator (written in Go with controller-runtime) reconciles a desired state declared in CRDs. The idea of an \u0026ldquo;Agent-as-Operator\u0026rdquo; is different: the agent does not reconcile a declared state, but responds to events and makes contextual decisions.\nConcrete example:\nTraditional Operator: \u0026ldquo;If the PVC exceeds 80% utilization, increase its size to X GB (hard-coded value).\u0026rdquo; Agent-as-Operator: \u0026ldquo;If the PVC exceeds 80%, analyze the growth patterns over the last 7 days, verify available storage budget, consult backup logs to ensure recoverability, and propose an optimal scaling plan.\u0026rdquo; This pattern does not replace traditional Operators (which are more efficient for deterministic tasks), but complements them for scenarios requiring flexibility.\nPhase 4: What\u0026rsquo;s Missing — Gaps and Future Directions # Despite the (measured) enthusiasm for Pi.Dev, there are clear gaps I am addressing:\nGap 1: Cost per Token and Multi-Model Orchestration # Using different models for different tasks is powerful, but it introduces budgeting complexity. Claude Sonnet is expensive (approximately $3 per million input tokens), while Gemini Flash is nearly free. I need to build logic for:\nIntelligent routing: simple tasks → affordable model, complex tasks → capable model. Cost monitoring: a dashboard tracking how many tokens I consume per agent/task. Pi.Dev does not provide this out-of-the-box. I am exploring integration with tools like LangSmith or building a custom dashboard with Prometheus + Grafana.\nGap 2: Agent Testing and Validation # How do I test that an agent works correctly? With traditional code, I write unit tests. With an AI agent, behavior is probabilistic. I am experimenting with:\nGolden Tests: I run the agent on known problems (e.g., \u0026ldquo;Debug this Flux error I know is caused by malformed YAML\u0026rdquo;) and verify the output contains the expected keywords. Regression Tests: Every time the agent resolves a problem, I save the input/output as a test case. If I change the system prompt, I re-run the tests to verify that desired behaviors have not regressed. Gap 3: Persistent State for In-Cluster Agents # An agent in a Kubernetes pod is by definition ephemeral. If the pod crashes, it loses the conversation memory. For long-running agents, I need to implement persistent state. Options:\nExternal database (Postgres): I save the conversation history and context. Kubernetes ConfigMap: For lightweight state (configurations, task queue). Conclusion: A Choice of Technical Sovereignty # Adopting Pi.Dev over Gemini CLI or Cloud Code was not driven by open-source fanaticism or an aversion to Google. It was a pragmatic choice based on TazLab\u0026rsquo;s architectural requirements:\nExtensibility: I need agents that behave exactly as I want, not as a BigTech product manager decided. Multi-Model: I want to choose the optimal model for each task, not be locked into an ecosystem. Deep Integration: Agents must live inside my ecosystem (TazPod, Kubernetes, Mnemosyne), not in a cloud walled garden. Sovereignty: I want to understand and control every aspect of the system, from the system prompt to the communication protocol. Pi.Dev, with its minimal and configurable philosophy, meets these requirements. It is the Neovim of AI coding agents: it has a steep learning curve, requires an initial investment, but pays off with total control.\nWhile writing this article (via the \u0026ldquo;blog-writer\u0026rdquo; agent built on Pi.Dev), I am still learning. The documentation is fragmented, some features are experimental, and there are edge cases to resolve. But that is precisely the point: I have the ability to resolve them. With Gemini CLI, if a feature does not exist, I can only open a GitHub issue and hope. With Pi.Dev, I can open the code, understand how it works, and contribute the patch.\nThis is the kind of technical empowerment I was looking for when I started the TazLab project. Adding Pi.Dev to the arsenal represents another step toward a truly sovereign ecosystem, where every component — from the OS (Talos) to the vault (TazPod) to memory (Mnemosyne) to agents (Pi.Dev) — is controllable, inspectable, and modifiable.\nIn the next articles, I will document the concrete implementation of the \u0026ldquo;K8s Watchdog\u0026rdquo; and the first results of the comparative benchmark of models on Kubernetes tasks. If this comparative analysis has intrigued you, I invite you to explore Pi.Dev and consider whether the philosophy of \u0026ldquo;minimal but radically configurable tool\u0026rdquo; fits your workflow.\nNote for readers: This article was written by a Pi.Dev agent configured as \u0026ldquo;Home Lab Blogger.\u0026rdquo; The irony is intentional — it is a practical demonstration of the article\u0026rsquo;s thesis. The agent autonomously read previous blog articles, identified the correct format, generated the TOML frontmatter, and produced this text following the style rules defined in its system prompt. The process was: pi --agent blog-writer --task \u0026quot;Scrivi analisi comparativa su Pi.Dev vs Gemini CLI/Cloud Code\u0026quot;. Generation time: ~3 minutes. Cost: ~$0.15 (Claude Sonnet 4, ~50k output tokens). +++\n","date":"9 March 2026","externalUrl":null,"permalink":"/posts/pi-dev-agent-architecture-comparative/","section":"Posts","summary":"","title":"Pi.Dev: Minimal Agent Architecture for the Cloud-Native Ecosystem","type":"posts"},{"content":" Enterprise Monitoring in a Home Lab: The (Uphill) Road to Stateless Grafana and Prometheus # Introduction: Beyond \u0026ldquo;Out of the Box\u0026rdquo; Monitoring # In a Home Lab that aims to be more than just a collection of containers, monitoring cannot be an afterthought. After stabilizing my Talos Linux cluster on Proxmox and consolidating distributed storage with Longhorn, I felt the need for granular visibility. I didn\u0026rsquo;t just need graphs; I needed an observability infrastructure that followed the same principles of resilience and immutability as the rest of the cluster.\nMany tutorials suggest installing kube-prometheus-stack with default values: Grafana saving data to a local SQLite database and Prometheus writing to a temporary volume. This solution, while quick, is antithetical to my vision of an \u0026ldquo;Enterprise Home Lab.\u0026rdquo; If a node fails and the Grafana Pod is rescheduled elsewhere without a persistent volume, I would lose every manually created dashboard, every user, and every configuration. I decided, therefore, to take the more complex path: a Stateless architecture for Grafana and Long-term Persistence for Prometheus, orchestrated entirely via GitOps with FluxCD.\nThe Architectural Strategy: Why \u0026ldquo;Stateless\u0026rdquo;? # The concept of a \u0026ldquo;stateless application\u0026rdquo; is fundamental in modern cloud-native architectures. For Grafana, this means that the application binary must not contain any vital state. I decided to use the existing PostgreSQL cluster (tazlab-db), managed by the CrunchyData Postgres Operator, as the backend for Grafana\u0026rsquo;s metadata.\nThe Reasoning: SQLite vs PostgreSQL # Why bother configuring an external database? In a standard installation, Grafana uses SQLite, a single-file database. While excellent for simplicity, SQLite in Kubernetes requires a dedicated PersistentVolumeClaim (PVC). If the PVC becomes corrupted or if there are file lock issues during a node migration (common with RWO volumes), Grafana won\u0026rsquo;t start. By using PostgreSQL, I shift the responsibility of persistence to a system I have already made resilient (with S3 backups via pgBackRest and high availability). This allows me to treat Grafana Pods as expendable: I can destroy and recreate them at any time, knowing the data is safe in the central database.\nThe Choice of Prometheus on Longhorn # For Prometheus, the situation is different. Prometheus is inherently \u0026ldquo;stateful\u0026rdquo; due to its time-series database (TSDB). While solutions like Thanos or Cortex exist to make it stateless, for my current data volume, it would be unnecessary overkill. I opted for a pragmatic approach: a 10GB volume on Longhorn with a 15-day retention policy. This ensures that historical data survives Pod restarts, while Longhorn\u0026rsquo;s distributed replication protects me from hardware failures of the physical Proxmox nodes.\nImplementation: Configuration and GitOps # The entire stack is defined via a FluxCD HelmRelease. This allows me to manage the configuration declaratively in the tazlab-k8s repository.\nThe Heart of the Configuration (Technical Snippet) # Here is how I declared the PostgreSQL integration and networking management:\nspec: values: grafana: enabled: true grafana.ini: database: type: postgres host: tazlab-db-primary.tazlab-db.svc.cluster.local:5432 name: grafana user: grafana env: GF_DATABASE_TYPE: postgres GF_DATABASE_HOST: tazlab-db-primary.tazlab-db.svc.cluster.local:5432 GF_DATABASE_NAME: grafana GF_DATABASE_USER: grafana envValueFrom: GF_DATABASE_PASSWORD: secretKeyRef: name: tazlab-db-pguser-grafana key: password service: type: LoadBalancer annotations: metallb.universe.tf/loadBalancerIPs: \u0026#34;192.168.1.240\u0026#34; metallb.universe.tf/allow-shared-ip: \u0026#34;tazlab-internal-dashboard\u0026#34; port: 8005 This configuration uses External Secrets (ESO) to inject the database password, syncing it directly from Infisical. It is a critical security step: no password is written in plain text in the Git code.\nThe Chronicle of Failures: An Obstacle Course # Despite the planning, the installation was a \u0026ldquo;Trail of Failures\u0026rdquo; that required hours of deep debugging. Documenting these errors is fundamental, as they represent the reality of a DevOps engineer\u0026rsquo;s work.\n1. The Ghost of SQLite (The Silent Failure) # After the first deploy, I noticed from the logs that Grafana was still trying to initialize an SQLite database in /var/lib/grafana/grafana.db. Despite having configured the database section in grafana.ini, the settings were being ignored.\nThe Investigation: I executed a kubectl exec into the Pod to inspect the generated configuration file. I discovered that, due to the way the Grafana Helm Chart processes values, some variables entered in grafana.ini were not being correctly propagated if they were not also present as environment variables. The Solution: I had to duplicate the configuration in both the grafana.ini section and the env section. Only then did Grafana \u0026ldquo;understand\u0026rdquo; it needed to point to PostgreSQL. It\u0026rsquo;s a frustrating behavior of complex charts: redundancy is sometimes the only way.\n2. The Postgres 16 Permissions Wall # Once the configuration issue was resolved, the Grafana Pod started crashing with a cryptic error: pq: permission denied for schema public.\nThe Investigation: I knew the database was active and the grafana user existed. However, PostgreSQL 16 introduced restrictive changes to permissions on the public schema. By default, new users no longer have the right to create objects in that schema. The Solution: I had to manually intervene on the database with an SQL session:\nGRANT ALL ON SCHEMA public TO grafana; ALTER SCHEMA public OWNER TO grafana; This step reminded me that, even in an automated world, deep knowledge of underlying systems (like database RBAC) is irreplaceable.\n3. Network Conflict: Port 8004 # The cluster uses MetalLB to expose services on a dedicated IP (192.168.1.240). During deployment, the Grafana service remained in a \u0026lt;pending\u0026gt; state.\nThe Investigation: I checked the service events with kubectl describe svc. MetalLB reported a conflict: \u0026ldquo;port 8004 is already occupied\u0026rdquo;. A quick analysis of my documentation revealed that mnemosyne-mcp was already using that port on the same shared IP. The Solution: I moved Grafana to port 8005. This highlights the importance of rigorous IP Address Management (IPAM) even in a lab environment, especially when using annotations like allow-shared-ip.\n4. The Silence of Node Exporter (Pod Security Standards) # After installation, the dashboards were visible but\u0026hellip; empty. No data from the nodes.\nThe Investigation: I checked the node-exporter DaemonSet. No Pods had been created. The controller returned a Pod Security Policies violation error: violates PodSecurity baseline:latest. node-exporter requires access to host namespaces (hostNetwork, hostPID) and hostPath to read hardware metrics—behaviors that Kubernetes now blocks by default for security. The Solution: I had to \u0026ldquo;soften\u0026rdquo; the monitoring namespace by labeling it as privileged:\napiVersion: v1 kind: Namespace metadata: name: monitoring labels: pod-security.kubernetes.io/enforce: privileged It is a necessary compromise: to monitor hardware, the software must be able to \u0026ldquo;see\u0026rdquo; it.\nGitOps for Dashboards: The Magic Sidecar # Another pillar of this installation is dashboard automation. I don\u0026rsquo;t want to create graphs by hand by clicking in the interface; I want dashboards to be part of the code.\nI configured the Grafana Sidecar, a lightweight process that runs alongside Grafana and scans the cluster for ConfigMap objects with the label grafana_dashboard: \u0026quot;1\u0026quot;. When it finds one, it downloads the dashboard JSON and injects it into Grafana. This transforms monitoring into a purely declarative system. If I had to reinstall everything from scratch tomorrow, my professional dashboards (\u0026ldquo;Nodes Pro\u0026rdquo;, \u0026ldquo;Cluster Health\u0026rdquo;) would automatically appear at the first boot.\nPost-Lab Reflections: What have we learned? # This \u0026ldquo;stage\u0026rdquo; of the TazLab journey was one of the most challenging in terms of troubleshooting. What does this setup mean for long-term stability?\nFailure Resilience: Now I can lose an entire node or corrupt the monitoring namespace without losing the history of my work. The PostgreSQL database is my \u0026ldquo;anchor.\u0026rdquo; Standardization: The use of privileged namespaces and specific ports on MetalLB is now documented and codified, reducing cluster entropy. Mental Scalability: Facing these problems forced me to dig into the specifications of Postgres 16 and the internal mechanisms of Kubernetes (PSA, MetalLB). This is true professional growth. In conclusion, observability is not just about \u0026ldquo;seeing graphs.\u0026rdquo; It is about building a system that is as reliable as the system it is meant to monitor.\n","date":"4 March 2026","externalUrl":null,"permalink":"/posts/enterprise-monitoring-grafana-prometheus-stateless/","section":"Posts","summary":"","title":"Enterprise Monitoring in a Home Lab: The (Uphill) Road to Stateless Grafana and Prometheus","type":"posts"},{"content":"","date":"4 March 2026","externalUrl":null,"permalink":"/tags/fluxcd/","section":"Tags","summary":"","title":"Fluxcd","type":"tags"},{"content":"","date":"4 March 2026","externalUrl":null,"permalink":"/tags/grafana/","section":"Tags","summary":"","title":"Grafana","type":"tags"},{"content":"","date":"4 March 2026","externalUrl":null,"permalink":"/tags/monitoring/","section":"Tags","summary":"","title":"Monitoring","type":"tags"},{"content":"","date":"4 March 2026","externalUrl":null,"permalink":"/tags/postgresql/","section":"Tags","summary":"","title":"Postgresql","type":"tags"},{"content":"","date":"4 March 2026","externalUrl":null,"permalink":"/tags/prometheus/","section":"Tags","summary":"","title":"Prometheus","type":"tags"},{"content":"","date":"28 February 2026","externalUrl":null,"permalink":"/tags/dex/","section":"Tags","summary":"","title":"Dex","type":"tags"},{"content":"","date":"28 February 2026","externalUrl":null,"permalink":"/tags/external-secrets/","section":"Tags","summary":"","title":"External-Secrets","type":"tags"},{"content":" Introduction: The Dashboard Protection Problem # When building a modern Kubernetes infrastructure, one of the most critical challenges that emerges quickly is managing access to operational dashboards. In my TazLab laboratory—a Talos Linux cluster on Proxmox with a complete GitOps stack—I had already implemented Grafana for monitoring, pgAdmin for database management, and an informational dashboard (Homepage) for navigation. All these components were accessible via Traefik Ingress, but none of them were protected by authentication. Anyone who could reach https://grafana.tazlab.net from my lab could access sensitive monitoring data without entering credentials.\nI decided that this situation violated the fundamental Zero Trust principle that guides the entire Ephemeral Castle architecture. The objective of the day was therefore ambitious: implement a Single Sign-On (SSO) system using Google OAuth, where all dashboards would be protected behind a single authentication gateway. Users would need to log in once with their Google account, and then all subsequent accesses to various services would be automatically authorized, without further password prompts.\nThis \u0026ldquo;stage of the journey\u0026rdquo; of TazLab represented a significant turning point: the infrastructure was evolving from simply \u0026ldquo;functional\u0026rdquo; to \u0026ldquo;enterprise-ready\u0026rdquo;, where security was not an afterthought but a founding principle.\nPhase 1: OIDC Architecture and Strategic Choices # Before writing the first YAML manifest, I had to make a series of architectural decisions that would define the entire approach. There was no single correct path; each choice involved trade-offs that would affect the long-term stability of the system.\nWhy DEX and Not Keycloak? A Conscious Comparison # The most critical choice was the OIDC provider. The standards in the Kubernetes landscape are two: Keycloak and DEX. Keycloak is a complete ecosystem, extremely flexible, supported by a gigantic community, with a rich administration interface and dozens of connectors. DEX, on the other hand, is a minimalist tool: a Kubernetes-native OIDC provider that reads its configuration from YAML files, persists data via Kubernetes CRD (Custom Resource Definition), and has no web administration interface (everything is declarative).\nI chose DEX for one fundamental reason: philosophical alignment with my infrastructure. TazLab is built entirely around Kubernetes as the source of truth database. Flux CD manages declarative state through version control (Git). All secrets reside in Infisical and are synchronized via External Secrets Operator. Adding Keycloak meant introducing a new \u0026ldquo;fiefdom\u0026rdquo; of data—a separate database with its own lifecycle, backups, and dependencies—that would live outside the declarative paradigm. DEX, by contrast, leverages Kubernetes CRDs for persistence: every token, every authentication session, is a native Kubernetes object stored in etcd. This means that automatic etcd backups also protect the authentication system. It means that disaster recovery is consistent with the rest of the infrastructure.\nThe downside of DEX is the lack of a rich web interface. If I need to modify the provider\u0026rsquo;s behavior (add a new connector, change configuration), I must edit YAML files and commit them to Git, not click in a UI. Initially, this limitation seemed restrictive. But after implementing the system, I realized it was a strength: traceability. Every change to DEX is a Git commit with an author, timestamp, and documented reason in a PR. There is no \u0026ldquo;administrator who clicked the wrong button\u0026rdquo;.\noauth2-proxy as a Traefik Middleware: The ForwardAuth Pattern # Once I chose DEX as the OIDC provider, I needed a proxy to intercept HTTP requests to my dashboards, verify whether the user was already authenticated with Google, and if not, redirect them to the authentication flow. The standard solution in the Kubernetes world is oauth2-proxy.\noauth2-proxy is a reverse proxy specialized in OAuth2 integration. It is typically deployed as a pod in Kubernetes and configured as a Traefik Middleware in the ForwardAuth pattern. In this architectural pattern, when a request arrives at a protected Traefik Ingress, Traefik does not pass the request directly to the backend application. Instead, it sends a verification request to the oauth2-proxy service, asking: \u0026ldquo;Is this client authenticated?\u0026rdquo; If oauth2-proxy responds with HTTP 200, it means \u0026ldquo;yes, it\u0026rsquo;s valid\u0026rdquo;, and Traefik proceeds. If it responds with 401, Traefik blocks the request and redirects the client to the login service.\nDeep-Dive Conceptual: Traefik\u0026rsquo;s ForwardAuth Pattern\nThe ForwardAuth pattern is an implementation of the \u0026ldquo;external authorization service\u0026rdquo; paradigm commonly used in nginx (via auth_request). The idea is elegant from an architectural standpoint: the authentication decision is delegated to a specialized service, which remains completely decoupled from the actual application. This means I can protect any application—Grafana, pgAdmin, a simple HTML page—without modifying its code. The application doesn\u0026rsquo;t even need to \u0026ldquo;know\u0026rdquo; there\u0026rsquo;s a proxy in front. From its perspective, HTTP requests arrive as always. The difference is that Traefik has already verified authentication via the ForwardAuth Middleware, and passes the app some additional headers (like X-Auth-Request-User) that the app can use to automatically recognize the logged-in user.\nThis pattern is particularly powerful when combined with Traefik\u0026rsquo;s ability to pass HTTP headers to the verification service and collect response headers. In the case of oauth2-proxy, the flow becomes:\nClient requests /dashboard on Grafana Traefik intercepts the request and sends it to oauth2-proxy for verification oauth2-proxy checks whether the client has a valid session cookie If yes, it responds 200 and includes in the response headers the username (e.g., X-Auth-Request-User: roberto.tazzoli@gmail.com) Traefik passes the request to Grafana, adding those headers Grafana reads the header and automatically creates a session for that user Phase 2: Initial Implementation (Confidence in Plans) # With the architectural decisions made, I proceeded with implementation. I decided to structure the project following conventions already present in TazLab:\ninfrastructure/configs/dex/: ExternalSecrets that pull Google secrets from Infisical, and DEX configuration files infrastructure/instances/dex/: Deployment, Service, Ingress, RBAC for DEX infrastructure/auth/: A new layer dedicated to oauth2-proxy, Traefik middleware, and Flux configuration infrastructure/operators/monitoring/: Updates to Grafana ingresses to apply the ForwardAuth middleware I created 19 YAML files in total, approximately 1500 lines of Kubernetes manifests. Each component was declarative, versioned in Git, and synchronizable by Flux. The theory was solid. Practice was about to teach me humbling lessons.\nDEX Structure: CRD Storage and Google Connectors # The DEX configuration is a pure YAML file that specifies:\nThe issuer (the URL where DEX is accessible, e.g., https://dex.tazlab.net) The storage backend (in my case, Kubernetes CRD) The \u0026ldquo;connectors\u0026rdquo; (identity providers, in my case Google OAuth) The \u0026ldquo;static clients\u0026rdquo; (applications authorized to request tokens, in my case oauth2-proxy) Here\u0026rsquo;s a simplified snippet of how I structured the DEX ConfigMap:\napiVersion: v1 kind: ConfigMap metadata: name: dex-config namespace: dex data: config.yaml: | issuer: https://dex.tazlab.net storage: type: kubernetes config: inCluster: true web: http: 0.0.0.0:5556 allowedOrigins: - https://dex.tazlab.net connectors: - type: google id: google name: Google config: clientID: $GOOGLE_CLIENT_ID clientSecret: $GOOGLE_CLIENT_SECRET redirectURI: https://dex.tazlab.net/callback staticClients: - id: oauth2-proxy secret: $OAUTH2_PROXY_CLIENT_SECRET redirectURIs: - https://auth.tazlab.net/oauth2/callback name: oauth2-proxy Why External Secrets Operator and Not Direct ConfigMap? # Google secrets (clientID, clientSecret) cannot reside in the ConfigMap in plaintext—it would be a basic violation of security principles. I decided to use External Secrets Operator (ESO) to synchronize secrets from Infisical (my centralized vault) and make them available as Kubernetes Secrets. This pattern is now well-established in TazLab, so the choice was natural.\nI created an ExternalSecret that pulled DEX_GOOGLE_CLIENT_ID and DEX_GOOGLE_CLIENT_SECRET from Infisical:\napiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: dex-google-secrets namespace: dex spec: refreshInterval: 1h secretStoreRef: kind: ClusterSecretStore name: tazlab-secrets target: name: dex-google-secrets creationPolicy: Owner data: - secretKey: DEX_GOOGLE_CLIENT_ID remoteRef: key: DEX_GOOGLE_CLIENT_ID - secretKey: DEX_GOOGLE_CLIENT_SECRET remoteRef: key: DEX_GOOGLE_CLIENT_SECRET - secretKey: OAUTH2_PROXY_CLIENT_SECRET remoteRef: key: OAUTH2_PROXY_CLIENT_SECRET The DEX Deployment mounted the Secret and injected it as environment variables:\napiVersion: apps/v1 kind: Deployment metadata: name: dex namespace: dex spec: replicas: 1 template: spec: containers: - name: dex image: ghcr.io/dexidp/dex:v2.41.1 args: - dex - serve - /etc/dex/cfg/config.yaml env: - name: GOOGLE_CLIENT_ID valueFrom: secretKeyRef: name: dex-google-secrets key: DEX_GOOGLE_CLIENT_ID - name: GOOGLE_CLIENT_SECRET valueFrom: secretKeyRef: name: dex-google-secrets key: DEX_GOOGLE_CLIENT_SECRET Phase 3: The First Error - The Missing ADMIN_EMAIL # After the first git push, I ran a flux reconcile source git flux-system and waited for Flux to synchronize all the state described in my manifests.\nReconciliation encountered an unexpected error in the ClusterRoleBinding that should have assigned the tazlab-admin role to the user with email ${ADMIN_EMAIL}:\nClusterRoleBinding/tazlab-admin-binding dry-run failed (Invalid): ClusterRoleBinding [...] subjects[0].name: Required value The subjects[0].name field was empty. I checked the manifest:\napiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: tazlab-admin-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: tazlab-admin subjects: - kind: User name: ${ADMIN_EMAIL} The ${ADMIN_EMAIL} variable had not been substituted. I checked the cluster-vars ConfigMap in the flux-system namespace—where Flux stores global variables used by postBuild.substituteFrom:\n$ kubectl get cm cluster-vars -n flux-system -o jsonpath=\u0026#39;{.data}\u0026#39; {\u0026#34;domain\u0026#34;: \u0026#34;tazlab.net\u0026#34;, \u0026#34;cluster_name\u0026#34;: \u0026#34;tazlab-k8s\u0026#34;, \u0026#34;traefik_lb_ip\u0026#34;: \u0026#34;192.168.1.240\u0026#34;} ADMIN_EMAIL was missing. Here emerged a crucial architectural insight: the cluster-vars ConfigMap is not managed by GitOps, but by Terraform. It is created during cluster bootstrap by the k8s-flux module in ephemeral-castle. I couldn\u0026rsquo;t add it directly to a GitOps YAML file, because Flux did not control it. I had to modify Terraform.\nI opened /workspace/ephemeral-castle/clusters/tazlab-k8s/modules/k8s-flux/main.tf and added the admin_email parameter:\nvariable \u0026#34;admin_email\u0026#34; { type = string description = \u0026#34;Email of TazLab admin — used by Flux for RBAC and oauth2-proxy allowlist\u0026#34; } # In the block that creates the ConfigMap: data = { domain = var.base_domain cluster_name = var.cluster_name traefik_lb_ip = var.traefik_lb_ip ADMIN_EMAIL = var.admin_email } Then I updated clusters/tazlab-k8s/live/gitops/terragrunt.hcl to read the email from Infisical and pass it to Terraform:\ninputs = { admin_email = data.infisical_secrets.github.secrets[\u0026#34;ADMIN_EMAIL\u0026#34;].value # ... other parameters } I pushed these Terraform changes, and then ran a kubectl patch configmap cluster-vars -n flux-system --type merge -p '{\u0026quot;data\u0026quot;: {\u0026quot;ADMIN_EMAIL\u0026quot;: \u0026quot;roberto.tazzoli@gmail.com\u0026quot;}}' as an emergency patch to accelerate testing.\nLesson learned: When designing infrastructure with Terraform and GitOps, you must be aware of which layer \u0026ldquo;owns\u0026rdquo; which data. Terraform creates the initial blank slate of the cluster; GitOps maintains declarative state from manifests. If a configuration is generated once during bootstrap and won\u0026rsquo;t change often, it belongs to Terraform. If it changes frequently and has a versioning history, it belongs to GitOps. Mixing the two levels is the fastest way to create operational confusion.\nPhase 4: The DEX Problem - The Variable That Doesn\u0026rsquo;t Expand # After resolving the ADMIN_EMAIL, everything else started reconciling correctly. The DEX and oauth2-proxy pods started. I tested the login flow by navigating to https://grafana.tazlab.net—Traefik redirected me to DEX, which showed me the \u0026ldquo;Log in with Google\u0026rdquo; button. I clicked, Google asked me to authenticate\u0026hellip;\nAnd then I received an error from the Google server:\nError 400: invalid_request flowName=GeneralOAuthFlow - Missing required parameter: client_id Google was not receiving the client_id. I checked the DEX logs to understand what was happening:\n[2026/02/28 08:14:23] [connector.go:123] provider.go: authenticating, error: invalid_request: Missing required parameter: client_id The problem was silent in DEX\u0026rsquo;s log. I decided to conduct a deeper investigation. I examined the config file that DEX was reading inside the pod:\n$ kubectl exec -it deployment/dex -n dex -- cat /etc/dex/cfg/config.yaml | grep -A 5 \u0026#34;connectors:\u0026#34; connectors: - type: google id: google name: Google config: clientID: \u0026#34;$GOOGLE_CLIENT_ID\u0026#34; Aha! The $GOOGLE_CLIENT_ID variable was literal in the YAML file. DEX was not expanding environment variables inside its configuration file. I tried reading the DEX documentation to see if it supported variable substitution\u0026hellip; and discovered that DEX does not perform any variable expansion in the configuration file. DEX is a Go application that reads the YAML file once at startup, unmarshals it into a Go data structure, and uses it as-is. There is no post-processing.\nThis was a serious architectural problem. I couldn\u0026rsquo;t put secrets directly in the ConfigMap in plaintext. But I also couldn\u0026rsquo;t use environment variables as placeholders in YAML files and expect DEX to expand them.\nI considered several solutions:\nSed wrapper: An entrypoint that uses sed to substitute variables in the YAML file before launching DEX The secretEnv flag in DEX: DEX has a special field for client secret that reads from an environment variable ESO template engine: Use External Secrets Operator v2 to render the complete configuration file with real values I initially attempted solution #1 (sed wrapper). I created a shell entrypoint:\n#!/bin/sh sed -e \u0026#34;s|\\$GOOGLE_CLIENT_ID|${GOOGLE_CLIENT_ID}|g\u0026#34; \\ -e \u0026#34;s|\\$GOOGLE_CLIENT_SECRET|${GOOGLE_CLIENT_SECRET}|g\u0026#34; \\ /etc/dex/cfg/config.yaml.template \u0026gt; /tmp/config.yaml exec dex serve /tmp/config.yaml This didn\u0026rsquo;t work. When sed produced the file with empty values (if the environment variables were not defined at execution time), DEX would silently crash with a YAML parsing error.\nI then tried solution #2: using the secretEnv field in DEX for the oauth2-proxy client secret. In the configuration file, I can tell DEX: \u0026ldquo;For this client, the secret is not in the YAML file, but in an environment variable\u0026rdquo;. But this only worked for the secret of the static client, not for the clientSecret of the Google connector.\nI decided to implement solution #3: ESO template engine v2. This is a feature of External Secrets Operator that transforms the generated Secret using a Go template engine. I create an ExternalSecret that tells ESO:\n\u0026ldquo;Go to Infisical, fetch DEX_GOOGLE_CLIENT_ID and DEX_GOOGLE_CLIENT_SECRET, then render DEX\u0026rsquo;s complete configuration file using these values inside the templates {{ .DEX_GOOGLE_CLIENT_ID }}\u0026rdquo;\napiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: dex-config-rendered namespace: dex spec: refreshInterval: 1h secretStoreRef: kind: ClusterSecretStore name: tazlab-secrets target: name: dex-rendered-config creationPolicy: Owner template: engineVersion: v2 data: config.yaml: | issuer: https://dex.tazlab.net storage: type: kubernetes config: inCluster: true connectors: - type: google id: google name: Google config: clientID: \u0026#34;{{ .DEX_GOOGLE_CLIENT_ID }}\u0026#34; clientSecret: \u0026#34;{{ .DEX_GOOGLE_CLIENT_SECRET }}\u0026#34; redirectURI: https://dex.tazlab.net/callback staticClients: - id: oauth2-proxy secretEnv: OAUTH2_PROXY_CLIENT_SECRET redirectURIs: - https://auth.tazlab.net/oauth2/callback name: oauth2-proxy data: - secretKey: DEX_GOOGLE_CLIENT_ID remoteRef: key: DEX_GOOGLE_CLIENT_ID - secretKey: DEX_GOOGLE_CLIENT_SECRET remoteRef: key: DEX_GOOGLE_CLIENT_SECRET When ESO recreates this ExternalSecret, it passes the secrets from the data block to the template engine, which substitutes {{ .DEX_GOOGLE_CLIENT_ID }} with the real value, and generates a Secret with the completely rendered DEX configuration file, with real values already inside.\nI updated the DEX Deployment to mount the dex-rendered-config Secret instead of the ConfigMap:\nspec: volumes: - name: config secret: secretName: dex-rendered-config items: - key: config.yaml path: config.yaml After deploy, I verified that the Secret contained the real values:\n$ kubectl get secret dex-rendered-config -n dex -o jsonpath=\u0026#39;{.data.config\\.yaml}\u0026#39; | base64 -d | grep clientID clientID: \u0026#34;502646366772-9165kme6a67a10m1s8imiv540ltoisp7.apps.googleusercontent.com\u0026#34; Perfect. DEX was now reading the configuration file with real values.\nPhase 5: The Redirect That Didn\u0026rsquo;t Work # After DEX started working correctly with Google, the authentication flow continued. The user (myself) was redirected to Google, authenticated, and then\u0026hellip;\nEnded up on https://auth.tazlab.net/authenticated with a simple message: \u0026ldquo;Authenticated\u0026rdquo;. It didn\u0026rsquo;t redirect back to Grafana. I had to manually re-enter https://grafana.tazlab.net in the address bar.\nThe problem was in oauth2-proxy. When it received the callback from Google, it knew the user was authenticated, but it didn\u0026rsquo;t know which URL to return to. oauth2-proxy is a complex tool with many configurations, and the bug resided in how it handles tracking the original URL after redirect.\nWhen Traefik calls oauth2-proxy as ForwardAuth middleware, it might not pass the original URL to the authentication service. So oauth2-proxy doesn\u0026rsquo;t know where the client came from. I added the --reverse-proxy=true parameter:\nargs: - --provider=oidc - --oidc-issuer-url=https://dex.tazlab.net - --client-id=oauth2-proxy - --client-secret=$(OAUTH2_PROXY_CLIENT_SECRET) - --cookie-secret=$(OAUTH2_PROXY_COOKIE_SECRET) - --cookie-secure=true - --cookie-domain=.tazlab.net - --redirect-url=https://auth.tazlab.net/oauth2/callback - --upstream=static://200 - --http-address=:4180 - --reverse-proxy=true # \u0026lt;-- New - --set-xauthrequest=true - --authenticated-emails-file=/etc/oauth2-proxy/allowed-emails.txt Deep-Dive Conceptual: The --reverse-proxy Flag in oauth2-proxy\nWhen oauth2-proxy is exposed directly to the client (as in a traditional reverse proxy configuration), it receives standard HTTP headers: Host, User-Agent, etc. But when behind a reverse proxy like Traefik, the intermediate proxy adds \u0026ldquo;forwarded\u0026rdquo; headers: X-Forwarded-Proto, X-Forwarded-Host, X-Forwarded-Uri. These headers tell the downstream proxy what the original request was. The --reverse-proxy=true flag tells oauth2-proxy: \u0026ldquo;Read these headers to reconstruct the client\u0026rsquo;s original URL\u0026rdquo;. That way, after Google\u0026rsquo;s callback, oauth2-proxy knows to return not to itself (auth.tazlab.net), but to the original URL (grafana.tazlab.net).\nUnfortunately, this didn\u0026rsquo;t completely solve the problem. I realized there was further complexity: the integration between DEX, oauth2-proxy, and Grafana itself.\nPhase 6: Configure Grafana to Recognize the Authenticated User # Even after oauth2-proxy correctly redirected the client back to Grafana, Grafana still asked for credentials. The reason is that Grafana was not reading the X-Auth-Request-User header that oauth2-proxy was passing via Traefik Middleware.\nGrafana has a dedicated configuration section for \u0026ldquo;proxy auth\u0026rdquo;: when enabled, Grafana trusts an HTTP header (by default X-WEBAUTH-USER) and assumes that the user provided in the header is already authenticated. This is a common security feature in enterprise environments where there\u0026rsquo;s centralized SSO.\nIn my case, I had to tell Grafana to enable this module and read from X-Auth-Request-User (the header that oauth2-proxy generates). I modified the HelmRelease of kube-prometheus-stack:\ngrafana: enabled: true grafana.ini: auth.proxy: enabled: true header_name: X-Auth-Request-User header_property: username auto_sign_up: true sync_ttl: 60 With this configuration:\nenabled: true: Activates the module header_name: X-Auth-Request-User: Read from this header header_property: username: The value in the header is the username field (email, in this case) auto_sign_up: true: If the user doesn\u0026rsquo;t exist in Grafana, create them automatically on first login sync_ttl: 60: Every 60 seconds, synchronize user data from Infisical (if integrated) After this change, Grafana automatically recognized the user roberto.tazzoli@gmail.com and logged them in without asking for a password.\nPhase 7: The oauth2-proxy Crash - The Silent Error # Just when I thought everything was stable, I added two parameters to oauth2-proxy that had the potential to improve behavior:\nargs: # ... previous parameters ... - --url=https://auth.tazlab.net - --auth-logging=true After the push, the oauth2-proxy pods entered CrashLoopBackOff. The container logs showed:\nunknown flag: --url I had used a flag that didn\u0026rsquo;t exist in the v7.8.1 version of oauth2-proxy I was using. I checked the documentation and the list of supported flags\u0026hellip; and the flag wasn\u0026rsquo;t there. It was possible it had been added in a newer version, but my image was older.\nWhat followed was a cascade of problems: Kubernetes kept trying to start the pod with the old cached configuration. Flux remained stuck in a \u0026ldquo;Reconciliation in progress\u0026rdquo; state for five minutes (the health check timeout). The CrashLoopBackOff pods restarted every 10 seconds, creating noise in the logs.\nI reverted the commits that had added those flags and manually patched the deployment in the cluster to remove the problematic parameters:\nkubectl patch deployment oauth2-proxy -n auth --type json -p \u0026#39;[ { \u0026#34;op\u0026#34;: \u0026#34;replace\u0026#34;, \u0026#34;path\u0026#34;: \u0026#34;/spec/template/spec/containers/0/args\u0026#34;, \u0026#34;value\u0026#34;: [ \u0026#34;--provider=oidc\u0026#34;, \u0026#34;--oidc-issuer-url=https://dex.tazlab.net\u0026#34;, \u0026#34;--client-id=oauth2-proxy\u0026#34;, \u0026#34;--client-secret=$(OAUTH2_PROXY_CLIENT_SECRET)\u0026#34;, \u0026#34;--cookie-secret=$(OAUTH2_PROXY_COOKIE_SECRET)\u0026#34;, \u0026#34;--cookie-secure=true\u0026#34;, \u0026#34;--cookie-domain=.tazlab.net\u0026#34;, \u0026#34;--whitelist-domain=.tazlab.net\u0026#34;, \u0026#34;--redirect-url=https://auth.tazlab.net/oauth2/callback\u0026#34;, \u0026#34;--upstream=static://200\u0026#34;, \u0026#34;--http-address=:4180\u0026#34;, \u0026#34;--skip-provider-button=true\u0026#34;, \u0026#34;--set-xauthrequest=true\u0026#34;, \u0026#34;--reverse-proxy=true\u0026#34;, \u0026#34;--authenticated-emails-file=/etc/oauth2-proxy/allowed-emails.txt\u0026#34;, \u0026#34;--silence-ping-logging=true\u0026#34; ] } ]\u0026#39; After a few minutes, a new pod started with the correct configuration and the system stabilized.\nCritical lesson: When writing configuration parameters for applications obtained from public images, always verify the documentation of the specific version you\u0026rsquo;re using. A flag might not exist in the version you pulled, causing silent crashes. The solution is to use strict version pinning and document which version supports which features.\nPhase 8: Flux Gets Stuck - The Health Check Timeout # When the oauth2-proxy pod continuously crashed, Flux became stuck in a pathological state. The infrastructure-auth kustomization couldn\u0026rsquo;t complete reconciliation because the health check was waiting for pods to become ready. But the pods never became ready due to the crash.\nFlux has a health check timeout of 5 minutes. After 5 minutes, it marks reconciliation as failed, but remains in a \u0026ldquo;Reconciliation in progress\u0026rdquo; state waiting for the next automatic attempt (which is scheduled an hour later, unless I force it manually).\nI had to break through the process:\nI reverted the commit that contained the problematic flags I forced Flux to recognize the new commit: flux reconcile source git flux-system I forcefully deleted all old pods: kubectl delete pods -n auth --all --grace-period=0 --force I manually patched the deployment to start the pod with the correct configuration I waited for the pod to stabilize Flux finally recognized that everything was in order and completed reconciliation Final Reflections: What We Built # After this \u0026ldquo;stage of the journey\u0026rdquo;, TazLab now has an enterprise-ready authentication system that combines:\nDEX as a Kubernetes-native OIDC provider, with CRD storage and Google OAuth integration oauth2-proxy as a Traefik middleware, with ForwardAuth pattern for transparent interception External Secrets Operator with template engine to render DEX configuration with real secrets from Infisical Kubernetes RBAC with ClusterRole and ClusterRoleBinding that reads the admin email from Flux Grafana configured for auth.proxy, automatically recognizing users via X-Auth-Request-User header The complete flow works like this:\nUser navigates to https://grafana.tazlab.net Traefik ForwardAuth calls oauth2-proxy oauth2-proxy sees there\u0026rsquo;s no valid session cookie oauth2-proxy redirects the client to https://dex.tazlab.net/auth DEX shows the \u0026ldquo;Login with Google\u0026rdquo; button User authenticates with Google Google redirects back to https://auth.tazlab.net/oauth2/callback oauth2-proxy processes the callback, generates a session cookie oauth2-proxy redirects the client to https://grafana.tazlab.net (the original URL reconstructed from X-Forwarded-* headers) Traefik ForwardAuth calls oauth2-proxy again, which responds with 200 and header X-Auth-Request-User: roberto.tazzoli@gmail.com Traefik passes the request to Grafana, adding the header Grafana reads the header, automatically creates a session for that user Grafana responds with the dashboard The entire system is declarative, versioned in Git, recoverable from etcd backups, and integrated with Flux for disaster recovery. There is no \u0026ldquo;external state\u0026rdquo; living outside Kubernetes. It is the concrete realization of the Zero Trust principle that guides Ephemeral Castle.\nThe problems encountered—the unexpanded variable, the nonexistent flag, the Flux timeout—were all resolved through a systematic debugging approach: identify the symptom, construct hypotheses, test, iterate. And most importantly, document the process so that anyone reading this chronicle can learn from my experiences without repeating the same mistakes.\nThis laboratory is now ready for the next chapter of its evolution: integration of new identity providers, implementation of granular RBAC, synchronization of user attributes from enterprise directories. But for now, the authentication system is stable, secure, and production-ready.\n","date":"28 February 2026","externalUrl":null,"permalink":"/posts/dex-oauth2-kubernetes-oidc-journey/","section":"Posts","summary":"","title":"From Zero to OIDC: A Journey Through Zero Trust Authentication in Our Kubernetes Cluster","type":"posts"},{"content":"","date":"28 February 2026","externalUrl":null,"permalink":"/tags/oauth2/","section":"Tags","summary":"","title":"Oauth2","type":"tags"},{"content":"","date":"28 February 2026","externalUrl":null,"permalink":"/tags/oidc/","section":"Tags","summary":"","title":"Oidc","type":"tags"},{"content":"","date":"28 February 2026","externalUrl":null,"permalink":"/tags/traefik/","section":"Tags","summary":"","title":"Traefik","type":"tags"},{"content":"","date":"25 February 2026","externalUrl":null,"permalink":"/categories/devsecops/","section":"Categories","summary":"","title":"DevSecOps","type":"categories"},{"content":" Phoenix Protocol V2: Enterprise Security, Parallelism, and the 8-Minute Milestone # While the first chapter of the Phoenix Protocol focused on data validation and its immortality through S3 restoration, this second stage of the journey into the Ephemeral Castle tackles an even more ambitious challenge: process perfection. It is not enough for the cluster to be reborn; it must do so deterministically, without human hesitation, and with a security profile that admits no compromises—even during the few minutes when the infrastructure is \u0026ldquo;naked\u0026rdquo; under the fire of the bootstrap.\nToday I decided to push the limit beyond the psychological threshold of ten minutes. To achieve this, I had to radically rethink how the cluster \u0026ldquo;claims\u0026rdquo; its own identity and how the different layers fit together. This is not just a speed exercise, but a pursuit of engineering efficiency where every second saved is an uncertainty removed.\nThe Mindset: Security as Cement, Not Paint # Often, in HomeLab projects or developing infrastructures, there is a tendency to \u0026ldquo;make things work\u0026rdquo; first and then, only later, to harden them. I have decided that this approach is inherently flawed. In a Zero-Knowledge architecture, security must be the cement of the foundations. If a secret touches the disk during bootstrap, that disk is compromised forever in my vision.\nThe goal of the session was twofold: eliminate unstable external dependencies and ensure that no secret \u0026ldquo;travels\u0026rdquo; in the clear or resides persistently on the host orchestrating the rebirth.\nPhase 1: Shifting the Root of Trust (Goodbye GITHUB_TOKEN) # One of the latent risks in previous versions was the presence of the GITHUB_TOKEN in the host\u0026rsquo;s environment variables during the execution of Terragrunt. Although the token was injected into RAM, its existence in the bash shell represented an attack vector.\nThe Reasoning: Why Internalize Secrets? # I decided to shift the responsibility for identity retrieval inside the cluster itself. Instead of \u0026ldquo;handing over\u0026rdquo; the token to Flux CD during installation, I configured the system so that the cluster, as soon as it is born, \u0026ldquo;claims\u0026rdquo; its own access to the code.\nThe alternative would have been to continue passing the token via environment variables, but this would have kept the secret exposed to host system logs and potential memory dumps of child processes. By using the External Secrets Operator (ESO) and an Infisical Machine Identity, the cluster becomes autonomous.\nDeep-Dive: Machine Identity # A Machine Identity is a security entity designed for automated systems. Unlike a token generated by a human user, it is linked to a specific role with granular permissions (Least Privilege) and can be revoked or rotated without impacting real users. It is the heart of the \u0026ldquo;Trust no one, verify internal identity\u0026rdquo; model.\nTechnical Implementation # I modified the engine layer to prepare the ground for Flux even before Flux is installed. The trick lies in an intelligent wait loop:\n# modules/k8s-engine/main.tf # 1. Early creation of the flux-system namespace resource \u0026#34;kubernetes_namespace_v1\u0026#34; \u0026#34;flux_system\u0026#34; { metadata { name = \u0026#34;flux-system\u0026#34; } } # 2. Injection of the Infisical Machine Identity resource \u0026#34;kubernetes_secret_v1\u0026#34; \u0026#34;infisical_machine_identity\u0026#34; { metadata { name = \u0026#34;infisical-machine-identity\u0026#34; namespace = kubernetes_namespace_v1.external_secrets.metadata[0].name } data = { clientId = var.infisical_client_id clientSecret = var.infisical_client_secret } } # 3. ExternalSecret fetching the GitHub token resource \u0026#34;kubectl_manifest\u0026#34; \u0026#34;github_token_external_secret\u0026#34; { yaml_body = \u0026lt;\u0026lt;YAML apiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: github-api-token namespace: flux-system spec: refreshInterval: 1h secretStoreRef: kind: ClusterSecretStore name: tazlab-secrets target: name: flux-system # The name Flux expects for its boot secret data: - secretKey: password remoteRef: key: GITHUB_TOKEN YAML depends_on = [helm_release.external_secrets] } # 4. The synchronization \u0026#34;Hook\u0026#34; resource \u0026#34;null_resource\u0026#34; \u0026#34;wait_for_github_token\u0026#34; { provisioner \u0026#34;local-exec\u0026#34; { command = \u0026#34;kubectl wait --for=condition=Ready externalsecret/github-api-token -n flux-system --timeout=60s\u0026#34; } depends_on = [kubectl_manifest.github_token_external_secret] } Phase 2: Ephemeral Secrets and the War on Zombie Processes # A recurring technical problem during testing was the freezing of the create.sh script. By invoking every command through infisical run, Terragrunt processes frequently became \u0026lt;defunct\u0026gt; (zombies).\nThe Investigation: The Illusion of External Automation # I observed that in non-interactive sessions, the Infisical CLI wrapper struggled to correctly handle exit signals from child processes. The result was a bootstrap that \u0026ldquo;froze\u0026rdquo; without producing logs, forcing me to intervene manually.\nI decided to eliminate the wrapper. The new strategy, named Vault-Native, involves extracting secrets from the TazPod RAM vault (/home/tazpod/secrets) once at the beginning of the script.\nThe Reasoning: Why Files in RAM? # Files in a directory mounted as tmpfs (RAM) never touch the disk platters. They are protected by the TazPod\u0026rsquo;s encryption and disappear instantly upon shutdown or unmounting of the vault. This allows me to have the speed of a local file with the security of a cloud secret.\n# create.sh - New resolution logic resolve() { local var_name=$1 local vault_file=\u0026#34;/home/tazpod/secrets/${2:-$1}\u0026#34; if [[ -f \u0026#34;$vault_file\u0026#34; ]]; then export \u0026#34;$var_name\u0026#34;=$(cat \u0026#34;$vault_file\u0026#34; | tr -d \u0026#34;\u0026#39;\u0026#34; \u0026#34;) else # Fallback if the secret is already in env but points to a file local val=\u0026#34;${!var_name}\u0026#34; [[ -f \u0026#34;$val\u0026#34; ]] \u0026amp;\u0026amp; export \u0026#34;$var_name\u0026#34;=$(cat \u0026#34;$val\u0026#34; | tr -d \u0026#34;\u0026#39;\u0026#34; \u0026#34;) fi } resolve \u0026#34;PROXMOX_TOKEN_ID\u0026#34; \u0026#34;proxmox-token-id\u0026#34; resolve \u0026#34;GITHUB_TOKEN\u0026#34; \u0026#34;github-token\u0026#34; Phase 3: Parallelism Engineering (The \u0026ldquo;Turbo Flow\u0026rdquo;) # Sequential bootstrap is the enemy of speed. In version V1, layers were born one after another: secrets -\u0026gt; platform -\u0026gt; engine -\u0026gt; networking -\u0026gt; storage -\u0026gt; gitops.\nThe Bottleneck Analysis # I noticed that while MetalLB (Networking) was negotiating IPs, Flux (GitOps) and Longhorn (Storage) were simply \u0026ldquo;watching.\u0026rdquo; There is no technical reason why storage must wait for the LoadBalancer to be ready; both only need the cluster\u0026rsquo;s API Server to be alive.\nThe Solution: Aggressive Parallelism # I decoupled the dependencies in Terragrunt and modified the orchestrator to launch the three heavy layers simultaneously.\n# create.sh - Turbo Acceleration echo \u0026#34;🚀 [TURBO] Launching Networking, GitOps, and Storage in PARALLEL...\u0026#34; ( cd \u0026#34;$LIVE_DIR/networking\u0026#34; \u0026amp;\u0026amp; $TG apply --auto-approve ) \u0026amp; PID_NET=$! ( cd \u0026#34;$LIVE_DIR/gitops\u0026#34; \u0026amp;\u0026amp; $TG apply --auto-approve ) \u0026amp; PID_GITOPS=$! ( cd \u0026#34;$LIVE_DIR/storage\u0026#34; \u0026amp;\u0026amp; $TG apply --auto-approve ) \u0026amp; PID_STORAGE=$! wait $PID_NET $PID_GITOPS $PID_STORAGE This change reduced the \u0026ldquo;iron\u0026rdquo; time by over 30%. But the real challenge was managing the chaos this parallelism introduced into Kubernetes.\nPhase 4: The Flux Path Trap and Granular Decomposition # In an attempt to make everything faster, I decided to break the Flux operator monolith. Instead of a single infrastructure-operators block, I created three units: core (Traefik/Cert-Manager), data (Postgres), and namespaces.\nThe Struggle: Not a Directory # After the push, Flux went into error: kustomization.yaml: not a directory. The failure analysis was immediate: Kustomize requires each resource to be a directory containing an index. By moving the files, I had broken the relative references. I had to rebuild the tree structure:\ninfrastructure/operators/ ├── core/ │ └── kustomization.yaml (with ../cert-manager) ├── data/ │ └── kustomization.yaml (with ../postgres-operator) └── namespaces/ └── kustomization.yaml This taught me that speed requires order. Granularity must never sacrifice the logical structure of the repository.\nPhase 5: Asynchronous Resilience and the Blog \u0026ldquo;Fast-Track\u0026rdquo; # The last obstacle was application wait time. Why should the Hugo Blog, a simple Nginx image with static files, wait for a 10GB database restoration?\nThe Solution: InitContainers and RBAC # I implemented a \u0026ldquo;Fast-Track.\u0026rdquo; I decoupled the Blog (apps-static) from any heavy dependency. For apps that do need the database (Mnemosyne, PGAdmin), I introduced an InitContainer.\nDeep-Dive: InitContainers # An InitContainer is a specialized container that runs before the application containers in a Pod. It must complete successfully before the main container can start. It is the perfect tool for managing asynchronous dependencies.\nInstead of crashing the Pod with a CreateContainerConfigError (because the password secret does not exist yet), the InitContainer queries the Kubernetes API:\n# apps/base/mnemosyne-mcp/deployment.yaml initContainers: - name: wait-for-db-secret image: bitnami/kubectl:latest command: - /bin/sh - -c - | until kubectl get secret tazlab-db-pguser-mnemosyne; do echo \u0026#34;waiting for database user secret...\u0026#34; sleep 5 done This requires a ServiceAccount with minimum reading permissions (get, list) on secrets, configured through a dedicated rbac.yaml file. The result is a cluster that \u0026ldquo;converges\u0026rdquo; organically: light parts come up immediately, while heavy parts auto-configure as soon as data is ready.\nFinal Result: 8 Minutes and 43 Seconds # The final validation produced impressive telemetry. We went from 11:38 to 8:43 to have the Blog online and secure.\nLayer Time Status Secrets (RAM) 10s Optimized Platform (Iron) 1m 53s Stable Parallel Layers 1m 56s TURBO GitOps Fast-Track 1m 31s RECORD Total: 8 minutes and 43 seconds.\nAfter another 4 minutes, the database and MCP server were also ready, completing the entire stack in less than 13 minutes total, including data restoration from S3.\nPost-Lab Reflections: The Beauty of Determinism # This setup is not just \u0026ldquo;fast.\u0026rdquo; It is deterministic. The removal of unstable wrappers, intelligent wait management, and component decomposition have transformed the bootstrap from a sequence of hopes into an engineering protocol.\nWhat I learned today: # Less is More: Removing intermediate tools (like the constantly running Infisical CLI) reduces the attack surface and points of failure. Asynchrony is Strength: Do not force the cluster to be a monolith. Let each component manage its own patience. Security Accelerates: Implementing enterprise practices (Machine Identity, RBAC, RAM Vault) made the script cleaner and, consequently, faster to execute and easier to debug. TazLab\u0026rsquo;s infrastructure has reached a new threshold of technical maturity. The rebirth protocol is no longer just a recovery mechanism, but an engineering system optimized to guarantee resilience, security, and absolute precision at every stage of the cluster\u0026rsquo;s lifecycle.\nTechnical Chronicle by Taz - HomeLab DevOps \u0026amp; Architect\n","date":"25 February 2026","externalUrl":null,"permalink":"/posts/phoenix-protocol-v2-turbo-rebirth/","section":"Posts","summary":"","title":"Phoenix Protocol V2: Enterprise Security, Parallelism, and the 8-Minute Milestone","type":"posts"},{"content":"","date":"25 February 2026","externalUrl":null,"permalink":"/tags/reliability/","section":"Tags","summary":"","title":"Reliability","type":"tags"},{"content":"","date":"22 February 2026","externalUrl":null,"permalink":"/tags/go/","section":"Tags","summary":"","title":"Go","type":"tags"},{"content":" Introduction: The Paradox of the Ephemeral # In a nomadic and \u0026ldquo;Zero Trust\u0026rdquo; ecosystem like TazLab, the development environment (TazPod) is ephemeral by nature. Upon closing the container, every trace of activity vanishes, except for data saved in the encrypted vault. This volatility, while excellent for security and system cleanliness, introduces a fundamental problem: AI agent amnesia. Every new session is a blank slate, a tabula rasa where the artificial intelligence has no memory of the architectural decisions made yesterday, the bugs resolved with effort, or the project\u0026rsquo;s strategic directions.\nI decided that TazLab needed a long-term semantic memory, a \u0026ldquo;technical conscience\u0026rdquo; residing within the infrastructure itself. This project was named Mnemosyne. The goal of the day was ambitious: abandon unstable Python bridges and implement a native server based on the Model Context Protocol (MCP), integrated directly into the Gemini CLI, to allow the AI to consult its own technical past in a fluid and sovereign manner.\nPhase 1: The Cloud Mirage and the Return to Sovereignty # Initially, my strategy for Mnemosyne relied on Google Cloud AlloyDB. The idea of delegating vector persistence to an \u0026ldquo;Enterprise\u0026rdquo; managed service seemed like the safest and highest-performing move. AlloyDB, with its pgvector extension, offered enormous computing power for semantic searches.\nConceptual Deep-Dive: AlloyDB and pgvector AlloyDB is a Google Cloud PostgreSQL-compatible database optimized for intensive workloads. It is a VPC-native service, meaning that for security reasons, it does not normally expose a public IP but requires a private connection within the Google cloud. pgvector is the extension that allows storing \u0026ldquo;embeddings\u0026rdquo; (numerical vectors representing text meaning) and performing similarity searches using the cosine distance operator (\u0026lt;=\u0026gt;).\nHowever, I quickly collided with operational reality. To access AlloyDB from the TazPod on the move, I had to configure the AlloyDB Auth Proxy, a binary that creates a secure tunnel to GCP. Within a Docker container, this proxy created zombie processes and suffered from unpredictable latencies. Furthermore, the GCP firewall required dynamic IP unlocking via scripts (memory-gate), creating constant friction that betrayed the agile nature of the lab. Every time I changed connections (moving from home Wi-Fi to a mobile network), my semantic memory became unreachable until I manually updated the network rules.\nI therefore decided to change course: true digital sovereignty requires that data resides on my own hardware. I migrated Mnemosyne to a local PostgreSQL instance hosted in my Kubernetes cluster (Proxmox/Talos), using the Postgres Operator for lifecycle management. This choice not only zeroed out cloud costs but made the memory an integral part of TazLab\u0026rsquo;s \u0026ldquo;iron,\u0026rdquo; making it transparently accessible via the Wireguard VPN integrated into the TazPod.\nPhase 2: Genesis of a Native Go Server # To connect the Gemini CLI to the Postgres database, I needed a bridge that spoke the MCP language. Initially, I used a Python script acting as a bridge, but the interpreter\u0026rsquo;s startup latency and dependency fragility pushed me toward a more professional solution: a server written in Go.\nI chose Go for its ability to generate tiny static binaries, perfect for Google\u0026rsquo;s Distroless images. A Distroless image contains no shell or package manager, drastically reducing the pod\u0026rsquo;s attack surface in Kubernetes. The server had to be hybrid to support two use cases:\nStdio Transport: For rapid local development, where the CLI launches the binary and communicates via standard input/output. SSE Transport (Server-Sent Events): For production, where the server exposes an HTTP endpoint in the cluster and the CLI connects as a remote client through a MetalLB LoadBalancer. Conceptual Deep-Dive: Stdio vs SSE Stdio transport is the simplest way to let two processes communicate on the same host: JSON-RPC messages pass through system file descriptors. It is extremely fast but limited to the local machine. SSE transport, on the other hand, is a unidirectional protocol over HTTP that allows the server to send \u0026ldquo;events\u0026rdquo; to the client. In the MCP protocol, SSE is used to keep an asynchronous response channel open from the server to the AI, allowing for multi-user and distributed integrations.\nPhase 3: The Trail of Failures # The transition to a native server was not without obstacles. In fact, I encountered a series of bugs that required almost forensic investigation.\nThe Deadly Quote Bug (Error 400) # After the first deployment, every semantic search returned a laconic embedding API returned status 400. I checked the server logs, but the Google error body was not displayed. I suspected everything: from the embedding model (gemini-embedding-001) to the JSON format.\nAfter implementing more aggressive logging that captured the HTTP response body, I discovered the absurd truth: the secrets file in the TazPod (/home/tazpod/secrets/gemini-api-key) contained the key enclosed in single quotes ('AIzaSy...'). These quotes had been included by mistake during a copy-paste operation. Google\u0026rsquo;s APIs received the quote as part of the key, invalidating it. I resolved this by physically cleaning the file with sed and adding a sanitization function in the Go code to make the server resilient to similar human errors:\n// Aggressive key cleaning (removes quotes and spaces) apiKey = strings.Trim(strings.TrimSpace(apiKey), \u0026#34;\u0026#34;\u0026#39;\u0026#34;) Silence is Golden (Stdio Discovery Failure) # Another unexpected behavior occurred at Gemini CLI startup. Although the server was correctly configured in the settings.json file, the CLI reported No tools found on the server.\nInvestigating the debug logs, I realized that the Stdio protocol is extremely fragile: any character printed to stdout that is not part of the JSON-RPC breaks communication. My server was printing welcome logs via fmt.Printf. These logs polluted the stream, causing the Gemini CLI client\u0026rsquo;s JSON parser to fail. I had to make the server totally silent in Stdio mode, redirecting every diagnostic log to stderr.\n// Before (WRONG): fmt.Printf(\u0026#34;🚀 Server starting...\u0026#34;) // After (CORRECT): fmt.Fprintf(os.Stderr, \u0026#34;🚀 Server starting...\u0026#34;) Phase 4: Surrendering to Standards (SDK Refactoring) # After hours spent manually writing JSON-RPC message handling and SSE channels, I had to admit an error of pride: reinventing the MCP protocol from scratch was complex and prone to concurrency bugs. For example, my server lost messages if the client opened multiple simultaneous sessions with the same ID.\nI decided to refactor everything using the official community SDK: github.com/mark3labs/mcp-go. This meant rewriting the entire tool manager, but it brought immediate benefits in terms of stability. The SDK natively handles SSE data \u0026ldquo;flushing,\u0026rdquo; ensuring that messages do not remain stuck in the server\u0026rsquo;s buffers.\nHowever, the challenge did not end there. During the automatic build on GitHub Actions, the produced image continued to show logs from the old code. After checking every line, I identified a Module Naming problem. The Go module was named tazlab/mnemosyne-mcp-server, but the real repository on GitHub was github.com/tazzo/.... Go, during the cloud build, failing to resolve internal packages as local files, downloaded old versions of the code from remote branches instead of using the ones just committed. I corrected the module structure to align with the real GitHub path, forcing a clean build.\nPhase 5: The GitOps Deadlock (When Flux lies) # The final hurdle was cluster deployment. Despite correct commits and the GHA build passing, the pod continued to run with the old v14 image. Flux CD reported Applied revision, but the cluster\u0026rsquo;s live state was frozen.\nConceptual Deep-Dive: GitOps and Flux CD The GitOps philosophy mandates that the Git repository is the sole \u0026ldquo;source of truth.\u0026rdquo; Flux CD monitors Git and applies changes to the cluster. However, if a resource fails Kustomize validation, Flux stalls to avoid corrupting the cluster state.\nI investigated with flux get kustomizations and discovered a Dependency Deadlock. The apps kustomization (which manages Mnemosyne) was blocked because it depended on infrastructure-configs, which in turn was in error due to a malformed YAML in the Mnemosyne manifest. Inadvertently, I had introduced an indentation error in the env block of the Mnemosyne manifest during a hectic Git rebase. This error prevented the Flux controller from generating the new manifests, leaving the old v14 version running.\nI resolved the deadlock by cleanly rewriting the manifest and forcing a cascading reconciliation of the entire chain:\nexport KUBECONFIG=\u0026#34;/path/to/kubeconfig\u0026#34; # Unblocking the dependency chain flux reconcile kustomization flux-system --with-source flux reconcile kustomization apps --with-source Phase 6: Final State: \u0026ldquo;1 MCP Loaded\u0026rdquo; # After resolving the indentation error and forcing Kubernetes to download the fresh image with the imagePullPolicy: Always policy, the moment of truth arrived.\nLaunching the gemini command, the CLI finally displayed the message: \u0026ldquo;1 MCP loaded\u0026rdquo;. Mnemosyne was alive. I tested the list_memories tool and saw my technical memories from the last few months appear, retrieved from the local Postgres database via the SSE protocol.\nFinal MCP server snippet (Go SDK):\nfunc (s *Server) registerTools() { // Tool for semantic search retrieve := mcp.NewTool(\u0026#34;retrieve_memories\u0026#34;, mcp.WithDescription(\u0026#34;Search semantic memory\u0026#34;)) retrieve.InputSchema = mcp.ToolInputSchema{ Type: \u0026#34;object\u0026#34;, Properties: map[string]any{\u0026#34;query\u0026#34;: map[string]any{\u0026#34;type\u0026#34;: \u0026#34;string\u0026#34;}}, Required: []string{\u0026#34;query\u0026#34;}, } s.mcp.AddTool(retrieve, s.handleRetrieve) } Post-Lab Reflections: Toward Resilient Knowledge # This work session was a true technical marathon of over 4 hours. I learned that architectural simplicity (returning to local Postgres) almost always wins over the complexity of managed cloud services, especially in a laboratory context. The transition to the standard SDK transformed Mnemosyne from a fragile experiment into a solid infrastructural component.\nWhat does this mean for TazLab? Now my development environment is no longer amnesiac. The AI agent can finally say: \u0026ldquo;I remember how we configured Longhorn three weeks ago\u0026rdquo; or \u0026ldquo;This is why we chose that specific MetalLB policy.\u0026rdquo; Memory is sovereign, resides on my hardware, and speaks a universal protocol.\nWhat I learned in this stage: # The importance of standards: Using an official SDK (like mark3labs\u0026rsquo;) saves hours of debugging on protocol details such as SSE flushing and session ID management. GitOps Vigilance: Never trust a global \u0026ldquo;Reconciliation Succeeded\u0026rdquo; if a downstream component does not respond. A silent YAML error can freeze the entire cluster. Secret Sanitization: A single quote in a text file can be more destructive than a complex logic bug. The Mnemosyne mission continues. The next objective will be automated knowledge distillation, ensuring that every session is archived without human intervention, transforming every log line into an atomic fact for the future.\n","date":"22 February 2026","externalUrl":null,"permalink":"/posts/mnemosyne-mcp-integration/","section":"Posts","summary":"","title":"Mnemosyne Rebirth: Chronicle of a Sovereign Memory (and how I collided with the MCP protocol)","type":"posts"},{"content":"","date":"10 February 2026","externalUrl":null,"permalink":"/tags/longhorn/","section":"Tags","summary":"","title":"Longhorn","type":"tags"},{"content":" Phoenix Protocol: Validating Zero-Touch Rebirth and the S3 PITR Hell # In the architecture of the Ephemeral Castle, resilience is not an option, but the very condition of existence. An infrastructure that can be destroyed and recreated in less than twelve minutes is useless if, at the end of the rebirth, its memory has vanished. Over the last 48 hours, I subjected the TazLab cluster to what I dubbed the Phoenix Protocol: an obsessive cycle of nuclear-wipe and create, aimed at validating data immortality through automated restoration (Point-In-Time Recovery) from AWS S3.\nThis is not a story of immediate success, but the honest chronicle of a war of attrition against the CrunchyData PGO v5 operator\u0026rsquo;s automations, the idiosyncrasies of S3 object paths, and the physical latency of distributed storage on limited hardware.\nThe Mindset: Infrastructure is Ash, Data is Diamond # I decided to adopt a radical philosophy: the entire state of the cluster (VMs, OS configurations, local volumes) must be considered sacrificial. The only element that must survive the \u0026ldquo;nuclear fire\u0026rdquo; is the encrypted backup on S3. To test this vision, I had to face three main technical hurdles:\nDeterministic Orchestration: Ensuring that the Terragrunt layers rise in the correct order, managing dependencies between network storage and database instances. S3 Credential Injection: Resolving the paradox of an operator that requires access keys to download the restoration manifest that contains instructions on how to use those very keys. Longhorn Latency: Managing volume re-attachment on nodes that, after a total wipe, present state residues that confuse the Kubernetes scheduler. Phase 1: The Storage Struggle and the Longhorn Paradox # The first rebirth attempt clashed with the physical reality of my HomeLab (3 Proxmox nodes with about 32GB of total RAM). Longhorn, the distributed storage engine I chose for its simplicity and native Kubernetes integration, proved to be an unexpected bottleneck during rapid destruction and creation cycles.\nThe Investigation: \u0026ldquo;Volume not ready for workloads\u0026rdquo; # After launching the creation command, I observed the restore Pods remaining stuck in Init:0/1. Analyzing the events with kubectl describe pod, I encountered the error: AttachVolume.Attach failed for volume \u0026quot;pvc-xxx\u0026quot; : rpc error: code = Aborted desc = volume is not ready for workloads\nThe mental process that led me to the solution was this: I initially suspected a Talos OS error in mounting iSCSI targets. However, the Longhorn Manager logs indicated that the volume was \u0026ldquo;stuck\u0026rdquo; in a detachment phase from the previous node, which physically no longer existed due to the wipe.\nThe Reasoning: Why I reduced replicas and forced overprovisioning # To resolve this deadlock, I had to make two crucial decisions:\nReplica Count to 1: In a cluster with only two worker nodes, demanding three replicas for each database volume led to a scheduler deadlock. I decided that storage redundancy would be managed at the application level (via Postgres) and at the backup level (via S3), allowing local volumes to be lean and fast. 200% Overprovisioning: I configured Longhorn to allow the virtual allocation of double the physical space. This is necessary because during bootstrap, the system attempts to create new volumes before the old ones have been completely removed from the nodes\u0026rsquo; state database. Phase 2: The S3 Path Hell and the War on Leading Slashes # Once storage was stabilized, I faced the heart of the problem: pgBackRest. The integration between CrunchyData PGO v5 and S3 is extremely powerful, but equally picky.\nAnalysis of Failure: \u0026ldquo;No backup set found\u0026rdquo; # Despite the files being present in the S3 bucket, the restore Job failed systematically with a laconic FileMissingError: unable to open missing file '/pgbackrest/repo1/backup/db/backup.info'.\nDeep-Dive: Object Storage Pathing Unlike a POSIX filesystem, an S3 bucket does not have real folders, but only keys composed of strings (prefixes). When a tool like pgbackrest searches for a file, the presence or absence of a leading slash (/) in the configured prefix can radically change the API request.\nAfter using a temporary Pod with AWS CLI to inspect the bucket, I discovered that the data resided in pgbackrest/repo1/... (without a leading slash). In my cluster.yaml manifest, I had configured repo1-path: /pgbackrest/repo1. The operator was thus looking for a ghost \u0026ldquo;subfolder\u0026rdquo; in the root. I removed the leading slash, aligning the configuration with the reality of S3 objects.\nPhase 3: The Authentication Paradox in Bootstrap # Once the path problem was solved, the most difficult error emerged: ERROR: [037]: restore command requires option: repo1-s3-key.\nThe Reasoning: Why the operator does not \u0026ldquo;inherit\u0026rdquo; keys # I discovered that the CrunchyData v5 operator manages backups and restores asymmetrically. Although the S3 credentials were defined in the backups block, the bootstrap Job (the one that brings the cluster to life from nothing) did not automatically inherit them.\nI had to implement a refactoring of the ExternalSecret and the cluster manifest to force the injection. The solution was to create an s3.conf file dynamically injected via a Secret, and explicitly reference it in the dataSource block.\nTechnical Implementation: The \u0026ldquo;Sacred\u0026rdquo; Configuration # Here is the secret that unlocked the situation, mapping the Infisical keys into the format required by the pgBackRest configuration file:\n# infrastructure/configs/tazlab-db/s3-external-secret.yaml apiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: s3-backrest-creds namespace: tazlab-db spec: refreshInterval: 1h secretStoreRef: kind: ClusterSecretStore name: infisical-tazlab target: name: s3-backrest-creds template: engineVersion: v2 data: # Configuration file that CrunchyData mounts in the restore pod s3.conf: | [global] repo1-s3-key={{ .AWS_ACCESS_KEY_ID }} repo1-s3-key-secret={{ .AWS_SECRET_ACCESS_KEY }} data: - secretKey: AWS_ACCESS_KEY_ID remoteRef: key: AWS_ACCESS_KEY_ID - secretKey: AWS_SECRET_ACCESS_KEY remoteRef: key: AWS_SECRET_ACCESS_KEY And the cluster manifest that explicitly calls this configuration for the bootstrap:\n# infrastructure/instances/tazlab-db/cluster.yaml spec: dataSource: pgbackrest: stanza: db configuration: - secret: name: s3-backrest-creds # Essential for authentication during restore repo: name: repo1 s3: bucket: \u0026#34;tazlab-longhorn\u0026#34; endpoint: \u0026#34;s3.amazonaws.com\u0026#34; region: \u0026#34;eu-central-1\u0026#34; options: - --delta # Allows restoration onto existing volumes if necessary Phase 4: Validating the Phoenix Protocol (PITR) # For the final test, I wanted to raise the bar. It wasn\u0026rsquo;t enough to recover an old backup; I wanted to recover data inserted seconds before the total destruction of the cluster.\nThe Test Protocol: # Insert DATO_A: Recorded in the S3 Full Backup. Manual backup trigger. Insert DATO_B: Recorded only in the transaction logs (WAL). Force pg_switch_wal() to ensure the last segment was pushed to S3. Nuclear Wipe: Physical destruction of all VMs on Proxmox. Deep-Dive: Point-In-Time Recovery (PITR) PITR is the ability of a database to return to any past instant in time by combining a full backup (\u0026ldquo;the base\u0026rdquo;) with transaction logs (WAL - \u0026ldquo;the bricks\u0026rdquo;). If the system can replay the WALs on S3 after a wipe, it means we haven\u0026rsquo;t lost even a single row of data, even if inserted just a moment before the disaster.\nThe Final Obstacle: The \u0026ndash;type=immediate flag # Initially, the restoration showed only DATO_A. Analyzing the logs, I realized that the operator used the --type=immediate option by default. This option instructs Postgres to stop as soon as the database reaches a consistent state after the full backup, ignoring all subsequent transaction logs. I removed the flag from the manifest, allowing the process to \u0026ldquo;chew\u0026rdquo; through all available WALs until the last transaction received from S3.\nFinal Result: 11 Minutes and 38 Seconds # Using the system clock to measure each phase of the rebirth, here is the final telemetry of the complete bootstrap:\nLayer Secrets: 33s Layer Platform (Proxmox + Talos): 3m 48s Layer Engine \u0026amp; Networking: 2m 51s Layer GitOps \u0026amp; Storage: 2m 25s Database Restore (S3 PITR): ~2m 00s Total: 11 minutes and 38 seconds.\nAt the end of this interval, I queried the memories table:\nid | content | created_at ----+----------------------------------+------------------------------- 2 | DATO_B_VOLATILE_MA_IMMORTALE_WAL | 2026-02-10 14:55:10 1 | DATO_A_NEL_BACKUP_S3 | 2026-02-10 14:54:02 Both pieces of data were there. The Phoenix Protocol succeeded.\nPost-Lab Reflections: The Future is Nomadic # Achieving this milestone radically transforms my approach to the cluster. Knowing that I can destroy everything and have every single database transaction back in less than 12 minutes frees me from the \u0026ldquo;fear of the hardware.\u0026rdquo;\nWhat we learned: # Automation is not magic: It is a sequence of rigorous validations. Every slash, every username (which must respect the RFC 1123 standard, otherwise reconciliation fails), every restore flag counts. Data is the only anchor: Infrastructure must be considered ephemeral by definition. Investing time in making the data \u0026ldquo;immortal\u0026rdquo; via S3 is worth a thousand times the time spent trying to make a VM \u0026ldquo;stable.\u0026rdquo; The cloud is close: This 3-node setup (1 CP + 2 Workers) with 24GB of total RAM is already ready to be moved to AWS EC2 or Google Cloud. The configuration is agnostic; only the VM provisioning layer will change, but the heart of the rebirth will remain the same. The TazLab Castle is now officially indestructible. Its strength lies not in its walls, but in its ability to rise from its own ashes, exactly where and when I decide.\nCronaca Tecnica a cura di Taz - HomeLab DevOps Engineer.\n","date":"10 February 2026","externalUrl":null,"permalink":"/posts/phoenix-protocol-s3-pitr-validation/","section":"Posts","summary":"","title":"Phoenix Protocol: Validating Zero-Touch Rebirth and the S3 PITR Hell","type":"posts"},{"content":"","date":"10 February 2026","externalUrl":null,"permalink":"/categories/reliability-engineering/","section":"Categories","summary":"","title":"Reliability Engineering","type":"categories"},{"content":"","date":"10 February 2026","externalUrl":null,"permalink":"/tags/s3-backup/","section":"Tags","summary":"","title":"S3-Backup","type":"tags"},{"content":"","date":"6 February 2026","externalUrl":null,"permalink":"/tags/cryptography/","section":"Tags","summary":"","title":"Cryptography","type":"tags"},{"content":"","date":"6 February 2026","externalUrl":null,"permalink":"/categories/engineering/","section":"Categories","summary":"","title":"Engineering","type":"categories"},{"content":"","date":"6 February 2026","externalUrl":null,"permalink":"/tags/post-mortem/","section":"Tags","summary":"","title":"Post-Mortem","type":"tags"},{"content":" TazPod v2.0: Surrendering to Root and the RAM Revolution # In the world of DevOps and Security Engineering, there is a fine line between a secure architecture and an unusable one. With TazPod v1.0, I had built what looked on paper like a masterpiece of isolation: a \u0026ldquo;Ghost Mode\u0026rdquo; that leveraged Linux Namespaces and LUKS devices to make secrets invisible even to concurrent processes in the same container.\nToday, with the release of v2.0, I am officially documenting the failure of that approach and the complete rewrite of the system\u0026rsquo;s core. This is the chronicle of how operational stability won over theoretical paranoia, and how I learned to stop fighting against the root user.\n1. The Collapse of \u0026ldquo;Ghost Mode\u0026rdquo;: A Post-Mortem # The ambition of v1.0 was high: use unshare --mount to create a private mount space within the container, where a LUKS volume (vault.img) would be decrypted. The idea was that upon exiting the shell, the namespace would collapse and the secrets would vanish.\nLoop Device Instability # The first sign of structural failure appeared during intensive development sessions. The Linux kernel manages loop devices (files mounted as disks) as global resources. Inside a Docker container—which is already an isolated environment and often \u0026ldquo;privileged\u0026rdquo; in a precarious way to allow these operations—managing locks on device mappers proved disastrous.\nThe error Failed to create loop device or Device or resource busy became a constant. Often, a container that didn\u0026rsquo;t terminate cleanly left the vault.img file \u0026ldquo;hanging\u0026rdquo; on a ghost loop device on the host. This required machine reboots or surgical interventions with losetup -d that broke the workflow.\nThe Data Loss Event # The breaking point was a filesystem corruption event. LUKS and ext4 do not like being terminated abruptly. On two separate occasions, a container crash left the encrypted volume in an inconsistent state (\u0026ldquo;dirty bit\u0026rdquo;), making a remount impossible.\nI lost data. And among those data, I lost precious sessions of Mnemosyne (my AI\u0026rsquo;s long-term memory), which I had imprudently decided to save inside the vault for \u0026ldquo;maximum security.\u0026rdquo; This event forced me to reconsider the entire strategy: a security system that makes data inaccessible to its legitimate owner is a failed system.\n2. Surrendering to Root: Threat Analysis # While struggling to stabilize mount points, I had to face an uncomfortable truth regarding the threat model.\n\u0026ldquo;Ghost Mode\u0026rdquo; protected secrets from other unprivileged processes. But TazPod runs as a --privileged container to perform mounts. Anyone with root access to the container (or the host) can simply use nsenter to enter the \u0026ldquo;secret\u0026rdquo; namespace or perform a RAM dump.\nThe Isolation Paradox # I spent weeks building a house of cards with unshare and mount --make-private, only to realize I was protecting secrets from\u0026hellip; myself. An attacker capable of compromising the host would have had access to everything anyway.\nI therefore decided to change my approach: accept that Root sees everything. Instead of trying to hide data from an omnipotent user via kernel isolation, I decided to reduce the time window and physical surface area where data exists in the clear.\n3. v2.0 Architecture: The RAM Vault (tmpfs + AES-GCM) # The new architecture completely eliminates the dependency on cryptsetup, dm-crypt, and loop devices. We shifted security from the block level (kernel) to the application level (Go) and volatile level (RAM).\nStorage: The vault.tar.aes Format # Instead of an encrypted ext4 filesystem, data at rest is now a simple compressed and encrypted TAR archive.\nFor encryption, I chose AES-256-GCM (Galois/Counter Mode).\nWhy GCM? Unlike CBC (Cipher Block Chaining) mode, GCM offers authenticated encryption. This means the file is not only unreadable but also protected from tampering. If a bit of the encrypted file on disk is corrupted or altered, the decryption phase fails immediately with an authentication error, protecting the integrity of the secrets. Key Derivation: I use PBKDF2 with a random salt generated at each save to derive the AES key from the user passphrase. Runtime: The Volatility of tmpfs # When the user launches tazpod unlock, the CLI does not touch the disk.\nMount: A 64MB tmpfs volume (RAM Disk) is mounted at /home/tazpod/secrets. // Internal code for volatile mount func mountRAM() { cmd := exec.Command(\u0026#34;sudo\u0026#34;, \u0026#34;mount\u0026#34;, \u0026#34;-t\u0026#34;, \u0026#34;tmpfs\u0026#34;, \u0026#34;-o\u0026#34;, \u0026#34;size=64M,mode=0700,uid=1000,gid=1000\u0026#34;, \u0026#34;tmpfs\u0026#34;, MountPath) cmd.Run() } Decrypt \u0026amp; Extract: The vault.tar.aes file is read into memory, decrypted on-the-fly, and the resulting TAR stream is unpacked directly into the RAM mount point. Zero Trace: No temporary files are ever written to the host\u0026rsquo;s physical disk. Lifecycle: Pull, Save, Lock # Persistence management has been completely overhauled to adapt to the ephemeral nature of RAM.\ntazpod pull: Downloads secrets from Infisical, writes them to RAM, and immediately triggers an Auto-Save. Auto-Save: The CLI recursively reads the RAM content, creates a new TAR in memory, encrypts it, and atomically overwrites the vault.tar.aes file on disk. tazpod lock (or exit): The final command is brutal and effective: umount /home/tazpod/secrets. Data vanishes instantly. No need for secure overwrites (shred), because the bits never touched the magnetic platters or NAND cells. 4. Developer Experience: Resolving Friction # Beyond security, v1.0 suffered from usability issues that slowed down my daily workflow.\nThe Name Collision Problem # Initially, the container name was hardcoded (tazpod-lab). This prevented working on two projects simultaneously (e.g., tazlab-k8s and blog-src).\nI introduced dynamic initialization logic in tazpod init.\n// Generating a unique identifier for the project cwd, _ := os.Getwd() folderName := filepath.Base(cwd) r := rand.New(rand.NewSource(time.Now().UnixNano())) randomSuffix := fmt.Sprintf(\u0026#34;%04d\u0026#34;, r.Intn(10000)) containerName := fmt.Sprintf(\u0026#34;tazpod-%s-%s\u0026#34;, folderName, randomSuffix) Now, each project folder has its dedicated container (e.g., tazpod-backend-8492), isolated from others, with its own vault and configuration.\nHot Reloading: Developing the CLI within the CLI # Developing TazPod using TazPod presented an \u0026ldquo;Inception\u0026rdquo; challenge. How to test the new version of the CLI without having to rebuild the entire Docker image (which takes minutes) for every change?\nI implemented a Hot Reload workflow:\nCompile the Go binary on the host (task build). Copy the binary to ~/.local/bin (for host use). Inject it directly into the active container: docker cp bin/tazpod tazpod-lab:/home/tazpod/.local/bin/tazpod This reduced the feedback cycle from 4 minutes to 3 seconds, allowing me to iterate quickly on encryption and mount logic.\n5. Mnemosyne: Memory Outside the Vault # One of the hardest lessons from v1.0 was the loss of AI sessions. For Mnemosyne, persistence is more important than absolute secrecy. Chats with Gemini contain architectural context, not passwords.\nIn v2.0, I decided to decouple the AI memory from the secrets vault. During the setupBindAuth phase, the CLI creates a strategic symlink:\nHost: Logs reside in /workspace/.tazpod/.gemini (on the host disk, persistent). Container: Linked to ~/.gemini. This ensures that even if I destroy the vault or reset the container, the project\u0026rsquo;s \u0026ldquo;consciousness\u0026rdquo; survives. Secrets (API tokens to talk to Gemini) remain in the RAM Vault, but memories are saved on standard disk.\nConclusions: Simplicity is a Security Feature # TazPod v2.0 is, paradoxically, technologically less advanced than v1.0. It doesn\u0026rsquo;t use esoteric kernel features, nor does it manipulate network or mount namespaces in creative ways. It\u0026rsquo;s just an encrypted file and a RAM disk.\nHowever, it is infinitely more robust.\nIt doesn\u0026rsquo;t break if Proxmox has a high load. It doesn\u0026rsquo;t corrupt data if the container crashes. It is portable to any Linux system without requiring specific kernel modules for encryption. I\u0026rsquo;ve learned that in DevOps, complexity is often technical debt disguised as \u0026ldquo;best practice.\u0026rdquo; Reducing the attack surface meant, in this case, reducing the complexity of the architecture. Now my secrets live in a digital soap bubble (RAM): ephemeral, fragile if touched, but perfectly isolated as long as it exists.\nThe next step? Bringing this philosophy of \u0026ldquo;resilient simplicity\u0026rdquo; to the heart of the Kubernetes cluster, where Mnemosyne will find its definitive home.\nTechnical Chronicle by Taz - Systems Engineering and Zero-Trust Infrastructure.\n","date":"6 February 2026","externalUrl":null,"permalink":"/posts/tazpod-v2-ram-vault-evolution/","section":"Posts","summary":"","title":"TazPod v2.0: Surrendering to Root and the RAM Revolution","type":"posts"},{"content":"","date":"5 February 2026","externalUrl":null,"permalink":"/posts/tazlab-nomadic-rebirth-cloud-horizon/","section":"Posts","summary":"","title":"Nomadic Rebirth: Towards the Cloud Horizon and the Castle's Evolution","type":"posts"},{"content":"","date":"5 February 2026","externalUrl":null,"permalink":"/categories/strategy/","section":"Categories","summary":"","title":"Strategy","type":"categories"},{"content":"","date":"5 February 2026","externalUrl":null,"permalink":"/tags/vectordb/","section":"Tags","summary":"","title":"Vectordb","type":"tags"},{"content":"","date":"2 February 2026","externalUrl":null,"permalink":"/categories/data-engineering/","section":"Categories","summary":"","title":"Data Engineering","type":"categories"},{"content":"","date":"2 February 2026","externalUrl":null,"permalink":"/tags/knowledge-management/","section":"Tags","summary":"","title":"Knowledge-Management","type":"tags"},{"content":"","date":"2 February 2026","externalUrl":null,"permalink":"/posts/mnemosyne-local-rebirth-snr/","section":"Posts","summary":"","title":"Mnemosyne: Local Rebirth, the Recursive Loop, and the SNR Challenge","type":"posts"},{"content":"","date":"2 February 2026","externalUrl":null,"permalink":"/tags/pgvector/","section":"Tags","summary":"","title":"Pgvector","type":"tags"},{"content":"","date":"2 February 2026","externalUrl":null,"permalink":"/categories/design-patterns/","section":"Categories","summary":"","title":"Design Patterns","type":"categories"},{"content":"","date":"2 February 2026","externalUrl":null,"permalink":"/tags/proxmox/","section":"Tags","summary":"","title":"Proxmox","type":"tags"},{"content":" The Castle\u0026rsquo;s Orchestra: The Pivot to Terragrunt and the War on Race Conditions # The dream of every DevOps engineer working with ephemeral infrastructure is Total Determinism. The idea that, by pressing a single key, an entire digital cathedral can rise from nothing, configure itself, and serve traffic in a few minutes, only to vanish without a trace, is what drives the Ephemeral Castle project. However, as often happens when transitioning from the lab to production, reality presented a steep bill in the form of instability, timing conflicts, and infinite stalls.\nIn this new stage of my technical diary, I am documenting the most significant architectural pivot since the project\u0026rsquo;s inception: the abandonment of the Terraform monolith in favor of layered orchestration managed by Terragrunt. This was not merely a tool change, but a necessary philosophical shift to defeat the Race Conditions that were turning the cluster bootstrap into a gamble rather than a certainty.\nThe Breaking Point: The Tyranny of Webhooks # Until a few days ago, the Castle was born from a single, giant main.tf. Terraform handled everything: it created the VMs on Proxmox, configured Talos OS, installed MetalLB, Longhorn, Cert-Manager, and finally Flux. On paper, Terraform\u0026rsquo;s dependency graph should have managed the execution order. In practice, I collided with the asynchronous nature of Kubernetes.\nThe Struggle Analysis: Webhooks in Timeout # The problem manifested systematically during the installation of MetalLB or Cert-Manager. Kubernetes uses Admission Webhooks to validate resources. When Terraform sent the manifest for an IPAddressPool (for MetalLB) or a ClusterIssuer (for Cert-Manager), the relative controller was still in the initialization phase.\nThe result was a frustrating error: failed calling webhook \u0026quot;l2advertisementvalidationwebhook.metallb.io\u0026quot;: connect: connection refused\nEven though the controller Pod appeared Running, the webhook service was not yet ready to respond. Terraform, seeing the failure, errored out and interrupted the entire provisioning chain. I tried inserting artificial \u0026ldquo;waits,\u0026rdquo; but they were fragile: too short and the system failed, too long and I lost the speed advantage. The monolith was becoming unmanageable because it tried to manage too many different states (infrastructure, network, storage, application logic) in a single lifecycle.\nThe Philosophical Pivot: Base Infrastructure vs. GitOps # Another tactical error I had to acknowledge was over-delegation to Flux. In the previous post, I celebrated the idea of moving Longhorn and MetalLB under Flux management to make Terraform \u0026ldquo;lighter.\u0026rdquo;\nThe Reasoning: Why I moved back # I realized that MetalLB and Longhorn are not \u0026ldquo;applications,\u0026rdquo; but extensions of the cluster Kernel. Without MetalLB, the Ingress doesn\u0026rsquo;t receive an IP. Without Longhorn, apps requiring persistence (like the blog or databases) cannot start.\nIf I delegate these components to Flux, I create a dangerous dependency loop: Flux needs secrets to authenticate, but ESO (External Secrets Operator) needs a healthy cluster to run. If Flux fails for any reason, I lose visibility into the cluster\u0026rsquo;s vital components. I decided, therefore, that everything necessary for the cluster to be considered \u0026ldquo;functional and capable\u0026rdquo; must be born via IaC (Infrastructure as Code), while Flux must handle only what the cluster \u0026ldquo;hosts.\u0026rdquo;\nThe Arrival of Terragrunt: The Conductor # To solve these problems, I introduced Terragrunt. Terragrunt acts as a wrapper for Terraform, allowing the infrastructure to be divided into independent modules linked by an explicit dependency graph.\nDeep-Dive: State Isolation and Dependency Graph # Using Terragrunt introduced two key concepts that changed everything:\nState Isolation: Each layer (networking, storage, engine) has its own .tfstate file. If I break the Flux configuration, the state of my VMs on Proxmox remains intact. I no longer risk destroying the entire cluster due to a syntax error in a Kubernetes manifest. Dependency Graph: I can tell Terragrunt: \u0026ldquo;Don\u0026rsquo;t even try to install MetalLB until the Platform layer (the VMs) is completely online and the Kubernetes API is responding.\u0026rdquo; The Anatomy of the 6-Layer Castle # I reorganized the entire ephemeral-castle repository into a layered structure, where each layer builds upon the foundations of the previous one.\nLayer 1: Secrets (G1) # This layer interacts only with Infisical EU. It retrieves the necessary tokens for Proxmox, SSH keys, and S3 credentials. It is the \u0026ldquo;point zero\u0026rdquo; of trust.\nLayer 2: Platform (G2) # This is where the heavy provisioning happens. Virtual machines are created on Proxmox, and the Talos OS configuration is injected.\nDeep-Dive: Quorum and VIP: In this phase, Terraform waits for the 3 Control Plane nodes to form the etcd quorum. The Virtual IP (VIP) must be stable before moving to the next layer. If the VIP does not respond, the bootstrap stops here. Layer 3: Engine (G3) # Once the \u0026ldquo;metal\u0026rdquo; is ready, we install the identity engine: External Secrets Operator (ESO). Without ESO, the cluster cannot talk to Infisical to retrieve application secrets. It is the bridge between the external world and the Kubernetes world.\nLayer 4: Networking (G4) # Installation of MetalLB. Here we implemented the definitive solution to the webhook race condition. The orchestration script queries Kubernetes until the webhook\u0026rsquo;s EndpointSlice is Ready. Only then is the IP pool configuration injected.\nLayer 5 \u0026amp; 6: Storage and GitOps (G5 - In Parallel) # This is where the optimization I called the \u0026ldquo;Parallel Blitz\u0026rdquo; took place. I realized that Longhorn (Storage) and Flux (GitOps) can be born simultaneously. Flux can start downloading images and preparing deployments while Longhorn is still initializing disks on the nodes.\nThe War on State: \u0026ldquo;VM Already Exists\u0026rdquo; and the Persistent Backend # A recurring problem during testing was local state corruption. If I accidentally deleted the .terraform folder or if the state was not saved after a crash, the next attempt would yield the error: 400 Parameter verification failed: vmid: VM 421 already exists on node proxmox\nThe Investigation: The ghost in the system # Terraform is a \u0026ldquo;state-aware\u0026rdquo; system. If it loses the state file, it thinks the world is empty. But Proxmox has a physical memory. To resolve this stall, I implemented two strategies:\nOut-of-Tree Persistent Backend: I moved all state files to a dedicated directory /home/taz/kubernetes/ephemeral-castle/states/, external to the Git repository. This ensures the state survives even an aggressive git clean or a branch change. Nuclear Wipe: I created a nuclear-wipe.sh script that, in case of emergency, uses the Proxmox API to forcibly delete VMs between IDs 421 and 432, allowing Terraform to restart from a real tabula rasa. Technical Implementation: The Heart of Terragrunt # Here is how the root configuration file that orchestrates the entire dance looks. Notice how providers are generated for all underlying layers, ensuring total consistency.\n# live/terragrunt.hcl remote_state { backend = \u0026#34;local\u0026#34; config = { path = \u0026#34;${get_parent_terragrunt_dir()}/../../states/${path_relative_to_include()}/terraform.tfstate\u0026#34; } } generate \u0026#34;provider\u0026#34; { path = \u0026#34;provider.tf\u0026#34; if_exists = \u0026#34;overwrite_terragrunt\u0026#34; contents = \u0026lt;\u0026lt;EOF provider \u0026#34;proxmox\u0026#34; { endpoint = var.pm_api_url api_token = var.pm_api_token insecure = true } provider \u0026#34;kubernetes\u0026#34; { config_path = \u0026#34;${get_parent_terragrunt_dir()}/../../clusters/tazlab-k8s-proxmox/proxmox/configs/kubeconfig\u0026#34; } provider \u0026#34;helm\u0026#34; { kubernetes { config_path = \u0026#34;${get_parent_terragrunt_dir()}/../../clusters/tazlab-k8s-proxmox/proxmox/configs/kubeconfig\u0026#34; } } EOF } And an example of how a layer (e.g., networking) declares its dependency on the previous layer:\n# live/tazlab-k8s-proxmox/stage4-networking/terragrunt.hcl include \u0026#34;root\u0026#34; { path = find_in_parent_folders() } dependency \u0026#34;engine\u0026#34; { config_path = \u0026#34;../stage3-engine\u0026#34; } inputs = { # Inputs passed from the previous layer if necessary } Optimization: The \u0026ldquo;Parallel Blitz\u0026rdquo; and the 8-Minute Record # After stabilizing the order, the challenge became speed. Initially, the bootstrap took about 14 minutes. Analyzing the logs, I saw that Flux remained waiting for Longhorn even though it wasn\u0026rsquo;t strictly necessary for its basic installation.\nThe Solution: Intelligent Orchestration # In the create.sh script, I separated the layer application. While layers 1, 2, 3, and 4 must be sequential (Secrets -\u0026gt; VMs -\u0026gt; Engine -\u0026gt; Network), layers 5 and 6 are launched almost simultaneously.\n# create.sh snippet - Enterprise V4 echo \u0026#34;🚀 STAGE 5 \u0026amp; 6: Launching Storage and GitOps in Parallel...\u0026#34; terragrunt run-all apply --terragrunt-non-interactive --terragrunt-parallelism 2 This change reduced the total bootstrap time to 8 minutes and 20 seconds. In this timeframe, the system goes from cosmic nothingness to an HA cluster with 5 nodes, distributed storage, Layer 2 networking, and Flux having already reconciled the latest version of this blog.\nPost-Lab Reflections: Toward Cloud Agnosticism # The transition to Terragrunt has transformed the Ephemeral Castle into a real Infrastructure Factory.\nWhat does this setup mean for the future? # Platform Agnosticism: I can now create a live/tazlab-k8s-aws/ folder, change only the stage2-platform layer (using AWS modules instead of Proxmox), and keep all other layers identical. Networking will provide an AWS LoadBalancer instead of MetalLB, but Flux and the apps won\u0026rsquo;t even notice. Industrial Reliability: We have eliminated the \u0026ldquo;maybe it works.\u0026rdquo; If a layer fails, Terragrunt stops exactly there, allowing us to inspect the specific state without chasing ghosts in a 5000-line state file. Speed as Security: An infrastructure born in 8 minutes allows one to not fear destroying everything. If we suspect a compromise or a configuration error, the answer is always: destroy \u0026amp;\u0026amp; create. The Castle is now solid, modular, and ready to scale beyond the borders of my home lab. The orchestra is ready, and the music of code has never been so harmonious.\nEnd of Technical Chronicle - The Terragrunt Revolution\n","date":"2 February 2026","externalUrl":null,"permalink":"/posts/orchestrating-ephemeral-castle-terragrunt-pivot/","section":"Posts","summary":"","title":"The Castle's Orchestra: The Pivot to Terragrunt and the War on Race Conditions","type":"posts"},{"content":" The Immutable Handover: Terraform, Flux, and the Birth of the Castle Factory # Systems engineering is not a linear process, but an evolution made of continuous simplifications. After achieving High Availability with a 5-node cluster, I realized the architecture still suffered from an original sin: overlapping responsibilities. Terraform was doing too much, and Flux was doing too little. In this technical chronicle, I document the final evolutionary leap of the Ephemeral Castle: its transformation into a true \u0026quot;Infrastructure Factory\u0026quot; where the IaC code acts only as a spark, delegating the entire construction of the pillars to the GitOps engine.\nThe session\u0026rsquo;s objective was radical: reduce Terraform to the bare minimum, reorganize the repository to ensure total isolation between projects, and create a rebirth system capable of rising from the ashes with a single automated command.\nThe Reasoning: The Aesthetic of IaC Minimalism # Initially, I had configured Terraform to install not only the Kubernetes cluster but also all its fundamental components: MetalLB for networking, Longhorn for storage, Traefik for ingress, and Cert-Manager for certificates. On paper, it seemed like a logical choice: a single command to have everything ready.\nHowever, this choice created an identity conflict. Flux, my GitOps \u0026quot;butler,\u0026quot; was also trying to manage those same components by reading the manifests repository. The result was a constant duel between Terraform and Flux for control of the cluster, with the risk of drift and collisions at every update.\nThe Choice: Only the Essential # I decided to implement a drastic refactoring. Terraform now manages only the \u0026quot;Kernel\u0026quot; of the Castle:\nPhysical Provisioning: Creation of VMs on Proxmox and configuration of Talos OS. External Secrets Operator (ESO): This is the only Kubernetes component I kept in Terraform. The reason is purely technical: for Flux to download apps, it often needs secrets (Git tokens, S3 keys). ESO must be there from the very first second to act as a bridge with Infisical EU. Flux CD: The final trigger. Terraform installs Flux and hands it the keys to the tazlab-k8s repository. This separation transforms Terraform into a midwife: it helps the cluster be born and then steps aside. Flux becomes the sole sovereign of the infrastructure pillars. The advantage? Traefik or MetalLB updates now happen with a simple git push, without ever having to invoke Terraform for application changes.\nPhase 1: Project-Centric Reconstruction and Isolation # Until yesterday, the folder structure was divided by platform (providers/proxmox/...). It was a limited approach that didn\u0026rsquo;t scale well in a multi-project or multi-cloud scenario.\nThe Reasoning: Total Isolation # I decided to reorganize the entire ephemeral-castle repository following a project-oriented hierarchy. A project (like \u0026quot;Blue\u0026quot;) must be able to exist on both Proxmox and AWS in a totally isolated manner, with its independent Terraform states and protected keys.\nI implemented the following structure:\nclusters/blue/proxmox/: The specific logic for the local cluster. clusters/blue/configs/: A dedicated folder to host generated sensitive files (kubeconfig, talosconfig). Security and .gitignore # A common error in IaC is letting state files or configs slip into version control. I updated the .gitignore with a recursive and aggressive rule:\n**/configs/ *.tfstate* This ensures that regardless of how many new clusters I create, their keys will remain confined to my protected workstation or the vault, never on GitHub.\nPhase 2: The Castle Remote - destroy.sh and create.sh # The true challenge of ephemeral infrastructure is the speed of rebirth. If recreating the cluster requires 10 manual commands, the infrastructure is not ephemeral; it\u0026rsquo;s just exhausting. I decided to condense the entire operational intelligence into two orchestration scripts.\nThe Investigation: The Terraform Block # The main problem was that terraform destroy failed systematically. The Kubernetes and Helm providers were trying to connect to the cluster to verify resource status before deleting them. But if the machines had already been reset or turned off, Terraform remained hung waiting for a response that would never come.\nThe Solution: The State \u0026quot;Purge\u0026quot; # I resolved this stalemate by inserting a forced cleanup phase in the destroy.sh script. Before launching the destroyer, the script manually removes problematic resources from the local state:\n# destroy.sh snippet echo \u0026#34;🔥 Phase 1: Cleaning Terraform State...\u0026#34; terraform state list | grep -E \u0026#34;flux_|kubernetes_|kubectl_|helm_\u0026#34; | xargs -n 1 terraform state rm || true This command tells Terraform: \u0026quot;Forget you ever knew Flux or Helm, just think about deleting the VMs\u0026quot;. It is a surgical maneuver that unlocks the entire destruction process.\nPhase 3: The Struggle of Race Conditions # During the first tests of the create.sh script, the cluster was born, but services (like the blog) remained offline.\nError Analysis: The MetalLB Webhook # I saw the MetalLB Pods in Running state, but Flux reported a cryptic error on the IP pool configurations: failed calling webhook \\\u0026quot;l2advertisementvalidationwebhook.metallb.io\\\u0026quot;: connect: connection refused\nThe thought process: I initially suspected a network problem between nodes. I checked the metallb-controller logs and discovered the truth: the webhook process (which validates YAML files) takes a few seconds longer than the main controller to activate. Flux tried to inject the configuration at the wrong millisecond, received a rejection, and stalled.\nThe Solution: The Patience of EndpointSlices # I updated the creation script to not just wait for the Pods, but to query Kubernetes until the webhook endpoint was actually ready to serve. I migrated the control logic from the old Endpoints resource (now deprecated) to the modern EndpointSlice.\nHowever, even this logic required refinement: initially, a Bash syntax error in the wait loop blocked the rebirth right at the finish line. Fixing that bug was the last lesson of the day: in an orchestration script, the robustness of controls (using grep -q instead of fragile string comparisons) is what separates \u0026quot;toy\u0026quot; automation from professional-grade.\n# create.sh logic update echo \u0026#34;⏳ Waiting for MetalLB Webhook to be serving...\u0026#34; until kubectl get endpointslice -n metallb-system -l kubernetes.io/service-name=metallb-webhook-service -o jsonpath=\u0026#39;{range .items[*].endpoints[?(@.conditions.ready==true)]}{.addresses[*]}{\u0026#34;\\n\u0026#34;}{end}\u0026#39; 2\u0026gt;/dev/null | grep -q \u0026#34;\\.\u0026#34;; do printf \u0026#34;.\u0026#34; sleep 5 done echo \u0026#34; Webhook ready!\u0026#34; This granular check eliminated the last \u0026quot;race condition\u0026quot; preventing total automation.\nPhase 4: Idempotency and the Infisical Conflict # Another obstacle was the automatic backup of config files to Infisical EU. Terraform tried to create the KUBECONFIG_CONTENT secret, but if it already existed from the previous attempt, the API returned a 400 Bad Request: Secret already exists error.\nThe Reasoning: Preventive Import # Instead of trying to delete the secret (which requires elevated permissions and time), I decided to implement an automatic import logic. Before executing the final apply, the script tries to \u0026quot;import\u0026quot; the secret into the Terraform state. If it exists, Terraform takes control and updates it; if it doesn\u0026rsquo;t, the error is ignored, and Terraform will create it normally.\n# create.sh snippet echo \u0026#34;🔗 Checking for existing configs on Infisical...\u0026#34; terraform import -var-file=secrets.tfvars infisical_secret.kubeconfig_upload \u0026#34;$WORKSPACE_ID:$ENV_SLUG:$FOLDER_PATH:KUBECONFIG_CONTENT\u0026#34; || true Deep-Dive: The Concept of Handover # In this architecture, the concept of Handover is fundamental. It represents the exact moment when the responsibility for the cluster passes from provisioning (IaC) to continuous delivery (GitOps).\nWhy is this a significant technical term? In a traditional system, Terraform is \u0026quot;the state.\u0026quot; If you want to change a Traefik port, you change the Terraform code. In the Castle, Terraform doesn\u0026rsquo;t even know what Traefik is. Terraform only knows it must give birth to a cluster and install Flux.\nThis drastically reduces the Blast Radius of a Terraform error: if you get a line wrong in the IaC code, you risk breaking the VMs, but you will never break the blog\u0026rsquo;s application logic, because that resides in another world (GitOps). It is the final separation between the \u0026quot;machine\u0026quot; and the \u0026quot;purpose.\u0026quot;\nThe Factory in Action: How a New Project is Born # Thanks to this restructuring, creating a new cluster is no longer a work of craftsmanship but a production line process. If I wanted to create the \u0026quot;Green\u0026quot; cluster today, the procedure would be as follows:\nProvisioning (IaC):\nCopy the templates/proxmox-talos folder to clusters/green/proxmox. Modify the terraform.tfvars file by setting the new IPs, cluster name, and the new Infisical path (e.g., /ephemeral-castle/green/proxmox). Prepare the secrets on Infisical in the new folder. Delivery (GitOps):\nCreate a new GitHub repository starting from the contents of gitops-template. Enter the URL of this new repository in the terraform.tfvars file of the project folder. Spark:\nRun ./create.sh from the project folder. In less than 10 minutes, Terraform would create the machines, and Flux would start populating the new repository with the base components (MetalLB, Traefik, Cert-Manager) already pre-configured. This is the true power of the Castle: the ability to scale horizontally not just nodes, but entire digital ecosystems.\nFinal Hardening: Kernel Cleanup and API v1 # To conclude the day, I addressed two \u0026quot;cleanup\u0026quot; bugs that were cluttering the logs.\nKernel Modules: Talos reported errors loading iscsi_generic. Investigating the documentation, I found that in recent versions, iSCSI modules have been merged. I removed the non-existent module from talos.tf, finally achieving a clean boot (\u0026quot;Green Boot\u0026quot;). Deprecations: I migrated every Kubernetes resource managed by Terraform to v1 versions (e.g., kubernetes_secret_v1). This doesn\u0026rsquo;t change functionality but ensures the infrastructure is ready for upcoming major Kubernetes releases and silences annoying terminal warnings. Post-lab Reflections: The Triumph of Automation # Seeing the Castle rise with a single command was one of the most satisfying experiences of this journey.\nWhat we learned: # IaC as Bootstrapper: Terraform is at its best when limited to creating foundations. The more Kubernetes code you put in Terraform, the more problems you\u0026rsquo;ll have in the future. The Importance of Retries: In a distributed world, you cannot assume a command will work on the first try. Orchestration scripts must have the \u0026quot;patience\u0026quot; to wait for network services to warm up. Isolation = Replicability: Dividing by project and platform makes the Castle a true factory. Today I have a \u0026quot;Blue\u0026quot; cluster on Proxmox, but the structure is ready to give birth to a \u0026quot;Green\u0026quot; cluster on AWS in less than 10 minutes. The Castle is now not just solid; it is autonomous. The walls are high, the butler (Flux) is at work, and the blog you are reading is living proof that code, when well-orchestrated, can create immutable and indestructible realities.\nEnd of Technical Chronicle - Phase 5: Automation and Handover\n","date":"1 February 2026","externalUrl":null,"permalink":"/posts/the-immutable-handover-factory-automation/","section":"Posts","summary":"","title":"The Immutable Handover: Terraform, Flux, and the Birth of the Castle Factory","type":"posts"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/tags/alloydb/","section":"Tags","summary":"","title":"Alloydb","type":"tags"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/posts/mnemosyne-long-term-memory/","section":"Posts","summary":"","title":"Mnemosyne: Agent's Long-Term Memory and AlloyDB Integration","type":"posts"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/tags/ha/","section":"Tags","summary":"","title":"Ha","type":"tags"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/tags/nginx/","section":"Tags","summary":"","title":"Nginx","type":"tags"},{"content":" Rise of the Fortress: High Availability, Immutability, and the Birth of a Serious Cluster # The journey of building the Ephemeral Castle has reached a critical threshold. Until now, the infrastructure had been an experimental laboratory: a single Control Plane, a single Worker, a functional but fragile shell. In systems engineering, a cluster with a single point of failure is not a cluster; it is just a scheduled delay towards disaster.\nIn this technical chronicle, I document the transformation of the Castle into a true High Availability (HA) fortress. I decided to scale the architecture to 3 Control Plane nodes and 2 Workers, establishing the minimum requirement to guarantee control plane resilience and workload continuity. Simultaneously, I faced the migration of the first \u0026quot;real\u0026quot; application: this blog, which moved from a dynamic and unstable setup to a stateless and immutable architecture, laying the foundations for a professional-grade CI/CD pipeline.\nPhase 1: Engineering High Availability (HA) # The first decision of the day was radical: wipe the existing setup to give birth to an infrastructure capable of withstanding the loss of an entire node without service interruption.\nThe Reasoning: Why 3 Control Planes? # In a Kubernetes cluster, the brain is represented by etcd, the distributed database that stores the state of every resource. etcd uses the Raft consensus algorithm to ensure that all nodes agree on the data.\nI chose the 3-node configuration for a purely mathematical reason related to the concept of Quorum. The quorum is the minimum number of nodes that must be online for the cluster to make decisions. The formula is (n/2) + 1.\nWith 1 node, the quorum is 1 (no fault tolerance). With 2 nodes, the quorum is 2 (if one dies, the cluster freezes). With 3 nodes, the quorum is 2. This means I can lose an entire node and the Castle will continue to function perfectly. Moving to 3 nodes transforms the cluster from a toy into a production platform.\nProxmox Infrastructure Details # I configured Terraform to manage 5 virtual machines on Proxmox:\nVIP (Virtual IP): 192.168.1.210 - The single entry point for the Kubernetes API. CP-01, 02, 03: IP .211, .212, .213 - The distributed brain. Worker-01, 02: IP .214, .215 - The operational arms where Pods run. Phase 2: The Quorum Struggle and the Fight Against Ghosts # The implementation of High Availability proved more complex than expected due to a phenomenon I dubbed \u0026quot;ghost identity conflict.\u0026quot;\nError Analysis: etcd at a Standstill # After launching the provisioning, the nodes appeared on Proxmox, but the cluster failed to form. Monitoring the status with talosctl service etcd, I saw the services stuck in the Preparing state.\nInvestigation via talosctl get members revealed a chaotic situation: new nodes were trying to communicate but saw duplicate identities associated with the same IPs in the database. This happened because, during previous tests, I had reused the same IP addresses without performing a full wipe of the disks. etcd, finding residues of an old configuration, refused to form a new quorum to protect data integrity.\nThe Solution: Clean Slate and Network Shift # I decided to apply the supreme philosophy of the Ephemeral Castle: if it\u0026rsquo;s not clean, it\u0026rsquo;s not reliable.\nI executed a talosctl reset on all nodes simultaneously to wipe every magnetic residue on the virtual disks. I moved the entire cluster IP range (from .22x to .21x) to force every network component, including the router\u0026rsquo;s ARP cache, to forget the \u0026quot;ghosts\u0026quot; of the past. After this total reset, provisioning went smoothly. The three brains recognized each other, elected a leader, and the VIP .210 went online in less than 2 minutes. This result was particularly satisfying after hours of troubleshooting invisible certificate conflicts.\nPhase 3: The Stateless Revolution - Blog Migration # With a solid HA base, it was time to deploy the first non-infrastructure workload: the Hugo blog.\nThe Reasoning: From State to Immutability # The previous blog setup was based on a git-sync container that downloaded source code from GitHub and a Hugo instance that compiled the site within the cluster.\nI decided to abandon this approach for three fundamental reasons:\nSecurity (Zero Trust): The old method required keeping a GitHub token or an SSH key inside the cluster. By removing git-sync, the cluster no longer needs to know that a source Git repository exists. Reliability: If GitHub had gone down, the blog would not have started. Now, the blog depends only on the Docker image saved on Docker Hub. Speed: An immutable image containing only pre-compiled files and a lightweight web server starts in milliseconds, whereas Hugo took precious seconds to generate the site at every startup. Deep-Dive: Docker Multi-Stage Build # To implement this vision, I wrote a multi-stage Dockerfile. This approach allows for the separation of the build environment from the runtime environment, ensuring tiny and secure images.\n# Stage 1: Builder FROM hugomods/hugo:std AS builder WORKDIR /src COPY . . # Generate static site with optimizations RUN hugo --minify # Stage 2: Runner FROM nginx:stable-alpine # Copy build artifacts, leaving behind compiler and source code COPY --from=builder /src/public /usr/share/nginx/html EXPOSE 80 CMD [\u0026#34;nginx\u0026#34;, \u0026#34;-g\u0026#34;, \u0026#34;daemon off;\u0026#34;] Phase 4: Level 2 GitOps - Total Traceability # A serious infrastructure requires a serious release workflow. I decided to implement an Image Tagging system based on Git SHA.\nThe Problem with Static Tags # Using a tag like :latest or :blog is a cardinal sin in Kubernetes. It prevents deterministic rollbacks and misleads Kubernetes, which might not download the new version if the tag doesn\u0026rsquo;t change.\nThe Solution: The Smart Publish Script # I developed a publish.sh script that coordinates the release between two different repositories (blog-src and tazlab-k8s).\nThe script\u0026rsquo;s thought process:\nVerify that there are no uncommitted changes (determinism). Extract the current commit SHA (e.g., 8c945ac). Build and push the image tazzo/tazlab.net:blog-8c945ac. GitOps Automation: The script enters the local tazlab-k8s repository folder, searches for the blog manifest file, and replaces the old tag with the new one using sed. Executes an automatic commit and push to tazlab-k8s. In this way, the blog update is not a manual operation on the cluster, but a declared state change on Git. Flux CD detects the new commit and aligns the cluster within 60 seconds. This is the true essence of GitOps: code is the only source of truth.\nPhase 5: The Port Mapping Bug Hunt # Despite the correct architecture, the blog initially responded with a frustrating Connection Refused.\nInvestigation: Ingress vs Service # I began the investigation by checking the Pod status: they were Running. I checked the Traefik logs and noticed unexpected behavior: Traefik was receiving traffic on port 80 but failing to contact the backend.\nExecuting kubectl describe svc hugo-blog, I discovered the snag. Traefik, by default in its Helm chart, attempts to map traffic to ports 8000 (HTTP) and 8443 (HTTPS) of the containers. However, in my manifest, I had configured Nginx to listen on port 80.\nFurthermore, the official Traefik image runs as a non-root user and does not have permissions to listen on ports below 1024 inside the Pod.\nThe Solution: Port Alignment # I modified the Traefik configuration in main.tf to explicitly handle the mapping:\nExternal: Port 80 (exposed by the MetalLB LoadBalancer). Mapping: Port 80 of the Service -\u0026gt; Port 8000 of the Traefik Pod. Ingress: Traefik then routes to port 80 of the blog Pods (Nginx). # Traefik Port Configuration Fix ports: web: exposedPort: 80 port: 8000 # Internal port where Traefik is authorized to listen websecure: exposedPort: 443 port: 8443 After applying this change, the Let\u0026rsquo;s Encrypt SSL certificates (managed via HTTP-01 challenge) instantly moved from pending to valid. Seeing the green padlock appear on https://blog.tazlab.net was the culmination of a long debugging session.\nPost-lab Reflections: Towards the Complete Castle # With a 5-node cluster and a real application operational in HA, the Ephemeral Castle has emerged from its embryonic phase.\nWhat we achieved: # Resilience Born from Consensus: Thanks to the 3 CPs, we can afford hardware failures without losing control of the platform. Application Immutability: The blog is no longer a mass of synchronized files, but an entity frozen in time, easy to scale and impossible to corrupt. Backup Automation: I no longer need to worry about kubeconfig files. Terraform uploads them to Infisical as soon as the cluster is born, allowing me to be operational on any machine in moments. The Castle is now ready to welcome the next pillars: observability with Prometheus and Grafana, and filesystem hardening through Disk Encryption of the /var partition. Every step moves us further from the fragility of \u0026quot;hardware\u0026quot; and closer to the freedom of pure code.\nEnd of Technical Chronicle - Phase 4: HA and Immutability\n","date":"31 January 2026","externalUrl":null,"permalink":"/posts/scaling-ephemeral-castle-ha-stateless-blog/","section":"Posts","summary":"","title":"Rise of the Fortress: High Availability, Immutability, and the Birth of a Serious Cluster","type":"posts"},{"content":"","date":"30 January 2026","externalUrl":null,"permalink":"/tags/cert-manager/","section":"Tags","summary":"","title":"Cert-Manager","type":"tags"},{"content":"","date":"30 January 2026","externalUrl":null,"permalink":"/tags/letsencrypt/","section":"Tags","summary":"","title":"Letsencrypt","type":"tags"},{"content":" The Foundations of Accessibility: Traefik, Cert-Manager, and the Castle\u0026rsquo;s Philosophical Pivot # After securing the heart of the Ephemeral Castle with etcd encryption and establishing the secure bridge with Infisical, the infrastructure was in a state of \u0026ldquo;secure solitude.\u0026rdquo; The cluster was protected, but isolated. In this new stage of my technical diary, I document the implementation process of the two pillars that allow the Castle to communicate with the outside world in a secure and automated way: Traefik and Cert-Manager.\nThe goal of the day was ambitious: to transform a \u0026ldquo;naked\u0026rdquo; cluster into a production-ready platform, capable of managing HTTPS traffic and the SSL certificate lifecycle without any manual intervention. Along the way, I collided with architectural choices that tested the very philosophy of the project, leading to a radical change of course.\nThe Prelude of Trust: TazPod as Identity Anchor # No automation can begin without a verified identity. In the context of the Ephemeral Castle, where portability is the supreme dogma, I cannot afford to leave access keys scattered on my laptop\u0026rsquo;s hard drive. This is where TazPod comes in.\nThe bootstrap process always begins in the terminal. Through the tazpod pull command, I activate the \u0026ldquo;Ghost Mount\u0026rdquo;: an encrypted memory area, isolated via Linux Namespaces, where Infisical session tokens reside. It is this step that allows Terraform to authenticate towards the Infisical EU instance and retrieve cluster secrets (like the Proxmox token or S3 keys).\nI populated the secrets.tfvars file by drawing from this secure enclave. This approach ensures that the \u0026ldquo;master\u0026rdquo; credentials are never written in cleartext on the persistent filesystem, keeping my work environment ready to disappear at any time without leaving a trace. Once Terraform has its tokens, the provisioning dance begins.\nPhase 1: Traefik - The Traffic Director # To manage incoming traffic, the choice fell on Traefik. In Kubernetes, an Ingress Controller is the component that listens for requests coming from the outside and routes them to the correct services within the cluster.\nThe Reasoning: Why Traefik and not Nginx? # I decided to use Traefik primarily for its \u0026ldquo;Cloud Native\u0026rdquo; nature and its ability to self-configure by reading Kubernetes resource annotations. Compared to the Nginx Ingress, Traefik offers a smoother management of Custom Resource Definitions (CRDs), such as the IngressRoute, which allows for superior configuration granularity for traffic routing.\nI could have chosen an approach based on a DaemonSet, running Traefik on every node, but for the \u0026ldquo;Blue\u0026rdquo; cluster (composed of only one operational worker) I opted for a classic Deployment with a single replica. This reduces resource consumption and simplifies persistence management, should it be necessary. In a larger architecture, scaling would be managed by a Horizontal Pod Autoscaler based on traffic metrics.\nIaC Integration # Traefik was not installed via Flux, but integrated directly into the main.tf of ephemeral-castle. This is a fundamental design choice: the Ingress is a component of the core infrastructure, not an application. It must be born along with the cluster.\n# Traefik Ingress Controller Configuration resource \u0026#34;helm_release\u0026#34; \u0026#34;traefik\u0026#34; { name = \u0026#34;traefik\u0026#34; repository = \u0026#34;https://traefik.github.io/charts\u0026#34; chart = \u0026#34;traefik\u0026#34; namespace = kubernetes_namespace.traefik.metadata[0].name version = \u0026#34;34.0.0\u0026#34; values = [ \u0026lt;\u0026lt;-EOT deployment: kind: Deployment replicas: 1 podSecurityContext: fsGroup: 65532 additionalArguments: - \u0026#34;--entrypoints.web.http.redirections.entryPoint.to=websecure\u0026#34; - \u0026#34;--entrypoints.web.http.redirections.entryPoint.scheme=https\u0026#34; ports: web: exposedPort: 80 websecure: exposedPort: 443 service: enabled: true type: LoadBalancer annotations: # Static IP from MetalLB Pool metallb.universe.tf/loadBalancerIPs: 192.168.1.240 persistence: enabled: false # Switched to stateless EOT ] depends_on = [helm_release.longhorn, kubectl_manifest.metallb_config] } Phase 2: Cert-Manager and the Philosophical Pivot # An Ingress without HTTPS is a blunt weapon. To automate the issuance of TLS certificates via Let\u0026rsquo;s Encrypt, I introduced Cert-Manager. This is where the real ideological clash of the day took place.\nThe Initial Error: The temptation of DNS-01 # Initially, I configured Cert-Manager to use the DNS-01 challenge via Cloudflare. The technical advantage is undeniable: it allows for the generation of Wildcard certificates (*.tazlab.net), enormously simplifying subdomain management. I created the integration with Infisical to retrieve the Cloudflare API Token and watched with satisfaction as the first wildcard certificate appeared in the cluster.\nThe Investigation: The betrayal of Agnosticism # As I observed the ready certificate, I realized I was violating the first commandment of the Ephemeral Castle: provider independence. By tying the core infrastructure to Cloudflare, I was creating a \u0026ldquo;lock-in.\u0026rdquo; If tomorrow I wanted to donate this project to the community or use it for a client using different DNS, I would have to rewrite the ClusterIssuer logic.\nI decided to take a step back. I destroyed the Cloudflare configuration and switched to the HTTP-01 challenge.\nDeep-Dive: DNS-01 vs HTTP-01 # DNS-01: Cert-manager writes a TXT record in your DNS to prove ownership. It allows wildcards but requires a specific integration for each provider (Cloudflare, Route53, etc.). HTTP-01: Cert-manager exposes a temporary file on port 80. Let\u0026rsquo;s Encrypt reads it and validates the domain. It is universal and agnostic to the DNS, but it does not allow wildcards. For the Castle, agnosticism is more important than the convenience of a single certificate. Every app (Blog, Grafana, etc.) will now request its own specific certificate. It is a cleaner choice and consistent with a modular architecture.\nPhase 3: Error Analysis and \u0026quot;The Ephemeral Way\u0026quot; # The transition from one configuration to another was not painless. During the Traefik update via Terraform, the command timed out.\nThe Struggle: Helm in limbo # I saw the helm_release.traefik resource stuck in a pending-install state. When Terraform times out during a Helm installation, the cluster remains in an inconsistent state: the release exists in the Helm database but Terraform has lost the tracking (state).\nOn the next attempt, I received the error: Error: cannot re-use a name that is still in use\nThe mental resolution process:\nI checked the actual state with helm list -n traefik. I tried to import the resource into the Terraform state (terraform import), but the release was marked as \u0026quot;failed\u0026quot; and not importable. I adopted the \u0026quot;Ephemeral\u0026quot; solution: I manually uninstalled Traefik with helm uninstall, removed the resource from the Terraform state (terraform state rm), and deleted the namespace to clean up any remaining PVCs. I relaunched terraform apply. This \u0026quot;clean slate\u0026quot; approach is the heart of the project. Instead of debugging a corrupted Helm database for hours, I bring the system back to state zero and let the declarative code rebuild it correctly.\nPhase 4: The Final Agnostic Configuration # Here is how the universal ClusterIssuer now looks in the Castle. It does not need external API tokens; it only needs an email for Let\u0026rsquo;s Encrypt.\n# letsencrypt-issuer.tf (integrated in main.tf) resource \u0026#34;kubectl_manifest\u0026#34; \u0026#34;letsencrypt_issuer\u0026#34; { yaml_body = \u0026lt;\u0026lt;-EOT apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-issuer spec: acme: email: ${var.acme_email} server: https://acme-v02.api.letsencrypt.org/directory privateKeySecretRef: name: letsencrypt-issuer-account-key solvers: - http01: ingress: class: traefik EOT depends_on = [helm_release.cert_manager, helm_release.traefik] } We have also implemented total Zero-Hardcoding. Every IP, every domain (tazlab.net), and every parameter is managed via variables.tf and terraform.tfvars. The code is now an \u0026quot;empty box\u0026quot; ready to be filled with any configuration.\nPost-lab Reflections: What does this setup mean? # With the implementation of Traefik and Cert-Manager (HTTP-01), the Castle has completed its \u0026quot;Core Infrastructure\u0026quot; phase.\nWhat we learned: # Stateless is better: By removing ACME from Traefik and delegating it to Cert-Manager, we made the Ingress Controller totally stateless. We can destroy and recreate it without worrying about losing the certificate .json files. Independence has a price: Giving up wildcard certs is a small operational nuisance, but it ensures that the Castle can \u0026quot;land\u0026quot; on any DNS provider without changes to the core code. The Castle is a Factory: The current structure allows cloning the entire Proxmox provider folder, changing three lines in the .tfvars file, and having a new working cluster in less than 10 minutes. Some elements are still missing to define this base as \u0026quot;complete\u0026quot; (Prometheus and Grafana are next on the list), but the path is set. The rest of the work is now in the hands of Flux, which will begin populating the Castle with real applications, starting with the Blog you are reading.\nEnd of Technical Chronicle - Phase 3: Ingress and Certificate Automation\n","date":"30 January 2026","externalUrl":null,"permalink":"/posts/extending-ephemeral-castle-ingress-automation/","section":"Posts","summary":"","title":"The Foundations of Accessibility: Traefik, Cert-Manager, and the Castle's Philosophical Pivot","type":"posts"},{"content":"","date":"29 January 2026","externalUrl":null,"permalink":"/tags/talos/","section":"Tags","summary":"","title":"Talos","type":"tags"},{"content":" The Fortress Walls: Engineering Zero-Trust Security into the Ephemeral Infrastructure # Building an immutable infrastructure is an exercise in discipline, but making it secure without sacrificing portability is an architectural challenge. After laying the foundations of the Ephemeral Castle on Proxmox and establishing the reconciliation loop with Flux, I realized that the foundations were solid but the walls were still vulnerable. Secrets resided in SOPS-encrypted YAML files within the Git repository: a functional solution, but one that introduced significant operational friction and too tight a coupling with local encryption keys.\nIn this technical chronicle, I document the transition to a production-grade security model, where trust is never presumed (Zero-Trust) and secrets flow as dynamic entities, never persisted on disk in cleartext.\nThe Spark: The Zero Point of Trust with TazPod # Every fortress needs a key, but where does this key reside when the knight is nomadic? My answer is TazPod. Before I can launch a single Terraform command, I must establish a secure channel to my source of truth: Infisical.\nI decided to use TazPod not just as a development environment, but as a true \u0026ldquo;identity anchor.\u0026rdquo; Through the tazpod pull command, I activate the \u0026ldquo;Ghost Mount.\u0026rdquo; In this state, TazPod creates an isolated Linux namespace and mounts an encrypted memory area where it downloads Infisical session tokens. This step is crucial: the tokens that allow Terraform to read the cluster keys never touch the guest computer\u0026rsquo;s disk in cleartext.\nWhy Infisical? The choice fell on Infisical (EU instance for compliance and latency) to overcome the limits of SOPS. SOPS requires every collaborator (or every CI/CD instance) to possess the Age private key or access to a KMS. With Infisical, I centralized secret management into a platform that offers audit logs, rotation, and, most importantly, native integration with Kubernetes via Machine Identities.\nOnce TazPod was unlocked, I populated the secrets.tfvars file with the Machine Identity\u0026rsquo;s client_id and client_secret. This file is the \u0026ldquo;beachhead\u0026rdquo;: it is the only sensitive information needed to start the automation dance, and it is strictly excluded from version control via .gitignore.\nPhase 1: Hardening the Heart - Talos Secretbox and etcd Encryption # Kubernetes, by its nature, stores all resources, including Secret, within etcd. If an attacker were to gain access to etcd data files on the Control Plane disk, they could extract every key, certificate, or password in the cluster. In a standard configuration, this data is stored in cleartext.\nThe Technical Reasoning # I decided to implement Talos Secretbox Encryption. Talos allows patching the node configuration to include a 32-byte encryption key (AES-GCM) that is used to encrypt data before it is written to etcd.\nWhy not use native Kubernetes encryption (EncryptionConfiguration)? The answer lies in the operational simplicity of Talos. Managing EncryptionConfiguration manually requires creating files on the node and managing rotation via the API server. Talos abstracts this process into its declarative configuration, allowing me to manage the key like any other IaC parameter.\nThe Investigation: The disaster of hot migration # The initial plan involved applying the patch to an already existing cluster. I generated a secure key with:\nopenssl rand -base64 32 I uploaded it to Infisical and updated the Terraform manifest to inject it into the Control Plane. However, at the moment of terraform apply, disaster struck: core cluster Pods began to fail. Flux went into CrashLoopBackOff, the helm-controller could no longer read its tokens.\nChecking kube-apiserver logs with talosctl logs, I found the fatal error: \u0026quot;failed to decrypt data\u0026quot; err=\u0026quot;output array was not large enough for encryption\u0026quot;\nThe API server had entered a state of confusion: it was trying to decrypt existing secrets (written in cleartext) using the new Secretbox key, or worse, it had partially encrypted some data, rendering it unreadable. The cluster was corrupted.\nThe Ephemeral Way: Destruction and Rebirth # Faced with a compromised Kubernetes cluster, a traditional administrator would spend hours attempting to repair etcd. But this is the Ephemeral Castle. I decided to honor the project\u0026rsquo;s philosophy: do not repair, recreate.\nI performed an aggressive reset:\nI manually removed \u0026ldquo;ghost\u0026rdquo; resources from the Terraform state (terraform state rm). I destroyed the VMs on Proxmox. I relaunched the entire provisioning. The cluster was reborn in 5 minutes, but this time with the Secretbox active from the very first second of life. Every piece of data written to etcd during the bootstrap process was born already encrypted. This is the true power of immutability: the ability to solve complex problems by returning to a known, clean state.\n# Patch snippet applied in main.tf resource \u0026#34;talos_machine_configuration_apply\u0026#34; \u0026#34;cp_config\u0026#34; { client_configuration = talos_machine_secrets.this.client_configuration machine_configuration_input = data.talos_machine_configuration.controlplane.machine_configuration node = var.control_plane_ip config_patches = [ yamlencode({ machine = { # ... networking and installation ... } cluster = { secretboxEncryptionSecret = data.infisical_secrets.talos_secrets.secrets[\u0026#34;TALOS_SECRETBOX_KEY\u0026#34;].value } }) ] } Phase 2: The Dynamic Ambassador - External Secrets Operator (ESO) # With the cluster database secured, the next step was to eliminate the need to store application secrets in the Git repository. SOPS is a great tool, but it introduces a problem: secret rotation requires a new commit and a new push.\nWhy External Secrets Operator? # I chose to install External Secrets Operator (ESO) as a fundamental pillar of the Castle. ESO does not store secrets; it acts as a bridge between Kubernetes and an external provider (Infisical).\nThe advantage is radical: in Git, I write an ExternalSecret object that describes which secret I want and where it should end up in Kubernetes. ESO takes care of contacting Infisical via API, retrieving the value, and creating a native Kubernetes Secret only in the cluster\u0026rsquo;s RAM. If I change a value on Infisical, ESO updates it in the cluster in real-time, without any Git intervention.\nThe Authentication Challenge: Universal Auth # To have ESO talk to Infisical securely, I avoided using simple static tokens. I implemented the Universal Auth method (Machine Identity).\nThe thought process was this: Terraform creates an initial Kubernetes secret containing the Machine Identity\u0026rsquo;s clientId and clientSecret. Then, it configures a ClusterSecretStore, a resource that instructs ESO on how to authenticate cluster-wide.\nDuring installation, I ran into the rigid schema of ESO version 0.10.3. A configuration error in the ClusterSecretStore blocked synchronization with a laconic InvalidProviderConfig. Analyzing the CRD with:\nkubectl get crd clustersecretstores.external-secrets.io -o yaml I discovered that the fields had changed compared to previous versions. The universalAuth section had become universalAuthCredentials and required explicit references to Kubernetes secret keys.\nHere is the final and correct configuration that I integrated directly into the Terraform provisioning:\nresource \u0026#34;kubectl_manifest\u0026#34; \u0026#34;infisical_store\u0026#34; { yaml_body = \u0026lt;\u0026lt;-EOT apiVersion: external-secrets.io/v1beta1 kind: ClusterSecretStore metadata: name: infisical-tazlab spec: provider: infisical: hostAPI: https://eu.infisical.com secretsScope: environmentSlug: ${var.infisical_env_slug} projectSlug: ${var.infisical_project_slug} auth: universalAuthCredentials: clientId: name: ${kubernetes_secret.infisical_machine_identity.metadata[0].name} namespace: ${kubernetes_secret.infisical_machine_identity.metadata[0].namespace} key: clientId clientSecret: name: ${kubernetes_secret.infisical_machine_identity.metadata[0].name} namespace: ${kubernetes_secret.infisical_machine_identity.metadata[0].namespace} key: clientSecret EOT depends_on = [helm_release.external_secrets, kubernetes_secret.infisical_machine_identity] } Phase 3: Modularization and Cleanup - The Castle Factory # The final act of this consolidation day was code refactoring. An ephemeral infrastructure must be replicable. If tomorrow I wanted to create a \u0026ldquo;Green\u0026rdquo; cluster identical to the \u0026ldquo;Blue\u0026rdquo; one but isolated, I shouldn\u0026rsquo;t have to rewrite the code, just change the parameters.\nThe Concept of Zero-Hardcoding # I decided to rigorously apply the principle of Zero-Hardcoding. I removed every static IP, every Infisical folder name, and every repository URL from the main.tf and providers.tf files. Everything was moved to a three-level system:\nvariables.tf: Defines the schema. What data is needed? What type is it? What are the secure defaults? terraform.tfvars: Defines the topology. This is where node IPs, the GitOps repo URL, and Infisical project slugs reside. This file is committed: it describes what the castle is, not how to open it. secrets.tfvars: The only forbidden file. It contains the Machine Identity credentials. Thanks to the .gitignore modification, this file stays only on my protected workstation (or in the TazPod vault). # Modularization example in providers.tf provider \u0026#34;infisical\u0026#34; { host = \u0026#34;https://eu.infisical.com\u0026#34; client_id = var.infisical_client_id client_secret = var.infisical_client_secret } provider \u0026#34;proxmox\u0026#34; { endpoint = var.proxmox_endpoint # Proxmox secrets are now dynamically retrieved from Infisical via data source api_token = \u0026#34;${data.infisical_secrets.talos_secrets.secrets[\u0026#34;PROXMOX_TOKEN_ID\u0026#34;].value}=${data.infisical_secrets.talos_secrets.secrets[\u0026#34;PROXMOX_TOKEN_SECRET\u0026#34;].value}\u0026#34; } The Final Farewell to SOPS # With this move, I was finally able to delete proxmox-secrets.enc.yaml. There are no more encrypted files weighing down the repository. The dependency on the SOPS provider in Terraform has been removed. The \u0026ldquo;Castle\u0026rdquo; is now lighter, faster to initialize, and infinitely more secure.\nPost-Lab Reflections: What have we learned? # This implementation phase taught me that security in a modern environment is not a perimeter, but a flow.\nWe have traced a path that starts from the developer\u0026rsquo;s mind (the TazPod passphrase), crosses an encrypted channel in RAM, temporarily materializes in Terraform variables to build the infrastructure, and finally stabilizes in a Kubernetes operator (ESO) that keeps the secret fluid and updatable.\nResults Achieved: # Armored etcd: Even with physical access to Proxmox disks, cluster data is unreadable without the Secretbox key. Clean Git: The repository contains only logic, no keys, not even encrypted ones. Total Replicability: I can duplicate the provider folder, change three lines in .tfvars, and have a production-ready new cluster in less than 10 minutes. The Castle now has its walls. It is ready to host the services that will make it alive, knowing that every \u0026ldquo;treasure\u0026rdquo; deposited within it will be protected by modern encryption and an architecture that never forgets its ephemeral nature.\nEnd of Technical Chronicle - Phase 2: Security and Secrets\n","date":"29 January 2026","externalUrl":null,"permalink":"/posts/fortifying-the-ephemeral-castle-security/","section":"Posts","summary":"","title":"The Fortress Walls: Implementing Zero-Trust Security and Secret Management","type":"posts"},{"content":"Architecture is not just a drawing on paper or a manifesto of intent. After outlining the vision of the Ephemeral Castle, it is time to get hands-on with silicon, hypervisors, and declarative code. This is the chronicle of the first implementation phase: the transition from an abstract concept to a functional Kubernetes cluster, born and managed entirely through Infrastructure as Code (IaC).\nI decided to start the journey in my local lab based on Proxmox VE. The choice is not accidental: total control over the hardware allows me to iterate quickly, test the limits of distributed storage, and understand networking dynamics before facing the complexity (and costs) of the public cloud.\nThe Foundation: Talos OS and the Death of SSH # The first critical decision concerned the operating system of the nodes. I chose Talos OS. In a world accustomed to Ubuntu Server or Debian, Talos represents a radical paradigm shift: it is a Linux operating system designed exclusively for Kubernetes. It is immutable, minimal, and, most importantly, it has no SSH shell.\nWhy this extreme choice? In an infrastructure that aims to be \u0026ldquo;ephemeral,\u0026rdquo; the persistence of manual configurations within a node is the enemy. By eliminating SSH, I eliminated the temptation to apply \u0026ldquo;temporary fixes\u0026rdquo; that would become permanent. Every modification must pass through the Talos API via YAML configuration files. If a node behaves abnormally, I do not repair it: I destroy it and recreate it.\nDeep-Dive: Immutability and Security # Immutability means that the root filesystem is read-only. There are no package managers like apt or yum. This drastically reduces the attack surface: even if a malicious actor managed to gain access to a process in the node, they could not install rootkits or modify system binaries. The security quorum of the cluster benefits directly.\nThe DHCP Nightmare and the Transition to Terraform # The initial implementation was far from fluid. During the first tests, I let the nodes acquire IP addresses via DHCP. This was a fundamental error that led to a significant technical incident. After a scheduled restart of the Proxmox server, the DHCP server assigned new addresses to the cluster nodes.\nThe result? The Control Plane became unreachable. kubectl could no longer authenticate because the certificates were tied to the old IPs, and the etcd quorum was destroyed. I spent hours attempting to manually patch the nodes with talosctl patch, trying to chase the new network topology.\nIt was here that I realized manual or semi-automated management was not enough. I decided to migrate the entire provisioning to Terraform.\nThe Solution: Declarative Static Networking # I rewrote the Terraform manifests to statically define every network interface. This ensures that, regardless of restarts or network fluctuations, the \u0026ldquo;Castle\u0026rdquo; maintains its shape.\n# A glimpse of the providers.tf file with the node configuration resource \u0026#34;proxmox_vm_qemu\u0026#34; \u0026#34;talos_worker\u0026#34; { count = 3 name = \u0026#34;worker-${count.index + 1}\u0026#34; target_node = \u0026#34;pve\u0026#34; clone = \u0026#34;talos-template\u0026#34; # Static network configuration to avoid IP drift ipconfig0 = \u0026#34;ip=192.168.1.15${5 + count.index}/24,gw=192.168.1.1\u0026#34; cores = 4 memory = 8192 # Integration with Talos happens via machine_config # generated through the dedicated Talos provider. } Using Terraform allowed me to map the desired state of the infrastructure. If I want to add a worker, I simply change the count from 3 to 4. Terraform will calculate the difference and interact with the Proxmox APIs to clone the VM, assign the correct IP, and inject the Talos configuration.\nDistributed Storage: The Longhorn Challenge # A cluster without persistent storage is just an academic exercise. For the Ephemeral Castle, I needed a storage system that was as resilient as the cluster itself. The choice fell on Longhorn.\nLonghorn transforms the local disk space of the worker nodes into a distributed and replicated storage pool. However, running Longhorn on an immutable operating system like Talos requires specific precautions. Talos does not include the binaries for iSCSI (needed for mounting volumes) or NBD (Network Block Device) by default.\nError Analysis: The Mount Problem # Initially, pods failed to transition from the ContainerCreating state to Running. Checking system logs with kubectl describe pod, I noticed a recurring error: executable file not found in $PATH referring to iscsid.\nOn a traditional system, I would have installed open-iscsi with a command. On Talos, I had to instruct the system to load the necessary kernel modules via the Talos machineConfig, using system extensions.\n# Extract of the Talos configuration to enable iSCSI machine: install: extensions: - image: ghcr.io/siderolabs/iscsi-tools:v0.1.4 - image: ghcr.io/siderolabs/util-linux-tools:v2.39.3 This step is fundamental: it transforms the node from a generic entity into a specialized component of the storage cluster. Once configured, Longhorn began replicating data between nodes, ensuring that even in the event of a total loss of a worker, the blog or database volumes remain accessible.\nGitOps: The Beating Heart with Flux CD # The Ephemeral Castle is not configured manually. Once Terraform has created the VMs and Talos has initialized Kubernetes, Flux CD comes into play.\nFlux is a GitOps operator that keeps the cluster synchronized with a GitHub repository. I created two distinct repositories:\nephemeral-castle: Contains the Terraform code and \u0026ldquo;hardware\u0026rdquo; configurations (IPs, VM resources). tazlab-k8s: Contains the Kubernetes manifests (Deployment, Service, HelmRelease). Why not a single repository? # I decided to separate the infrastructure from the workload. Terraform manages the \u0026ldquo;iron\u0026rdquo; (even if virtual), while Flux manages the application ecosystem. This separation allows for the destruction of the entire cluster while keeping the application logic intact. When the new cluster emerges, Flux detects its presence and starts pulling the manifests, recreating the environment exactly as it was before.\nDeep-Dive: The Reconciliation Loop # The key concept of Flux is the Reconciliation Loop. Flux constantly monitors the Git repository. If I modify the number of replicas of a microservice in the YAML file on GitHub, Flux detects the \u0026ldquo;drift\u0026rdquo; between the current state of the cluster and the desired state in the repository and applies the change in seconds. This eliminates the need for manual commands like kubectl apply -f.\nSecurity and Secrets: SOPS and Git Integration # Versioning infrastructure on GitHub carries a risk: secret leakage. Proxmox passwords, SSH keys, API tokens\u0026hellip; none of this should end up in plain text in the repository.\nI adopted SOPS (Secrets Operations) by encrypting sensitive files with Age keys. The resulting files (e.g., proxmox-secrets.enc.yaml) are perfectly safe to push to a private repository. Terraform and Flux are configured to decrypt these files \u0026ldquo;on the fly\u0026rdquo; during execution, ensuring that credentials never touch the disk in unencrypted format.\n# Example of encrypting a secrets file sops --encrypt --age $(cat key.txt) secrets.yaml \u0026gt; secrets.enc.yaml Post-Lab Reflections: What have we learned? # This first stage of the journey has confirmed a fundamental truth of modern DevOps: automation is painful at the beginning, but liberating later.\nConfiguring static IPs in Terraform was slower than assigning them manually on Proxmox. Configuring SOPS was more complex than using environment variables. However, I now have an infrastructure that I can replicate with the press of a button. The Castle is \u0026ldquo;Ephemeral\u0026rdquo; because its physical existence is irrelevant; what matters is the code that defines it.\nNext Steps # The Castle now breathes, but it is naked. In the next chronicles, we will address:\nThe Ingress Controller: Configuring Traefik to manage external traffic and automatic SSL certificate generation with Let\u0026rsquo;s Encrypt. The Hugo Blog: Deploying the site you are currently reading, fully automated via CI/CD. To the Clouds: Replicating this entire architecture on AWS, demonstrating the true portability of the Ephemeral Castle. The road is still long, but the foundations have been laid in the concrete of code.\nEnd of Technical Chronicle - Stage 1\n","date":"28 January 2026","externalUrl":null,"permalink":"/posts/implementing-the-ephemeral-castle-proxmox/","section":"Posts","summary":"","title":"From Vision to Silicon: Implementing the Ephemeral Castle on Proxmox","type":"posts"},{"content":"","date":"28 January 2026","externalUrl":null,"permalink":"/categories/tutorials/","section":"Categories","summary":"","title":"Tutorials","type":"categories"},{"content":" Introduction: The Weight of Theory against the Reality of the Metal # In recent weeks, I have dedicated a significant amount of time to building an immutable and secure workstation. However, a perfectly organized workshop is useless if the \u0026ldquo;construction site\u0026rdquo; — my Kubernetes cluster based on Talos Linux and Proxmox — is unable to withstand the impact of a real failure. My mindset today was not geared towards construction, but towards controlled destruction. I wanted to understand where the thread of resilience breaks.\nThe objective of the session was clear: now that the infrastructure is managed via Terraform and boasts 4 worker nodes, it is time to test the promises of High Availability (HA). But, as often happens in distributed systems, what looks like a painless transition on paper can turn into a catastrophic domino effect in reality. In this chronicle, I will document how a simple IP change and a forced shutdown brought the cluster to the brink of collapse, and how I decided to rebuild the foundations to prevent it from happening again.\nPhase 1: Expansion and IaC Consolidation # The first step was aligning the cluster with the new desired configuration. I decided to use Terraform to manage the entire lifecycle of the nodes on Proxmox. The use of an Infrastructure as Code (IaC) approach is not just a matter of convenience; it is a necessity to guarantee the replicability of the \u0026ldquo;Ephemeral Castle\u0026rdquo; I wrote about previously.\nI configured 4 worker nodes, distributing workloads so that no single node was a Single Point of Failure (SPOF).\nDeep-Dive: Why 4 Workers and 3 Control Plane nodes? # In Kubernetes, the concept of Quorum is vital. The control plane uses etcd, a distributed database based on the Raft consensus algorithm. To survive the loss of a node, a minimum of odd members is required (3 is the bare minimum). For the workers, the number 4 allows for the implementation of robust Antiaffinity strategies: I can afford to lose a node for maintenance and still have 3 nodes on which to distribute replicas, maintaining high resource density without overloading the hardware.\nPhase 2: The Unexpected Disaster - The IP Change Domino Effect # The test began with an apparently trivial event: changing the IP of the Control Plane node. What was supposed to be a routine update turned into an operational nightmare.\nThe Symptom # Suddenly, internal cluster services stopped communicating. The logs for CoreDNS and Longhorn began showing No route to host or Connection refused errors towards the 10.96.0.1:443 endpoint.\nThe Investigation # I began the investigation by checking the status of the pods with kubectl get pods -A. Many were in CrashLoopBackOff. Analyzing the longhorn-manager logs:\ntime=\u0026#34;2026-01-25T20:45:28Z\u0026#34; level=error msg=\u0026#34;Failed to list nodes\u0026#34; error=\u0026#34;Get \\\u0026#34;https://10.96.0.1:443/api/v1/nodes\\\u0026#34;: dial tcp 10.96.0.1:443: connect: no route to host\u0026#34; The problem was deep: the internal Kubernetes service (kubernetes.default) was still pointing to the old physical IP of the Control Plane (.71) instead of the new one (.253). Although I had updated the external kubeconfig, the internal routing tables (managed by kube-proxy and iptables) remained stuck.\nThe Solution: Manual Patching of Endpoints # I decided to intervene surgically on the Endpoints object in the default namespace. This is a risky operation because it is usually managed by the controller manager, but in a state of network partition, manual intervention was the only way.\n# I extracted the configuration, corrected the IP and reapplied it kubectl patch endpoints kubernetes -p \u0026#39;{\u0026#34;subsets\u0026#34;:[{\u0026#34;addresses\u0026#34;:[{\u0026#34;ip\u0026#34;:\u0026#34;192.168.1.253\u0026#34;}],\u0026#34;ports\u0026#34;:[{\u0026#34;name\u0026#34;:\u0026#34;https\u0026#34;,\u0026#34;port\u0026#34;:6443,\u0026#34;protocol\u0026#34;:\u0026#34;TCP\u0026#34;}]}]}\u0026#39; --kubeconfig=kubeconfig Immediately after, I forced a restart of coredns and kube-proxy. The network began to breathe again, but the wounds were still open at the storage level.\nPhase 3: The Longhorn Deadlock and RWO Storage # Once the network was resolved, I faced the harsh reality of distributed storage. I had forcibly shut down some nodes during the instability phase.\nThe Problem: Ghost Volumes # Longhorn uses RWO (ReadWriteOnce) volumes. This means a volume can be mounted by only one node at a time. When the worker-new-03 node was abruptly shut down, the Kubernetes cluster marked it as NotReady, but Longhorn maintained the \u0026ldquo;lock\u0026rdquo; on the Traefik volume, thinking the node might return at any moment.\nI saw the new Traefik pod stuck in ContainerCreating for minutes, with this error in the events: Multi-Attach error for volume \u0026quot;pvc-...\u0026quot; Volume is already exclusively attached to one node and can't be attached to another.\nError Analysis: Why doesn\u0026rsquo;t it unlock itself? # I analyzed the behavior: Kubernetes waits about 5 minutes before evicting pods from a dead node. However, even after eviction, the CSI (Container Storage Interface) does not detach the volume unless it receives confirmation that the original node is powered off. It is a protection measure against data corruption (Split-Brain).\nThe Solution: Forcing the Cluster\u0026rsquo;s Hand # I decided to proceed with an aggressive cleanup of VolumeAttachments and zombie pods.\n# Forced deletion of the zombie pod kubectl delete pod traefik-79fcb6d7fd-pwp9v -n traefik --force --grace-period=0 # Removal of the stale VolumeAttachment kubectl delete volumeattachment csi-5f3b43f479e048a26187... --kubeconfig=kubeconfig Only after these actions did Longhorn allow the new node to \u0026ldquo;take possession\u0026rdquo; of the disk. This taught me that a forced shutdown in an environment with RWO storage almost always requires human intervention to restore service availability.\nPhase 4: The Traefik Limit and the Necessity of Statelessness # During replica testing, I tried to increase the number of Traefik instances to 2. The result was an immediate failure.\nThe Reasoning: Why did I want 2 replicas? # From a High Availability perspective, having only one Ingress Controller instance is an unacceptable risk. If the node hosting Traefik dies, the blog goes down (as we saw in the test). A normal Deployment should allow me to scale horizontally.\nThe Clash with Reality # Traefik is configured to generate SSL certificates via Let\u0026rsquo;s Encrypt and save them in an acme.json file. To persist these certificates across restarts, I used a Longhorn volume. Here lies the architectural error: since the volume is RWO, the second Traefik replica could not start because the disk was already occupied by the first one.\nI decided, therefore, to temporarily maintain a single replica, but I have mapped out a plan to migrate to cert-manager. By using Kubernetes Secrets for certificates, Traefik will become completely stateless, allowing us to scale to 3 or more replicas without disk conflicts.\nPhase 5: The 5-Minute Test - Automation vs. Prudence # I wanted to conduct one final scientific experiment: shut down a node and time how long it takes for the cluster to react on its own.\nT+0: Forcibly shut down worker-new-01. T+1: The node is NotReady. The pod is still considered Running. T+5: Kubernetes marks the pod as Terminating and creates a new one on another node. T+8: The new pod is still in Init:0/1, blocked by the Longhorn volume. Test Conclusion # Kubernetes automation works for compute, but fails for RWO persistent storage in the event of sudden hardware failures. Without a Fencing system (which physically shuts down the node via Proxmox API), automatic recovery is not guaranteed in a short timeframe.\nPost-Lab Reflections: The Roadmap towards Zero Trust # This session of \u0026ldquo;stress and suffering\u0026rdquo; was more instructive than a thousand clean installations. I learned that resilience is not a button you press, but a balance built piece by piece.\nWhat does this mean for long-term stability? # The cluster is now much more solid because:\nStatic IPs and VIP: I moved all management to the .250 VIP. If a control node dies, the kubeconfig does not need to change. Network Configuration: I corrected internal routes, ensuring that system components talk to the correct API. Storage Management: I now know Longhorn\u0026rsquo;s limits and how to intervene in case of a deadlock. Next Steps # I have already budgeted for two major tasks:\nTraefik Restructuring: Migration to cert-manager to eliminate RWO volumes and allow multi-replica. Etcd Security: Implementation of secretbox and Disk Encryption on Talos to protect secrets at rest. In conclusion, the TazLab cluster has passed its baptism by fire. It is not yet perfect, but it has become a system capable of failing with dignity and being repaired with surgical precision. The road to the \u0026ldquo;Ephemeral Castle\u0026rdquo; continues, one deadlock at a time.\n","date":"26 January 2026","externalUrl":null,"permalink":"/posts/tazlab-resilience-stress-test/","section":"Posts","summary":"","title":"Baptism by Fire: Resilience, Deadlock, and Disaster Recovery in the TazLab Cluster","type":"posts"},{"content":"","date":"25 January 2026","externalUrl":null,"permalink":"/tags/digital-nomad/","section":"Tags","summary":"","title":"Digital Nomad","type":"tags"},{"content":" Introduction: The Paradox of Persistence # In my journey of technological evolution, I have always fought against the \u0026ldquo;physical constraint.\u0026rdquo; We began by making the workstation immutable with the TazPod project, transforming my development environment into a secure, encrypted, and portable enclave. But a workstation without its cluster is like a craftsman without his workshop.\nToday, I want to talk to you about the next phase: the transformation of my entire Kubernetes cluster into an Ephemeral Castle.\nThe objective is radical: to go beyond the traditional concept of Infrastructure as Code (IaC) to arrive at an infrastructure that is, by definition, placeless. It does not matter if my local Proxmox server explodes or if the power is cut while I am on the other side of the world. If I have a laptop with Linux and an internet connection, my entire digital world must be able to be reborn in 10 minutes.\nThe Disaster Scenario (and the Nomadic Response) # Imagine this scenario: I am traveling, and my home cluster is unreachable. Perhaps a fatal hardware failure or a prolonged blackout. In the past, this would have meant the end of productivity.\nToday, the procedure is almost ritualistic:\nI take any Linux computer. I download the TazPod static binary. I execute the \u0026ldquo;Ghost Mount\u0026rdquo;: I enter my passphrase, TazPod contacts Infisical and downloads my identities into an encrypted memory area. I am operational again. I have the keys, I have the tools, I have the knowledge. From this moment, the reconstruction of the castle begins.\nThe TazPod: The Zero Trust Swiss Army Knife # TazPod is not just a container; it is my digital toolbox. Thanks to its architecture in Go and the use of Linux Namespaces, it guarantees that my credentials never touch the \u0026ldquo;guest\u0026rdquo; computer\u0026rsquo;s disk in plain text.\nWith instant access (less than 2 minutes), TazPod provides me with the bridge to the cloud. The decoupling between physical hardware and my security is total. I do not trust the PC I am using; I only trust the encryption that TazPod manages for me.\nTerraform and Flux: Recreating the Castle in 10 Minutes # The strength of the rebirth lies in the union between Terraform and the GitOps philosophy of Flux.\n1. The Ground (Terraform) # I launch a Terraform command. In a few minutes, the nodes are allocated on a cloud provider (e.g., AWS). It is not a massive cluster, but the \u0026ldquo;minimum requirement\u0026rdquo; for High Availability (HA): 3 Control Plane nodes and 2 Workers. Terraform dynamically configures what is needed: whether it is S3 for storage or DNS pointing on Cloudflare.\n2. The Foundations and Walls (Flux) # Once the nodes are ready, Terraform installs only one component: FluxCD. Flux is the castle\u0026rsquo;s butler. It connects to my private Git repositories and begins reading the manifests. In a cascade of automation, Flux recreates:\nNetworking and Ingress (Traefik). Security policies and certificates (Cert-Manager). All my application services, from the blog to monitoring tools. 3. The Treasures (The Return of Data) # An empty castle is useless. The data, the true value, is retrieved from encrypted backups on S3. Thanks to Longhorn or native restore mechanisms, volumes are repopulated.\nIn about 7-10 minutes, I point the Cloudflare DNS towards the new public IP of the LoadBalancer. The world noticed nothing, but my cluster was reborn on another continent, on different hardware, with the exact same configuration as before.\nConclusion: Freedom is an Algorithm # This vision transforms infrastructure into something gaseous, capable of expanding or condensing wherever necessary. I am no longer bound to a place or a physical device.\nMy cluster is ephemeral because it can die at any time without pain. It is portable because it lives in my Git repositories. It is secure because the keys to unlock it live only in my mind and in my TazPod.\nThis is the quintessenza of resilience in the digital age: owning nothing physical that cannot be recreated by a line of code in less than 10 minutes. The castle is in the air, and I have the keys to make it land wherever I am.\n","date":"25 January 2026","externalUrl":null,"permalink":"/posts/the-ephemeral-castle-vision/","section":"Posts","summary":"","title":"The Ephemeral Castle: Towards a Nomadic and Zero Trust Infrastructure","type":"posts"},{"content":" Introduction: The Species Jump of the Homelab # Managing a Kubernetes cluster in a home lab is often an act of love, a mixture of hand-written YAMLs and small manual adjustments via GUI. However, there comes a time when complexity exceeds the memory capacity of its administrator. In Tazlab, that moment arrived today. The goal was clear: to stop treating cluster nodes as \u0026ldquo;pets\u0026rdquo; — each with its own name and history — and start treating them as \u0026ldquo;cattle\u0026rdquo; — fungible, identical, and reproducible resources.\nI decided to introduce Terraform to manage the lifecycle of the Talos Linux cluster hosted on Proxmox. This was not a triumphal march, but an honest chronicle of permission errors, virtual hardware conflicts, and cryptographic decoding issues. Here is how I transformed Tazlab into a true infrastructure defined by code.\nPhase 1: Tool Selection and the Silent Architecture # Before writing a single line of HCL (HashiCorp Configuration Language) code, I had to face the choice of Providers. In the Proxmox world, two main currents exist: the legacy provider by Telmate and the modern bpg provider.\nI decided to opt for bpg/proxmox. The reason lies in its ability to manage Proxmox objects with superior granularity, especially regarding snippets and SDN configuration. Telmate, although historical, suffers from chronic instability in detecting drift (configuration drift) on network interfaces in Proxmox 8.x versions. In a professional IaC (Infrastructure as Code) architecture, drift detection must be precise: Terraform must not propose changes if nothing has changed in reality.\nThe Importance of etcd Quorum # Another critical decision concerned the Control Plane. I initially hypothesized the creation of additional control plane nodes, but I had to reflect on the concept of Quorum. In a distributed system based on etcd like Kubernetes, quorum requires an absolute majority ($n/2 + 1$). Moving from one to two control plane nodes would paradoxically reduce reliability: if one of the two fell, the cluster would remain blocked. I therefore decided to maintain a single control plane node for now, concentrating automation on the horizontal scalability of worker nodes.\nPhase 2: Permissions Setup - The First Barrier # Automation requires an identity. One cannot (and must not) use the root@pam user for Terraform. I had to create a dedicated user and a role with the minimum necessary privileges. This step revealed one of the first pitfalls: official documentation often omits granular permissions that become critical during execution.\nI had to modify the TerraformAdmin role on Proxmox several times. The most subtle error was related to the QEMU Guest Agent. Without the VM.GuestAgent.Audit permission, Terraform could not query Proxmox to know the IP address assigned by DHCP, entering an infinite waiting loop.\nProxmox Setup Code (Shell): # # Creazione del ruolo professionale con permessi granulari pveum role add TerraformAdmin -privs \u0026#34;Datastore.AllocateSpace Datastore.AllocateTemplate Datastore.Audit Pool.Allocate Pool.Audit Sys.Audit Sys.Console Sys.Modify VM.Allocate VM.Audit VM.Clone VM.Config.CDROM VM.Config.Cloudinit VM.Config.CPU VM.Config.Disk VM.Config.HWType VM.Config.Memory VM.Config.Network VM.Config.Options VM.PowerMgmt SDN.Use VM.GuestAgent.Audit VM.GuestAgent.Unrestricted\u0026#34; # Creazione utente e generazione token pveum user add terraform-user@pve pveum aclmod / -user terraform-user@pve -role TerraformAdmin pveum user token add terraform-user@pve terraform-token --privsep=0 Phase 3: Scaffolding and the \u0026ldquo;Secrets Debt\u0026rdquo; # I structured the Terraform project modularly to separate responsibilities: versions.tf for plugins, variables.tf for the data schema, data.tf for reading secrets, and main.tf for business logic.\nSOPS Integration # Tazlab uses SOPS with Age encryption. This was the most interesting challenge. Terraform must decrypt Talos YAML files to extract the Certification Authority (CA) and join tokens. I encountered a frustrating problem: certificates saved in SOPS were Base64 encoded and often contained invisible newline characters (\\n) that crashed Talos validation.\nI decided to solve the problem \u0026ldquo;at the source\u0026rdquo; in the data.tf file, implementing an aggressive string cleaning logic. Without this transformation, the worker node received a corrupted certificate and refused to join the cluster, remaining in a perennial \u0026ldquo;Maintenance Mode\u0026rdquo; state.\nterraform/data.tf: # # Decriptazione dei segreti Proxmox e Talos tramite SOPS data \u0026#34;sops_file\u0026#34; \u0026#34;proxmox_secrets\u0026#34; { source_file = \u0026#34;proxmox-secrets.enc.yaml\u0026#34; } data \u0026#34;sops_file\u0026#34; \u0026#34;controlplane_secrets\u0026#34; { source_file = \u0026#34;../talos/controlplane-reference.yaml\u0026#34; } data \u0026#34;sops_file\u0026#34; \u0026#34;worker_secrets\u0026#34; { source_file = \u0026#34;../talos/worker-reference.yaml\u0026#34; } locals { # Gestione multi-documento e pulizia Base64 parts = split(\u0026#34;---\u0026#34;, data.sops_file.controlplane_secrets.raw) cp_raw = yamldecode(local.parts[0] == \u0026#34;\u0026#34; ? local.parts[1] : local.parts[0]) cluster_secrets = { token = trimspace(local.cp_raw.machine.token) # Rimoziome newline e decodifica PEM ca_crt_b64 = replace(replace(local.cp_raw.machine.ca.crt, \u0026#34;\\n\u0026#34;, \u0026#34;\u0026#34;), \u0026#34; \u0026#34;, \u0026#34;\u0026#34;) ca_key_b64 = replace(replace(local.cp_raw.machine.ca.key, \u0026#34;\\n\u0026#34;, \u0026#34;\u0026#34;), \u0026#34; \u0026#34;, \u0026#34;\u0026#34;) ca_crt = base64decode(local.proxmox_token_id) # Logica di decode centralizzata } } Phase 4: The Fight against Virtual Hardware # Provisioning a Talos VM on Proxmox does not follow standard Cloud-Init rules. Talos expects the configuration to be \u0026ldquo;pushed\u0026rdquo; via its APIs on port 50000.\nI encountered a critical hardware conflict: Proxmox, by default, assigns the Cloud-Init drive to the ide2 interface. However, I was also using the ide2 interface to mount the Talos ISO. This silent conflict prevented Talos from reading the static network configuration, forcing the VM to request an IP via DHCP (often outside the desired range) or, worse, to have no connectivity at all.\nI decided to move the ISO to the ide0 interface, freeing port ide2 for the initialization bus. This move, apparently trivial, was the key to obtaining deterministic static IPs on an immutable system.\nterraform/main.tf (Excerpt): # resource \u0026#34;proxmox_virtual_environment_vm\u0026#34; \u0026#34;worker_nodes\u0026#34; { for_each = var.worker_nodes name = each.key node_name = var.proxmox_node # Allineamento hardware con i nodi fisici esistenti scsi_hardware = \u0026#34;virtio-scsi-single\u0026#34; agent { enabled = true # Cruciale per la visibilità dell\u0026#39;IP nella GUI } disk { datastore_id = \u0026#34;local-lvm\u0026#34; interface = \u0026#34;scsi0\u0026#34; size = each.value.disk_size iothread = true } # Disco dedicato a Longhorn: lo storage distribuito richiede dischi raw disk { datastore_id = \u0026#34;local-lvm\u0026#34; interface = \u0026#34;scsi1\u0026#34; size = each.value.data_disk iothread = true } initialization { ip_config { ipv4 { address = \u0026#34;${each.value.ip_address}/24\u0026#34; gateway = var.gateway } } } cdrom { enabled = true file_id = \u0026#34;local:iso/nocloud-amd64.iso\u0026#34; # ISO Factory personalizzata interface = \u0026#34;ide0\u0026#34; # Risoluzione del conflitto IDE } } Phase 5: The \u0026ldquo;Technical Debt\u0026rdquo; and the Image Factory # During the creation of the first worker (worker-new-01), I noticed that Longhorn pods remained in CrashLoopBackOff. Analysis of the logs with kubectl logs revealed the absence of the iscsiadm binary inside the operating system.\nI realized that Talos Linux, in its standard version, is too minimal for Longhorn. The existing cluster nodes were using an image generated through the Talos Image Factory that included the iscsi-tools extension and the qemu-guest-agent.\nInstead of destroying the node, I decided to perform an In-Place Upgrade via API:\ntalosctl upgrade --image factory.talos.dev/installer/e187c9b90f773cd8c84e5a3265c5554ee787b2fe67b508d9f955e90e7ae8c96c:v1.12.0 This \u0026ldquo;settled the technical debt\u0026rdquo;. I then immediately updated the Terraform code to point to this factory image for all future nodes, ensuring cluster homogeneity.\nPhase 6: Hugo and Cloud-Native Scalability # Once the node fleet was stabilized, I tested scalability with the Hugo blog application. The blog used a PersistentVolumeClaim (PVC) in ReadWriteOnce (RWO) mode. Scaling to 3 replicas, I saw the dreaded Multi-Attach error appear.\nRWO allows mounting a disk on only one node at a time. Kubernetes, trying to distribute pods across my 3 new workers to ensure high reliability, clashed with the physical limit of the volume.\nI decided to implement a Shared-Nothing approach using an emptyDir.\nWhat is an emptyDir? It is a temporary volume that lives as long as the pod is active, created on the node\u0026rsquo;s local disk. Why for Hugo? Hugo is a static site generator. Its source data is downloaded from Git via a sidecar container (git-sync). A centralized persistent disk is not needed if each pod can download its local copy in a few seconds. This change allowed the blog to scale to 3 replicas instantly, each residing on a different worker, without any storage conflict.\nPhase 7: Final Security with Terraform Cloud # The last act was solving the problem of the terraform.tfstate file. As I explained during the process, the Terraform state contains all decrypted secrets in clear text. Keeping this file on the hard drive is an unacceptable risk.\nI decided to migrate the state to HCP Terraform (Terraform Cloud), but with a specific configuration: Local Execution Mode. In this mode, Terraform executes commands on my PC (thus being able to reach the local Proxmox IP and use my Age key), but sends the encrypted state to HashiCorp\u0026rsquo;s secure servers. I removed every local trace of .tfstate, eliminating the possibility of credential theft from the file system.\nterraform/versions.tf (Cloud Configuration): # terraform { required_version = \u0026#34;\u0026gt;= 1.5.0\u0026#34; cloud { organization = \u0026#34;tazlab\u0026#34; workspaces { name = \u0026#34;tazlab-k8s\u0026#34; } } # ... provider ... } Post-Lab Reflections: What have we learned? # The introduction of Terraform in Tazlab was not just the addition of a tool, but a change of mentality. I learned that:\nAbstraction has a cost: Terraform simplifies creation but requires deep knowledge of the underlying APIs (Proxmox in this case). Secrets are alive: Managing secrets does not just mean hiding them, but knowing how to transform them (Base64 vs PEM) to make them digestible for machines. Architecture beats persistence: We often try to solve storage problems with complex volumes when a simple emptyDir and a good synchronization process are more effective. Today Tazlab has 3 new workers. Tomorrow it could have 30. I just need to add a line of text. This is the true freedom of Infrastructure as Code.\n","date":"24 January 2026","externalUrl":null,"permalink":"/posts/tazlab-iac-chronicle/","section":"Posts","summary":"","title":"From Craftsmanship to Infrastructure: Chronicle of the Introduction of Terraform in Tazlab","type":"posts"},{"content":"","date":"20 January 2026","externalUrl":null,"permalink":"/tags/linux-namespaces/","section":"Tags","summary":"","title":"Linux Namespaces","type":"tags"},{"content":"","date":"20 January 2026","externalUrl":null,"permalink":"/tags/open-source/","section":"Tags","summary":"","title":"Open Source","type":"tags"},{"content":" Introduction: The Phoenix Moment # In the previous episode of this technical diary, I documented the dramatic failure of the attempt to transform DevPod into a Zero Trust enclave. The fundamental conflict between DevPod\u0026rsquo;s \u0026ldquo;Convenience-First\u0026rdquo; architecture and my security requirements led to an inevitable conclusion: I had to abandon the tool completely.\nHowever, as every engineer knows, failure is often the mother of innovation. The ashes of DevPod became the fertile ground for something new: TazPod, a custom CLI in Go designed from scratch to address the specific security challenges that DevPod could not handle.\nThis is the story of how I built TazPod from v1.0 to v9.9, transforming it from the fragility of Bash scripts to the robustness of Go, from global mounts to namespace isolation, and from convenience compromises to true Zero Trust security.\nPhase 1: The Foundation in Go - TazPod v1.0 # The first technical decision was radical: abandon the idea of an environment that \u0026ldquo;magically\u0026rdquo; self-configures via SSH. I needed determinism.\nThe Reasoning: Why Go? # After the nightmare of Bash scripts in DevPod, I needed a language with:\nStrong typing to prevent runtime errors. Excellent integration with Docker through the SDK. Cross-platform compilation for future portability. Robust error handling without the fragility of Bash \u0026ldquo;traps\u0026rdquo;. Go offers a critical advantage for a tool of this type: direct access to operating system syscalls and the ability to compile into a single static binary.\nThe Architecture: Command-First Design # I structured TazPod around a central set of commands, managed by a main switch in main.go. This approach transforms the container into a \u0026ldquo;Development Daemon\u0026rdquo;. It is there, waiting (sleep infinity), but inert. The magic happens when we enter it.\n// cmd/tazpod/main.go (Snippet of the up function) func up() { // ... loading configuration ... runCmd(\u0026#34;docker\u0026#34;, \u0026#34;run\u0026#34;, \u0026#34;-d\u0026#34;, \u0026#34;--name\u0026#34;, cfg.ContainerName, \u0026#34;--privileged\u0026#34;, // Necessary to mount loop devices \u0026#34;--network\u0026#34;, \u0026#34;host\u0026#34;, \u0026#34;-e\u0026#34;, \u0026#34;DISPLAY=\u0026#34;+display, \u0026#34;-v\u0026#34;, cwd+\u0026#34;:/workspace\u0026#34;, // Mount current project \u0026#34;-w\u0026#34;, \u0026#34;/workspace\u0026#34;, cfg.Image, \u0026#34;sleep\u0026#34;, \u0026#34;infinity\u0026#34;) // The container stays alive waiting } The first implementation was essentially a direct translation of the Bash scripts. It worked, but it still suffered from the same global mount problem that plagued DevPod. Anyone with docker exec access could see the secrets.\nPhase 2: The Security Breakthrough - TazPod v2.0 (Ghost Edition) # During a security review on January 17th, I identified a critical flaw: if I unlocked the vault and another user accessed the container, they could read all the secrets. The solution came from an unexpected source: Linux Mount Namespaces.\nThe Concept: \u0026ldquo;Ghost Mode\u0026rdquo; # The idea was revolutionary: instead of mounting the vault globally, create an isolated namespace where only the current session could see the mounted secrets.\nIn Linux, mount points are global per namespace. If I create a new mount namespace and mount a disk inside it, that disk exists only for the processes living in that namespace. For the parent process (and for the host), that mount point is simply an empty directory.\nThe Implementation: unshare Magic # The key was using unshare -m to create a new mount namespace. Here is what happens \u0026ldquo;under the hood\u0026rdquo; when a user types the vault password:\nTrigger: The user launches tazpod pull. Fork \u0026amp; Unshare: The Go binary executes itself with elevated privileges using unshare: sudo unshare --mount --propagation private /usr/local/bin/tazpod internal-ghost Enclave Creation: The new internal-ghost process is born in a parallel mount universe. Decryption: Inside this universe, we use cryptsetup to open the vault.img file (mounted via loop device) and mount it to /home/tazpod/secrets. Drop Privileges: Once the disk is mounted, the process \u0026ldquo;downgrades\u0026rdquo; its privileges from root to user tazpod and launches a Bash shell. The Result:\nYou (in the ghost shell): See the secrets, use kubectl, work normally. Intruders (in other shells): See an empty ~/secrets directory. Exit: When you exit, the namespace disappears, taking the mount with it. Phase 3: The IDE Revolution - TazPod v3.0 # With DevPod gone, I lost the integrated VS Code experience. I decided to embrace a pure terminal workflow with Neovim (LazyVim configuration).\nThe LazyVim Integration # I invested significant time perfecting the Neovim setup directly in the base Docker image. I wanted the IDE to be ready immediately, without having to wait for plugin downloads on the first startup.\n# LazyVim installation and headless plugin sync RUN git clone https://github.com/LazyVim/LazyVim ~/.config/nvim \u0026amp;\u0026amp; \\ nvim --headless \u0026#34;+Lazy! sync\u0026#34; +qa \u0026amp;\u0026amp; \\ nvim --headless \u0026#34;+MasonInstall all\u0026#34; +qa The Result: A complete development environment ready in seconds, with Tree-sitter, LSP, and all plugins pre-compiled.\nPhase 4: The Battle for Infisical Persistence # Having solved the filesystem isolation, I had to tackle identity management. I use Infisical to manage centralized secrets. However, Infisical needs to save a local session token (usually in ~/.infisical).\nIf the container is ephemeral, I would have to log in at every restart. Unacceptable. If I save the token on a Docker volume, it is exposed in cleartext on the host. Unacceptable.\nThe Investigation: The \u0026ldquo;Cannibal\u0026rdquo; Bug # The idea was simple: move the .infisical folder inside the encrypted vault and use a bind-mount to make it appear in the user home only when the vault is open.\nDuring the Go implementation, I encountered a critical bug that I nicknamed \u0026ldquo;The Cannibal\u0026rdquo;. The migration function, designed to move old tokens into the vault, had a logic flaw that led to the deletion of content if the paths coincided.\nThe Solution: The Armored Bridge # I rewrote the logic implementing rigorous checks:\nPreliminary Check: I verify if the mount is already active by reading /proc/mounts. Double Bridge: I mount both the configuration (.infisical) and the system keyring (infisical-keyring) inside the vault (.infisical-vault and .infisical-keyring). Recursive Ownership: A recurring problem was that files created during the mount (by root) were not readable by the user. I added a forced chown -R tazpod:tazpod on the entire .tazpod structure at every init or mount operation. Now, the session survives restarts, but physically exists only inside the encrypted vault.img file.\nPhase 5: From Hack to Product (TazPod v9.9) # At this point, I had a working but rough system. To make it a true \u0026ldquo;Zero Trust\u0026rdquo; tool usable by others, deep cleaning and standardization were needed.\nStandardization and \u0026ldquo;Smart Init\u0026rdquo; # I introduced the tazpod init command. Instead of having to manually copy configuration files, the CLI now analyzes the current directory and generates:\nA hidden .tazpod/ folder. A pre-compiled config.yaml, allowing to choose the \u0026ldquo;vertical\u0026rdquo; (base, k8s, gemini) via an argument (e.g., tazpod init gemini). A secrets.yml template to map Infisical environment variables. A .gitignore that automatically excludes the vault and local AI memory (mounted in ./.gemini to persist project memories). The Name Collision Problem # Launching multiple TazPod projects simultaneously, I noticed that Docker conflicted on container names (tazpod-lab). I implemented dynamic naming logic in Go in version v9.9:\ncwd, _ := os.Getwd() dirName := filepath.Base(cwd) rng := rand.New(rand.NewSource(time.Now().UnixNano())) containerName := fmt.Sprintf(\u0026#34;tazpod-%s-%d\u0026#34;, dirName, rng.Intn(9000000)+1000000) Now every project has a unique identity, allowing work on multiple clusters or clients in parallel without overlaps.\nPost-Development Reflections # The transition from DevPod to TazPod was an exercise in subtraction. I removed the graphical interface, removed the synchronization agent, removed the managed SSH abstraction.\nIn return, I gained:\nVerifiable Security: I know exactly where every byte of sensitive data resides (in the RAM of the Ghost process). Total Portability: The project is self-contained. Just have Docker and the TazPod binary. Speed: Without agent overhead, shell startup is instantaneous once the image is downloaded. The Project on GitHub # I decided to release TazPod as an Open Source project under the MIT license. It is not just a personal script, but a complete framework for those who, like me, live in the terminal and do not want compromises on security.\nInstallation is now reduced to a single line:\ncurl -sSL https://raw.githubusercontent.com/tazzo/tazpod/master/scripts/install.sh | bash For more technical details and to consult the complete project documentation, I invite you to visit the official repository on GitHub: https://github.com/tazzo/tazpod.\nThis journey confirms that in modern DevOps, building your own tools is not reinventing the wheel, but often the only way to ensure that the wheel turns exactly as required by the security constraints of critical infrastructure.\nThe next step? Using TazPod to complete the Terraform refactoring of the TazLab cluster, knowing that the access keys are finally safe.\n","date":"20 January 2026","externalUrl":null,"permalink":"/posts/tazpod-rising-go-cli-zero-trust/","section":"Posts","summary":"","title":"TazPod Rising: From DevPod Ashes to a Go-Powered Zero Trust CLI","type":"posts"},{"content":"","date":"14 January 2026","externalUrl":null,"permalink":"/tags/devpod/","section":"Tags","summary":"","title":"Devpod","type":"tags"},{"content":" Introduction: The Illusion of Total Control # In the first part of this technical diary, I outlined the architecture of an immutable workstation based on DevPod. The goal was ambitious: a \u0026ldquo;Golden Image\u0026rdquo; containing every tool necessary for orchestrating my Kubernetes cluster (Proxmox, Talos, Longhorn), eliminating the entropy of local configuration. However, as every engineer knows, the transition from theory to practice exposes flaws that no planning can completely foresee.\nIn this session, I set an even more extreme goal: transforming the DevPod into a Zero Trust environment. I didn\u0026rsquo;t just want a container with my tools; I wanted a secure enclave where critical secrets (Kubeconfig, SSH keys, API tokens) would never reside on disk in plain text, even within the isolated container.\nThe mindset of the day was driven by constructive paranoia. I asked myself: \u0026ldquo;If someone physically compromised my laptop or managed to execute an unauthorized command in the container, what would they find?\u0026rdquo;. The answer had to be: \u0026ldquo;Absolutely nothing.\u0026rdquo;\nThis is the technical chronicle of how I tried to bend DevPod to this radical security vision, clashing with its own architecture oriented towards convenience, until reaching the inevitable decision to abandon the tool and start over on different foundations.\nPhase 1: Image Refactoring and the Cache Nightmare # Before addressing security, I had to solve a problem of architectural efficiency. My original Dockerfile was becoming an unmanageable monolith. Every small change to the dotfiles required a complete rebuild of the entire image, a process that consumed bandwidth and precious time.\nThe Reasoning: Layered Architecture # I decided to decompose the image into three distinct logical layers:\nBase Layer (Dockerfile.base): The foundation of the operating system, security tools (Infisical, SOPS), and stable binaries (Eza, Neovim, Starship). Kubernetes Layer (Dockerfile.k8s): The specific stack for orchestration (Kubectl, Helm, Talosctl). AI Layer (Dockerfile.gemini): The heavy Gemini CLI, which requires a dedicated Node.js runtime. Conceptual Deep-Dive: Docker Layer Caching Layer caching in Docker works according to a deterministic logic: if the content of an instruction (such as a RUN or COPY command) does not change, Docker reuses the previously built layer. This is fundamental for continuous integration (CI/CD). However, if a layer at the base of the chain changes, all subsequent layers are invalidated and must be rebuilt. By separating stable tools from heavy or frequently updated ones, I sought to maximize iteration speed.\nThe Symptom: The \u0026ldquo;Invisible\u0026rdquo; Cache # During testing, I stumbled upon a frustrating behavior. I had updated the Starship theme in the dotfiles (switching from Gruvbox to a more restful Pastel Powerline), but despite running the build, the container continued to present itself with the old theme.\nChecking the build logs, I noticed the infamous =\u0026gt; CACHED label right on the COPY dotfiles/ command. Docker did not detect that the files inside the host folder had changed.\nThe Solution: Dynamic Cache Busting # To force Docker to invalidate the cache at the exact desired point, I introduced a dynamic build argument.\n# Dockerfile.base snippet # ... stable tools ... # Argument to force dotfiles update ARG CACHEBUST=1 RUN echo \u0026#34;Cache bust: ${CACHEBUST}\u0026#34; # Now Docker is forced to re-execute the copy if CACHEBUST changes COPY --chown=vscode:vscode dotfiles/ /home/vscode/ By launching the build with --build-arg CACHEBUST=$(date +%s), I injected the current timestamp into the process. Since the RUN echo command changed every second, Docker was mathematically obliged to rebuild that layer and all subsequent ones, guaranteeing the injection of the new configuration files.\nPhase 2: The RAM Enclave and the Kernel Conflict # Having solved the cache problem, I moved to the heart of the project: the Encrypted Vault. The idea was to create a LUKS (Linux Unified Key Setup) volume inside the container.\nThe Reasoning: Why LUKS in a Container? # Normally, containers rely on kernel namespace isolation. But files inside a container are accessible to anyone with root privileges on the host or who can execute a docker exec. By encrypting a portion of the filesystem with LUKS and unlocking it only via a manually entered passphrase, secrets are protected by a cryptographic key that resides only in RAM (and in the user\u0026rsquo;s mind).\nConceptual Deep-Dive: Linux Unified Key Setup (LUKS) LUKS is the standard for disk encryption in Linux. It works by creating a layer between the physical device (or an image file) and the filesystem. This layer handles the on-the-fly decryption of data blocks. In the context of a container, using LUKS requires access to the host kernel\u0026rsquo;s Device Mapper, an operation that is inherently complex to isolate.\nThe Investigation: Loop Device Failure # The first attempt to create the vault in RAM via tmpfs hit a kernel error: Attaching loopback device failed (loop device with autoclear flag is required).\nIn a Docker environment, even if the container is launched with the --privileged flag, the cryptsetup command often fails to automatically allocate loop devices (those virtual devices that allow a file to be treated as a hard disk). This happens because the nodes in /dev/loop* are not dynamically created inside the container.\nThe Solution: Mknod and Manual Losetup # I had to implement a robust unlocking procedure that prepared the ground for the kernel:\n# Snippet from the unlock script (devpod-zt.sh) echo \u0026#34;🛠️ Preparing loop devices (0-63)...\u0026#34; sudo mknod /dev/loop-control c 10 237 2\u0026gt;/dev/null || true for i in $(seq 0 63); do sudo mknod /dev/loop$i b 7 $i 2\u0026gt;/dev/null || true done echo \u0026#34;💾 Engaging Secure Enclave (RAM)...\u0026#34; # Dedicated tmpfs mount to avoid /dev/shm limits sudo mount -t tmpfs -o size=256M tmpfs \u0026#34;$VAULT_BASE\u0026#34; # Manual loop device association LOOP_DEV=$(sudo losetup -f --show \u0026#34;$VAULT_IMG\u0026#34;) echo -n \u0026#34;$PLAIN_PASS\u0026#34; | sudo cryptsetup luksFormat --batch-mode \u0026#34;$LOOP_DEV\u0026#34; - echo -n \u0026#34;$PLAIN_PASS\u0026#34; | sudo cryptsetup open \u0026#34;$LOOP_DEV\u0026#34; \u0026#34;$MAPPER_NAME\u0026#34; - This move was crucial. By manually creating device nodes and managing the losetup association outside of cryptsetup\u0026rsquo;s automation, I succeeded in overcoming Docker runtime restrictions and finally mounting a working encrypted filesystem in ~/secrets.\nPhase 3: The Clash Between Automation and Hardening # With the vault working, I tried to automate the process. I wanted the container to ask for the password immediately upon entry. I implemented a Trap-Shell in the .bashrc: a script that intercepted the session start and launched the unlocking procedure.\nThe Symptom: \u0026ldquo;Ghosts\u0026rdquo; in the Logs # As soon as the Trap-Shell was activated, I started seeing incessant output every 30 seconds in the devpod up logs: 00:32:47 debug Start refresh ... Device secrets_vault already exists.\nThe Analysis: The DevPod Agent Lifecycle # Here I discovered the true nature of the DevPod Agent. To provide features like port forwarding and file sync, the DevPod agent maintains an open SSH channel or socket to the container. Every 30 seconds, the agent executes \u0026ldquo;refresh\u0026rdquo; commands (such as update-config) by launching new shells in the container.\nSince my Trap-Shell was in the .bashrc, every time the agent entered for a routine check, the security script started, tried to ask for a password (which the agent couldn\u0026rsquo;t provide), or tried to remount an already active volume, generating cascading errors.\nConceptual Deep-Dive: Interactive vs Non-interactive Shells In Bash, shells can be interactive (connected to a terminal/TTY) or non-interactive (executed by a script or a daemon). The DevPod agent launches non-interactive shells. I tried to solve the problem by filtering the security script execution:\n# Modification in .bashrc if [[ $- == *i* ]]; then # Run unlock only if user is at the screen tazpod-unlock fi Although this reduced the noise, it did not solve the underlying problem: DevPod Agent continued to \u0026ldquo;quarrel\u0026rdquo; with my hardened environment.\nPhase 4: The Fall of SSH and the \u0026ldquo;Fail-Open\u0026rdquo; Discovery # The final nail in the coffin of the DevPod-based approach was the attempt to harden SSH access. I wanted the vault to unmount automatically after exiting the shell and for reentry to require the password again.\nI tried removing the SSH keys injected by DevPod (rm ~/.ssh/authorized_keys). The result? The DevPod agent panicked, losing the ability to manage the workspace. I tried implementing a background Watchdog that would count active bash processes and unmount the vault at the end of the last session. But the complexity was scaling exponentially compared to the benefits.\nThe \u0026ldquo;Ctrl+C\u0026rdquo; Vulnerability # During a manual penetration test, I discovered an embarrassing flaw: if I pressed Ctrl+C during the Infisical password prompt, the script was interrupted but the shell gave me the command prompt anyway. It was a security system that could be bypassed with a single keystroke.\nI responded by implementing a brutal SIGINT Trap:\n# In .bashrc trap \u0026#34;echo \u0026#39;❌ Interrupted. Exiting.\u0026#39;; exit 1; kill -9 $$\u0026#34; INT It worked. But at that point, my development environment had become a web of hacks, fragile Bash scripts trying to manage kernel signals, and perennial conflicts with the DevPod orchestration agent.\nPhase 5: Resignation and Paradigm Shift # After hours spent fighting against the Device already exists error from the Device Mapper and the infinite refreshes of the agent, I reached a painful but necessary conclusion: DevPod is not the right tool for a Zero Trust enclave.\nDevPod is built on the philosophy of Convenience-First. It wants you to be operational in one click, your SSH keys synced everywhere, your environment \u0026ldquo;always ready.\u0026rdquo; My security vision, however, requires an environment that is \u0026ldquo;never ready\u0026rdquo; until the user explicitly decides so.\nThe Decision: I decided to throw away all the work done with DevPod. I decided to eliminate the agent, the automatic SSH keys, and the integrated VS Code server.\nThe new approach will be based on:\nPure Docker: A Debian Slim container launched manually with 100% controlled startup scripts. Go CLI: A dedicated CLI written in Go (which we will call tazpod) to manage the entire security lifecycle in a robust and atomic way, eliminating the fragility of Bash scripts. Terminal-Only Workflow: Abandoning VS Code in favor of Neovim (LazyVim), eliminating the need for persistent SSH channels for the IDE. Conclusion: What We Learned in This Stage # This session, seemingly a failure, was actually a masterclass in systems engineering. I learned that:\nAutomation is not always an ally of extreme security. The host kernel and the container have a very tight dependency relationship when it comes to encryption, and intermediaries make debugging impossible. Knowing when to give up on a tool when it no longer meets requirements is a senior skill as fundamental as knowing how to configure it. The Immutable Workshop is not dead; it is just shedding its skin. In the next post, I will document the birth of the TazPod CLI in Go and the transition to a Pure Docker environment, where control is no longer an option, but the very foundation of the architecture.\n","date":"14 January 2026","externalUrl":null,"permalink":"/posts/devpod-zero-trust-struggle/","section":"Posts","summary":"","title":"DevPod's Swan Song: The Clash Between Automation and Zero Trust Security","type":"posts"},{"content":"","date":"14 January 2026","externalUrl":null,"permalink":"/tags/luks/","section":"Tags","summary":"","title":"Luks","type":"tags"},{"content":"","date":"14 January 2026","externalUrl":null,"permalink":"/tags/troubleshooting/","section":"Tags","summary":"","title":"Troubleshooting","type":"tags"},{"content":"","date":"12 January 2026","externalUrl":null,"permalink":"/it/tags/automazione/","section":"Tags","summary":"","title":"Automazione","type":"tags"},{"content":"","date":"12 January 2026","externalUrl":null,"permalink":"/it/tags/produttivit%C3%A0/","section":"Tags","summary":"","title":"Produttività","type":"tags"},{"content":" Introduction: The Local Configuration Paradox # In today\u0026rsquo;s Infrastructure as Code (IaC) landscape, a fundamental paradox exists: we spend hours making our servers immutable (via systems like Talos Linux) and our workloads ephemeral (via Kubernetes), yet we continue to manage infrastructure from \u0026ldquo;artisanal\u0026rdquo; laptops, configured manually and subject to slow but inexorable entropy.\nWhile working on my Proxmox/Talos cluster, I realized my workstation (Zorin OS) was becoming a bottleneck. Misaligned versions of talosctl, conflicts between Python versions, and precarious management of kubeconfig files were introducing unacceptable operational risk. Furthermore, the need to operate on the move required an environment not bound to my main laptop\u0026rsquo;s physical hardware.\nThe goal of this session was to build a DevPod (Development Pod): a containerized, portable workspace strictly defined by code. We are not talking about a simple throwaway Docker container, but a complete engineering workstation—persistent in configuration yet ephemeral in execution.\nThe Mindset: Security vs. Usability # Before writing the first line of code, I evaluated a radical approach to security. The initial idea was to implement an encrypted filesystem residing exclusively in RAM. I imagined a script that, upon startup, would allocate a block of RAM, format it with LUKS (Linux Unified Key Setup), and mount it into the container.\nThe Reasoning: In a \u0026ldquo;Cold Boot Attack\u0026rdquo; scenario or physical compromise of the powered-off machine, secrets (SSH keys, kubeconfig) would be mathematically unrecoverable, having vanished along with the electrical current.\nThe Decision: After a cost-benefit analysis, I decided to discard this complexity for the moment. While technically fascinating, it would have introduced excessive friction into the daily workflow (the need to enter decryption passphrases at every reboot, complex management of privileged mount points). I opted for a more pragmatic approach: secrets reside in an host directory not versioned on Git, dynamically mounted into the container. Security is delegated to host disk encryption (standard LUKS), an acceptable compromise for a lab environment, allowing me to focus on development environment stability.\nPhase 1: Networking and the MTU Nightmare # The first technical barrier encountered during the debian:slim container bootstrap was, predictably, the network. My host uses a VPN connection (WireGuard/Tailscale) to reach the Proxmox cluster management network.\nThe Symptom # Upon starting the container, the apt-get update command would hang indefinitely at 0% or fail with timeouts on specific repositories.\nThe Investigation # This behavior is a \u0026ldquo;classic\u0026rdquo; symptom of MTU (Maximum Transmission Unit) issues. Docker, by default, creates a bridge network (docker0) and encapsulates container traffic. The Ethernet standard specifies a 1500-byte MTU. However, VPN tunnels must add their own headers to packets, reducing the available useful space (payload), often bringing the effective MTU to 1420 bytes or less.\nWhen the container attempts to send a 1500-byte packet, it reaches the host\u0026rsquo;s VPN interface. If the \u0026ldquo;Don\u0026rsquo;t Fragment\u0026rdquo; (DF) bit is set (as often happens in HTTPS/TLS traffic), the packet is silently discarded because it is too large for the tunnel. In theory, the router should send an ICMP \u0026ldquo;Fragmentation Needed\u0026rdquo; message, but many modern firewalls block ICMP, creating a \u0026ldquo;Path MTU Discovery Blackhole.\u0026rdquo;\nThe Solution: --network=host # Invece di tentare un fragile tuning dei valori MTU nel demone Docker (che avrebbe reso la configurazione specifica per la mia macchina e non portabile), ho deciso di bypassare completamente lo stack di rete di Docker.\nIn the devcontainer.json file, I introduced:\n\u0026#34;runArgs\u0026#34;: [ \u0026#34;--network=host\u0026#34; ] Conceptual Deep-Dive: Host Networking By using the host network driver, the container does not receive its own isolated network namespace. It directly shares the host\u0026rsquo;s network stack. If the host has a tun0 interface (the VPN), the container sees and uses it directly. This eliminates double NAT and packet fragmentation issues, ensuring the DevPod\u0026rsquo;s connectivity is exactly identical to the physical machine\u0026rsquo;s.\nPhase 2: State Management and Secrets Injection # An ephemeral environment must be destroyable without data loss, but it must not contain sensitive data in its base image either. This required a very precise volume management strategy.\nThe Bind Mounts Strategy # I decided to keep critical configuration files (kubeconfig, talosconfig) in a local host directory (~/kubernetes/tazlab-configs), strictly excluded from Git versioning via .gitignore.\nThis directory is \u0026ldquo;grafted\u0026rdquo; into the container at runtime:\n\u0026#34;mounts\u0026#34;: [ \u0026#34;source=/home/taz/kubernetes/tazlab-configs,target=/home/vscode/.cluster-configs,type=bind,consistency=cached\u0026#34; ] The Environment Variables Conflict # Mounting files is not enough. Tools like kubectl expect configuration files in standard paths (~/.kube/config). Having moved the files to a custom path for cleanliness, I had to instruct the tools via environment variables (KUBECONFIG, TALOSCONFIG).\nInitially, I attempted to export these variables via a startup script (postCreateCommand) that appended them to the .bashrc file. However, I found that upon opening a shell in the container, the variables were not present.\nFailure Analysis: The problem lay in shell management. The base image included a configuration that launched Zsh instead of Bash, or (in the case of tmux) launched a login shell that reset the environment. Relying on init scripts to set environment variables is inherently fragile due to \u0026ldquo;Race Conditions\u0026rdquo;: if the user enters the terminal before the script finishes, the environment is incomplete.\nThe Robust Solution: I moved the variable definitions directly into the container configuration, using the containerEnv property of DevContainer.\n\u0026#34;containerEnv\u0026#34;: { \u0026#34;KUBECONFIG\u0026#34;: \u0026#34;/home/vscode/.cluster-configs/kubeconfig\u0026#34;, \u0026#34;TALOSCONFIG\u0026#34;: \u0026#34;/home/vscode/.cluster-configs/talosconfig\u0026#34; } In this way, the Docker daemon itself injects these variables into the container\u0026rsquo;s parent process at creation time (docker run -e ...). The variables are therefore available instantly and universally, regardless of the shell used (Bash, Zsh, Fish) or the loading order of user profiles.\nPhase 3: The \u0026lsquo;Golden Image\u0026rsquo; Strategy and Layered Architecture # In early iterations, my devcontainer.json defined a generic base image and devolved to an install-extras.sh script the installation of all tools (kubectl, talosctl, neovim, yazi). The result was an unacceptable startup time (5-8 minutes) at each container rebuild, with a high risk of failure if an external repository (e.g., GitHub or apt) was momentarily unreachable.\nI decided to pivot toward a Golden Image approach: building the environment \u0026ldquo;offline\u0026rdquo; and distributing it as a monolithic Docker image.\nOptimized Layering # To balance build speed and flexibility, I structured the Dockerfiles into three distinct hierarchical levels.\n1. The Base Level (Dockerfile.base) # This is the foundation. It contains the operating system (Debian Bookworm), the Locales configuration (essential to avoid crashes of TUI tools like btop that require UTF-8), and heavy, stable binaries.\nConceptual Deep-Dive: Locales in Docker Minimal Docker images often do not have locales generated to save space (POSIX or C). However, modern tools like starship or terminal graphical interfaces require Unicode characters. I had to force the generation of en_US.UTF-8 in the Dockerfile to ensure interface stability.\n# Dockerfile.base snippet RUN echo \u0026#34;en_US.UTF-8 UTF-8\u0026#34; \u0026gt; /etc/locale.gen \u0026amp;\u0026amp; \\ locale-gen \u0026amp;\u0026amp; \\ update-locale LANG=en_US.UTF-8 ENV LANG=en_US.UTF-8 2. The Intermediate Level (Dockerfile.gemini) # This layer extends the base by adding specific and potentially optional tools—in my case, the Gemini CLI. Separating it allows me to have, in the future, \u0026ldquo;light\u0026rdquo; versions of the environment without having to recompile the entire base layer.\n3. The Final Level (Dockerfile) # This is the entry point consumed by DevPod. It inherits from the intermediate level and is tagged as latest. This \u0026ldquo;matryoshka\u0026rdquo; approach allows me to update a tool in the base layer and propagate the change to all child images with a simple chain rebuild.\nOperational Result # Startup time (devpod up) plummeted from minutes to a few seconds. The image is immutable: I have mathematical certainty that the versions of the tools I use today will be identical a month from now, eliminating the root cause of \u0026ldquo;Configuration Drift.\u0026rdquo;\nPhase 4: Customization and GNU Stow # A sterile development environment is unproductive. I needed my specific Neovim configuration (based on LazyVim), my Tmux bindings, and my custom scripts.\nI chose GNU Stow to manage my dotfiles. Stow is a symbolic link manager that allows keeping configuration files in a centralized directory (a Git repo) and creating symlinks in target positions (~/.config/nvim, ~/.bashrc).\nThe Dirty Link Challenge # Stow operates by default by \u0026ldquo;mirroring\u0026rdquo; the source directory structure. This created a problem with my scripts/ folder. Stow attempted to create a ~/scripts link in the container home, while Linux convention requires user executables to reside in ~/.local/bin to be automatically included in the $PATH.\nI had to write an intelligent runtime script (setup-runtime.sh) that executes Stow conditionally:\n# Differentiated stowing logic for package in *; do if [ \u0026#34;$package\u0026#34; == \u0026#34;scripts\u0026#34; ]; then # Force destination for scripts in .local/bin stow --target=\u0026#34;$HOME/.local/bin\u0026#34; --adopt \u0026#34;$package\u0026#34; else # Standard behavior for nvim, tmux, git stow --target=\u0026#34;$HOME\u0026#34; --adopt \u0026#34;$package\u0026#34; fi done Furthermore, I had to handle a critical conflict with Neovim. My Dockerfile pre-installs a \u0026ldquo;starter\u0026rdquo; Neovim configuration. When Stow attempted to link my personal configuration, it failed because the target directory already existed. I added preventive cleanup logic that detects the presence of personal dotfiles and removes the default configuration (\u0026ldquo;nuke and pave\u0026rdquo;) before applying symlinks.\nPhase 5: Architectural Decoupling # During the restructuring, I noticed a \u0026ldquo;Code Smell\u0026rdquo;: the image definition files (Dockerfile, build scripts) resided in the same repository as the Kubernetes infrastructure (tazlab-k8s).\nThe Reasoning: Mixing the definition of tools with the definition of infrastructure violates the principle of Separation of Concerns. If in the future I wanted to use the same DevPod environment for a Terraform project on AWS, or to develop a Go application, I would be forced to duplicate code or improperly depend on the Kubernetes repository.\nThe Action: I decided to extract all image building logic into a new dedicated repository: tazzo/devpod. The tazlab-k8s repository was cleaned up and now contains only a lightweight reference in the devcontainer.json:\n\u0026#34;image\u0026#34;: \u0026#34;tazzo/tazlab.net:devpod\u0026#34; This transforms the DevPod image into a standalone, versionable Platform Product reusable across all organization projects, significantly cleaning up the cluster codebase.\nPost-Lab Reflections # The result of this engineering marathon is an environment I would define as \u0026ldquo;Anti-Fragile.\u0026rdquo; I no longer depend on the host laptop\u0026rsquo;s configuration. I can format the physical machine, install Docker and DevPod, and be 100% operational again in the time it takes to download the Docker image (about 2 minutes on a fiber connection).\nThis setup has profound implications for the cluster\u0026rsquo;s long-term stability:\nUniformity: Every operation on the cluster is executed with the exact same binary versions, eliminating bugs due to client-server incompatibilities. Security: Secrets are confined to memory or temporary mounts, reducing the attack surface. Onboarding: Should I collaborate with another engineer, their environment setup time would be zero. The most important lesson learned today concerns the importance of investing time in one\u0026rsquo;s \u0026ldquo;meta-work.\u0026rdquo; The hours spent building this environment will be repaid in minutes saved every single day of future operations. The next logical step will be to move this DevPod from the local Docker engine directly into the Kubernetes cluster, transforming it into a management bastion that is persistent and accessible from anywhere—but that is a story for the next log.\n","date":"12 January 2026","externalUrl":null,"permalink":"/posts/devpod-architecture-deep-dive/","section":"Posts","summary":"","title":"The Immutable Workshop: Architecture of a 'Golden Image' DevPod Environment for Kubernetes Orchestration","type":"posts"},{"content":" Cloud-Native Security Paradigms and the Inadequacy of Native Mechanisms # The evolution of infrastructure toward cloud-native models and the massive adoption of Kubernetes as a container orchestrator have introduced unprecedented security challenges. In this context, secret management—the handling of sensitive information such as API keys, database passwords, TLS certificates, and access tokens—has become the fundamental pillar of any modern security strategy. Traditionally, managing sensitive data was plagued by \u0026ldquo;sprawl,\u0026rdquo; where credentials were often hardcoded directly into source code, stored in cleartext in configuration files, or insecurely exposed via environment variables. With the shift to microservices, the number of these credentials has grown exponentially, making manual methods not only insecure but also operationally unsustainable.\nKubernetes offers a native system for secret management, but in-depth technical analysis reveals structural limitations critical for production environments. By default, Kubernetes secrets are stored in etcd, the cluster\u0026rsquo;s key-value database, using Base64 encoding. It is essential to emphasize that Base64 encoding does not constitute any form of encryption; its sole purpose is to allow the storage of arbitrary binary data. Without explicit configuration of Encryption at Rest for etcd, anyone who gains access to the storage backend or the API server with sufficient privileges can retrieve secrets in cleartext. Furthermore, native secrets lack advanced features such as automatic credential rotation, granular identity-based access control, and a robust audit logging system that can track who accessed a secret and when.\nTo address these needs, the DevOps landscape has integrated specialized tools like HashiCorp Vault and Mozilla SOPS. Vault acts as a central authority for secrets, providing a unified control plane that transcends the individual Kubernetes cluster. SOPS, on the other hand, solves the challenge of integrating secrecy with version control systems (Git), allowing sensitive data to be encrypted before being stored in repositories. The combination of these tools, supported by automation via Terraform, allows for building secure and resilient CI/CD pipelines suitable for both a small homelab and large-scale professional infrastructures.\nInternal Architecture of HashiCorp Vault: The Heart of Secret Management # HashiCorp Vault is not a simple encrypted database but a comprehensive framework for identity-based security. Its architecture is designed around the concept of a cryptographic barrier that protects all data stored in the backend. When Vault is started, it is in a \u0026ldquo;sealed\u0026rdquo; state. In this state, Vault can access its physical storage but cannot decrypt the data contained within it, as the Master Key is not available in memory.\nThe Unseal Process and the Shamir Algorithm # The unlocking process, known as \u0026ldquo;unseal,\u0026rdquo; traditionally requires reconstructing the Master Key. Vault uses Shamir\u0026rsquo;s Secret Sharing algorithm to split the Master Key into multiple fragments (key shares). A specified minimum number of these fragments (threshold) must be provided to reconstruct the master key and allow Vault to decrypt the data encryption key (Barrier Key). In Kubernetes environments, where pods are ephemeral and can be frequently rescheduled, manual unsealing is impractical. For this reason, the Auto-unseal feature is almost universally adopted, delegating the protection of the Master Key to an external KMS service (such as AWS KMS, Azure Key Vault, or Google Cloud KMS) or to another Vault cluster via the Transit engine.\nSecret Engines and Authentication Methods # Vault\u0026rsquo;s flexibility stems from its Secret Engines and Auth Methods. While KV (Key-Value) engines store static secrets, dynamic engines can generate credentials \u0026ldquo;on-the-fly\u0026rdquo; for databases, cloud providers, or messaging systems. These credentials have a limited time-to-live (TTL) and are automatically revoked upon expiration, drastically reducing the \u0026ldquo;blast radius\u0026rdquo; in case of compromise.\nVault Component Main Function Application in Kubernetes Barrier Cryptographic barrier between storage and API Protection of sensitive data in etcd or Raft Storage Backend Data persistence (e.g., Raft, Consul) Storage of secrets on Persistent Volumes Secret Engines Generation/Storage of secrets Management of PKI certificates, dynamic DB credentials Auth Methods Verification of client identity Integration with Kubernetes ServiceAccounts Audit Broker Logging of every request/response Monitoring access for compliance and security Implementing Vault on Kubernetes: Raft and High Availability # Deploying Vault on Kubernetes requires careful planning to ensure data availability and persistence. The modern approach recommended by HashiCorp involves using Integrated Storage based on the Raft consensus protocol. Unlike external backends like Consul, Raft allows Vault to autonomously manage data replication within the cluster, simplifying the topology and reducing the number of components to monitor.\nCluster Topology and Quorum # A resilient Vault implementation requires an odd number of nodes to avoid \u0026ldquo;split-brain\u0026rdquo; scenarios. In production, a minimum of three nodes is recommended to tolerate the failure of a single node, while a five-node configuration is preferable to handle the loss of two nodes or an entire availability zone without service interruption. Each node participates in replicating the Raft log, ensuring that every write operation is confirmed by the majority before being considered definitive.\nHelm Chart Configurations and Hardening # Installation typically occurs via the official HashiCorp Helm chart. Critical configurations include enabling the server.ha.enabled module and defining storage via volumeClaimTemplates to ensure that each Vault replica has its own dedicated persistent volume. To maximize security, workload isolation must be implemented. Vault should not share nodes with other applications to mitigate side-channel attack risks. This is achieved using nodeSelector, tolerations, and affinity rules to confine Vault pods to dedicated hardware.\nAn often overlooked aspect is the configuration of liveness and readiness probes. Since a Vault instance can be active but sealed, the readiness probe must be intelligently configured to distinguish between a running process and a service ready to respond to decryption requests. The Helm chart handles much of this logic, using CLI commands like vault status to verify the internal state of the node.\nTerraform: The Connective Tissue of DevOps Automation # Terraform integrates into the ecosystem as the Infrastructure as Code (IaC) tool of choice, allowing the configuration of not only the underlying infrastructure (Kubernetes clusters, networks, storage) but also access policies and secrets within Vault. Terraform\u0026rsquo;s value lies in its ability to manage dependencies between different providers.\nLifecycle Management and Dependencies # Using the hashicorp/vault provider allows operators to define secrets, policies, and authentication configurations declaratively. At the same time, the hashicorp/kubernetes provider allows mapping this information within the cluster. A common pattern involves extracting a secret from Vault via a data source and subsequently creating it as a Kubernetes secret for legacy applications that do not support native integration with Vault.\nState File Security and Sensitive Variables # A critical challenge in using Terraform is protecting the state file (terraform.tfstate). This file often contains sensitive information in cleartext, including secrets retrieved from Vault during the plan or apply phase. It is imperative to store the state in a secure remote backend, such as AWS S3 with server-side encryption and state locking (DynamoDB), or use HashiCorp Terraform Cloud which natively manages state encryption at rest. Additionally, variables marked as sensitive = true prevent Terraform from printing their values in the console output, reducing the risk of exposure in CI/CD pipeline logs.\nTerraform Strategy Security Benefit Mitigated Risk Encrypted Remote Backend State encryption at rest Unauthorized access to secrets in tfstate Sensitive Variables Obfuscation of values in logs Accidental exposure in CI/CD stdout Vault Provider Centralized secret management Hardcoding credentials in .tf files RBAC for the Control Plane Limitation of who can execute apply Unauthorized changes to critical infrastructure Mozilla SOPS: Security for Version Control and GitOps Flows # Mozilla SOPS (Secrets OPerationS) was born from the need to integrate secrets into the Git-based workflow (GitOps) without compromising security. Unlike Kubernetes secrets, which should never be stored in Git even if encoded, files encrypted with SOPS are safe for versioning.\nEnvelope Encryption and Multi-Recipient # SOPS implements envelope encryption, where data is encrypted with a symmetric Data Encryption Key (DEK), which is in turn encrypted by one or more Master Keys (KEK) managed externally. This approach allows for multiple recipients for the same secret: for example, a file can be decrypted by a team of administrators via their personal PGP keys and, simultaneously, by the Kubernetes cluster via a cloud KMS service.\nIntegration with age for Operational Simplicity # While PGP has historically been the standard for SOPS, the tool age (Actually Good Encryption) has become the preferred choice for modern DevOps environments due to its simplicity, lack of complex configurations, and cryptographic speed. In an age-based workflow, each operator generates a key pair; the public key is inserted into the repository\u0026rsquo;s .sops.yaml file, while the private key remains protected on the operator\u0026rsquo;s machine or uploaded as a secret in the Kubernetes cluster.\nYAML\nEsempio di file.sops.yaml # creation_rules:\npath_regex:.*.enc.yaml$ encrypted_regex: ^(data|stringData)$ age: age1vwd8j93mx9l99k\u0026hellip; # Chiave pubblica del cluster\npgp: 0123456789ABCDEF\u0026hellip; # Chiave di backup dell\u0026rsquo;admin The use of encrypted_regex is a fundamental best practice: it allows encrypting only sensitive values (such as the data and stringData fields of a Kubernetes secret) while leaving metadata like apiVersion, kind, and metadata.name in cleartext. This enables GitOps tools and operators to identify the resource type without having to decrypt it.\nSecret Consumption Mechanisms in Kubernetes # Once secrets have been stored in Vault or encrypted with SOPS, the workload running on Kubernetes must be able to access them. Three main patterns exist, each responding to different security and complexity requirements.\n1. Vault Agent Injector # This method uses a Sidecar container automatically injected into pods via a Mutating Admission Webhook. The Vault Agent handles authentication with Vault using the pod\u0026rsquo;s ServiceAccount and writes secrets to a shared memory volume (emptyDir). It is the ideal solution for applications that are not \u0026ldquo;cloud-native\u0026rdquo; and expect to read secrets from local files, as it allows formatting data via HCL or Go templates.\n2. Vault Secrets Operator (VSO) # VSO represents the native approach for GitOps. The operator monitors custom resources (CRDs) in the cluster, retrieves data from Vault, and creates/updates standard Kubernetes secrets. This method is extremely powerful because it allows applications to use native Kubernetes secrets (mounted as volumes or environment variables) without any code changes, while maintaining Vault as the single source of truth.\n3. Secrets Store CSI Driver # This driver allows mounting external secrets directly as volumes in the pod\u0026rsquo;s file system, without ever creating a Kubernetes Secret object. This approach is considered the most secure since the secret exists only within the pod\u0026rsquo;s ephemeral memory and disappears when the pod is terminated, reducing the persistence of sensitive data in the cluster.\nIntegration Method Storage in the Cluster Dynamicity Complexity Vault Agent Injector Memory volume (Sidecar) Very High (Automatic renewal) Medium Vault Secrets Operator Kubernetes Secret object High (Periodic synchronization) Low Secrets Store CSI Pod file system High (On-the-fly update) High Native K8s Secrets etcd (Base64) None (Manual) Minimal The Secret Zero Problem and the Identity-Based Solution # The \u0026ldquo;Secret Zero\u0026rdquo; dilemma is a fundamental logical challenge in information security: to retrieve its secrets securely, an application needs an initial credential to prove its identity to the secret manager. If this initial credential is hardcoded into the container image or passed as an insecure environment variable, the entire system becomes vulnerable.\nCryptographic Identity Attestation # The modern solution to Secret Zero consists of shifting the focus from \u0026ldquo;what you possess\u0026rdquo; (a password) to \u0026ldquo;who you are\u0026rdquo; (a verifiable identity). In Kubernetes, this is achieved via Vault\u0026rsquo;s Kubernetes authentication method. When a pod attempts to access Vault, it sends its ServiceAccount JWT token, which is automatically injected by Kubernetes into the pod\u0026rsquo;s file system. Vault receives this token and contacts the Kubernetes API server via a TokenReview request to verify that the token is valid and belongs to the declared ServiceAccount. Once the identity is confirmed, Vault issues a session token with limited privileges, eliminating the need to distribute bootstrap secrets.\nOIDC Federation in CI/CD # The same principle applies to CI/CD pipelines. Using OIDC (OpenID Connect) identity federation, a GitHub Actions or GitLab CI pipeline can obtain a temporary JWT token signed by the pipeline provider. Vault can be configured to trust this OIDC provider, verifying \u0026ldquo;claims\u0026rdquo; (such as repository name, branch, or environment) to decide whether to grant access to the secrets needed for deployment. This completely removes the need to store long-term Vault tokens within GitHub or GitLab secrets, effectively solving the Secret Zero problem for automation.\nHomelab Use Case: Implementation on Raspberry Pi with k3s, Flux, and SOPS # In a home context, resources are limited and operational simplicity is key. A Raspberry Pi 4 (with 4GB or 8GB of RAM) represents the ideal platform for running k3s, a lightweight Kubernetes distribution optimized for edge computing.\nHardware and OS Preparation # Installation starts by using the Raspberry Pi Imager to write Raspberry Pi OS Lite (64-bit) to an SD card. A critical configuration for k3s is enabling cgroups in the /boot/firmware/cmdline.txt file, adding the parameters cgroup_memory=1 cgroup_enable=memory, without which the k3s service would fail to start correctly. To ensure stability, it is recommended to assign a static IP to the device via a DHCP reservation on the home router.\nGitOps Flow Configuration # In a homelab, managing secrets via SOPS and age keys is often preferred over installing a full Vault instance, due to the lower memory overhead. The workflow is structured as follows:\nFluxCD Bootstrap: Flux is installed on the cluster and connected to a private Git repository. Key Management: An age key pair is generated on the management machine. The private key is uploaded to the k3s cluster as a Kubernetes secret in the flux-system namespace. Manifest Encryption: Developers (i.e., homelab users) write their YAML manifests for applications like Pi-hole or Home Assistant, including the necessary credentials. These files are encrypted locally with SOPS before being committed. Automatic Decryption: When Flux detects a new commit, its Kustomize controller uses the age key present in the cluster to decrypt the manifests and apply them, ensuring that secrets are never exposed in cleartext in the repository. This setup provides a professional-level experience with minimal cost and maximum security, allowing the entire home infrastructure to be managed as code.\nProfessional Use Case: Enterprise Multi-Cluster Infrastructure # In a corporate environment, requirements for availability, audit, and Separation of Duties dictate a more complex architecture. Here, HashiCorp Vault becomes the nerve center of security.\nMulti-Cluster Reference Architecture # Enterprise best practice involves physical separation between the cluster hosting Vault (Tooling Cluster) and the clusters running application workloads (Production Clusters). This separation ensures that a potential \u0026ldquo;cluster failure\u0026rdquo; due to excessive application load does not prevent access to secrets, effectively blocking any recovery or autoscaling operations.\nThe Vault cluster must be deployed across three availability zones (AZ) to ensure high availability. Auto-unseal is implemented via the cloud provider\u0026rsquo;s KMS service (e.g., AWS KMS) to eliminate the operational risk of manual unlocking.\nAdvanced Integration with Terraform and CI/CD # In large organizations, Vault configuration is not done manually. Terraform pipelines are used to define:\nGranular Policies: Each application has a dedicated policy that allows read-only access exclusively to the secret paths assigned to it. Centralized Audit Logging: Vault is configured to send audit logs to a SIEM system (such as Splunk or Elasticsearch) for real-time anomaly detection. PKI as a Service: Vault is used as an intermediate certificate authority (CA) to issue short-lived TLS certificates for pod-to-pod communication, often integrating with Service Meshes like Istio via cert-manager integration. Compliance and Governance # A fundamental pillar of production is secret rotation. While in a homelab rotation might be semi-annual and manual, in production it must be automated. Vault periodically rotates Master Keys and database credentials every 30 days or less, reducing the temporal validity of any stolen secret. This process is transparent to applications if integrated via the Vault Agent, which automatically updates the secret file on disk when it is rotated.\nIntegration between Vault and SOPS: The Best of Both Worlds # A sophisticated evolution of the DevOps workflow consists of using Vault as the encryption backend for SOPS. Instead of relying on distributed age keys, SOPS uses Vault\u0026rsquo;s Transit engine to encrypt the Data Encryption Key (DEK).\nThe Hybrid Workflow # In this scenario, a developer who needs to modify an encrypted secret in Git does not need to possess a private key on their laptop. They simply need to authenticate to Vault (via corporate SSO). SOPS sends the encrypted DEK to Vault; Vault verifies the user\u0026rsquo;s policies and, if authorized, decrypts the DEK and returns it to SOPS to unlock the file.\nThis approach offers unique advantages:\nNo key distribution: Cryptographic keys never leave Vault\u0026rsquo;s security barrier. Instant Revocation: If an employee leaves the company, simply disabling their Vault account prevents them from decrypting any secret in the Git repository, even if they have a local copy. Centralized Audit: Every attempt to decrypt a secret in Git leaves a trace in the Vault logs, allowing for identification of who is accessing what sensitive information during development. Feature SOPS only (age) Vault only (Dynamic) Hybrid (SOPS + Vault Transit) Source of Truth Git (Repository) Vault (API) Git (Encrypted) + Vault (Keys) Offline Access Yes (with private key) No (requires connection) No (requires authentication) Operation Audit Limited (Git logs) Full (Vault logs) Full for every decryption Key Management Manual (File distribution) Automatic (HSM/KMS) Centralized in Vault Monitoring, Audit, and Operational Maintenance (Day 2) # Secret management does not end with initial implementation. Long-term success depends on \u0026ldquo;Day 2\u0026rdquo; operations, which include cluster health monitoring and rigorous auditing.\nBackup and Disaster Recovery Strategies # For Vault, backup is not just about data, but also the Master Keys. Using Raft, it is possible to take periodic snapshots of the cluster state via the vault operator raft snapshot save command. These snapshots must be stored in an S3 bucket with encryption and versioning enabled. In the event of total Kubernetes cluster failure, it is essential to have a documented procedure for restoring Vault from a snapshot on a new cluster, including reconnecting to the KMS service for Auto-unseal.\nDrift Detection and Auto-Healing # In GitOps ecosystems, drift occurs when the cluster\u0026rsquo;s actual state diverges from the one defined in Git. Flux and ArgoCD constantly monitor this drift. If an administrator manually modifies a decrypted secret via kubectl edit, the GitOps controller will detect the discrepancy and overwrite the changes with the encrypted state present in Git. This ensures configuration immutability and prevents silent and potentially harmful changes.\nLog Analysis and Intrusion Detection # Vault audit logs are a gold mine for security. Sophisticated analysis should look for anomalous patterns, such as a sudden spike in secret read requests from a ServiceAccount that usually only reads a few, or attempts to access unauthorized paths. Integration with Machine Learning-based anomaly detection tools can help identify these behaviors before they lead to a large-scale data breach.\nPerformance and Scalability Considerations # Introducing Vault and SOPS adds layers of abstraction that can affect performance. Network latency between the application and Vault is a critical factor, especially for applications making hundreds of secret requests per second.\nOptimization via Caching and Renewable Tokens # To reduce the load on Vault, the Vault Agent implements caching and token renewal mechanisms. Instead of requesting a new secret for every transaction, the agent can keep the secret in memory and periodically renew its \u0026ldquo;lease,\u0026rdquo; reducing traffic to the Vault cluster. In multi-region environments, Vault performance replicas can be used to distribute data geographically, allowing applications to read secrets from the nearest Vault node, minimizing intercontinental latency.\nLoad Management in Kubernetes # CPU and memory resources for Vault must be correctly sized. A Vault cluster with Raft storage requires high-performance disks with low seek times (high IOPS) to avoid delays in committing consensus logs.\nSnippet di codice\nT_{commit} = L_{network} + T_{disk_write} + T_{consensus_logic}\nThe simplified formula above highlights that the commit time of a secret ($T_{commit}$) is the sum of the network latency between nodes ($L_{network}$), the physical disk write time ($T_{disk_write}$), and the computational overhead of the Raft protocol. In enterprise environments, the use of NVMe SSD storage is highly recommended to keep performance within safe levels.\nOperational Conclusions and Adoption Roadmap # Secret management is an incremental journey. For organizations starting today, the recommended roadmap is:\nPhase 1 (Basic Hygiene): Implement SOPS with age keys for all secrets stored in Git, immediately eliminating cleartext files. Phase 2 (Centralization): Install HashiCorp Vault in high availability on Kubernetes and migrate critical database secrets, implementing automatic rotation. Phase 3 (Identity): Enable the Kubernetes and OIDC authentication methods to eliminate the Secret Zero problem and move toward authentication based on infrastructure trust. Phase 4 (Optimization): Integrate SOPS with Vault Transit to centralize key management and implement advanced audit logging for every access to sensitive data. By adopting these tools and methodologies, DevOps teams can ensure that security is not an obstacle to speed, but an accelerator that allows for deploying code in a secure, auditable, and resilient manner, from the modest resources of a Raspberry Pi to the vast infrastructures of the global cloud.\nBibliografia # Secrets Management in Kubernetes: Native Tools vs HashiCorp Vault - PufferSoft, accessed on January 8, 2026, https://puffersoft.com/secrets-management-in-kubernetes-native-tools-vs-hashicorp-vault/ 10 Best Practices For Cloud Secrets Management (2025 Guide) | by Beck Cooper - Medium, accessed on January 8, 2026, https://beckcooper.medium.com/10-best-practices-for-cloud-secrets-management-2025-guide-ffed6858e76b Secrets Management: Vault, AWS Secrets Manager, or SOPS? - DEV Community, accessed on January 8, 2026, https://dev.to/instadevops/secrets-management-vault-aws-secrets-manager-or-sops-2ce1 5 best practices for secrets management - HashiCorp, accessed on January 8, 2026, https://www.hashicorp.com/en/resources/5-best-practices-for-secrets-management Kubernetes Secrets Management in 2025 - A Complete Guide - Infisical, accessed on January 8, 2026, https://infisical.com/blog/kubernetes-secrets-management-2025 How HashiCorp\u0026rsquo;s Solutions Suite Secures Kubernetes for Business Success, accessed on January 8, 2026, https://somerford-ltd.medium.com/how-hashicorps-solutions-suite-secures-kubernetes-for-business-success-7a561ceee6fc How to Manage Kubernetes Secrets with Terraform - HashiCorp Developer, accessed on January 8, 2026, https://developer.hashicorp.com/terraform/tutorials/kubernetes/kubernetes-provider Run Vault on Kubernetes - HashiCorp Developer, accessed on January 8, 2026, https://developer.hashicorp.com/vault/docs/deploy/kubernetes Building a Secure and Efficient GitOps Pipeline with SOPS | by Paolo Carta | ITNEXT, accessed on January 8, 2026, https://itnext.io/securing-secrets-in-a-gitops-environment-with-sops-dccd8e8952d9 Secrets Management With GitOps and Kubernetes - Stakater, accessed on January 8, 2026, https://www.stakater.com/post/secrets-management-with-gitops-and-kubernetes HashiCorp Vault on production-ready Kubernetes: Architecture guide, accessed on January 8, 2026, https://flowfactor.be/blogs/hashicorp-vault-on-production-ready-kubernetes-complete-architecture-guide/ Master DevOps: Kubernetes, Terraform, \u0026amp; Vault | Kite Metric, accessed on January 8, 2026, https://kitemetric.com/blogs/mastering-devops-practical-guide-to-kubernetes-terraform-and-vault Vault on Kubernetes deployment guide - HashiCorp Developer, accessed on January 8, 2026, https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-raft-deployment-guide Vault with integrated storage reference architecture - HashiCorp Developer, accessed on January 8, 2026, https://developer.hashicorp.com/vault/tutorials/day-one-raft/raft-reference-architecture How to Setup Vault in Kubernetes- Beginners Tutorial - DevOpsCube, accessed on January 8, 2026, https://devopscube.com/vault-in-kubernetes/ CI/CD Pipeline Security Best Practices: The Ultimate Guide - Wiz, accessed on January 8, 2026, https://www.wiz.io/academy/application-security/ci-cd-security-best-practices Manage Kubernetes resources with Terraform - HashiCorp Developer, accessed on January 8, 2026, https://developer.hashicorp.com/terraform/tutorials/kubernetes/kubernetes-provider Terraform - HashiCorp Developer, accessed on January 8, 2026, https://developer.hashicorp.com/terraform Terraform Project for Managing Vault Secrets in a Kubernetes Cluster - GitGuardian Blog, accessed on January 8, 2026, https://blog.gitguardian.com/terraform-project-for-managing-vault-secrets-in-a-kubernetes-cluster/ Managing Secrets in Terraform: A Complete Guide, accessed on January 8, 2026, https://ezyinfra.dev/blog/managing-secrets-in-terraform Access secrets from Hashicorp Vault in Github Action to implement in Terraform code, accessed on January 8, 2026, https://www.reddit.com/r/hashicorp/comments/1hzz3r4/access_secrets_from_hashicorp_vault_in_github/ Securing Secrets in a GitOps Environment with SOPS | by Paolo Carta | ITNEXT, accessed on January 8, 2026, https://itnext.io/securing-secrets-in-a-gitops-environment-with-sops-dccd8e8952d9 Securely store secrets in Git using SOPS and Azure Key Vault - Patrick van Kleef, accessed on January 8, 2026, https://www.patrickvankleef.com/2023/01/18/securely-store-secrets-with-sops-and-keyvault Use vault as backend of sops - by Eric Mourgaya - Medium, accessed on January 8, 2026, https://medium.com/@eric.mourgaya/use-vault-as-backend-of-sops-1141fcaab07a Secure Secret Management with SOPS in Terraform \u0026amp; Terragrunt - DEV Community, accessed on January 8, 2026, https://dev.to/hkhelil/secure-secret-management-with-sops-in-terraform-terragrunt-231a Manage Kubernetes secrets with SOPS - Flux, accessed on January 8, 2026, https://fluxcd.io/flux/guides/mozilla-sops/ Managing secrets with SOPS in your homelab | code and society - codedge, accessed on January 8, 2026, https://www.codedge.de/posts/managing-secrets-sops-homelab/ Using SOPS Secrets with Age - Federico Serini | Site Reliability Engineer, accessed on January 8, 2026, https://www.federicoserinidev.com/blog/using_sops_secrets_with_age/ From Zero to GitOps: Building a k3s Homelab on a Raspberry Pi with \u0026hellip; - Medium, accessed on January 8, 2026, https://dev.to/shankar_t/from-zero-to-gitops-building-a-k3s-homelab-on-a-raspberry-pi-with-flux-sops-55b7 List Of Secrets Management Tools For Kubernetes In 2025 - Techiescamp, accessed on January 8, 2026, https://blog.techiescamp.com/secrets-management-tools/ Solving secret zero with Vault and OpenShift Virtualization - HashiCorp, accessed on January 8, 2026, https://www.hashicorp.com/en/blog/solving-secret-zero-with-vault-and-openshift-virtualization Secret Zero Problem: Risks and Solutions Explained - GitGuardian, accessed on January 8, 2026, https://www.gitguardian.com/nhi-hub/the-secret-zero-problem-solutions-and-alternatives What is the Secret Zero Problem? A Deep Dive into Cloud-Native Authentication - Infisical, accessed on January 8, 2026, https://infisical.com/blog/solving-secret-zero-problem Use Case: Solving the Secret Zero Problem - Aembit, accessed on January 8, 2026, https://aembit.io/use-case/solving-the-secret-zero-problem/ Integrating Azure DevOps pipelines with HashiCorp Vault, accessed on January 8, 2026, https://www.hashicorp.com/en/blog/integrating-azure-devops-pipelines-with-hashicorp-vault HashiCorp Vault · Actions · GitHub Marketplace, accessed on January 8, 2026, https://github.com/marketplace/actions/hashicorp-vault Use HashiCorp Vault secrets in GitLab CI/CD, accessed on January 8, 2026, https://docs.gitlab.com/ci/secrets/hashicorp_vault/ Tutorial: Authenticating and reading secrets with HashiCorp Vault - GitLab Docs, accessed on January 8, 2026, https://docs.gitlab.com/ci/secrets/hashicorp_vault_tutorial/ Building a Self-Hosted Homelab: Deploying Kubernetes (K3s), NAS (OpenMediaVault), and Pi-hole for Ad-Free Browsing | by PJames | Medium, accessed on January 8, 2026, https://medium.com/@james.prakash/building-a-self-hosted-homelab-deploying-kubernetes-k3s-nas-openmediavault-and-pi-hole-for-7390d5a59bac Modern Java developement with Devops and AI – Modern Java developement with Devops and AI, accessed on January 8, 2026, https://coresynapseai.com/ Secrets and configuration management in IaC: best practices in HashiCorp Vault and SOPS for security and efficiency - Semantive, accessed on January 8, 2026, https://www.semantive.com/blog/secrets-and-configuration-management-in-iac-best-practices-in-hashicorp-vault-and-sops-for-security-and-efficiency Managing Kubernetes in 2025: 7 Pillars of Production-Grade Platform Management, accessed on January 8, 2026, https://scaleops.com/blog/the-complete-guide-to-kubernetes-management-in-2025-7-pillars-for-production-scale/ Mastering GitOps with Flux and Argo CD: Automating Infrastructure as Code in Kubernetes, accessed on January 8, 2026, https://www.clutchevents.co/resources/mastering-gitops-with-flux-and-argo-cd-automating-infrastructure-as-code-in-kubernetes Data Science - Noise, accessed on January 8, 2026, https://noise.getoto.net/tag/data-science/ ","date":"10 January 2026","externalUrl":null,"permalink":"/guides/hashicorp-vault-sops-kubernetes-guide/","section":"Guides","summary":"","title":"Advanced Secret Management Strategies: HashiCorp Vault, SOPS, and the Kubernetes Ecosystem","type":"guides"},{"content":"In this section, you\u0026rsquo;ll find comprehensive guides for your Homelab and Kubernetes setup.\n","date":"10 January 2026","externalUrl":null,"permalink":"/guides/","section":"Guides","summary":"","title":"Guides","type":"guides"},{"content":"","date":"10 January 2026","externalUrl":null,"permalink":"/tags/sops/","section":"Tags","summary":"","title":"Sops","type":"tags"},{"content":"","date":"8 January 2026","externalUrl":null,"permalink":"/tags/csi/","section":"Tags","summary":"","title":"Csi","type":"tags"},{"content":"","date":"8 January 2026","externalUrl":null,"permalink":"/it/tags/immutabilit%C3%A0/","section":"Tags","summary":"","title":"Immutabilità","type":"tags"},{"content":"","date":"8 January 2026","externalUrl":null,"permalink":"/tags/immutability/","section":"Tags","summary":"","title":"Immutability","type":"tags"},{"content":"","date":"8 January 2026","externalUrl":null,"permalink":"/tags/persistence/","section":"Tags","summary":"","title":"Persistence","type":"tags"},{"content":"","date":"8 January 2026","externalUrl":null,"permalink":"/it/tags/persistenza/","section":"Tags","summary":"","title":"Persistenza","type":"tags"},{"content":"","date":"8 January 2026","externalUrl":null,"permalink":"/tags/pki/","section":"Tags","summary":"","title":"Pki","type":"tags"},{"content":"The advent of Talos Linux represents a fundamental paradigm shift in how security professionals and platform engineers conceive the operating system underlying Kubernetes clusters. Unlike traditional Linux distributions, designed for general-purpose use and based on mutable management via shell and SSH, Talos Linux was born as a purely API-oriented, immutable, and minimal solution.1 This architecture is not merely a technical optimization, but a structural response to the inherent vulnerabilities of legacy operating systems. By eliminating SSH access, package managers, and superfluous GNU utilities, Talos drastically reduces the attack surface, limiting it to about 12 essential binaries compared to the over 1,500 of a standard distribution.1 Security in this context is not a subsequent addition (bolt-on), but is integrated into the system\u0026rsquo;s DNA, where every interaction occurs via authenticated and encrypted gRPC calls.2\nImmutable Security Architecture and Threat Model # The heart of Talos Linux\u0026rsquo;s security proposition lies in its immutable nature and declarative management. The operating system runs from a read-only SquashFS image, which ensures that, even in the event of temporary runtime compromise, the system can be restored to a known and secure state simply via a reboot.2 This model eliminates \u0026ldquo;configuration drift\u0026rdquo;, a critical phenomenon where small manual changes over time make servers unique and difficult to protect.5 In Talos, the entire machine configuration is defined in a single YAML manifest, which includes not only operating system parameters but also the configuration of the Kubernetes components it orchestrates.2\nThe elimination of SSH is perhaps the most distinctive and discussed feature. Traditionally, SSH represents a primary attack vector due to weak keys, misconfigurations, and the possibility for an attacker to move laterally once a shell is obtained.1 By replacing SSH with a gRPC API interface, Talos mandates that every administrative action be structured, traceable, and certificate-based.2 This shifts the security focus from node access to the protection of client certificates and API keys.8\nTraditional Component Talos Linux Approach Security Implication Remote Access SSH (Port 22) gRPC API (Port 50000) 8 Package Management apt, yum, pacman Immutable Image (SquashFS) 2 Configuration Bash scripts, Cloud-init Declarative YAML Manifest 2 Userland GNU Utilities, Shell Minimal (only 12-50 binaries) 1 Privileges sudo, Root API-based RBAC 8 Public Key Infrastructure (PKI) and Certificate Management # The security of communications within a Talos and Kubernetes cluster is entirely based on a complex hierarchy of X.509 certificates. Talos automates the creation and management of these Certificate Authorities (CAs) during the cluster secrets generation phase.7 There are three primary PKI domains operating in parallel: the Talos API domain, the Kubernetes API domain, and the etcd database domain.9\nRoot Certificate Authorities and Lifetimes # By default, Talos generates root CAs with a duration of 10 years.13 This choice reflects the project\u0026rsquo;s philosophy of providing a stable infrastructure where root CA rotation is considered an exceptional operation, necessary only in case of private key compromise or mass access revocation needs.13 However, the certificates issued by these CAs for server components and clients have significantly shorter durations.9\nServer-side certificates for etcd, Kubernetes components (like the apiserver), and the Talos API are automatically managed and rotated by the system.9 A critical detail is represented by the kubelet: although rotation is automatic, the kubelet must be restarted (or the node updated/rebooted) at least once a year to ensure that new certificates are loaded correctly.9 Verifying the status of Kubernetes dynamic certificates can be done via the command talosctl get KubernetesDynamicCerts -o yaml directly from the control plane.9\nClient Certificates: talosconfig and kubeconfig # Unlike server certificates, client certificates are the sole responsibility of the operator.9 Every time a user downloads a kubeconfig file via talosctl, a new client certificate with a one-year validity is generated.9 Similarly, the talosconfig file, essential for interacting with the Talos API, must be renewed annually.9 The loss of validity of these certificates can lead to a total lockout of administrative access, making it fundamental to integrate periodic renewal processes into operational pipelines.9\nScheduled Change and Certificate Rotation # Root CA rotation, though rare, is a well-defined process in Talos Linux. It is not an instantaneous replacement, which would cause a total service interruption, but a multi-phase transition process.13\nAutomated CA Rotation Process # Talos provides the talosctl rotate-ca command to orchestrate rotation for both the Talos API and the Kubernetes API.13 The workflow follows an \u0026ldquo;Accepted -\u0026gt; Issuing -\u0026gt; Remove\u0026rdquo; model that guarantees operational continuity.13\nAcceptance Phase: A new CA is generated. This new CA is added to the acceptedCAs list in the machine configuration of all nodes.13 In this phase, the system accepts certificates signed by both the old and the new CA, but continues to issue certificates with the old one.13 Issuing Phase (Swap): The new CA is set as the primary issuing authority. Services begin generating new certificates using the new private key.13 The old CA remains among the acceptedCAs to allow components not yet updated to communicate.13 Refresh Phase: All certificates in the cluster are updated. For Kubernetes, this involves restarting the control plane components and the kubelet on each node.13 Removal Phase: Once it is confirmed that all components are using the new certificates, the old CA is removed from the acceptedCAs. From this moment, any old talosconfig or kubeconfig becomes unusable, effectively completing the revocation of previous accesses.13 Client Certificate Renewal Automation # Since client certificates expire annually, the use of cronjobs or automation scripts is an established practice. An administrator can generate a new talosconfig starting from an existing one that is still valid using the command talosctl config new --roles os:admin --crt-ttl 24h against a control plane node.9 For more robust management, it is possible to extract the root CA and private key directly from saved secrets (e.g., secrets.yaml) to generate new certificates offline, a vital technique for disaster recovery if all client certificates have expired simultaneously.9\nSecrets Management: The Role of Mozilla SOPS # In a GitOps architecture, where every configuration must reside in a Git repository, protecting the secrets present in Talos manifests (such as CA keys, bootstrap tokens, and etcd encryption secrets) becomes the primary challenge. Mozilla SOPS (Secrets OPerationS) has established itself as the reference tool in this domain.17\nWhy SOPS is the Standard for Talos # Unlike tools that encrypt the entire file (like Ansible Vault), SOPS is \u0026ldquo;structure-aware\u0026rdquo;. It can encrypt only the values within a YAML file, leaving the keys in clear text.19 This is fundamental for Talos for several reasons:\nDiffing: Developers can see which fields have changed in a commit without having to decrypt the entire file, facilitating code reviews.19 Integration with age: SOPS integrates perfectly with age, a modern and minimal encryption tool that avoids PGP complexities.19 Native Support in Talos Tools: Tools like talhelper and talm include native support for SOPS, allowing the entire configuration lifecycle (generation, encryption, application) to be managed fluidly.23 Practical Implementation: talhelper and SOPS # The recommended workflow for production involves using talhelper to generate node-specific configuration files starting from a central template (talconfig.yaml) and an encrypted secrets file (talsecret.sops.yaml).24\nInitialization: An age key pair is generated with age-keygen.19 SOPS Configuration: A .sops.yaml file is created in the repository root to define encryption rules, specifying which fields to protect via regular expressions (e.g., crt, key, secret, token).19 Secrets Management: Base secrets are generated with talhelper gensecret \u0026gt; talsecret.sops.yaml and immediate encryption is performed with sops -e -i talsecret.sops.yaml.24 Configuration Generation: During the CI/CD pipeline, talhelper genconfig automatically decrypts the necessary secrets to produce the final machine manifests, which are then applied to the nodes.22 CI/CD Integration and Security Pipelines # Integrating Talos Linux into a CI/CD pipeline (GitHub Actions, GitLab CI) transforms infrastructure management into a rigorous software process. The core principle is that no sensitive configuration should be decrypted on the developer\u0026rsquo;s machine, but only within the protected environment of the pipeline.18\nProduction Pipeline Flow # A typical pipeline for Talos deployment follows these security-critical steps:\nage Key Injection: The age private key is stored as a pipeline secret (e.g., SOPS_AGE_KEY). This ensures that only the authorized pipeline can decrypt the manifests.19 Validation and Linting: Before applying any change, the pipeline performs static checks on the YAML configuration to ensure no syntax errors or security policy violations have been introduced.17 Staged Update: Talos supports the --mode=staged mode for configuration application. This allows loading the new configuration onto the node, which will be applied only upon the next reboot, enabling controlled maintenance windows.29 Notifications and Auditing: Tools like ntfy.sh or Slack integrations are used to notify the outcome of certificate renewals or patch applications, ensuring total visibility into infrastructure operations.31 Comparison: SOPS vs Vault vs External Secrets Operator # Many teams wonder if SOPS is sufficient for production or if more complex solutions like HashiCorp Vault are necessary. The answer lies in the distinction between \u0026ldquo;Infrastructure Secrets\u0026rdquo; (necessary to start the cluster) and \u0026ldquo;Application Secrets\u0026rdquo; (necessary for workloads).33\nCriterion Mozilla SOPS HashiCorp Vault External Secrets Operator (ESO) Strength Simplicity and pure GitOps. 18 Dynamic security and advanced auditing. 33 Bridge between K8s and cloud KMS. 37 Complexity Low (CLI and files). 19 High (requires Vault cluster management). 36 Medium (operator in cluster). 38 Dynamic Secrets No (Static in Git). 35 Yes (temporary DB credentials). 33 Depends on backend. 38 Ideal Use for Talos Machine Configuration and Bootstrap. 24 Regulated Enterprise workloads. 33 Cloud secrets sync to Pod. 38 License Open Source (MPL). 41 BSL (BSL is not Open Source). 34 Open Source (Apache 2.0). 38 Critical analysis: For managing Talos operating system security and the initial PKI, SOPS is often superior to Vault because it does not require a pre-existing infrastructure to be decrypted.25 However, once the cluster is operational, integrating Vault via ESO or the Vault sidecar injector is the best practice for managing application credentials, reducing the proliferation of static secrets in Kubernetes.33\nAdvanced Hardening: Disk Encryption and TPM # A production Kubernetes cluster cannot ignore the protection of data-at-rest. Talos Linux offers one of the most advanced disk encryption implementations via LUKS2, integrated directly into the operating system lifecycle.29\nEncryption via TPM 2.0 and SecureBoot # The most secure approach on bare metal involves using the TPM (Trusted Platform Module) chip. When encryption is configured to use the TPM, Talos \u0026ldquo;seals\u0026rdquo; the disk encryption key to the firmware and bootloader state.29\nBoot Measurement: During the boot process, the Unified Kernel Image (UKI) components are measured in the TPM\u0026rsquo;s PCR (Platform Configuration Registers) registers.29 Conditional Unlock: The STATE or EPHEMERAL partition is unlocked only if SecureBoot is active and if PCR-7 measurements (indicating UEFI certificate integrity) match the expected ones.29 This prevents an attacker who physically steals the disk from accessing the data, as the key would not be released if inserted into different hardware or with a tampered bootloader.29 Integration with Network KMS # For cloud environments or data centers where TPM is not available or desired, Talos supports encryption via external KMS (Key Management Service).29 In this configuration, the Talos node generates a random encryption key, sends it to a KMS endpoint (like Omni or a custom proxy) to be encrypted (sealed), and stores the result in the LUKS2 metadata.43 Upon reboot, the node must be able to reach the KMS via network to decrypt the key.43\nNetwork Implication: Using KMS for the STATE partition introduces a challenge: network configuration must be defined in kernel parameters or via DHCP, as the partition that normally contains the configuration is still encrypted and inaccessible until the connection to the KMS is established.29\nNetwork and Runtime Security: Cilium and KubeArmor # Talos security does not stop at the operating system. Being a \u0026ldquo;purpose-built\u0026rdquo; system for Kubernetes, Talos facilitates the adoption of networking and security stacks based on eBPF, which offer superior performance and visibility compared to iptables.11\nCilium as Production Standard # While Flannel is the default CNI, Cilium is the established choice for the enterprise.11 Using Cilium on Talos allows:\nNetwork Policy Enforcement: Implement L3/L4 and L7 policies that are not possible with Flannel.11 Transparent Encryption (mTLS): Cilium can encrypt all pod-to-pod traffic transparently using IPsec or WireGuard.45 Kube-proxy Replacement: Eliminate kube-proxy in favor of a much more efficient eBPF-based implementation.44 Application Hardening with KubeArmor # While Talos isolates the node, KubeArmor is used to protect pod runtime. KubeArmor leverages kernel LSM (Linux Security Modules) modules (such as AppArmor or BPF-LSM) to prevent \u0026ldquo;breakout\u0026rdquo; attacks or the execution of unauthorized files within containers.46 Combining a minimal operating system like Talos with an enforcement engine like KubeArmor realizes a true \u0026ldquo;Zero Trust\u0026rdquo; architecture at all levels of the stack.46\nOperational Management Strategies and Conclusions # Security management in Talos Linux requires a mental transition from server administration to API orchestration. Common and established practices reflect this need for automation and formal rigor.\nTotal Immutability: Every change must pass through Git and the CI/CD pipeline. The use of talosctl patch must be reserved exclusively for debugging or temporary emergencies, with the obligation to immediately reflect changes in the main YAML manifest.1 Active Certificate Monitoring: Since client certificates are the weak point of the annual lifecycle, it is essential to implement expiration-based alerts (e.g., via Prometheus) to avoid administrative access interruption.9 Secrets Governance: SOPS must be used to encrypt sensitive cluster files, but the private decryption key (age) must be managed with the utmost severity, preferably via an HSM or the cloud provider\u0026rsquo;s secrets management service.18 Hardware Integration: Where possible, enable SecureBoot and TPM to guarantee boot integrity and physical data protection. This transforms the node into a secure, tamper-proof \u0026ldquo;black box\u0026rdquo;.29 Talos Linux, if configured following these practices, offers probably the highest level of security available today for Kubernetes. Its restrictive nature forces DevOps teams to adopt modern and secure workflows by necessity, rather than choice, raising the security standard of the entire organization.1 The choice between SOPS and heavier solutions like Vault should not be seen as mutually exclusive; on the contrary, a mature architecture uses SOPS for infrastructure bootstrap and Vault for dynamic application secrets, getting the best of both worlds.33\nBibliography # Using Talos Linux and Kubernetes bootstrap on OpenStack - Safespring, accessed on January 8, 2026, https://www.safespring.com/blogg/2025/2025-03-talos-linux-on-openstack/ Philosophy - Sidero Documentation - What is Talos Linux?, accessed on January 8, 2026, https://docs.siderolabs.com/talos/v1.9/learn-more/philosophy What is Talos Linux? - Sidero Documentation, accessed on January 8, 2026, https://docs.siderolabs.com/talos/v1.12/overview/what-is-talos Talos Linux: Bringing Immutability and Security to Kubernetes Operations - InfoQ, accessed on January 8, 2026, https://www.infoq.com/news/2025/10/talos-linux-kubernetes/ Talos Linux: Kubernetes Important API Management Improvement - Linux Security, accessed on January 8, 2026, https://linuxsecurity.com/features/talos-linux-redefining-kubernetes-security Talos Linux - The Kubernetes Operating System, accessed on January 8, 2026, https://www.talos.dev/ Getting Started - Sidero Documentation - What is Talos Linux?, accessed on January 8, 2026, https://docs.siderolabs.com/talos/v1.9/getting-started/getting-started Role-based access control (RBAC) | TALOS LINUX, accessed on January 8, 2026, https://www.talos.dev/v1.6/talos-guides/configuration/rbac/ How to manage PKI and certificate lifetimes with Talos Linux - Sidero Documentation, accessed on January 8, 2026, https://docs.siderolabs.com/talos/v1.7/security/cert-management Troubleshooting - Sidero Documentation - What is Talos Linux?, accessed on January 8, 2026, https://docs.siderolabs.com/talos/v1.9/troubleshooting/troubleshooting Kubernetes Cluster Reference Architecture with Talos Linux for 2025-05 - Sidero Labs, accessed on January 8, 2026, https://www.siderolabs.com/wp-content/uploads/2025/08/Kubernetes-Cluster-Reference-Architecture-with-Talos-Linux-for-2025-05.pdf Role-based access control (RBAC) - Sidero Documentation - What is Talos Linux?, accessed on January 8, 2026, https://docs.siderolabs.com/talos/v1.9/security/rbac CA Rotation - Sidero Documentation - What is Talos Linux?, accessed on January 8, 2026, https://docs.siderolabs.com/talos/v1.8/security/ca-rotation How to Rotate Certificate Authority - Cozystack, accessed on January 8, 2026, https://cozystack.io/docs/operations/cluster/rotate-ca/ First anniversary and predictably the client certs were all broken : r/TalosLinux - Reddit, accessed on January 8, 2026, https://www.reddit.com/r/TalosLinux/comments/1mtss8g/first_anniversary_and_predictably_the_client/ talos package - github.com/siderolabs/talos/pkg/rotate/pki/talos - Go Packages, accessed on January 8, 2026, https://pkg.go.dev/github.com/siderolabs/talos/pkg/rotate/pki/talos A template for deploying a Talos Kubernetes cluster including Flux for GitOps - GitHub, accessed on January 8, 2026, https://github.com/onedr0p/cluster-template Building a Secure and Efficient GitOps Pipeline with SOPS | by Platform Engineers - Medium, accessed on January 8, 2026, https://medium.com/@platform.engineers/building-a-secure-and-efficient-gitops-pipeline-with-sops-44ca1a4e505f Doing Secrets The GitOps Way | Mircea Anton, accessed on January 8, 2026, https://mirceanton.com/posts/doing-secrets-the-gitops-way/ Mozilla SOPS - K8s Security, accessed on January 8, 2026, https://k8s-security.geek-kb.com/docs/best_practices/cluster_setup_and_hardening/secrets_management/mozilla_sops/ Best Secrets Management Tools for 2026 - Cycode, accessed on January 8, 2026, https://cycode.com/blog/best-secrets-management-tools/ Guides - Talhelper, accessed on January 8, 2026, https://budimanjojo.github.io/talhelper/latest/guides/ cozystack/talm: Manage Talos Linux the GitOps Way! - GitHub, accessed on January 8, 2026, https://github.com/cozystack/talm joeypiccola/k8s_home - GitHub, accessed on January 8, 2026, https://github.com/joeypiccola/k8s_home Talhelper, accessed on January 8, 2026, https://budimanjojo.github.io/talhelper/ Kubernetes CI/CD Pipelines – 8 Best Practices and Tools - Spacelift, accessed on January 8, 2026, https://spacelift.io/blog/kubernetes-ci-cd Manage your secrets in Git with SOPS \u0026amp; GitLab CI - DEV Community, accessed on January 8, 2026, https://dev.to/stack-labs/manage-your-secrets-in-git-with-sops-gitlab-ci-2jnd Best practices for continuous integration and delivery to Google Kubernetes Engine, accessed on January 8, 2026, https://docs.cloud.google.com/kubernetes-engine/docs/concepts/best-practices-continuous-integration-delivery-kubernetes Disk Encryption - Sidero Documentation - What is Talos Linux?, accessed on January 8, 2026, https://docs.siderolabs.com/talos/v1.8/configure-your-talos-cluster/storage-and-disk-management/disk-encryption talos_machine_configuration_ap, accessed on January 8, 2026, https://registry.terraform.io/providers/siderolabs/talos/0.1.0-alpha.11/docs/resources/machine_configuration_apply Automatically regenerate Tailscale TLS certs using systemd timers - STFN, accessed on January 8, 2026, https://stfn.pl/blog/78-tailscale-certs-renew/ CI/CD Pipeline Security Best Practices: The Ultimate Guide - Wiz, accessed on January 8, 2026, https://www.wiz.io/academy/application-security/ci-cd-security-best-practices Secrets Management in Kubernetes: Native Tools vs HashiCorp Vault - PufferSoft, accessed on January 8, 2026, https://puffersoft.com/secrets-management-in-kubernetes-native-tools-vs-hashicorp-vault/ Open Source Secrets Management for DevOps in 2025 - Infisical, accessed on January 8, 2026, https://infisical.com/blog/open-source-secrets-management-devops Secrets Management: Vault, AWS Secrets Manager, or SOPS? - DEV Community, accessed on January 8, 2026, https://dev.to/instadevops/secrets-management-vault-aws-secrets-manager-or-sops-2ce1 Top-10 Secrets Management Tools in 2025 - Infisical, accessed on January 8, 2026, https://infisical.com/blog/best-secret-management-tools Comparison between Hashicorp Vault Agent Injector and External Secrets Operator, accessed on January 8, 2026, https://unparagonedwisdom.medium.com/comparison-between-hashicorp-vault-agent-injector-and-external-secrets-operator-c3cabd89afca Unlocking Secrets with External Secrets Operator - DEV Community, accessed on January 8, 2026, https://dev.to/hkhelil/unlocking-secrets-with-external-secrets-operator-2f89 List Of Secrets Management Tools For Kubernetes In 2025 - Techiescamp, accessed on January 8, 2026, https://blog.techiescamp.com/secrets-management-tools/ Kubernetes integrations comparison | Vault - HashiCorp Developer, accessed on January 8, 2026, https://developer.hashicorp.com/vault/docs/deploy/kubernetes/comparisons getsops/sops: Simple and flexible tool for managing secrets - GitHub, accessed on January 8, 2026, https://github.com/getsops/sops Building an IPv6-Only Kubernetes Cluster with Talos and talhelper - DevOps Diaries, accessed on January 8, 2026, https://blog.spanagiot.gr/posts/talos-ipv6-only-cluster/ Omni KMS Disk Encryption - Sidero Documentation - What is Talos Linux?, accessed on January 8, 2026, https://docs.siderolabs.com/omni/security-and-authentication/omni-kms-disk-encryption Installing Cilium and Multus on Talos OS for Advanced Kubernetes Networking, accessed on January 8, 2026, https://www.itguyjournals.com/installing-cilium-and-multus-on-talos-os-for-advanced-kubernetes-networking/ Kubernetes \u0026amp; Talos - Reddit, accessed on January 8, 2026, https://www.reddit.com/r/kubernetes/comments/1hs6bui/kubernetes_talos/ Talos Linux And KubeArmor Integration \\[2025 Edition\\]- AccuKnox, accessed on January 8, 2026, https://accuknox.com/technical-papers/talos-os-protection Kubernetes Best Practices in 2025: Scaling, Security, and Cost Optimization - KodeKloud, accessed on January 8, 2026, https://kodekloud.com/blog/kubernetes-best-practices-2025/ Talos Linux is powerful. But do you need more? - Sidero Labs, accessed on January 8, 2026, https://www.siderolabs.com/blog/do-you-need-omni/ ","date":"8 January 2026","externalUrl":null,"permalink":"/guides/talos-linux-security-secrets/","section":"Guides","summary":"","title":"Security and Lifecycle Management in Kubernetes on Talos Linux: Architectures, PKI, and Secrecy Strategies","type":"guides"},{"content":"","date":"8 January 2026","externalUrl":null,"permalink":"/it/tags/sicurezza/","section":"Tags","summary":"","title":"Sicurezza","type":"tags"},{"content":"","date":"8 January 2026","externalUrl":null,"permalink":"/tags/statefulset/","section":"Tags","summary":"","title":"Statefulset","type":"tags"},{"content":"","date":"8 January 2026","externalUrl":null,"permalink":"/tags/storage/","section":"Tags","summary":"","title":"Storage","type":"tags"},{"content":"The evolution of container orchestration has radically transformed the paradigm of state management in distributed applications. Within the Kubernetes ecosystem, storage management no longer represents a simple infrastructure accessory, but constitutes the critical foundation upon which the reliability of enterprise applications rests.1 Although containers were originally conceived as ephemeral and stateless entities, the operational reality of modern workloads requires that data survive not only the crashes of individual processes, but also the rescheduling of Pods across different nodes of the cluster.3 This technical analysis explores in depth the taxonomy of Kubernetes volumes, abstraction mechanisms, advanced YAML configurations, and optimization strategies for complex production scenarios.\nAnalysis of the YAML format and declarative orchestration # Before delving into storage specifics, it is essential to understand the primary communication tool of Kubernetes: the YAML format (YAML Ain\u0026rsquo;t Markup Language). The choice of this serialization format is not accidental; it responds to the need for a human-readable syntax that allows defining the desired state of the infrastructure in a declarative way.6 YAML excels in representing complex hierarchical data structures, fundamental for describing relationships between storage components and workloads.6\nYAML syntax is based on key-value pairs and lists, where indentation (strictly performed with spaces and never with tabs) determines the hierarchy of elements.6 This structure is vital for defining volume specifications within Pod manifests. For example, the use of anchors (\u0026amp;) and aliases (*) in YAML allows reducing duplication in similar storage configurations, improving the maintainability of complex configuration files.6 Kubernetes leverages these features to validate files against its own API schemas, ensuring that storage definitions are syntactically correct before application to the cluster.6\nTaxonomy and lifecycle of volumes # A volume in Kubernetes is fundamentally a directory accessible to containers within a Pod, whose nature, content, and lifecycle are determined by the specific volume type used.5 Kubernetes solves two fundamental challenges: data persistence beyond a container crash (since upon restart the container starts from a clean state) and file sharing among multiple containers residing in the same Pod.5\nClassification by persistence: ephemeral and persistent volumes # The primary distinction in the Kubernetes storage system concerns the link between the life of the volume and that of the Pod.3\nFeature Ephemeral Volumes Persistent Volumes Lifespan Coincides with the life of the Pod.3 Independent of the life of the Pod.3 Persistence post-container restart Data is maintained across restarts.5 Data is maintained across restarts.8 Persistence post-Pod deletion Data is destroyed.3 Data persists in external storage.3 Common examples emptyDir, ConfigMap, Secret, downwardAPI.3 PersistentVolume, NFS, Azure Disk, AWS EBS.13 Ephemeral volumes are ideal for scenarios requiring scratch space, temporary caches, or configuration injection.5 Conversely, persistent volumes are essential for stateful applications like databases, where the loss of the Pod must not entail the loss of information.4\nDeep dive on ephemeral volumes: emptyDir and hostPath # The emptyDir volume type is created when a Pod is assigned to a node and remains existing as long as the Pod is running on that node.3 Initially empty, it allows all containers in the Pod to read and write in the same space.5 An advanced configuration involves using memory (RAM) as a backend for emptyDir by setting the medium field to Memory, which is useful for very high-performance caches but consumes the node\u0026rsquo;s RAM quota.2\nThe hostPath volume, on the other hand, mounts a file or directory from the host\u0026rsquo;s filesystem directly into the Pod.3 This type is particularly useful for system workloads that need to monitor the node, such as log agents reading /var/log.3 However, it presents significant security risks by exposing the host\u0026rsquo;s filesystem and compromises portability, as the Pod becomes dependent on files present on a specific node.3\nProjection mechanisms: ConfigMap and Secret # Kubernetes uses special volumes to inject configuration data and secrets.15 Unlike using environment variables, mounting ConfigMap and Secret as volumes allows for dynamic updating of files within the container without having to restart the process, thanks to the atomic link update mechanism managed by the Kubelet.16 This approach is fundamental for modern microservices architectures that require hot reloads of configuration.16\nAn important technical detail concerns the use of subPath. While subPath allows mounting a single file from a volume into a specific folder of the container without overwriting the entire destination directory, files mounted via this technique do not benefit from automatic updates when the source resource changes in the cluster.5\nThe abstraction model: PersistentVolume and PersistentVolumeClaim # To manage persistent storage in a scalable and infrastructure-agnostic way, Kubernetes introduces three key concepts: PersistentVolume (PV), PersistentVolumeClaim (PVC), and StorageClass (SC).13\nDefinition and responsibility # A PersistentVolume is a physical storage resource within the cluster, comparable to a node in terms of computational resource.14 It captures the details of the storage implementation (whether NFS, iSCSI, or specific cloud provider storage).19 Conversely, a PersistentVolumeClaim represents the request for storage by the user, specifying size and access modes without needing to know the backend details.12\nThe lifecycle of these resources follows four distinct phases:\nProvisioning: Storage can be created statically by an administrator or dynamically via a StorageClass.13 Binding: Kubernetes monitors new PVCs and looks for a matching PV. Once found, the PV and PVC are bound in an exclusive 1-to-1 relationship.12 Using: The Pod uses the PVC as if it were a local volume. The cluster inspects the claim to find the bound volume and mounts it into the container\u0026rsquo;s filesystem.12 Reclaiming: When the user has finished using the volume and deletes the PVC, the reclaim policy defines what happens to the PV.13 Analysis of Reclaim Policies # Data management post-usage is critical for security and compliance. Three main policies exist 10:\nRetain: The PV remains intact after PVC deletion. The administrator must manually handle cleaning or reusing the volume.10 Delete: The physical volume and the associated PV are automatically deleted. This is the standard behavior for dynamic storage in cloud environments.13 Recycle: Performs a file deletion (cleans the filesystem) making the volume available for new claims. This policy is now considered deprecated in favor of dynamic provisioning.13 StorageClass and Dynamic Provisioning # Dynamic provisioning represents a milestone in Kubernetes automation, eliminating the need for administrators to manually pre-create volumes.14 Through the StorageClass object, it is possible to define different storage tiers (e.g., \u0026ldquo;fast\u0026rdquo; for SSD, \u0026ldquo;slow\u0026rdquo; for HDD) and delegate to Kubernetes the on-demand creation of the physical volume via the relevant provisioner.25\nCloud Provider Provisioner (CSI) Example Parameters Operational Notes AWS ebs.csi.aws.com type: gp3, iops: 3000 Supports online expansion.27 Azure disk.csi.azure.com storageaccounttype: Premium_LRS Requires RWO type PVC.29 GCP pd.csi.storage.gke.io type: pd-balanced Supports snapshots via CSI.26 Using the parameter volumeBindingMode: WaitForFirstConsumer within a StorageClass is a fundamental best practice in multi-zone environments.24 This parameter instructs the cluster to wait for Pod scheduling before creating the volume, ensuring storage is allocated in the same availability zone where the Pod is actually running, avoiding cross-zone mount errors.2\nAccess Modes and Application Scenarios # Correct selection of the access mode (AccessMode) is determinant for the stability of stateful applications.1\nReadWriteOnce (RWO): The volume can be mounted as read-write by a single node. It is the ideal mode for databases like MySQL or PostgreSQL that require exclusivity to guarantee data integrity.1 ReadOnlyMany (ROX): Many nodes can mount the volume simultaneously but only in read-only mode. This scenario is typical for distributing static content (e.g., an /html folder for an Nginx cluster).1 ReadWriteMany (RWX): Many nodes can read and write simultaneously. This mode is supported by systems like NFS or Azure Files and is useful for applications sharing a common state, although it requires attention to avoid corruption due to overlapping writes.1 ReadWriteOncePod (RWOP): Introduced in recent versions, guarantees that only a single Pod in the entire cluster can access the volume, offering a higher security level than RWO (which limits access at the node level).1 Architecture of Stateful Workloads: StatefulSet # Data management in Kubernetes culminates in the use of the StatefulSet, the API object designed to manage applications requiring persistent identities and stable storage.18 Unlike Deployments, where Pods are interchangeable, in a StatefulSet each Pod receives an ordinal index (0, 1, 2\u0026hellip;) that it maintains throughout its existence.18\nThe role of volumeClaimTemplates # The strength of the StatefulSet is the volumeClaimTemplates.18 Instead of sharing a single PVC among all Pods, the StatefulSet automatically generates a unique PVC for each instance.18 If Pod db-1 is deleted and rescheduled, Kubernetes will reattach exactly the data-db-1 PVC to that new instance, ensuring the database maintains its historical data continuity.18\nPractical Example: Resilient PostgreSQL Architecture # When implementing a PostgreSQL database, it is fundamental to use a Headless Service (with clusterIP: None) to provide stable DNS names (e.g., postgres-0.postgres.namespace.svc.cluster.local) allowing communication between primary and replicas.18\nYAML\napiVersion: apps/v1 kind: StatefulSet metadata: name: postgresql spec: serviceName: \u0026#34;postgresql\u0026#34; replicas: 3 template: metadata: labels: app: postgres spec: containers: - name: postgres image: postgres:15 volumeMounts: - name: pgdata mountPath: /var/lib/postgresql/data volumeClaimTemplates: - metadata: name: pgdata spec: accessModes: storageClassName: \u0026#34;managed-csi\u0026#34; resources: requests: storage: 100Gi In this scenario, Kubernetes manages the order of creation and termination of Pods, ensuring that replicas are created only after the primary is ready, minimizing risks of inconsistencies during cluster bootstrap.33\nContainer Storage Interface (CSI) and Storage Evolution # The Container Storage Interface (CSI) represents the modern standard for storage integration in Kubernetes, having replaced the old \u0026ldquo;in-tree\u0026rdquo; drivers (compiled directly into the Kubernetes code).37 CSI allows storage vendors to develop drivers independent of the Kubernetes release cycle, fostering innovation and core stability.37\nCSI Driver Architecture # A CSI driver operates through two main components 37:\nController Plugin: Manages high-level operations such as creation, deletion, and attachment of volumes to physical nodes.37 It is typically supported by sidecar containers like external-provisioner and external-attacher.38 Node Plugin: Running on every node (usually as a DaemonSet), it is responsible for the actual mounting and unmounting of the volume in the container\u0026rsquo;s filesystem via gRPC calls provided by the Kubelet.37 This architecture allows advanced functionalities like volume resizing without interruptions and monitoring storage health directly via the Kubernetes API.5\nPerformance Tuning and Optimization # Performance optimization requires a balance between IOPS, throughput, and latency.2\nStorage Parameters and Tiers # Organizations should define different storage classes based on workload requirements.1 For high-performance databases, using NVMe over TCP volumes or premium SSDs with configurable throughput is essential.1\nTo calculate necessary performance, one can refer to throughput density. For example, on Google Cloud Hyperdisk, balancing based on capacity is necessary:\n$$\\text{Minimum Throughput} \\= 10 \\text{ MiB/s per each TiB of capacity}$$While the upper limit is set at 600 MiB/s per volume.30\nVolumeAttributesClass (VAC) # One of the most recent innovations (beta in v1.31) is the VolumeAttributesClass (VAC).22 It allows dynamically modifying volume performance parameters (such as IOPS or throughput) without having to recreate the PVC or PV, eliminating downtimes that were previously necessary to migrate between different storage classes.28 This is particularly useful for managing seasonal traffic peaks where temporarily increasing database speed is required.28\nSecurity and Access Management # Protection of data at rest and in transit is a non-negotiable requirement in enterprise environments.1\nEncryption and RBAC # It is fundamental to enable encryption at rest provided by the backend storage.1 Furthermore, access to PVCs must be regulated via Role-Based Access Control (RBAC), ensuring that only authorized users and ServiceAccounts can manipulate storage resources.15\nFilesystem Permissions and fsGroup # Many \u0026ldquo;Permission Denied\u0026rdquo; issues in Pods stem from misalignments between the user running the container and the mounted volume\u0026rsquo;s permissions.39 Kubernetes resolves this problem through the securityContext. Using the fsGroup parameter, Kubernetes automatically applies ownership of the specified group to all files within the volume at mount time, ensuring that processes in the container can write data without manual chmod or chown interventions.5\nYAML\nspec: securityContext: fsGroup: 2000 fsGroupChangePolicy: \u0026#34;OnRootMismatch\u0026#34; The OnRootMismatch setting optimizes startup times for Pods mounting very large volumes, avoiding recursively scanning all files if the root directory already has correct permissions.5\nBackup, Snapshot, and Disaster Recovery # Persistence alone does not guarantee protection against accidental deletion or data corruption.40 It is essential to implement a solid backup strategy.40\nCSI Snapshotting Mechanisms # Kubernetes natively supports volume snapshots via the VolumeSnapshot object.22 This mechanism allows creating \u0026ldquo;point-in-time\u0026rdquo; copies of data that can be used to clone volumes or restore a previous state in case of application error.5\nVelero: Enterprise Data Protection # Velero is the open-source standard for Kubernetes backup and restore.40 It offers two main modes:\nCSI Snapshots: Leverages backend storage native capabilities to create fast volume snapshots.41 File System Backup (FSB): Uses tools like Restic or Kopia to perform file-level backups, ideal when the CSI driver does not support snapshots or when moving data to a different object storage (off-site backup).41 An advanced best practice involves adopting the \u0026ldquo;CSI Snapshot Data Movement Mode\u0026rdquo;, which combines the speed of hardware snapshots with the security of data transfer to an external repository, ensuring backup accessibility even in case of total primary cluster destruction.41\nConclusions: Towards a Flexible Data Infrastructure # Storage management in Kubernetes has matured from an accessory necessity to a highly sophisticated abstraction ecosystem.1 Understanding the distinction between ephemeral and persistent volumes, coupled with mastery of the PV/PVC/StorageClass model, allows engineers to design systems that not only survive failures but can scale dynamically to respond to business needs.2\nThe future of cloud-native storage is oriented towards greater intelligence of CSI drivers, with auto-tuning performance capabilities and increasingly deep integration with security policies.28 For organizations operating critical workloads, the key to success lies in adopting open standards, automating provisioning via SC, and rigorously validating backup processes, transforming storage from a potential bottleneck to a catalyst for technological innovation.27\nBibliography # Kubernetes Persistent Volumes - Best Practices \u0026amp; Guide | simplyblock, accessed on January 8, 2026, https://www.simplyblock.io/blog/kubernetes-persistent-volumes-how-to-best-practices/ Kubernetes Performance Tuning Guide: Optimize Your K8s Cluster - Kubegrade, accessed on January 8, 2026, https://kubegrade.com/kubernetes-performance-tuning-guide/ Kubernetes Volumes Explained: Use Cases \u0026amp; Best Practices - Groundcover, accessed on January 8, 2026, https://www.groundcover.com/learn/storage/kubernetes-volumes Kubernetes persistent vs ephemeral storage volumes and their uses - StarWind, accessed on January 8, 2026, https://www.starwindsoftware.com/blog/kubernetes-persistent-vs-ephemeral-storage-volumes-and-their-uses/ Volumes | Kubernetes, accessed on January 8, 2026, https://kubernetes.io/docs/concepts/storage/volumes/ YAML in detail: complete guide to the serialization format - Codegrind, accessed on January 8, 2026, https://codegrind.it/blog/yaml-spiegato YAML: The Ultimate Guide with Examples and Best Practices | by Mahalingam SRE, accessed on January 8, 2026, https://medium.com/@lingeshcbz/yaml-the-ultimate-guide-with-examples-and-best-practices-7040f9e389ed Kubernetes Volumes and How To Use Them – ReviewNPrep, accessed on January 8, 2026, https://reviewnprep.com/blog/kubernetes-volumes-and-how-to-use-them/ Ephemeral Volumes - Kubernetes, accessed on January 8, 2026, https://kubernetes.io/docs/concepts/storage/ephemeral-volumes/ What Is a Kubernetes Persistent Volume? - Pure Storage, accessed on January 8, 2026, https://www.purestorage.com/knowledge/what-is-kubernetes-persistent-volume.html Ephemeral Storage in Kubernetes: Overview \u0026amp; Guide - Portworx, accessed on January 8, 2026, https://portworx.com/knowledge-hub/ephemeral-storage-in-kubernetes-overview-guide/ Persistent Volume Claim (PVC) in Kubernetes: Guide - Portworx, accessed on January 8, 2026, https://portworx.com/tutorial-kubernetes-persistent-volumes/ What is a Kubernetes persistent volume? - Pure Storage, accessed on January 8, 2026, https://www.purestorage.com/it/knowledge/what-is-kubernetes-persistent-volume.html Kubernetes Persistent Volume: Examples \u0026amp; Best Practices - vCluster, accessed on January 8, 2026, https://www.vcluster.com/blog/kubernetes-persistent-volume In-Depth Guide to Kubernetes ConfigMap \u0026amp; Secret Management Strategies - Gravitee, accessed on January 8, 2026, https://www.gravitee.io/blog/kubernetes-configurations-secrets-configmaps Kubernetes ConfigMaps and Secrets Part 2 | by Sandeep Dinesh | Google Cloud - Medium, accessed on January 8, 2026, https://medium.com/google-cloud/kubernetes-configmaps-and-secrets-part-2-3dc37111f0dc Mounting ConfigMaps and Secrets as files - DuploCloud Documentation, accessed on January 8, 2026, https://docs.duplocloud.com/docs/automation-platform/kubernetes-overview/configs-and-secrets/mounting-config-as-files Run a Replicated Stateful Application | Kubernetes, accessed on January 8, 2026, https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/ Kubernetes Persistent Volumes and the PV Lifecycle - NetApp, accessed on January 8, 2026, https://www.netapp.com/learn/kubernetes-persistent-storage-why-where-and-how/ How to manage Kubernetes storage access modes - LabEx, accessed on January 8, 2026, https://labex.io/tutorials/kubernetes-how-to-manage-kubernetes-storage-access-modes-419137 Persistent Volumes - Kubernetes, accessed on January 8, 2026, https://kubernetes.io/docs/concepts/storage/persistent-volumes/ Kubernetes PVC Guide: Best Practices \u0026amp; Troubleshooting - Plural, accessed on January 8, 2026, https://www.plural.sh/blog/kubernetes-pvc-guide/ Kubernetes Persistent Volumes - Tutorial and Examples - Spacelift, accessed on January 8, 2026, https://spacelift.io/blog/kubernetes-persistent-volumes Kubernetes Persistent Volume Claims: Tutorial \u0026amp; Top Tips - Groundcover, accessed on January 8, 2026, https://www.groundcover.com/blog/kubernetes-pvc Dynamic Provisioning and Storage Classes in Kubernetes, accessed on January 8, 2026, https://kubernetes.io/blog/2017/03/dynamic-provisioning-and-storage-classes-kubernetes/ Dynamic Volume Provisioning | Kubernetes, accessed on January 8, 2026, https://kubernetes.io/docs/concepts/storage/dynamic-provisioning/ Kubernetes StorageClass: A technical Guide | by Fortismanuel - Medium, accessed on January 8, 2026, https://medium.com/@fortismanuel/kubernetes-storageclass-a-technical-guide-58cfb28619ee Modify Amazon EBS volumes on Kubernetes with Volume Attributes Classes | Containers, accessed on January 8, 2026, https://aws.amazon.com/blogs/containers/modify-amazon-ebs-volumes-on-kubernetes-with-volume-attributes-classes/ Create a persistent volume with Azure Disks in the service \u0026hellip;, accessed on January 8, 2026, https://learn.microsoft.com/it-it/azure/aks/azure-csi-disk-storage-provision Scale your storage performance with Hyperdisk | Google Kubernetes Engine (GKE), accessed on January 8, 2026, https://docs.cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/hyperdisk Optimizing Persistent Storage in Kubernetes - Astuto AI, accessed on January 8, 2026, https://www.astuto.ai/blogs/optimizing-persistent-storage-in-kubernetes Using NFS as External Storage in Kubernetes with PersistentVolume and PersistentVolumeClaim to Deploy Nginx | by Bshreyasharma | Medium, accessed on January 8, 2026, https://medium.com/@bshreyasharma1/using-nfs-as-external-storage-in-kubernetes-with-persistentvolume-and-persistentvolumeclaim-to-112994f3ad59 StatefulSets - Kubernetes, accessed on January 8, 2026, https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/ Guide to Kubernetes StatefulSet – When to Use It and Examples - Spacelift, accessed on January 8, 2026, https://spacelift.io/blog/kubernetes-statefulset Kubernetes StatefulSet - Examples \u0026amp; Best Practices - vCluster, accessed on January 8, 2026, https://www.vcluster.com/blog/kubernetes-statefulset-examples-and-best-practices Deploying the PostgreSQL Pod on Kubernetes with StatefulSets - Nutanix Support Portal, accessed on January 8, 2026, https://portal.nutanix.com/page/documents/solutions/details?targetId=TN-2192-Deploying-PostgreSQL-Nutanix-Data-Services-Kubernetes:deploying-the-postgresql-pod-on-kubernetes-with-statefulsets.html How the CSI (Container Storage Interface) Works - simplyblock, accessed on January 8, 2026, https://www.simplyblock.io/blog/how-the-csi-container-storage-interface-works/ Container Storage Interface (CSI) for Kubernetes GA | Kubernetes, accessed on January 8, 2026, https://kubernetes.io/blog/2019/01/15/container-storage-interface-ga/ Configure a Pod to Use a PersistentVolume for Storage - Kubernetes, accessed on January 8, 2026, https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/ Chapter 6: Backups - Kubernetes Guides - Apptio, accessed on January 8, 2026, https://www.apptio.com/topics/kubernetes/best-practices/backups/ Kubernetes Backup using Velero - Afi.ai, accessed on January 8, 2026, https://afi.ai/blog/kubernetes-velero-backup Snapshot Backups with Velero - MSR Documentation, accessed on January 8, 2026, https://docs.mirantis.com/msr/4.13/backup/ha-backup/snapshot-backups-with-velero/ Velero Backup and Restore using Replicated PV Mayastor Snapshots - Raw Block Volumes, accessed on January 8, 2026, https://openebs.io/docs/Solutioning/backup-and-restore/velerobrrbv File System Backup - Velero Docs, accessed on January 8, 2026, https://velero.io/docs/v1.17/file-system-backup/ ","date":"8 January 2026","externalUrl":null,"permalink":"/guides/kubernetes-volumes-guide/","section":"Guides","summary":"","title":"Strategies and architectures for storage management in Kubernetes: technical analysis of volumes, persistence, and cloud-native operations","type":"guides"},{"content":"","date":"8 January 2026","externalUrl":null,"permalink":"/tags/volumes/","section":"Tags","summary":"","title":"Volumes","type":"tags"},{"content":"","date":"8 January 2026","externalUrl":null,"permalink":"/it/tags/volumi/","section":"Tags","summary":"","title":"Volumi","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/aws-s3/","section":"Tags","summary":"","title":"Aws-S3","type":"tags"},{"content":" From Persistence to Resilience: Orchestrating Longhorn Backups on AWS S3 in a Talos Linux Environment # Introduction: The Local Availability Paradox # In recent weeks, my Homelab based on Talos Linux and virtualized on Proxmox has reached a remarkable level of operational stability. Core services like Traefik and the Hugo blog run without interruption, and networking has been hardened through static IP assignment to nodes. However, analyzing the architecture with a critical eye, a fundamental vulnerability emerged: the confusion between High Availability (HA) and Disaster Recovery (DR).\nLonghorn, the distributed storage engine I chose for this cluster, excels at synchronous data replication. By configuring a replicaCount: 2, every block written to disk is instantly duplicated on a second node. This protects me if a single node fails or a disk becomes corrupted. But what would happen if a configuration error deleted the traefik namespace? Or if a catastrophic failure of the physical Proxmox hardware rendered both virtual nodes inaccessible? The answer is unacceptable for an environment aiming to be \u0026ldquo;Production Grade\u0026rdquo;: total data loss.\nThe goal of today\u0026rsquo;s session was to bridge this gap by implementing an automated offsite backup strategy, using AWS S3 as a remote target and managing the entire configuration according to Infrastructure as Code (IaC) principles. What was supposed to be a simple parameter configuration turned into a complex operation involving software upgrades and declarative definition refactoring.\nPhase 1: Security Foundations and Secrets Management # Before touching Kubernetes, I had to prepare the ground on AWS. The guiding principle in this context is the Principle of Least Privilege (PoLP). It is not acceptable to use root account credentials or an administrator user for an automated backup process. If those keys were compromised, the entire AWS account would be at risk.\nIAM Identity and Bucket Creation # I created a dedicated S3 bucket in the eu-central-1 (Frankfurt) region, chosen to minimize latency with my laboratory in Europe. Subsequently, I configured a technical IAM user, longhorn-backup-user, associating it with a restrictive JSON policy. This policy exclusively grants the permissions necessary to read and write objects in that specific bucket, denying access to any other cloud resource.\n{ \u0026#34;Version\u0026#34;: \u0026#34;2012-10-17\u0026#34;, \u0026#34;Statement\u0026#34;: [ { \u0026#34;Effect\u0026#34;: \u0026#34;Allow\u0026#34;, \u0026#34;Action\u0026#34;: [ \u0026#34;s3:PutObject\u0026#34;, \u0026#34;s3:GetObject\u0026#34;, \u0026#34;s3:ListBucket\u0026#34;, \u0026#34;s3:DeleteObject\u0026#34;, \u0026#34;s3:GetBucketLocation\u0026#34; ], \u0026#34;Resource\u0026#34;: [ \u0026#34;arn:aws:s3:::tazlab-longhorn\u0026#34;, \u0026#34;arn:aws:s3:::tazlab-longhorn/*\u0026#34; ] } ] } Secrets Encryption with SOPS # The next step involved how to bring these credentials (Access Key and Secret Key) inside the Kubernetes cluster. The naive approach would have been to create the Secret manually with kubectl create secret or, worse, commit a YAML file with the keys in clear text to the Git repository.\nI opted for SOPS (Secrets OPerationS) combined with Age for asymmetric encryption. This workflow allows for versioning secret files in the Git repository in encrypted format. Only those who possess the Age private key (in my case, present on my management workstation) can decrypt the file at the time of application.\nThe generated aws-secrets.enc.yaml file contains only the metadata in clear text, while the stringData payload is an incomprehensible encrypted block. The application to the cluster occurred through a just-in-time decryption pipeline:\nsops --decrypt aws-secrets.enc.yaml | kubectl apply -f - This method ensures that a clear text file never exists on the hard drive that could be inadvertently committed or exposed.\nPhase 2: The Upgrade Odyssey (Longhorn 1.8 -\u0026gt; 1.10) # To take advantage of the latest backup management and StorageClass features, I decided to upgrade Longhorn from version 1.8.0 to the current version 1.10.1. Here I encountered the (justified) rigidity of stateful systems.\nThe Pre-Upgrade Hook Block # Launching a direct helm upgrade to version 1.10.1, the process failed instantly. The pre-upgrade job logs reported an unequivocal message:\nfailed to upgrade since upgrading from v1.8.0 to v1.10.1 for minor version is not supported\nThis error highlights a critical difference between stateless applications (like an Nginx web server) and stateful applications (like a storage engine). A stateless application can skip versions at will. A storage engine manages data structures on disk and metadata formats that evolve over time. Longhorn requires that each \u0026ldquo;minor\u0026rdquo; version update (the second number in semantic versioning) be performed sequentially to allow database migration jobs to convert data safely.\nIncremental Mitigation Strategy # I had to adopt a stepped approach, manually simulating the software lifecycle I should have followed if I had maintained the cluster updated regularly.\nStep 1: Upgrade to v1.9.2. I forced Helm to install the latest patch of the 1.9 series. This allowed Longhorn to migrate its CRDs (Custom Resource Definitions) and internal formats. I waited for all longhorn-manager pods to return to Running and complete (2/2) status. Step 2: Upgrade to v1.10.1. Only after validating the cluster health on 1.9 did I launch the final update. This procedure required time and patience, monitoring logs to ensure volumes were not disconnected or corrupted during daemon restarts. It is a reminder that maintenance in the Kubernetes sphere is never a simple \u0026ldquo;set and forget\u0026rdquo; operation.\nPhase 3: The Battle for Declarative Configuration (IaC) # Once the software was updated, the real problem emerged in the attempt to configure the BackupTarget (the S3 URL) declaratively. My intention was to define everything in the longhorn-values.yaml file passed to Helm, to avoid manual configurations via the web UI.\nThe Limit of defaultSettings # I inserted the configurations into the defaultSettings block of the Helm chart:\ndefaultSettings: backupTarget: \u0026#34;s3://tazlab-longhorn@eu-central-1/\u0026#34; backupTargetCredentialSecret: \u0026#34;aws-backup-secret\u0026#34; However, after application, the configuration in Longhorn remained empty. Analyzing the documentation and chart behavior, I rediscovered a technical detail often overlooked: Longhorn applies defaultSettings only during the first installation. If the Longhorn cluster is already initialized, these values are ignored to prevent overwriting configurations that the administrator might have changed at runtime.\nThe Failure of the Declared Imperative Approach # I attempted to bypass the problem by creating YAML manifests for Setting type objects (e.g., settings.longhorn.io), hoping Kubernetes would force the configuration. The result was a rejection by the Longhorn Validating Webhook:\nadmission webhook \u0026quot;validator.longhorn.io\u0026quot; denied the request: setting backup-target is not supported\nThis cryptic error hid an architectural change introduced in recent versions. The backup-target setting is no longer a simple global key-value managed via the Setting object, but has been promoted to a dedicated CRD called BackupTarget. Attempting to configure it as an old setting generated a validation error because the key no longer existed in the simple settings schema.\nThe \u0026ldquo;Tabula Rasa\u0026rdquo; Solution # Faced with a cluster state misaligned with the code (Configuration Drift) and the impossibility of reconciling it cleanly due to residues from previous versions, I made a drastic but necessary decision: the complete uninstallation of the Longhorn control plane.\nIt is essential to distinguish between deleting the control software and deleting the data. By uninstalling Longhorn (helm uninstall), I removed the Pods, Services, and DaemonSets. However, the physical data on the disks (/var/lib/longhorn on the nodes) and the Persistent Volume definitions in Kubernetes remained intact.\nReinstalling Longhorn v1.10.1 from scratch with the correct values.yaml file, the system read the defaultSettings as if it were a new installation, correctly applying the S3 configuration from the very first boot. Upon restart, the managers scanned the disks, found the existing data, and reconnected the volumes without any data loss. This operation validated not only the configuration but also the intrinsic resilience of Kubernetes\u0026rsquo; decoupled architecture.\nPhase 4: Automation and Backup Strategies # Having a configured backup target does not mean having backups. Without automation, backup depends on human memory, which guarantees failure.\nRecurringJob Implementation # I defined a RecurringJob resource to automate the process. Unlike system cronjobs, these are managed internally by Longhorn and are aware of the volume status.\napiVersion: longhorn.io/v1beta2 kind: RecurringJob metadata: name: nightly-s3-backup spec: cron: \u0026#34;0 3 * * *\u0026#34; task: backup retain: 7 groups: - traefik-only The choice to keep only 7 backups (retain: 7) is a compromise between security and S3 storage costs.\nGranularity via Labels and Groups # Initially, all volumes were in the default group. However, not all data has the same value. The Hugo blog volume contains data that is already versioned on GitHub; the Traefik volume contains private SSL certificates, which are irreplaceable and critical.\nI decided to implement a granular backup strategy:\nI created a custom group traefik-only in the RecurringJob. I applied a specific label to the Traefik volume: recurring-job-group.longhorn.io/traefik-only: enabled. I removed generic labels from other volumes. This approach reduces network traffic and storage costs, saving only what is strictly necessary.\nAdvanced StorageClass: Automation at Birth # To close the IaC circle, I created a new dedicated StorageClass: longhorn-traefik-backup.\nkind: StorageClass metadata: name: longhorn-traefik-backup parameters: recurringJobSelector: \u0026#39;[{\u0026#34;name\u0026#34;:\u0026#34;nightly-s3-backup\u0026#34;, \u0026#34;isGroup\u0026#34;:true}]\u0026#39; reclaimPolicy: Retain Using the recurringJobSelector parameter directly in the StorageClass is powerful: any future volume created with this class will automatically inherit the backup policy, without needing manual intervention or subsequent patches. Furthermore, the Retain policy ensures that even if the Traefik Deployment were accidentally deleted, the volume would remain in the cluster waiting to be reclaimed, preventing accidental deletion of certificates.\nConclusions and Reflections # This work session transformed the cluster\u0026rsquo;s storage layer from simple local persistence to a disaster-resilient enterprise-level solution.\nKey lessons learned:\nNever underestimate stateful upgrades: Version jumps in databases and storage engines require planning and incremental steps. IaC requires discipline: It is easy to solve a problem with kubectl patch, but rebuilding the infrastructure from scratch (as we did by uninstalling Longhorn) is the only way to ensure the code faithfully describes reality. Default vs. Runtime: Understanding when a configuration is applied (init vs. runtime) is crucial for debugging complex Helm charts. The infrastructure is now ready to face the worst. The next logical step will be to validate this setup by performing a real Disaster Recovery Test: intentionally destroying a volume and attempting restoration from S3, to transform the \u0026ldquo;hope\u0026rdquo; of backup into the \u0026ldquo;certainty\u0026rdquo; of recovery.\n","date":"7 January 2026","externalUrl":null,"permalink":"/posts/longhorn-s3-backup-talos/","section":"Posts","summary":"","title":"From Persistence to Resilience: Orchestrating Longhorn Backups on AWS S3 in a Talos Linux Environment","type":"posts"},{"content":"The adoption of Talos OS as an operating system for Kubernetes nodes represents a paradigm shift towards immutability, security, and declarative management via API. However, the minimalist nature and the lack of a traditional shell in Talos pose specific challenges when it comes to configuring the high availability (HA) endpoint for the API server and exposing services to the outside. The choice between the native Talos Virtual IP (VIP), kube-vip, and MetalLB is not purely technical, but depends on the cluster scale, latency requirements, and the complexity of the underlying network infrastructure.1 A deep understanding of how these components interact with the Linux kernel and the Kubernetes control plane is essential to implement a load balancing strategy that is resilient and scalable.\nFundamentals of Control Plane High Availability in Talos OS # The heart of a Kubernetes cluster is its control plane, which includes critical components such as etcd, kube-apiserver, kube-scheduler, and kube-controller-manager. In Talos OS, these components are executed as static pods managed directly by the kubelet.5 The main challenge in the architecture of an HA cluster consists of providing clients, such as kubectl or worker nodes, with a single stable endpoint (an IP address or a URL) that can reach any available control plane node, ensuring operational continuity even in the event of failure of one or more nodes.1\nTalos OS addresses this challenge through different methodologies, each with different implications in terms of failover speed and load capacity. The most immediate approach is the use of the native VIP integrated into the operating system, but as the external load on the API server increases, the need emerges for more sophisticated solutions such as external load balancers or BGP-based implementations.7\nThe Mechanism of the Native Talos Virtual IP # The native Talos VIP is a built-in feature designed to simplify the creation of HA clusters without requiring external resources like reverse proxies or hardware load balancers.1 This mechanism relies on the contention of the shared IP address among control plane nodes through an election process managed by etcd.1\nFrom an operational perspective, the configuration requires that all control plane nodes share a Layer 2 network. The VIP address must be a reserved address and not used within the same subnet as the nodes.1 A crucial aspect of this implementation is that the VIP does not become active until the Kubernetes cluster has been bootstrapped, since its management depends directly on the health state of etcd.1\nNative VIP Characteristic Technical Detail Network Requirement Layer 2 Connectivity (same subnet/switch) Election Mechanism Based on etcd quorum Failover Behavior Almost instant for graceful shutdowns; up to 1 minute for sudden crashes Load Limitation Only one node receives traffic at a time (Active-Passive) Bootstrap Dependency Active only after etcd cluster formation 1\nThe analysis of failover times reveals an important design decision by the creators of Talos. While an orderly disconnection allows for an immediate handover, a sudden failure requires Talos to wait for the etcd election timeout. This delay is intentional and serves to ensure that \u0026ldquo;split-brain\u0026rdquo; scenarios do not occur, where multiple nodes announce the same IP simultaneously, a situation that could corrupt network sessions and destabilize access to the API.1\nKubePrism: The Silent Hero of Internal High Availability # Often confused with external VIP solutions, KubePrism is actually a complementary and distinct feature.8 While the native VIP or kube-vip serve primarily for external access (such as kubectl commands), KubePrism is designed exclusively for internal access to the cluster.7 It creates a local load balancing endpoint on every node of the cluster (usually on localhost:7445), which internal processes like the kubelet use to communicate with the API server.8\nThe importance of KubePrism lies in its ability to abstract the complexity of the control plane from the worker nodes. If the external load balancer or the VIP were to fail, KubePrism has an automatic fallback mechanism that allows nodes to continue operating by communicating directly with the control plane nodes.7 In production architectures, it is recommended to keep KubePrism always enabled to ensure that the internal health of the cluster never depends solely on a single external network endpoint.7\nAnalysis of Strategies for Service Load Balancing # Besides access to the API server, managing traffic towards workloads requires the implementation of services of type LoadBalancer. In bare-metal or virtualized environments where Talos is commonly distributed, this functionality is not automatically provided by the cloud provider, making it necessary to install specific controllers like MetalLB or kube-vip.3\nMetalLB: The Standard for Bare-Metal Services # MetalLB is likely the most mature and widespread solution for providing load balancing in on-premise environments.3 It operates by monitoring resources of type Service with spec.type: LoadBalancer and assigning them an IP address from a preconfigured pool.3\nMetalLB supports two main operating modes: Layer 2 and BGP. In Layer 2 mode, one of the cluster nodes is elected \u0026ldquo;leader\u0026rdquo; for a given service IP address and responds to ARP requests (for IPv4) or NDP (for IPv6).3 Although extremely simple to configure, this mode presents the limitation of funneling all traffic of a service through a single node, creating a potential bottleneck.4 Conversely, BGP mode allows each node to announce the service IP address to network routers, enabling true load balancing via ECMP (Equal-Cost Multi-Path).4\nKube-vip: Versatility and Unification # Kube-vip stands out for its ability to manage both control plane HA and service load balancing in a single component.2 Unlike the native Talos VIP, kube-vip can be configured to use IPVS (IP Virtual Server) to distribute API server traffic across all control plane nodes in active-active mode, significantly improving performance under high load.14\nKube-vip can run as a static pod, making it ideal for scenarios where the HA endpoint must be available from the very first moments of the cluster bootstrap, even before the etcd database is fully formed.14 However, its configuration as a service load balancer is often considered less feature-rich compared to MetalLB, which offers more granular management of address pools and advertisement policies.16\nComparison of Strategies Requested by the User # Choosing the correct combination of tools depends on the need to balance operational simplicity and scalability. Below is an analysis of the comparison between the three main strategies raised in the query.\nStrategy 1: Native Talos VIP with MetalLB # This is the most common and recommended configuration for small to medium-sized clusters (up to 10-20 nodes) in Layer 2 environments.7\nAdvantages: Leverages operating system stability for critical API access and uses MetalLB, which is the industry standard, for application service management. The separation of duties makes the system easy to diagnose: API issues are linked to Talos configuration, while application issues are linked to MetalLB.17 Disadvantages: Access to the API server is limited to the capacity of a single node (active-passive), which may not be sufficient for clusters with a very high frequency of API operations (e.g., massive CI/CD environments).7 Strategy 2: Kube-vip without MetalLB # This strategy aims at unifying network functions under a single controller.2\nAdvantages: Reduces the number of components to manage in the cluster. Kube-vip can manage both the API server IP and LoadBalancer service IPs. Supports IPVS for real API balancing.14 Disadvantages: Although versatile, kube-vip can result in being more complex to configure correctly to cover all MetalLB use cases, especially in complex BGP networks. The loss of the kube-vip pod could, in theory, interrupt both access to the control plane and all cluster services simultaneously.16 Strategy 3: Kube-vip with MetalLB # In this configuration, kube-vip is used exclusively for control plane high availability, while MetalLB manages application services.16\nAdvantages: Offers the best performance for the API server (thanks to IPVS or BGP ECMP provided by kube-vip) while maintaining the flexibility of MetalLB for applications.17 It is an excellent choice for enterprise environments where the control plane is under heavy stress. Disadvantages: It is the most complex configuration to maintain, requiring the management of two different network controllers that could conflict if not carefully configured (for example, both attempting to listen on BGP port 179).3 Characteristic Native VIP + MetalLB Kube-vip (Only) Kube-vip + MetalLB Complexity Low Medium High API Performance Active-Passive Active-Active (IPVS) Active-Active (IPVS) Service Performance High (L2/BGP) Medium High (L2/BGP) Standardization Very Common Common Professional/Enterprise Recommended Use Homelab / SMB Minimalist Systems High Load Clusters 3\nDifferentiated Strategies by Cluster Size # Cluster sizing is a determining factor for choosing the balancing strategy. What works for a small home server might not be adequate for a distributed data center.\nSmall Clusters and \u0026ldquo;Minecraft\u0026rdquo; Environments # By \u0026ldquo;Minecraft configuration\u0026rdquo; we usually mean a small-sized cluster, often consisting of a single node or a small set of nodes (3 or less), typical of homelab or test environments.21\nIn a single-node cluster, it is fundamental to pay attention to a technical detail of Talos: by default, control plane nodes are labeled to be excluded from external load balancers (node.kubernetes.io/exclude-from-external-load-balancers: \u0026ldquo;\u0026rdquo;).24 In a multi-node cluster, this protects master nodes from application traffic, but in a single-node cluster, it prevents MetalLB or kube-vip from correctly exposing services.24 The solution consists of removing or commenting out this label in the machine configuration.24\nFor these small clusters, the recommendation is absolute simplicity:\nControl Plane: Use the native Talos VIP.7 Services: Use MetalLB in Layer 2 mode.10 Storage: Often coupled with Longhorn for simplicity of management on few nodes.7 Large Clusters (\u0026gt;100 Nodes) # In enterprise-scale clusters, Layer 2 network limitations become evident. ARP broadcast traffic for VIP management can degrade network performance, and failover speed based on etcd election might not meet availability requirements.4\nGuidelines from Sidero Labs (the developers of Talos) for high-load clusters suggest moving the responsibility of API server balancing outside the cluster.6 The use of an external load balancer (F5, Netscaler, or a dedicated HAProxy instance) that distributes requests to all healthy control plane nodes is the most resilient option.6 This approach offloads the master nodes\u0026rsquo; CPU from network traffic management and ensures that API access is independent of the internal state of the Kubernetes control plane.7\nFor services, at this scale, the use of BGP mode is imperative.4 MetalLB or Cilium (which offers a native eBPF-based BGP control plane) become the tools of choice.18 Integration with TOR (Top of Rack) routers allows for a truly horizontal traffic distribution, leveraging the physical network infrastructure to ensure scalability.27\nTechnical Analysis of Protocols: ARP vs BGP # The decision between Layer 2 (ARP) and Layer 3 (BGP) is dictated by infrastructure. It is fundamental to understand the \u0026ldquo;cost\u0026rdquo; of each choice.\nImplications of Layer 2 and ARP # ARP-based balancing is fundamentally a failover mechanism, not load distribution.12 When MetalLB or kube-vip operate in this mode, they choose one node that responds to all requests for a given IP.3 The advantage is that it works everywhere, even on cheap switches.29 However, in case of leader node failure, a \u0026ldquo;gratuitous\u0026rdquo; ARP packet must be sent to inform other hosts that the MAC address associated with that IP has changed.12 If clients or network routers have persistent ARP caches and ignore gratuitous advertisements, connectivity interruptions of up to 30-60 seconds can occur.12\nImplications of Layer 3 and BGP # BGP transforms Kubernetes nodes into actual routers.13 Each node announces service IP prefixes to a BGP peer (usually the default gateway). This allows ECMP balancing, where the router distributes packets among nodes.4\nHowever, BGP on Kubernetes presents a challenge known as connection \u0026ldquo;churn\u0026rdquo;. Since traditional routers are often \u0026ldquo;stateless\u0026rdquo; in their ECMP hashing, when a node is added or removed (e.g., during a Talos upgrade), the router\u0026rsquo;s hashing algorithm might recalculate paths, moving active TCP sessions to different nodes.13 If the new node does not know that session (because traffic was not proxied correctly), the connection will be interrupted with a \u0026ldquo;Connection Reset\u0026rdquo; error.13 To overcome this, it is necessary to use routers that support \u0026ldquo;Resilient ECMP\u0026rdquo; or place services behind an Ingress controller that can manage session persistence at the application level.13\nConfiguration Guide: Details and Warnings # Configuring these strategies in Talos OS requires the use of YAML patches applied to the machine configuration files (machineconfig).\nConfiguring the Native VIP # A common mistake is using the VIP as an endpoint in the talosconfig file.1 Since the VIP depends on the health of etcd and the kube-apiserver, if these components fail, it will not be possible to use talosctl via the VIP to repair the node. The correct practice involves inserting the individual physical IP addresses of the master nodes in the endpoint list of talosconfig.6\nConflicts between Kube-vip and MetalLB # If choosing to use kube-vip for the control plane and MetalLB for services, it is vital to use load balancing classes (loadBalancerClass) introduced in Kubernetes 1.24.17 Without this distinction, both controllers might attempt to \u0026ldquo;take charge\u0026rdquo; of the same service, leading to a situation of instability where the IP address is continuously assigned and removed.17\nFurthermore, if both components are configured to use BGP, they are likely to conflict over the use of TCP port 179.3 In Talos, a modern solution consists of using Cilium as CNI and entrusting it with the entire BGP control plane, eliminating the need for MetalLB and reducing system complexity.18\nSpecial Use Cases and Troubleshooting # In real installations, undocumented scenarios often emerge requiring specific interventions.\nAsymmetric Routing Issues # When using software load balancers on Talos, the phenomenon of asymmetric routing can occur: the packet enters the cluster via node A (which holds the VIP) but must be delivered to a pod on node B.32 If node B responds directly to the client via its own default gateway, many firewalls will block the traffic considering it an attack or a protocol error.32\nTo mitigate this issue, Talos and MetalLB recommend enabling \u0026ldquo;strict ARP\u0026rdquo; mode in kube-proxy.31 This ensures that traffic follows predictable paths. Another option is the use of externalTrafficPolicy: Local in the Kubernetes service, which instructs the load balancer to send traffic only to nodes that actually host the service pod, eliminating the internal hop between nodes and preserving the client\u0026rsquo;s source IP address.13\nFailover and Impact on Workloads # It is fundamental to understand that VIP failover (whether native or managed by kube-vip) affects only external access to the cluster (e.g., executing kubectl or external API calls).1 Inside the cluster, thanks to KubePrism and service discovery, workloads continue to communicate normally and are unaffected by the state of the external VIP.1 However, long-lived connections passing through the VIP (such as gRPC tunnels or HTTP/2 sessions) will be interrupted and require client-side reconnection logic.1\nConclusions and Strategic Recommendations # Based on the analysis of collected data and industry best practices, the correct strategy for a Kubernetes cluster on Talos OS can be summarized in three main paths, depending on scalability needs and network complexity.\nFor most users, the Recommended Strategy is the pairing of the native Talos VIP with MetalLB in Layer 2 mode. This configuration perfectly balances the management simplicity typical of Talos with the flexibility of MetalLB. It is the ideal choice for clusters operating in a single server room or standard virtualized environment, ensuring high availability of the API server without adding critical components that must be manually managed during bootstrap.\nFor Enterprise or High Load installations, the optimal strategy shifts towards external load balancing for the control plane and MetalLB or Cilium in BGP mode for services. This architecture eliminates the typical bottlenecks of Layer 2 networks and leverages the power of physical routers to distribute traffic, ensuring that the cluster can scale up to hundreds of nodes without network performance degradation.\nFinally, for Small Clusters and Homelabs (Minecraft Style), the watchword is minimalism. The use of native VIP and MetalLB (L2), taking care to correctly configure node labels to allow service exposure, provides a robust and easy-to-maintain environment, minimizing the \u0026ldquo;consumption\u0026rdquo; of precious resources by infrastructure components.\nIn summary, the systems architect operating with Talos OS must always prioritize enabling KubePrism as the foundation of internal resilience and select the IP address advertisement method (ARP vs BGP) not based on software preference, but based on the actual capabilities of the network hardware hosting the cluster.\nBibliography # Virtual (shared) IP - Sidero Documentation - What is Talos Linux?, accessed on January 1, 2026, https://docs.siderolabs.com/talos/v1.8/networking/vip kube-vip: Documentation, accessed on January 1, 2026, https://kube-vip.io/ Setting Up MetalLB: Kubernetes LoadBalancer for Bare Metal Clusters | Talha Juikar, accessed on January 1, 2026, https://talhajuikar.com/posts/metallb/ MetalLB: A Load Balancer for Bare Metal Kubernetes Clusters | by 8grams - Medium, accessed on January 1, 2026, https://8grams.medium.com/metallb-a-load-balancer-for-bare-metal-kubernetes-clusters-ef8a9e00c2bd Control Plane - Sidero Documentation - What is Talos Linux?, accessed on January 1, 2026, https://docs.siderolabs.com/talos/v1.9/learn-more/control-plane Production Clusters - Sidero Documentation - What is Talos Linux?, accessed on January 1, 2026, https://docs.siderolabs.com/talos/v1.7/getting-started/prodnotes Kubernetes Cluster Reference Architecture with Talos Linux for 2025-05 - Sidero Labs, accessed on January 1, 2026, https://www.siderolabs.com/wp-content/uploads/2025/08/Kubernetes-Cluster-Reference-Architecture-with-Talos-Linux-for-2025-05.pdf difference VIP vs KubePrism (or other) · siderolabs talos · Discussion #9906 - GitHub, accessed on January 1, 2026, https://github.com/siderolabs/talos/discussions/9906 Installation - kube-vip, accessed on January 1, 2026, https://kube-vip.io/docs/installation/ Kubernetes Homelab Series Part 3 - LoadBalancer With MetalLB | Eric Daly\u0026rsquo;s Blog, accessed on January 1, 2026, https://blog.dalydays.com/post/kubernetes-homelab-series-part-3-loadbalancer-with-metallb/ Configuration :: MetalLB, bare metal load-balancer for Kubernetes, accessed on January 1, 2026, https://metallb.universe.tf/configuration/ MetalLB in layer 2 mode :: MetalLB, bare metal load-balancer for Kubernetes, accessed on January 1, 2026, https://metallb.universe.tf/concepts/layer2/ MetalLB in BGP mode :: MetalLB, bare metal load-balancer for Kubernetes, accessed on January 1, 2026, https://metallb.universe.tf/concepts/bgp/ Architecture | kube-vip, accessed on January 1, 2026, https://kube-vip.io/docs/about/architecture/ Static Pods | kube-vip, accessed on January 1, 2026, https://kube-vip.io/docs/installation/static/ What do you use for baremetal VIP ControlPane and Services : r/kubernetes - Reddit, accessed on January 1, 2026, https://www.reddit.com/r/kubernetes/comments/1nlnb1o/what_do_you_use_for_baremetal_vip_controlpane_and/ HA Kubernetes API server with MetalLB\u0026hellip;? - Reddit, accessed on January 1, 2026, https://www.reddit.com/r/kubernetes/comments/1o9t1j2/ha_kubernetes_api_server_with_metallb/ For those who work with HA onprem clusters : r/kubernetes - Reddit, accessed on January 1, 2026, https://www.reddit.com/r/kubernetes/comments/1j05ozt/for_those_who_work_with_ha_onprem_clusters/; Kubernetes Load-Balancer service - kube-vip, accessed on January 1, 2026, https://kube-vip.io/docs/usage/kubernetes-services/ metallb + BGP = conflict with kube-router? | TrueNAS Community, accessed on January 1, 2026, https://www.truenas.com/community/threads/metallb-bgp-conflict-with-kube-router.115690/ Talos Kubernetes in Five Minutes - DEV Community, accessed on January 1, 2026, https://dev.to/nabsul/talos-kubernetes-in-five-minutes-1p1h [Lab Setup] 3-node Talos cluster (Mac minis) + MinIO backend — does this topology make sense? : r/kubernetes - Reddit, accessed on January 1, 2026, https://www.reddit.com/r/kubernetes/comments/1myb8xc/lab_setup_3node_talos_cluster_mac_minis_minio/ Getting back into the HomeLab game for 2024 - vZilla, accessed on January 1, 2026, https://vzilla.co.uk/vzilla-blog/getting-back-into-the-homelab-game-for-2024 Fix LoadBalancer Services Not Working on Single Node Talos Kubernetes Cluster, accessed on January 1, 2026, https://www.robert-jensen.dk/posts/2025/fix-loadbalancer-services-not-working-on-single-node-talos-kubernetes-cluster/ Deploy Talos Linux with Local VIP, Tailscale, Longhorn, MetalLB and Traefik - Josh\u0026rsquo;s Notes, accessed on January 1, 2026, https://notes.joshrnoll.com/notes/deploy-talos-linux-with-local-vip-tailscale-longhorn-metallb-and-traefik/ Kubernetes \u0026amp; Talos - Reddit, accessed on January 1, 2026, https://www.reddit.com/r/kubernetes/comments/1hs6bui/kubernetes_talos/ Advanced BGP configuration :: MetalLB, bare metal load-balancer for Kubernetes, accessed on January 1, 2026, https://metallb.universe.tf/configuration/_advanced_bgp_configuration/ Talos with redundant routed networks via bgp : r/kubernetes - Reddit, accessed on January 1, 2026, https://www.reddit.com/r/kubernetes/comments/1iy411r/talos_with_redundant_routed_networks_via_bgp/ MetalLB on K3s (using Layer 2 Mode) | SUSE Edge Documentation, accessed on January 1, 2026, https://documentation.suse.com/suse-edge/3.3/html/edge/guides-metallb-k3s.html Troubleshooting - Sidero Documentation - What is Talos Linux?, accessed on January 1, 2026, https://docs.siderolabs.com/talos/v1.9/troubleshooting/troubleshooting Installation :: MetalLB, bare metal load-balancer for Kubernetes, accessed on January 1, 2026, https://metallb.universe.tf/installation/ Analyzing Load Balancer VIP Routing with Calico BGP and MetalLB - AHdark Blog, accessed on January 1, 2026, https://www.ahdark.blog/analyzing-load-balancer-vip-routing/ Kubernetes Services : Achieving optimal performance is elusive | by CloudyBytes | Medium, accessed on January 1, 2026, https://cloudybytes.medium.com/kubernetes-services-achieving-optimal-performance-is-elusive-5def5183c281 Usage :: MetalLB, bare metal load-balancer for Kubernetes, accessed on January 1, 2026, https://metallb.universe.tf/usage/ ","date":"7 January 2026","externalUrl":null,"permalink":"/guides/talos-vip-load-balancing/","section":"Guides","summary":"","title":"Architectural Strategies for Load Balancing and Control Plane High Availability in Talos OS-based Kubernetes Clusters","type":"guides"},{"content":"The landscape of Static Site Generators (SSG) has undergone a paradigmatic evolution in recent years, with Hugo establishing itself as one of the most high-performance solutions thanks to its almost instant build speed and its robust Go architecture. In this context, the Blowfish theme emerges not as a simple visual template, but as a modular and sophisticated framework, built upon Tailwind CSS 3.0, designed to meet the needs of developers, researchers, and content creators who require an impeccable balance between minimalist aesthetics and functional power.1 Blowfish stands out for its ability to manage complex workflows, serverless integrations, and granular customization that goes far beyond the visual surface, positioning itself as one of the most advanced themes in the Hugo ecosystem.1\nEvolution and Design Philosophy of Blowfish # The genesis of Blowfish lies in the need to overcome the limitations of monolithic themes, offering a structure that prioritizes automated asset optimization and out-of-the-box accessibility. The adoption of Tailwind CSS is not just an aesthetic choice, but an architectural decision that allows for the generation of extremely small CSS bundles, containing only the classes actually used, ensuring high-level performance documented by excellent scores in Lighthouse tests.1 The theme is intrinsically content-oriented, structured to fully leverage Hugo\u0026rsquo;s \u0026ldquo;Page Bundles\u0026rdquo;, a system that organizes multimedia resources directly alongside text files, improving project portability and maintainability in the long run.5\nBlowfish\u0026rsquo;s architecture is designed to be \u0026ldquo;future-proof\u0026rdquo;, natively supporting dynamic integrations in a static environment, such as view counting and interaction systems via Firebase, advanced client-side search with Fuse.js, and complex data visualization via Chart.js and Mermaid.1 This versatility makes it suitable for a wide range of applications, from personal blogs to enterprise-level technical documentation.\nInstallation and Project Initialization Procedures # Implementing Blowfish requires a development environment correctly configured with Hugo (version 0.87.0 or higher, preferably the \u0026ldquo;extended\u0026rdquo; version) and Git.3 There are three main paths for installation, each with specific implications for workflow management.\nCLI Methodology: Blowfish Tools # The most modern and recommended approach for new users is the use of blowfish-tools, a command-line tool that automates site creation and initial configuration.3\nCommand Function Context of Use npm i -g blowfish-tools Global installation Preparation of the Node.js development environment. blowfish-tools new Complete site creation Ideal for new projects starting from scratch. blowfish-tools Interactive menu Configuration of specific features in existing projects. This tool significantly reduces the barrier to entry, handling the creation of the complex folder structure required for a modular configuration.5\nProfessional Methodology: Hugo Modules and Git Submodules # For professionals operating in Continuous Integration (CI) environments, the use of Hugo Modules represents the most elegant solution. This method treats the theme as a dependency managed by Go, allowing rapid updates via the command hugo mod get -u.1 Alternatively, installation as a Git submodule (git submodule add https://github.com/nunocoracao/blowfish.git themes/blowfish) is preferable for those who wish to keep the theme code within their own repository without mixing it with content, facilitating the tracking of specific versions.1\nThe Modular Configuration System # One of Blowfish\u0026rsquo;s distinctive features is the abandonment of the single config.toml file in favor of a config/_default/ directory containing specialized TOML files. This logical separation is fundamental for managing the complexity of the options offered by the theme.2\nHugo.toml: The Backbone of the Site # The hugo.toml file (or config.toml if not using the modular structure) defines global Hugo engine parameters and basic site settings.8\nParameter Description Technical Relevance baseURL Site root URL Essential for correct absolute link generation and SEO.4 theme \u0026ldquo;blowfish\u0026rdquo; Indicates to Hugo which theme to load (omissible with Modules).8 defaultContentLanguage Default language Determines the i18n translations to use initially.8 outputs.home ` Crucial: the JSON format is necessary for internal search.8 summaryLength Summary length A value of 0 indicates to Hugo to use the first sentence as summary.8 Enabling the JSON format on the homepage is a critical technical step often overlooked; without it, the Fuse.js search module will not have an index to query, rendering the search bar non-functional.8\nParams.toml: The Feature Control Panel # The params.toml file hosts theme-specific configurations, allowing complex modules to be enabled or disabled without modifying the source code.4\nVisual aspect management is controlled by the parameters defaultAppearance and autoSwitchAppearance. The first defines whether the site should load in \u0026ldquo;light\u0026rdquo; or \u0026ldquo;dark\u0026rdquo; mode, while the second, if set to true, allows the site to respect the user\u0026rsquo;s operating system preferences, ensuring a visual experience consistent with the visitor\u0026rsquo;s ecosystem.8 Furthermore, the colorScheme parameter allows selecting one of the predefined palettes, each of which radically transforms the site\u0026rsquo;s chromatic identity without requiring manual CSS changes.5\nMultilingual Architecture and Author Configuration # Blowfish excels in multilingual support, requiring a dedicated configuration file for each language (e.g., languages.it.toml).5 Defined in this file are not only the site title for that specific language, but also the author metadata that will appear in biographical boxes under articles.2\nAuthor Field Function UI Impact name Author name Displayed in the header and footer of articles.2 image Author avatar Circular profile image in biographical widgets.2 headline Short slogan Impact text displayed in the \u0026ldquo;profile\u0026rdquo; layout homepage.2 bio Full biography Descriptive text displayed in the post footer if showAuthor is active.7 links Social media Array of clickable icons linking to external profiles.2 This approach allows for extreme customization: a site can have different authors for different language versions, or simply translate the main author\u0026rsquo;s biography to adapt to the local audience.5\nNavigation and Menus: Hierarchies and Iconography # Menu configuration takes place via dedicated files like menus.en.toml or menus.it.toml. Blowfish supports three main navigation areas: the main menu (header), the footer menu, and subnavigation.5\nThe theme introduces a simplified icon system via the pre parameter, which allows inserting SVG icons (like those from FontAwesome or social icons) directly next to the menu text.5 An advanced aspect is support for nested menus: by defining an element with a unique identifier and setting other elements with a parent parameter corresponding to that identifier, Blowfish will automatically generate elegant and functional dropdown menus.5\nContent Management: Page Bundles and Taxonomies # Hugo\u0026rsquo;s strength, and Blowfish\u0026rsquo;s in particular, lies in structured content management. The theme is designed to operate in harmony with the concept of \u0026ldquo;Page Bundles\u0026rdquo;, distinguishing between Branch Pages and Leaf Pages.5\nBranch Pages and Section Organization # Branch Pages are nodes in the hierarchy that contain other files, such as section homepages or category lists. They are identified by the file _index.md. Blowfish honors parameters in the front matter of these files, allowing global settings to be overridden for a specific section of the site.6 For example, one can decide that the \u0026ldquo;Portfolio\u0026rdquo; section uses a card view, while the \u0026ldquo;Blog\u0026rdquo; section uses a classic list.6\nLeaf Pages and Asset Management # Leaf Pages represent atomic content, such as a single post or an \u0026ldquo;About\u0026rdquo; page. If an article includes images or other media, it must be created as a \u0026ldquo;bundle\u0026rdquo;: a directory named after the article containing an index.md file (without underscore) and all related assets.6 This system not only maintains order in the filesystem but allows Blowfish to process images via Hugo Pipes to automatically optimize their weight and dimensions.1\nIntegration of External Content # Blowfish offers a sophisticated feature to include links to external platforms (such as Medium, LinkedIn, or GitHub repositories) directly in the site\u0026rsquo;s article flow.1 By using the externalUrl parameter in the front matter and instructing Hugo not to generate a local page (build: render: \u0026ldquo;false\u0026rdquo;), the post will appear in the article list but will redirect the user directly to the external resource, while maintaining the site\u0026rsquo;s visual consistency and internal categorization.6\nVisual Support and Media Optimization # Blowfish\u0026rsquo;s visual impact is strongly tied to its image management, which balances aesthetics with performance through the use of modern technologies like lazy-loading and dynamic resizing.1\nFeatured Images and Hero Sections # To set a preview image that appears in cards and the header of an article, Blowfish follows a strict naming convention: the file must start with feature* (e.g., feature.png, featured-image.jpg) and be located in the article\u0026rsquo;s folder.5 These images not only serve as thumbnails but are used to generate the Open Graph metadata necessary for correct display on social media via the oEmbed protocol.7\nThe header layout (Hero Style) can be configured globally or per single post:\nHero Style Visual Effect Recommended Use basic Simple layout with title and image side-by-side. Standard informational posts.7 big Large image above the title with caption support. Cover stories or long-form articles.7 background The feature image becomes the header background. Impact pages or landing pages.7 thumbAndBackground Combines the background image with a thumbnail in the foreground. Strong brand identity or portfolio.7 Custom Backgrounds and System Images # Blowfish allows defining global backgrounds via the defaultBackgroundImage parameter in params.toml. To ensure fast load times, the theme automatically scales these images to a predefined width (usually 1200px), reducing data consumption for users on mobile devices.7 Furthermore, it is possible to globally disable image zooming or optimization for specific scenarios where absolute visual fidelity takes priority over performance.8\nRich Content and Advanced Shortcodes # Blowfish shortcodes extend standard Markdown capabilities, allowing the insertion of complex UI components without writing HTML code.16\nAlerts and Callouts # The alert shortcode is a fundamental tool for technical communication, allowing warnings, notes, or suggestions to be highlighted. It supports parameters for the icon, card color, icon color, and text color, ensuring the alert aligns perfectly with the semantic context of the content.16\nExample usage with named parameters:\nCritical error message!\n\u0026lt; /alert \u0026gt;.16\nCarousels and Interactive Galleries # For managing multiple images, the carousel shortcode offers a sliding and elegant interface. A particularly powerful feature is the ability to pass a regex string to the images parameter (e.g., images=\u0026ldquo;gallery/*\u0026rdquo;), instructing the theme to automatically load all images present in a specific subdirectory of the Page Bundle.16 This eliminates the need to manually update Markdown code every time a photo is added to the gallery.\nFigures and Video Embedding # Blowfish\u0026rsquo;s figure shortcode replaces Hugo\u0026rsquo;s native one, offering superior performance via device resolution-based image optimization (Responsive Images). It supports Markdown captions, hyperlinks on the image, and granular control over the zoom function.16\nRegarding video, Blowfish provides responsive wrappers for YouTube, Vimeo, and local files. Using the youtubeLite shortcode is recommended for sites aiming for maximum speed: instead of loading the entire Google iframe at page load, it loads only a lightweight thumbnail, activating the heavy player only when the user actually clicks the play button.16\nScientific Communication: Math and Diagrams # Blowfish has become a de facto standard for academic and technical blogs thanks to its native integration with high-level typesetting and data visualization tools.1\nMathematical Notation with KaTeX # The rendering of mathematical formulas is entrusted to KaTeX, known for being the fastest math typesetting engine for the web. To preserve performance, Blowfish does not load KaTeX assets globally; they are included in the page bundle only if the shortcode is detected within the article.16\nThe supported syntax follows LaTeX standards:\nInline Notation: Formulas inserted into the text flow using the delimiters \\( and \\). Example: $\\nabla \\cdot \\mathbf{E} = \\frac{\\rho}{\\varepsilon_0}$.18\nBlock Notation: Formulas centered and isolated using the delimiters $$. Example:\n$$e^{i\\pi} \\+ 1 \\= 0$$\n.18\nThis implementation allows writing complex equations that remain readable and searchable, with zero impact on the loading speed of non-scientific pages of the site.\nDynamic Diagrams and Charts # Through the mermaid and chart shortcodes, Blowfish allows generating complex visualizations starting from textual data.1\nMermaid.js: Allows creating flowcharts, sequence diagrams, Gantt charts, and class diagrams using simple text syntax. It is ideal for documenting software architectures or logical processes without managing external image files.1 Chart.js: Allows embedding bar, pie, line, and radar charts by providing structured data directly in the shortcode. Since charts are rendered on an HTML5 Canvas element, they remain sharp at any zoom level and are interactive (showing values on mouse hover).1 Dynamic Integrations and Dynamic Data Support # Despite its static nature, Blowfish can evolve into a dynamic platform thanks to intelligent integration with serverless services, particularly Firebase.1\nFirebase: Views, Likes, and Dynamic Analytics # Integration with Firebase allows adding features typical of traditional CMS systems, such as real-time view counting and a \u0026ldquo;like\u0026rdquo; system for articles.1 The configuration process involves:\nCreating a Firebase project and enabling the Firestore database in production mode.9 Configuring security rules to allow anonymous reads and writes (after enabling Anonymous Authentication).9 Inserting API keys in the params.toml file under the Firebase section.8 Once configured, Blowfish automatically handles view incrementing every time a page is loaded, storing data in the serverless database and displaying it in article lists.8\nAdvanced Search with Fuse.js # Blowfish\u0026rsquo;s internal search does not require external databases. During the build phase, Hugo generates an index.json file containing the title, summary, and content of all articles.1 Fuse.js, a lightweight fuzzy search library, downloads this index and allows instant searches directly in the user\u0026rsquo;s browser. To ensure this feature works, it is imperative that the outputs.home configuration includes the JSON format.8\nSEO, Accessibility, and Search Engine Optimization # Blowfish is built following SEO best practices to ensure contents are easily indexable and presented optimally on social media.1\nMetadata and Structured Data # The theme automatically generates Open Graph and Twitter Cards meta tags, using the article\u0026rsquo;s feature image and the description provided in the front matter. If no description is provided, Blowfish uses the summary automatically generated by Hugo.7 Furthermore, support for structured breadcrumbs (enableable via enableStructuredBreadcrumbs) helps search engines understand the site hierarchy and display clean navigation paths in search results.8\nPerformance and Lighthouse Scores # Performance optimization is not just a question of speed, but a critical ranking factor (Core Web Vitals). Blowfish achieves scores close to 100 in all Lighthouse categories thanks to:\nGeneration of minimal critical CSS via Tailwind.1 Native lazy-loading for all images.8 Minimization of JS assets.1 Native support for modern image formats like WebP (via Hugo Pipes).1 Deployment Strategies and Production Pipelines # The static nature of sites generated with Blowfish allows for global and economical distribution via CDN (Content Delivery Networks).12\nHosting and Continuous Deployment # Modern hosting platforms offer direct integrations with GitHub or GitLab, automating the build and deployment process.\nPlatform Build Method Technical Notes GitHub Pages GitHub Actions Requires creating a YAML workflow executing hugo --gc --minify.4 Netlify Internal Build Bot Configuration via netlify.toml; supports branch previews and forms.3 Firebase Hosting Firebase CLI Ideal if already using Firebase for views and likes.9 During deployment configuration, it is fundamental to correctly set the baseURL variable for the production environment, especially if the site resides in a subdirectory, to prevent assets (CSS, images) from being loaded from incorrect paths.4\nConclusions: Towards a Static Web without Compromises # Configuring the Blowfish theme for Hugo represents a balancing exercise between the simplicity of Markdown content management and the complexity of modern technological needs. Through a modular structure, maniacal attention to performance, and a series of high-level integrations for scientific and dynamic data, Blowfish confirms itself as an excellent solution for creating professional websites.1\nAdopting this theme allows developers to focus on content quality and information structure, delegating technical aspects related to accessibility, SEO, and asset optimization to the framework. In an increasingly demanding web ecosystem, Blowfish offers the necessary tools to build a solid, high-performance, and visually appealing online presence, defining the state of the art for next-generation Hugo themes.3\nBibliography # Blowfish | Hugo Themes, accessed on January 3, 2026, https://www.gohugothemes.com/theme/nunocoracao-blowfish/ Gitlab Pages, Hugo and Blowfish to set up your website in minutes - Mariano González, accessed on January 3, 2026, https://blog.mariano.cloud/your-website-in-minutes-gitlab-hugo-blowfish nunocoracao/blowfish: Personal Website \u0026amp; Blog Theme for Hugo - GitHub, accessed on January 3, 2026, https://github.com/nunocoracao/blowfish Blowfish - True Position Tools, accessed on January 3, 2026, https://truepositiontools.com/crypto/blowfish-guide Getting Started - Blowfish, accessed on January 3, 2026, https://blowfish.page/docs/getting-started/ Content Examples · Blowfish, accessed on January 3, 2026, https://blowfish.page/docs/content-examples/ Thumbnails · Blowfish, accessed on January 3, 2026, https://blowfish.page/docs/thumbnails/ Configuration - Blowfish, accessed on January 3, 2026, https://blowfish.page/docs/configuration/ Firebase: Views \u0026amp; Likes - Blowfish, accessed on January 3, 2026, https://blowfish.page/docs/firebase-views/ Installation - Blowfish, accessed on January 3, 2026, https://blowfish.page/docs/installation/ How To Make A Hugo Blowfish Website - YouTube, accessed on January 3, 2026, https://www.youtube.com/watch?v=-05mOdHmQVc A Beginner-Friendly Tutorial for Building a Blog with Hugo, the Blowfish Theme, and GitHub Pages, accessed on January 3, 2026, https://www.gigigatgat.ca/en/posts/how-to-create-a-blog/ Step-by-Step Guide to Creating a Hugo Website · - dasarpAI, accessed on January 3, 2026, https://main\u0026ndash;dasarpai.netlify.app/dsblog/step-by-step-guide-creating-hugo-website/ Partials - Blowfish, accessed on January 3, 2026, https://blowfish.page/docs/partials/ Build your homepage using Blowfish and Hugo · N9O - Nuno Coração, accessed on January 3, 2026, https://n9o.xyz/posts/202310-blowfish-tutorial/ Shortcodes · Blowfish, accessed on January 3, 2026, https://blowfish.page/docs/shortcodes/ Shortcodes - Hugo, accessed on January 3, 2026, https://gohugo.io/content-management/shortcodes/ Mathematical notation · Blowfish, accessed on January 3, 2026, https://blowfish.page/samples/mathematical-notation/ Hosting \u0026amp; Deployment - Deepfaces, accessed on January 3, 2026, https://deepfaces.pt/docs/hosting-deployment/ Getting Started With Hugo | FREE COURSE - YouTube, accessed on January 3, 2026, https://www.youtube.com/watch?v=hjD9jTi_DQ4 ","date":"7 January 2026","externalUrl":null,"permalink":"/guides/hugo-blowfish-theme-guide/","section":"Guides","summary":"","title":"Architecture and Advanced Configuration of the Blowfish Theme for Hugo: An Integral Technical Examination","type":"guides"},{"content":"The evolution of cloud-native operating systems has led to the emergence of solutions radically different from traditional Linux distributions. Talos Linux stands at the forefront of this transformation, proposing an operational model based on immutability, the absence of interactive shells, and entirely API-mediated management.1 In this ecosystem, the integration of Tailscale, a mesh network solution based on the WireGuard protocol, is not a simple software installation, but a systems engineering operation that requires a deep understanding of Talos\u0026rsquo;s kernel extension mechanisms and filesystem.3 This report analyzes the implementation methodologies, declarative configuration strategies, and the resolution of networking issues arising from the convergence of these two technologies.\nOperational Paradigms and Architecture of Talos Linux # To understand the challenges of installing Tailscale, it is necessary to analyze the fundamental structure of Talos Linux. Unlike general-purpose distributions, Talos does not use package managers such as apt or yum.1 The root filesystem is mounted as read-only, and the system is designed to be ephemeral, with the exception of the partition dedicated to persistent data.3 This approach eliminates the problem of configuration drift but prevents the execution of common Tailscale installation scripts.1\nSystem management occurs exclusively via talosctl, a CLI utility that communicates with the gRPC APIs exposed by the machined daemon.3 In this context, any additional software component must be integrated as a system extension or as a workload within Kubernetes.3\nFeature Talos Linux Traditional Distributions Package Management Absent (OCI Extensions) apt, yum, zypper, pacman Remote Access gRPC API (Port 50000) SSH (Port 22) Root Filesystem Immutable (Read-only) Mutable (Read-write) Configuration Declarative (YAML) Imperative (Script/CLI) Kernel Hardened / Minimalist General-purpose / Modular The absence of a local terminal and standard diagnostic tools like iproute2 or iptables accessible directly by the user makes the use of Tailscale indispensable not only for network security but also as a potential bridge for out-of-band cluster management.3\nThe System Extension Mechanism # The primary method for injecting binaries like tailscaled and tailscale into Talos Linux is the System Extensions system.9 A system extension is an OCI-compliant container image containing a predefined file structure intended to be overlaid on the root filesystem during the boot phase.12\nAnatomy of an OCI Extension # A valid extension must contain a manifest.yaml file at the root, defining the name, version, and compatibility requirements with the Talos version.3 The actual content of the binaries must be placed in the /rootfs/usr/local/lib/containers/ directory.3 Talos scans the /usr/local/etc/containers directory for service definitions in YAML format, which describe how the machined daemon should start the process.9\nThe Tailscale service, when run as an extension, operates as a privileged container with access to the host\u0026rsquo;s /dev/net/tun device, essential for creating the virtual network interface.4 Since the tailscale0 interface must be available to the host operating system and not just within an isolated network namespace, the extension uses host networking.14\nLifecycle of the ext-tailscale Service # When Talos detects the Tailscale configuration, it registers a service named ext-tailscale.9 This service enters a waiting state until network dependencies are met, such as the assignment of IP addresses to physical interfaces and connectivity to default gateways.9 The telemetry of this service can be monitored via the command talosctl service ext-tailscale, which provides details on operational status, restart events, and process health.9\nInstallation Methodologies and Image Generation # There are three primary paths for implementing Tailscale on a Talos node, each with different implications for system maintainability and stability.3\nUsing the Talos Image Factory # The Talos Image Factory represents the most modern and recommended approach.5 It is an API service managed by Sidero Labs that allows dynamically assembling ISO images, PXE assets, or disk images (raw) by including certified extensions.3 The user selects the Talos version, architecture (amd64 or arm64), and adds the siderolabs/tailscale extension from the official extensions list.5\nThe result of this operation is a Schematic ID.10 This hash ensures that the image is reproducible and that all nodes in a cluster use the exact combination of kernel and drivers.\nPlatform Image Format Distribution Method Bare Metal ISO / RAW USB Flash / iDRAC / IPMI Virtualization (Proxmox/ESXi) ISO Datastore Upload Cloud (AWS/GCP/Azure) AMI / Disk Image Image Import Network Boot PXE / iPXE TFTP/HTTP Server Installation is performed by providing the schematic-based installer URL in the machine configuration file, under the machine.install.image key.3 During the installation or update process, Talos retrieves the OCI image, extracts the necessary components, and persists them in the system partition.3\nInstallation via OCI Installer on Existing Nodes # For nodes already in operation, it is possible to inject Tailscale without regenerating the entire physical boot medium.3 This is done by dynamically modifying the installation image in the MachineConfig.3 However, this method carries a risk: if the specified image does not contain the extension during a subsequent operating system update, Tailscale will be removed upon reboot.3 It is therefore imperative that the schematic ID remains consistent throughout the node\u0026rsquo;s entire lifecycle.\nCustom Builds via Imager # In air-gapped environments or where maximum customization is required, operators can use Sidero Labs\u0026rsquo; imager utility to create offline images.12 This tool allows downloading necessary packages, including static network configurations, and integrating Tailscale locally before producing the final boot asset.12\nDeclarative Configuration and Identity Management # Once the binaries are installed, Tailscale must be configured to join the tailnet. In Talos, this does not happen through manual invocation of tailscale up, but through the ExtensionServiceConfig resource.3\nAuthentication via Auth Keys # The simplest method is the use of an authentication key pre-generated from the Tailscale control panel.4 There are several types of keys, each suitable for a specific scenario:\nReusable Keys: Ideal for the automatic expansion of worker nodes in a Kubernetes cluster. A single key can authenticate multiple machines.10 Ephemeral Keys: Recommended for Talos nodes, as they ensure that if a node is destroyed or reset, its entry is automatically removed from the tailnet, avoiding the proliferation of orphaned nodes.10 Pre-approved Keys: Allow bypassing manual device approval if the tailnet has this feature enabled.22 OAuth2 Integration for Advanced Security # For enterprise-level installations, integration with OAuth2 is the preferred solution.16 Talos Linux supports the OAuth2 authentication flow directly in the kernel parameters or machine configuration.24 By providing a clientId and a clientSecret, the system can negotiate its own access credentials, reducing the need to manage long-lived keys.16\nThis configuration is inserted into the node\u0026rsquo;s YAML patch file:\nYAML\napiVersion: v1alpha1\nkind: ExtensionServiceConfig\nmetadata:\nname: tailscale\nspec:\nenvironment:\n- TS_AUTHKEY=tskey-auth-abcdef123456\n- TS_EXTRA_ARGS=\u0026ndash;advertise-tags=tag:talos,tag:k8s --accept-dns=false\nThe patch is applied via talosctl patch mc -p @tailscale-patch.yaml -n , which forces the parameters to load into the machined daemon and subsequently restarts the extension service.3\nState Persistence and Identity Stability # One of the most common issues reported by users is the creation of duplicate nodes in the Tailscale panel after each reboot.11 This happens because the Tailscale state (which includes the node\u0026rsquo;s private key and machine certificate) is usually stored in /var/lib/tailscale, which in system extensions is ephemeral by default.6\nPersistence Strategies on Immutable Filesystems # In Talos Linux, the /var directory is mounted on a persistent partition that survives reboots and operating system updates.6 To ensure the stability of the node\u0026rsquo;s identity, the extension must be configured to mount a persistent host directory.3\nConfiguration Parameter Value Purpose TS_STATE_DIR /var/lib/tailscale Path for storing the node key Mount Source /var/lib/tailscale Persistent directory on the Talos host Mount Destination /var/lib/tailscale Destination inside the extension container Mount Options bind, rw Allows read and write access Without this precaution, every Talos update (which involves a reboot and erasure of ephemeral state) would trigger the generation of a new cryptographic identity, breaking static routes and ACL policies configured in the tailnet.11\nAnalysis of Networking Conflicts and Multihoming # Introducing a virtual network interface like tailscale0 on a host already managing physical interfaces and Kubernetes networking (via CNI) can lead to complex routing conflicts.27\nThe Kubelet and API Server Binding Issue # By default, Kubernetes attempts to identify the primary IP address of the node for internal cluster communications.27 If Tailscale is started before the physical interface has established a stable connection, or if the Kubelet detects the tailscale0 interface as priority, it might attempt to register the node with the tailnet IP (in the 100.64.0.0/10 range).27\nThis scenario prevents the CNI (Cilium, Flannel, etc.) from establishing correct tunnels between pods, as encapsulated traffic might attempt to transit through the Tailscale tunnel instead of the local network, causing performance degradation or complete connectivity failure.27\nDocumented Solution:\nThe Talos configuration must explicitly instruct the Kubelet and Etcd to use only local network subnets for cluster traffic.27\nYAML\nmachine:\nkubelet:\nnodeIP:\nvalidSubnets:\n- 192.168.1.0/24 # Replace with your local subnet\ncluster:\netcd:\nadvertisedSubnets:\n- 192.168.1.0/24\nThis configuration ensures that, despite the presence of Tailscale, the Kubernetes control plane and traffic between workers remain on the physical network, while Tailscale is used exclusively for remote access and management.27\nDNS and resolv.conf Management # Tailscale often attempts to take control of DNS resolution to enable MagicDNS, a service that allows contacting tailnet nodes via simple hostnames.4 In Talos Linux, the /etc/resolv.conf file is managed deterministically, and external changes are often overwritten.4\nMany users report that enabling MagicDNS breaks the resolution of internal Kubernetes names (such as kubernetes.default.svc.cluster.local).27 The technical recommendation is to disable DNS management by Tailscale via the \u0026ndash;accept-dns=false flag and, if necessary, configure CoreDNS in the Kubernetes cluster to forward queries for the .ts.net domain to the Tailscale resolver IP (100.100.100.100).15\nPerformance, MTU, and Traffic Optimization # Tailscale uses a default MTU (Maximum Transmission Unit) value of 1280 bytes.35 This value is chosen to ensure that WireGuard packets (which add encapsulation overhead) do not exceed the standard 1500-byte MTU typical of most Ethernet networks.35\nCriticalities Related to Packet Fragmentation # In some environments, such as DSL connections with PPPoE or cellular hotspots, the underlying network MTU might be lower than 1500. In these cases, a 1280 MTU for Tailscale might be too high, leading to packet fragmentation.36 Since WireGuard silently drops fragmented packets for security reasons, TCP sessions (such as SSH or file transfers) might appear \u0026ldquo;frozen\u0026rdquo; or extremely slow.35\nUser experience suggests that manually setting the MTU to 1200 can drastically resolve throughput issues in problematic networks.36\nNetwork Scenario Recommended MTU Optimization Technique Standard Ethernet (LAN) 1280 Default DSL / PPPoE 1240 - 1260 MSS Clamping Mobile Networks (LTE/5G) 1200 - 1240 TS_DEBUG_MTU Overlay on Overlay (VPN in VPN) 1100 - 1200 Manual reduction To apply these optimizations on Talos, the TS_DEBUG_MTU environment variable must be used within the ExtensionServiceConfig.36 Furthermore, for traffic passing through the cluster as a Subnet Router, implementing MSS Clamping via firewalling rules is fundamental (although this is complex in Talos without specific extensions for iptables or nftables).35\nSubnet Router and Exit Node Configuration on Talos # A Talos node can act as a gateway for the entire cluster or local network, allowing other tailnet members to access resources that cannot run the Tailscale client directly (such as legacy databases, printers, or individual Kubernetes Pods).32\nEnabling IP Forwarding at the Kernel Level # The absolute prerequisite for a Subnet Router to function is enabling IP packet forwarding at the kernel level.32 While in standard distributions this is done by modifying /etc/sysctl.conf, in Talos it must be defined in the MachineConfig.8\nYAML\nmachine:\nsysctls:\nnet.ipv4.ip_forward: \u0026ldquo;1\u0026rdquo;\nnet.ipv6.conf.all.forwarding: \u0026ldquo;1\u0026rdquo;\nThis modification requires a node reboot (or hot application via talosctl apply-config) for the kernel to start routing packets between physical interfaces and the tailscale0 interface.42\nAdvertising Pod and Service Routes # To expose Kubernetes services, the node must advertise routes corresponding to the cluster CIDRs.32 For example, if the Pod CIDR is 10.244.0.0/16, the Tailscale command must include \u0026ndash;advertise-routes=10.244.0.0/16.32\nIt is important to remember that advertising routes in the command is not enough; they must be manually approved in the Tailscale control panel, unless \u0026ldquo;Auto Approvers\u0026rdquo; are configured.32 Using \u0026ndash;snat-subnet-routes=false is recommended to preserve the original client IP address in cluster-internal communications, facilitating logging and security monitoring.32\nComparative Analysis: System Extension vs. Kubernetes Operator # There is an ongoing debate among users about the best method for integrating Tailscale into a Talos cluster.3\nThe System Extension Approach # The extension operates at the host operating system level. It is the preferred solution when the main objective is managing the node itself.3\nPros: Allows accessing the Talos API (port 50000) even if Kubernetes is not running or has crashed.3 It is ideal for the initial bootstrap of the cluster on remote networks.10 Cons: Requires managing keys and states at the individual node level, increasing administrative overhead if the cluster has many nodes.3 The Kubernetes Operator Approach # The operator is installed within Kubernetes via Helm and manages dedicated Proxy Pods for each service to be exposed.16\nPros: Native Kubernetes integration. Creating a Tailscale-type Ingress automatically generates an entry in the tailnet with the service name.16 Does not require modifications to the Talos MachineConfig.16 Cons: Does not provide access to the host operating system management.16 If the Kubernetes control plane fails, Tailscale access is cut off.16 Hybrid Architecture Recommendation # For a robust infrastructure, the use of both systems is recommended: the system extension on at least one control plane node for emergency access and administration via talosctl, and the Kubernetes operator to expose applications to end-users in a scalable and granular manner.20\nCommon User-Reported Errors and Documented Resolutions # Analysis of support threads and GitHub issues highlights a series of \u0026ldquo;traps\u0026rdquo; typical of Talos-Tailscale integration.\nError 1: Conflicts Between KubeSpan and Tailscale # KubeSpan is Talos\u0026rsquo;s native solution for mesh networking between nodes, also based on WireGuard.6 While theoretically compatible, simultaneous activation of both can cause performance issues and port conflicts (both might attempt to use UDP port 51820).49\nSolution:\nIf Tailscale is used for inter-node connectivity, KubeSpan should be disabled.49 Alternatively, Tailscale must be configured to use a different UDP port via the \u0026ndash;port flag or allowed to use dynamic NAT negotiation.36\nError 2: Breaking Portainer and Other Privileged Agents Networking # A specific reported case involves Tailscale installation breaking the functioning of Portainer or monitoring agents that rely on inter-pod communication.27 This happens when the agent attempts to join a cluster using the Tailscale IP instead of the pod IP, encountering a \u0026ldquo;no route to host\u0026rdquo; error.27\nResolution:\nThe error is a direct consequence of the multihoming issue discussed earlier. The definitive solution is setting machine.kubelet.nodeIP.validSubnets to exclude the Tailscale IP range from internal Kubernetes routes.27\nError 3: Invalid API Certificates Due to Dynamic IPs # If a Talos node receives a new IP from the tailnet and the user attempts to connect via that IP, talosctl might return an mTLS certificate validation error.30 Talos generates API certificates including only known IP addresses at the time of bootstrap.28\nSolution:\nIt is necessary to add Tailscale IP ranges (or MagicDNS names) to the Subject Alternative Names (SAN) list in the MachineConfig.30\nYAML\nmachine:\ncertSANs:\n- 100.64.0.0/10\n- my-node.tailnet-id.ts.net\nFuture Perspectives and Final Considerations # The integration of Tailscale on Talos Linux represents the synthesis between the security of an immutable operating system and the flexibility of a modern mesh network. Despite initial challenges related to declarative configuration and multihoming management, the benefits in terms of operational simplicity and security are undeniable.\nCommunity discussions suggest a growing interest in creating even more specialized Talos images, which could include Tailscale directly in the kernel to further reduce the memory footprint and improve cryptographic performance.11 Until then, the OCI extension system remains the most robust and flexible mechanism for extending Talos\u0026rsquo;s network capabilities.9\nOperators adopting this stack must prioritize the use of the Image Factory to ensure reproducibility, implement rigorous persistence policies to maintain node identity, and pay close attention to the Kubelet subnet configuration to avoid routing conflicts that could compromise the stability of the entire Kubernetes cluster.3 With these precautions, Tailscale becomes an invisible but fundamental component for orchestrating secure and resilient cloud-native infrastructures.\nBibliography # Talos Linux - The Kubernetes Operating System, accessed on January 6, 2026, https://www.talos.dev/ siderolabs/talos: Talos Linux is a modern Linux distribution built for Kubernetes. - GitHub, accessed on January 6, 2026, https://github.com/siderolabs/talos Customizing Talos with Extensions - A cup of coffee, accessed on January 6, 2026, https://a-cup-of.coffee/blog/talos-ext/ Install Tailscale on Linux, accessed on January 6, 2026, https://tailscale.com/kb/1031/install-linux System Extensions - Image Factory - Talos Linux, accessed on January 6, 2026, https://factory.talos.dev/?arch=amd64\u0026amp;platform=metal\u0026amp;target=metal\u0026amp;version=1.7.6 What\u0026rsquo;s New in Talos 1.8.0 - Sidero Documentation - What is Talos Linux?, accessed on January 6, 2026, https://docs.siderolabs.com/talos/v1.8/getting-started/what\u0026rsquo;s-new-in-talos Install talosctl - Sidero Documentation - What is Talos Linux?, accessed on January 6, 2026, https://docs.siderolabs.com/omni/getting-started/how-to-install-talosctl MachineConfig - Sidero Documentation - What is Talos Linux?, accessed on January 6, 2026, https://docs.siderolabs.com/talos/v1.8/reference/configuration/v1alpha1/config Extension Services - Sidero Documentation - What is Talos Linux?, accessed on January 6, 2026, https://docs.siderolabs.com/talos/v1.8/build-and-extend-talos/custom-images-and-development/extension-services Creating a Kubernetes Cluster With Talos Linux on Tailscale | Josh Noll, accessed on January 6, 2026, https://joshrnoll.com/creating-a-kubernetes-cluster-with-talos-linux-on-tailscale/ FR: Minimal Purpose-Built OS for Tailscale · Issue #17761 - GitHub, accessed on January 6, 2026, https://github.com/tailscale/tailscale/issues/17761 How to build a Talos system extension - Sidero Labs, accessed on January 6, 2026, https://www.siderolabs.com/blog/how-to-build-a-talos-system-extension/ Package tailscale - GitHub, accessed on January 6, 2026, https://github.com/orgs/siderolabs/packages/container/package/tailscale How to make Tailscale container persistant? - ZimaOS - IceWhale Community Forum, accessed on January 6, 2026, https://community.zimaspace.com/t/how-to-make-tailscale-container-persistant/5987 Using Tailscale with Docker, accessed on January 6, 2026, https://tailscale.com/kb/1282/docker Kubernetes operator · Tailscale Docs, accessed on January 6, 2026, https://tailscale.com/kb/1236/kubernetes-operator talosctl - Sidero Documentation - What is Talos Linux?, accessed on January 6, 2026, https://docs.siderolabs.com/talos/v1.6/reference/cli Talos Linux Image Factory, accessed on January 6, 2026, https://factory.talos.dev/ siderolabs/extensions: Talos Linux System Extensions - GitHub, accessed on January 6, 2026, https://github.com/siderolabs/extensions Deploy Talos Linux with Local VIP, Tailscale, Longhorn, MetalLB and Traefik - Josh\u0026rsquo;s Notes, accessed on January 6, 2026, https://notes.joshrnoll.com/notes/deploy-talos-linux-with-local-vip-tailscale-longhorn-metallb-and-traefik/ Securely handle an auth key · Tailscale Docs, accessed on January 6, 2026, https://tailscale.com/kb/1595/secure-auth-key-cli Auth keys · Tailscale Docs, accessed on January 6, 2026, https://tailscale.com/kb/1085/auth-keys OAuth clients · Tailscale Docs, accessed on January 6, 2026, https://tailscale.com/kb/1215/oauth-clients Machine Configuration OAuth2 Authentication - What is Talos Linux?, accessed on January 6, 2026, https://docs.siderolabs.com/talos/v1.8/security/machine-config-oauth A collection of scripts for creating and managing kubernetes clusters on talos linux - GitHub, accessed on January 6, 2026, https://github.com/joshrnoll/talos-scripts Troubleshooting guide · Tailscale Docs, accessed on January 6, 2026, https://tailscale.com/kb/1023/troubleshooting Tailscale on Talos os breaks Portainer : r/kubernetes - Reddit, accessed on January 6, 2026, https://www.reddit.com/r/kubernetes/comments/1izy26m/tailscale_on_talos_os_breaks_portainer/ Production Clusters - Sidero Documentation - What is Talos Linux?, accessed on January 6, 2026, https://docs.siderolabs.com/talos/v1.7/getting-started/prodnotes Issues · siderolabs/talos - GitHub, accessed on January 6, 2026, https://github.com/siderolabs/talos/issues Troubleshooting - Sidero Documentation - What is Talos Linux?, accessed on January 6, 2026, https://docs.siderolabs.com/talos/v1.9/troubleshooting/troubleshooting Split dns on talos machine config · Issue #7287 · siderolabs/talos - GitHub, accessed on January 6, 2026, https://github.com/siderolabs/talos/issues/7287 Subnet routers · Tailscale Docs, accessed on January 6, 2026, https://tailscale.com/kb/1019/subnets Configure a subnet router · Tailscale Docs, accessed on January 6, 2026, https://tailscale.com/kb/1406/quick-guide-subnets README.md - michaelbeaumont/k8rn - GitHub, accessed on January 6, 2026, https://github.com/michaelbeaumont/k8rn/blob/main/README.md Slow direct connection, get better result with UDP + MTU tweak : r/Tailscale - Reddit, accessed on January 6, 2026, https://www.reddit.com/r/Tailscale/comments/1p5dxtq/slow_direct_connection_get_better_result_with_udp/ PSA: Tailscale yields higher throughput if you lower the MTU - Reddit, accessed on January 6, 2026, https://www.reddit.com/r/Tailscale/comments/1ismen1/psa_tailscale_yields_higher_throughput_if_you/ Unable to lower the MTU · Issue #8219 · tailscale/tailscale - GitHub, accessed on January 6, 2026, https://github.com/tailscale/tailscale/issues/8219 Site-to-site networking · Tailscale Docs, accessed on January 6, 2026, https://tailscale.com/kb/1214/site-to-site Using Tailscale and subnet routers to access legacy devices - Ryan Freeman, accessed on January 6, 2026, https://ryanfreeman.dev/writing/using-tailscale-and-subnet-routers-to-access-legacy-devices Check Linux IP Forwarding for Access Server Routing - OpenVPN, accessed on January 6, 2026, https://openvpn.net/as-docs/faq-ip-forwarding-on-linux.html Setting loadBalancer.acceleration=native causes Cilium Status to report unexpected end of JSON input #35873 - GitHub, accessed on January 6, 2026, https://github.com/cilium/cilium/issues/35873 2.5. Turning on Packet Forwarding | Load Balancer Administration - Red Hat Documentation, accessed on January 6, 2026, https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/6/html/load_balancer_administration/s1-lvs-forwarding-vsa Sysctl: net.ipv4.ip_forward - Linux Audit, accessed on January 6, 2026, https://linux-audit.com/kernel/sysctl/net/net.ipv4.ip_forward/ Rootless podman without privileged flag on talos/Setting max_user_namespaces · Issue #4385 - GitHub, accessed on January 6, 2026, https://github.com/talos-systems/talos/issues/4385 Set Up a Tailscale Exit Node and Subnet Router on an Ubuntu 24.04 VPS - Onidel, accessed on January 6, 2026, https://onidel.com/blog/setup-tailscale-exit-node-ubuntu Configuring tailscale subnet router using a Linux box and OpnSense : r/homelab - Reddit, accessed on January 6, 2026, https://www.reddit.com/r/homelab/comments/18zds4l/configuring_tailscale_subnet_router_using_a_linux/ OpenZiti meets Talos Linux!, accessed on January 6, 2026, https://openziti.discourse.group/t/openziti-meets-talos-linux/2988 Is there a better way than system extensions to run simple commands on boot as root? · siderolabs talos · Discussion #9857 - GitHub, accessed on January 6, 2026, https://github.com/siderolabs/talos/discussions/9857 hcloud-talos/terraform-hcloud-talos: This repository contains a Terraform module for creating a Kubernetes cluster with Talos in the Hetzner Cloud. - GitHub, accessed on January 6, 2026, https://github.com/hcloud-talos/terraform-hcloud-talos How I Setup Talos Linux. My journey to building a secure… | by Pedro Chang | Medium, accessed on January 6, 2026, https://medium.com/@pedrotychang/how-i-setup-talos-linux-bc2832ec87cc Talos VM Setup on macOS ARM64 with QEMU #9799 - GitHub, accessed on January 6, 2026, https://github.com/siderolabs/talos/discussions/9799 ","date":"7 January 2026","externalUrl":null,"permalink":"/guides/talos-linux-tailscale-guide/","section":"Guides","summary":"","title":"Architecture and Implementation of Tailscale on Talos Linux: Technical Analysis and Resolution of Operational Criticalities","type":"guides"},{"content":"The technological evolution of home data centers and corporate infrastructures has led to the emergence of solutions that challenge traditional paradigms of system administration. In this context, Talos OS stands out not as a simple Linux distribution, but as a radical reinterpretation of the operating system designed exclusively for Kubernetes. Its immutable, minimal, and entirely API-governed nature represents an ideal solution for those desiring a stable, secure Proxmox environment free from the technical debt associated with manual management via SSH.1 This report examines in depth every aspect necessary to take a Talos OS cluster from scratch to a production configuration on Proxmox VE, analyzing the complexities of networking, data persistence, and hypervisor-specific optimizations.\nArchitectural Fundamentals of Talos OS and the Immutable Approach # The philosophy behind Talos OS is centered on eliminating everything that is not strictly necessary for running Kubernetes. Unlike a traditional Linux distribution, Talos does not include a shell, has no package manager, and does not allow SSH access.1 All management occurs through a gRPC interface protected by mTLS (Mutual TLS), ensuring that every interaction with the system is authenticated and encrypted from the ground up.2\nFilesystem Structure and Layer Management # The Talos filesystem architecture is one of its most distinctive traits and ensures system resilience against accidental corruption or malicious attacks. The core of the system resides in a read-only root partition, structured as a SquashFS image.5 During boot, this image is mounted as a loop device in memory, creating an immutable base. Over this base, Talos overlays different layers to handle runtime needs:\nFilesystem Layer Type Main Function Persistence Rootfs SquashFS (Read-only) Operating system core and essential binaries. Immutable System tmpfs (In Memory) Temporary configuration files like /etc/hosts. Recreated at Boot Ephemeral XFS (On Disk) /var directory for containers, images, and etcd data. Persistent (Wipe on Reset) State Dedicated partition Machine configuration and node identity. Persistent This separation ensures that a configuration error or a corrupted temporary file never compromises the integrity of the underlying operating system. The EPHEMERAL partition, mounted at /var, hosts everything Kubernetes requires to function: from the etcd database in control plane nodes to images downloaded by the container runtime (containerd).5 A critical aspect of Talos\u0026rsquo;s design is that changes made to files like /etc/resolv.conf or /etc/hosts are managed via bind mounts from a system directory that is completely regenerated at every reboot, forcing the administrator to define such settings exclusively in the declarative configuration file.5\nThe API-driven Operating Model # The shift from imperative management (commands executed via shell) to declarative (desired state defined in YAML) is the heart of the Talos experience. The talosctl tool acts as the primary client communicating with the apid daemon running on each node.5 This architecture allows treating cluster nodes as \u0026ldquo;cattle\u0026rdquo; rather than \u0026ldquo;pets\u0026rdquo;, where replacing a non-functioning node is preferable to manual repair. The absence of SSH drastically reduces the attack surface, as it eliminates one of the most common entry points for malware and lateral movements within a network.2\nInfrastructure Planning on Proxmox VE # Implementing Talos on Proxmox requires careful virtual machine configuration to ensure that paravirtualized drivers and security features are properly leveraged. Proxmox, based on KVM/QEMU, offers excellent support for Talos, but some default settings can cause instability or sub-optimal performance.8\nResource Allocation and Hardware Requirements # While Talos is extremely efficient, Kubernetes requires minimum resources to manage the control plane and workloads. Resource distribution must take into account not only current needs but also future cluster growth.\nResource Parameter Control Plane (Minimum) Worker (Minimum) Recommended Production vCPU 2 Cores 1 Core 4+ Cores (Control Plane) RAM 2 GB 2 GB 4-8 GB+ Storage (OS) 10 GB 10 GB 40-100 GB (NVMe/SSD) CPU Type x86-64-v2 or Higher x86-64-v2 or Higher Host (Passthrough) A fundamental technical detail concerns the CPU microarchitecture. Starting from version 1.0, Talos requires support for the x86-64-v2 instruction set.10 In Proxmox, the default \u0026ldquo;kvm64\u0026rdquo; CPU type might not expose the necessary flags (such as cx16, popcnt, or sse4.2). It is highly recommended to set the VM CPU type to \u0026ldquo;host\u0026rdquo; or use a custom configuration that explicitly enables these extensions to avoid boot failure or sudden crashes during intensive workload execution.10\nVM Configuration for Optimal Performance # For a smooth integration, the virtual machine configuration must reflect modern virtualization standards. Using UEFI (OVMF) is preferable to traditional BIOS, as it allows for more secure boot management and supports larger disks with GPT partitioning.10 The chipset should be set to q35, which offers superior native PCIe support compared to the outdated i440fx. Regarding storage, using the VirtIO SCSI Single controller with the \u0026ldquo;iothread\u0026rdquo; option and enabling \u0026ldquo;discard\u0026rdquo; support (if supported by the physical backend) ensures efficient disk space management and high input/output performance.6\nImplementation: From Boot to Cluster Ready # The Talos installation process does not include a traditional interactive installer. Booting occurs via an ISO that loads the operating system entirely into RAM, leaving the node awaiting remote configuration.6\nWorkstation Preparation and talosctl # Before interacting with VMs on Proxmox, the local management environment must be prepared. The talosctl binary must be installed on the administrator\u0026rsquo;s workstation. This tool handles secret generation, node configuration, and cluster monitoring.6 It is critical that the talosctl version is aligned with the Talos OS version intended for deployment to avoid gRPC protocol incompatibilities.13\nBash\n# Esempio di installazione su macOS tramite Homebrew\nbrew install siderolabs/tap/talosctl\nOnce the Talos ISO image is downloaded (preferably customized via the Image Factory to include necessary drivers), it must be uploaded to the Proxmox ISO storage.6 Upon the first VM boot, the console will show a temporary IP address obtained via DHCP. This IP is the entry point for sending the initial configuration.6\nConfiguration File Generation and Secret Management # Talos security is based on a set of locally generated secrets. These secrets are never transmitted in clear text and form the basis for mTLS certificate signing.14 Configuration generation requires defining the Kubernetes API endpoint, which usually coincides with the IP of the first master node or a managed virtual IP.6\nBash\n# Generazione dei segreti del cluster\ntalosctl gen secrets -o secrets.yaml\n# Generazione dei file di configurazione per nodi master e worker\ntalosctl gen config my-homelab-cluster https://192.168.1.50:6443 \\\n\u0026ndash;with-secrets secrets.yaml \\\n\u0026ndash;output-dir _out\nThis operation generates three main components:\ncontrolplane.yaml: Contains definitions for nodes that will manage etcd and the API server. worker.yaml: Contains configuration for nodes that will run workloads. talosconfig: The client file allowing the administrator to authenticate with the cluster.6 Configuration Application and etcd Bootstrap # Applying the configuration transforms the node from maintenance mode to an installed and functional operating system. It is essential to verify the target disk name (e.g., /dev/sda or /dev/vda) before sending the YAML file.8 Initial sending occurs in \u0026ldquo;insecure\u0026rdquo; mode since mTLS certificates have not yet been distributed to the node.6\nBash\ntalosctl apply-config \u0026ndash;insecure \u0026ndash;nodes 192.168.1.10 \u0026ndash;file _out/controlplane.yaml\nAfter reboot, the first control plane node must be instructed to initialize the Kubernetes cluster via the bootstrap command. This operation configures the etcd distributed database and starts the core control plane components.6 Only after this phase does the cluster become self-aware and the Kubernetes API endpoint becomes reachable.\nNetworking: Optimization and High Availability # Networking is the area where Talos expresses its maximum flexibility, allowing the administrator to choose between standard configurations and advanced eBPF-based solutions.17\nChoosing Between Flannel and Cilium # By default, Talos uses Flannel as the network interface (CNI), a simple solution providing pod-to-pod connectivity via a VXLAN overlay.17 However, Flannel lacks support for Network Policies and does not offer advanced observability features. For a production-oriented homelab, Cilium represents the gold standard.17 Thanks to intensive use of eBPF, Cilium can entirely replace the kube-proxy component, drastically improving routing performance and reducing CPU load by eliminating the thousands of iptables rules typical of traditional Kubernetes clusters.19\nImplementing Cilium requires explicitly disabling the default CNI and kube-proxy in the Talos configuration.16 This is done via a YAML patch applied during generation or configuration modification:\nYAML\ncluster:\nnetwork:\ncni:\nname: none\nproxy:\ndisabled: true\nRemoving kube-proxy is not without challenges. Cilium must be configured to manage services via eBPF host routing. A critical detail often overlooked is the need to set bpf.hostLegacyRouting=true if DNS resolution or pod-to-host connectivity issues are encountered in particular kernel versions.21\nHigh Availability with kube-vip # In a cluster with multiple control plane nodes, it is essential that the API server is reachable through a single stable IP address, even if one master node fails. Talos offers an integrated Virtual IP (VIP) feature operating at layer 2 (ARP) or layer 3 (BGP).14 This function is based on leader election managed directly by etcd.22\nA widely used alternative is kube-vip, which can operate both as a VIP for the control plane and as a Load Balancer for Kubernetes services of type LoadBalancer.23 Kube-vip in ARP mode elects a leader among nodes hosting the virtual IP. To avoid bottlenecks, \u0026ldquo;leader election per service\u0026rdquo; can be enabled, allowing different cluster nodes to host different service IPs, thus distributing the network load.24\nFeature Native Talos VIP Kube-vip Control Plane HA Integrated, very simple to configure. Supported via Static Pods or DaemonSet. Service LoadBalancer Not natively supported. Core feature, supports various IP ranges. Dependencies Depends directly on etcd. Depends on Kubernetes or etcd. Configuration Declarative in controlplane.yaml file. Requires Kubernetes manifests or patches. Using the native Talos VIP is recommended for its simplicity in ensuring API server access, while kube-vip is the ideal choice for exposing internal services (like an Ingress Controller) with static IPs from your local network.23\nProxmox Optimizations and Advanced Customizations # To ensure Talos behaves as a first-class citizen within Proxmox, certain optimizations must be implemented to bridge the gap between the hypervisor and the minimal operating system.\nQEMU Guest Agent and System Extensions # The QEMU Guest Agent is a fundamental helper allowing Proxmox to manage clean shutdowns and read network information directly from the VM.4 Since Talos has no package manager, it cannot be installed with an apt install command. The solution lies in Talos\u0026rsquo;s \u0026ldquo;System Extensions\u0026rdquo;.4 Using the (https://factory.talos.dev), a custom ISO or installer can be generated including the siderolabs/qemu-guest-agent extension.4\nOnce the extension is included, the service must be enabled in the machine configuration file:\nYAML\nmachine:\nfeatures:\nqemuGuestAgent:\nenabled: true\nThis approach ensures the agent is an integral part of the immutable system image, maintaining consistency between nodes and facilitating maintenance operations from the Proxmox web interface.4\nPersistence with iSCSI and Longhorn # In many homelabs, storage is not local but resides on a NAS or SAN. To use distributed storage solutions like Longhorn or to mount volumes via iSCSI, Talos requires the corresponding system binaries. Again, extensions play a crucial role. Adding siderolabs/iscsi-tools and siderolabs/util-linux-tools provides necessary kernel drivers and user-space utilities to manage iSCSI targets.4\nIt is also necessary to configure the kubelet to allow mounting specific directories like /var/lib/longhorn with correct permissions (rshared, rw). This ensures that containers managing storage have direct access to hardware or network volumes without interference from operating system isolation mechanisms.9\nLifecycle: Atomic Updates and Maintenance # Maintaining a Talos cluster differs radically from traditional systems. Updates are atomic and image-based, reducing the risk of leaving the system in an inconsistent intermediate state to near zero.2\nUpdate and Rollback Strategies # Talos implements an A-B update system. When an upgrade command is sent, the system downloads the new image to an inactive partition, updates the bootloader, and reboots.26 If booting the new version fails (e.g., due to a configuration incompatible with the new kernel), Talos automatically rolls back to the previous version.26 This mechanism, borrowed from smartphone operating systems (like Android), ensures extremely high availability.\nRecommended procedures involve updating one node at a time, starting with worker nodes and finally proceeding to control plane nodes.13 During the update, Talos automatically performs \u0026ldquo;cordon\u0026rdquo; (prevents new pods) and \u0026ldquo;drain\u0026rdquo; (moves existing pods) of the node in Kubernetes, ensuring workloads do not suffer abrupt interruptions.26\nMonitoring with the Integrated Dashboard # For immediate diagnostics, Talos provides an integrated dashboard accessible via talosctl. This tool provides an overview of core service health, resource usage, and system logs, eliminating the need to install heavy external monitoring agents during initial troubleshooting phases.8\nBash\n# Avvio della dashboard per un nodo specifico\ntalosctl dashboard \u0026ndash;nodes 192.168.1.10\nThis dashboard is particularly useful during the bootstrap phase to identify why a node fails to join the cluster or why etcd does not reach quorum.8\nFinal Considerations and Future Perspectives # Adopting Talos OS on Proxmox VE represents a choice of excellence for anyone wanting to build a robust and modern Kubernetes infrastructure. The combination of declarative management, immutability, and the absence of legacy components like SSH raises the standard of security and stability far beyond what is possible with general-purpose Linux distributions.1\nInitial challenges related to learning a new paradigm are amply compensated by the ease of managing system updates and the predictability of cluster behavior. In an ecosystem where Kubernetes complexity can often become overwhelming, Talos offers an \u0026ldquo;opinionated\u0026rdquo; approach that reduces variables, allowing administrators to focus on applications rather than the operating system. The integration with Proxmox, supported by VirtIO and System Extensions, provides the perfect balance between the power of virtualization and the agility of Cloud Native, making this configuration a benchmark for professional homelabs and edge infrastructures.\nBibliography # siderolabs/talos: Talos Linux is a modern Linux distribution built for Kubernetes. - GitHub, accessed on December 29, 2025, https://github.com/siderolabs/talos Introduction to Talos, the Kubernetes OS | Yet another enthusiast blog!, accessed on December 29, 2025, https://blog.yadutaf.fr/2024/03/14/introduction-to-talos-kubernetes-os/ Sidero Documentation - What is Talos Linux?, accessed on December 29, 2025, https://docs.siderolabs.com/talos/v1.7/overview/what-is-talos Customizing Talos with Extensions - A cup of coffee, accessed on December 29, 2025, https://a-cup-of.coffee/blog/talos-ext/ Architecture - Sidero Documentation - What is Talos Linux?, accessed on December 29, 2025, https://docs.siderolabs.com/talos/v1.9/learn-more/architecture Talos with Kubernetes on Proxmox | Secsys, accessed on December 29, 2025, https://secsys.pages.dev/posts/talos/ Using Talos Linux and Kubernetes bootstrap on OpenStack - Safespring, accessed on December 29, 2025, https://www.safespring.com/blogg/2025/2025-03-talos-linux-on-openstack/ Proxmox - Sidero Documentation - What is Talos Linux?, accessed on December 29, 2025, https://docs.siderolabs.com/talos/v1.9/platform-specific-installations/virtualized-platforms/proxmox Creating a Kubernetes Cluster With Talos Linux on Tailscale | Josh \u0026hellip;, accessed on December 29, 2025, https://joshrnoll.com/creating-a-kubernetes-cluster-with-talos-linux-on-tailscale/ Talos on Proxmox, accessed on December 29, 2025, https://homelab.casaursus.net/talos-on-proxmox-3/ Talos ProxMox - k8s development - GitLab, accessed on December 29, 2025, https://gitlab.com/k8s_development/talos-proxmox Getting Started - Sidero Documentation - What is Talos Linux?, accessed on December 29, 2025, https://docs.siderolabs.com/talos/v1.9/getting-started/getting-started Upgrade Talos Linux and Kubernetes | Eric Daly\u0026rsquo;s Blog, accessed on December 29, 2025, https://blog.dalydays.com/post/kubernetes-talos-upgrades/ Production Clusters - Sidero Documentation - What is Talos Linux?, accessed on December 29, 2025, https://docs.siderolabs.com/talos/v1.7/getting-started/prodnotes How to Deploy a Kubernetes Cluster on Talos Linux - HOSTKEY, accessed on December 29, 2025, https://hostkey.com/blog/102-setting-up-a-k8s-cluster-on-talos-linux/ “ServiceLB” with cilium on Talos Linux | by Stefan Le Breton | Dev Genius, accessed on December 29, 2025, https://blog.devgenius.io/servicelb-with-cilium-on-talos-linux-8a290d524cb7 Kubernetes \u0026amp; Talos - Reddit, accessed on December 29, 2025, https://www.reddit.com/r/kubernetes/comments/1hs6bui/kubernetes_talos/ Installing Cilium and Multus on Talos OS for Advanced Kubernetes Networking, accessed on December 29, 2025, https://www.itguyjournals.com/installing-cilium-and-multus-on-talos-os-for-advanced-kubernetes-networking/ Deploy Cilium CNI - Sidero Documentation, accessed on December 29, 2025, https://docs.siderolabs.com/kubernetes-guides/cni/deploying-cilium Install in eBPF mode - Calico Documentation - Tigera.io, accessed on December 29, 2025, https://docs.tigera.io/calico/latest/operations/ebpf/install Validating Talos Linux Install and Maintenance Operations - Safespring, accessed on December 29, 2025, https://www.safespring.com/blogg/2025/2025-04-validating-talos-linux-install/ Virtual (shared) IP - Sidero Documentation - What is Talos Linux?, accessed on December 29, 2025, https://docs.siderolabs.com/talos/v1.8/networking/vip kube-vip: Documentation, accessed on December 29, 2025, https://kube-vip.io/ Kubernetes Load-Balancer service | kube-vip, accessed on December 29, 2025, https://kube-vip.io/docs/usage/kubernetes-services/ Qemu-guest-agent - Proxmox VE, accessed on December 29, 2025, https://pve.proxmox.com/wiki/Qemu-guest-agent Upgrading Talos Linux - Sidero Documentation, accessed on December 29, 2025, https://docs.siderolabs.com/talos/v1.8/configure-your-talos-cluster/lifecycle-management/upgrading-talos omni-docs/tutorials/upgrading-clusters.md at main - GitHub, accessed on December 29, 2025, https://github.com/siderolabs/omni-docs/blob/main/tutorials/upgrading-clusters.md Talos OS - Documentation \u0026amp; FAQ - HOSTKEY, accessed on December 29, 2025, https://hostkey.com/documentation/marketplace/kubernetes/talos/ ","date":"7 January 2026","externalUrl":null,"permalink":"/guides/talos-proxmox-complete-guide/","section":"Guides","summary":"","title":"Architecture, Implementation, and Optimization of Talos OS on Proxmox: The Ultimate Guide for Homelabs and Production Environments","type":"guides"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/blowfish/","section":"Tags","summary":"","title":"Blowfish","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/controllers/","section":"Tags","summary":"","title":"Controllers","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/css/","section":"Tags","summary":"","title":"Css","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/distributed-storage/","section":"Tags","summary":"","title":"Distributed-Storage","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/hugo/","section":"Tags","summary":"","title":"Hugo","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/hypervisor/","section":"Tags","summary":"","title":"Hypervisor","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/immutable-os/","section":"Tags","summary":"","title":"Immutable-Os","type":"tags"},{"content":"The adoption of Kubernetes in on-premises contexts has introduced the need to manage load balancing in the absence of native services offered by public cloud providers. In this scenario, the combination of Proxmox Virtual Environment (VE) as the hypervisor, Talos OS as the operating system for cluster nodes, and MetalLB as the network LoadBalancer solution, represents one of the most robust, secure, and efficient architectures for managing modern workloads.1 Proxmox provides the flexibility of enterprise virtualization by combining KVM and LXC, while Talos OS redefines the concept of an operating system for Kubernetes, eliminating the complexity of traditional Linux distributions in favor of an immutable and API-driven approach.1 MetalLB steps in to fill the critical gap in bare-metal networking, allowing Kubernetes services of type LoadBalancer to receive external IP addresses reachable from the local network.3\nArchitectural Analysis of the Proxmox Virtualization Layer # Designing a Kubernetes infrastructure on Proxmox requires a deep understanding of how the hypervisor manages resources and networking. Proxmox is based on standard Linux networking concepts, primarily using virtual bridges (vmbr) to connect virtual machines to the physical network.5 When planning a MetalLB installation, the configuration of these bridges becomes the foundation upon which the entire reachability of services rests.\nHost Networking Configuration # Best practice in Proxmox involves using Linux Bridges or, in more complex scenarios, Open vSwitch. For most Kubernetes deployments, a correctly configured vmbr0 bridge is sufficient, provided it supports the Layer 2 traffic necessary for MetalLB\u0026rsquo;s ARP (Address Resolution Protocol) operations.4 An often overlooked aspect is the need to manage different VLANs to isolate cluster management traffic (Corosync), Kubernetes API traffic, and application data traffic.5 Latency is a critical factor for Corosync; therefore, it is recommended not to saturate the main bridge with heavy data loads that could cause instability in the Proxmox cluster quorum.5\nNetwork Component Optimal Configuration Critical Function Bridge (vmbr0) VLAN Aware, No IP (optional) Main virtual switch for VM traffic.5 Bonding (LACP) 802.3ad (if supported by the switch) Redundancy and bandwidth increase.5 MTU 1500 (standard) or 9000 (Jumbo Frames) Throughput optimization for storage and pod-to-pod traffic.5 VirtIO Model Paravirtualization Maximum network performance with minimum CPU overhead.7 MetalLB integration requires that the Proxmox bridge does not interfere with the gratuitous ARP packets sent by MetalLB speakers to announce Virtual IPs (VIPs). In some advanced routing scenarios, it might be necessary to enable proxy_arp on the host bridge interface to facilitate communication between different subnets, although this practice must be carefully evaluated for security implications.8\nTalos OS: The Evolution of the Immutable Operating System # Talos OS stands radically apart from general-purpose Linux distributions. It is a minimal operating system, devoid of shell, SSH, and package managers, designed exclusively to run Kubernetes.1 This reduction in attack surface, which brings the system to have only about 12 binaries compared to the usual 1500 of standard distributions, makes it ideal for environments requiring high security and maintainability.2 Talos management occurs entirely via gRPC APIs using the talosctl tool.2\nVirtual Machine Specifications for Kubernetes Nodes # Creating VMs on Proxmox to host Talos must follow rigorous technical requirements to ensure the stability of etcd and system APIs.\nVM Resource Minimum Requirement Recommended Configuration CPU Type host Enables all hardware extensions of the physical CPU.7 CPU Cores 2 Cores 4 Cores for Control Plane nodes.7 RAM Memory 2 GB 4-8 GB to ensure operational fluidity and caching.7 Disk Controller VirtIO SCSI Support for TRIM command and reduced latencies.7 Storage 20 GB 32 GB or higher for logs and ephemeral local storage.10 Using the \u0026ldquo;host\u0026rdquo; CPU type is fundamental as it allows Talos to access advanced virtualization and encryption instructions of the physical processor, improving the performance of etcd and traffic encryption processes.7 Furthermore, enabling the QEMU agent in the Proxmox VM settings allows for more granular management of the operating system, such as clean shutdowns and clock synchronization, although Talos handles many of these functions natively via its APIs.7\nMetalLB Implementation: Theory and Network Mechanisms # MetalLB solves the problem of external reachability for Kubernetes services by acting as a software implementation of a network load balancer. It works by monitoring services of type LoadBalancer and assigning them an IP address from a pool configured by the administrator.11 There are two main operational modes: Layer 2 (ARP/NDP) and BGP.\nHow Layer 2 Mode Works # In Layer 2 mode, MetalLB uses the ARP protocol for IPv4 and NDP for IPv6. When an IP is assigned to a service, MetalLB elects one of the cluster nodes as the \u0026ldquo;owner\u0026rdquo; of that IP.4 That node will start responding to ARP requests for the service\u0026rsquo;s External-IP with its own physical MAC address. From the perspective of the external network (e.g., the lab or office router), it looks like the node has multiple IP addresses associated with its network card.4\nThis mode is extremely popular in home labs and small businesses because it requires no configuration on existing routers; it works on any standard Ethernet switch.4 However, it has a structural limit: all inbound traffic for a given VIP is directed to a single node. Although kube-proxy then distributes this traffic to the actual pods on other nodes, inbound bandwidth is limited by the network capacity of the single leader node.4\nBorder Gateway Protocol (BGP) and Scalability # For high-traffic production environments, BGP mode is the preferred choice. In this case, each cluster node establishes a BGP peering session with the infrastructure routers.4 When a service receives an External-IP, MetalLB announces that route to the router. If the router supports ECMP (Equal-Cost Multi-Pathing), traffic can be distributed equally among all nodes announcing the route, allowing true network-level load balancing and overcoming the limits of Layer 2 mode.13\nUsing BGP on Talos requires careful configuration, especially if advanced CNIs like Cilium are used, which have their own BGP capabilities.14 It is fundamental to avoid conflicts between MetalLB and the CNI, deciding which component should manage peering with physical routers.15\nPractical Installation Guide: From Bootstrap to Configuration # The installation process begins after the Talos cluster has been successfully bootstrapped and kubectl is operational.\nTalos Preparation: System Patching # Before installing MetalLB, it is necessary to apply some changes to the Talos node configuration. One of MetalLB\u0026rsquo;s fundamental requirements, when operating with kube-proxy in IPVS mode, is enabling the strictARP parameter.16 In Talos, this is not done by modifying a ConfigMap, but by patching the MachineConfig.\nThe configuration file must include the kube-proxy section to force acceptance of gratuitous ARPs and correctly handle VIP routing.16 Furthermore, if it is desired that Control Plane nodes participate in IP announcements (very common in small clusters), it is necessary to remove the restrictive labels that Kubernetes and Talos apply by default.18\n# Example patch to enable strictARP and remove node restrictions cluster: proxy: config: ipvs: strictARP: true allowSchedulingOnControlPlanes: true # If using masters as workers machine: nodeLabels: node.kubernetes.io/exclude-from-external-load-balancers: \u0026#34;\u0026#34; $patch: delete This patch ensures that the control plane is not excluded from load balancing operations, allowing MetalLB to run its \u0026ldquo;speakers\u0026rdquo; on every available node.18\nInstalling MetalLB via Helm # Using Helm is the recommended method for installing MetalLB as it facilitates version management and RBAC dependencies.16\nNamespace Creation and Security Labels:\nKubernetes applies Pod Security Admissions. MetalLB, needing to manipulate the host\u0026rsquo;s network stack, requires a privileged profile. It is essential to label the namespace before installation.16\nkubectl create namespace metallb-system kubectl label namespace metallb-system pod-security.kubernetes.io/enforce=privileged kubectl label namespace metallb-system pod-security.kubernetes.io/audit=privileged kubectl label namespace metallb-system pod-security.kubernetes.io/warn=privileged Helm Execution:\nThe official repository is added and installation proceeds.10\nhelm repo add metallb https://metallb.github.io/metallb helm repo update helm install metallb metallb/metallb -n metallb-system Defining Custom Resources (CRD) # Once installed, MetalLB remains inactive until IP address pools and announcement modes are defined.16 These configurations are now managed via Custom Resource Definitions (CRD) and no longer via ConfigMap as in versions prior to 0.13.\nIPAddressPool: defines the range of IP addresses that MetalLB can assign. It is crucial that these IPs are not in the router\u0026rsquo;s DHCP range to avoid conflicts.11\napiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: default-pool namespace: metallb-system spec: addresses: - 192.168.1.50-192.168.1.70 L2Advertisement: this resource associates the address pool with the Layer 2 announcement mode. Without it, MetalLB will assign IPs but will not respond to ARP requests, rendering the IPs unreachable.20\napiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: default-advertisement namespace: metallb-system spec: ipAddressPools: - default-pool Integration and Security with Proxmox: Spoofing Management # A common hurdle in installing MetalLB on Proxmox is the network protection system integrated into the hypervisor. Proxmox\u0026rsquo;s firewall includes \u0026ldquo;IP Filter\u0026rdquo; and \u0026ldquo;MAC Filter\u0026rdquo; features aimed at preventing a VM from using IP or MAC addresses other than those officially assigned in the management panel.21\nSince MetalLB in Layer 2 mode \u0026ldquo;pretends\u0026rdquo; the node possesses the service IP addresses (VIPs), sending ARP responses for IPs not configured on the primary network interface, the Proxmox firewall might block this traffic, identifying it as ARP Spoofing.21\nResolving Proxmox Firewall Restrictions # To allow MetalLB to function, there are three main approaches:\nDisabling MAC Filter: In the VM\u0026rsquo;s (or bridge\u0026rsquo;s) firewall options, disable the MAC filter entry. This allows the VM to send traffic with IP sources other than the primary one.22 IPSet Configuration: If maintaining a high security level is desired, an IPSet named ipfilter-net0 (where net0 is the VM interface) can be created, including all IP addresses in the MetalLB pool. In this way, the Proxmox firewall will know that those IPs are authorized for that specific VM.21 Manual ebtables Rules: In advanced scenarios, the administrator can insert ebtables rules on the Proxmox host to specifically allow ARP traffic for the MetalLB range.23 # Example ebtables command to allow ARP on a specific VM ebtables -I FORWARD 1 -i fwln\u0026lt;VMID\u0026gt;i0 -p ARP --arp-ip-dst 192.168.1.50/32 -j ACCEPT Omission of these steps is the primary cause of MetalLB installation failures on Proxmox, leading to situations where the Kubernetes service shows a correctly assigned External-IP, but that IP is unreachable (not pingable) from outside the cluster.19\nMonitoring and Troubleshooting # The immutable nature of Talos OS makes troubleshooting different from traditional systems. Not being able to access via SSH to run tcpdump directly on the node, it is necessary to rely on MetalLB pod logs and talosctl tools.\nSpeaker Log Analysis # The \u0026ldquo;speaker\u0026rdquo; pods are responsible for IP announcements. If an IP is unreachable, the first step is to check the speaker logs on the node that should be the leader for that service.4\nkubectl logs -n metallb-system -l component=speaker In the logs, it is possible to observe if the speaker has detected the service, if it has correctly elected a leader, and if it is encountering errors in sending gratuitous packets. If logs show the announcement occurred correctly but the router does not see the address, the problem almost certainly lies in the Proxmox virtualization layer or the physical switch.8\nVerifying L2 Status (ServiceL2Status) # MetalLB provides a status resource that allows seeing which node is currently serving a given IP.19\nkubectl get servicel2statuses.metallb.io -n metallb-system This information is vital for understanding if traffic is being directed to the correct node and for verifying cluster behavior during a simulated failover (e.g., rebooting a worker node).6\nConflicts with CNI and Pod-to-Pod Routing # In some cases, traffic reaches the node but is not correctly routed to the pods. This can happen if the CNI (such as Cilium or Calico) has a configuration that conflicts with the routing rules created by kube-proxy in IPVS mode.12 If using Cilium, it is recommended to check if Cilium\u0026rsquo;s \u0026ldquo;L2 Announcement\u0026rdquo; feature is active; if it is, it will perform the same function as MetalLB, rendering the latter redundant or even harmful to network stability.14\nPerformance Optimization and High Availability # A professional Kubernetes cluster on Proxmox must be designed to withstand failures and scale efficiently.\nLoad Balancing and Hardware Offloading # Using VirtIO in Proxmox allows offloading some network functions (such as checksum offload) to the host CPU, reducing the load on the Talos VM.7 Furthermore, the implementation of MetalLB in BGP mode, as discussed, allows leveraging physical network hardware (enterprise routers like MikroTik or Cisco) to manage packet-level balancing, ensuring no single node becomes the bottleneck for application traffic.13\nFailover and Convergence Times # In Layer 2 mode, failover time depends on the speed at which nodes detect a peer\u0026rsquo;s failure and the rapidity with which the router updates its ARP table.6 Talos optimizes this process thanks to an extremely lean and responsive Linux kernel. To further accelerate failover, MetalLB can be configured with protocols like BFD (Bidirectional Forwarding Detection) in BGP mode, reducing failure detection times from seconds to milliseconds.13\nFinal Considerations on Day-2 Management # MetalLB integration on Talos and Proxmox does not end with initial installation. \u0026ldquo;Day-2\u0026rdquo; management concerns updates, security monitoring, and cluster expansion. Thanks to the declarative nature of Talos and MetalLB, the entire infrastructure can be managed as code (Infrastructure as Code). Using tools like Terraform for VM creation on Proxmox and Helm for managing Kubernetes components allows recreating the entire environment deterministically in case of disaster recovery.8\nIn conclusion, the synergy between Proxmox stability, Talos OS\u0026rsquo;s intrinsic security, and MetalLB\u0026rsquo;s versatility creates an ideal ecosystem for hosting modern applications. Attention to detail in Layer 2 networking configuration and the elimination of Proxmox\u0026rsquo;s restrictive filters are the pillars for a successful deployment that ensures services are not only operational but also constantly accessible and high-performing for end users. The continuous evolution of these tools suggests a future where the distinction between public cloud and private data center will become increasingly blurred, thanks to software-defined solutions that bring cloud agility directly to the bare metal of one\u0026rsquo;s own infrastructure.1\nBibliography # Proxmox vs Talos – Deciding on the Best Infrastructure Solution - simplyblock, accessed on January 2, 2026, https://www.simplyblock.io/blog/proxmox-vs-talos/ Using Talos Linux and Kubernetes bootstrap on OpenStack - Safespring, accessed on January 2, 2026, https://www.safespring.com/blogg/2025/2025-03-talos-linux-on-openstack/ How to setup the MetalLB | kubernetes-under-the-hood - GitHub Pages, accessed on January 2, 2026, https://mvallim.github.io/kubernetes-under-the-hood/documentation/kube-metallb.html MetalLB: A Load Balancer for Bare Metal Kubernetes Clusters | by 8grams - Medium, accessed on January 2, 2026, https://8grams.medium.com/metallb-a-load-balancer-for-bare-metal-kubernetes-clusters-ef8a9e00c2bd Networking best practice | Proxmox Support Forum, accessed on January 2, 2026, https://forum.proxmox.com/threads/networking-best-practice.163550/ MetalLB in layer 2 mode :: MetalLB, bare metal load-balancer for Kubernetes, accessed on January 2, 2026, https://metallb.universe.tf/concepts/layer2/ Talos with Kubernetes on Proxmox | Secsys, accessed on January 2, 2026, https://secsys.pages.dev/posts/talos/ epyc-kube/docs/proxmox-metallb-subnet-configuration.md at main \u0026hellip;, accessed on January 2, 2026, https://github.com/xalgorithm/epyc-kube/blob/main/docs/proxmox-metallb-subnet-configuration.md How I Setup Talos Linux. My journey to building a secure… | by Pedro Chang | Medium, accessed on January 2, 2026, https://medium.com/@pedrotychang/how-i-setup-talos-linux-bc2832ec87cc Highly available kubernetes cluster with etcd, Longhorn and \u0026hellip;, accessed on January 2, 2026, https://wiki.joeplaa.com/tutorials/highly-available-kubernetes-cluster-on-proxmox MetalLB Load Balancer - Documentation - K0s docs, accessed on January 2, 2026, https://docs.k0sproject.io/v1.34.2+k0s.0/examples/metallb-loadbalancer/ MetalLB - Ubuntu, accessed on January 2, 2026, https://ubuntu.com/kubernetes/charmed-k8s/docs/metallb MetalLB in BGP mode :: MetalLB, bare metal load-balancer for Kubernetes, accessed on January 2, 2026, https://metallb.universe.tf/concepts/bgp/ Kubernetes \u0026amp; Talos - Reddit, accessed on January 2, 2026, https://www.reddit.com/r/kubernetes/comments/1hs6bui/kubernetes_talos/ Talos with redundant routed networks via bgp : r/kubernetes - Reddit, accessed on January 2, 2026, https://www.reddit.com/r/kubernetes/comments/1iy411r/talos_with_redundant_routed_networks_via_bgp/ Installation :: MetalLB, bare metal load-balancer for Kubernetes, accessed on January 2, 2026, https://metallb.universe.tf/installation/ Kubernetes Homelab Series Part 3 - LoadBalancer With MetalLB \u0026hellip;, accessed on January 2, 2026, https://blog.dalydays.com/post/kubernetes-homelab-series-part-3-loadbalancer-with-metallb/ Unable to use MetalLB on TalosOS linux v.1.9.3 on Proxmox · Issue #2676 - GitHub, accessed on January 2, 2026, https://github.com/metallb/metallb/issues/2676 Unable to use MetalLB load balancer for TalosOS v1.9.3 · Issue #10291 · siderolabs/talos, accessed on January 2, 2026, https://github.com/siderolabs/talos/issues/10291 Configuration :: MetalLB, bare metal load-balancer for Kubernetes, accessed on January 2, 2026, https://metallb.universe.tf/configuration/ Implementing MAC Filtering for IPv4 in Proxmox Using Built-In Firewall Features, accessed on January 2, 2026, https://forum.proxmox.com/threads/implementing-mac-filtering-for-ipv4-in-proxmox-using-built-in-firewall-features.157726/ [SOLVED] - Allow MAC spoofing? - Proxmox Support Forum, accessed on January 2, 2026, https://forum.proxmox.com/threads/allow-mac-spoofing.84424/ Block incoming ARP requests if destination ip is not part of ipfilter-net[n], accessed on January 2, 2026, https://forum.proxmox.com/threads/block-incoming-arp-requests-if-destination-ip-is-not-part-of-ipfilter-net-n.144135/ Filter ARP request - Proxmox Support Forum, accessed on January 2, 2026, https://forum.proxmox.com/threads/filter-arp-request.118505/ Creating ExternalIPs in OpenShift with BGP and MetalLB | - Random Tech Adventures, accessed on January 2, 2026, https://xphyr.net/post/metallb_and_ocp_using_bgp/ Setting up a Talos kubernetes cluster with talhelper - beyondwatts, accessed on January 2, 2026, https://www.beyondwatts.com/posts/setting-up-a-talos-kubernetes-cluster-with-talhelper/ ","date":"7 January 2026","externalUrl":null,"permalink":"/guides/talos-proxmox-metallb/","section":"Guides","summary":"","title":"Integration and Optimization of MetalLB on Talos OS Kubernetes Clusters in Proxmox Virtual Environments","type":"guides"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/kube-vip/","section":"Tags","summary":"","title":"Kube-Vip","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/kvm/","section":"Tags","summary":"","title":"Kvm","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/load-balancing/","section":"Tags","summary":"","title":"Load-Balancing","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/lxc/","section":"Tags","summary":"","title":"Lxc","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/metallb/","section":"Tags","summary":"","title":"Metallb","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/production/","section":"Tags","summary":"","title":"Production","type":"tags"},{"content":"The evolution of digital infrastructures has made virtualization a cornerstone not only for large enterprise data centers but also for research contexts and home labs. Proxmox Virtual Environment (VE) stands out in this landscape as an enterprise-class virtualization management platform, completely open-source, which integrates the KVM (Kernel-based Virtual Machine) hypervisor and LXC-based containers (Linux Containers) into a single solution.1 This discussion explores every aspect of the platform in depth, starting from fundamental concepts to advanced configurations for production deployment, while providing a critical analysis compared to public cloud giants such as Amazon Web Services (AWS) and Google Cloud Platform (GCP).\nChapter 1: Fundamentals and System Architecture # Proxmox VE is a type 1 hypervisor, defined as \u0026ldquo;bare metal,\u0026rdquo; since it is installed directly on the physical hardware without the need for a pre-existing underlying operating system.1 This architecture ensures that machine resources — CPU, RAM, storage, and network connectivity — are managed directly by the virtualization software, drastically reducing overhead and improving overall performance.1\nThe Kernel and the Debian Base # Proxmox\u0026rsquo;s stability derives from its Debian GNU/Linux base, on which a modified kernel is applied to support critical virtualization and clustering functions.3 Integration with Debian allows Proxmox to benefit from a vast ecosystem of packages and update management via the APT (Advanced Package Tool) tool, making system maintenance familiar for Linux administrators.4\nThe Pillars of Management: pveproxy and pvedaemon # Proxmox\u0026rsquo;s operation is orchestrated by a series of specialized services working in concert to offer a smooth management interface. The pveproxy service acts as the web interface, operating on port 8006 via the HTTPS protocol.1 This component acts as the main entry point for the user, allowing total control of the datacenter via a browser.1\nThe pvedaemon, instead, represents the operational engine that executes commands given by the user, such as creating virtual machines or modifying network settings.1 In a cluster environment, pve-cluster comes into play, a service that keeps configurations synchronized between nodes using a cluster file system (pmxcfs).1 This architecture ensures that, should an administrator make a change on a node, that information is instantly available across the entire cluster, ensuring operational integrity.1\nComponent Main Function Dependencies KVM Hypervisor for full virtualization CPU Extensions (Intel VT-x / AMD-V) LXC Lightweight virtualization via containers Host kernel sharing QEMU Hardware emulation for VMs KVM for acceleration pveproxy Web interface server (Port 8006) SSL certificates pvedaemon Execution of administrative tasks System APIs pve-cluster Multi-node synchronization Corosync (Ports 5404/5405) 1\nChapter 2: Virtualization Technologies: KVM vs. LXC # Proxmox\u0026rsquo;s distinctive strength lies in its ability to offer two complementary virtualization technologies under the same roof, allowing administrators to choose the most suitable tool based on the specific workload.1\nKVM and QEMU: Full Virtualization # The KVM/QEMU pairing represents the solution for full virtualization. In this scenario, each virtual machine behaves like an independent physical computer, equipped with its own BIOS/UEFI and an autonomous operating system kernel.1 QEMU handles the emulation of hardware components — such as disk controllers, network cards, and video cards — while KVM leverages the CPU\u0026rsquo;s hardware capabilities to execute guest code at near-native speeds.1\nThis technology is indispensable for running non-Linux operating systems, such as Microsoft Windows, or Linux instances that require custom kernels or total isolation for security reasons.1 However, full virtualization comes with a cost in terms of resources: each VM requires a dedicated portion of RAM and CPU that cannot be easily shared with other instances, making it less efficient for lightweight services.1\nLXC: Efficiency and Speed of Containers # Linux Containers (LXC) offer a radically different approach. Instead of emulating hardware, LXC isolates processes within the host environment, sharing the Proxmox operating system kernel.1 This eliminates the need to boot an entire kernel for each application, reducing boot times to a few seconds and drastically slashing memory and CPU usage.1\nContainers are ideal for running standard Linux services, such as Nginx web servers, databases, or nested Docker instances. The main limitation lies in compatibility: a container can only run Linux distributions and cannot have a different kernel from the host\u0026rsquo;s.1 Nevertheless, for scalable workloads, LXC represents the choice of choice for optimizing service density on a single mini PC.1\nPerformance Analysis: Case Studies # Comparative studies indicate that LXC tends to outperform KVM in CPU and memory-intensive tasks, thanks to lower overhead.8 However, anomalous cases have been detected: in some tests related to Java or Elasticsearch workloads, KVM VMs showed superior performance compared to LXCs or even bare metal hardware.9 This phenomenon is often attributed to how the VM\u0026rsquo;s guest kernel manages process scheduling and memory cache more aggressively than an isolated process in a container would, suggesting that for specific applications, empirical validation is necessary before the final choice.9\nFeature KVM (Virtual Machine) LXC (Container) Isolation Hardware (Maximum) Process (High) Kernel Independent Shared with host Operating Systems Windows, Linux, BSD, etc. Linux only Boot Time 30-60 seconds 1-5 seconds RAM Usage Reserved and fixed Dynamic and shared Overhead Moderate Minimum 1\nChapter 3: The Storage Stack: Performance and Integrity # Data management in Proxmox is extremely flexible, supporting both local and distributed storage. For a homelab user on a mini PC, the choice between ZFS and LVM is decisive for hardware performance and longevity.10\nZFS: The Gold Standard for Integrity # ZFS is much more than just a file system; it is a logical volume manager with advanced data protection features.10 The most relevant feature is end-to-end checksumming, which allows for automatically detecting and correcting silent data corruption (bit rot).10 ZFS excels in snapshot management and native replication, allowing for synchronizing VM disks between different Proxmox nodes in minutes.10\nHowever, ZFS is resource-demanding. It requires direct access to disks (HBA mode), making it incompatible with traditional hardware RAID controllers, which should be avoided.10 Furthermore, ZFS uses RAM as a read cache (ARC), recommending at least 8-16 GB of system memory to operate optimally.10\nLVM and LVM-Thin: Speed and Simplicity # LVM (Logical Volume Manager) is the traditional option for disk management in Linux. Proxmox implements LVM-Thin to allow for \u0026ldquo;thin provisioning,\u0026rdquo; or the ability to virtually allocate more space than is physically available.10 LVM is extremely fast and has near-zero CPU and RAM overhead, making it ideal for mini PCs with budget processors or little memory.10 The downside is the lack of protection against bit rot and the absence of native replication between cluster nodes.10\nDistributed Storage: Ceph and Shared Storage # For more ambitious multi-node configurations, Proxmox integrates Ceph, a distributed storage system that transforms local disks of multiple servers into a single redundant and highly available storage pool.11 Although Ceph is considered the standard for enterprise production, its implementation on mini PCs requires caution: at least three nodes (preferably five) and fast networks (at least 10GbE) are necessary to avoid bottlenecks and unacceptable latencies.11\nStorage Type Type Snapshot Replication Redundancy ZFS Local/Soft RAID Yes Yes Software RAID LVM-Thin Local Yes No No (Requires Hardware RAID) Ceph Distributed Yes Yes Replication between nodes NFS / iSCSI Shared (NAS) Backend dependent No Managed by NAS 10\nChapter 4: Networking and Network Segmentation # Network configuration in Proxmox is based on the abstraction of physical components into virtual bridges, allowing for granular management of traffic between VMs and the outside world.16\nLinux Bridge and Naming Convention # At installation time, Proxmox creates a default bridge named vmbr0, which is linked to the primary physical network card.1 Modern installations use predictive interface names (such as eno1 or enp0s3), which avoid name changes due to kernel updates or hardware modifications.16 These names can be customized by creating .link files in /etc/systemd/network/ to ensure total consistency in multi-node configurations.16\nVLAN-Aware Bridge: The Segmentation Guide # To isolate traffic in a home lab (for example, separating IP cameras from production servers), the recommended technique is the use of \u0026ldquo;VLAN-aware\u0026rdquo; bridges.17 Instead of creating a separate bridge for each VLAN, a single bridge can handle multiple 802.1Q tags. Once the option is enabled in the bridge settings, simply specify the \u0026ldquo;VLAN Tag\u0026rdquo; in the VM\u0026rsquo;s network hardware.17\nThis approach offers several advantages:\nSimplicity: Reduces the complexity of /etc/network/interfaces configuration files.17 Flexibility: Allows for changing a VM\u0026rsquo;s network without having to modify the host\u0026rsquo;s network infrastructure.17 Security: Combined with a firewall, it prevents lateral movement between different security zones.17 The Role of OpenVSwitch (OVS) # For even more complex networking scenarios, Proxmox supports OpenVSwitch, a multilayer virtual switch designed to operate in large-scale cluster environments.19 OVS offers advanced monitoring and management features, but requires a separate installation (apt install openvswitch-switch) and manual configuration that may be superfluous for most small labs.19\nChapter 5: From Lab to Production: Maintenance and Updates # Transforming an experimental installation into a production-ready system requires moving to more rigorous management practices, especially regarding security and software integrity.21\nRepository Management: Enterprise vs. No-Subscription # Proxmox offers different channels for updates. By default, the system is configured with the \u0026ldquo;Enterprise\u0026rdquo; repository, which guarantees extremely stable and tested packages, but requires a paid subscription.3 For users who do not need official support, the \u0026ldquo;No-Subscription\u0026rdquo; repository is the correct choice.4\nTo switch to the free repository on Proxmox 8, it is necessary to modify the files in /etc/apt/sources.list.d/. The correct procedure involves commenting out the enterprise repository and adding the line for no-subscription, ensuring to also include the correct repository for Ceph (even if not used directly, some packages are necessary) to avoid errors during the update.5\nExample configuration for Proxmox 8 (Bookworm):\n# Disabilitare Enterprise sed -i \u0026#39;s/^deb/#deb/\u0026#39; /etc/apt/sources.list.d/pve-enterprise.list # Aggiungere No-Subscription cat \u0026gt; /etc/apt/sources.list.d/pve-no-subscription.list \u0026lt;\u0026lt; EOF deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription EOF It is fundamental to always use the apt dist-upgrade command instead of apt upgrade to ensure that new kernel packages and Proxmox\u0026rsquo;s structural dependencies are correctly installed.21\nHardening and Access Security # Production system security starts with administrative access. It is recommended to:\nDisable root login via SSH: Use non-privileged users with sudo.22 Implement 2FA: Proxmox natively supports TOTP and WebAuthn for GUI access.6 SSL Certificates: Replace the self-signed certificate with one issued by Let\u0026rsquo;s Encrypt via the integrated ACME protocol.24 ACME configuration can be performed directly from the GUI under Datacenter \u0026gt; ACME. If Proxmox is not publicly exposed, DNS challenges can be used via plugins for providers such as Cloudflare or DuckDNS, allowing for obtaining valid certificates even in isolated local networks.25\nChapter 6: Backup Strategies with Proxmox Backup Server (PBS) # A system without a backup plan cannot be considered \u0026ldquo;in production.\u0026rdquo; Proxmox revolutionized this aspect with the launch of Proxmox Backup Server (PBS), a dedicated solution that integrates perfectly with Proxmox VE.28\nDeduplication and Data Integrity # Unlike traditional backups (based on .vzdump files), PBS operates at the block level. This means that if ten virtual machines run the same Linux operating system, identical data blocks are saved only once on the backup server.28 The advantages are manifold:\nSpace saving: Reduction of necessary storage by up to 90% in homogeneous environments.29 Speed: Incremental backups transfer only modified blocks, reducing execution times from hours to minutes.28 Verification: PBS allows for scheduling periodic integrity checks (Garbage Collection and Verification) to ensure data is not corrupted.28 PBS Implementation # PBS can be installed on dedicated bare metal hardware or, for testing, as a VM (although not recommended for real production of critical backups of the host hosting it).28 A typical configuration involves a rigorous maintenance schedule:\nPruning: Automatic removal of old backups based on retention rules (e.g., keep 7 daily, 4 weekly).28 Garbage Collection: Freeing up physical space on the disk after blocks have been marked for deletion by pruning.28 Operation Recommended Time Purpose VM Backup 02:00 Copy guest data Pruning 03:00 Retention policy application Garbage Collection 03:30 Physical space recovery Verification 05:00 Block integrity check 31\nChapter 7: Optimization for Mini PCs and Energy Saving # Mini PCs are popular for home labs thanks to their low power consumption, but Proxmox is configured by default to maximize performance, which can lead to high temperatures and energy waste.32\nCPU Governor: Powersave vs. Performance # By default, Proxmox sets the CPU governor to performance, forcing cores to maximum frequency.33 For mini PCs, it is advisable to change this setting to powersave. Contrary to the name, in modern processors (especially recent Intel Core i5/i7), the powersave governor still allows the CPU to instantly accelerate under load, but drops it to minimum frequencies at idle, saving even 40-50W per node.33\nIt is possible to automate this change by adding a command to the host\u0026rsquo;s crontab:\n@reboot echo \u0026#34;powersave\u0026#34; | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor \u0026gt;/dev/null 2\u0026gt;\u0026amp;1 This ensures that the setting persists after every reboot.33\nAdvanced Power Management: Powertop and ASPM # To further optimize consumption, tools like powertop can be used to identify components that prevent the CPU from entering deep power-saving states (C-states).32 Often, enabling ASPM (Active State Power Management) in the BIOS or via kernel parameters can halve the idle consumption of mini PCs equipped with Intel or Realtek NICs.33\nHardware Passthrough Criticalities # A common challenge in mini PCs is hardware passthrough, for example, passing a SATA controller or an iGPU to a specific VM. It has been documented that on some models (such as Aoostar), passing the SATA controller can disable the host CPU\u0026rsquo;s thermal management and boosting functions, as the controller is integrated directly into the SoC.37 In these cases, the host loses the ability to read temperature sensors and, for protection, locks the CPU at the base frequency, degrading overall performance.37\nChapter 8: Clustering and High Availability (HA) # Although Proxmox can operate as a single node, its true power emerges in a cluster configuration.\nThe Science of Quorum # In a Proxmox cluster, stability is guaranteed by the concept of \u0026ldquo;quorum.\u0026rdquo; Each node has one vote and, for the cluster to be operational, a majority of votes (50% + 1) must be present.38 With only two nodes, if one fails, the cluster loses quorum and services stall to avoid the \u0026ldquo;split-brain\u0026rdquo; phenomenon.15\nThe optimal solution is a three-node cluster.38 If you do not have three identical mini PCs, a \u0026ldquo;Quorum Device\u0026rdquo; (QDevice) can be used. A QDevice can be a minimal Linux instance running on a Raspberry Pi or even in a small VM on other hardware, providing the third vote necessary to maintain quorum in a two-primary node setup.15\nLive Migration and HA # With shared storage (such as a NAS via NFS) or via ZFS replication, it is possible to perform \u0026ldquo;Live Migration\u0026rdquo; of virtual machines from one host to another without service interruptions.13 In the event of a node hardware failure, Proxmox\u0026rsquo;s high availability manager (HA Manager) will detect the node\u0026rsquo;s absence and automatically restart VMs on the surviving hosts, minimizing downtime.15\nChapter 9: Proxmox vs. Public Cloud (AWS and GCP) # Many users wonder why manage their own Proxmox server instead of using ready-to-use services like AWS or GCP. The answer lies in a balance between costs, control, and learning.\nCost Analysis (TCO) # AWS and GCP use a \u0026ldquo;pay-as-you-go\u0026rdquo; model that may appear cheap initially, but costs scale quickly.40 For an instance with 8 GB of RAM and 2 vCPUs, the cost in the cloud can hover around 50-70 euros per month.42 A mid-range mini PC for a home lab costs about 300-500 euros; the initial investment thus pays for itself in less than a year of continuous use.42 Furthermore, the cloud charges for outbound data traffic (egress), while in your own lab the only limit is your internet connection\u0026rsquo;s bandwidth.45\nPrivacy and Data Sovereignty # Proxmox offers total privacy. Data physically resides in the user\u0026rsquo;s mini PC, not on third-party servers subject to foreign regulations or corporate policy changes.44 This is fundamental for managing sensitive data, personal backups, or for those who wish to avoid \u0026ldquo;vendor lock-in.\u0026ldquo;2\nOperational Complexity and Learning Curve # AWS and GCP offer thousands of managed services (databases, AI, global networking) that Proxmox cannot easily replicate.40 However, learning Proxmox means understanding the fundamentals of IT: hypervisors, file systems, Linux networking, and network security.1 These are universal skills that remain valid regardless of the cloud provider used in the future.38\nDimension Proxmox VE AWS / GCP Hardware Control Total None Egress Costs Zero High Maintenance User (Self-managed) Provider (Managed) AI/ML Integration Manual Native services (Vertex AI, SageMaker) Scalability Limited hardware Virtually infinite Data Ownership User Service provider 40\nChapter 10: Conclusions and Roadmap for the User # Proxmox VE represents the perfect bridge between home experimentation and professional reliability. For a user starting from scratch with a mini PC, the path to production follows precise stages that transform a simple hobby into a resilient infrastructure.\nThe strength of this platform lies not only in its technical capabilities — such as the speed of LXC containers or the integrity of ZFS — but in its community and its open nature. While the public cloud will continue to dominate global-scale scenarios and \u0026ldquo;cloud-native\u0026rdquo; applications, Proxmox remains the choice of choice for anyone seeking technological independence, economic efficiency, and granular control over their digital environment.\nImplementing Proxmox today means investing in a system that grows with your needs, moving from a single machine to a redundant cluster, protected by state-of-the-art backup and optimized to consume only what is strictly necessary. Whether it\u0026rsquo;s hosting a Home Assistant server, a database for development, or an entire corporate infrastructure, Proxmox VE confirms itself as one of the most complete and powerful virtualization solutions available on the market.\nBibliography # Understanding the Proxmox Architecture: From ESXi to Proxmox VE 8.4 - Dev Genius, accessed on December 29, 2025, https://blog.devgenius.io/understanding-the-proxmox-architecture-from-esxi-to-proxmox-ve-8-4-0d41d300365a What Is Proxmox? Guide to Open Source Virtualization - CloudFire Srl, accessed on December 29, 2025, https://www.cloudfire.it/en/blog/proxmox-guida-virtualizzazione-open-source [SOLVED] - Explain please pve-no-subscription | Proxmox Support Forum, accessed on December 29, 2025, https://forum.proxmox.com/threads/explain-please-pve-no-subscription.102743/ Package Repositories - Proxmox VE, accessed on December 29, 2025, https://pve.proxmox.com/wiki/Package_Repositories How to Setup Proxmox VE 8.4 Non-Subscription Repositories + \u0026hellip;, accessed on December 29, 2025, https://ecintelligence.ma/en/blog/how-to-setup-proxmox-ve-84-non-subscription-reposi/ Proxmox VE Port Requirements: The Complete Guide | Saturn ME, accessed on December 29, 2025, https://www.saturnme.com/proxmox-ve-port-requirements-the-complete-guide/ Firewall Ports Cluster Configuration - Proxmox Support Forum, accessed on December 29, 2025, https://forum.proxmox.com/threads/firewall-ports-cluster-configuration.16210/ Proxmox VE: Performance of KVM vs. LXC - IKUS, accessed on December 29, 2025, https://ikus-soft.com/en_CA/blog/techies-10/proxmox-ve-performance-of-kvm-vs-lxc-75 Performance of LXC vs KVM - Proxmox Support Forum, accessed on December 29, 2025, https://forum.proxmox.com/threads/performance-of-lxc-vs-kvm.43170/ Choosing the Right Proxmox Local Storage Format: ZFS vs LVM - Instelligence, accessed on December 29, 2025, https://www.instelligence.io/blog/2025/08/choosing-the-right-proxmox-local-storage-format-zfs-vs-lvm/ Proxmox VE Storage Options: Comprehensive Comparison Guide - Saturn ME, accessed on December 29, 2025, https://www.saturnme.com/proxmox-ve-storage-options-comprehensive-comparison-guide/ [SOLVED] - Performance comparison between ZFS and LVM - Proxmox Support Forum, accessed on December 29, 2025, https://forum.proxmox.com/threads/performance-comparison-between-zfs-and-lvm.124295/ Proxmox with Local M.2 Storage: The Best Storage \u0026amp; Backup Strategy (No Ceph Needed), accessed on December 29, 2025, https://www.detectx.com.au/proxmox-with-local-m-2-storage-the-best-storage-backup-strategy-no-ceph-needed/ Mini PC Proxmox cluster with ceph, accessed on December 29, 2025, https://forum.proxmox.com/threads/mini-pc-proxmox-cluster-with-ceph.156601/ HA Best Practice | Proxmox Support Forum, accessed on December 29, 2025, https://forum.proxmox.com/threads/ha-best-practice.157253/ Network Configuration - Proxmox VE, accessed on December 29, 2025, https://pve.proxmox.com/wiki/Network_Configuration Proxmox VLAN Configuration | Bankai-Tech Docs, accessed on December 29, 2025, https://docs.bankai-tech.com/Proxmox/Docs/Networking/VLAN%20Configuration Proxmox VLAN Configuration: Linux Bridge Tagging, Management IP, and Virtual Machines, accessed on December 29, 2025, https://www.youtube.com/watch?v=stQzK0p59Fc Proxmox VLANs Demystified: Step-by-Step Network Isolation for Your Homelab - Medium, accessed on December 29, 2025, https://medium.com/@P0w3rChi3f/proxmox-vlan-configuration-a-step-by-step-guide-edc838cc62d8 Proxmox vlan handling - Homelab - LearnLinuxTV Community, accessed on December 29, 2025, https://community.learnlinux.tv/t/proxmox-vlan-handling/3232 How to Safely Update Proxmox VE: A Complete Guide - Saturn ME, accessed on December 29, 2025, https://www.saturnme.com/how-to-safely-update-proxmox-ve-a-complete-guide/ Proxmox server hardening document for compliance, accessed on December 29, 2025, https://forum.proxmox.com/threads/proxmox-server-hardening-document-for-compliance.146961/ [SOLVED] - converting from no subscription repo to subscription - Proxmox Support Forum, accessed on December 29, 2025, https://forum.proxmox.com/threads/converting-from-no-subscription-repo-to-subscription.164060/ How to Secure Your Proxmox VE Web Interface with Let\u0026rsquo;s Encrypt SSL - Skynats, accessed on December 29, 2025, https://www.skynats.com/blog/how-to-secure-your-proxmox-ve-web-interface-with-lets-encrypt-ssl/ Automate Proxmox SSL Certificates with ACME and Dynv6, accessed on December 29, 2025, https://bitingbytes.de/posts/2025/proxmox-ssl-certificate-with-dynv6/ Managing Certificates in Proxmox VE 8.1: A Step-by-Step Guide - BDRShield, accessed on December 29, 2025, https://www.bdrshield.com/blog/managing-certificates-in-proxmox-ve-8-1/ Step-by-step guide to configure Proxmox Web GUI/API with Let\u0026rsquo;s Encrypt certificate and automatic validation using the ACME protocol in DNS alias mode with DNS TXT validation redirection to Duck DNS. - GitHub Gist, accessed on December 29, 2025, https://gist.github.com/zidenis/e93532c0e6f91cb75d429f7ac7f66ba5 Proxmox Backup Server, accessed on December 29, 2025, https://homelab.casaursus.net/proxmox-backup-server/ Features - Proxmox Backup Server, accessed on December 29, 2025, https://www.proxmox.com/en/products/proxmox-backup-server/features How To: Proxmox Backup Server 4 (VM) Installation, accessed on December 29, 2025, https://www.derekseaman.com/2025/08/how-to-proxmox-backup-server-4-vm-installation.html Proxmox Backup Server - Our Home Lab, accessed on December 29, 2025, https://homelab.anita-fred.net/pbs/ Guide for Proxmox powersaving - Technologie Hub Wien, accessed on December 29, 2025, https://technologiehub.at/project-posts/tutorial/guide-for-proxmox-powersaving/ PSA How to configure Proxmox for lower power usage - Home Assistant Community, accessed on December 29, 2025, https://community.home-assistant.io/t/psa-how-to-configure-proxmox-for-lower-power-usage/323731 CPU power throttle back to save energy - Proxmox Support Forum, accessed on December 29, 2025, https://forum.proxmox.com/threads/cpu-power-throttle-back-to-save-energy.27510/ gaming rig to run proxmox server - how do i lower my idle power? - Reddit, accessed on December 29, 2025, https://www.reddit.com/r/Proxmox/comments/1fwphxw/gaming_rig_to_run_proxmox_server_how_do_i_lower/ Powersaving tutorial : r/Proxmox - Reddit, accessed on December 29, 2025, https://www.reddit.com/r/Proxmox/comments/1nultme/powersaving_tutorial/ WTR Pro CPU throttling - Proxmox Support Forum, accessed on December 29, 2025, https://forum.proxmox.com/threads/wtr-pro-cpu-throttling.160039/ How to Set Up a Proxmox Cluster for Free – Virtualization Basics - freeCodeCamp, accessed on December 29, 2025, https://www.freecodecamp.org/news/set-up-a-proxmox-cluster-virtualization-basics/ Building a Highly Available (HA) two-node Home Lab on Proxmox - Jon, accessed on December 29, 2025, https://jon.sprig.gs/blog/post/2885 AWS Vs. GCP: Which Platform Offers Better Pricing? - CloudZero, accessed on December 29, 2025, https://www.cloudzero.com/blog/aws-vs-gcp/ AWS vs GCP vs Azure: Which Cloud Platform is Best for Mid-Size Businesses? - Qovery, accessed on December 29, 2025, https://www.qovery.com/blog/aws-vs-gcp-vs-azure What\u0026rsquo;s the Difference Between AWS vs. Azure vs. Google Cloud? - Coursera, accessed on December 29, 2025, https://www.coursera.org/articles/aws-vs-azure-vs-google-cloud I set up a tiny PC Proxmox cluster! : r/homelab - Reddit, accessed on December 29, 2025, https://www.reddit.com/r/homelab/comments/15gkr1r/i_set_up_a_tiny_pc_proxmox_cluster/ Compare Google Compute Engine vs Proxmox VE 2025 | TrustRadius, accessed on December 29, 2025, https://www.trustradius.com/compare-products/google-compute-engine-vs-proxmox-ve Cloud Comparison AWS vs Azure vs GCP – Networking \u0026amp; Security - Exeo, accessed on December 29, 2025, https://exeo.net/en/networking-security-cloud-comparison-aws-vs-azure-vs-gcp/ AWS vs GCP: Unraveling the cloud conundrum - Proxify, accessed on December 29, 2025, https://proxify.io/articles/aws-vs-gcp AWS vs GCP - Which One to Choose in 2025? - ProjectPro, accessed on December 29, 2025, https://www.projectpro.io/article/aws-vs-gcp-which-one-to-choose/477 AWS vs. GCP: A Developer\u0026rsquo;s Guide to Picking the Right Cloud - DEV Community, accessed on December 29, 2025, https://dev.to/shrsv/aws-vs-gcp-a-developers-guide-to-picking-the-right-cloud-59a1 ","date":"7 January 2026","externalUrl":null,"permalink":"/guides/proxmox-complete-guide/","section":"Guides","summary":"","title":"Proxmox Virtual Environment: Architecture, Implementation, and Comparative Analysis towards the Public Cloud","type":"guides"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/static-site-generator/","section":"Tags","summary":"","title":"Static-Site-Generator","type":"tags"},{"content":"The evolution of IT infrastructures towards fully declarative and immutable paradigms has found one of its most advanced expressions in the combination of Talos OS and Kubernetes. However, adopting an immutable and shell-less operating system introduces significant challenges when integrating distributed block storage solutions like Longhorn. This technical report comprehensively analyzes the entire installation lifecycle, starting from the configuration of the Proxmox VE hypervisor, moving through the customization of Talos OS via system extensions, up to the production deployment of Longhorn, with a particular focus on performance optimization and the resolution of networking and mounting issues.\nHypervisor Configuration: Proxmox VE as the Cluster Foundation # The stability of a distributed Kubernetes cluster depends largely on the correct configuration of the underlying virtual machines. Proxmox VE offers remarkable flexibility but requires specific settings to meet the rigorous requirements of Talos OS and the input/output (I/O) needs of Longhorn.\nCPU Microarchitecture Requirements and Necessary Instructions # Starting from version 1.0, Talos OS explicitly requires the x86-64-v2 microarchitecture. This requirement is fundamental as many default Proxmox installations use the kvm64 CPU type to maximize compatibility during live migration, but this model lacks critical instructions such as cx16, popcnt, and sse4.2, which are necessary for the correct functioning of the Talos kernel and binaries.1\nThe choice of processor type within Proxmox directly influences Longhorn\u0026rsquo;s ability to perform encryption and volume management operations. The recommended setting is host, which exposes all physical CPU capabilities to the virtual machine, ensuring maximum performance for the storage engine.1 If live migration between nodes with different CPUs is a requirement, the administrator must manually configure CPU flags in the VM configuration file /etc/pve/qemu-server/\u0026lt;vmid\u0026gt;.conf by adding the string args: -cpu kvm64,+cx16,+lahf_lm,+popcnt,+sse3,+ssse3,+sse4.1,+sse4.2.1\nCPU Parameter Recommended Value Technical Impact Processor Type host Native x86-64-v2 support and superior cryptographic performance.1 Cores (Control Plane) Minimum 2 Necessary for managing system processes and etcd.1 Cores (Worker Node) 4 or more Support for Longhorn V2 engine polling and workloads.4 NUMA Enabled Optimization of memory access on multi-socket servers.6 Memory Management and SCSI Controller # Talos OS is designed to operate entirely in RAM during critical phases, which makes memory management a potential point of failure. A known limitation of Talos concerns the lack of support for memory hot-plugging. If this feature is enabled in Proxmox, Talos will not be able to correctly detect the total allocated memory, leading to installation errors due to insufficient memory.1 The minimum RAM allocation must be 2 GB for control plane nodes and preferably 8 GB for worker nodes hosting Longhorn, as the latter requires resources for data replication and management of instance manager pods.4\nRegarding storage, the VirtIO SCSI single controller is the preferred choice. This configuration allows for the use of dedicated I/O threads for each virtual disk, reducing contention between processes and improving latency, a critical factor when Longhorn must replicate data blocks across multiple nodes over the network.6 Enabling the Discard option on the virtual disk is equally essential to allow the guest operating system to send TRIM commands, ensuring that the underlying storage (especially if based on ZFS or LVM-thin in Proxmox) can reclaim unused space.7\nTalos OS Provisioning: Immutability and Customization # The immutable nature of Talos OS implies that it is not possible to install software or drivers after boot via traditional channels like apt or yum. Therefore, the preparation of the installation image must pre-emptively include all necessary tools for Longhorn.\nUsing the Image Factory and System Extensions # Longhorn depends on binaries and daemons that usually reside at the host level, such as iscsid for volume connection and various filesystem management tools. In Talos, these dependencies are met through \u0026ldquo;System Extensions\u0026rdquo;. Sidero Labs\u0026rsquo; Image Factory allows for generating custom ISOs and installers that integrate these extensions directly into the system image.1\nIndispensable extensions for a working Longhorn installation include:\nsiderolabs/iscsi-tools: provides the iscsid daemon and the iscsiadm utility, necessary for mapping Longhorn volumes as local block devices.4 siderolabs/util-linux-tools: includes tools like fstrim, used for filesystem maintenance and reducing the space occupation of volumes.4 siderolabs/qemu-guest-agent: fundamental in Proxmox environments to allow the hypervisor to communicate with the guest, facilitating clean shutdowns and correct display of IP addresses in the management console.1 The image generation process produces a unique schematic ID, which ensures that every node in the cluster is configured identically, fundamentally eliminating the problem of configuration drift.9\nCluster Bootstrapping and Declarative Configuration # Once the Proxmox VMs are booted with the custom ISO, the cluster enters a maintenance mode awaiting configuration. Interaction occurs exclusively through the talosctl utility from the administrator\u0026rsquo;s terminal. Configuration file generation is done via the talosctl gen config command, specifying the control plane endpoint.1\nDuring the modification phase of the controlplane.yaml and worker.yaml files, it is crucial to verify the installation disk identifier. In Proxmox, depending on the controller used, the disk might appear as /dev/sda or /dev/vda. Using the command talosctl get disks --insecure --nodes \u0026lt;IP\u0026gt; allows for certain identification of the correct device before applying the configuration.1\nCluster bootstrapping follows a rigorous sequence:\nApplication of the configuration to the control plane node: talosctl apply-config --insecure --nodes $CP_IP --file controlplane.yaml.1 Cluster initialization (ETCD Bootstrap): talosctl bootstrap --nodes $CP_IP.1 Retrieval of the kubeconfig file for administrative access to Kubernetes via kubectl.1 Longhorn Integration: Requirements and Volume Architecture # Installing Longhorn on Talos requires meticulous attention to privilege management and filesystem path visibility, as Talos isolates control plane processes and system services into separate mount namespaces.\nKernel Modules and Machine Parameters # Longhorn requires certain kernel modules to be loaded to manage virtual block devices and iSCSI communication. Since Talos does not load all modules by default, they must be explicitly declared in the kernel section of the worker nodes\u0026rsquo; machine configuration.11\nRequired modules include nbd (Network Block Device), iscsi_tcp, iscsi_generic, and configfs.11 Their inclusion ensures that the Longhorn manager can correctly create devices under /dev, which will then be mounted by application pods.\nmachine: kernel: modules: - name: nbd - name: iscsi_tcp - name: iscsi_generic - name: configfs This configuration snippet, once applied, forces the node to reboot to load the necessary modules, making the system ready for distributed storage.11\nMount Propagation and Kubelet Extra Mounts # One of the most common technical hurdles in installing Longhorn on Talos is the isolation of the kubelet process. In Talos, kubelet runs inside a container and, by default, has no visibility of user-mounted disks or specific host directories needed for CSI (Container Storage Interface) operations.10\nTo solve this problem, it is necessary to configure extraMounts for the kubelet. This setting ensures that the path where Longhorn stores data is mapped inside the kubelet namespace with mount propagation set to rshared.4 Without this configuration, Kubernetes would be unable to attach Longhorn volumes to application pods, resulting in \u0026ldquo;MountVolume.SetUp failed\u0026rdquo; errors.14\nHost Path Kubelet Path Mount Options Function /var/lib/longhorn /var/lib/longhorn bind, rshared, rw Default path for volume data.15 /var/mnt/sdb /var/mnt/sdb bind, rshared, rw Used if a second dedicated disk is employed.4 rshared propagation is fundamental: it allows a mount performed inside a container (like the Longhorn CSI plugin) to be visible to the host and, consequently, to other containers managed by the kubelet.15\nStorage Strategy: Secondary Disks and Persistence # Although Longhorn can technically store data on Talos\u0026rsquo;s EPHEMERAL partition, this practice is discouraged for production environments. Talos\u0026rsquo;s system partition is subject to changes during operating system updates, and using a secondary disk offers a clear separation between application data and the immutable operating system.4\nAdvantages of Using Dedicated Disks in Proxmox # Adding a second virtual disk (e.g., /dev/sdb) in Proxmox for each worker node offers several architectural advantages. First, it isolates storage I/O traffic from system traffic, reducing latency for sensitive applications. Second, it allows for simplified space management: if a node runs out of space for Longhorn volumes, the virtual disk in Proxmox can be expanded without interfering with Talos\u0026rsquo;s critical partitions.4\nTo implement this strategy, the Talos configuration must include instructions to format and mount the additional disk at boot:\nmachine: disks: - device: /dev/sdb partitions: - mountpoint: /var/mnt/sdb Once the disk is mounted at /var/mnt/sdb, this path must be communicated to Longhorn during installation via the Helm values file, setting defaultDataPath to that directory.4\nDisk Format Analysis: RAW vs QCOW2 # The choice of image file format in Proxmox directly impacts the performance of Longhorn, which already internally implements replication and snapshotting mechanisms.\nFeature RAW QCOW2 Performance Maximum (no metadata overhead).18 Lower (overhead due to Copy-on-Write).8 Space Management Occupies the entire allocated space (if not supported by FS holes).19 Supports native thin provisioning.8 Hypervisor Snapshots Not natively supported on file storage.19 Natively supported.8 In an architecture where Longhorn manages redundancy at the cluster level, using the RAW format is often preferred to avoid the \u0026ldquo;double snapshotting\u0026rdquo; phenomenon and reduce write latency.18 However, if the underlying Proxmox infrastructure is based on ZFS, it is crucial to avoid using QCOW2 on top of ZFS to prevent massive write amplification, which would rapidly degrade SSD performance.20\nSoftware Implementation and Configuration of Longhorn # After preparing the Talos infrastructure, Longhorn installation typically occurs via Helm or GitOps operators like Flux or ArgoCD.\nSecurity and Privileged Namespace # Due to the low-level operations it must perform, Longhorn requires elevated privileges. With the introduction of Pod Security Standards in Kubernetes, it is imperative to correctly label the longhorn-system namespace to allow pods to run in privileged mode.11\nApplying the following manifest ensures that Longhorn components are not blocked by the admission controller:\napiVersion: v1 kind: Namespace metadata: name: longhorn-system labels: pod-security.kubernetes.io/enforce: privileged pod-security.kubernetes.io/audit: privileged pod-security.kubernetes.io/warn: privileged This step is critical: without it, the Longhorn manager pods or CSI plugins would fail to start, leaving the system in a perpetual waiting state.11\nRecommended Helm Installation Parameters # During Helm installation, some parameters must be adapted for the Talos-Proxmox environment. Using a custom values.yaml file allows for automating these settings:\ndefaultSettings.defaultDataPath: set to the secondary disk path (e.g., /var/mnt/sdb).4 defaultSettings.numberOfReplicas: usually set to 3 to ensure high availability.4 defaultSettings.createDefaultDiskLabeledNodes: if set to true, allows selecting only specific nodes as storage nodes via Kubernetes labels.4 Additionally, to avoid issues during updates in Talos environments, it is often recommended to disable the preUpgradeChecker if it causes inexplicable blocks due to the immutable nature of the host filesystem.11\nPerformance Optimization and Networking # Distributed storage is inherently dependent on network performance. In a Proxmox virtualized environment, the configuration of bridges and VirtIO interfaces can make the difference between a responsive system and one plagued by timeouts.\nMTU Issues and Packet Fragmentation # A common error in Proxmox configurations concerns MTU (Maximum Transmission Unit) mismatch. If the physical Proxmox bridge is configured for Jumbo Frames (MTU 9000) to optimize storage traffic, but the Talos VM interfaces are left at the default value of 1500, packet fragmentation will occur, drastically increasing CPU usage and reducing throughput for Longhorn volumes.23\nMTU consistency must be guaranteed along the entire path:\nPhysical switch and Proxmox server NIC. Linux Bridge (vmbr0) or OVS Bridge in Proxmox. Network configuration in the Talos OS YAML file. CNI configuration (e.g., Cilium or Flannel) inside Kubernetes.23 In some recent Proxmox versions (8.2+), bugs related to MTU management with VirtIO drivers have been reported, which can cause TCP connections to hang during intensive transfers. In these cases, forcing the MTU to 1500 at all levels can resolve inexplicable instabilities, at the cost of a slight reduction in efficiency.24\nV2 Engine and SPDK: High Resource Requirements # Longhorn has introduced a new storage engine (V2) based on SPDK (Storage Performance Development Kit). Although it offers superior performance, the requirements for Talos nodes increase significantly. The V2 engine uses polling-mode drivers instead of interrupt-based ones, meaning that instance management processes will consume 100% of a dedicated CPU core to minimize latency.5\nV2 engine requirements on Talos:\nHuge Pages: it is necessary to configure the allocation of large memory pages (2 MiB) via sysctl in the Talos configuration (e.g., 1024 pages for a total of 2 GiB).5 CPU Instructions: SSE4.2 support is mandatory, reinforcing the need for the host CPU type in Proxmox.5 Activating the V2 engine must be a weighed choice based on the workload: for high-performance databases, it is recommended, while for general workloads, the V1 engine remains more resource-efficient.5\nOperational Management: Updates, Backup, and Troubleshooting # Maintaining a Longhorn cluster on Talos requires an understanding of specific workflows for immutable systems.\nManaging Talos OS Updates # Updating a Talos node involves rebooting the virtual machine with a new image. During this process, Longhorn must handle the temporary unavailability of a replica.\nSafe update procedure:\nVerify that all Longhorn volumes are in \u0026ldquo;Healthy\u0026rdquo; state and have the full number of replicas. Perform the update one node at a time using talosctl upgrade. Wait for the node to rejoin the Kubernetes cluster and for Longhorn to complete replica rebuilding before proceeding to the next node.9 It is fundamental that the image used for the update contains the same system extensions (iscsi-tools) as the original image; otherwise, Longhorn will lose the ability to communicate with the disks upon the first reboot.9\nData Backup and Disaster Recovery # Although Proxmox allows for backing up the entire VM, for data contained in Longhorn volumes, it is preferable to use the solution\u0026rsquo;s native backup function. Longhorn can export snapshots to an external archive (S3 or NFS).11\nIn a Talos environment, if NFS is chosen as the backup target, it is necessary to ensure that the NFSv4 client extension is present in the system image or that kernel support is enabled.15 Configuring a default BackupTarget is a best practice that avoids volume initialization errors in some Longhorn versions.11\nCommon Troubleshooting # A frequent issue concerns nodes being unable to join the cluster after configuration application, often manifesting as an infinite \u0026ldquo;Installing\u0026rdquo; status in the Proxmox console. This is usually due to networking issues (wrong gateway, lack of DHCP, or non-functional DNS) that prevent Talos from downloading the final installation image.28 Using static IP addresses reserved via MAC address in the DHCP server is the recommended solution to ensure consistency during the multiple reboots of the installation process.3\nAnother critical error is \u0026ldquo;Missing Kind\u0026rdquo; when using talosctl patch. This happens if the YAML patch file does not include apiVersion and kind headers. Talos requires every patch to be a valid Kubernetes object or that the structure exactly matches the schema expected for the specific resource.9\nI/O Performance Modeling in Virtualized Environments # Longhorn\u0026rsquo;s performance can be mathematically analyzed considering the latencies introduced by various layers of abstraction. Total write latency ($L_{total}$) in a configuration with synchronous replication can be expressed as:\n$$L_{total} \\approx L_{virt} + L_{fs\\_guest} + \\max(L_{net\\_RTT} + L_{io\\_remote})$$\nWhere:\n$L_{virt}$: latency introduced by the Proxmox hypervisor and the VirtIO driver. $L_{fs_guest}$: filesystem overhead inside the VM (e.g., XFS or Ext4). $L_{net_RTT}$: network round-trip time between worker nodes for block replication. $L_{io_remote}$: write latency on the physical disk of the remote node. In a 1 Gbps network, $L_{net_RTT}$ can become the primary bottleneck, especially under heavy load. Adopting a 10 Gbps network drastically reduces this value, allowing Longhorn to approach the performance of local storage.23\nSummary and Final Recommendations # Implementing Longhorn on a Kubernetes cluster based on Talos OS and Proxmox represents an excellent solution for managing stateful workloads in modern environments. The key to success lies in the meticulous preparation of the infrastructure layer and understanding Talos\u0026rsquo;s declarative nature.\nThe following actions are recommended for optimal production deployment:\nPre-emptive Customization: Always integrate iscsi-tools and util-linux-tools into Talos images via the Image Factory to avoid runtime issues.4 Hardware Configuration: Use the host CPU type and dedicated SCSI controllers with I/O threads enabled in Proxmox.1 Data Separation: Always implement secondary disks for Longhorn data storage, avoiding the use of the system partition.4 Network Monitoring: Ensure MTU consistency across all virtual and physical network levels to prevent performance degradation.23 Declarative Security: Manage all configurations, including extra mounts and kernel modules, via versioned YAML files, fully leveraging the GitOps philosophy supported by Talos.29 This architecture, although requiring a higher initial learning curve compared to traditional Linux distributions, offers security and reproducibility guarantees that make it ideal for the challenges of modern software engineering.\nBibliography # Proxmox - Sidero Documentation - What is Talos Linux?, accessed on December 30, 2025, https://docs.siderolabs.com/talos/v1.9/platform-specific-installations/virtualized-platforms/proxmox Talos on Proxmox, accessed on December 30, 2025, https://homelab.casaursus.net/talos-on-proxmox-3/ Talos with Kubernetes on Proxmox - Secsys, accessed on December 30, 2025, https://secsys.pages.dev/posts/talos/ Storage Solution: Longhorn, accessed on December 30, 2025, https://www.xelon.ch/en/docs/storage-solution-longhorn Longhorn | Prerequisites, accessed on December 30, 2025, https://longhorn.io/docs/1.10.1/v2-data-engine/prerequisites/ Best CPU Settings for a VM? I want best Per Thread Performance from 13,900k : r/Proxmox, accessed on December 30, 2025, https://www.reddit.com/r/Proxmox/comments/16i7i2w/best_cpu_settings_for_a_vm_i_want_best_per_thread/ Windows 2022 guest best practices - Proxmox VE, accessed on December 30, 2025, https://pve.proxmox.com/wiki/Windows_2022_guest_best_practices Using the QCOW2 disk format in Proxmox - 4sysops, accessed on December 30, 2025, https://4sysops.com/archives/using-the-qcow2-disk-format-in-proxmox/ Improve Documentation for Longhorn and System Extensions \u0026hellip;, accessed on December 30, 2025, https://github.com/siderolabs/talos/issues/12064 Install Longhorn on Talos Kubernetes - HackMD, accessed on December 30, 2025, https://hackmd.io/@QI-AN/Install-Longhorn-on-Talos-Kubernetes Installing Longhorn on Talos Linux: A Step-by-Step Guide - Phin3has Tech Blog, accessed on December 30, 2025, https://phin3has.blog/posts/talos-longhorn/ A collection of scripts for creating and managing kubernetes clusters on talos linux - GitHub, accessed on December 30, 2025, https://github.com/joshrnoll/talos-scripts Automating Talos Installation on Proxmox with Packer and Terraform, Integrating Cilium and Longhorn | Suraj Remanan, accessed on December 30, 2025, https://surajremanan.com/posts/automating-talos-installation-on-proxmox-with-packer-and-terraform/ Why are Kubelet extra mounts for Longhorn needed? · siderolabs talos · Discussion #9674, accessed on December 30, 2025, https://github.com/siderolabs/talos/discussions/9674 Longhorn | Quick Installation, accessed on December 30, 2025, https://longhorn.io/docs/1.10.1/deploy/install/ Kubernetes - Reddit, accessed on December 30, 2025, https://www.reddit.com/r/kubernetes/hot/ Longhorn | Multiple Disk Support, accessed on December 30, 2025, https://longhorn.io/docs/1.10.1/nodes-and-volumes/nodes/multidisk/ Which is better image format, raw or qcow2, to use as a baseimage for other VMs?, accessed on December 30, 2025, https://serverfault.com/questions/677639/which-is-better-image-format-raw-or-qcow2-to-use-as-a-baseimage-for-other-vms Raw vs Qcow2 Image | Storware BLOG, accessed on December 30, 2025, https://storware.eu/blog/raw-vs-qcow2-image/ RAW or QCOW2 ? : r/Proxmox - Reddit, accessed on December 30, 2025, https://www.reddit.com/r/Proxmox/comments/1jh4rlp/raw_or_qcow2/ Performance Tweaks - Proxmox VE, accessed on December 30, 2025, https://pve.proxmox.com/wiki/Performance_Tweaks Longhorn - Rackspace OpenStack Documentation, accessed on December 30, 2025, https://docs.rackspacecloud.com/storage-longhorn/ Strange Issue Using Virtio on 10Gb Network Adapters | Page 2 | Proxmox Support Forum, accessed on December 30, 2025, https://forum.proxmox.com/threads/strange-issue-using-virtio-on-10gb-network-adapters.167666/page-2 qemu virtio issues after upgrade to 9 - Proxmox Support Forum, accessed on December 30, 2025, https://forum.proxmox.com/threads/qemu-virtio-issues-after-upgrade-to-9.169625/ working interface fails when added to bridge - Proxmox Support Forum, accessed on December 30, 2025, https://forum.proxmox.com/threads/working-interface-fails-when-added-to-bridge.106271/ qemu virtio issues after upgrade to 9 | Page 2 - Proxmox Support Forum, accessed on December 30, 2025, https://forum.proxmox.com/threads/qemu-virtio-issues-after-upgrade-to-9.169625/page-2 Installing Longhorn on Talos With Helm - Josh Noll, accessed on December 30, 2025, https://joshrnoll.com/installing-longhorn-on-talos-with-helm/ Completely unable to configure Talos in a Proxmox VM · siderolabs \u0026hellip;, accessed on December 30, 2025, https://github.com/siderolabs/talos/discussions/9291 What Longhorn Talos Actually Does and When to Use It - hoop.dev, accessed on December 30, 2025, https://hoop.dev/blog/what-longhorn-talos-actually-does-and-when-to-use-it/ ","date":"7 January 2026","externalUrl":null,"permalink":"/guides/talos-proxmox-longhorn/","section":"Guides","summary":"","title":"Technical Architecture and Implementation of Longhorn on Kubernetes with Talos OS in Proxmox Virtualized Environments","type":"guides"},{"content":"The success of Kubernetes as the de facto standard for container orchestration lies not only in its ability to abstract hardware or manage networking, but fundamentally in its operational model based on controllers. In a massively distributed system, manual workload management would be impossible; stability is instead guaranteed by a myriad of intelligent control loops working incessantly to maintain harmony between what the user declared and what actually happens on physical or virtual servers.1 A controller in Kubernetes is, in its purest essence, an infinite loop, a daemon that observes the shared state of the cluster through the API server and makes the necessary changes to ensure the current state converges towards the desired state.1 This paradigm, borrowed from systems theory and robotics, transforms infrastructure management from an imperative approach (do this) to a declarative one (I want this to be like this).2\nFundamentals and mechanisms of the control loop # To understand controllers \u0026ldquo;from scratch\u0026rdquo;, it is necessary to visualize the cluster not as a static set of containers, but as a dynamic organism regulated by an intelligent thermostat. In a room, the thermostat represents the controller: the user sets a desired temperature (the desired state), the thermostat detects the current temperature (the current state) and acts by turning the heating on or off to eliminate the difference.2 In Kubernetes, this process follows a rigid pattern named \u0026ldquo;Watch-Analyze-Act\u0026rdquo;.4\nThe first pillar, the \u0026ldquo;Watch\u0026rdquo; phase, relies on the API server as the single source of truth. Controllers constantly monitor the resources within their purview, leveraging etcd\u0026rsquo;s notification mechanisms to react in real-time to any change.3 When a user applies a YAML manifest, the API server stores the specification (spec) in etcd, and the corresponding controller immediately receives a signal.2\nIn the \u0026ldquo;Analyze\u0026rdquo; phase, the controller compares the specification with the state reported in the resource\u0026rsquo;s status field. If the specification requires three replicas of an application but the status reports only two, the analysis identifies a discrepancy.2 Finally, in the \u0026ldquo;Act\u0026rdquo; phase, the controller does not act directly on the container, but sends instructions to the API server to create new objects (like a Pod) or remove existing ones.2 Other components, such as the kube-scheduler and the kubelet, will then perform the necessary physical actions.2 This decoupling ensures that each component is specialized and that the system can tolerate partial failures without losing global consistency.3\nThe Kube-Controller-Manager: the nerve center # Logically, each controller is a separate process, but to reduce operational complexity, Kubernetes groups all core controllers into a single binary called kube-controller-manager.1 This daemon runs on the control plane and manages most of the built-in control loops.1 To optimize performance, the kube-controller-manager allows configuring concurrency, i.e., the number of objects that can be synchronized simultaneously for each controller type.1\nController Concurrency Parameter Default Value Impact on Performance Deployment \u0026ndash;concurrent-deployment-syncs 5 Update speed of stateless applications StatefulSet \u0026ndash;concurrent-statefulset-syncs Not specified (global) Orderly management of stateful applications DaemonSet \u0026ndash;concurrent-daemonset-syncs 2 Readiness of infrastructural services on new nodes Job \u0026ndash;concurrent-job-syncs 5 Simultaneous batch processing capacity Namespace \u0026ndash;concurrent-namespace-syncs 10 Speed of resource cleanup and termination ReplicaSet \u0026ndash;concurrent-replicaset-syncs 5 Management of the desired number of replicas These parameters are crucial for administrators of large clusters; increasing these values can make the cluster more responsive but drastically increases the load on the control plane CPU and network traffic towards the API server.1\nDetailed analysis of Workload controllers # Application management in Kubernetes happens through abstractions called workload resources, each governed by a specific controller designed to solve unique orchestration problems.9\nDeployment and ReplicaSet: the stateless standard # The Deployment controller is probably the most used in the Kubernetes ecosystem. It provides declarative updates for Pods and ReplicaSets.5 When defining a Deployment, the controller does not create Pods directly, but creates a ReplicaSet, which in turn ensures that the exact number of Pods is always running.5\nThe true power of the Deployment lies in the management of update strategies, primarily the \u0026ldquo;RollingUpdate\u0026rdquo;.11 During a rollout, the Deployment controller creates a new ReplicaSet with the new image version and begins scaling it up, while simultaneously scaling down the old ReplicaSet.15 This mechanism allows for zero-downtime updates and facilitates immediate rollback via the command kubectl rollout undo.18 Deployments are ideal for web applications, APIs, and microservices where individual Pods are considered ephemeral and interchangeable.9\nStatefulSet: identity in distributed chaos # Unlike stateless applications, many systems (such as databases or message queues) require that each instance has a persistent identity and a specific startup order.9 The StatefulSet controller manages the deployment and scaling of a set of Pods providing uniqueness guarantees.21\nEach Pod receives a name derived from an ordinal index (e.g., $pod-0, pod-1, \\dots, pod-N-1$) which remains constant even if the Pod is rescheduled on another node.17 Furthermore, the StatefulSet guarantees storage persistence: each Pod is associated with a specific PersistentVolume via a volumeClaimTemplate.17 If Pod db-0 fails, the controller will create a new one named db-0 and attach it to the same data volume as before, preserving the application state.17\nDaemonSet: ubiquitous infrastructure # The DaemonSet controller ensures that a copy of a Pod is running on all (or some) nodes of the cluster.5 When a new node is added to the cluster, the DaemonSet controller automatically adds the specified Pod to it.9 This is fundamental for services that must reside on every physical machine, such as log collectors (Fluentd, Logstash), monitoring agents (Prometheus Node Exporter), or network components (Calico, Cilium).9 It is possible to limit execution to a subset of nodes using label selectors or node affinity.22\nJob and CronJob: finite execution # While the previous controllers manage services that should run indefinitely, Job and CronJob manage tasks that must terminate successfully.9 The Job controller creates one or more Pods and ensures that a specific number of them terminate successfully.24 If a Pod fails due to a container or node error, the Job controller starts a new one until the success quota or the retry limit (backoffLimit) is reached.24\nThe CronJob extends this logic by allowing the execution of Jobs on a scheduled basis, using the standard Unix crontab format.27 This is ideal for nightly backups, periodic report generation, or database maintenance tasks.28\nFeature Deployment StatefulSet DaemonSet Job Workload Nature Stateless Stateful Infrastructural Batch / One-off task Pod Identity Random (hash) Stable ordinal Tied to node Temporary Storage Shared or ephemeral Dedicated per replica Local or specific Ephemeral Startup Order Random / Parallel Ordered sequential Parallel on nodes Parallel / Sequential Usage Example Nginx, Spring Boot MySQL, Kafka, Redis Fluentd, New Relic DB Migration Internal controllers and system integrity # Beyond user-visible controllers managing Pods, the Kubernetes control plane runs numerous \u0026ldquo;system\u0026rdquo; controllers that guarantee the functioning of the infrastructure itself.5\nNode Controller # The Node Controller is responsible for managing the lifecycle of nodes within the cluster.5 Its main functions include:\nRegistration and Monitoring: Keeps track of node inventory and their health status.6 Failure Detection: If a node stops sending heartbeat signals (sign of a network or hardware failure), the Node Controller marks it as NotReady or Unknown.3 Pod Evacuation: If a node remains unreachable for a prolonged period, the controller initiates the eviction of Pods managed by Deployment or StatefulSet so they can be rescheduled on healthy nodes.5 Namespace Controller # Namespaces provide a logical isolation mechanism within a cluster.12 The Namespace Controller intervenes when a user requests the deletion of a namespace.5 Instead of an instant deletion, the controller starts an iterative cleanup process: it ensures that all associated resources (Pod, Service, Secret, ConfigMap) are correctly removed before definitively deleting the Namespace object from the etcd database.5\nEndpoints and EndpointSlice Controller # These controllers constitute the connective tissue between networking and workloads. The Endpoints Controller constantly monitors Services and Pods; when a Pod becomes \u0026ldquo;Ready\u0026rdquo; (according to its readiness probe), the controller adds the Pod\u0026rsquo;s IP address to the Endpoints object corresponding to the Service.5 This allows kube-proxy to correctly route traffic.3 The EndpointSlice Controller is a more modern and scalable evolution that manages larger groupings of endpoints in clusters with thousands of nodes.5\nService Account and Token Controller # Security within the cluster is mediated by Service Accounts, which provide an identity to processes running in Pods.12 The Service Account Controllers automatically create a \u0026ldquo;default\u0026rdquo; account for each new namespace and generate the secret tokens necessary for containers to authenticate with the API server for monitoring or automation operations.8\nCloud Controller Manager (CCM): The interface with providers # In cloud installations (AWS, Azure, Google Cloud), Kubernetes must interact with external resources such as load balancers or managed disks.3 The Cloud Controller Manager (CCM) separates cloud-specific logic from Kubernetes core logic.6\nThe CCM runs three main control loops:\nService Controller: When a Service of type LoadBalancer is created, this controller interacts with the cloud provider\u0026rsquo;s APIs (e.g., AWS NLB/ALB) to instantiate an external load balancer and configure its targets towards the cluster nodes.5 Route Controller: Configures the routing tables of the cloud network infrastructure to ensure that packets destined for Pods can travel between different physical nodes.5 Node Controller (Cloud): Queries the cloud provider to determine if a node that has stopped responding has effectively been removed or terminated from the cloud console, allowing for quicker cleanup of cluster resources.5 Extreme extensibility: The Operator pattern and Custom Controllers # One of Kubernetes\u0026rsquo; strengths is its ability to be extended beyond native capabilities.7 While built-in controllers manage general abstractions (Pod, Service), the Operator pattern allows managing complex applications by introducing \u0026ldquo;domain knowledge\u0026rdquo; directly into the control plane.16\nAnatomy of an Operator # An Operator is the union of two components:\nCustom Resource Definition (CRD): Extends the API server allowing the creation of new object types (e.g., an object of type ElasticsearchCluster or PostgresBackup).7 Custom Controller: A custom control loop that watches these new resources and implements specific operational logic, such as performing a backup before a database update or managing data re-sharding.7 Operators automate tasks that would normally require expert human intervention (a Site Reliability Engineer), such as quorum management in a distributed cluster or database schema migration during an application upgrade.16\nDevelopment Tools: Operator SDK and Kubebuilder # Developing a custom controller from scratch is complex, as it requires managing caches, workqueues, and low-latency network interactions.34 Tools like Operator SDK (supported by Red Hat) and Kubebuilder (official Kubernetes SIGs project) provide Go language frameworks to generate boilerplate, manage object serialization, and implement the reconciliation loop efficiently.33\nTool Supported Languages Key Features Operator SDK Go, Ansible, Helm Integration with Operator Lifecycle Manager (OLM), ideal for enterprise integrations.33 Kubebuilder Go Based on controller-runtime, provides clean abstractions for CRD and Webhook generation.33 Client-Go Go Low-level library for total control, but with a very steep learning curve.33 Elastic Automation: Horizontal Pod Autoscaler (HPA) # The Horizontal Pod Autoscaler (HPA) controller automates horizontal scaling, i.e., adding or removing Pod replicas in response to load.38\nOperation follows a precise mathematical formula to calculate the number of desired replicas:\n$R\textsubscript{desired} = \\lceil R\textsubscript{current} \\times \\frac{current_value}{target_value} \\rceil$ The HPA queries the Metrics Server (or an adapter for custom metrics like Prometheus) to obtain average resource usage.38 If usage exceeds the set threshold (e.g., 70% CPU), the HPA updates the replicas field of the target Deployment or StatefulSet.38 This allows the cluster to adapt to unexpected traffic spikes without manual intervention, while optimizing costs during periods of low activity.38\nPractical Guide: Installation and Configuration of Controllers # Most users interact with controllers through YAML manifest files. Here is how to configure and manage the main controllers with real examples.\nConfiguring a Deployment with Rollout strategies # A well-configured Deployment must clearly define how to manage updates.\nYAML\napiVersion: apps/v1\nkind: Deployment\nmetadata:\nname: api-service\nlabels:\napp: api\nspec:\nreplicas: 4\nstrategy:\ntype: RollingUpdate\nrollingUpdate:\nmaxSurge: 25% # Number of extra pods created during rollout\nmaxUnavailable: 25% # Maximum number of pods that can be offline\nselector:\nmatchLabels:\napp: api\ntemplate:\nmetadata:\nlabels:\napp: api\nspec:\ncontainers:\n- name: api-container\nimage: myrepo/api:v1.0.2\nports:\n- containerPort: 8080\nreadinessProbe: # Fundamental for the Deployment controller\nhttpGet:\npath: /healthz\nport: 8080\n13\nTo manage this controller from CLI:\nView status: kubectl rollout status deployment/api-service.19 See history: kubectl rollout history deployment/api-service.15 Perform rollback: kubectl rollout undo deployment/api-service -\u0026ndash;to-revision=2.15 Installing an Operator via Operator SDK # The installation process of an Operator is more articulated than a native resource, as it requires registering new API types.36\nInstalling CRDs: kubectl apply -f deploy/crds/db_v1alpha1_mysql_crd.yaml. This teaches the API server what a \u0026ldquo;MySQLDatabase\u0026rdquo; is.34\nRBAC Configuration: kubectl apply -f deploy/role.yaml and kubectl apply -f deploy/role_binding.yaml. This gives the controller permissions to create Pods and Services.36\nController Deployment: kubectl apply -f deploy/operator.yaml. This starts the Pod containing the Operator source code.36\nInstance Creation: Once the Operator is running, the user creates a custom resource to instantiate the application:\nYAML\napiVersion: db.example.com/v1alpha1\nkind: MySQLDatabase\nmetadata:\nname: production-db\nspec:\nsize: 3\nstorage: 100Gi\n7\nAt this point, the Operator will take charge of the request and orchestrate the creation of necessary StatefulSets, Services, and backups.7\nController Selection: Decision Matrix for Cloud Architects # Identifying the correct controller is a critical architectural decision that influences the resilience and maintainability of the entire system.20\nUsage Scenarios Controller to use Why? API Gateway, Web Front-end, Stateless Microservices Deployment Maximum scaling speed and ease of \u0026ldquo;rolling\u0026rdquo; updates.9 Databases (PostgreSQL, MongoDB), Queues (RabbitMQ), Stateful AI/ML StatefulSet Ensures data remains coupled to correct instances and manages quorum.9 Monitoring, Log Forwarding, Network Proxy (Kube-proxy) DaemonSet Ensures every node contributes to cluster observability and connectivity.20 Massive data processing, ML Model Training, DB Migrations Job Manages tasks that must run to success, with built-in retry logic.23 Periodic backups, Cache cleaning, Scheduled log rotation CronJob Time-based automation, replaces system cron for a containerized environment.27 Complex Software-as-a-Service (SaaS), Managed-like Database Operator When operational logic requires specific steps (e.g., leader promotion) not covered by StatefulSet.7 Operational Best Practices and Troubleshooting # Correctly managing controllers requires awareness of some common pitfalls that can destabilize the production environment.41\nThe importance of Probes: Liveness and Readiness # The Deployment controller trusts the information provided by containers. If a container is \u0026ldquo;Running\u0026rdquo; but the application inside is stalled, the controller will not intervene unless a Liveness Probe is configured.9 Similarly, a Readiness Probe is essential during rollouts: it informs the controller when the new Pod is effectively ready to receive traffic, preventing the rollout from proceeding if the new version is failing silently.9\nResource Requests and Limits: The fuel of Controllers # The scheduler and autoscaling controllers (HPA) depend entirely on resource declarations.41 Without requests, the scheduler might overcrowd a node, leading to degraded performance.9 Without limits, a single Pod with a memory leak could consume all node memory, causing the forced restart of critical system Pods (OOM Killing).41\nLabels and Selectors: The \u0026ldquo;Collision\u0026rdquo; risk # Controllers identify resources within their purview via label selectors.5 A common mistake is using overly generic labels (e.g., app: web) in shared namespaces. If two different Deployments use the same selector, their controllers will conflict, each attempting to manage the other\u0026rsquo;s Pods, leading to continuous container creation and deletion.47 It is good practice to use unique and structured labels.\nHistory Management and Rollback # Kubernetes maintains a limited history of Deployment rollouts (by default, 10 revisions).21 It is important to monitor these limits to ensure one can revert to stable versions in case of serious incidents.15 The use of GitOps tools (like ArgoCD or Flux) that track desired state in a Git repository is the preferred recommendation for managing complex configurations without manual errors.14\nConclusions: Towards Cluster Autonomy # The controller model in Kubernetes represents the culmination of modern distributed systems engineering. Understanding how different controllers interact with each other — from the Node Controller detecting a failure, to the Deployment responding by rescheduling Pods, up to the HPA scaling replicas — is what differentiates a passive Kubernetes user from an orchestration expert.2\nThe future of this technology is moving towards ever-greater specialization through Operators, which allow managing not just containers, but the entire business logic lifecycle, from AI-driven databases to software-defined networks. In this ecosystem, the YAML manifest is no longer just a configuration file, but a living contract that a host of intelligent controllers pledges to honor every second, ensuring the application remains always available, secure, and ready to scale.1\nBibliography # kube-controller-manager - Kubernetes, accessed on December 31, 2025, https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/ Controllers - Kubernetes, accessed on December 31, 2025, https://kubernetes.io/docs/concepts/architecture/controller/ Kubernetes Control Plane: Ultimate Guide (2024) - Plural, accessed on December 31, 2025, https://www.plural.sh/blog/kubernetes-control-plane-architecture/ Kube Controller Manager: A Quick Guide - Techiescamp, accessed on December 31, 2025, https://blog.techiescamp.com/docs/kube-controller-manager-a-quick-guide/ A controller in Kubernetes is a control loop that: - DEV Community, accessed on December 31, 2025, https://dev.to/jumptotech/a-controller-in-kubernetes-is-a-control-loop-that-23d3 Basic Components of Kubernetes Architecture - Appvia, accessed on December 31, 2025, https://www.appvia.io/blog/components-of-kubernetes-architecture Understanding Custom Resource Definitions, Custom Controllers, and the Operator Framework in Kubernetes | by Damini Bansal, accessed on December 31, 2025, https://daminibansal.medium.com/understanding-custom-resource-definitions-custom-controllers-and-the-operator-framework-in-5734739e012d Kubernetes Components, accessed on December 31, 2025, https://kubernetes-docsy-staging.netlify.app/docs/concepts/overview/components/ The Guide to Kubernetes Workload With Examples - Densify, accessed on December 31, 2025, https://www.densify.com/kubernetes-autoscaling/kubernetes-workload/ Workload Management - Kubernetes, accessed on December 31, 2025, https://kubernetes.io/docs/concepts/workloads/controllers/ Deployment vs StatefulSet vs DaemonSet: Navigating Kubernetes Workloads, accessed on December 31, 2025, https://dev.to/sre_panchanan/deployment-vs-statefulset-vs-daemonset-navigating-kubernetes-workloads-190j Controllers :: Introduction to Kubernetes, accessed on December 31, 2025, https://shahadarsh.github.io/docker-k8s-presentation/kubernetes/objects/controllers/ Kubernetes Workload - Resource Types \u0026amp; Examples - Spacelift, accessed on December 31, 2025, https://spacelift.io/blog/kubernetes-workload Kubernetes Configuration Good Practices, accessed on December 31, 2025, https://kubernetes.io/blog/2025/11/25/configuration-good-practices/ How do you rollback deployments in Kubernetes? - LearnKube, accessed on December 31, 2025, https://learnkube.com/kubernetes-rollbacks Kubernetes Controllers vs Operators: Concepts and Use Cases \u0026hellip;, accessed on December 31, 2025, https://konghq.com/blog/learning-center/kubernetes-controllers-vs-operators Kubernetes StatefulSet vs. Deployment with Use Cases - Spacelift, accessed on December 31, 2025, https://spacelift.io/blog/statefulset-vs-deployment kubectl rollout undo - Kubernetes, accessed on December 31, 2025, https://kubernetes.io/docs/reference/kubectl/generated/kubectl_rollout/kubectl_rollout_undo/ kubectl rollout - Kubernetes, accessed on December 31, 2025, https://kubernetes.io/docs/reference/kubectl/generated/kubectl_rollout/ Kubernetes Deployments, DaemonSets, and StatefulSets: a Deep \u0026hellip;, accessed on December 31, 2025, https://www.professional-it-services.com/kubernetes-deployments-daemonsets-and-statefulsets-a-deep-dive/ StatefulSets - Kubernetes, accessed on December 31, 2025, https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/ Kubernetes DaemonSet: Examples, Use Cases \u0026amp; Best Practices - Groundcover, accessed on December 31, 2025, https://www.groundcover.com/blog/kubernetes-daemonset Mastering K8s Job Timeouts: A Complete Guide - Plural, accessed on December 31, 2025, https://www.plural.sh/blog/kubernetes-jobs/ What Are Kubernetes Jobs? Use Cases, Types \u0026amp; How to Run - Spacelift, accessed on December 31, 2025, https://spacelift.io/blog/kubernetes-jobs How to Configure Kubernetes Jobs for Parallel Processing - LabEx, accessed on December 31, 2025, https://labex.io/tutorials/kubernetes-how-to-configure-kubernetes-jobs-for-parallel-processing-414879 Understanding backoffLimit in Kubernetes Jobs | Baeldung on Ops, accessed on December 31, 2025, https://www.baeldung.com/ops/kubernetes-backofflimit CronJobs | Google Kubernetes Engine (GKE), accessed on December 31, 2025, https://docs.cloud.google.com/kubernetes-engine/docs/how-to/cronjobs CronJob in Kubernetes - Automating Tasks on a Schedule - Spacelift, accessed on December 31, 2025, https://spacelift.io/blog/kubernetes-cronjob CronJob - Kubernetes, accessed on December 31, 2025, https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/ How to automate your tasks with Kubernetes CronJob - IONOS UK, accessed on December 31, 2025, https://www.ionos.co.uk/digitalguide/server/configuration/kubernetes-cronjob/ Service Accounts | Kubernetes, accessed on December 31, 2025, https://kubernetes.io/docs/concepts/security/service-accounts/ Operator pattern - Kubernetes, accessed on December 31, 2025, https://kubernetes.io/docs/concepts/extend-kubernetes/operator/ What Is The Kubernetes Operator Pattern? – BMC Software | Blogs, accessed on December 31, 2025, https://www.bmc.com/blogs/kubernetes-operator/ Ultimate Guide to Kubernetes Operators and How to Create New Operators - Komodor, accessed on December 31, 2025, https://komodor.com/learn/kubernetes-operator/ The developer\u0026rsquo;s guide to Kubernetes Operators | Red Hat Developer, accessed on December 31, 2025, https://developers.redhat.com/articles/2024/01/29/developers-guide-kubernetes-operators A complete guide to Kubernetes Operator SDK - Outshift | Cisco, accessed on December 31, 2025, https://outshift.cisco.com/blog/operator-sdk Build a Kubernetes Operator in six steps - Red Hat Developer, accessed on December 31, 2025, https://developers.redhat.com/articles/2021/09/07/build-kubernetes-operator-six-steps Kubernetes HPA [Horizontal Pod Autoscaler] Guide - Spacelift, accessed on December 31, 2025, https://spacelift.io/blog/kubernetes-hpa-horizontal-pod-autoscaler HPA with Custom GPU Metrics - Docs - Kubermatic Documentation, accessed on December 31, 2025, https://docs.kubermatic.com/kubermatic/v2.29/tutorials-howtos/hpa-with-custom-gpu-metrics/ Horizontal Pod Autoscaler (HPA) with Custom Metrics: A Guide - overcast blog, accessed on December 31, 2025, https://overcast.blog/horizontal-pod-autoscaler-hpa-with-custom-metrics-a-guide-0fd5cf0f80b8 7 Common Kubernetes Pitfalls (and How I Learned to Avoid Them), accessed on December 31, 2025, https://kubernetes.io/blog/2025/10/20/seven-kubernetes-pitfalls-and-how-to-avoid/ kubectl rollout history - Kubernetes, accessed on December 31, 2025, https://kubernetes.io/docs/reference/kubectl/generated/kubectl_rollout/kubectl_rollout_history/ Install the Operator on Kubernetes | Couchbase Docs, accessed on December 31, 2025, https://docs.couchbase.com/operator/current/install-kubernetes.html The Kubernetes Compatibility Matrix Explained - Plural.sh, accessed on December 31, 2025, https://www.plural.sh/blog/kubernetes-compatibility-matrix/ A pragmatic look at the Kubernetes Threat Matrix | by Simon Elsmie | Beyond DevSecOps, accessed on December 31, 2025, https://medium.com/beyond-devsecops/a-pragmatic-look-at-the-kubernetes-threat-matrix-d58504e926b5 Tackle Common Kubernetes Security Pitfalls with AccuKnox CNAPP, accessed on December 31, 2025, https://accuknox.com/blog/avoid-common-kubernetes-mistakes 7 Common Kubernetes Pitfalls in 2023 - Qovery, accessed on December 31, 2025, https://www.qovery.com/blog/7-common-kubernetes-pitfalls ","date":"7 January 2026","externalUrl":null,"permalink":"/guides/kubernetes-controllers-guide/","section":"Guides","summary":"","title":"The controller architecture in Kubernetes: comprehensive guide to the cloud-native automation engine","type":"guides"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/theme/","section":"Tags","summary":"","title":"Theme","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/virtualization/","section":"Tags","summary":"","title":"Virtualization","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/vpn/","section":"Tags","summary":"","title":"Vpn","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/web-development/","section":"Tags","summary":"","title":"Web-Development","type":"tags"},{"content":"","date":"7 January 2026","externalUrl":null,"permalink":"/tags/wireguard/","section":"Tags","summary":"","title":"Wireguard","type":"tags"},{"content":" Introduction: The Illusion of Simplicity # Today the goal seemed trivial: take a static blog generated with Hugo, which currently runs peacefully in a Docker container managed via Compose, and move it inside the Kubernetes cluster.\nOn paper, it\u0026rsquo;s a five-minute operation. Take the compose.yml, translate it into a Deployment and a Service, apply, done. In reality, this migration turned into a masterclass on the difference between local volume management (Docker) and distributed storage (Kubernetes/Longhorn), and on how file permissions can become public enemy number one.\nThis is not a \u0026ldquo;copy-paste\u0026rdquo; guide. It is the chronicle of how we dissected the problem, analyzed the failures, and built a resilient solution.\nYes, the blog you are reading right now runs on Kubernetes, self-hosted on Proxmox on my home mini PC!\nPhase 1: The Storage Paradox # The starting point was a simple docker-compose.yml that I used for local development:\nservices: hugo: image: hugomods/hugo:exts-non-root command: server --bind=0.0.0.0 --buildDrafts --watch volumes: - ./:/src # \u0026lt;--- THE CULPRIT Note that volumes line. In Docker, I was mapping the current folder of my host inside the container. It\u0026rsquo;s immediate: I modify a file on my laptop, Hugo notices it and regenerates the site.\nThe Conceptual Problem # When we move to Kubernetes, that \u0026ldquo;my laptop\u0026rdquo; no longer exists. The Pod can be scheduled on any node of the cluster. We cannot rely on files present on the host filesystem (unless using hostPath, which however is an anti-pattern because it binds the Pod to a specific node, breaking High Availability).\nThe architectural solution is to use a PersistentVolumeClaim (PVC) backed by Longhorn. Longhorn replicates data across multiple nodes, ensuring that if a node dies, the blog data survives and the Pod can restart elsewhere.\nBut here arises the paradox: A new Longhorn volume is empty. If I start the Hugo Pod attached to this empty volume, Hugo will crash instantly because it won\u0026rsquo;t find the config.toml file.\nIngestion Strategy # We had three paths:\nGit-Sync Sidecar: A side-by-side container that constantly clones the Git repo into the shared volume. Elegant, but complex for a personal blog. InitContainer: A container that starts before the app, clones the repo, and dies. One-Off Copy: Start the Pod, wait for it to fail (or hang), and manually copy the data once. We opted for a hybrid variant. Since the goal was to keep the \u0026ldquo;watch\u0026rdquo; mode to edit live files (maybe via remote editor in the future), we decided to treat the volume as the \u0026ldquo;Single Source of Truth\u0026rdquo;.\nPhase 2: The Manifesto Architecture # Why a Deployment and not a StatefulSet?\nOne often associates the StatefulSet with applications that need storage stability. However, Hugo (in server mode) does not need stable network identities (like hugo-0, hugo-1). It only needs its files. A Deployment with Recreate strategy (to avoid two pods writing to the same RWO volume simultaneously) is sufficient and simpler to manage.\nHere is the final commented manifesto:\napiVersion: apps/v1 kind: Deployment metadata: name: hugo-blog namespace: hugo-blog # Isolation first of all spec: replicas: 1 strategy: type: Recreate # Avoids Longhorn volume lock selector: matchLabels: app: hugo-blog template: metadata: labels: app: hugo-blog spec: # THE SECRET OF PERMISSIONS securityContext: fsGroup: 1000 containers: - name: hugo image: hugomods/hugo:exts-non-root args: - server - --bind=0.0.0.0 - --baseURL=https://blog.tazlab.net/ - --appendPort=false ports: - containerPort: 1313 volumeMounts: - name: blog-src mountPath: /src volumes: - name: blog-src persistentVolumeClaim: claimName: hugo-blog-pvc Deep Dive: fsGroup: 1000 # This was the critical moment of the investigation. The image hugomods/hugo:exts-non-root is built to run, as the name says, without root privileges (UID 1000). However, when Kubernetes mounts a volume (especially with certain CSI drivers like Longhorn), the mount directory can belong to root by default.\nResult? The container starts, tries to write to the /src folder (for cache or lock files) and receives a Permission Denied.\nThe instruction fsGroup: 1000 in the securityContext tells Kubernetes: \u0026ldquo;Hey, any volume mounted in this Pod must be readable and writable by group 1000\u0026rdquo;. Kubernetes recursively applies a chown or manages ACL permissions at mount time, solving the problem at the root.\nPhase 3: The Network and Discovery # Once the Pod is running, it must be reachable. Here Traefik, our Ingress Controller, comes into play.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: hugo-blog-ingress annotations: # The magic of Let\u0026#39;s Encrypt traefik.ingress.kubernetes.io/router.tls.certresolver: myresolver spec: ingressClassName: traefik rules: - host: blog.tazlab.net http: paths: - path: / pathType: Prefix backend: service: name: hugo-blog port: number: 80 During setup, I had to verify what the exact name of the resolver configured in Traefik was. A quick check on traefik-values.yaml confirmed that the ID was myresolver. Without this exact match, SSL certificates would never be generated.\nA detail often overlooked: BaseURL. Hugo generates internal links based on its configuration. If it runs on internal port 1313, it will tend to create links like http://localhost:1313/post. But we are behind a Reverse Proxy (Traefik) serving on HTTPS port 443. The argument --baseURL=https://blog.tazlab.net/ and --appendPort=false forces Hugo to generate correct links for the outside world, regardless of the port the container listens on.\nPhase 4: Operation \u0026ldquo;Data Transplant\u0026rdquo; # With the manifesto applied, the Pod went into Running state, but served a blank page or an error, because /src was empty.\nHere we used intelligent brute force: kubectl cp.\n# Local copy -\u0026gt; Remote Pod kubectl cp ./blog hugo-blog/hugo-blog-pod-xyz:/src Thanks to the fsGroup configured earlier, the copied files kept the correct permissions to be read by the Hugo process. Immediately, the Hugo watcher detected the new files (config.toml, content/) and compiled the site in a few milliseconds.\nPost-Lab Reflections # This migration moved the blog from a \u0026ldquo;pet\u0026rdquo; entity (tied to my computer) to \u0026ldquo;cattle\u0026rdquo; (part of the cluster).\nResilience: If the node where Hugo runs dies, Longhorn has replicated the data to another node. Kubernetes reschedules the Pod, which attaches to the data replica and restarts. Downtime time: seconds. Scalability: We don\u0026rsquo;t need it now, but we could scale to more replicas (removing the --watch mode and using Nginx to serve pure statics). Security: Everything runs in HTTPS, with automatically renewed certificates, and the container has no root privileges. Today\u0026rsquo;s lesson is that in Kubernetes, storage is a first-class citizen. It is no longer just a folder on disk; it is a network resource with its own access rules, permissions, and lifecycle. Ignoring this aspect is the fastest way to a CrashLoopBackOff.\n","date":"6 January 2026","externalUrl":null,"permalink":"/posts/hugo-blog-kubernetes-migration/","section":"Posts","summary":"","title":"Migrating a Hugo Blog to Kubernetes","type":"posts"},{"content":"","date":"6 January 2026","externalUrl":null,"permalink":"/tags/migration/","section":"Tags","summary":"","title":"Migration","type":"tags"},{"content":"","date":"6 January 2026","externalUrl":null,"permalink":"/it/tags/migrazione/","section":"Tags","summary":"","title":"Migrazione","type":"tags"},{"content":" Introduction: The Limit of \u0026ldquo;It Just Works\u0026rdquo; # Until yesterday, our Kubernetes cluster lived in a sort of architectural limbo. The Ingress Controller (Traefik) was configured in hostNetwork: true mode. Simply put, the Traefik Pod hijacked the entire network interface of the node it was running on, listening directly on ports 80 and 443 of the Control Plane\u0026rsquo;s physical IP.\nDoes it work? Yes. Is it a best practice? Absolutely not. This configuration creates a strong coupling between the logical service and the physical infrastructure. If the node dies, the service dies. Furthermore, it blocks those ports for anything else. In cloud providers (AWS, GCP), this problem is solved with a click: \u0026ldquo;Create Load Balancer\u0026rdquo;. But we are \u0026ldquo;on-premise\u0026rdquo; (or rather, \u0026ldquo;on-homelab\u0026rdquo;), where the luxury of ELBs (Elastic Load Balancers) does not exist.\nThe solution is MetalLB: a component that simulates a hardware Load Balancer inside the cluster, assigning \u0026ldquo;virtual\u0026rdquo; IPs to services. Today\u0026rsquo;s mission was simple on paper but complex in execution: install MetalLB, configure a dedicated IP zone, and migrate Traefik to make it a first-class citizen of the cluster.\nPhase 1: MetalLB and the Dance of Protocols (Layer 2) # For a home cluster where we don\u0026rsquo;t have expensive BGP routers (like Juniper or Cisco in datacenters), MetalLB offers Layer 2 mode.\nKey Concept: Layer 2 \u0026amp; ARP In this mode, one of the cluster nodes \u0026ldquo;raises its hand\u0026rdquo; and tells the local network: \u0026ldquo;Hey, IP 192.168.1.240 is me!\u0026rdquo;. It does this by sending ARP (Address Resolution Protocol) packets. If that node dies, MetalLB instantly elects another node that starts shouting \u0026ldquo;No, it\u0026rsquo;s me now!\u0026rdquo;. It\u0026rsquo;s a simple yet effective failover mechanism.\nThe Challenge of Tolerations # The first obstacle was architectural. By default, MetalLB installs pods called \u0026ldquo;speakers\u0026rdquo; (those that \u0026ldquo;shout\u0026rdquo; ARP) only on Worker nodes. But in our cluster, traffic was still predominantly entering from the Control Plane. If we hadn\u0026rsquo;t had a speaker on the Control Plane, we would have risked having a mute Load Balancer on half the infrastructure.\nWe had to force Helm\u0026rsquo;s hand with a specific tolerations configuration, allowing speakers to \u0026ldquo;get their hands dirty\u0026rdquo; on the Master node as well:\n# metallb-values.yaml speaker: tolerations: - key: \u0026#34;node-role.kubernetes.io/control-plane\u0026#34; operator: \u0026#34;Exists\u0026#34; effect: \u0026#34;NoSchedule\u0026#34; - key: \u0026#34;node-role.kubernetes.io/master\u0026#34; operator: \u0026#34;Exists\u0026#34; effect: \u0026#34;NoSchedule\u0026#34; controller: tolerations: - key: \u0026#34;node-role.kubernetes.io/control-plane\u0026#34; operator: \u0026#34;Exists\u0026#34; effect: \u0026#34;NoSchedule\u0026#34; Without this, the speakers would have remained in Pending on the control plane, making failover lame.\nPhase 2: The DHCP Trap (Networking Surgery) # Configuring MetalLB requires an IP address pool to assign. And here we risked disaster.\nThe home router (a Sky Hub) was configured, like many consumer routers, to cover the entire 192.168.1.x subnet with its DHCP server (range .2 - .253).\nThe Danger of IP Conflict If we had told MetalLB \u0026ldquo;Use the range .50-.60\u0026rdquo; without touching the router, we would have created a ticking time bomb. Scenario:\nMetalLB assigns .50 to Traefik. Everything works. I come home, my phone connects to Wi-Fi. The router, unaware of MetalLB, assigns .50 to my phone. Result: IP Conflict. The Kubernetes cluster and my phone start fighting over who owns the address. Packets get lost, connections drop. Chaos. The Solution: \u0026ldquo;DHCP Shrinking\u0026rdquo; Before applying any YAML, we intervened on the router. We drastically reduced the DHCP range: from .2-.120. This created a \u0026ldquo;No Man\u0026rsquo;s Land\u0026rdquo; (from .121 to .254) where the router dares not venture. It is in this safe space that we carved out the pool for MetalLB.\n# metallb-config.yaml apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: main-pool namespace: metallb-system spec: addresses: - 192.168.1.240-192.168.1.245 # Safe Zone --- apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: l2-adv namespace: metallb-system spec: ipAddressPools: - main-pool Phase 3: Refactoring Traefik (The Big Leap) # With MetalLB ready to serve IPs, the time came to detach Traefik from the hardware.\nThe changes to Traefik\u0026rsquo;s values.yaml were radical:\nGone hostNetwork: true: The pod now lives in the cluster\u0026rsquo;s virtual network, isolated and secure. Gone nodeSelector: We no longer force Traefik to run on the Control Plane. It can (and must) go to Workers. Service Type LoadBalancer: The keystone. We ask the cluster for an external IP. But migrations are never painless.\nPhase 4: Chronicle of a Debugging (The Struggle) # Just as we launched the Helm upgrade, we ran into two classic but educational problems.\n1. The Volume Deadlock (RWO) # Traefik uses a persistent volume (Longhorn) to save SSL certificates (acme.json). This volume is of type ReadWriteOnce (RWO), which means it can be mounted by only one node at a time.\nWhen Kubernetes tried to move Traefik from the Control Plane to the Worker:\nIt created the new pod on the Worker. The old pod on the Control Plane was still shutting down (Terminating). The volume still appeared \u0026ldquo;attached\u0026rdquo; to the old node. The new pod remained stuck in ContainerCreating with the error Multi-Attach error. Solution: Sometimes Kubernetes is too polite. We had to force delete the old pod and scale the deployment to 0 replicas to \u0026ldquo;unlock\u0026rdquo; the volume from Longhorn, then allowing the new pod to mount it cleanly.\n2. The Permission War (Root vs Non-Root) # In the hardening process, we decided to run Traefik as a non-privileged user (UID 65532), abandoning root. However, the existing acme.json file in the volume had been created by the old Traefik (which ran as root).\nResult? open /data/acme.json: permission denied\nUser 65532 looked at the file owned by root and couldn\u0026rsquo;t touch it. The fsGroup parameter in the SecurityContext often isn\u0026rsquo;t enough for existing files on certain storage drivers.\nSolution: The \u0026ldquo;Init Container\u0026rdquo; Pattern Instead of going back and using root (which would be a defeat for security), we implemented an Init Container. It\u0026rsquo;s a small ephemeral container that starts before the main one, executes a command, and dies.\nWe configured it to run as root (only him!), fix permissions, and leave the field clear for Traefik:\n# traefik-values.yaml snippet initContainers: - name: volume-permissions image: busybox:latest # Brutal but effective command: \u0026#34;This is all yours, user 65532\u0026#34; command: [\u0026#34;sh\u0026#34;, \u0026#34;-c\u0026#34;, \u0026#34;chown -R 65532:65532 /data \u0026amp;\u0026amp; chmod 600 /data/acme.json || true\u0026#34;] securityContext: runAsUser: 0 # Root, necessary for chown volumeMounts: - name: data mountPath: /data Conclusions # Today the cluster took a leap in quality. It is no longer a collection of hacks to make things work at home, but an infrastructure that respects cloud-native patterns.\nWhat we achieved:\nNode Independence: Traefik can die and be reborn on any node; the service IP (192.168.1.240) will follow it thanks to MetalLB. Security: Traefik no longer has access to the host\u0026rsquo;s entire network and runs with a limited user. Order: We clearly separated the router\u0026rsquo;s responsibility (home DHCP) from the cluster\u0026rsquo;s (Static IP Pool). The main lesson? Automation (Helm) is powerful, but when touching persistent storage (Stateful) and permissions, surgical human intervention and log understanding (permission denied, multi-attach error) remain irreplaceable.\n","date":"4 January 2026","externalUrl":null,"permalink":"/posts/metallb-traefik-config/","section":"Posts","summary":"","title":"From HostNetwork Chaos to MetalLB Elegance","type":"posts"},{"content":" Introduction: The Persistence Paradox # In the Cloud Native paradigm, we treat workloads as cattle, not pets. Pods are ephemeral, expendable, and stateless. However, operational reality imposes an inescapable constraint: state must reside somewhere. Whether it is a database, system logs, or, as in our specific case, SSL certificates dynamically generated by an Ingress Controller, the need for persistent and distributed Block Storage is the first real obstacle that transforms a \u0026ldquo;toy\u0026rdquo; cluster into a production infrastructure.\nThe goal of this session was not trivial: to implement Longhorn, SUSE/Rancher\u0026rsquo;s distributed storage engine, on an immutable operating system like Talos Linux. The challenge is twofold: Talos, by design, prevents modification of the root filesystem and runtime package installation. This makes installing storage drivers (like iSCSI) an operation that must be planned at the OS image architecture level, not via simple apt or yum commands.\nThis chronicle documents the infrastructure hardening process, physical storage provisioning on Proxmox, and a complex troubleshooting session related to persistent volume permissions during integration with Traefik.\nPhase 1: The Immutability Obstacle and System Extensions # The first technical barrier encountered concerns the very nature of Longhorn. To function, Longhorn creates a virtual block device on each node, which is then mounted by the Pod. This operation relies heavily on the iSCSI (Internet Small Computer Systems Interface) protocol.\nIn a traditional Linux distribution (Ubuntu, CentOS), the Longhorn installation would check for the presence of open-iscsi and, if missing, the administrator would install it. On Talos Linux, this is impossible. The filesystem is read-only; there is no package manager.\nAnalysis and Solution: Sidero Image Factory # A preliminary check on the cluster revealed the lack of necessary extensions:\ntalosctl get extensions # Output: No critical extensions installed Without iscsi-tools and util-linux-tools, Longhorn pods would have remained indefinitely in the ContainerCreating state, unable to mount volumes.\nThe architectural solution adopted was the use of Sidero Image Factory. Instead of modifying the existing node, we generated a new OS image definition (a \u0026ldquo;schematic\u0026rdquo;) that natively included the required drivers.\nThe selected extensions were:\nsiderolabs/iscsi-tools: The daemon and user-space utilities for iSCSI. siderolabs/util-linux-tools: Essential filesystem management utilities for automatic volume formatting. siderolabs/qemu-guest-agent: To improve integration with the Proxmox hypervisor. The update was performed in \u0026ldquo;rolling\u0026rdquo; mode, one node at a time, ensuring the cluster remained operational (or nearly so) during the transition.\n# Example of the surgical upgrade command talosctl upgrade --image factory.talos.dev/installer/[ID_SCHEMA]:v1.12.0 --preserve=true This step highlights a fundamental lesson of modern DevOps: infrastructure is managed declaratively. You don\u0026rsquo;t \u0026ldquo;patch\u0026rdquo; servers; you replace the images that govern them.\nPhase 2: Physical Storage Provisioning (Proxmox \u0026amp; Talos) # Once the software was enabled to \u0026ldquo;speak\u0026rdquo; to the storage, we needed to provide the physical storage. Although it is possible to use the main operating system disk for data, this is a discouraged practice (anti-pattern) for several reasons:\nI/O Contention: System logs or etcd operations must not compete with database writes. Lifecycle: Reinstalling the operating system (e.g., a Talos reset) could result in formatting the /var partition, deleting persistent data. The Dedicated Disk Strategy # We opted to add a second virtual disk (virtio-scsi or virtio-blk) on each Proxmox VM. Here a critical operational risk emerged: device identification.\nOn Linux, device names (/dev/sda, /dev/sdb, /dev/vda) are not guaranteed to be persistent or deterministic, especially in virtualized environments where boot order can vary. Applying a Talos configuration that formats /dev/sdb when /dev/sdb is actually the system disk would lead to catastrophe (total data loss).\nMitigation Technique: Identification via Size # To mitigate this risk, we adopted a hardware \u0026ldquo;flagging\u0026rdquo; technique. Instead of creating disks identical to the system ones (34GB), we resized the new data disks to 43GB.\n# Pre-formatting verification NODE DISK SIZE TYPE 192.168.1.127 sda 34 GB QEMU HARDDISK (OS) 192.168.1.127 vda 43 GB (Data Target) Only after unequivocally confirming that /dev/vda was the 43GB disk on all nodes did we apply the Talos MachineConfig to partition, format in XFS, and mount the disk at /var/mnt/longhorn.\nThe Kubelet Mount Trick # A technical detail often overlooked is mount visibility. The Kubelet runs inside an isolated container. Mounting a disk on the host at /var/mnt/longhorn does not automatically make it visible to the Kubelet.\nWe had to explicitly configure extraMounts with rshared propagation:\nkubelet: extraMounts: - destination: /var/lib/longhorn type: bind source: /var/mnt/longhorn options: - bind - rshared - rw Without rshared, Longhorn would have attempted to mount the volumes, but the Kubelet would not have been able to pass them to the Pods, resulting in \u0026ldquo;MountPropagation\u0026rdquo; errors.\nPhase 3: Longhorn Installation and Configuration # Installation via Helm was relatively painless, thanks to meticulous preparation. However, configuring Longhorn in a two-node environment (one Control Plane and one Worker) requires specific compromises.\nReplica Configuration # By default, Longhorn tries to maintain 3 replicas of data on different nodes to guarantee High Availability (HA). In a 2-node cluster, this requirement is impossible to satisfy (Hard Anti-Affinity).\nWe had to reduce the numberOfReplicas to 2. This configures a \u0026ldquo;minimum fault tolerance\u0026rdquo; situation: if a node goes down, data is still accessible on the other, but redundancy is lost until recovery. This is an acceptable trade-off for a Homelab environment, but critical to understand for production.\nAdditionally, we customized the defaultDataPath to point to /var/lib/longhorn (the path internal to the Kubelet container that maps our dedicated disk), ensuring data never touched the OS disk.\nPhase 4: Traefik Integration and the Permissions Nightmare # The real technical battle began when we attempted to use this new storage to persist Traefik SSL certificates (acme.json file).\nThe Problem: Init:CrashLoopBackOff # After configuring Traefik to use a Longhorn PVC, the pod entered a continuous crash loop. Log analysis revealed: chmod: /data/acme.json: Operation not permitted\nRoot Cause Analysis # The conflict arose from three contrasting security vectors:\nKubernetes fsGroup: We instructed Kubernetes to mount the volume making it writable for group 65532 (Traefik\u0026rsquo;s non-root user). This sets permissions to 660 (Read/Write for User and Group). Let\u0026rsquo;s Encrypt / Traefik: For security, Traefik demands that the acme.json file has very strict permissions: 600 (Only the owner user can read/write). If permissions are more open (e.g., 660), Traefik refuses to start. HostNetwork \u0026amp; Privileged Ports: Since we are using hostNetwork: true to expose Traefik directly on the node IP, Traefik must be able to bind to ports 80 and 443. On Linux, ports under 1024 require Root privileges (or the NET_BIND_SERVICE capability). The Infinite Troubleshooting Loop # Initially, we tried to force permissions with an initContainer. Failed: the initContainer did not have root privileges on the mounted filesystem. We then tried changing the user (runAsUser: 65532), but this prevented binding to port 80 (bind: permission denied).\nThe situation was paradoxical:\nIf we ran as Root, we could open port 80, but Kubernetes (via fsGroup) altered file permissions to 660, angering Traefik. If we ran as Non-Root, we could not open port 80. The Definitive Solution: \u0026ldquo;Clean Slate\u0026rdquo; # Resolution required a radical approach:\nRemoval of fsGroup: We removed every fsGroup directive from the Helm values.yaml. This tells Kubernetes: \u0026ldquo;Mount the volume as is, do not touch file permissions\u0026rdquo;. Execution as Root (Temporary): We configured Traefik to run as runAsUser: 0 (Root). This resolves the port 80 binding problem. Volume Reset: Since the existing acme.json file was by now \u0026ldquo;corrupted\u0026rdquo; by previous attempts (it had 660 permissions), Traefik continued to fail even with the new configuration. We had to manually delete the file (rm /data/acme.json) from inside the pod. At the next restart, Traefik (running as Root) created a new acme.json. Since there was no fsGroup interfering, the file was created with the correct default permissions (600). The final log was a relief: Testing certificate renew... Register... providerName=myresolver.acme\nPost-Lab Reflections # Implementing Longhorn on a bare-metal (or low-level virtualized) Kubernetes cluster is an exercise that exposes the hidden complexity of distributed storage. It is not enough to \u0026ldquo;install the chart\u0026rdquo;. One must understand how the operating system manages devices, how the Kubelet manages mount points, and how containers manage user permissions.\nLessons Learned:\nImmutability requires planning: On systems like Talos, kernel and userspace dependencies must be \u0026ldquo;baked\u0026rdquo; into the image, not installed retrospectively. Permissions in persistent storage are tricky: Kubernetes\u0026rsquo; fsGroup mechanism is useful for standard databases but can be destructive for applications requiring paranoid file permissions (like Traefik/ACME or SSH keys). Hardware Identification: Never trust device names (/dev/sda). Use UUIDs or, during provisioning, unique disk sizes to avoid catastrophic human errors. The cluster now possesses a resilient persistence layer. The next logical step will be to remove the dependency on hostNetwork and Root by introducing a BGP Load Balancer like MetalLB, allowing Traefik to run as an unprivileged user and completing the security architecture.\nGenerated via Gemini CLI\n","date":"2 January 2026","externalUrl":null,"permalink":"/posts/longhorn-kubernetes-storage/","section":"Posts","summary":"","title":"Lab Chronicles: Building Persistence with Longhorn and Talos","type":"posts"},{"content":" Introduction: The (Apparent) Charm of Simplicity # Today I tackled one of those lab sessions that start with an apparently simple goal and end up turning into a masterclass in Kubernetes architecture. The goal was clear: configure a solid entry point (Ingress) for my Talos Linux cluster on Proxmox, exposed via a native VIP (Virtual IP), and install Traefik to manage HTTPS traffic with automatic Let\u0026rsquo;s Encrypt certificates.\nThe mantra of the day was \u0026ldquo;Less is More\u0026rdquo;. No MetalLB (for now). No complex external Load Balancers. I wanted to leverage Talos\u0026rsquo;s native capabilities to manage network High Availability and run Traefik \u0026ldquo;on the metal\u0026rdquo; (HostNetwork).\nWhat follows is not a sterile tutorial, but the faithful chronicle of the challenges, architectural errors, and solutions that led to success.\nPhase 1: The Talos Native VIP (Layer 2) # The first challenge was ensuring a stable IP address (192.168.1.250) that could \u0026ldquo;float\u0026rdquo; between nodes, regardless of which physical machine was powered on.\nThe Reasoning (The Why) # Why a native VIP? In a Bare Metal environment (or VMs on Proxmox), we don\u0026rsquo;t have the convenience of cloud Load Balancers (AWS ELB, Google LB) that provide us with a public IP with a click. The classic alternatives are MetalLB (which announces IPs via ARP/BGP) or Kube-VIP. However, Talos Linux offers a built-in feature to manage shared VIPs directly in the machine configuration (machine config). I chose this path to reduce software dependencies: if the operating system can do it, why install another pod to manage it?\nThe Analysis and the Error # I started by identifying the network interface on the nodes (ens18) and creating a patch to announce the IP 192.168.1.250.\n# vip-patch.yaml machine: network: interfaces: - interface: ens18 dhcp: true vip: ip: 192.168.1.250 Applying the patch to the Control Plane node (192.168.1.253) was an immediate success. The node started answering ARP requests for the new IP. The problem arose when I attempted to apply the same patch to the Worker node (192.168.1.127) to ensure redundancy.\nError: virtual (shared) IP is not allowed on non-controlplane nodes\nAnalysis: Talos, by design, limits the use of shared VIPs to Control Plane nodes. This is because the primary use case is High Availability for the API Server (port 6443), not generic user traffic. Impact: We had to accept that our VIP will reside, for now, only on the Control Plane. Is it a Single Point of Failure? Yes, if the CP node dies, we lose the IP. But for a home lab, it is an acceptable compromise that drastically simplifies the stack.\nPhase 2: Helm and Preparation of the Ground # With the VIP active, we needed the \u0026ldquo;engine\u0026rdquo; to install applications. Helm is the de facto standard. Installation was trivial via the official script, but essential. Helm allows us to define our infrastructure as code (Values files) instead of as imperative commands launched randomly.\ncurl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 chmod 700 get_helm.sh \u0026amp;\u0026amp; ./get_helm.sh Phase 3: Traefik and the Configuration Hell # Here the real battle began. We wanted Traefik configured in a very specific way:\nHostNetwork: Listen directly on ports 80/443 of the node (bypassing the K8s overlay network level) to intercept traffic directed to the VIP. ACME (Let\u0026rsquo;s Encrypt): Generate valid SSL certificates. Persistence: Save certificates to disk to avoid regenerating them at every restart (and hitting rate-limits). The First Wall: The Helm Syntax # The Traefik chart evolves rapidly. My initial values.yaml configuration used deprecated syntax for redirects (redirectTo) and port exposure. Helm responded with cryptic errors like got boolean, want object.\nSolution: I had to consult the updated documentation (via Context7) and discover that global redirect management is now more robust if passed via additionalArguments rather than trying to fit it into the ports map.\nThe Second Wall: RollingUpdate vs HostNetwork # Once the syntax was corrected, Helm refused installation with an interesting logical error:\nError: maxUnavailable should be greater than 0 when using hostNetwork\nDeep-Dive: When you use hostNetwork: true, a Pod physically occupies port 80 of the node. Kubernetes cannot start a new Pod (update) on the same node until the old one is dead, because the port is occupied. The default strategy maxUnavailable: 0 (which tries to never have downtime) is mathematically incompatible with this constraint on a single node. Solution: I had to modify the updateStrategy to allow maxUnavailable: 1.\nupdateStrategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 maxSurge: 0 The Third Wall: Pod Security Admission (PSA) # Overcoming the configuration obstacle, the Pods wouldn\u0026rsquo;t start. They remained in CreateContainerConfigError state or weren\u0026rsquo;t created by the DaemonSet. Describing the DaemonSet (kubectl describe ds), the truth emerged:\nError: violates PodSecurity \u0026quot;baseline\u0026quot;: host namespaces (hostNetwork=true)\nAnalysis: Talos and recent Kubernetes versions apply strict security standards by default. A Pod requiring hostNetwork is considered \u0026ldquo;privileged\u0026rdquo; because it can see all node traffic. The namespace had to be explicitly authorized.\nSolution:\nkubectl label namespace traefik pod-security.kubernetes.io/enforce=privileged --overwrite Phase 4: The Connection Paradox # Everything looked green. Pod Running. VIP active. But trying to connect to http://192.168.1.250 (or to the domain tazlab.net), I received a dry Connection Refused.\nThe Investigation (Sherlock Mode) # VIP: The VIP 192.168.1.250 is on the Control Plane node (.253). Pod: I checked where the Traefik Pod was running: kubectl get pods -o wide. It was running on the Worker node (.127). The Black Hole: Traffic arrived at node .253 (VIP), but on that node, there was no Traefik listening on port 80! The router sent packets to the right place, but no one answered. Why wasn\u0026rsquo;t Traefik running on the Control Plane? Deep-Dive: Taints \u0026amp; Tolerations. Control Plane nodes have a \u0026ldquo;Taint\u0026rdquo; (a stain) called node-role.kubernetes.io/control-plane:NoSchedule. This tells the scheduler: \u0026ldquo;Do not place any workload here, unless it is explicitly tolerated\u0026rdquo;. Traefik, by default, does not tolerate it.\nThe Definitive Architectural Solution # We had to take a drastic decision to make everything work in harmony:\nAbandon the DaemonSet (which tries to run everywhere). Switch to a Deployment with 1 single replica. Force this replica to run exclusively on the Control Plane node (where the VIP resides). Changes to values.yaml:\n# 1. Tolerate the Control Plane Taint tolerations: - key: \u0026#34;node-role.kubernetes.io/control-plane\u0026#34; operator: \u0026#34;Exists\u0026#34; effect: \u0026#34;NoSchedule\u0026#34; # 2. Force execution on the Control Plane node nodeSelector: kubernetes.io/hostname: \u0026#34;talos-unw-ifc\u0026#34; # Or use generic labels # 3. Single replica Deployment (Crucial for ACME) deployment: kind: Deployment replicas: 1 Why a single replica? Because the Community version of Traefik does not support sharing ACME certificates between multiple instances. If we had two replicas, both would try to renew certificates, conflicting or getting banned by Let\u0026rsquo;s Encrypt.\nConclusions and Final State # After applying this \u0026ldquo;surgical\u0026rdquo; configuration, the system came to life.\nThe home router forwards ports 80/443 to the VIP 192.168.1.250. The VIP carries traffic to the Control Plane node. Traefik (now residing on the Control Plane) intercepts the traffic. It recognizes the domain tazlab.net, requests the certificate from Let\u0026rsquo;s Encrypt, saves it to /data (hostPath volume mounted), and serves the whoami application. What have we learned? That \u0026ldquo;simple\u0026rdquo; does not mean \u0026ldquo;easy\u0026rdquo;. Removing abstraction layers (like external Load Balancers) forces us to deeply understand how Kubernetes interacts with the underlying physical network. We had to manually handle node affinity, namespace security, and update strategies.\nThe result is a lean cluster, without resource waste, perfect for a Homelab, but built with the awareness of every single gear.\nNext steps: Configure certificate backups (because now they are on a single node!) and start deploying real services.\n","date":"30 December 2025","externalUrl":null,"permalink":"/posts/talos-vip-traefik-setup/","section":"Posts","summary":"","title":"Lab Chronicles: Native VIP on Talos and Traefik Ingress","type":"posts"},{"content":"","date":"21 December 2025","externalUrl":null,"permalink":"/tags/blog/","section":"Tags","summary":"","title":"Blog","type":"tags"},{"content":"","date":"21 December 2025","externalUrl":null,"permalink":"/tags/docker-compose/","section":"Tags","summary":"","title":"Docker-Compose","type":"tags"},{"content":"This post describes the Hugo installation setup.\nDocker Compose Configuration # The Hugo site is set up using Docker Compose. The compose.yml file defines a service named hugo which uses the hugomods/hugo:exts-non-root Docker image. This image includes the extended version of Hugo and runs as a non-root user, enhancing security and providing essential features for a modern Hugo site.\nThe compose.yml also maps the local project directory to /src inside the container, allowing Hugo to serve content from the local files. Port 1313 is exposed to access the development server.\nservices: hugo: image: hugomods/hugo:exts-non-root container_name: hugo command: server --bind=0.0.0.0 --buildDrafts --buildFuture --watch volumes: - ./:/src ports: - \u0026#34;1313:1313\u0026#34; restart: always networks: - frontend networks: frontend: external: true Blowfish Theme Installation # The Blowfish theme is used for this blog. It\u0026rsquo;s a powerful and highly customizable theme built with Tailwind CSS. The theme is added as a git submodule in the themes/blowfish directory.\nTo install the theme and its dependencies, the following commands were used:\nAdd the Blowfish theme as a submodule:\ngit submodule add https://github.com/nunocoracao/blowfish.git themes/blowfish Install dependencies (if applicable, following theme-specific instructions).\nThe configuration for the theme is managed through the files in config/_default/.\nDeployment on Kubernetes # The blog is deployed on a Kubernetes cluster. The deployment uses a git-sync sidecar container to automatically update the blog content whenever changes are pushed to the GitHub repository. Persistent storage is provided by Longhorn.\nGenerated via Gemini CLI\n","date":"21 December 2025","externalUrl":null,"permalink":"/posts/hugo-installation/","section":"Posts","summary":"","title":"Hugo Installation Details","type":"posts"},{"content":"From the beginnings of the first electronic circuits to Kubernetes orchestration, my constant has always been only one: never accepting that a system works without having deeply understood its internal logic. Whether it\u0026rsquo;s hardware or the command line, my goal has always been to master the system, understand its deep mechanisms, and, where possible, push its limits a little further.\nToday many talk about Artificial Intelligence and Cloud as absolute novelties, but those who have experienced the evolution of bits \u0026ldquo;under the hood\u0026rdquo; are witnessing an extraordinary convergence. In this post, I want to tell you how I went from the screech of the 56k modem to the orchestration of a Kubernetes cluster in my living room.\nAI Before Power: The Era of SVMs (1999-2001) # At the end of the 90s, Artificial Intelligence was a challenge against the limits of physics. The computing power of that time did not allow classical neural networks to converge in reasonable times.\nIn my degree thesis in Physics, I addressed this limit by using Support Vector Machines (SVM) for the early diagnosis of radiographic tumors. The algorithms were written in C and C++ for maximum performance, with a Java interface. It was an era when AI was built with pure mathematics.\nSlackware, RedHat, and the Modem\u0026rsquo;s Screech # In those same years, I discovered Linux. There were no YouTube tutorials; there were manuals and newsgroups. I installed the first versions of RedHat and Slackware, configuring kernels while the outside world was connected via the twisted pair telephone line and the screech of the 56k modem. There I learned how an operating system \u0026ldquo;thinks.\u0026rdquo;\nFrom the Semantic Web to Enterprise Consulting # After an IT Master\u0026rsquo;s focused on the internet stack and the Semantic Web (the idea of making data understandable to machines), I arrived in Milan. Those were the years of \u0026ldquo;heavyweight\u0026rdquo; consulting in Java, where the modernity of JavaBeans had to communicate with COBOL systems and critical SQL databases. An incredible training ground for understanding the management of complexity on a large scale.\nThe Duality: Teaching and Web Development # My path then veered towards teaching Mathematics and Physics. But I never stopped being a developer. During my teaching qualification process, I worked as a JavaScript and PHP programmer, staying connected with the world of the dynamic web.\nIn these years, I understood that learning a new language (Python, Haskell, Rust, XML) is a very fast process if you understand its paradigms. Once you understand the logic (functional, imperative, object-oriented), the difference is almost only syntax.\nThe Spark: AI as a Study Companion # The circle began to close with the advent of the new LLMs. Initially, I used AI for educational purposes: creating personalized materials for my students and, especially, teaching them how to use AI to improve learning and not to \u0026ldquo;cheat.\u0026rdquo;\nBut AI reawakened my desire to experiment. I needed a functioning Linux system and, after 15 years, I installed Linux Mint. I was struck: the Linux world had become simple, elegant, beautiful. Thanks to AI, I didn\u0026rsquo;t have to reread tons of documentation for drivers and services; in two days, I had a perfect workstation.\nThe Escalation: From the VM to the Mini PC # At that point, curiosity became unstoppable. I wanted services active 24/7. I rented a VM, using it as a training ground with AI as a \u0026ldquo;Senior Partner\u0026rdquo; always by my side.\nBut the services grew: Docker, SQL and NoSQL databases, vector databases for LLMs, n8n for automation, Tailscale for the network\u0026hellip; The rented server was no longer enough. Instead of increasing the monthly fee, I made the leap: I bought a Mini PC with enough RAM, a domain, and started building my homelab.\nBeyond Fear: Proxmox and Kubernetes # With AI supporting me, the \u0026ldquo;fear of complexity\u0026rdquo; vanished. Why read 300 pages of manual when AI can extract the exact 10 lines I need at that moment?\nSo I installed Proxmox. Then, I told myself: \u0026ldquo;Why stop?\u0026rdquo;. I faced the ultimate challenge: Kubernetes. Perhaps it is excessive for a home laboratory, but the subject is too fascinating to be ignored. Today my Home Lab runs on a Kubernetes cluster with Talos Linux on Proxmox.\nConclusion: Un New World to Explore # Who am I today? A physicist who saw the birth of AI and who today uses it to break down time barriers. I don\u0026rsquo;t know where this road made of High Availability, security, and scalability will lead me, but I know that in front of me is a new world to explore.\nI can\u0026rsquo;t wait to discover what\u0026rsquo;s behind the next node.\nDid you enjoy this journey? Follow me on blog.tazlab.net to discover how I am configuring my cluster and what the next experiments on the border between physics and cloud are.\n","date":"20 December 2025","externalUrl":null,"permalink":"/about/","section":"Roberto Tazzoli","summary":"","title":"About","type":"page"},{"content":"","date":"20 December 2025","externalUrl":null,"permalink":"/it/tags/biografia/","section":"Tags","summary":"","title":"Biografia","type":"tags"},{"content":"","date":"20 December 2025","externalUrl":null,"permalink":"/tags/biography/","section":"Tags","summary":"","title":"Biography","type":"tags"},{"content":"","date":"20 December 2025","externalUrl":null,"permalink":"/tags/computer-science/","section":"Tags","summary":"","title":"Computer-Science","type":"tags"},{"content":"","date":"20 December 2025","externalUrl":null,"permalink":"/it/tags/fisica/","section":"Tags","summary":"","title":"Fisica","type":"tags"},{"content":"","date":"20 December 2025","externalUrl":null,"permalink":"/it/tags/ia/","section":"Tags","summary":"","title":"Ia","type":"tags"},{"content":"","date":"20 December 2025","externalUrl":null,"permalink":"/it/tags/informatica/","section":"Tags","summary":"","title":"Informatica","type":"tags"},{"content":"","date":"20 December 2025","externalUrl":null,"permalink":"/tags/physics/","section":"Tags","summary":"","title":"Physics","type":"tags"},{"content":"","externalUrl":null,"permalink":"/authors/","section":"Authors","summary":"","title":"Authors","type":"authors"},{"content":"","externalUrl":null,"permalink":"/series/","section":"Series","summary":"","title":"Series","type":"series"}]