A space for technology notes on anything I find interesting, and a musing or two. From electrons through to delivering and developing code, as well as shiny emerging tech.
This post was written almost entirely by an LLM.
A reflection on building LLM-powered writing assistance while maintaining authentic voice
The Challenge of Authentic LLM Assistance After well over a year of sporadic posting, I found myself facing a familiar challenge: maintaining consistency in voice and technical depth across content while leveraging the productivity benefits of LLM assistance. The solution emerged through an interesting meta-exercise—using Roo Code (the agentic coding plugin for VSCode/VSCodium) to create its own content creation persona.
After well over a year without a post…
Personal Update At the end of March 2024, my father died. May he rest in peace. With the process of grieving taking its natural course, this site went well on the back-burner, or even off the stove entirely for a time.
The Migration Journey Then we moved house, and my Proxmox cluster turned back into a single node. I stayed up late one night and transitioned the site’s K3s stack on to an Azure VM as an interim measure.
This post explores renting a cloud GPU from RunPod and using the vLLM inference engine to run a Large Language Model made available via an OpenAI compatible endpoint, and then load testing that endpoint with K6.
What is RunPod? RunPod is a paid cloud GPU provider. It offers:
Pods
We will utilise a pod in this example.
A pod is a container with one or more GPUs attached. We specify the docker image and the configuration.
A set of notes on converting a transformers model from Pytorch format to Safetensors format and then quantising to ExLlamaV2 (Exl2) using a code based calibration dataset.
This was inspired by posts which reported coding LLMs quantised to Exl2 format using the wikitext default calibration dataset resulted in relatively lower quality outcomes.
ExLlamaV2 is the excellent work of Turboderp. It is an inference library for running LLMs on consumer GPUs. It is fast and supports multi-GPU hosts.
A guide to using the Terraform bpg provider to create virtual machines on a Proxmox instance.
The bpg provider is a wrapper for the Proxmox API. It enables the provisioning of infrastructure on Proxmox using Terraform.
bpg is one of two terraform providers available for Proxmox at time of writing, the other option being telmate. Both are active based on their GitHub repos, at a quick glance bpg was a bit more active, and a few positive posts about bpg swayed the decision towards it.
Let’s progress from checking Kubernetes logs in a terminal to using structured log data for searching, visualising and setting alerts within a web based user interface. We will use our Nginx deployment to demonstrate.
Structured logging involves defining shapes for log data, most often represented in JSON using key value pairs.
As compared to unstructured text log entries, structured logs make it easier to find events and turn log data into insights.
A reflection on how to react to unplanned downtime once services are restored.
The opportunity for growth and improvement is often highest during and directly after the times when complex systems behave unexpectedly.
The potential for damage to stakeholder relationships is present at these times, particularly within teams or management structures. The term throw someone under a bus comes to mind, a metaphor for a very painful and maybe fatal experience.
Let us walk through setting up an Actions Runner Controller (ARC) for GitHub in a Kubernetes cluster. This will enable running continuous integration and continuous deployment (CI/CD) pipelines using GitHub Actions on our infrastructure, or on cloud based Kubernetes.
First, we’ll introduce a bit of the terminology:
Runner a container which runs code in response to a trigger. They may be used to test, build and deploy code, as well as far more creative use-cases.
This post will explore deploying Hashicorp Vault to K3s (Kubernetes distribution) using Helm and then configuring it with Terraform. This will enable us to store our secret state data in Vault and make those secrets available to our K3s resources.
Vault is an enterprise level secrets manager configurable for high availability which integrates with Kubernetes and many CI toolsets.
In the previous two posts journaling the evolution of this site’s delivery, we have been managing a single secret, the Cloudflared tunnel token.