June 10, 2026ArchitectureRAGAI Infrastructure
File-Based RAG Without Apology
Why I chose a 200-line knowledge module over pgvector — and when you should too.
Architecture notes and engineering decisions from real projects.
Why I chose a 200-line knowledge module over pgvector — and when you should too.
How setting CHAT_MAX_RPM_PER_IP to 2 broke my portfolio chat.
Building a self-healing monitor that recovers silently — and when to actually wake up a human.
Building a Go-based LLM proxy that routes chat, TTS, and embeddings to different backends — and why I hosted it on a Raspberry Pi 5.