Production-Grade Cloud Architecture
This portfolio is a live deployment on GCP Cloud Run—Terraform-provisioned, Docker-containerized, and CI/CD automated. It demonstrates the same infrastructure patterns I use for enterprise payment systems.
System Design
Edge & Frontend
Global Delivery Network
Next.js 16 (App Router)
React Server Components for zero-JS static pages. Streaming SSR for dynamic routes. Turbopack for sub-second HMR.
Cloud Run (SSR)
Containerized serverless deployment. Scale-to-zero eliminates idle costs. Auto-scales to 100 instances under load.
AI & Backend
Intelligent Processing
Self-Hosted LLM (Inferencia)
OpenAI-compatible inference endpoint running a quantized open-source model. Full data control, zero per-token cost.
RAG Pipeline (pgvector)
Vector embeddings via Gemini text-embedding-004. Semantic search over a curated knowledge base. Local file fallback for zero-dependency mode.
FinOps & Cost Efficiency
| Component | Strategy | Cost |
|---|---|---|
| Cloud Run | Scale-to-zero. Pay only for active request milliseconds. | $0.10 - $2.00 |
| Vector Database | Supabase free tier. Local file fallback for zero cost. | $0.00 |
| LLM Inference | Self-hosted on local hardware. No per-token API costs. | $0.00 |
| Total | ~$2/mo | |
Infrastructure as Code
Zero manual provisioning. The entire stack is defined in Terraform—reproducible, auditable, and drift-detected. This is how Cloud Run deployments should work.
Defense in Depth
Cloud Armor policies mitigate DDoS at the edge. Rate limiting at the application layer. Secret Manager for all credentials. No secrets in code, ever.
resource "google_compute_security_policy" "edge" {
name = "portfolio-edge-policy"
adaptive_protection_config {
layer_7_ddos_defense_config {
enable = true
rule_visibility = "STANDARD"
}
}
rule {
action = "deny(429)"
priority = "1000"
match {
expr { expression = "rate(ip.src) > 500" }
}
}
}
resource "google_cloud_run_v2_service" "app" {
name = "gimenez-portfolio"
location = var.region
template {
scaling {
min_instance_count = 0
max_instance_count = 100
}
containers {
image = var.image
resources {
limits = {
cpu = "2000m"
memory = "1Gi"
}
}
}
}
}