Skip to main content
Live System Case Study

Production-Grade Cloud Architecture

This portfolio is a live deployment on GCP Cloud Run—Terraform-provisioned, Docker-containerized, and CI/CD automated. It demonstrates the same infrastructure patterns I use for enterprise payment systems.

System Design

Edge & Frontend

Global Delivery Network

  • Next.js 16 (App Router)

    React Server Components for zero-JS static pages. Streaming SSR for dynamic routes. Turbopack for sub-second HMR.

  • Cloud Run (SSR)

    Containerized serverless deployment. Scale-to-zero eliminates idle costs. Auto-scales to 100 instances under load.

AI & Backend

Intelligent Processing

  • Self-Hosted LLM (Inferencia)

    OpenAI-compatible inference endpoint running a quantized open-source model. Full data control, zero per-token cost.

  • RAG Pipeline (pgvector)

    Vector embeddings via Gemini text-embedding-004. Semantic search over a curated knowledge base. Local file fallback for zero-dependency mode.

FinOps & Cost Efficiency

ComponentStrategyCost
Cloud RunScale-to-zero. Pay only for active request milliseconds.$0.10 - $2.00
Vector DatabaseSupabase free tier. Local file fallback for zero cost.$0.00
LLM InferenceSelf-hosted on local hardware. No per-token API costs.$0.00
Total~$2/mo

Infrastructure as Code

Zero manual provisioning. The entire stack is defined in Terraform—reproducible, auditable, and drift-detected. This is how Cloud Run deployments should work.

Defense in Depth

Cloud Armor policies mitigate DDoS at the edge. Rate limiting at the application layer. Secret Manager for all credentials. No secrets in code, ever.

TerraformCloud ArmorSecret ManagerArtifact RegistryCloud Build
main.tf
resource "google_compute_security_policy" "edge" {
  name = "portfolio-edge-policy"

  adaptive_protection_config {
    layer_7_ddos_defense_config {
      enable          = true
      rule_visibility = "STANDARD"
    }
  }

  rule {
    action   = "deny(429)"
    priority = "1000"
    match {
      expr { expression = "rate(ip.src) > 500" }
    }
  }
}

resource "google_cloud_run_v2_service" "app" {
  name     = "gimenez-portfolio"
  location = var.region

  template {
    scaling {
      min_instance_count = 0
      max_instance_count = 100
    }
    containers {
      image = var.image
      resources {
        limits = {
          cpu    = "2000m"
          memory = "1Gi"
        }
      }
    }
  }
}