Live System Case Study

Production-Grade Cloud Architecture

This portfolio is a live deployment on GCP Cloud Run—Terraform-provisioned, Docker-containerized, and CI/CD automated. It demonstrates the same infrastructure patterns I use for enterprise payment systems.

System Design

Edge & Frontend

Global Delivery Network

Next.js 16 (App Router)
React Server Components for zero-JS static pages. Streaming SSR for dynamic routes. Turbopack for sub-second HMR.
Cloud Run (SSR)
Containerized serverless deployment. Scale-to-zero eliminates idle costs. Auto-scales to 100 instances under load.

AI & Backend

Intelligent Processing

Self-Hosted LLM (Inferencia)
OpenAI-compatible inference endpoint running a quantized open-source model. Full data control, zero per-token cost.
RAG Pipeline (pgvector)
Vector embeddings via Gemini text-embedding-004. Semantic search over a curated knowledge base. Local file fallback for zero-dependency mode.

FinOps & Cost Efficiency

Component	Strategy	Cost
Cloud Run	Scale-to-zero. Pay only for active request milliseconds.	$0.10 - $2.00
Vector Database	Cloud SQL (PostgreSQL + pgvector) or file-based RAG. GCP-native.	$0.00
LLM Inference	Self-hosted on local hardware. No per-token API costs.	$0.00
Total		~$2/mo

Infrastructure as Code

Zero manual provisioning. The entire stack is defined in Terraform—reproducible, auditable, and drift-detected. This is how Cloud Run deployments should work.

Defense in Depth

Cloud Armor policies mitigate DDoS at the edge. Rate limiting at the application layer. Secret Manager for all credentials. No secrets in code, ever.

TerraformCloud ArmorSecret ManagerArtifact RegistryCloud Build

main.tf

resource "google_compute_security_policy" "edge" {
  name = "portfolio-edge-policy"

  adaptive_protection_config {
    layer_7_ddos_defense_config {
      enable          = true
      rule_visibility = "STANDARD"
    }
  }

  rule {
    action   = "deny(429)"
    priority = "1000"
    match {
      expr { expression = "rate(ip.src) > 500" }
    }
  }
}

resource "google_cloud_run_v2_service" "app" {
  name     = "gimenez-portfolio"
  location = var.region

  template {
    scaling {
      min_instance_count = 0
      max_instance_count = 100
    }
    containers {
      image = var.image
      resources {
        limits = {
          cpu    = "2000m"
          memory = "1Gi"
        }
      }
    }
  }
}

Discuss Architecture Opportunities

Production-Grade Cloud Architecture

System Design

Edge & Frontend

Next.js 16 (App Router)

Cloud Run (SSR)

AI & Backend

Self-Hosted LLM (Inferencia)

RAG Pipeline (pgvector)

FinOps & Cost Efficiency

Infrastructure as Code

Defense in Depth