No.01 — AI Engineering● Open to roles

AI Engineer · LLM Systems · Platform & SRE

Building agent systems that hold up in production.

I design multi-agent orchestration (LangGraph supervisor–worker), multi-provider routing behind a Portkey gateway, and hybrid retrieval across vectors, a knowledge graph, and rerank — then make quality a number, not a feeling, with golden-dataset evals and Langfuse tracing. All on a platform/SRE foundation hardened in fintech at 1,500+ TPS.

Agent orchestration/Retrieval & grounding/Evals & observability/Platform & SRE

View architectures LinkedIn

— Architecting elegance in chaos.

−40%

Dev overhead

Multi-LLM

Provider routing

99.99%

Uptime

RAG + Evals

Grounded & measured

Contributions for

No.02 — Systems

Selected Architectures

Schematics of how I design AI and platform systems — drawn, not screenshotted.

Agentic SRE Operations

The infra & SRE operation run by agents — a supervisor routes incidents to triage, remediation, and FinOps specialists through MCP tools, human-in-the-loop.

LangGraph Multi-Agent

A StateGraph supervisor routes through conditional edges to Planner, Worker, and Tool nodes — each updating shared state and cycling back until the task is done.

Multi-Provider LLM Gateway

A cheap classifier routes to a Portkey-style gateway with a fallback chain across Claude, GPT, and Gemini — balancing quality, latency, and cost.

RAG · Retrieval Stack

Hybrid grounding across Qdrant vectors, a Neo4j knowledge graph, and Typesense full-text — reranked with Cohere into a cited, trustworthy answer.

LangChain LCEL Pipeline

Composable retrieval-augmented chains — prompt | model | parser — where a retriever injects grounded context and the parser returns typed, structured output.

Event-Driven Payments

The platform foundation: PCI-DSS microservices over an event bus (SNS/SQS/Kafka) feeding a ledger and real-time reconciliation at 1.5K+ TPS.

No.03 — Selected Work

Things I've built

AI Delivery Pipeline

Deuna

Agentic automation of the software lifecycle — Jira → AI code-gen → tests → PR with human review. ~40% less manual overhead.

ClaudeMCPn8nFastAPI

Cloud Cost Optimizer

Open Source

FinOps engine ingesting AWS/Azure billing, detecting idle resources via rules, generating safe decommission plans. MCP server + Claude Agent SDK sub-agents.

PythonFastAPIboto3MCP

AI Engineer Lab

Open Source

A runnable, line-by-line reference for production LLM systems: routing, agents, RAG, evals — real APIs, graceful degradation, interview-grade notes.

PythonRAGEvalsAgents

Payment Infrastructure

Deuna · Kushki

Event-driven microservices processing 1,500+ TPS with PCI-DSS compliance — from a 100 TPS monolith to 99.95% success at 450ms latency.

AWSEKSKafkaGo

No.04 — Professional Path

The road here

2023 — Present · Remote

Head of Infrastructure & SRE — Agentic Operations · Deuna

Run the infra & SRE operation through AI agents: incident triage, remediation, and FinOps via multi-agent orchestration + MCP.
Built an agentic pipeline automating the delivery lifecycle (Jira → code-gen → tests → PR), cutting manual overhead ~40%.
Multi-provider LLM integration (Claude, OpenAI, Gemini) with structured output, retries, and fallback; token/cost optimization.
Extended SLO/SLI and Datadog observability to AI behavior, latency, and cost.

2021 — 2025 · Remote

Senior DevOps / Platform Engineer · Housecall Pro

Led enterprise platform transformation for a SaaS product serving millions of users.
Architected multi-cloud solutions (AWS/Azure) with global RDS replication.
Partnered with C-level leadership to reduce annual cloud spend by 20%.

No.05 — Capabilities

What I bring

Agentic AI & Orchestration

Multi-agent systems (supervisor + specialist), tool & function calling, MCP servers, classifier routing.

RAG & Retrieval

Embeddings, vector search (Qdrant), reranking (Cohere), knowledge graphs (Neo4j), grounded answers.

LLM Evals & Observability

Golden datasets, prompt-regression, hallucination & safety checks, Langfuse and Datadog tracing.

Cloud & Platform Foundation

AWS, Kubernetes, Terraform, GitOps, SRE — the resilient base beneath the AI.

No.06 — Writing

Technical insights

The Productivity Paradox of AI

Are we entering a productivity boom that elevates our capabilities, or a bubble that erodes our engineering muscle memory?

No.07 — Contact

Currently open to AI Engineer & LLM Systems roles.

andreco87@gmail.com LinkedIn GitHub