POST · 2026

I'm an infrastructure engineer who ships AI products. Here's why both halves matter.

Two things sit on my plate.

Production infra at Pax8. AI products at Respondyr, on the side. This post is about why both halves are load-bearing, and why most engineers who claim both are really 80% one of them.

Most AI product shippers leave a mess

It's a familiar pattern. Someone ships a demo on Friday. It looks great. It breaks at customer #500. The platform team gets paged. A senior engineer spends three weeks rewriting the demo into something that won't get them fired at the next post-mortem.

That's not a moral failing. It's a skill those engineers were never paid to develop. The job they were hired for ends at "the demo works." Production is someone else's problem.

The catch is that someone else is expensive, and they show up after the fact. Product teams learn to ship looser. Infra teams learn to distrust shipping. The two groups end up communicating in passive-aggressive Jira tickets.

Most infra engineers can't ship an AI product to save their life

Mirror gripe. Hand an infra engineer an AI product idea and you'll get an architecture diagram, a service mesh proposal, a list of SLO questions, a retry-and-backpressure conversation, a design review, and no shipped product for three months.

This one isn't a moral failing either. It's the failure mode of a profession paid to worry about the long tail. If you spend years reasoning about what happens at 3am during a region failover, "let's just see if it works" feels actively unsafe.

Model-skepticism doesn't help. A lot of infra engineers were trained back when "AI" meant the thing that polluted your error budget with non-determinism. They have a point. They're also missing the move.

So here I am

I came up infra. GitOps, multi-provider Terraform, declarative state, every secret managed by IaC, every deployment reconciled. The instincts are baked in deep enough that I don't like shipping anything without them. Hand me a hackathon idea on a Friday afternoon and I'll still find a way to get it into a repo with a CI pipeline before Sunday.

Somewhere in the last year I figured out the other half. Not the demo. Anyone can ship a demo. The part where someone hands you a vague problem, you figure out what they actually meant, you ship a working version by Friday, and it still runs on Monday without an apology.

The hardest part of building AI products isn't the model. It's sitting with the person who'll use the thing and translating what they actually need into a system that doesn't fall over. I'd rather embed for two weeks and ship the right thing than write a Notion doc about the wrong thing for six.

I trust AI agents in critical production paths at Pax8. The family recipes stay in the family, even the ones that are my brainchild.

I built a Jira and Confluence MCP server in Python, solo. Twenty engineers at Pax8 reach for it every day. Nobody mandated it. They picked it up because it was easier than the alternative. That's the only measure of an internal tool that ever mattered.

I run Respondyr on the side. The whole product is a multi-service architecture on EKS, pure GitOps, multi-provider Terraform across AWS and Cloudflare, internal cost-tracking gateway in front of every paid LLM call. If I can't see what an agent is spending or how it's failing, I don't trust it in production. Because if I'm going to ship an AI product, I'm going to ship one I'd put my own on-call rotation against.

A quick aside on agents versus workflows, because the industry keeps blurring it. Respondyr is as much workflow as it is agentic. If not more. The marketing engine is a deterministic workflow with a Playwright MCP agent dropped in for the long tail of prospect intake forms that don't have a public email address. I love the determinism of a workflow. Cheaper to run, easier to test, easier to debug, easier to explain to whoever's on-call at 3am. I appreciate the autonomy of an agent when determinism actually runs out of road. Most teams reach for agents because "agent" is the louder word. Knowing which one to use, and when, is what's hard.

This site is the same story in miniature. Designed in Google Stitch on a Thursday afternoon. Ported to Astro. Cloudflare Pages, Terraform-managed DNS, auto-deploy via GitHub Actions. One evening, end to end. The infra is honest, the shipping was fast, and I'm not going to apologize for either.

The thesis

If you're building AI products with real customers, products that have to survive contact with people who don't read your docs, I'm useful day one.

If you're running production infrastructure that needs to ship AI, same.

There aren't a lot of people who are honestly both, and you won't find many more by waiting for one tribe to grow the other tribe's reflexes.

My email is in the footer.