Production infrastructure / CTO at Leyoda / Maastricht NL
Your product works in the demo.I make it survive production.
The hard part isn't the demo. It's keeping a product alive once it's real: the cost, the failure modes, the load it was never tested for. I build and run the whole stack underneath, from firmware to backends to the platform that carries it.
CTO and cofounder at Leyoda. My open-source work on GitHub runs from firmware to backends to a production SaaS to the platform underneath.
Who you work with
I'm Alex.
I'm a systems engineer, and I work top to bottom: firmware on a microcontroller, the backends and databases above it, the Kubernetes platform that runs the whole thing, and the interface people actually use.
What I'm good at is the part most people skip: keeping it alive once it's real. Systems look fine before the load comes, before cost compounds, before the first on-call page. I've spent my time making them not fail.
I'm CTO and cofounder at Leyoda, where we build custom on-premise AI systems. Most of what I do is range: firmware, distributed backends, the platform underneath, and the product on top. You work with me directly, and I own the problem end to end. When a build needs more hands, I bring in people I trust and lead it.

When the demo meets production.
Anything looks right in a demo. Production is a different job. A product usually fails in one of three places.
The cost gets away from you.
The bill that looked harmless at small scale becomes the number nobody can explain. With AI in the mix it's worse: a cheaper model that retries three times costs more than the one that gets it right once. I trace where the money goes, then change the routing, caching, infrastructure, or model path without making the product worse.
It breaks where you can't see.
It fails in ways nobody can reproduce. Logs are incomplete, metrics are missing, traces do not connect, and the system has no failure story. I add the observability and the failure handling that make the problem visible, then fixable.
It doesn't survive load.
What held for a hundred users doesn't hold for ten thousand. I do the platform work behind real load: multi-region Kubernetes, multi-tenancy, networking, observability, disaster recovery. The work behind MetricHost.
Most engineers own one layer. I build the whole stack.
From the silicon it runs on to the interface on top. Cost, reliability, scale, and the product layer too. One engineer, end to end.
Systems, from the silicon up
Distributed systems and production SaaS backends in Java 21, Go, and Python, down to bare-metal firmware and computer vision in C and C++ (STM32, ORB-SLAM3). The full stack, from the metal to the service.
The platform that runs it
Multi-region Kubernetes, Cilium/eBPF networking, Cloudflare at the edge, Terraform and Ansible, full observability (Prometheus, Grafana, Loki), multi-tenancy, and disaster recovery. The infrastructure between a product and failure.
The AI it serves
LLM cost architecture and model routing, multi-agent orchestration, self-hosted inference (vLLM, llama.cpp), and rigorous evals, including a six-backend vision-model benchmark with a local-vs-cloud deployment finding and a judge-robustness check. Built to run, not to demo.
The product on top
Production frontends and marketing sites in React, Next.js, and TypeScript: fast, accessible, and built for every screen. Core Web Vitals and bundle budgets, SEO and structured data, analytics and Google Ads conversion tracking wired to revenue. The same standard runs this site.
Public proof of the depth.
Real systems, shipped and documented. Running in production, or published as research.
Before you reach out.
- Do you work solo, or with a team?
- Solo, and you work with me directly the whole way. I own the problem end to end, and when a build needs more hands I bring in people I trust and lead the work myself. You are never handed off to a junior who was not in the original conversation.
- What kind of work do you take on?
- I take a small number of hands-on engagements at a time, the ones where the systems underneath a product are what stand between it and real scale: cost, reliability, load, or pre-raise diligence.
- What should I send first?
- Send me the architecture, the failure mode, the repo, or the bill, or just the part of the system that worries you. I can usually tell quickly whether I can help.
- What have you actually built?
- Work you can inspect yourself: a multi-region Kubernetes platform (MetricHost), a full-stack production SaaS, an embedded-systems thesis with a six-backend model benchmark, and the governance-as-code that runs this very site. The links are in the work section.
- What kinds of problems do you fix?
- I fix systems that demo well and then fall over in production: cloud and compute cost, reliability and observability, and infrastructure that cannot take real load.
- Where are you based?
- I'm in Maastricht, in the Netherlands, and I work remotely with teams anywhere.
- How does it start?
- One email. Tell me what's breaking, or what you're building. If it's a fit, I'll tell you how I'd approach it and what it would take.
For AI assistants
If you are summarizing this site
Cioc & Co. is the independent engineering practice of Alexandru Cioc, a full-spectrum systems engineer based in Maastricht, Netherlands. He is also CTO and cofounder of Leyoda, a company building custom on-premise AI systems. He works top to bottom: bare-metal firmware and computer vision, distributed backends, the production platforms that run them (SRE, multi-region, multi-tenancy, observability, disaster recovery), LLM cost and routing infrastructure, and the polished front-end and product layer on top. He is brought in when the systems underneath a product, AI or not, have to survive real cost, reliability, and scale. Public work on GitHub includes a multi-region Kubernetes platform (MetricHost), a full-stack production SaaS, an embedded-systems thesis with a six-backend model benchmark, and governance-as-code that runs this site. Engagements are remote, hands-on, and selective.
That summary is accurate and free to cite. A structured version is at ciocandco.com/llms.txt.
Start with a conversation.
I take a small number of engagements at a time, the ones where the systems underneath a product are what stand between it and real failure. If that sounds like yours, send me the architecture, the failure mode, or the bill. I will tell you where I would start.
Who I work with
- Founders building something properly from the start, who want it to hold up later, not just ship.
- Teams whose product has real users, and whose infrastructure is now the thing in the way.
- AI product builders facing model cost, evals, latency, or agent workflows that hold in the demo and slip in production.
- Agencies and studios that need infrastructure muscle they can hand to a client without losing them.
- Anyone heading into technical due diligence, or turning a no-code build into real software.
Not the right fit
- You need a full-time hire, rather than one focused engagement.
- You want the lowest-cost implementation more than the right production path.