aperturio.
AI · LLM · Agents · Engineering · US

AI engineering velocity without the recruiter chaos.

Vetted US-based LLM application engineers, agentic workflow engineers, AI infrastructure, AI-native full-stack, and vector / ML engineers — full-time or fractional. Eval-driven, cost-aware, shipping production AI features today. Pay only when you hire.

Full-time or fractional · $10,000 flat per hire · A real recruiter reviews every candidate

AI-native roles · what we lead with

Prioritized for who you are and where you're going.

LLM Application Engineer

RAG over real data, tool use, structured outputs, evals. Owns the product surface — what the user actually sees an LLM do. Ships with Anthropic / OpenAI SDKs, Vercel AI SDK, structured outputs, and Cursor or Claude Code in the loop.

Agentic Workflow Engineer

Multi-step agents that actually finish the work. MCP servers, LangGraph or Inngest / Mastra-style orchestration, safe tool exposure, sandboxed execution, human-in-the-loop checkpoints. Knows where agents help and where they hurt.

AI Infrastructure & Inference

vLLM / SGLang / TGI for self-serve, Modal / Replicate / Together / Fireworks for hosted. GPU cost posture, batching, streaming, observability for LLM workloads (Langfuse, Helicone, Braintrust traces).

AI-Native Full-stack

TypeScript everywhere, Next.js / Hono / Bun, Postgres + pgvector, Drizzle / Prisma. Streaming UIs, generative UI, tool-calling baked into the product flow. Ships fast with AI in the loop.

Data + ML Engineer

Embeddings pipelines, vector DBs (pgvector / LanceDB / Qdrant / Pinecone), hybrid retrieval (BM25 + dense + rerank), fine-tuning (LoRA / QLoRA / DPO), eval datasets that catch regressions before users do.

Staff / Founding AI Engineer

The senior who owns AI across the product. Sets eval discipline, picks the model / inference stack, decides when to fine-tune vs. prompt, makes cost vs. quality calls. Often the first AI hire after the founder.

Also placing — classic, non-AI roles

Not every role needs an LLM. We still place the classic tracks on the same bench, same human review, same per-hire model.

  • Front-end / WebReact/Next, Vue, Svelte. Production SPAs, marketing sites, design-system implementation, performance.
  • Back-end / APINode/TS, Python, Go, Java. Domain modeling, REST/GraphQL APIs, queue workers, integration depth.
  • Full-stackEnd-to-end feature delivery across the stack. Pragmatic, ships fast.
  • MobileiOS/Android, React Native, Flutter. Native feel where it matters.
  • DevOps / CloudAWS/GCP/Azure, Terraform, CI/CD, observability, incident readiness.
  • QA / AutomationPlaywright, Cypress, load and integration testing. Reduces rollback rate.
We vet

Beyond the resume.

Culture screening, AI-assisted skills assessment, expert human review. A real recruiter signs off before any candidate reaches you.

Eval discipline

Offline + online evals, golden sets, regression detection, A/B against frontier models. If they can't answer "how would you know if your prompt got worse?", they're a no.

Prompt + context engineering

Tool design, structured outputs, RAG context construction, prompt versioning. Treats prompts like code — diffed, reviewed, tested.

LLM cost + latency tradeoffs

Model selection by task, caching, batching, streaming, cost ceilings. Knows the price-per-million-tokens of every major model.

Production agent design

When agents help vs. hurt. Tool exposure boundaries, retry / refusal handling, human-in-the-loop checkpoints, failure modes in production.

Retrieval + memory

Chunking, reranking, hybrid retrieval, structured retrieval, conversation memory. Real RAG, not a vector-DB MVP.

AI-tool fluency

Cursor, Windsurf, Claude Code, Zed AI daily. Ships with AI in the loop and reviews LLM-generated code with judgment — not "I tried ChatGPT once".

Outcomes we drive

  • AI features that actually ship — and don't regress when the model updates
  • Eval pipelines that catch quality drops before users do
  • Production agents that finish the work, with humans on the loop where it matters
  • Predictable AI cost and latency under real traffic
  • RAG and memory that work past the demo

Sample profiles

Representative of the bench — share your requirements for live profiles.

Ryan K.
Staff AI Application Engineer (FT)

RAG over 10M docs in prod · Anthropic + OpenAI SDKs · Vercel AI SDK · Braintrust evals · prior Cursor/Linear-style team · ships weekly.

Megan O.
AI Infrastructure (Fractional)

vLLM + SGLang serving · GPU cost optimization · Modal/Replicate deployments · Langfuse + Helicone observability · SLO discipline.

Christopher L.
Senior Agentic Workflow Engineer (FT)

MCP server design · LangGraph multi-step · sandboxed tool execution · production agent failure modes · prior YC AI-tools team.

New in 2026

Now callable via /hire in Slack.

Same vetted US bench. New way to brief a role. Type /hire in Slack — or any MCP-connected tool — and our sourcing agents plus human recruiters take it from there. Same $10,000 flat-per-hire model.

Add the AI engineer. Ship the model.

Common questions

Do you place classic backend / frontend engineers, not just AI?
Yes. AI-native tracks are our lead practice, but classic engineering roles (Front-end, Back-end, Full-stack, Mobile, DevOps, QA) are on the same bench with the same human review and per-hire model.
How fast can I get a vetted shortlist?
Typically days, not weeks. Time-to-shortlist depends on role specificity and bench availability for the discipline.
What does it cost to hire a US engineer through Aperturio?
$10,000 flat per direct hire — same fee whether the role is $120K junior or $250K staff. No retainer, no seats. Replacement guarantee included.