aperturio.
AI · LLM · Agents · Engineering · LATAM Nearshore

AI engineering velocity without the payroll bloat.

Vetted LATAM-based LLM application engineers, agentic workflow engineers, AI infrastructure, AI-native full-stack, and vector / ML engineers — full-time or fractional — with no placement fees. Same eval-driven, cost-aware bench, US time-zone aligned.

Full-time or fractional · No placement fees · A real recruiter reviews every candidate

AI-native roles · what we lead with

Prioritized for who you are and where you're going.

LLM Application Engineer

RAG over real data, tool use, structured outputs, evals. Owns the product surface — what the user actually sees an LLM do. Ships with Anthropic / OpenAI SDKs, Vercel AI SDK, and Cursor or Claude Code in the loop.

Agentic Workflow Engineer

Multi-step agents that actually finish the work. MCP servers, LangGraph / Inngest / Mastra orchestration, safe tool exposure, sandboxed execution, human-in-the-loop checkpoints.

AI Infrastructure & Inference

vLLM / SGLang / TGI for self-serve, Modal / Replicate / Together / Fireworks for hosted. GPU cost posture, batching, streaming, observability for LLM workloads.

AI-Native Full-stack

TypeScript everywhere, Next.js / Hono / Bun, Postgres + pgvector, Drizzle / Prisma. Streaming UIs, generative UI, tool-calling baked into the product flow.

Data + ML Engineer

Embeddings pipelines, vector DBs (pgvector / LanceDB / Qdrant / Pinecone), hybrid retrieval (BM25 + dense + rerank), fine-tuning (LoRA / QLoRA / DPO), eval datasets.

AI Product Engineer

Full-stack who can ship AI features end-to-end with judgment about cost / latency / UX tradeoffs. Comfortable across model selection, retrieval, UI, and evals.

Also placing — classic, non-AI roles

Not every role needs an LLM. We still place the classic tracks on the same bench, same human review, same per-hire model.

  • Front-end / WebReact/Next, Vue, Svelte. Production SPAs, marketing sites, design-system implementation, performance.
  • Back-end / APINode/TS, Python, Go, Java. Domain modeling, REST/GraphQL APIs, queue workers.
  • Full-stackEnd-to-end feature delivery across the stack. Pragmatic, ships fast.
  • MobileiOS/Android, React Native, Flutter.
  • DevOps / CloudAWS/GCP/Azure, Terraform, CI/CD, observability.
  • QA / AutomationPlaywright, Cypress, load and integration testing.
We vet

Beyond the resume.

Culture screening, AI-assisted skills assessment, expert human review. A real recruiter signs off before any candidate reaches you.

Eval discipline

Offline + online evals, golden sets, regression detection, A/B against frontier models.

Prompt + context engineering

Tool design, structured outputs, RAG context construction, prompt versioning. Prompts treated like code.

LLM cost + latency tradeoffs

Model selection by task, caching, batching, streaming, cost ceilings. Token economics literacy.

Production agent design

Tool exposure boundaries, retry / refusal handling, human-in-the-loop checkpoints, failure modes in production.

Retrieval + memory

Chunking, reranking, hybrid retrieval, structured retrieval, conversation memory. Real RAG, not a vector-DB MVP.

AI-tool fluency

Cursor, Windsurf, Claude Code, Zed AI daily. Ships with AI in the loop and reviews LLM-generated code with judgment.

Outcomes we drive

  • AI features that actually ship — and don't regress when the model updates
  • Eval pipelines that catch quality drops before users do
  • Production agents that finish the work, with humans on the loop where it matters
  • Predictable AI cost and latency under real traffic
  • RAG and memory that work past the demo

Sample profiles

Representative of the bench — share your requirements for live profiles.

Augustin M.
AI-Native Full-stack Engineer (FT or 20–30 hrs/wk)

Next.js + Vercel AI SDK · Anthropic + OpenAI SDKs · pgvector RAG · structured outputs in prod · ships weekly.

Elvis C.
AI Infrastructure (Fractional)

vLLM + SGLang · Modal / Replicate · GPU cost posture · Langfuse traces · LLM-aware SLOs.

Santiago C.
Senior LLM Application Engineer (FT)

RAG pipelines · evals (Braintrust + custom) · tool-calling design · MCP servers · prior agentic-product team.

New in 2026

Now callable via /hire in Slack.

Same vetted LATAM bench. New way to brief a role. Type /hire in Slack — or any MCP-connected tool — and our sourcing agents plus human recruiters take it from there. Same no placement fees model.

Add the AI engineer. Ship the model.

Common questions

Do you place classic backend / frontend engineers, not just AI?
Yes. AI-native tracks are our lead practice, but classic engineering roles (Front-end, Back-end, Full-stack, Mobile, DevOps, QA) are on the same LATAM bench with the same human review.
How fast can I get a vetted shortlist?
Typically 1–2 weeks when the candidate is already in our pool. Slower for very niche specializations.
What does it cost to hire a LATAM engineer through Aperturio?
Two ways to engage. LATAM direct full-time hires are $5,000 flat per hire. Fractional or contract engagements have no placement fee — you pay a transparent margin on the contractor's rate. No retainer either way.