Graffitecs / Services / AI & Automation
Discipline 03 · Applied AI

LLMs, agents, and
workflows that actually pay rent.

Past the demo. We ship retrieval-augmented systems, multi-step agents, fine-tuned models and automation pipelines that survive contact with real users, real data, and real cost-per-token spreadsheets.

evaluated.
cheap. fast.

Median eval pass rate before launch: 94%. Median cost-per-task drop after our optimisation pass: 6.4×.

RAG over private data Multi-step agents Fine-tuning & evals Workflow automation Voice & multimodal LLM cost optimisation
Avg cost reduction
0%
Avg eval pass rate
0+
AI systems shipped
0ms
Avg p50 latency
Capabilities

The shapes of system we ship.

Six recurring archetypes cover most of what we get asked to build. Pick the one closest to yours, or describe your own — we've probably done a cousin of it.

01

RAG over your private corpus

Search, chat, summarisation, citation. Built on the boring parts (chunking, hybrid retrieval, reranking, evals) — not just an OpenAI call wrapped in a sycophantic prompt.

  • · Hybrid BM25 + vector
  • · Cross-encoder rerank
  • · Citation-grounded answers
02

Agentic workflows

Multi-step agents that read, plan, and act on real systems. Tool use, browser automation, human-in-the-loop checkpoints, full audit trail.

  • · Tool-using agents
  • · Approval gates
  • · Replay & audit logs
03

Fine-tuning & evals

When prompting plateaus. SFT, DPO, RLAIF on the open-weight models that fit your task — and the eval harness that proves it works.

  • · Llama / Qwen / Mistral
  • · LoRA & full fine-tunes
  • · Eval harness CI
04

Internal copilots

Slack-native, IDE-native, dashboard-native. The copilot your team actually uses because it knows your wiki, your issues, your codebase.

  • · Slack / Teams native
  • · IDE plugins
  • · Org-wide RAG
05

Document & process automation

Invoices, contracts, claims, KYC packets. We replace the form-shop with structured extraction + verification + human escalation.

  • · Structured extraction
  • · Verification workflows
  • · Human-in-the-loop UI
06

Voice & multimodal

Real-time voice agents, vision QA, multimodal search. Built with eval-driven discipline — not vibe-checked.

  • · Realtime voice (Pipecat)
  • · Vision QA pipelines
  • · Multimodal eval suites
What you walk away with

More than a demo.

Every engagement ends the same way: a production system, a team that can run it, and a written record of every decision we argued about.

D01
Production AI service
Versioned, evaluated, observable. Cost dashboard from day one.
D02
Eval harness in CI
Every PR re-runs your task suite. Regressions block merge.
D03
Prompt + model registry
Versioned prompts, model pins, rollback in one click.
D04
Cost & latency dashboards
Per-tenant, per-feature, per-model. The truth about your spend.
D05
Safety & guardrail layer
Prompt-injection defence, PII redaction, content moderation.
D06
Team enablement
We train your team on the eval-first workflow we use ourselves.
Default stack

Boring on the inside.

A deliberately conservative core. We swap in sharp tools when the problem requires them, never because they're trending.

OpenAI
Anthropic
Llama 3
vLLM
LangGraph
LlamaIndex
Pinecone
pgvector
Weaviate
Modal
Replicate
Pydantic
How we engage

Three shapes of work.

Pick the one that maps to where you are. Or hop between them — most clients start at one and end at another.

PHASE A

Discovery sprint

2 weeks · fixed

Architecture review, prototype of the riskiest interaction, and a costed roadmap. You walk away with a plan whether or not we build it.

From AED 38,000
PHASE B · MOST POPULAR

Build pod

8–16 weeks · milestones

Senior pod of 3–5: tech lead, full-stack, design, QA. Two-week cycles, demo every Friday, you own the repo from week one.

From AED 145,000 / month
PHASE C

Embedded retainer

ongoing · monthly

Once it's live. A reduced pod stays in your stand-ups, owns the on-call rota, and ships the roadmap quarter-over-quarter.

From AED 65,000 / month
FAQ

Honest answers.

If yours isn't here, just ask. We respond to most enquiries within a working day.

Will you work with our existing engineering team? +

Yes — most of our work is alongside in-house teams. We default to your repo, your code review, your standup. Where we run our own pods, we still pair-program weekly so the knowledge transfers from day one rather than at handover.

Can you take over a project that's already in trouble? +

Frequently. We start with a one-week audit: code, infra, team, runway. Output is a frank document — what to keep, what to throw out, what to rewrite — and a costed path to stable. About a third of those projects we then take on. The rest we hand back with a plan they can execute themselves.

What about IP and code ownership? +

You own everything we write — code, designs, infra, docs — assigned on first commit. We keep no kill-switches and no exclusivity clauses. Our standard MSA is six pages and we'll happily redline yours.

Can you work fixed-price? +

For well-shaped, mostly-known scopes — yes, we offer fixed-fee with milestone billing. For genuinely exploratory work we recommend time-and-materials with a hard cap, because anyone who quotes you a fixed price for an unknown is either lying or building you something cheap.

What's the smallest engagement you'll take? +

A two-week discovery sprint. Below that we're not adding much value over a good freelancer, and we'd rather refer you to one of the four we trust than do it badly ourselves.

AI pod available June 2026

Got an AI demo
that needs to grow up?