Graffitecs / Services / AI & Automation

Discipline 03 · Applied AI

LLMs, agents, and
workflows that actually pay rent.

Past the demo. We ship retrieval-augmented systems, multi-step agents, fine-tuned models and automation pipelines that survive contact with real users, real data, and real cost-per-token spreadsheets.

Start a project What we ship

evaluated.
cheap. fast.

Median eval pass rate before launch: 94%. Median cost-per-task drop after our optimisation pass: 6.4×.

RAG over private data Multi-step agents Fine-tuning & evals Workflow automation Voice & multimodal LLM cost optimisation

0×

Avg cost reduction

0%

Avg eval pass rate

0+

AI systems shipped

0ms

Avg p50 latency

Capabilities

The shapes of system we ship.

Six recurring archetypes cover most of what we get asked to build. Pick the one closest to yours, or describe your own — we've probably done a cousin of it.

01

RAG over your private corpus

Search, chat, summarisation, citation. Built on the boring parts (chunking, hybrid retrieval, reranking, evals) — not just an OpenAI call wrapped in a sycophantic prompt.

· Hybrid BM25 + vector
· Cross-encoder rerank
· Citation-grounded answers

02

Agentic workflows

Multi-step agents that read, plan, and act on real systems. Tool use, browser automation, human-in-the-loop checkpoints, full audit trail.

· Tool-using agents
· Approval gates
· Replay & audit logs

03

Fine-tuning & evals

When prompting plateaus. SFT, DPO, RLAIF on the open-weight models that fit your task — and the eval harness that proves it works.

· Llama / Qwen / Mistral
· LoRA & full fine-tunes
· Eval harness CI

04

Internal copilots

Slack-native, IDE-native, dashboard-native. The copilot your team actually uses because it knows your wiki, your issues, your codebase.

· Slack / Teams native
· IDE plugins
· Org-wide RAG

05

Document & process automation

Invoices, contracts, claims, KYC packets. We replace the form-shop with structured extraction + verification + human escalation.

· Structured extraction
· Verification workflows
· Human-in-the-loop UI

06

Voice & multimodal

Real-time voice agents, vision QA, multimodal search. Built with eval-driven discipline — not vibe-checked.

· Realtime voice (Pipecat)
· Vision QA pipelines
· Multimodal eval suites

What you walk away with

More than a demo.

Every engagement ends the same way: a production system, a team that can run it, and a written record of every decision we argued about.

D01

Production AI service

Versioned, evaluated, observable. Cost dashboard from day one.

→

D02

Eval harness in CI

Every PR re-runs your task suite. Regressions block merge.

→

D03

Prompt + model registry

Versioned prompts, model pins, rollback in one click.

→

D04

Cost & latency dashboards

Per-tenant, per-feature, per-model. The truth about your spend.

→

D05

Safety & guardrail layer

Prompt-injection defence, PII redaction, content moderation.

→

D06

Team enablement

We train your team on the eval-first workflow we use ourselves.

→

Default stack

Boring on the inside.

A deliberately conservative core. We swap in sharp tools when the problem requires them, never because they're trending.

OpenAI

Anthropic

Llama 3

vLLM

LangGraph

LlamaIndex

Pinecone

pgvector

Weaviate

Modal

Replicate

Pydantic

How we engage

Three shapes of work.

Pick the one that maps to where you are. Or hop between them — most clients start at one and end at another.

PHASE A

Discovery sprint

2 weeks · fixed

Architecture review, prototype of the riskiest interaction, and a costed roadmap. You walk away with a plan whether or not we build it.

From AED 38,000

PHASE B · MOST POPULAR

Build pod

8–16 weeks · milestones

Senior pod of 3–5: tech lead, full-stack, design, QA. Two-week cycles, demo every Friday, you own the repo from week one.

From AED 145,000 / month

PHASE C

Embedded retainer

ongoing · monthly

Once it's live. A reduced pod stays in your stand-ups, owns the on-call rota, and ships the roadmap quarter-over-quarter.

From AED 65,000 / month

FAQ

Honest answers.

If yours isn't here, just ask. We respond to most enquiries within a working day.

Will you work with our existing engineering team? +

Yes — most of our work is alongside in-house teams. We default to your repo, your code review, your standup. Where we run our own pods, we still pair-program weekly so the knowledge transfers from day one rather than at handover.

Can you take over a project that's already in trouble? +

Frequently. We start with a one-week audit: code, infra, team, runway. Output is a frank document — what to keep, what to throw out, what to rewrite — and a costed path to stable. About a third of those projects we then take on. The rest we hand back with a plan they can execute themselves.

What about IP and code ownership? +

You own everything we write — code, designs, infra, docs — assigned on first commit. We keep no kill-switches and no exclusivity clauses. Our standard MSA is six pages and we'll happily redline yours.

Can you work fixed-price? +

For well-shaped, mostly-known scopes — yes, we offer fixed-fee with milestone billing. For genuinely exploratory work we recommend time-and-materials with a hard cap, because anyone who quotes you a fixed price for an unknown is either lying or building you something cheap.

What's the smallest engagement you'll take? +

A two-week discovery sprint. Below that we're not adding much value over a good freelancer, and we'd rather refer you to one of the four we trust than do it badly ourselves.

Five other disciplines