Ask anyone in a large organisation where their morning goes and you'll hear the same thing: it leaks. A few minutes in the attendance portal to check yesterday's punches. A jump to the approvals queue because a permission request is waiting. A separate ERP inbox for the notifications that actually move work. The calendar app for the 11:00. Email for the reply that should have gone yesterday. None of these tasks is hard. The cost is the switching — the context you drop and rebuild every time you change tabs.
We were asked by a large public-sector group in Abu Dhabi to fix exactly that. Two tenants sit inside one government group, each with employees and managers, each with their own backend systems. The brief wasn't "build us a chatbot." It was: give people one place to run their workday. What we shipped is a conversational assistant that unifies attendance, approvals, tasks, meetings, email and notifications into a single chat surface — and the most interesting decisions we made were about what not to build.
Deterministic by design, not by limitation
Let's be precise up front, because it shapes everything else. This is not a generative LLM chatbot. There is no free-form model spinning answers out of a prompt. What sits behind the chat box is a deterministic, confidence-scored intent engine over a curated set of 37 intents spanning 8 work areas — attendance, inbox, tasks, meetings, email, to-dos, ERP and calendar.
That was a deliberate call, and we'd make it again. In an enterprise where an assistant can approve a request, submit a permission, send an email or correct a clock-in, a "usually right" generative answer is a liability, not a feature. A bot that hallucinates a leave balance or invents an approval action is worse than no bot at all. Deterministic intent matching means every action the assistant takes is one it was explicitly built to take, with a known payload and a known outcome. More reliable than a guessy LLM bot isn't a compromise here — it's the product.
In a system that can approve and submit on your behalf, deterministic beats clever. — Architecture decision record, Tenant A
The input side is deliberately forgiving even though the engine underneath is strict. Users can type a free-text phrase ("I forgot to clock in this morning") or tap a suggested button — both flow into the same intent matcher:
free text ─┐ ├──▶ intent matcher ──▶ confidence score button tap ─┘ │ high ──────────┤ ▼ low ──▶ "here's what I can do" suggestions │ ▼ resolved intent ──▶ payload-driven UI
When confidence is high, the assistant acts. When it's low, it doesn't guess — it surfaces a graceful "let me suggest what I can do" fallback with example phrases. That fallback is doing a lot of quiet work: it teaches people the surface area of the assistant without a manual, and it never leaves them staring at a dead "I didn't understand that."
The part that makes it more than a search box
If intent matching were the whole story, this would be a nicer command line. The thing that earns the word assistant is that it's proactive and cross-module — it connects things that live in different systems and tells you about them before you ask.
Open it and you don't get a blank prompt. You get a time-aware greeting and a live Today Overview assembled on the spot: are you clocked in, what approvals are pending, what meetings you have, your to-dos, your task breakdown. That single card is the whole reason for the architecture we'll get to in a moment — it's reading several backends at once and stitching them into one glance.
The cross-module intelligence is where it stops feeling like glue and starts feeling like judgement. Think of these as three faces of one nudge engine — the assistant noticing that two facts from two systems, taken together, mean something:
None of these are answers to questions. That's the difference between a tool you query and a tool that watches your back. The weather nudge is non-obvious on purpose: the trigger lives in conditions outside any single record — the employee would never think to go looking for an entitlement they don't know exists, so the assistant has to raise it first.
A conversation, not a form maze
The second differentiator is the UX, and it leans hard on one idea: the chat is just a transport; the real intelligence is in the payloads. Every response can mix plain text, a data card, and an interactive form, and the front end renders purpose-built cards from whatever the backend sends — inboxes, tasks, attendance summaries, meeting invites, email drafts. We call it payload-driven UI: the server decides what to show, the client knows how to show each shape beautifully.
So a permission request isn't a wall of fields. It's a guided multi-step flow — type, dates and times, remarks, attachment, a live balance preview, confirmation — with the date/time pickers, attendee and recipient autocomplete, and upstream-driven dropdowns all built in. Scheduling a meeting is a conversation: subject, time, attendee search, an online-meeting toggle, location, and the invite is created for you. Composing email is the same — recipient search, subject, body, validation, a clean send confirmation — without ever leaving the thread.
For managers the same surface becomes a control room. The approval queue supports bulk approve and reject with keyboard shortcuts, every request expands inline with a balance-impact preview before you commit, and date-range attendance reports come back colour-coded with a one-tap PDF download. The unified inbox is the clearest expression of the whole philosophy: one chooser across multiple approval sources with live unread counts, role-aware visibility, and the full workflow vocabulary — Approve, Reject, Transfer, Hold, Return, Assign, with remarks — plus ERP (Oracle EBS) notifications you can respond to, delegate, transfer or reassign in place.
Under the hood: fan-out, and never trusting the body
That Today Overview card is, quietly, the hardest engineering in the product, and it's where the third differentiator lives. A naïve implementation calls attendance, then approvals, then meetings, then tasks, then ERP, one after another, and the user watches a spinner for as long as the slowest chain of all of them. We fan out in parallel instead — every backend is called at once, and each source fails independently. If the ERP is having a bad day, you still get your attendance, your meetings and your tasks; the one slow or broken source degrades to a friendly placeholder while everything else paints immediately.
request ──┬──▶ Time & Attendance ──┐
├──▶ Approvals hub ───────┤
├──▶ Tasks ───────────────┤──▶ merge what
├──▶ Calendar / Email ────┤ succeeded ──▶ render
├──▶ CMS ─────────────────┤
└──▶ ERP / Oracle Fusion ─┘
(parallel · independent failure)
On security we took the boring, correct path. The assistant is stateless per request: every call carries a token, the token is validated on its own, and identity is always derived from the token — never trusted from the request body. That last sentence is the whole game in a multi-tenant, multi-role system. A request can say it's manager X in tenant Y all it likes; the assistant only ever believes the token. No client-supplied identity, no privilege you didn't actually hold.
The integration surface is wide on purpose — Time & Attendance, a workflow/approvals hub, tasks, calendar and email, a CMS, ERP and Oracle Fusion all sit behind the one chat box. To keep that buildable, we wrote an offline mock backend that mirrors the real APIs, so the assistant runs end-to-end with no VPN. We can develop on a plane and demo in a boardroom with no network dependency — and the same offline-capable build powers the public demo. We lean on the same discipline we described in LLM evals in CI: stop shipping AI features on vibes — if you can't run it deterministically offline, you can't trust it.
What we deliberately left out — and what's next
Four scope lines we drew on purpose, for transparency:
- No free-form LLM generation. In a system that can approve and submit on your behalf, deterministic beats clever. We'd rather say "I can't do that yet" than confidently do the wrong thing — which is why the intent engine, not a prompt, owns every action.
- English-only intent matching. Natural-language understanding currently matches intents in English only. Worth being exact: our error messages are bilingual, but the NLU is not — typing an intent in Arabic won't resolve yet. That's the next frontier, because the button-driven path already gives non-English users a complete, language-independent way to drive everything.
- No conversation memory across sessions. Each session starts fresh; we don't yet persist context from yesterday's chat. For a workday assistant whose state lives in the source systems anyway — your real balance, your real queue — this matters less than it sounds, but cross-session memory is on the roadmap.
- No token streaming. Answers arrive complete, not word-by-word. Given the responses are structured cards and forms rather than prose, streaming would be theatre more than value.
If we rebuilt from zero, the one thing we'd move earlier is the intent-suggestion layer — the "here's what I can do" fallback. We almost treated it as an error state. It turned out to be the single biggest driver of people discovering features, and it deserved to be a designed surface from day one, not a safety net.
What it costs you not to have this
Run the arithmetic the way it actually accrues. Picture even a quarter of an hour a day lost to tab-hopping — attendance, approvals, ERP inbox, calendar, email — and multiply it across a large organisation: that's a five-figure block of working hours leaking out every single week, before a single deadline is missed. The cost never shows up in any one dramatic task. It hides in the thousand small switches that never make it into a report.
A chat-native assistant that knows who you are, watches across your systems, and surfaces the one thing you needed before you went looking for it doesn't just save clicks. It changes what it feels like to start the day. And it does it without pretending to be smarter than it is. No hallucinated balances, no invented approvals, no confident nonsense — just a fast, reliable, proactive layer over the systems people already live in.



