Build your agent
A/B experiments
A/B experiments let you trial different agent personas against each other on real visitor traffic and pick the winner based on engagement / lead-capture / conversation length. Pitchbar assigns each visitor stickily โ the same person always sees the same variant โ so the comparison is honest.
Where it lives
Open any agent โ A/B experiments tab in the agent
nav (Beaker icon). URL: /app/agents/{id}/experiments.
Owners and Admins can create / start / stop experiments; Editors
can't.
Creating an experiment
- Click New experiment.
- Pick a kind:
- persona โ variants override the agent's name + tone in the system prompt for assigned visitors. Use this to test "Aria" vs "Max", "friendly" vs "punchy", etc.
- cta โ variants record assignment but don't yet alter the runtime CTA payload. Recorded for measurement; full runtime application ships in a future release.
- trigger โ same as cta โ measurement only.
- Add at least 2 variants. Default is
control+treatmentat 50/50. You can change weights (any positive integers โ they're normalised) and names. - For
personakind, each variant'sconfigJSON should hold apersonaobject:
The widget renders the variant's persona{ "persona": { "name": "Aria", "tone": "warm and concise" } }namein the chat header, and the LLM speaks under that name + tone for the assigned conversation. - Save. Status starts at
draftโ no visitors are assigned yet. - Click Start. Status flips to
running. Every subsequent first-turn visitor is bucketed.
How assignment works
On the first message of a conversation, the
MessageStreamController calls
ExperimentResolver::resolveForConversation:
- If
conversation.variant_idis already set, use it (sticky). - Otherwise, look up the most recently started
runningexperiment for this agent. Only ONE active experiment per agent โ if you start a second one while the first is running, the resolver picks the most-recent. To run a different kind, stop the previous one first. - Hash
(visitor_id + experiment_id)into a bucket on the weighted variant list (Assigner). The same visitor returning days later lands in the same variant โ the assignment row is durable. - Persist
conversation.variant_id. Every future turn for this conversation reads the same variant. - For
kind = persona, the variant'sconfig[persona]overrides the agent's default persona inPromptBuilder::buildfor that turn.
Seeing it in action
The fastest way to confirm the wiring:
- Create a
personaexperiment with two clearly different variants โ e.g.{ "persona": { "name": "Helpfulbot" } }vs{ "persona": { "name": "Snarkbot" } }. - Start the experiment.
- Open your widget in two different browsers (or one normal + one incognito โ different cookies = different
visitor_id). - Ask the same question in each. The chat panel header should read "Helpfulbot" in one and "Snarkbot" in the other, and the answers should sound noticeably different.
- Open
/admin/conversationsand confirm each conversation row has avariant_idstamped.
Measuring results
All persisted: experiment_assignments rows + the
variant_id on every conversations row.
Join those two tables against messages and
leads for any analysis you want โ e.g.:
SELECT v.name,
COUNT(DISTINCT c.id) AS conversations,
COUNT(DISTINCT l.id) AS leads_captured,
AVG(c.message_count) AS avg_messages
FROM variants v
LEFT JOIN conversations c ON c.variant_id = v.id
LEFT JOIN leads l ON l.conversation_id = c.id
WHERE v.experiment_id = '...'
GROUP BY v.name;
A built-in stats panel inside /app/agents/{id}/experiments
is on the roadmap; for now you'll need to run that query manually
(workspace API token + /api/v1/db read access if
you're on the self-host build, or ask support).
Stop / delete
Stop sets status to stopped and flushes the
running-experiment cache so new conversations immediately stop
getting assigned. Existing conversations keep their assigned
variant for consistency in mid-flight chats.
Delete hard-deletes the experiment row. Variant rows
cascade. Existing conversations.variant_id values become
foreign-key orphans โ that's intentional; we keep the historical
record of which conversation got which variant even after the
experiment ends.