ShipEasy
Flags & ExperimentsMetrics

Quickstart

Create your first metric, log the events it depends on, attach it to an experiment — five minutes, end to end.

TutorialOn this page · 5 min readUpdated · May 15, 2026Works with · Server SDK · CLI

This walks you through the metric pipeline end to end: pick a metric, log the underlying events, create the metric definition, attach it to an experiment, and read the result. By the end you'll have a primary metric that drives a real ship/no-ship decision.

The 5-minute path

define · log · attach · read
01 · DEFINE

Create a conversion metric

$shipeasy metrics create purchase_conversion --type conversion --event purchase
02 · LOG

Log the underlying event from your code

$await shipeasy.track('purchase', { revenueCents })
03 · ATTACH

Attach to an experiment as the primary metric

$shipeasy experiments metric add paywall-v2 purchase_conversion --role primary
04 · READ

Watch the lift + p-value on the dashboard

$open https://app.shipeasy.ai/experiments/paywall-v2

1. Pick what to measure

Before you create anything, answer one question: what number tells you the experiment worked?

For a checkout-flow rewrite: probably purchase_conversion — did exposed users buy? For a new paywall: probably subscription_conversion — did they sign up? For a homepage redesign: typically session_engagement — did they click past the fold?

Pick one. Two is fine if they're closely related. Five primary metrics means you haven't decided yet — go back and decide.

The metric needs to be:

  • Computable from events you already log (or are willing to start logging).
  • Specific to the change — not "DAU," which moves for a hundred reasons. "Did this exposed user convert after exposure."
  • Reasonable to detect — a 1% conversion lift on 1,000 users a day will take a month. Check the power table before committing.

2. Define the metric

The simplest case — did event X happen for this user at least once?

shipeasy metrics create purchase_conversion \
  --type conversion \
  --event purchase

You now have a metric definition. It does nothing on its own; it tells the analysis pipeline how to aggregate the underlying events per user.

For a revenue metric (sum the revenueCents property across purchase events per user):

shipeasy metrics create revenue_per_user \
  --type sum \
  --event purchase \
  --property revenueCents

For more aggregation types, see Aggregations.

3. Log the underlying events

The metric is a rule for aggregating events. The events themselves come from your code:

app/checkout/success/page.tsx
import { shipeasy } from "@shipeasy/sdk/server";

export default async function CheckoutSuccess({ order }: { order: Order }) {
  await shipeasy.track("purchase", {
    userId: order.userId,
    revenueCents: order.totalCents,
    currency: order.currency,
    channel: order.acquisitionChannel,
  });
  return <ThankYou />;
}

A few rules that matter:

  • userId is required for attribution. Without it, the event can't be assigned to an experiment exposure.
  • Properties become filterable. You can later add a metric like "paid purchases from organic channel" by filtering on channel.
  • track() is fire-and-forget. It returns immediately; the event flushes asynchronously. Do not await it on the hot path expecting a delivery guarantee — it's analytics, not transactional state.

Deploy this. Events start flowing. The metric definition will pick them up on the next analysis window (daily by default).

4. Attach the metric to an experiment

Now wire the metric into the experiment whose lift you want to read:

shipeasy experiments metric add paywall-v2 \
  purchase_conversion --role primary

# Add a guardrail too — don't tank page load while you're at it
shipeasy metrics create p95_page_load_ms \
  --type mean --event page_view --property loadTimeMs --direction down

shipeasy experiments metric add paywall-v2 \
  p95_page_load_ms --role guardrail

--role primary is the metric the experiment is judged by. --role guardrail is a metric that must not regress, even if the primary moves. See Guardrails.

5. Read the result

The dashboard's experiment page shows lift, confidence interval, and p-value per metric, refreshed once per analysis window:

purchase_conversion        control   v1     lift     p       95% CI
                            4.8%     5.2%   +8.3%    0.018   [1.2%, 15.6%]

p95_page_load_ms (guard)   860 ms   862 ms  +0.2%    0.71    [-1.1%, 1.5%]

You read this as: paywall v1 lifted conversion by 8.3% (p=0.018, 95% CI excludes zero) and did not regress page-load time. Ship it.

What if the result is flat or noisy? Two checks:

  1. Are events actually landing? Check the Events tab on the experiment — exposed users should have non-trivial event counts. If the count is zero, your track() call isn't running for the variant.
  2. Is the experiment powered? The dashboard shows MDE alongside the lift. If MDE is ±10% and the real lift is +2%, you can't tell — run longer or accept the null.

See Power & sample size.

Where to next

On this page