Attribution & exclusion

Unlike other experimentation platforms, Split is able to consume event data from any source so long as the event can be associated with a customer key and timestamp. Learn more about sending track events to Split or using Split's analytics integration with Segment.

With this architectural advantage, the Split platform allows you to send us data from any source, automatically identifies the sample population using split impression events, and using event attribution logic intelligently attributes events to each sample based on targeting rule and treatment. Learn more about these concepts below.

Sample Set

We combine the event data you sent to Split with the traffic assignment data collected by the treatment evaluations to determine whether the event may have been influenced by the experiment, based on whether the event occurred after the user was exposed to the experiment. Learn more about event attribution below.

When Split calculates the metric impact of a particular treatment compared to the baseline treatment, Split starts by first assigning the customers exposed to the split into samples. These samples are organized by treatment.

Split then examines the events of those users after they were exposed to the treatment to derive a metric value.

The distribution of that metric value by each sample is then compared for significance. Learn more about statistical significance in Split.

The process of mapping user-generated events to the experiments and treatments they were exposed to is known as attribution, and is a cornerstone of running accurate experiments.

Event attribution

As mentioned above, unlike other experimentation platforms, Split is able to consume event data from any source, so long as the events can be associated with a customer key and timestamp. Split combines this event data with the impression events (e.g. traffic assignment data) collected by the treatment evaluation through its attribution model.

Split's attribution model differs from other experimentation and analytics platforms in that everything is automatic and fully extensible to allow for third-party data collection. Split's attribution model is based on identifying when a user was assigned to a particular treatment, and attributing any events that occurred after this point in time.

(1) ATTRIBUTION WHEN NO RULE AND NO TREATMENT CHANGE

In most scenarios, when a user is shown a treatment, their treatment will not change for the course of the experiment or until the version changes.

In this case, Split identifies when the user first saw the treatment, and attributes all events within 15 minutes prior to this first impression. If an event timestamp is > 15 minutes before the impressions timestamp, it wouldn’t be attributed to the experiment and treatment on the metrics impact page. This same logic applies to the end of the experiment. There is a 15-minute buffer for events to be received once the version changes before the final calculation is done and the metrics impact are frozen for that version.

Here’s a diagram that illustrates this attribution following a single user’s activity.

We’ll use this example to follow how Split attributes events when a user's rule and treatment do not change throughout the version.

e is an event, such as a click event. These events are represented below the timeline of the version.

r1 and on are a representation of the user's impressions containing the rule (r1) and the treatment (on). At these points, Split is deciding whether the user is bucketed into a certain treatment based on the targeting rule you've defined. In this example, all impression events in the timeline are for the same treatment and targeting rule.

The example in the timeline diagram shows the user’s activity in your application. When calculating metrics, Split will include the shaded region - the events from 15 minutes before the user's first impression to 15 minutes after the end of the version. Those events highlighted in pink would not be included in this example.

(2) ATTRIBUTION WHEN RULE CHANGE AND NO TREATMENT CHANGE

In some scenarios, when a user is shown a treatment the reason, or rule, that was used to determine the treatment may have changed. As an example, it is possible the attribute being used in the evaluation changed and the user still received the same treatment, despite their attribute changing.

In this case, Split will isolate the events and apply a 15-minute buffer to the first and last impression we've received for the unique rule and treatment combination.

Here’s a diagram that illustrates this attribution following a single user’s activity.

We’ll use this example to follow how Split attributes events when a user's rule changes and their treatment does not change throughout the version.

e is an event, such as a click event. These events are represented below the timeline of the version.

r1, r2 and on are a representation of the user's impressions containing the rule (r1 or r2) and the treatment (on). At these points, Split is deciding whether the user is bucketed into a certain treatment based on the targeting rule you've defined. In this example, there are two unique combinations where the rule is changing from r1 to r2 while the treatment is not changing.

The example in the timeline diagram shows the user’s activity in your application. When calculating metrics, Split will include the shaded region when looking at targeting rule r1: the events from 15 minutes before the user's first impression to 15 minutes after the first impression for the second rule. When isolating to r2, the same rules apply - include the events 15 minutes before the user's first impression to 15 minutes after the rule (in this example, version) change. Those events highlighted in pink would not be included in any rule analysis given they fall outside the buffer windows.

(3) ATTRIBUTION WHEN RULE CHANGE AND TREATMENT CHANGE

In some scenarios, the reason, or rule, that was used to determine the treatment may have changed as well as the treatment shown. As an example, it is possible the attribute being used in the evaluation changed, and the user receives a new treatment as a result of their attribute changing. This simplest example to consider is when the attribute you are targeting on causes the user's treatment to change. As an example, we target our free users (using the custom attribute "free") to convert to paid and may show the users a prompt to take a certain action (treatment "on"). Once they convert, the customer attribute now becomes "paid", they no longer meet the targeting condition used to show them the prompt and are shown the "off " treatment in a different rule.

In this case, Split will isolate the events and apply a 15-minute buffer the version start and the unique rule treatment combination change.

Here’s a diagram that illustrates this attribution following a single user’s activity.

We’ll use this example to follow how Split attributes events when user's rule changes and their treatment changes throughout the version.

e is an event, such as a click event. These events are represented below the timeline of the version.

r1, r2 and on, off are a representation of the user's impressions containing the rule (r1 or r2) and the treatment (on or off). At these points, Split is deciding whether the user is bucketed into a certain treatment based on the targeting rule you've defined. In this example, there are two unique combinations where the rule is changing from r1 to r2 and the treatment changing from on to off.

The example in the timeline diagram shows the user’s activity in your application. When calculating metrics, Split will include the shaded region when looking at targeting rule r1: the events from 15 minutes before the user's first impression to 15 minutes after the first impression for the second rule. When isolating to r2, the same rules apply - include the events 15 minutes before the user's first impression to 15 minutes after the rule (in this example, version) change. Those events highlighted in pink would not be included in any rule analysis given they fall outside the buffer windows.

User exclusion

It is important when understanding attribution to also understand what causes a user to be excluded from the analysis. In Split, there are two primary scenarios, when the platform will automatically remove the user from the analysis.

(1) EXCLUSION DUE TO TREATMENT CHANGE WITHIN RULE

If a user is exposed to multiple treatments within a split targeting rule, you would want to disqualify the user from the experiment and exclude their data when looking at your metrics. In this case, the user’s data would then be applied to both sides of the experiment, in both treatments, and will cloud your results. Medically speaking, if a patient participating in a new drug trial was to switch to sugar pills halfway through the experiment, it would be inappropriate to ascribe their outcomes to either the group taking the drug or the group assigned the placebo.

In Split, this could happen for two reasons:

  • (1) if the user is moved from a segment that is whitelisted on to a segment that is whitelist off. In this scenario, their targeting rule "whitelisted segment" will not change, but their treatment will.
  • (2) if you are using matching and bucketing keys, the bucketing key could change causing the same user (represented by the matching key) to receive a different treatment. If you are having issues, please don't hesitate to contact support@split.io.

(2) EXCLUSION DUE TO MOVING RULES MORE THAN ONCE

Split's engine allows users to move between rules and treatments once within the same version. If a user is frequently moving between versions and treatments, there may be an issue with your experiment and how you are targeting your users. To be safe, and not provide bad data, we cautiously remove the user and their data from the analysis.

Version change

When performing an online experiment it may be the case that users are progressively added into the treatment throughout the course of the experiment as you edit your targeting rules. Changes to targeting rules will trigger a new version in Split. By tracking version changes, Split is able to count only the events relevant to a particular treatment, and treat each targeting rule as a distinct sample population to compare. Note that you cannot look at your data across versions.

As the version is changed, Split runs a final calculate job 15-minutes after the version change and freezes the metrics impact so you can always revisit your results in the future.

Experiment window

Split allows customers to run an experiment across a 90-day time window. If your version is longer than 90 days, say 95 days, Split will only calculate data for the last 90 days of the experiment. See above for attribution over these 90 days and what happens on version change.

If you need a longer experimental time window, please contact support@split.io.