How-Tos/analytics

How to Build Custom Attribution Models in Analytics

Learn how to build custom attribution models in analytics to track real ROI. Step-by-step guide for multi-touch attribution that credits every conversion touchpoint.

Introduction: Why Last-Click Attribution Is Lying to You

You've probably seen it a thousand times: your analytics dashboard credits 100% of a conversion to the final click before purchase, completely ignoring the blog post that introduced the customer to your brand, the email that brought them back three days later, and the retargeting ad that finally pushed them over the edge. Last-click attribution is the default in most analytics platforms because it's simple—but simple doesn't mean accurate.

The problem gets worse when you're trying to justify marketing spend across channels. Your social media campaigns look worthless, your content marketing appears to contribute nothing, and everyone's fighting over budget based on fundamentally broken data. Multi-touch attribution models solve this by distributing credit across all touchpoints in a customer journey, but most platforms either don't offer custom models or lock them behind enterprise paywalls.

Here's the good news: you can build your own custom attribution models using data you already have. This guide walks through the actual process of extracting journey data, designing attribution logic, and implementing models that reflect how customers actually behave. You'll need basic SQL knowledge and access to your raw analytics data, but you don't need expensive tools or a data science PhD.

Step 1: Extract and Structure Your Customer Journey Data

Before you can attribute credit to touchpoints, you need a complete view of the customer journey. This means pulling together every interaction a user had before converting, which typically lives scattered across different tables in your analytics database.

Start by identifying what counts as a "touchpoint" in your context. Common examples include page visits, email clicks, ad impressions, social media interactions, and form submissions. Whatever you choose, you need three core data points for each touchpoint: a user identifier (cookie ID, user ID, or device ID), a timestamp, and a channel or source identifier.

Write a SQL query that unions these touchpoints into a single table. Here's the basic structure you're aiming for:

SELECT 
    user_id,
    timestamp,
    channel,
    session_id,
    conversion_flag
FROM (
    -- Website visits from analytics
    SELECT user_id, visit_timestamp as timestamp, 
           traffic_source as channel, session_id, 0 as conversion_flag
    FROM web_analytics
    
    UNION ALL
    
    -- Email clicks
    SELECT user_id, click_timestamp as timestamp,
           'email' as channel, NULL as session_id, 0 as conversion_flag
    FROM email_events
    
    UNION ALL
    
    -- Conversions
    SELECT user_id, conversion_timestamp as timestamp,
           'conversion' as channel, session_id, 1 as conversion_flag
    FROM conversions
)
ORDER BY user_id, timestamp

The conversion_flag is crucial—it lets you filter for only the journeys that ended in conversions. You'll also want to add a journey ID by grouping touchpoints that happen within your conversion window (typically 30-90 days). Use window functions to assign sequential positions to each touchpoint within a journey.

Step 2: Choose Your Attribution Logic Based on Actual Behavior

Now comes the conceptual work: deciding how to distribute credit. Different models make sense for different business scenarios, and you can get creative beyond the standard options.

Linear attribution divides credit equally across all touchpoints. It's simple to implement (each touchpoint gets 1/N where N is the number of touchpoints) but treats a random blog visit the same as a demo request, which probably doesn't match reality.

Time-decay attribution gives more credit to touchpoints closer to conversion. A common approach uses an exponential decay function where each touchpoint gets weighted by exp(-k * days_before_conversion). You tune the decay constant k based on your typical sales cycle—higher k means faster decay, concentrating credit near the conversion.

Position-based (U-shaped) attribution allocates 40% to first touch, 40% to last touch, and splits the remaining 20% among middle touches. This tends to work well for businesses where both discovery and closing are critical, like SaaS products with long evaluation periods.

Here's where custom models get interesting: you can weight touchpoints based on actual engagement signals. If someone spent 5+ minutes on your pricing page, that probably matters more than a 10-second bounce from a display ad. Create engagement scores based on time-on-page, pages-per-session, or specific high-intent actions, then use these scores as multipliers in your attribution formula.

For implementation, add a column to your journey table that calculates the attribution weight for each touchpoint. For time-decay, that might look like:

SELECT 
    *,
    EXP(-0.05 * days_until_conversion) / 
    SUM(EXP(-0.05 * days_until_conversion)) OVER (PARTITION BY journey_id) as attribution_weight
FROM journey_touchpoints

Step 3: Build Your Attribution Model as Reusable SQL Views

Don't try to calculate attribution on-the-fly for every report—you'll end up with inconsistent logic and slow queries. Instead, create SQL views that materialize your attribution model and can be queried like regular tables.

Create a base view that joins your journey data with the attribution weights you calculated:

CREATE VIEW attributed_touchpoints AS
SELECT 
    j.journey_id,
    j.user_id,
    j.channel,
    j.timestamp,
    j.attribution_weight,
    c.conversion_value,
    (j.attribution_weight * c.conversion_value) as attributed_value
FROM journey_touchpoints j
JOIN conversions c ON j.journey_id = c.journey_id
WHERE j.conversion_flag = 0  -- exclude the conversion event itself

Then create aggregate views that roll this up to different dimensions:

CREATE VIEW attribution_by_channel AS
SELECT 
    channel,
    COUNT(DISTINCT journey_id) as influenced_conversions,
    SUM(attributed_value) as total_attributed_value,
    AVG(attributed_value) as avg_attributed_value
FROM attributed_touchpoints
GROUP BY channel

The power of this approach is that you can create multiple attribution views implementing different models (linear, time-decay, custom) and compare them side-by-side. Name them clearly—attributed_touchpoints_linear, attributed_touchpoints_time_decay, etc.—so stakeholders understand what they're looking at.

Make sure to add date range filters to these views since attribution often needs to align with specific reporting periods. You might also want to create versions that filter for different conversion types if you track multiple goals (trials vs. purchases, for example).

Step 4: Validate Your Model Against Reality

Building a model is one thing—trusting it is another. Before you start making budget decisions based on custom attribution, you need to validate that it's producing sensible results.

Start with sanity checks. Sum up all attributed values across all touchpoints for a given time period—this should equal your total conversion value for that period. If it doesn't, you've got a logic error in your attribution weight calculation. Check that weights sum to 1.0 within each journey.

Next, compare your custom model against last-click for channels you have strong intuitions about. If your custom model shows that your brand awareness campaigns have zero influence while last-click gave them nothing, but your marketing team insists they're driving top-of-funnel traffic, something's wrong. Pull sample journeys and walk through them manually to see if the attribution weights make sense.

Look for edge cases that might break your model: single-touch journeys (where first-touch, last-touch, and linear all converge), very long journeys with dozens of touchpoints (does your time-decay function still make sense?), and journeys with multiple conversions (do you want to attribute value to touchpoints after the first conversion?).

A useful validation technique is to calculate attribution for a historical period where you know what happened—maybe you paused a specific channel for a week, or you ran a major campaign that definitely drove conversions. See if your model reflects those known changes. If pausing paid search made zero difference in your model but you know it tanked conversions, your model needs tuning.

Step 5: Implement Incremental Improvements Based on Your Data

Your first attribution model won't be perfect, and that's fine. The goal is to make incremental improvements that bring it closer to reality over time.

Start tracking which touchpoint sequences actually lead to conversions versus which ones don't. This lets you build probabilistic weights—if 40% of journeys that include a podcast ad eventually convert, but only 10% of journeys with a display ad convert, you might weight podcast ads 4x higher in your model. This moves beyond theoretical weighting into data-driven attribution.

Here's a query that calculates conversion probability by channel:

SELECT 
    channel,
    COUNT(DISTINCT CASE WHEN max_conversion_flag = 1 THEN journey_id END) as converted_journeys,
    COUNT(DISTINCT journey_id) as total_journeys,
    COUNT(DISTINCT CASE WHEN max_conversion_flag = 1 THEN journey_id END) * 1.0 / 
    COUNT(DISTINCT journey_id) as conversion_probability
FROM (
    SELECT 
        journey_id,
        channel,
        MAX(conversion_flag) OVER (PARTITION BY journey_id) as max_conversion_flag
    FROM journey_touchpoints
)
GROUP BY channel

Use these probabilities as multipliers in your attribution formula. A touchpoint's final weight becomes: base_attribution_weight * conversion_probability * engagement_score.

Another improvement: account for cross-device journeys if you have the data. Many attribution models break when users switch from mobile to desktop because they're treated as separate journeys. If your analytics platform offers user ID tracking (not just cookie-based), make sure you're grouping touchpoints by user ID rather than device ID.

Finally, consider building different models for different customer segments. High-value B2B customers might have completely different journey patterns than low-value B2C customers, and a single attribution model might not serve both well. Create segment-specific views and compare how attribution patterns differ.

Step 6: Turn Insights Into Action With Regular Reporting

An attribution model that sits in your database doing nothing is worthless. You need to surface these insights where decision-makers will actually use them.

Build a recurring report (weekly or monthly) that shows attributed value by channel compared to last-click attribution. Include columns for both so stakeholders can see the difference. Many channels that look worthless in last-click—content marketing, social media, display ads—will show meaningful contribution in multi-touch models.

Add trend lines that show how channel contribution changes over time. If a channel's attributed value is declining, that's an early warning signal that might not show up in last-click data until weeks later. Include metrics like attributed cost-per-acquisition (divide attributed value by channel spend) to make budget decisions clearer.

Create a "journey analysis" report that shows common path patterns for converting customers. Which channel combinations appear most frequently in successful journeys? This often reveals non-obvious insights, like "users who touch both email and paid search convert at 3x the rate of users who only see one channel."

Don't just share the data—share the methodology. Attribution modeling involves subjective choices (decay rates, engagement scoring, conversion windows), and stakeholders need to understand what assumptions underpin the numbers. A simple README document explaining your model's logic goes a long way toward building trust.

Conclusion: Start Simple, Iterate Based on Results

You don't need to build the perfect attribution model on day one. Start with a simple time-decay or position-based model, implement it as SQL views, validate it against known results, and then gradually add complexity as you learn what matters for your business.

The real value isn't in the sophistication of your math—it's in moving beyond last-click attribution to a model that reflects how customers actually find and evaluate your product. Even a basic multi-touch model will give you better insights than crediting everything to the final click.

Set a reminder to review and tune your model quarterly. Customer behavior changes, channel mix evolves, and your attribution logic should evolve with it. The goal is continuous improvement, not perfection. Now go pull that journey data and start building.

how to build custom attribution models in analytics