fbpx Cookie Settings
AttributionData SciencePredictive attribution modeling

Attribution Modeling Needs Policing: 3 Things You Need to Know About Model Validation

By December 2, 2016 November 13th, 2019 No Comments

This was originally published in MarTech Advisor
Alison Lohse, Co-founder and COO at Conversion Logic outlines how attribution is a data-modeling exercise and validation takes marketers a step closer to better measurement.
The basic promise of multi-touch cross-channel attribution is to give credit where it’s due. Among the multiple paths the customers take to action, advertisers should able to be strategically design touch points to influence purchase behavior. Methodology plays an important role in measuring the effectiveness of these marketing exposures. Marketing attribution modeling methodologies have evolved from the days of last touch, where 100% of credit was given to last touch point leading to a conversion, ignoring prior exposures. This led to inaccurate attribution and ultimately optimization. Last touch serves as a good starting point for organizations looking into marketing measurement, but should not be where it stops. There are other methodologies in market that are a stepping stone toward better measurement including:
1. Rules / Heuristics – Conditional or non-conditional rules are used to allocate credit to each channel for conversion based on sequence, weightage etc. Haircuts for certain channels like TV are assigned based on past experience. This also encompasses V, W, U shaped models which assign credit to touch-points based on where they are in the customer path to conversion.

Time Decay

    – This methodology is essentially a subset of rules based/ heuristics. Credit is allocated to media exposures based on how close or further they are from the conversion. Example say the conversion occurred after 7 days of exposure to a display advertisement and after 1 day of a Facebook advertisement. The Facebook advertisement may receive 70% credit while display gets 30% credit.

2. Vendor Reporting – The publisher provides metrics on how media performed. Similar to the fox guarding the hen house, trust issues with vendor reporting is evident.
3. Statistical Model – A statistical based approach to allocate credit to each channel or touch-point is used, like linear /logistic regression. Typically, the model is created using a single algorithm to predict credit deserved. There are examples in the industry of multiple statistical models working in parallel, but each utilizing a single algorithm.
4. Machine Learning Predictive Ensemble – Ensemble methods use multiple algorithms to obtain better predictive performance. Multiple algorithms are combined in a manner best suited for an organization. An organization having sparse conversion data will need another algorithm to be dominant versus another organization where there is high volume of conversion data. The Ensemble model learns and adapts to changing business conditions and is able to predict variations unique to the business. A predictive machine learning model is also better equipped to identify synergies between the channels and fairly allocate credit.
Although many systems support statistical modeling, machine learning predictive algorithms are the future of attribution modeling. High-tech companies like Netflix, American Express and Airbnb have successfully applied these advances in data science to their business and reaped benefits. A critical step and key benefit of Ensemble Method’s sophisticated process is validation.
When is validation done?
Validation occurs every time the model runs, specifically after data has been collected and algorithm(s) have been trained on it. Apart from validation at the time of model creation, there is on-going validation. Every time the model is utilized, it’s re-validated. Issues in performance can be identified with regular monitoring. More on that in the third question.
Why is validation important?
Validation is imperative for optimal model selection. A model/ algorithms’ predictive accuracy needs to be evaluated on a hold out data set, different and sizable compared to the training data set. The algorithm and their parameters that has the best validation performance scores are selected at the end of the process to be combined in an Ensemble. Performance score is defined by the how accurate the model was at predicting conversions against the hold out data set. This step assures that the selected models will adapt to and perform well when put into practice. If validation is not done right or not at all, it may result in a model memorizing the historical data, or over fitting the model, leading to great performance while training but poor performance in the real world scenarios.
What does this mean on-going?
Validation doesn’t end at model selection but is a continuous Kaizen process. The model performance needs regular monitoring to keep up with changes in business environment, customer behavior or other inputs. Some maybe due to factors like seasonality, which could be included in modeling or external factors like a price drop by competitors. This allows for issues to be detected in real-time and incorporated into modeling.
Validation is clearly a best practice for attribution modeling, regardless of application or utilization of machine learning predictive algorithms. Ensure any attribution approach you consider includes this validation step to ensure model accuracy and transparency. Validation is part of marketers’ DNA, so why leave it out while attribution modeling?

Leave a Reply