Advanced Analysis Techniques
Max Diff
Max Diff is a trade-off approach that is superior to just asking people ‘on a 1-10 scale how important is……’ and more respondent friendly than ranking when you have a lot of items. The approach shows respondents multiple tasks, that each consist of 4 or 5 items and the respondent picks which is most ‘important’, and which is least ‘important’. The results are more differentiated than using direct approaches. Due to the way it can be analysed the results can be cut and diced for different subgroups very easily.
Max Diff is all about items, they can be features, promotions, communication messages, needs, plate designs, pretty much anything where you might want to get a hierarchy from the most ‘important’ to the least.
If we are using ‘important’ as an example, as it could be ‘Appeal’, ‘Unique’ the context is key, but flexible to fit the purpose you need. There are alternative ways to design max diffs to deal with larger sets of items (over 25-30) and keep the task manageable for respondents. The scripting can also allow a bit of on-the-fly analysis, to work out an individual’s ‘preferences’ that can be used to tailor follow up questions to them.
Conjoint
Where max diff is about individual items, conjoint is about fuller (priced) product ‘bundles’. Conjoint can be used for different purposes so most projects are quite different. However, the main uses are to understand how a new or repositioned product will sit in the market (share/preference), what are the aspects that impact that preference (inclusion/exclusion of different features, price) and who it competes with.
The most common type of conjoint is CBC (Choice Based Conjoint) which is also known as DCM (Discrete Choice Modelling – esp. in the US). In surveys respondents typically get shown c. 3 products that are made up of different features, prices, brands and asked to pick with one they would be most likely to buy (or none of them). Then they see a series of similar tasks where the products are different. The norm is they would go through 9-12 tasks in total.
The next version is ACBC (Adaptive Choice Based Conjoint), which ultimately gets you to the same place, but with a longer but more varied interview. Some of the reasons why ACBC might be used instead of CBC include some dominating attributes (e.g. if for telecoms, do you want Pay TV or not), better ways of dealing with pricing (esp. if the product composition means widely varying costs) and if the list of features is just too long.
The final main type of conjoint is MBC (Menu Based Conjoint). This is relatively new and is for the situations when there is no/less of a fixed product but lots of optional extra/add-ons or indeed the whole thing is custom built (the old Dell model). For instances, restaurant menus, new cars, telecoms, and more categories are moving that way. The MBC is primarily about how to price all of the components to see how they impact, whether they are bought or not, e.g. each priced too highly does that mean someone will buy one add-on, whereas if slightly cheaper they would have bought both and spent more in total.
There are a lot of variations of each type of conjoint, so needs detailed discussions upfront to determine the best approach and normally to ‘trade-off’ a bit of what makes it into the final study. In a lot of cases there is too much info ‘trying’ to be included and/or requirements that are not as important, but in fact are complicating the conjoint and impacting the cost/time/approach or even the ability to do the project at all.
The main outputs are ‘What if’ simulators. so that you can test the impact of different decisions and strategies. The simulators can include a variety of different options that help the interpretation, e.g. dampening/calibration to try and bring the research results more in line with real behaviours/shares, including costs to look at margins etc.
*We don’t currently support MBC projects, but happy to advise and see if it’s really an MBC or if there are other ways of approaching.
Segmentation
There are a lot of approaches, so the key is to understand what is needed and how the segmentation will be used before getting caught up in the research aspect. We are approach ‘agnostic’ when it comes to segmentation. We pick the best approach or approaches based on the data and the objectives. There are different approaches that we can use depending on the exact objectives and how extensive/broad the inputs that will be used as segmentation drivers will be.
The majority of segmentations, that we do use a cluster ensemble approach as clients tend to want segmentations to meet many requirements, so want aspects of attitudes, behaviours, needs etc all built in.
Cluster ensemble is as much a process rather than a specific. It allows multiple dimensions to be built into the segmentation. We do this via building the segmentation from bottom up in a number of stages, to help get discrimination across the dimensions, rather than just throwing everything in ‘one pot’.
In the analysis we create a number of mini/dimensional segmentations, e.g. an attitudinal segmentation, a behavioural segmentation, a needs segmentation etc. We then use these analyses as inputs into an ‘uber’ segmentation using a cluster ensemble technique that aims to create segments that are differentiated on each of the different dimensions.
A key challenge is when there is a requirement for the segmentation to be attributed to another data source, e.g. a client database.
*Simpler segmentations that are based on more focused questions/single battery can be cheaper.
Driver Analysis
The two main approaches we use are Ridge Regression and Shapley Values Analysis. Ridge Regression is akin to a more traditional regression technique. It is designed to work with research/correlated data. Shapley Values Analysis is where we conduct the analysis in two parts to see where poor performance drives ‘dissatisfaction’ and then separately where great performance drives ‘delight’).
In addition, there is Correlation analysis, which is a simpler technique. However, it is measuring something a bit different (and some other caveats), so not one that we tend to use unless the structure of the questionnaire, the data or sample size bring it into consideration.
However, we are familiar with many approaches and use others when they are more appropriate, or the client has a preference.
Ridge Regression
- When considered simultaneously, how much does each of the inputs drive people up the <satisfaction> scale.
- Most survey data is correlated, so traditional regression techniques don’t work. Ridge Regression was designed to cope better with this type of data and takes into account the ‘overlap’ (multicollinearity) between the input variables.
- As all data is modelled together, you need to have complete data for each respondent (i.e. no filtering / DKs), or we have to impute (if a small amount missing).
Shapley Values Analysis
- Understand what drives different degrees of <Satisfaction> (e.g. Delighted and Dissatisfied), i.e. how much does each of the inputs drive people into each of those <satisfaction> types
- Effectively two models are built and each one will identify the key drivers in each
- The analysis will look separately at where poor performance/association might drive people to be <Dissatisfied>, then a second stage looks at where excellent performance drivers show people to be <Delighted>. This allows the client to define different types of strategy for different aspects.
- The definitions are developed based on the distribution of the data.
- As we are effectively chunking the data up into 3 groups, we can also look to include other aspects/inputs on different scales more easily
*<Satisfaction> is just used as an example. Driver models are often run for many different types of measures, e.g. Recommendation/NPS, consideration/preference, Brand Trust, Likelihood to Switch/Repurchase to name a few.
TURF
TURF is an old approach that looks at picking the best X items out of a larger set. It’s most frequently used in FMCG for example to determine which 4 ice-cream flavours out of the 12 we make should we stock in this supermarket. The idea behind TURF is that it is optimising the (Total) ‘Unduplicated Reach’ (and Frequency) of the selection. So if a lot of people like the top 2 flavours, then does it make sense to supply both. You could also pick a less well-liked flavour but one that appeals to a completely different customer.
More recently we have used the same approach to look at communication messaging, i.e. out of these 20 messages, what 5 should we move forward with. Alternatively, what features from this long list should we include in my product, or what mix of promotions etc. The element that we do slightly differently to the mainstream packages is that we think it’s important to look at the ‘depth of resonance’. In most TURF packages, if you like one flavour/message/feature you are ‘reached’. In FMCG cases that might be fine, but for comms or product then you might need 2 or 3 elements to resonate with you before you are interested in the project or complete message.
Often Max Diff is used as a precursor to TURF, but not always. Sometimes a composite might be used that captures different aspects of how ‘important’ an aspect is, e.g. appeal, uniqueness, fit, priced intent to purchase etc. etc.
CHAID
CHAID is primarily used for identifying groups of people who are more or less likely to have a certain characteristic. It was initially used for direct marketing, where models were built to look at who responds to a campaign so that future waves could better target those likely to respond. The output is typically presented as a tree-like structure and as you work down the branches you find groups who are extremely different. Typically, the input are items that are known about people, so that it can be used for targeting. Sometime survey questions are used if the objective is to find out more about the who and why.
Maps
Correspondence/Brand maps are used to get a quick snapshot of how brands and images interact, how are brands perceived as most similar to and are there any white spaces etc. It doesn’t have to be brands and images, but basically anything you can ‘crosstab’, for instance segments vs needs.
Although maps give what seems to be an easy to understand picture of the market, in virtually all cases they are being misinterpreted. Also maps are very unstable and not always good where there is a dominant brand (as they are relative positioning, so each brand has strengths and weaknesses) or when you are looking at more mainstream brands when there are a lot of niche/new brands. So they are good when they work, but you need to know what you are looking at. You also need to be happy to forget about them when they don’t’ work that well.
Pricing
Lots of approaches, from conjoint being the gold standard to other more direct/simpler approaches, including Gabor Granger, Van Westendorp, Perceived Value Pricing. Many of the simpler techniques are ones that clients can do in-house, unless they want a tool to go with it.
Market Sizing
This is a bit niche as these projects have been survey based for (new) product development where there might be 3-4 detailed concepts that are being considered. These concepts are usually looked at in a monadic approach, with either one cell being the current offering or a sequential monadic with each respondent seeing the current offering then a new one.
Using other information collected in the study, such as current behaviours, brand perceptions, switching behaviour (loyalty, inertia) and much more, we adapt a respondent’s likelihood to take up the product (or buy that number) as well as any overall market/distribution factors etc (e.g. awareness, stores shopped at). Incorporating actual sales data for the current product, we can calibrate the other concepts to create estimation for market potential.
All of this is built into bespoke Excel simulators, and as well as the market sizing can look at switching, cannibalisation, and upgrading (new customers vs existing), as well as perhaps secondary product adoptions (e.g. Broadband switching/uptake within a Pay TV market sizing).