Sampling often seems to be an afterthought with clients as many simply state they want a ‘nationally representative sample.’ The question is what does the client mean by a nationally representative sample? One client might think it means representation on age and gender only, while another might expect it to include controls on additional variables like region, income, education, etc.
How do you decide the right variables to control on? The sample supplier needs to understand the objectives of the research as well as the analytic plan in order to make solid recommendations. Without this understanding it is difficult to build an appropriate sample. This understanding should include a discussion of the category and how different groups react to the category. Clients may not always know every group that is important, but most will have a general understanding of how various groups might respond.
Research-Live (May 2016) recently reported an excellent example of the importance of understanding the objectives and the category. Voters in the UK will soon be voting on a referendum on whether or not to remain in the European Union. Results of polls have varied greatly and originally people thought the difference was driven by online versus phone. However, with further digging it was discovered that the decision to remain or not is highly correlated with education. Many of the polls are not controlling on education so that can lead to skews in the results. Those online are also more likely to have higher education levels so that exasperates the difference between online and phone.
Sampling differences may also be accounting for some of the large differences in political polling in the U.S. for the next presidential race. It is important to look at the types of people who support each candidate and ensure the groups are appropriately represented in the sample. In some cases it may go beyond demographic variables. Certainly in U.S. politics, political party is key as many people vote along party lines.
Some might be saying ‘but you have just given us two political examples and this doesn’t apply in the marketing research world’. But it does! Say a client is testing a new idea for a high end product with an expensive price tag. Logic suggests that those with higher income will be more likely to afford the product and purchase it. If the income of your sample skews low then it may appear the product is not viable. Income might become even more important if you are comparing several product ideas and trying to pick a winner. If one of the samples skews high on income and the other low on income, it could look as if the one with the higher income is the winner when in fact it is the sample that is driving the difference.
Generally age and gender are the most common quota variables, but below are a number of examples of what might be important to control on depending on the category. For any category, the key is to think about what demographics might impact respondents’ behaviors and answers.
- Banking and finance – Income impacts the types of financial products people may own and use.
- Product consumption – Household size is key because larger households have higher consumption levels.
- Shopper study – Stores can vary by region.
- Entertainment/music – Tastes may vary by race/ethnic group.
- Insurance – Insurance needs change as life stage changes so controlling on things like marital status or presence of children is important.
- Toys – Age and gender of children can drive toy preference.
- Hispanics/Canadians – Language is important because it can drive product choice.
Even when sampling is carefully done there can still be unexpected results. This is why it is imperative that the first thing to check when receiving a data file should be the demographics. Do the demographics look like what is expected of the target group? Next brand usage and category habits should be examined. Balancing on demographics reduces the chance that there will be brand usage and habit skews, but differences can still occur. For example, having significantly more users of the brand can greatly impact key measures. When differences in demographics, brand usage, and category habits are discovered, data can be weighted to bring the differences in line with expectations.
Bottom line, sampling needs the same consideration as the rest of the research design and should never be done on auto-pilot.
References
Bainbridge, J. (May 2016). Education not taken into account sufficiently in polls. Retrieved from https://www.research-live.com/article/news/education-not-taken-into-account-sufficiently-by-polls/id/5007442