Archive

Archive for the ‘research’ Category

Tableau Public

June 1, 2011 Leave a comment

After downloading but just not getting around to using the software on three separate occasions, I finally created my first chart (or should I say “visualization”) using the free Tableau Public platform. The visualization I created was for Give Me The Rock and it graphed the top 200 fantasy basketball players on a scatter plot based on how similar or dissimilar they played during the 2010-11 season.

While I am by no means an expert on using the software, here are my first impressions on Tableau Public.

The Good Stuff

I mean, let’s start with the fact that Tableau Public is free to use. In a world where I pay to drink water, that is a very good thing. Loading up the software, I noticed that the interface is very slick and creating a basic chart is fairly easy to do. I’m a guy who likes to jump in head first without reading a manual, and I appreciate I could do that and still create something cool. For the most part, the software is drag and drop with your data displayed on the left hand side of the screen and the visualization on the right. You simply start dragging your data into the appropriate place and the magic happens.

Also, the visualizations look great – on par or better than what Excel can produce. The shapes, colors, labels are all sharp and really pop on the screen. But the best thing about the software is that Tableau Public knows exactly what it is and what people use it for. The chart options that it provides seem spot on and the ease and speed at which the software allows you to slice and dice your data and the way you present it is something that Excel can’t come close to touching. For example, my visualization of 200 NBA players was original crammed on the screen, but Tableau Public has a pages option that allows you to split out your presentation into different pages which can be viewed individually. I ended up splitting my graph into pages by player position to make it more readable.

If your the type of person that likes to experiment with different ways to present data, then Tableau Public has the potential to save you a lot of time.

Finally, Tableau Public makes it incredibly easy to share visualizations on the web. It provides scripts that you can copy and paste right into your blog, as well as an option to email it to others. You can also download any visualization on the web onto your computer if you have Tableau Public. No more worrying about versions of Excel and compatibility issues.

The Bad Stuff

Like all cloud software that I’ve used, Tableau Public is a touch slow for my tastes, especially in regard to loading and saving data. For a one off chart (excuse me, visualization) I can deal with it, but given the spastic way I typically work, I wouldn’t want to work with it all day long. Tableau has professional versions of the software (for $999 and $1,999 depending on the version) that I’d assume solves this problem by not requiring you to work in a cloud.

And while the software is generally easy to use, I’d say it’s still a step or two away from being perfectly intuitive. There were a few things that took me a while to figure out. I didn’t realize you could drag the interactive legends/tools around the screen and they would be presented in that exact location when you published the visualization (very handy). And while I split my graph by player position to make it easier to read, I still wanted to add a total page that presented all the data on the screen at once. I never did figure out how to do that.

Finally – and this may bother some people more than others – while Tableau Public is free, you are required to save your visualization to their servers. Once published, your data is in the public domain for all to see. Obviously if you have proprietary data, Tableau Public is not for you (although again, the paid versions will solve this problem).

The Verdict

Overall, I’m impressed with Tableau Public. As a guy who uses Excel on a daily basis, it’s not going to replace that for my day-to-day work, but it has definite advantages over Excel like its ability to slice and dice data in any number of ways and the fact that Tableau makes it very easy to share visualizations with others. The next time I create a graph that is going to be displayed on a website, I’d hands down use Tableau Public do to that. And did I mention it’s free?

Are Extended Warranties Ever Worth It?

September 22, 2010 Leave a comment

I dislike buying things. I dislike spending money, shopping, junk that passes as necessity. I have a difficult enough time buying crap to begin with that when a salesperson asks me if I want a three-year extended warranty on that alarm clock for an extra $20, it makes me want to punch them in the face.

Instead, I usually just politely say no. If my alarm clock breaks, I’ll just buy another one for $20.

I have purchased exactly two extended warranties in my life. The first was from my friend who was working at Circuit City (remember them?) when we were in college. Back in the day, Circuit City employees would get commissions off sales. You could tell because you’d get swarmed by a dozen employees as soon as you walked in the door. Anyway, my friend guilted me into buying the extended warranty on a surround sound system so he could get the commission. I should have just given him the money directly after he got out of work, because it’s been over 10 years and the system still works fine.

The other extended warranty I bought was far worse, because it was on a truck I bought a few years ago. Kudos to the sales guy, I guess, who put on the hard sale for the warranty. Either that or I blacked out during negotiations, because it’s not in my character to buy an extended warranty like that. Fast forward to the end of the 3 year warranty coverage and I used it once for a grand total of $100. And the best part of the whole thing – when I thought my four-wheel drive linkage needed to be replaced a couple years ago – I found out that whoops, the extended warranty didn’t cover that.

Fine print is a bitch.

Despite my hard stance on extended warranties, I was wondering recently if they are worth it, at least in certain circumstances. I know some people who have gotten plenty of use out of extended warranties and have great things to say about them.

A Consumer Reports study of 8,000 new car buyers a couple of years ago and found that, surprise, surprise, those who purchase the extended warranty on their vehicles lost money, on average (although the average loss of $100 was quite a bit less than I would have expected).

CBS news expert Susan Koeppen suggests that extended warranties are generally not worth it, with the possible exception of laptops, which break all the time. She also introduces the concept of repair rate, which can be used to calculate hard numbers on the value of a warranty. For example, the average flat screen television will break 3% of the time. If a television costs $500, then the value of the warranty would be around $500 * 3%, or $15. That’s quite a bit less than the average extended warranty on a television costs.

PC World conducted a study last year which found that 71% of people who bought the extended warranties on their computers were happy with the decision. Of course, people are generally loathe to admit they made a bad decision (especially when it comes to money-related matters) and the survey doesn’t provide any data other than that touchy-feely crap. What I’d really like to know is how many of those people used the warranty and how much money did they save.

Finally, an academic paper by some professors at Carnegie Mellon University took a look at why consumers by extended service contracts. To the extent I understand what they are talking about, they used linear modeling to try to understand the decision making process of consumers. Among other things, they found:

  • People tend to value hedonic (e.g., luxury goods like flat screen televisions) over utilitarian goods (e.g., goods that fulfill basic functional needs). This over valuation of hedonic goods means that consumers are more likely to buy warranties for these fun, pleasure making things than other stuff. The unfortunate use of the word hedonic over and over in the article kept reminding me of Hedo Rick, which was very distracting.
  • Retailers can increase the likelihood that a consumer purchases a warranty by promoting or selling said warranties.**
  • All else being equal, people associated higher costs goods with higher quality
  • Lower income consumers are more likely to purchase extended warranties, because they are more sensitive to the cost of replacing a product if it breaks. Let’s call it the lottery-ticket-retirement-plan corollary.
  • If you’ve purchased and used an extended warranty in the past, then you are more likely to think your products will break in the future. Ah, now I see why people who have used extended warranties seem to be such big fans.

* That number does not take into account standard warranties. Since products that break tend to break sooner rather than later, the likelihood that a product would still be covered under its standard warranty should probably be backed out of the repair rate percentage. However, that is math I am not interested in doing at the moment.
** The reason to hate academic research – the need to test, confirm AND report on the obvious.

Categories: money, research

>Apple’s iPad Finally Makes Survey Research Cool

August 11, 2010 Leave a comment

>It took Steve Jobs to do it, but the iPad has finally given the survey research industry a tool that makes the average person interested in doing surveys. USA Today (the place where I get all my news, usually in picture form) reports on companies that have started using iPads to conduct face-to-face interviews at shopping malls with glorious results. According to the article:

People “are attracted by the cool factor,” says Jude Olinger, CEO of the Olinger Group, a marketing research firm that conducted surveys at 130 shopping malls for the past two months using 200 iPads. “People who haven’t seen iPads are fascinated.”

At many of the centers, he says, response was so good that survey takers collected the required information in about three weeks instead of the four they’d anticipated.

If true, a 25% reduction in data collection time is actually a significant amount. The time and money saved not having to pay interviewers an extra week probably comes close to paying for the cost of buying all those iPads to begin with. The article mentions “clipboard-wielding researchers” and “pencil-and-paper surveys” but given these companies’ early adoption of iPads, it’s more likely that they were already using laptops or tablet PCs to conduct similar surveys. A company that is still doing pencil and paper surveys in the year 2010 is probably not going to immediately jump on the Steve Jobs bandwagon.

But the long-term rub is that while iPads are a great way to get people’s attention right now, the novelty will soon wear off. If I tried to get you to do a survey on my cool, new PalmPilot, I doubt you’d be that interested. In the same way, as iPads become more ubiquitous, the cool factor is going to go out the window along with their ability to get people do want to do surveys on them.

That being said, iPads do offer a number of advantages over old-school pencil and paper surveys, the most important of which is being able to compile, verify, analyze and report on data in real time. We’ve been able to do the same thing with PDA’s, laptops, and tablet PCs for a while now, but the iPad offers a unique combination of usability, portability and (most importantly) battery life. Much like Ron Burgundy, having 10 hours of battery life when you are conducting face-to-face interviews in a mall all day long is kind of a big deal.

History tells us that other manufacturers will eventually catch up to what Apple is doing, but until then, all hail the iPad as the survey research tool of the future. At least until people get bored with it.

>Fake Your Data like a Machine

July 14, 2010 Leave a comment

>You may have heard the news about a little polling company called Research 2000, which has gotten into a little hot water recently for supposedly faking polling data during the 2008 election. The Daily Kos uncovered the story based on some tips from a few statistical wizards who spotted some abnormalities in the data.

To break it down, in addition to the Daily Kos, Research 2000 provided polling services for a large number of local television and newspaper affiliates. Well, “provided” in the sense that they will likely not be providing said research services much longer. Research 2000 president Del Ali quickly and unsurprisingly shot back against the charges of faking data, writing in a statement:

Every charge against my company and myself are pure lies, plain and simple and the motives as to why Kos is doing it will be revealed in the legal process and not before that. I will share one little minor reason that Kos is doing this and it pertains to the fact they owe us a significant sum of monies that is in the six figure category and payment was on June 15, 2010.

Of course, the fact that he won’t publicly release his data (likely because it doesn’t exist) does hurt his credibility a little.

But this brings us to the larger and more important issue – when you fake your survey data, don’t do it like a human being. See, the world has both a randomness and an order to it that our feeble minds can’t quite grasp. And when we try to randomize things, we do it in a much too orderly fashion.

The thing that brought down Research 2000 is that their data was much too “clean.” It didn’t have the error associated with it that one would expect from a random survey. For example, when you fake your data, don’t make all the breakdowns either even OR odd. It’s best to mix it up a little.

Bad fake data when ALL the male/female comparisons are both either even or odd

Also, when you fake your data, you most likely want to make sure it is normally distributed, because that’s how the world operates. See, this Gallop poll demonstrates what a normal distribution looks like. It’s what’s refered to the Bell curve.

This Research 2000 poll on the other hand, demonstrates that humans don’t like the number 0 when faking data.

There were a number of other problems with the data that have been well documented on the Daily Kos website. It all adds up to a damning set of evidence indicating that Research 2000 faked some serious-ass data. Hundreds of thousands of dollars worth of data. And didn’t do it particularly well.

The fact that the company didn’t have a mailing address and operated out of a Kinko’s probably should have been enough of an indication that something wasn’t right.

Categories: poll, research, statistics, survey

>Quantitative Analytical Techniques

July 9, 2010 Leave a comment

>Statistics is a branch of mathematics that deals with the collection, analysis, interpretation, and presentation of masses of numerical data. When a measurement is calculated for an entire population, say the average age, it’s called a parameter. When we look across a sample and calculate a measurement, also the average age, we call it a statistic. Since people make entire careers out of the study of statistics, the point of this post is to present a birds-eye overview and brief description of common terms you’ll hear in conversations about quantitative analysis.

When discussing statistics, researchers usually talk about the data in terms of “variables.” A variable is a characteristic that may assume more than one set of values (age, income, birth place can all have more than one value). A variable can either be nominal, ordinal, interval, or ratio in its scale. Nominal variables are also referred to as categorical variables because they represent categories of responses. The color of a car would be represented by a categorical value (for example, black, red, or silver). Categorical variables have no set order, meaning that a black car is not necessarily any better than a silver car.

The level of satisfaction with one’s car on a 1 to 10 scale is an example an ordinal variable (where a 10 is a better score than a 5 and a 5 is better than a 1). Ordinal variables have a clear, set order, but they still represent categories of responses. Interval and ratio variables are numerical variables whose numbers have direct meaning. The age of a car would be ratio variable because it can be measured precisely and at equal intervals (in hours, years, or decades).

Variables can also be discrete or continuous. Continuous variables, such as time, have an infinite number of possible values, while discrete variables, such as a satisfaction scale, have a finite (in this case 10) number of possible values.

Descriptive statistics are simple portrayals of what the variables show. They are summaries of the frequency of the different values (like percentages); the central tendency (mean, median or mode); and the dispersion (like the range and the standard deviation). Cross tabs (short for tabulations) are popular for displaying the joint distribution of two or more variables. They are usually presented in a matrix called a contingency table. In a cross tab table, each cell gives the number of respondents that gave a particular combination of responses.

Measures of association summarize the relationship between two variables (correlation and regression, for instance). Two variables are associated when information about one can help us predict information about the other. A variety of techniques to measure association are available, each better suited to different classes of variables. When analyzing data, most statisticians use multivariate analysis where the effects of many variables are considered.

Tests of statistical significance are used to determine how sure we can feel about the associations found in the data — Could it just be chance? Can we infer that the result can be generalized to the study population? Confidence intervals, chi square tests and t-tests are the most common statistics used to indicate the probability of saying that there is a difference between two groups when actually there is none (level of significance).

Measures of association can be used in very sophisticated ways. Conjoint analysis can be used to determine trade-offs customers are willing to make among product or service attributes. In addition to understanding current preferences, this technique allows modeling of the impact of the introduction of new factors on preferences.

Discrete choice analysis models selection of a product or concept with many attributes from a set of products or concepts. In essence, it models how people make decisions in the real world. For example, one could test products with varying combinations of features to assess which consumers prefer. As with conjoint analysis, discrete choice analysis allows modeling of the impact of the introduction of a new product or concept on factors such as market share.

Cluster analysis identifies population segments using groups of variables. This provides information to better understand and communicate with customers, or help you understand your place in the market place. In general, whenever one needs to classify a mass of information into manageable and meaningful results, cluster analysis is a technique of great usefulness.

Discriminant analysis is used to define which variables best differentiate between predefined groups. The key difference is that discriminant analysis relies on previously defined groups whereas cluster analysis uses the data to discover these groups.

Factor analysis finds the underlying construct behind answers to a series of questions. In other words, factor analysis is designed to classify variables. For clients, it simplifies the interpretation of answers to many questions to a few “factors” that seem to drive answers to all questions. It can be used to determine the key factors that drive aspects like satisfaction, image or customer retention. In addition, factor analysis is used when designing surveys. Often complex concepts (like “leadership”) need to be turned into a group of concrete questions in order to query meaningfully.

Regression Analysis (linear, non-linear and logistic) is widely used for forecasting. It compares the effects of one or more variables on another. The objective of regression analysis is to understand the relationship between several independent or predictor variables on a dependent or criterion variable. This allows forecasting or estimation of the change in a dependent variable based on the change in an independent variable.

***

http://rcm.amazon.com/e/cm?t=patrmadd0a-20&o=1&p=8&l=bpl&asins=0761925767&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifrhttp://rcm.amazon.com/e/cm?t=patrmadd0a-20&o=1&p=8&l=bpl&asins=0521674654&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifrhttp://rcm.amazon.com/e/cm?t=patrmadd0a-20&o=1&p=8&l=bpl&asins=B00387FOGM&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifr

>Qualitative Analytical Techniques

July 5, 2010 Leave a comment

>Qualitative Research employs special techniques that allow researchers to observe the ways respondents analyze and synthesize information. When attempting to understand people’s perceptions, beliefs, and behaviors, it is important to solicit information in a way that is meaningful to them. Rigorous sampling and advanced quantitative analysis will not remove bias introduced by a researcher who is subconsciously imposing his/her viewpoint on the study population. The following are a few methods that can be used to collect qualitative data.

Observational techniques are those where a researcher simply observes human behavior or actions first hand. This technique is useful to gain a full understanding of the context in which the behavior is talking place and also when people are unwilling or unable to verbalize the topic being evaluated.

Collecting nonverbal data requires such fineness of observational detail that special training is required to use the terminology and notational systems. Examples of nonverbal study include: Kinesics (observing detail of bodily movement) and Proxemics (social symbolic uses of space). A research study that includes analysis of nonverbal data is usually videotaped for repeated viewing (like focus group tapes).

Listing, Selecting and Sorting – Asking respondents to generate and/or sort lists, and make selections, is a way to explore their taxonomic systems (the way they organize information).

Projective Techniques are based on the understanding that people naturally project their beneath the-surface perceptions, beliefs and personal themes in their verbal responses and behavioral styles. There are many different tests and games to help probe beneath the surface that have been devised and tested over time. Familiar examples include: Sentence Completion Tests, Thematic Apperception Tests, Personalization, and Role Playing.

Interviews or Discussion – Are good at collecting information that can be verbalized easily. Common settings include focus groups and in-depth interviews.

http://rcm.amazon.com/e/cm?t=patrmadd0a-20&o=1&p=8&l=bpl&asins=0470283548&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifrhttp://rcm.amazon.com/e/cm?t=patrmadd0a-20&o=1&p=8&l=bpl&asins=0761920706&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifrhttp://rcm.amazon.com/e/cm?t=patrmadd0a-20&o=1&p=8&l=bpl&asins=0972051619&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifr

>Introduction to Intercepts, In-person Interviews and CAPI

June 28, 2010 Leave a comment

>Intercepts, In-person Interviews and CAPI (Computer Assisted Personal Interviews) are conducted by interviewers going to locations where respondents are apt to be found, and requesting their participation. They are commonly used to gather data from respondents that would be difficult to find by any other method. Common examples include customers of specific restaurants and stores, or a businessperson in a trade show. Intercepts are often conducted in malls where it is easy to find potential respondents and easy to visually screen potential respondents by age or other characteristics.

CAPI or manually collected intercept surveys can often be fielded quickly. It is relatively easy to gather national data using the large number of available field service locations. Since intercepts are in person, almost anything can be tested: visual communications, video, and even food taste and texture. However, people are generally less patient in person than by other modes of surveying, so intercept surveys are generally shorter than other types of surveys and the amount of information you collect is less.

For more complex information gathering, it is common to screen a respondent then invite him/her to an adjacent facility for more precise interviewing. Intercepts can be completed on either paper forms that are data entered at a later time or with laptop/handheld computers.

Go to the Respondents

If you want to know what customers think about a specific location, you can randomly select a sample and screen people based on if they have visited that location. But a much easier way to conduct the research is to go to the location and survey respondents there. This is especially true when you are dealing with difficult to find respondents or a small number of people.

The Problems with in-person Interviews

When surveys are conducted out in the field, it is more difficult to monitor and control the quality of interviewing. So it is important to verify the quality of the data that is collected. Costs are also a consideration. The cost of project management, on-site interviewing and respondent incentives are generally higher than for other methods, making this approach relatively costly.

Finally, this approach can be intrusive since respondents are interrupted and asked to participate. Generally, people don’t like to be solicited in person while they go about their business. So, this type of data collection must be done carefully, especially if your company’s name is associated with the survey.

***

If would like to conduct in-person interviews, but don’t have the capacity to send interviews around to different locations, there are plenty of companies around the country who specialize in this type of data collection. Every major (and probably minor) city in the US will have a facility who can collect this type of data for your company. Many have multiple locations in different cities if you need some sort of national representation.

>Introduction to Mail Surveys

June 9, 2010 Leave a comment

>Mail surveys can take a wide variety of forms; although they commonly consist of a paper survey that is completed by a respondent and mailed back. You will see these with various levels of quality and professionalism with product warranties, for customer service feedback, and in your mail. They are still popular because they can be very simple to construct and can often provide data at low cost.

Mail surveys are a good choice if the respondent is someone you know and/or they have an interest in the survey results, such as an existing customer or employee, or member of an organization. We also recommend the use of mail surveys where it is important that every member of the sample get the impression that their voice is important. A mail survey can also be used as a relationship building device, demonstrating that you are interest someone’s feedback

On the positive side, it can be easy to collect a large amount of data from mail surveys (although this depends on the exact methodology) – simply mail out a ton of surveys. They are not intrusive (as respondents are not interrupted at an inconvenient time) and they can respond when they want. And because mail is an archaic technology, you can reach nearly everyone you’ll want to survey with a mail or paper survey.

The biggest disadvantage of this type of survey is time, especially if you are mailing it both there and back. While surveys on the phone and internet can be completed in days, it usually takes two weeks at minimum to complete a mail survey (you’re looking at a week in post office transit alone). For this reason, mail surveys are often used to collect data on an ongoing basis for customer information like for product warranties, rather than for a quick snapshot type survey. You probably wouldn’t want to do your political poll by mail, for example.

Design of the Surveys

Unlike surveys conducted in person and by telephone, mail surveys don’t have an interviewer who can provide clarification or answer questions, so it is important to carefully consider the design of your survey and to troubleshoot problems that can lead to reduced question comprehension and errors in responses.

Avoid compressing and compacting questions. Too many questions on a page create confusion and can result in response errors. Survey testing shows that respondents will readily fill out a survey several pages in length provided they feel the survey is important.

Obtaining an Adequate Response Rate

Unless the survey is conducted among a highly engaged group (say employees of a company), we find that multiple mailings are almost always necessary to assure an adequate response. Our experience has shown that a first mailing of the survey, a reminder postcard, and a second mailing of the survey to non-respondents is the optimum approach and will result in a response rate that is 10-25% higher than a single mailing.

The Final Word

There is nothing sexy about conduct research via paper surveys or through the mail. But there is a reason they are still around and why you see them everywhere if you look carefully. They are a relatively inexpensive way to collect information at the respondents’ convenience, especially when you don’t have access to email addresses or not everyone you are surveying has easy access to the internet.

Categories: mail survey, research, survey

>Writing Questions for Surveys

May 22, 2010 Leave a comment

>Developing questions is the aspect of research that non-researchers normally feel the most comfortable with. However, it is something that is deceptively difficult to do well and is the place where things are most likely to go wrong in the survey process. People make entire careers out of the study of questionnaire development and there are many books and scholarly articles devoted to the topic. I am clearly not one of these scholars, but I’ve at least seen enough to know all the things I don’t know about developing questions for surveys.

So, what don’t I know?

The goal of a survey question is to gather reliable and consistent information on something you want to measure, such as the number of times people have been to the doctor in the past year. Sounds easy enough. But how people define the word ‘doctor’ is very different, especially as you cross cultures. Are we talking about primary care physicians only? Does a trip to the emergency room count? How about a visit to a public clinic? And what do we mean by year? Do we mean the previous 12 months from this very moment, or the last calendar year of 2009? Subtle differences, but to the extent that survey questions can be interpreted differently by respondents, it means that our question is vague and will produce inconsistent data.

Types of Questions

There are two basic types of questions that make up the average survey: closed-ended and open-end questions.

Closed-ended questions are those that require yes/no or a multiple choice answer, such as ‘do you love puppies?’ Open-ended questions cannot be answered in a simple yes/no response and require a longer and more varied response. Something like: ‘Please tell me the reasons why you hate puppies?’

Generally speaking, closed-ended questions are simple to ask, analyze and interpret. However, they do require you to phrase questions into something that can be answered using multiple choice. That is not always possible or practical. Open-ended questions can capture more detailed information and are useful when you don’t know what types of answers the respondents are going to provide, but they do require more coding and analysis time after the survey is completed.

Biased or Leading Questions

Ok, that question about puppies above is actually a bad survey question, because it is called a leading question. By using the word ‘love’ and phrasing the question in that way, I am pushing respondents to answer a certain way (in this example, to say they love of puppies). If they answer ‘no,’ then they look like a bad person.

Leading questions and biased surveys are something that I commonly run into, especially with the market research crowd (I rarely do political polling since it makes me feel dirty). These surveys seem to be designed to confirm a point of view. Political push polls are the most common example of these types of surveys, but there are other subtle ways that questions can lead respondents – such as loaded questions, biased phrasing of question wording or unbalanced response scales.

The unbalanced scale:

Please tell me your opinion of puppies, using the following scale of 1-5:

1. I love them more than life itself
2. I love them a lot
3. I love them
4. I like them

The results of my survey show that 100% of people like puppies.

Additionally, the order that you put your questions in can affect how respondents answer them, that is, one question can bias a later question. If you ask a respondent to rate a series of factors on importance, and then the respondent is asked an open-ended question on what is important, their comments will largely be limited to the factors already mentioned. Ask the open-ended question first, and a greater diversity of factors are likely to be mentioned.

Check out StatPac for a good list of more subtle ways to bias survey questions.

Ask Actionable Questions

That question about puppies above is bad for another reason too: It’s not going to give me any information that is actionable. So what if everyone loves puppies? What can I do with that data?

A mistake that people often make is asking questions simply to ask them because they think they should or because they have “space” in their survey. Make sure that every question in a survey serves a purpose and that you will use the information that comes out of it. The best way to do this is to clearly define the goals of your survey. What are your hypotheses? What are the things you want your survey to answer? Make survey every survey question builds towards those goals.

The Lowest Common Denominator

Maybe that sounds a little mean, but it is important to assure that your survey questions can be understood all respondents, not just those with advanced degrees. At minimum, questions need to be put in plain English (think 6th-8th grade reading level), and organizational jargon needs to be stripped away. If you conduct research on short term disability policies, your internal definition of STD is probably a little different than the average person. It’s usually a good idea to let someone outside your industry or area of expertise read your survey to make sure it can be understood by the average person.

For very sophisticated research, cognitive testing can be done to be sure that each question is understood by the respondent as intended. Most studies don’t have a budget to allow this, so you’ll need to rely on past research about what respondents have found confusing in the past. And if there is no past research, well, my best advice is to keep it simple.

Response Scales

There is the issue of what are the appropriate question response scales to use (another topic people have written entire books about). Scales can be either even or odd. Even scales allow no middle choice. Research suggests that an uneven scale is better because it allows a respondent to indicate that they are in the middle and could go either way.

Scales can also be anchored or unanchored. An anchored scale has words attached to it, such as “very satisfied” or “somewhat satisfied”, and they are preferred because they add clarity on the meaning of the divisions on scales. Alternatively, an unanchored scale might be worded, “on a scale of 1-10 where 10 is very satisfied…” More divisions allow finer distinctions in responses but they also make wording impractical.

Should you label every point in your scale? Generally, yes if possible, as labeling every point increases understanding of your scale and leads to more consistent responses.

How many distinct points should you have in your scale? Five will usually do for most questions, but if you believe that your data is going to be skewed either negatively or positively, then a seven or ten point scale will produce a wider range of answers.

The Final Word

Now that you’ve crafted all your survey questions and given them appropriate response scales, you get to worry about the order to put them in. The order of questions in a survey can be important. If you ask a question early in a survey, the responses will be different than the same question asked later, after a respondent had time to think about a topic. Later responses are more likely to provide more depth than earlier ones

Thinking about the research objectives and the specific informational needs, an experienced researcher will craft questions in different ways to completely surround a topic. For example, you might ask the respondent to name something top of mind, and then ask about familiarity with similar items, and then ask for rankings of these items on particular characteristics. You might also include an open-ended question on why one something is different than another.

***

So survey development is easy, right? For more information on designing questions for surveys, I’d reccommend the following books:

http://rcm.amazon.com/e/cm?t=patrmadd0a-20&o=1&p=8&l=bpl&asins=0787970883&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifrhttp://rcm.amazon.com/e/cm?t=patrmadd0a-20&o=1&p=8&l=bpl&asins=0749450282&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifrhttp://rcm.amazon.com/e/cm?t=patrmadd0a-20&o=1&p=8&l=bpl&asins=078797546X&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifr

Categories: questions, research, survey

>Sampling for Quantitative Surveys

May 15, 2010 Leave a comment

>Sampling is the process of selecting units (these units are often, but not always, individual people) from a population of interest in a way that allows us to study a smaller group, but generalize our findings back to a larger population.

Of course there are times when you don’t care about generalizing back to a larger group. If you are evaluating how satisfied your employees are, there is no need to select a sample of employees (unless you happen to be GE or something). Instead, it is easier to survey all your employees. This is called a census.

However, unless the group of people you are interesting in surveying is relatively small (in the thousands or less), then it makes financial sense to select a sample of your group, conduct a survey with those people, and generalize those results back to the larger group. In most market and social research, we are interested in generalizing to specific groups. The group you wish to generalize to is referred to as the study population and your sample will come from it.

Once you have identified your study population, you have to get a list of all the members that are accessible, and this list becomes your sampling frame. Finally, you actually draw your sample (using one of the many sampling procedures). The sample is the group of people whom you select to be in your study.

Note that the sample is not the group of people who are actually complete the study. You may not be able to find all of the people you actually sample, or some could drop out over the course of the study. Unless you can get every single person in your sample to respond, the group that actually completes your study will be a portion of the sample (the people who do not complete the survey are often referred to as non-respondents or dropouts).

Sampling is a complex, multi-step process, with lots of opportunities to go wrong. There is the possibility of introducing bias when going through the process of identifying a sample. For instance, you may be able to clearly identify the population of interest, but you probably will not have access to all of them. There are opportunities for error when drawing the sample from the sampling frame. And, of those in the sample, some probably will not fully participate (drop-outs and non-responses).

Sometimes, these problems can be corrected by using weighting. Though statistically complicated, weighting allows the researcher to adjust research results based on non – response, sampling, data collection processes and population characteristics.

Types of Sampling Approaches

Broadly speaking, there are two types of sampling: probability sampling and non-probability sampling. A probability sampling method is sampling that uses random selection, a process that assures that the different units in the study population have equal and known probabilities of being chosen for the sample. A probability sample is required when the objective is to generalize results to a population rather than just those who responded to the survey.

There are a number of sampling methods in this category and each is appropriate in different circumstances.

Simple random sampling (like drawing names from a hat) is easy to accomplish and explain to others, “Everyone has the same chance to be selected”. It is a fair way to select a sample so you can generalize the results from the sample back to the population. It is not the most statistically efficient method of sampling because by the luck of the draw, you may not get good representation of population subgroups.

Stratified random sampling remedies this problem by dividing the study population into homogeneous subgroups (strata) and then taking a simple random sample in each subgroup. Stratified sampling has a couple advantages over simple random sampling. First, it assures representation of subgroups of the population, particularly subgroups based on geography (for example, counties or cities). In fact, if you want to be able to talk about subgroups, this may be the only way to go. Second, if the strata are homogeneous, then this method usually is more precise statistically. The reason is that the variability within groups is usually lower than the variability in the whole population.

Probability proportional to size sampling uses a size measurement (such as the number of employees in a company or the number of students in a school) to assist in the sampling process. Using this method, each item (in this case a business or a school) is assigned a sampling probability in proportion to its size. So a company with twice as many employees as another would have twice the probability to be sampled. This method can increase the representativeness of the sample by focusing it on the larger members of your sampling frame, where more business is conducted.

Cluster or area random sampling solves for administrative efficiency problems when sampling a population that is spread out geographically. The steps in cluster sampling are: divide the population into clusters (sometimes along geographic boundaries), randomly sample clusters, then measure all units within sampled clusters.

Multi-stage sampling is when sampling methods are combined in useful ways to address sampling needs in the most efficient and effective manner possible. Most real marketing and social research uses complex sampling strategies that combine aspects of the ones described above.

Non-probability sampling is sampling that does not involve random selection. Non-probability samples may or may not represent the population well, and there is no sure way to tell. In general, researchers prefer probabilistic or random sampling methods and consider them to be more accurate and rigorous. However, in market and social research, there may be circumstances where it is not feasible, practical or theoretically sensible to do random sampling. For example, it may be impossible to obtain or compile a complete list of the group you want to survey, making it impossible to pull a random sample.

There are two broad types of non-probability sampling: accidental and purposive.

Convenience sampling is another name for accidental sampling. This type of sampling uses whoever is readily available or convenient to the researcher.

Purposive sampling is sampling with a purpose in mind and usually targets specific, predefined groups. Intercepts are generally this type of sampling. A researcher goes to a location and then observes the people passing by to see who appears to be in the target category. Once identified, the researcher will stop him/her to request participation. If the person agrees, the first thing the researcher must do is verify that the respondent meets the criteria for being in the sample.

Purposive sampling can be very useful for situations where you need to reach a targeted sample quickly and where sampling for proportionality is not a concern. With a purposive sample, you are likely to get the opinions of your target population, but you are also likely to over represent subgroups in your population that are more readily accessible.

There are a number of purposive sampling techniques, such as: modal instance sampling (targeting the “typical case”); heterogeneity sampling (diversity sampling, aiming to solicit the full range of possibilities); expert sampling (targeting people with known expertise in some area); quota sampling (targeting people according to some fixed quota); and snowball sampling (identifying someone who meets the criteria and then asking him/her to recommend others). Despite their limited ability to generate population generalizations, each of these methods may make sense under certain circumstances.

Categories: research, sample, survey
Follow

Get every new post delivered to your Inbox.