Pros and Cons to a Univariate Analysis

One purpose of our SPSS statistics forums is to effectively communicate quantitative information about sample data to your audience (e.g., your client, boss or, in your case, professor). We have already discussed how studies conducted on sample data involve a combination of descriptive and inferential statistics. The first task of a researcher is to examine what has been collected using one or more descriptive statistics. Often referred to as a “univariate” (one-variable) analysis, the goal is to describe each variable of interest in your database using the best possible statistic(s). The usual suspects include measures of central tendency (mean, median and mode) and dispersion (range and standard deviation). To visually display the distribution of a variable, a percentage distribution, bar chart or histogram is useful. The question is whether the descriptive statistics chosen fit the level of measurement that characterizes a variable (i.e., whether X or Y is nominal, ordinal or interval/ratio). For example, if you create the variable “sex” using 1=male and 2=female in SPSS, reporting its mean would be meaningless (a mean of 1.5 tells us what?). Consequently, the only useful statistic is the frequency or percentage of each sex in the sample (e.g., 60 percent of the sample were female). If you want, you can add a bar chart to graphically depict the distribution.

So how do I evaluate a good univariate analysis? The answer is straightforward. I look for two fundamentals. First, given the variable you selected to describe, did the chosen statistics fit the variable’s level of measurement (mentioned above and covered in class).  Second, how effective was your presentation of the variable?  In the case of the latter, I’m referring to the information you provided to your audience and how you presented it (visually). The following is a good example of a univariate analysis of the GSS "Political Outlook" scale (i.e., polviews):

The NORC-GSS (2006) asked 1500 respondents “Do you think of yourself as a Liberal or Conservative?” Using a seven-point scale ranging from “extremely liberal” to “extremely conservative,” 1439 individuals responded to this question. Almost four-in-ten (39.1 percent) indicated they were politically “moderate.” The number of respondents on either side of the political spectrum was nearly even. One-in-six (16.1 percent) stated they were “liberal” to “extremely liberal” and nearly one-in-five (19.2 percent) indicated they were “conservative” to “extremely conservative” (see Table 1 below). This split is reflected in the variable's sample mean (4.1) and standard deviation (1.4). Two-thirds of the distribution fell in the range of “slightly liberal” to “slightly conservative” with the majority placing themselves right in the center as politically “moderate.”




NOTE: Data that quantify this political outlook scale (labeled "polviews" in SPSS) are measured at the ordinal level.  Here, data are categorized (see range on right) with their respective percentages displayed in a pie chart.  By reporting the scale's central tendency (mean) and dispersion (standard deviation), data also are treated at the interval/ratio level. With statistical insight and SPSS, you'll see data's flexibility unfold.  How?  Beginning with our first statistical forum, you'll be transforming (e.g., RECODING) numeric characteristics of
one variable into another to best suit your research needs.

2 comments:

  1. In your example, you describe the results with the mean and standard deviation functions. Is this considered an acceptable method on nominal scales? I see its relevance to the results as the average choice is centered between the other three choices on either side and combined with the standard deviation you can determine a leptokurtic distribution even with a pie chart, but is this an accepted use of terminology in similar case?

    ReplyDelete
  2. The GSS "polviews" variable is not nominal data, Robert. While crude, it represents is a political ideology scale, since there is a range of 1-7. Most would refer to it as ordinal level data and some may go as far as calling it interval ratio. It is, therefore, customary to use the mean on scaled measures because the information is intuitive and useful.

    ReplyDelete