How to Interpret a Multiple Regression Analysis Model

We have come a long way this semester (in spite of weather cancellations!) and each of you should be congratulated on your QBA performance to date.  Just one more hurdle to go – Business Statistics Forum #6!

The last section of Part II (see “BSF #6 Guidelines” on the left side of this site) has you analyze the potential for employment discrimination at IBM using hypothetical employee data and an SPSS procedure that produces a multiple regression model based on the data.  We have been reviewing the relationship between correlation (r and r2) and regression (R, R2 and 1-R2) in class, through lectures, blog .pdfs and a PowerPoint presentation (also found on the left side of the site), and have integrated SPSS procedures into the discussion.  Though an important part of the discussion has centered on what to do when the numbers are in, I thought I’d continue this part of our in-class dialogue here to help sharpen your data analytic skills.

Specifically, you are to test and analyze the following regression equation and include the implications in the recommendations section written for your boss at IBM:

                                                                    (Click to Enlarge)

Since your task is to interpret MRA output, I will assume you have already reviewed the assigned textbook readings, class notes and .pdfs, and the PowerPoint presentation and are ready to roll up your sleeves to get to work.  For the purpose of this discussion, I will not run and analyze all seven independent variables ON the dependent variable (salary) that are required in this MRA.  Instead, I will run only four: age, education, previous experience in months, and months since being hired (see FIG 1: Explaining Salary from Four Predictors using MRA). Notice that there are three tables in this figure: Model Summary, ANOVA table and Coefficients table.  I will discuss the role of each table in your MRA in turn.   

The model summary output stems from a default forced entry procedure in SPSS (i.e., METHOD=ENTER), rather than FORWARD, STEPWISE or some other useful method.  The implications of METHOD=ENTER are that all predictor variables are entered into the regression equation at one time and subsequent analysis then follows.  Note: The decision to choose a specific method will depend on the theoretical interests and understandings of the researcher.  As an example, refer to our fifth BSF (i.e., Laurie Burney; Nancy Swanson. The Relationship between Balanced Scorecard Characteristics and Managers' Job Satisfaction. Journal of Managerial Issues, v22 i2, Summer 2010: 166-181) which used MRA to explain manager’s job satisfaction from multiple sets of predictor variables.
There are two essential pieces of information in the Model Summary table: R and R2.  The multiple correlation coefficient (R) is a measure of the strength of the relationship between Y (salary) and the four predictor variables selected for inclusion in the equation.  In this case, R=.667 which tells us there’s a moderate-to-strong relationship.  By squaring R, we identify the value of the coefficient of multiple determination  (i.e., R2).  This statistic enables us to determine the amount of explained variation (variance) in Y from the four predictors on a range from 0-100 percent. Thus, we’re able to say that 44.5 percent of the variation in Y (salary) is accounted for through the combined linear effects of the predictor variables.  THIS IS MISLEADING, however, since we don’t yet know which of the predictors has contributed significantly to our understanding of Y and which ones have not. We will address this important issue in the last table (Coefficients) when we explore each predictor’s beta (i.e., standardized regression coefficient) and its level of significance.
Question: How do we know if the MRA model itself is statistically significant or if we’re just wasting our time staring at non-significant output?  The answer is found in the ANOVA table. Because R2 is not a test of statistical significance (it only measures explained variation in Y from the predictor Xs), the F-ratio is used to test whether or not R2 could have occurred by chance alone.  In short, the F-ratio found in the ANOVA table measures the probability of chance departure from a straight line.  On review of the output found in the ANOVA table, in one sentence we can address the above-cited question:  We find that the overall equation was found to be statistically significant (F=93.94, p<.000).
Figure 1: Explaining Salary from Four Predictors using MRA
(Click to Enlarge)

Finally, as I previously pointed out, we need to identify which predictors are significant contributors to the 44.5 percent of explained variance in Y (i.e., R2=.445) and which ones are not – and in what way(s) do the significant ones help us to explain Y.  Answers to these important questions are found in the Coefficients table shown above.  In terms of our focus, note that for each predictor variable in the equation, we are only concerned with its associated (1) standardized beta and (2) t-test statistic’s level of significance (Sig.).  As always, whenever p <.05, we find the results statistically significant. For the purpose of understanding MRA output, this means that when a p-value (SIG.) is less than or equal to .05, the corresponding beta is significant in the equation. So what did we find?

From this equation, the level of education was found to be the only independent variable with a significant impact on employee salary (b=.673, p<.000) when all of the variables were entered into the regression equation.  We found that the higher the level of employee education at IBM, the greater the salary level.  The employee’s age, previous experience in months, and time on the job since being hired did not meet the necessary criteria to significantly impact employee salary, so they played no role at this stage of the analysis. We should note that an employee’s previous experience (measured in months) did approach significance (b=.107, p=.06).

I hope this helps!

Professor Ziner

Calculating the Expected Frequencies (E) in a Chi-Square Test of Independence

I have had several students ask me to review how to calculate a cell’s “E,” the value of the expected frequency under the null hypothesis, across all cells in a Chi-Square Test of Independence. The process is simple: To calculate each cell’s E, you take the cell’s corresponding column marginal total and multiply that number by that cell’s corresponding row marginal total and then divide by n (sample size). This procedure is summarized in the following 2x2 table below.

                    

Note that this procedure differs from the one variable Chi-Square "Goodness of Fit" test.  To obtain the expected frequencies for each cell in the "Goodness of FIt" test, divide the sample size by the number of columns.

I hope this helps!

Professor Ziner

Role of Sample, Sampling and Theoretical Distributions in Hypothesis Testing

Our latest class poll produced the following results: 70 percent who responded got it wrong (see our class poll’s outcome in the chart below).  Given that we just began to discuss this relationship and its connection to hypothesis testing in class, allow me to further develop the poll’s central issue in modest detail.  We’ll continue this discussion in class.  I hope you’re taking notes. 


Correct Answer to Our Class Poll: A Sampling Distribution (#4)

Let’s start off with a common activity among researchers:  We want to test a hypothesis using a specific statistic (e.g., z, t, Chi-Square, etc.) calculated from a random sample drawn from a population of interest.  That is, we want to test a hypothesis based on a statistic we calculate from raw scores that comprise our sample data (our sample distribution).  Once we use a standardized procedure to calculate the statistical outcome based on sample data (for example, t=2.13), we have the following question to address:  

Q:  How do we know if this singular outcome is statistically significant so we can reject the null hypothesis?  That is, how do we know whether the resulting outcome (t=2.13) occurred by chance (an outcome that is not significant) or is due to something other than chance (where we infer that a statistically significant outcome has occurred)?  This question is at the heart of inferential analysis.  Why?  If we meet the criteria needed to reject our null hypothesis, we can generalize our sample results to the population from which the sample was drawn, i.e., to the theoretical distribution of all raw scores in a population.

We answer this important question in three parts.

First, we need a model of all possible statistical outcomes of that specific statistical test and their associated probabilities of occurrence.  This model is called a sampling distribution of a given statistic.  Each sampling distribution of a statistic (such as z, t, Chi-Square, etc.) is generally found in the back of your statistics or research methods textbook, so you never have to create a sampling distribution on your own.  To reiterate the point, such a model gives us:
·    all possible outcomes of a given statistic and
·    the associated probabilities of occurrence of each statistical outcome
Second, the researcher then compares the statistic that (s)he computed from sample data (sample distribution) to the statistical model that gives the probabilities associated with observing all empirical outcomes (all statistical values) of that statistic.  That is, the researcher makes a comparison between (1) the original singular (one-sample) statistical outcome and (2) a statistical model (sampling distribution) that gives the probability of observing all given outcomes of that statistic.  In the example above, it would be a comparison between the calculated t statistic (2.13) and the sampling distribution of all potential outcomes of t and their associated probabilities of occurrence.  What you actually compare will be discussed shortly.   

To visualize this relationship, consider the following diagram which divides a sampling distribution into two regions based on such a comparison.  One region is referred to as the critical region (or CR). If your (one sample) statistical outcome falls in this region, you reject the null hypothesis of no difference in favor of your non-directional or directional research hypothesis (see my
prior blog for this distinction).  The CR is defined as that portion of a sampling distribution that leads to the rejection of the null hypothesis 


Non-Chance (CR) and Chance Regions of a Sampling Distribution

Where does the CR begin?  Enter the concepts of alpha (the level of significance) and the critical value (a value that begins the CR at alpha).  When alpha is set (by convention) at .05, the critical region marks off the outer 5 percent of a sampling distribution (2.5% in each tail for a non-directional hypothesis, as shown in the diagram above, and 5% in one tail for a directional hypothesis).

Where do you find a critical value for your statistic?  A sampling distribution is, in fact, a table of critical values.  All you need to know to find the critical value is knowledge of (1) your sample’s degrees of freedom (which will vary by each statistic; for example, it is “n-1” for t), (2) the level of significance or alpha, and (3) the type of hypothesis; whether it’s directional (one-tailed) or non-directional (two-tailed).  With this information and access to a statistic’s sampling distribution, you will be able to find the critical value at alpha.

Finally, with knowledge of your one sample statistical outcome (from the sample distribution) and the critical value of that statistic at alpha (from the sampling distribution), you are ready to plug these values into the decision rule presented in the diagram below to see if you can generalize the hypothesized results to the overall population from which the sample was drawn (to the theoretical distribution):     



How to Determine if Your Outcome is Significant:
Accept or Reject the Null Hypothesis

The logic associated with this entire discussion can be summarized through the following six steps of hypothesis testing which we reviewed in class:

1.  State your research hypothesis
2.  State your null hypothesis
3.  Set alpha (conventionally set at .05)
4.  Identify the critical value of the test statistic at alpha
5.  Calculate the test statistic
6.  Compare your one sample statistical outcome to the critical value of that statistic at alpha.  If the absolute value of the statistic you calculate from sample data meets or exceeds the absolute value of the critical value at alpha, then you will reject the null hypothesis of no difference in favor of your non-directional or directional research hypothesis.
So, in short, a distribution of all potential outcomes of a statistic and their associated probabilities of occurrence is called a sampling distribution.  For reasons we just examined, sampling distributions are an essential part of inferential analysis.

I hope this helps.

Professor Ziner

This message will self destruct ...

This note is sent in response to our course cancellation on March 23rd due to a snow storm.  ESU officially closed the university that day. To the students in my 5:30-6:45 p.m. class, email me before Monday if you want to know your exam scores.

See you Monday!

Professor Ziner
"Snow Storm"
Captured by Todd Klassy on the west side of Madison, Wisconsin near
Elver Park in the midst of a blinding snow storm on February 16, 2006.
(Source:
http://www.flickr.com/photos/latitudes/101340553/)

Toward a Better Understanding of Standard Scores (or z-Scores)

We have been reviewing the role of z-scores (also known as standard scores) and, to help clarify a few essential points, let me offer the following:
1.    A score derived from a test or scale is meaningless by itself unless it can be compared to the distribution of scores from some reference group (recall the “Are you impressed with someone who catches a 50 pound Northern Pike?” example).   Hence, once a reference group is established, a single measure then becomes meaningful (the average Northern Pike seldom exceeds 10 pounds so, with that knowledge, I would certainly be impressed)

2.    The process of dividing a deviation of a score from the mean by the standard deviation is known as the transformation to standard scores (or z scores).   Symbolically, z is defined as:


3.    The difference between “s” and “z” is straight forward:  The standard deviation (s) is a measure expressing dispersion in original units of measurement.  A standard (z) score is a raw score (“X” in the formulas above) converted into standard units, i.e., a number of standard deviation units from the mean.  Consider “s” to be discrete and “z” to be continuous in its form and application.

4.    The value of this conversion to standard (z) scores is that it: (1) Always yields a mean of zero and a standard deviation of one.  However, it doesn’t “normalize” a non-normal distribution such as one that is highly skewed.  (2) If the population of scores is normal on a given variable, we can express any raw score (“X”) as a percentile rank by referring our z to the standard normal distribution (i.e., a sampling distribution of z scores).  Click here to see this sampling distribution. (3) We can compare an individual’s position on one variable with her or his position on a second. 

5.    In short, by transforming the scores of a normally distributed variable to z-scores, we are, in effect, expressing these scores in units of the standard normal curve.  As we reviewed the characteristic of the standard normal curve (SNC) in class, you should be familiar with them.

6.    When you attempt to calculate and interpret z-scores (click here for your latest z-score exercises), you will need the mean and standard deviation.  This information either will be calculated by you from raw scores comprising a sample or will be given to you as a set of sample statistics or population parameters.  It either case, it will be assumed that the variable(s) to be assessed are normally distributed.

7.    I recommend following a few guidelines when you approach z-score exercises. 

a.    First, always diagram each problem so you can visualize what it is that needs to be done.  That’s how we approached this in class, using the board.  Are you seeking the percent of cases that fall between a z score and the mean?  The area that falls below or above the z?  The percentile ranking of that z in the general population?  This is necessary so you can VISUALIZE the area between a given z (which you calculated) and the mean, the area below or beyond a given z, the area between two z scores and, as previously stated, the percentile rank of a given z in the general population.  SO PLEASE DIAGRAM YOUR PROBLEMS.  You will be expected to do so on your exam.

b.    Second, recognize that whenever a raw score is below the mean, the resulting z will be negative and vice versa.  This translates into the following rule when you are identifying percentile rankings.  For a positive z, use Column B in the sampling distribution (found at Broken Pencils).  For a negative z, use Column C.

c.    Third, remember there are steps involved in this process.  Once you have a mean, standard deviation and one or more raw scores with which to work, and an idea of what problems you need to address (see 7.a above), you must transform the raw score into a standard (z) score.  Next, given your understanding of the problem that needs to be solved, the resulting z-score must be identified in TABLE A – the sampling distribution of z – to complete the exercise. 

d.    Finally, be certain that you provide your answers in their correct form.  If I request the “percent of cases,” don’t give me a proportion.  If I request a percentile rank, don’t provide the answer in the form of a proportion.  And, please, round your z-scores and proportions to the nearest 1/100th (that is, 1.08, not 1.0833).

I hope this helps!

Professor Ziner

Connecting Hypothesis Testing to Chi-Square Outcomes in SPSS

I’ve reviewed a student's BSF2 report and want to recommend a few key considerations to all of you before you turn it in.

For a Null Hypothesis, she states: “There will be no difference between sex and confidence in banks and financial institutions in the United States.” Well, a better way to state it is: “There will be no difference between men and women in their confidence in banks and financial institutions in the U.S.” The difference in confidence (Y) is between two populations: men and women (X). Please review my latest blog on “Constructing Hypotheses” for a discussion on this subject in greater detail. Once you look at the ways I worded each type of hypothesis, review your work and make any necessary changes.

Fundamental to this assignment is the interpretation of each hypothesis you constructed. Did you accept or reject the null, for each? As tempting as it may be, how you decide has nothing to do with reading the percentage outcomes of the crosstabular tables. The only way to do it is to generate the Chi-Square Square statistic in SPSS (which we already covered in class and in your BSF2) and refer to and interpret the table(s) generated in SPSS. For example, if you examine the table below that was generated in SPSS, you can see TWO key pieces of information. As I state in your second BSF, focus on the Pearson Chi-Square value (in this table, it’s 5.77) and the significance of this X2 statistical outcome found under the ASYMP. SIG. (2-sided) column (in this table, it’s .450).

INTERPRETATION: If the identified significance level is less than or equal to .05, then you reject the null hypothesis of no difference and accept the research hypothesis which indicates that there is a difference. In this table, .450 is much greater than .05 (45% v. 5%), so you must accept the null hypothesis (at the same time, you’re rejecting both research hypotheses).

UPSHOT? That means that you should review all of your hypotheses interpretations based on each “Chi-Square Tests” table, like the one that appears below, and decide on which outcomes of your tested hypotheses should be accepted or rejected. Simply put, if the two-sided "asymp. sig." is less than or equal to .05, then reject the null hypothesis.  


I hope this helps!

Professor Ziner

Tips to Consider as you move from BSF1 to BSF2: PART II

Here are a few more TIPS for you to consider.  They reflect some additional concerns I have regarding the last group of BSFs I reviewed.  Please carefully review these recommendations and incorporate them into future BSFs when it's relevant.  As always, see me with any questions.

1.  For organizational purposes, in any univariate analysis section of a report, provide your descriptive analysis (means, standard deviations, percentage breakdowns, etc.) of the independent variables before you begin your singular analysis and discussion of the dependent variable(s).

2.  When it comes to missing data (i.e., “refusals,” “don’t knows,”  “no answers,” “not applicables,” etc.) associated with a univariate or bivariate analysis, unless you’re asked to do so, you need not report when one person or a non-trivial number of persons refuse to answer a survey item.  By focusing on the VALID PERCENT column, with properly assigned missing cases, SPSS removes (excludes) those missing cases from the column for a “clean” interpretation of the results.  In national surveys, such as those conducted by the National Opinion Research Center (NORC), people who do not respond to an item generally do not invalidate either the entire survey or any subset of variables analyzed within it.

3.  Some students have incorrectly introduced the term “correlation” in our first BSF in the following way: “From the results, I cannot see any correlation between X and Y.”  As we will soon discover in class, a correlation is a statistical measure of the strength of the relationship between at least an ordinal level X and Y and should only be used in that context.  To use the term as a synonym for “relationship” or “association” would, therefore, be incorrect when conducting a crosstabular analysis.    

4.  The integration of tables and charts, while previously discussed in class and at this blog, needs additional clarification.  Never place a table or chart in the center of your text discussion – only to the right (with discussion on the left) or, preferably, below the relevant discussion.  Moreover, always label a table within its title (e.g., “TABLE 1C: Gender of MBA Students by Attitudes Toward Management Scores”) and reference the table/chart in your text as previously described (i.e., “See TABLE 1C”).  Any unlabeled table or chart, whether referenced or not somewhere in your analysis, does not belong in the report.

5.  When you report percentages associated with a single variable’s (or crosstabular) categories, do not refer to a percentage found “in category one” or “in category two.”  Since all categories of a variable (X or Y) must be properly labeled, report the percentages found across those labels (i.e., “55 percent have ‘A Great Deal’ of support for banks and financial institutions,” not “55 percent were found in category one”).  Related to this, be sure to present each analysis of a dependent variable (Y) in its own paragraph.  Do not present a one or two page-long run together paragraph that includes analyses of multiple dependent variables.  This also includes separate paragraphs for each unique X’s impact on the same Y (i.e., X1 and X2 by Y1 should involve two separate paragraphs).


6.  Finally, try to keep out any personal politics in your discussion and analysis.  Future employers are interested solely in “just the facts” of an assignment you’re given – not your personal opinion.  A few students decided to include their perceptions of affirmative action policy in their reports which, while interesting, have little to do with the way the data unfolded at this descriptive stage of the course.

I hope this helps!

Professor Ziner

Tips to Consider as you move from BSF1 to BSF2: PART I

In addition to the previous two blog entries, here are a few more tips you should consider as you review my comments from the first BSF and prepare to submit BSF2:

1.   Never analyze nominal data (and many ordinal level measures) using the mean and standard deviation.  For example, a mean of 1.55 and standard deviation of .623 on the variable sex (1=male, 2=female) is an absolutely meaningless set of statistics.  Hence, when you describe categories associated with nominal data, only a percentage analysis is warranted.   Note: When you’re ready to make inferences about the population from nominal level sample data (e.g., using “Chi-Square” along with the “Contingency C Coefficient”), then you are conducting inferential analyses, not descriptive analyses.  This effort involves testing hypotheses and estimating population parameters which we previously discussed in class. I raise this latter issue because, in BSF2, you are also responsible for constructing non-directional and directional hypotheses of difference and association from the list of variables found in the QBA_Database.sav file.  See my previous blog for a discussion on constructing these hypotheses.

2.   Be sure to examine the VALID PERCENT column of a table, not the PERCENT column because the valid percent excludes all missing cases in the tally.  Related to this, you can report the number/percentage of missing cases tied to the PERCENT column.  However, unless you’re asked to do so or you have a pressing, legitimate reason to do so, don’t bother (or put the missing information in a footnote).  Only report those categories that appear in the VALID PERCENT column without noting information on missing cases.

3.   When you gaze at a crosstabular table with a potentially large number of row and column percentages, be selective in what you ultimately choose to report.  Do not report everything you see.  Identify and report on the highlights (maybe 4-6 key points).  Where do you begin?  As we discussed in class, center on categories of the Y (dependent variable) and, within those categories, report on the X's (independent variable) differences.  Do not analyze the X categories and look at how they vary across Y.  For example, you may find that 60 percent of the sample has “A GREAT DEAL” of confidence in America’s banks and financial institutions (i.e., one category of Y).  Of that percentage, are females more likely than males to comprise this group?  What are the percent differences who report “A GREAT DEAL”?  Note: We are not testing hypotheses when using crosstabs.  We’re just gaining a two-variable (bivariate) understanding of the data and reporting what we found.

4.   When you’re requested to provide graphic evidence of the statistical outcomes in an assignment, integrate each selected table (or chart) below its respective discussion and be sure to reference the table (or chart) either in the discussion or at the paragraph’s end, i.e., “(see Table 1A).”  You should never present graphic information in a report without references to it in your discussion.  To this end, always add the corresponding table number at the top of a table (or bar chart) to the left of the table’s description.

5.   When you’re presenting graphic outcomes in a bar or pie chart, add value labels in each bar or wedge in the chart.  How?  Once the chart is generated in SPSS, click on it and another window will open.  Look for the ICON that looks like a bar chart, click and the value labels (percentages, for example) will automatically appear in the bar chart.

6.   In the univariate section of your report, please use paragraphs to logically separate parts of your discussion, especially where the discussion has become statistically tedious (e.g., lots of percentages reported) and long.  This means you should have a separate, labeled paragraph for nearly every variable in your report.  If there are two related variables, such as AGE and AGEORD (i.e., age recoded at the ordinal level), combine the discussion into one paragraph.  In this example, the result will be a paragraph that not only covers the mean and standard deviation of AGE, but also a percentage breakdown of AGEORD by the recoded variable’s categories.

7.   If you have trouble fitting a table or chart on one page in a WORD document, try the following (besides reading my prior blog on this subject):  Generate the table or chart in SPSS, edit it to make the necessary changes (such as change the title, add “TABLE #,” add percent labels to the bars in a chart, etc.), then right click to COPY the table/chart into WORD.  In your WORD document, click PASTE SPECIAL and choose PICTURE (Windows Metafile).  That will paste the graphic image into WORD.  From there, you can click on the graphic and adjust/resize it smaller to fit a page.  However, don’t make it so small that the title or data can’t be read.

I hope this helps! 

Professor Ziner

On the Construction of Hypotheses

I’ve been asked to clarify how to specify non-directional and directional Hypotheses of Difference and Association – an essential part of your second Business Statistics Forum (BSF2).  You've already learned a few rules, to this end: When X is nominal (or whenever you're using an experimental research design), then you're going to develop a Hypothesis of Difference.  When X (and Y) are measured at no less than the ordinal level, then you're going to construct a Hypothesis of Association 

In terms of how best to structure (word) each of these hypotheses, consider the following hypothetical examples (parenthetical information is added to clarify the level of measurement of X and Y and should not appear in your hypotheses):
 
Null Hypothesis of Difference:

There's no difference between Male and Female (SEX is a nominal level X) Attitudes Toward Management Scores (an interval level Y)

Research Hypothesis of Difference (Non-Directional):

There is a difference between Male and Female Attitudes Toward Management Scores

Research Hypothesis of Difference (Directional):

Males will have higher Attitudes Toward Management Scores than Females

Null Hypothesis of Association:

There's no association between Age (an I/R level X) and Attitudes Toward Management Scores (an interval level Y)

Research Hypothesis of Association (Non-Directional):

There is an association between Age and Attitudes Toward Management Scores

Research Hypothesis of Association (Directional):

As Age increases, Attitudes Toward Management Scores will decrease

As you have already learned this semester, directional hypotheses involve specification of how X and Y are related.  For a Hypothesis of Difference, you specify how a category of X (e.g., males or females, those exposed to X or those not exposed to X, etc.) will differ from the other category or categories in their responses to Y (see the example above).  For a Hypothesis of Association, you specify whether a positive or negative association will exist between X and Y (see the example above).  What's the difference between a "positive" and "negative" association?  For old time's sake:

Positive:  "As X increases, Y increases" or "As X decreases, Y decreases" (Be sure the direction of X and Y is the same)

Negative:  "As X increases, Y decreases" or "As X decreases, Y increases" (Be sure the direction of X and Y are inverse or the opposite)

Finally, you should recall that, as a researcher, you determine if the relationship between X and Y is positive or negative based on your understanding of prior research.

I hope this helps!

Professor Ziner

How to Recode SOCBAR into SOCBAR2 for BSF2

Open the QBA_Database (.sav) file in SPSS and scroll over to SOCBAR in the data editor window.  You just want to be able to view it.  Once SOCBAR is located and centered on your screen, click on the VALUE LABELS button below the menu bar near the top (the button with a "1" and "A” pictured).  The VALUE LABELS button hides and shows value labels for all variables at once with a simple click.  Click to see the value labels, not the numbers, and you'll view each of the original (Pre-SOCBAR2) value labels that I mentioned in BSF2.  Click again and you’ll see their associated numeric (#) values.  Now view the table below on this blog and, under OLD VALUES, you’ll see the same numbers that represent SOCBAR's value labels in what I just described above.
All you need to do is RECODE the variable SOCBAR into SOCBAR2 using the following scheme:
OLD VALUES
NEW VALUES
1, 2
1
3, 4
2
5, 6
3
7
4
ALL OTHER VALUES
SYSMIS

That's all!
Stay tuned for more TIPS to help with BSF2 due March 14, 2011.

Professor Ziner

Starting off the Semester with the Right Strategy for Success

Hello and welcome to EMGT 250!  This spring 2011 semester is designed to challenge you in several ways.  You’ll need to be organized (perhaps, more than usual).  Know, up front, that the syllabus is our contract.  You signed up for this class, right?  So, own it.  That means you’ll need to find time to 1) study weekly reading assignments (detailed in the syllabus), 2) learn how reading assignments + SPSS assignments + PowerPoint lessons apply to each Business Statistics Forum (there are six BSFs), and 3) link reading assignments +  PowerPoints to prepare for our course exams.  SPSS knowledge and skills will be assessed on BSF projects, not course exams. 
You’ll also need to take advantage of the many available resources to do well in the class.  On this blog alone, there are the following useful resources:
·     Latest BSF assignment (.pdf files)
·     Memo Board (Upper right side; keeps you up-to-date on all course changes) 
·     Latest SPSS database for BSFs (.sav files)
·     Weekly PowerPoints (.pdf files)
·     SPSS User’s Guides (.pdf files)
·     Instructional Polls (These are anonymous polls.  I cannot link a response to a student.  Our first poll ensures you understand the symbols <, +, >.  A fundamental use of the less than (<) symbol is found in hypothesis testing.  SPSS runs your equations, driven by your hypotheses, that generate no less than two essential pieces of information: the statistical outcome and its associated probability of occurrence.  For example, t=2.61, p<.05 is read “The probability of the t statistic’s outcome of 2.61 occurred among all potential t outcomes less than five percent of the time.  Because the t’s probability of occurrence is less than .05, we reject the null hypothesis of no difference in favor of the research hypothesis.  (Note: We convert the stated probability of .05 to a percent in this discussion.)
·     Statistical Workshop (All lessons reviewed slide-by-slide online)
·     Video Workshop (Step-by-Step "How to" SPSS lessons taught online)
·     Blog Archive (See what's already been discussed about the course)
In addition, come by my office after our class (office hours are 6:45-8:00 p.m.) to clarify your notes, get help with an SPSS assignment or just chat.  Bring a flash drive with your SPSS (.sav) and WORD (.doc) files containing the latest draft of a BSF assignment.  We’ll go from there.  I am at 427 Normal (across from the Kemp Library).  My campus phone number is 422-3349.

Finally, let me add that nearly all of our BSF assignments involve working in the SPSS environment, generating various (descriptive and inferential) statistical outcomes, including tables and charts and then exporting this output into a Microsoft WORD document.  Once in WORD, you will review and assess the statistical output in relation to your hypotheses and overall research objectives of the given BSF assignment.  Your knowledge and skill levels will increase as BSF assignments become more complex over time.  Once you have finalized your BSF and are ready to submit it, there's still one last step: convert your work into a .pdf file.

This is an industry standard procedure that preserves the report's structure, margins and graphics (tables, charts, etc.) exactly how you planned across all OS platforms.  This cannot be guaranteed if you send a WORD document to your audience.  Although, with the exception of the final exam/BSF, I do not have you submit a report via email (BSF numbers 1-5 are to be handed in as hard copies), I do offer the option to submit your work via email for me to review up to two days before the assignment is due.  Enter the role of .pdf files, as I do not accept any attachments other than .pdfs.  To convert a WORD document to .pdf, be sure you have the necessary .pdf conversion file installed in WORD 2007 or 2010.  If that's not working out for you, click HERE to go to the DOC2PDF website.  Scroll 
to the bottom of that webpage and follow their simple instructions.  The site generates a .pdf file (to save to your computer) for each WORD file you submit.

On the completion of this course, you should be able to add several research reports (BSFs) to your portfolio for potential employers to review.   The goal is to add value to your existing set of skills obtained through the Business Management program – in this case, it’s competence in SPSS (a powerful database manager, statistical package and graphics/report generator) and the confidence to inform your prospective employer that you know how to use it wisely in ways that can benefit his or her company.  You’ll have the evidence to back up your claim!

Have a great semester!
Professor Ziner