Áåç òåìû <<  Assa Evolution AssaBalt AS AStec  >>
 Assumption of linearity Assumption of linearity Linearity Population growth in Texas Evaluating linearity Interpreting scatterplots Interpreting scatterplots Scatterplots that are difficult to interpret Using correlation matrices The pattern of correlations for no relationship Correlation pattern suggesting transformation Transformations When transformations do not work Strategy for solving problems - 1 Strategy for solving problems - 2 Strategy for solving problems - 3 Problem 1 Creating the scatterplot Selecting the type of scatterplot Selecting the variables The scatterplot Adding a trendline The scatterplot in the SPSS Chart Editor Requesting the fit line Requesting r Completing the request for the fit line The fit line and r Changing the shape of the fit line Accessing the fit line options Specifying a quadratic fit line Completing the request for the fit line The quadratic fit line and r Computing the transformations Creating the scatterplot matrix Selecting type of scatterplot Specifications for scatterplot matrix The scatterplot matrix Creating the correlation matrix Specifications for correlation matrix The correlation matrix The assumption of linearity script Selecting the assumption of linearity script Specifications for linearity script The correlation matrix and the original problem Problem 2 The correlation matrix Problem 3 The correlation matrix Other problems on assumption of linearity Steps in answering questions about the assumption of linearity – Steps in answering questions about the assumption of linearity –

Ïðåçåíòàöèÿ: «Assumption of linearity». Àâòîð: . Ôàéë: «Assumption of linearity.ppt». Ðàçìåð zip-àðõèâà: 310 ÊÁ.

## Assumption of linearity

ñîäåðæàíèå ïðåçåíòàöèè «Assumption of linearity.ppt»
¹ÑëàéäÒåêñò
1

### Assumption of linearity

Assumption of linearity Strategy for solving problems Producing outputs for evaluating linearity Assumption of linearity script Sample Problems

SW388R7 Data Analysis & Computers II Slide 1

2

### Assumption of linearity

The statistics that we will study this semester generally assume that the relationship between variables is linear, or they perform better if the relationships are linear. If a relationship is nonlinear, the statistics which assume it is linear will underestimate the strength of the relationship, or fail to detect the existence of a relationship.

SW388R7 Data Analysis & Computers II Slide 2

3

### Linearity

Linearity means that the amount of change, or rate of change, between scores on two variables are constant for the entire range of scores for the variables. There are relationships are not linear. The relationship between learning and time may not be linear. Learning a new subject shows rapid gains at first, then the pace slows down over time. This is often referred to a a learning curve. Population growth may not be linear. The pattern often shows growth at increasing rates over time.

SW388R7 Data Analysis & Computers II Slide 3

4

### Population growth in Texas

The increase in population for the ten years from 1860 to 1870 is relatively small compared to the increase in the population for the ten years from 1960 to 1970.

A difference of 214,364.

A difference of 1,617,053.

SW388R7 Data Analysis & Computers II Slide 4

5

### Evaluating linearity

There are both graphical and statistical methods for evaluating linearity. Graphical methods include the examination of scatterplots, often overlaid with a trendline. While commonly recommended, this strategy is difficult to implement. Statistical methods include diagnostic hypothesis tests for linearity, a rule of thumb that says a relationship is linear if the difference between the linear correlation coefficient (r) and the nonlinear correlation coefficient (eta) is small, and examining patterns of correlation coefficients.

SW388R7 Data Analysis & Computers II Slide 5

6

### Interpreting scatterplots

The advice for interpreting linearity is often phrased as looking for a cigar-shaped band, which is very evident in this plot.

SW388R7 Data Analysis & Computers II Slide 6

7

### Interpreting scatterplots

Sometimes, a scatterplot shows a clearly nonlinear pattern that requires transformation, like the one shown in the scatterplot.

SW388R7 Data Analysis & Computers II Slide 7

8

### Scatterplots that are difficult to interpret

The correlations for both of these relationships are low. The linearity of the relationship on the right can be improved with a transformation; the plot on the left cannot. However, this is not necessarily obvious from the scatterplots.

SW388R7 Data Analysis & Computers II Slide 8

9

### Using correlation matrices

Creating a correlation matrix for the dependent variable and the original and transformed variations of the independent variable provides us with a pattern that is easier to interpret.

The information that we need is in the first column of the matrix which shows the correlation and significance for the dependent variable and all forms of the independent variable.

SW388R7 Data Analysis & Computers II Slide 9

10

### The pattern of correlations for no relationship

The correlation between the two variables is very weak and statistically non-significant. If we viewed this as a hypothesis test for the significance of r, we would conclude that there is no relationship between these variables.

Moreover, none of significance tests for the correlations with the transformed dependent variable are statistically significant. There is no relationship between these variables; it is not a problem with non-linearity.

SW388R7 Data Analysis & Computers II Slide 10

11

### Correlation pattern suggesting transformation

The correlation between the two variables is very weak and statistically non-significant. If we viewed this as a hypothesis test for the significance of r, we would conclude that there is no relationship between these variables.

However, the probability associated with the larger correlation for the square transformation is statistically significant, suggesting that this is a transformation we might want to use in our analysis.

SW388R7 Data Analysis & Computers II Slide 11

12

### Transformations

When a relationship is not linear, we can transform one or both variables to achieve a relationship that is linear. Four common transformations to induce linearity are: the logarithmic transformation, the square root transformation, the inverse transformation and the square transformation. All of these transformations produce a new variable that is mathematically equivalent to the original variable, but expressed in different measurement units, e.g. logarithmic units instead of decimal units.

SW388R7 Data Analysis & Computers II Slide 12

13

### When transformations do not work

When none of the transformations induces linearity in a relationship, our statistical analysis will underestimate the presence and strength of the relationship, i.e. we lose power. We do have the option of changing the way the information in the variables are represented, e.g. substitute several dichotomous variables for a single metric variable. This bypasses the assumption of linearity while still attempting to incorporate the information about the relationship in the analysis.

SW388R7 Data Analysis & Computers II Slide 13

14

### Strategy for solving problems - 1

Our strategy for determining whether or not a relationship is linear will be based on significance tests for the Pearson r correlation coefficient. If the correlation coefficient between an independent variable and a dependent variable is statistically significant (its probability is less than or equal to a specified level of significance), we will conclude that the relationship is linear.

SW388R7 Data Analysis & Computers II Slide 14

15

### Strategy for solving problems - 2

If linearity cannot be supported for the untransformed independent and dependent variables, we will examine the transformations for the variables. If any of the transformations for the independent or dependent variable are statistically significant when the untransformed relationship is not statistically significant, we will conclude that the problem is non-linearity, and can be remedied by substituting the transformed variable in the analysis. If none of the transformations are statistically significant, we will conclude that there is no relationship between the variables.

SW388R7 Data Analysis & Computers II Slide 15

16

### Strategy for solving problems - 3

Even when relationship is linear, the analysis might still be enhanced by the inclusion of a transformed version of the independent variable to the analysis, e.g. including the square of the independent variable in a regression. If the size of their correlation coefficient for a statistically significant transformation is substantially larger than the correlation coefficient for a statistically significant correlation between the untransformed variables, we will suggest that the transformed variable be included in the analysis, as well as the original form of the variables.

SW388R7 Data Analysis & Computers II Slide 16

17

### Problem 1

In the dataset GSS2000.sav, is the following statement true, false, or an incorrect application of a statistic? Use 0.01 as the level of significance. Based on a diagnostic hypothesis test of the correlation coefficient, the relationship between "hours per day watching TV" and "total hours spent on the Internet" is not linear. However, the square transformation of the independent variable "hours per day watching TV" does result in a relationship that is linear. 1. True 2. True with caution 3. False 4. Incorrect application of a statistic

SW388R7 Data Analysis & Computers II Slide 17

18

### Creating the scatterplot

The most commonly recommended strategy for evaluating linearity is visual examination of a scatter plot.

To obtain a scatter plot in SPSS, select the Scatter… command from the Graphs menu.

SW388R7 Data Analysis & Computers II Slide 18

19

### Selecting the type of scatterplot

First, click on thumbnail sketch of a simple scatterplot to highlight it.

Second, click on the Define button to specify the variables to be included in the scatterplot.

SW388R7 Data Analysis & Computers II Slide 19

20

### Selecting the variables

First, move the dependent variable netime to the Y Axis text box.

Third, click on the OK button to complete the specifications for the scatterplot.

Second, move the independent variable tvhours to the X axis text box.

If a problem statement mentions a relationship between two variables without clearly indicating which is the independent variable and which is the dependent variable, the first mentioned variable is taken to the be independent variable.

SW388R7 Data Analysis & Computers II Slide 20

21

### The scatterplot

The scatterplot is produced in the SPSS output viewer. The points in a scatterplot are considered linear if they form a cigar-shaped elliptical band. The pattern in this scatterplot is not really clear.

SW388R7 Data Analysis & Computers II Slide 21

22

To try to determine if the relationship is linear, we can add a trendline to the chart.

To add a trendline to the chart, we need to open the chart for editing. To open the chart for editing, double click on it.

SW388R7 Data Analysis & Computers II Slide 22

23

### The scatterplot in the SPSS Chart Editor

The chart that we double clicked on is opened for editing in the SPSS Chart Editor.

To add the trend line, select the Options… command from the Chart menu.

SW388R7 Data Analysis & Computers II Slide 23

24

### Requesting the fit line

In the Scatterplot Options dialog box, we click on the Total checkbox in the Fit Line panel in order to request the trend line.

Click on the Fit Options… button to request the r? coefficient of determination as a measure of the strength of the relationship.

SW388R7 Data Analysis & Computers II Slide 24

25

### Requesting r

First, the Linear regression thumbnail sketch should be highlighted as the type of fit line to be added to the chart.

Third, click on the Continue button to complete the options request.

Second, click on the Fit Options… Click on the Display R-square in Legend checkbox to add this item to our output.

SW388R7 Data Analysis & Computers II Slide 25

26

### Completing the request for the fit line

Click on the OK button to complete the request for the fit line.

SW388R7 Data Analysis & Computers II Slide 26

27

### The fit line and r

The red fit line is added to the chart.

The value of r? (0.0460) suggests that the relationship is weak.

SW388R7 Data Analysis & Computers II Slide 27

28

### Changing the shape of the fit line

We can try a trend line with a curved shape to see if it does a better job of fitting the data.

To change the trend line, select the Options… command from the Chart menu.

SW388R7 Data Analysis & Computers II Slide 28

29

### Accessing the fit line options

Click on the Fit Options… button to open up the dialog for specifying the characteristics of the fit line.

SW388R7 Data Analysis & Computers II Slide 29

30

### Specifying a quadratic fit line

First, click on the Quadratic regression thumbnail in the Fit Method panel. This will fit a trendline that include a square term in the equation (x?).

Second, click on the Continue button to close the fit line options dialog.

SW388R7 Data Analysis & Computers II Slide 30

31

### Completing the request for the fit line

Click on the OK button to complete the request for the fit line.

SW388R7 Data Analysis & Computers II Slide 31

32

### The quadratic fit line and r

The value of r? (0.1591) falls at the top of the weak range, indicating a stronger relationship that the one represented by the linear fit line. This result hints that a squared transformation of the independent variable may be needed.

The red fit line curves to reduce the discrepancies between the line and the data points.

SW388R7 Data Analysis & Computers II Slide 32

33

### Computing the transformations

There are four transformations that we can use to achieve or improve linearity. The compute dialogs for these four transformations for linearity are shown.

SW388R7 Data Analysis & Computers II Slide 33

34

### Creating the scatterplot matrix

To create the scatterplot matrix, select the Scatter… command in the Graphs menu.

SW388R7 Data Analysis & Computers II Slide 34

35

### Selecting type of scatterplot

First, click on the Matrix thumbnail sketch to indicate which type of scatterplot we want.

Second, click on the Define button to select the variables for the scatterplot.

SW388R7 Data Analysis & Computers II Slide 35

36

### Specifications for scatterplot matrix

First, move the dependent variable, the independent variable and all of the transformations to the Matrix Variables list box.

Second, click on the OK button to produce the scatterplot.

SW388R7 Data Analysis & Computers II Slide 36

37

### The scatterplot matrix

The scatterplot matrix shows a thumbnail sketch of scatterplots for each independent variable or transformation with the dependent variable. The scatterplot matrix may suggest which transformations might be useful.

SW388R7 Data Analysis & Computers II Slide 37

38

### Creating the correlation matrix

To create the correlation matrix, select the Correlate | Bivariate… command in the Analyze menu.

SW388R7 Data Analysis & Computers II Slide 38

39

### Specifications for correlation matrix

First, move the dependent variable, the independent variable and all of the transformations to the Variables list box.

Second, click on the OK button to produce the correlation matrix.

SW388R7 Data Analysis & Computers II Slide 39

40

### The correlation matrix

The answers to the problems are based on the correlation matrix. Before we answer the question in this problem, we will use a script to produce the output.

SW388R7 Data Analysis & Computers II Slide 40

41

### The assumption of linearity script

An SPSS script to produce all of the output that we have produced manually is available on the course web site. After downloading the script, run it to test the assumption of linearity.

Select Run Script… from the Utilities menu.

SW388R7 Data Analysis & Computers II Slide 41

42

### Selecting the assumption of linearity script

First, navigate to the folder containing your scripts and highlight the LinearityAssumptionAndTransformations.SBS script.

Second, click on the Run button to activate the script.

SW388R7 Data Analysis & Computers II Slide 42

43

### Specifications for linearity script

First, move the dependent variable from the list of variables in the data set.

Second, move the independent variable from the list of variables in the data set.

The default output is transformations of the independent variable. To include transformations of the dependent variable, mark the checkboxes.

Third, click on the OK button to run the script.

SW388R7 Data Analysis & Computers II Slide 43

44

### The correlation matrix and the original problem

The output from the script can be used to answer the problem question. The significance of the correlation coefficient between the untransformed variables (0.079) is not significant, suggesting either a weak or a non-linear relationship.

The correlation between the dependent variable and the square transformation (0.006) is less than the level of significance. The square transformation results in a relationship that can be treated as linear.

The answer to the problem is true.

SW388R7 Data Analysis & Computers II Slide 44

45

### Problem 2

In the dataset GSS2000.sav, is the following statement true, false, or an incorrect application of a statistic? Use 0.01 as the level of significance. Based on a diagnostic hypothesis test of the correlation coefficient, there is a linear relationship between "number of hours worked in the past week" and "total hours spent on the Internet". 1. True 2. True with caution 3. False 4. Incorrect application of a statistic

SW388R7 Data Analysis & Computers II Slide 45

46

### The correlation matrix

The probability associated with the correlation coefficient between "number of hours worked in the past week" and "total hours spent on the Internet" (0.486) is greater than the level of significance. The assumption of linearity is not supported.

The lack of statistical significance for all of the transformations suggests that there is no relationship between "number of hours worked in the past week" and "total hours spent on the Internet", and the lack of relationship is not attributable to non-linearity.

The answer to the problem is false.

SW388R7 Data Analysis & Computers II Slide 46

47

### Problem 3

In the dataset GSS2000.sav, is the following statement true, false, or an incorrect application of a statistic? Use 0.01 as the level of significance. Based on a diagnostic hypothesis test of the correlation coefficient, there is a linear relationship between "highest academic degree" and "occupational prestige score". 1. True 2. True with caution 3. False 4. Incorrect application of a statistic

SW388R7 Data Analysis & Computers II Slide 47

48

### The correlation matrix

The probability associated with the correlation coefficient between "highest academic degree" and "occupational prestige score" (<0.001) is less than or equal to the level of significance. The assumption of linearity is supported.

Since highest academic degree is an ordinal level variable, the answer to the problem is true with caution.

SW388R7 Data Analysis & Computers II Slide 48

49

### Other problems on assumption of linearity

A problem may ask about the assumption of linearity for a nominal level variable. The answer will be “An inappropriate application of a statistic” since linearity does not apply to nominal variables. A problem may ask about the assumption of linearity for an ordinal level variable. If the variable or transformed variable is linear, the correct answer to the question is “True with caution” since we may be required to defend treating an ordinal variable as metric. Questions will specify a level of significance to use and the statistical evidence upon which you should base your answer.

SW388R7 Data Analysis & Computers II Slide 49

50

question 1

The following is a guide to the decision process for answering problems about linearity of the relationship:

False (not linear)

True (linear)

True with caution (linear)

SW388R7 Data Analysis & Computers II Slide 50

Correlation for untransformed variables statistically significant?

Either variable ordinal level?

51

question 2

The following is a guide to the decision process for answering problems about the applicability of a transformation:

False

True

True with caution

SW388R7 Data Analysis & Computers II Slide 51

Correlation with transformed variable statistically significant?

Correlation for untransformed variables statistically significant?

Either variable ordinal level?

«Assumption of linearity»
http://900igr.net/prezentacija/bez_uroka/assumption-of-linearity-224494.html
cñûëêà íà ñòðàíèöó
Óðîê

1 òåìà
Ñëàéäû