32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 3: Comparing between groups Lecturer: Mahrita Harahap [email protected] B MathFin (Hons) M Stat (UNSW) PhD (UTS) mahritaharahap.wordpress.com/ teaching-areas Faculty of Engineering and Information Technology UTS CRICOS PROVIDER CODE: 00099F Last Week: Hypothesis Testing Process Hypothesis tests give us an objective way of assessing such questions. They are based on a proof by contradiction form of argument. H We formulate a null hypothesis (H0) A T P

C Week 2 We formulate an alternative hypothesis (H1) determine if it is a 1-tailed or 2-tailed test. State the assumptions of the test and its level of significance. We calculate a Test Statistic. Measures the compatibility of the sample obtained, with the H0, assuming it is true. Find its associated P-Value, which represents the probability of observing this sample, assuming H0 is true. Weigh up the conclusion based on the P-value. If p-value0.05, we reject H0. If p-value>0.05, we do not reject H0. State the conclusion in context so people who dont understand 2 statistics can still understand your conclusions Last Week: Parametric Tests On eg rou p 1-Sample tests

1-Sample Proportion Paired T ar ta a D Tests 1-Sample T mean proportio n Two groups 2-Sample tests ed air p e re Th

eo Data are not paire d 2c ate go ric al 2-Sample T ore rm Chi-Square Independence Test gro up s Week 2 K-Sample tests ans

g me rin ompa CoC mparin g propo rtions Analysis of Variance Chi-Square Goodness of Fit Test 3 Last Week: Nonparametric Tests gr ou p 1-Sample tests ne Two groups Th re gr e or

ou m ps or e 2-Sample tests K-Sample tests Week 2 1-Sample Wilcoxon aired p e r a Data O Tests mean Data are n

ot paired Comparing means Wilcoxon test on difference Mann-Whitney Test Kruskal Wallis Test 4 Last Week: Normality Test We can test normality assumptions formally using the KolmologrovSmirnov test (KS for short). The hypotheses for this test are H0: The data are normally distributed H1: The data are not normally distributed In R we can use the nortest package to use the lillie.test function In SPSS we find this test under Analyze > Nonparametric Tests > 1-Sample KS If we have more than one group, we could either test each sample individually. If p-value0.05, we reject H0. Therefore we conclude that the data is not significantly normally distributed. If p-value>0.05, we do not reject H0. Therefore we conclude that the data is significantly normally distributed.

Week 2 5 This Week Compare between 2 means Paired t-test 2-sample t-test Compare between 3 means or more ANOVA test and Multiple Comparisons test Nonparametric tests (if assumptions do not apply) Wilcoxon Test on difference Mann-Whitney test Kruskal Wallis test (we wont cover this section in the lecture but it is in the appendix and there is are examples in the lab sheet) Week 2 6 Comparing Two Groups It is often of interest to compare two populations or two groups together. We can do this graphically by comparing two boxplots to each other. To compare the two populations formally we can use a 2sample test. The choice of test depends on whether The data are paired or not. If the data are not paired, then whether we can assume that the two populations have the same variance or not.

Week 3 7 What are Paired Data? If we can match one observation in the sample from one population to one observation from the other population, then the data are said to be paired. We usually have some sort of matching variable, such as a plot of land, a person, the same machine, or a particular specimen. If the data are paired then the sample sizes of the two groups must be the same. This doesnt mean that just because the sample sizes are the same, the data is paired you need to identify the matching variable! Week 3 8 Analysing Paired Data This is optional! Think about the data and the question and decide if you want to compute the differences or leave the data as is. Since we have a 1-1 relationship in paired data, it is sensible

to subtract the observed value in one group from the observed value in the other group for that unit. E.g. length of left leg minus length of right leg If you cant figure out how to pair the data, then it probably isnt paired. The mean of these differences can then be treated like the mean from a single population That is, a paired t-test is like a 1-sample t-test, but conducted on the differences. This gives the test increased power over an unpaired test. Week 3 9 Example: Pulse Data In the lecture last week, we considered a data set based on the pulses of people who either ran for one minute, or who rested for one minute. We can use inferential techniques to test some of the theories that we may have made from our exploratory data analysis. Pulse 1 Pulse 2 Ran Smokes

Sex Height Weight Activity 64.0 88.0 1 0. 1 1.68 64.0 2 58.0 70.0

1 0 1 1.83 66.0 2 : : : : : : : 76.0

76.0 0 0 2 1.57 49.0 2 First pulse measuremen t Second pulse measurement 1 = Yes 0 = No 1 = Yes 0 = No

1 = Male 2 = Female in m in kg 1 = slight 2 = moderate 3 = high : Example: Paired T test on pulse data Suppose that I want to compare the first and second pulse rates for those participants who ran to see whether they were different or not. We notice that each participant will have an observation from the first pulse rate, and another from the second pulse rate Therefore the data are paired Step 1: Set up the hypotheses H0: 1 = 2 H1: 1 2 Alternatively, you can compute the differences and test H0: diff = 0 H1: diff 0

Using a 1-sample T test Step 2: Choose an appropriate test Paired-T (Remember to check assumptions Step 3: Execute the test in SPSS and obtain a p-value Use Analyze> Compare Means > Paired samples T test (filter out those who did not run) Week 3 11 Paired T test - Example Step 4: Make a decision P-value = 0.000 < 0.05 (level of significance) Therefore we will reject H0 Step 5: State the conclusion in context Therefore there is a significant difference between the initial pulse rate and the final pulse rate for those participants who ran. Week 3 12 When Data is not Paired Suppose that we do not have paired data Then taking differences doesnt make sense. So we cannot perform a test on diff.

Instead, we look only at the difference between the means We have two different test statistics, one when the variances of the two groups are the same, one when they are different Later we will look at how to decide which. Week 3 13 2-Sample T test The 2-sample T test determines whether the means of two unrelated populations are the same or not. In general, the hypotheses are H0 : 1 = 2 H1: 1 2 (or > or <)or > or <) where the subscripts correspond to the two groups. The first decision we need to make is whether or not we should use a pooled variance, that is, assume that the variance of the two groups are equal and obtain a more powerful test. Week 3 14 2-Sample T Test SPSS automatically conducts a test for equal variances when

you run a 2-sample T test This test is called Levenes test Levenes test has hypotheses H0: 1 = 2 H1: 1 2 If we reject the null hypothesis, we cannot assume equal variances and need to use the p-value associated with Equal variances not assumed If we do not reject the null hypotheses, we can assume equal variances and use the p-value associated with Equal variances assumed. Week 3 15 Example: 2-Sample T test Now suppose that I wish to test whether there is a significant difference between the final pulse rates between the males and females who ran. In this case, the participants in one group will be males and the other females Therefore the data wont be paired, and we should use a 2-Sample T test because these 2 samples are 2 independent groups. Step 1: Set up the hypotheses H0 : M = F H1: M F

Step 2: Choose an appropriate test 2-Sample T Step 3: Execute the test in SPSS and obtain a p-value Use Analyze> Compare Means > Independent Samples T test (filter out those who did not run) Week 3 16 2-Sample T test - Example Step 3.5: Levenes test for equal variances H0: 1 = 2 H1: 1 2 P-value(0.946)>(0.05) we do not reject H 0. Therefore we assume equal variances and need to use the p-value associated with Equal variances assumed Step 4: Make a decision P-value = 0.000 < (0.05) (level of significance) Therefore we will reject H0 Step 5: State the conclusion in context Therefore there is a significant difference between the final pulse rates of the male and female participants who ran. Week 3

17 Comparing 2 means On eg rou p 1-Sample tests 1-Sample Proportion Paired T ar ta a D Tests 1-Sample T mean proportio n

Two groups 2-Sample tests ed air p e re Th eo Data are not paire d 2c ate go ric al 2-Sample T ore rm Chi-Square Independence Test

gro up s Week 2 K-Sample tests ans g me rin ompa CoC mparin g propo rtions Analysis of Variance Chi-Square Goodness of Fit Test 18 Analysis of Variance (ANOVA) Next we look at testing for a difference in means across

more than 2 groups using a procedure called ANOVA. ANOVA is a generalisation of the two sample t-test for more than two samples. In general, if we have k groups then the hypotheses will be H0: 1 = 2 = = k H1: at least one i j Week 3 19 Assumptions for ANOVA In order to run an ANOVA on your data, the following assumptions must be met 1. Each group has normally distributed observations 2. The variances for each group are similar (Levenes test) Variance = Standard Deviation2 3. The differences between observations and the corresponding group means (called residuals) are normally distributed. Week 3 20 ANOVA Why is it called ANOVA when we are comparing means?

We are comparing the variation between groups to the variation within each group. If the variation between groups is large compared to the variation within groups, then we can conclude that the groups have different means. Between Group Variation Within Group Variation Week 3 21 ANOVA: Behind the scenes Variability is measured as the sum of squared deviations from the mean(sum of squares). If all the group means happen to be exactly the same, the variability between groups (SSG) would be zero. If the final pulse rates were always identical for each sex then the variability within groups (SSE) would be zero.

Deviations of data from the overall mean. Deviations of the group means from the overall mean. Deviations of the data from their group mean. Week 3 SST = (x - )2 SSG= ni(i - )2 = n1(1 - )2 + + nk(k - )2 SSE= (x - i)2 where k= number of groups n= sample size = overall mean(i.e. mean of all the data) 22 Example: ANOVA Suppose that we consider the pulse data again, and want to test whether there is a significant difference between the changes in pulse rates between the levels of activity for those participants who ran (slight, moderate, high). First we need to calculate the change in pulse rate Change in pulse rate = Pulse2 Pulse1 Step 1: Set up the hypotheses H0: slight= moderate = high

H1: at least one i is not equal to the other Step 2: Choose an appropriate test Analysis of Variance Step 3: Execute the test in SPSS and obtain a p-value Use Analyze> Compare Means > One-Way ANOVA (filter out those who did not run) Week 3 23 Example: Pulse Data Pulse 1 Pulse 2 Ran Smokes Sex Height Weight Activity 64.0

88.0 1 0. 1 1.68 64.0 2 58.0 70.0 1 0 1 1.83 66.0

2 : : : : : : : 76.0 76.0 0 0 2 1.57

49.0 2 First pulse measuremen t Second pulse measurement 1 = Yes 0 = No 1 = Yes 0 = No 1 = Male 2 = Female in m in kg 1 = slight 2 = moderate 3 = high

: Testing assumptions before running ANOVA 1. To test for normality of groups, we run a 1-sample KS test on each group. 2. To test for equality of variances in SPSS, we choose Homogeneity of variance test in the options section of OneWay ANOVA. This runs Levenes test (the setup is the same as for a 2-sample T test). 3. To test for normality, we need to run the model as a Generalised linear Model (GLM), and ask for residuals. Then run a 1-Sample KS test on the residuals Week 3 25 Analysis of Variance - Example SSE SSG n-k SST n-1

MSG=SSG/dfSSG k-1 MSE=SSE/dfSSE Step 4: Make a decision F=MSG/MSE P-value = 0.001 < 0.05 (level of significance) Therefore we reject H0 Step 5: State the conclusion in context Therefore there is a significant difference in the changes of pulse rates between the levels of activity of those participants who ran. Week 3 26 Multiple Comparisons Tests Analysis of Variance only considers whether or not there are differences between the means across the groups It does not find where those differences are. Multiple comparisons tests test differences between pairs of groups. These can be performed using the Post-Hoc option in the ANOVA

dialog in SPSS. Which type of test you run depends on which comparisons you are interested in. If you are interested in all pairs, then Tukeys paired comparisons test is a good place to start. Week 3 27 But why cant we just run lots of t-tests? If we run lots of t-tests, then the chance that we make an error in at least one of those tests is much larger than 0.05. This is called the family error rate. So how are these different? Multiple comparisons tests fit what are called simultaneous confidence intervals, so there is a 95% probability that all of the group means fall within their confidence intervals. In general, the confidence intervals will be wider than those obtained using t-tests. Week 3 28 Example: Multiple Comparisons Tests We used ANOVA to determine whether the changes in pulse rates differed between participants with different

levels of activity. Now we would like to see which groups differed. We will obtain the ANOVA table as well as the following multiple comparisons table. Week 3 29 Example: Multiple Comparisons Tests SPSS gives 95% Confidence Intervals between pairs of groups If 0 lies in the confidence interval, then we would conclude that the means of the two groups are not significantly different. If both bounds are positive, or both bounds are negative(i.e. 0 does not lie in the confidence interval), then we would say that the groups have different means. Week 3 30 Example: Multiple Comparisons Tests We see that there is a significant difference between Moderate and High levels of activity, but no other differences. The change in pulse rate for those with a slight level of activity is not significantly different to the other two groups. This is clearly visualised with a multiple boxplot. Week 3

31 Non-Parametric Tests WEEK 3 APPENDICES Week 3 32 Non-Parametric Tests For each of the parametric tests introduced there is an equivalent non-parametric test that does the same job when the assumptions of the parametric test fail. Non-parametric tests are generally less powerful than parametric tests This means that if H0 is in fact false (and there is a difference between groups), then a non-parametric test is less likely to pick the difference. This means that we want to use a parametric test wherever possible Non-Parametric tests test results on the population median not the population mean This makes non-parametric tests good for very skewed data Week 3 33

Types of Non-Parametric tests We can compare the non-parametric tests to their parametric counterparts Week 3 Parametric Test Non-parametric Test 1-Sample Z or 1-sample t H0: = value H1: value 1-Sample Wilcoxon test H0: median = value H1: median value Paired t H0: 1 = 2 or D = 0 H1: 1 2 or D 0 1-Sample Wilcoxon test on differences H0: median difference = value H1: median difference value 2-Sample t H0: 1 = 2

H1: 1 2 Mann-Whitney test H0: medians are equal H1: medians are not equal ANOVA H0: 1 = 2 = = k H1: not all s ares are equal Kruskal-Wallis test H0: medians are equal H1: not all medians are equal 34 Example: Mann-Whitney Test Suppose that we repeat the analysis comparing males and females who ran, but this time use a non-parametric test The appropriate test will be a Mann-Whitney Test The procedure is the same just the hypotheses change. Step 1: Set up the hypotheses H0: medianM= medianF H1: medianM medianF Step 2: Choose an appropriate test Mann-Whitney Step 3: Execute the test in SPSS and obtain a p-value Use Analyze> Non-Parametric Tests> 2 Independent Samples

(filter out those who did not run) Week 3 35 Mann-Whitney Test - Example Step 4: Make a decision P-value = 0.000 < 0.05 (level of significance) Therefore we will reject H0 Step 5: State the conclusion in context Week 3 Therefore there is a significant difference between the final pulse rates of the male and female participants who ran. 36 Example: Kruskal-Wallis Test We can also repeat the activity level analysis using a nonparametric test. First we need to calculate the change in pulse rate Change in pulse rate = Pulse2 Pulse1 Step 1: Set up the hypotheses H0: median1= median2 = median3 H1: not all of the medians are equal

Step 2: Choose an appropriate test Kruskal-Wallis Step 3: Execute the test in SPSS and obtain a p-value Use Analyze> Nonparametric Tests > K Independent Samples (filter out those who did not run) Week 3 37 Kruskal-Wallis Test - Example Step 4: Make a decision P-value = 0.002 < 0.05 (level of significance) Therefore we will reject H0 Step 5: State the conclusion in context Therefore there is a significant difference in the changes of pulse rates between the levels of activity of those participants who ran. In this case, we made the same conclusion (as the parametric test), but the observed p-value was larger. Week 3 38