# Analyzing Health Care Dataset (Statistics Project Sample)

Nonparametric T-Test

INTRODUCTION

Most of your focus in this course thus farhas been on health care situations where it is

reasonable to assume that the data you are analyzing is normally distributed. What happens

if you find yourself in a situation where you cannot make that assumption about the data?

This may happen with interval or ratio data if your sample size is small (fewer than 30) or

each group is skewed in opposite directions. You also may not be able to assume normal

distribution if the data is measured at the ordinal level, which is less precise than interval or

ratio data. If you encounter either of these scenarios, you may need to consider using

nonparametric tests.

Fortunately, nonparametric tests are very flexible because they are distribution free! So why

not use nonparametric tests all the time? The reason has to do with power. Like a powerful

microscope that can magnify tiny differences in small objects, the parametric tests can

identify significant differences in small increments of data. Like a toy microscope,

nonparametric tests are great for examining bigger objects, but they do not work well on

small objects.

To decide which statistical test to use for the various dependent variables to be analyzed,

one must first know more about the data type (measurement level) within those variables.

Assignment 1: Analyzing Health Care Dataset

Student’s Name

Institutional Affiliation

Date

Assignment 1: Analyzing HealthCare Data Set

Part One: Yoga and Stress Study Statistical Tests

1 The dependent variable in the study is the Psychological Stress Score (POST_PSS). The measurement level of the variable is the Ratio scale. It has all the characteristics of an interval scale, as well as the ability to hold the value 0.

2 Pre-Evaluation of the Data for Outliers

Variable Age

Looking at the above boxplot, there is no instance of a circle or a *. This means that there are no potential outliers in the variable Age.

Variable PRE_PSS

Also, looking at the above boxplot, there is no instance of a circle or a *. This means that there are no potential outliers in the variable PRE_PSS.

Variable POST_PSS

From the above boxplot, there is an instance of a circle. This means that there is a potential outlier in the variable POST_PSS.

Checking for Normal Distribution of The Dependent Variables

For POST PSS, the significance value is lower than the alpha value (using .05 as the alpha value), indicating that there is reason to believe that the data (POST_PSS) does not follow a normal distribution.

3 Next is to check if the outlier found in the POST_PSS variable affects the data. Looking at the Descriptive statistics table > We do a comparison of the 5% trim mean and the mean value. The difference between the mean value and 5% trim mean is not significantly large, this implies that there is no possibility that any further analyses we conduct, e.g., regression or correlation tests will be affected by this outlier.

Descriptive Statistics of the Demographic Data

The sample has 20 participants. The mean age of the participants (Min=18, Max=60, SD= 12.64) is 39.45

Chi-Square Test of Association (Variable Gender and Race)

The Pearson's chi-square test is a statistical test used to determine how probable any difference observed between two sets of category data is due to chance CITATION Hea14 \l 1033 (Heavey, 2014).

HO: The association between Variables Gender and Race does not exist

H1: The association between Variables Gender and Race exists

SPSS Output

The Pearson Chi-Square Asymp.Sig < 0.05

We reject H0

In conclusion, there is a statistically significant relationship between the two categorical variables (Race and Gender).

4 Comparing Pre-Test and Post-Test Scores

The t-test is a small-sample equivalent of the z test, which is used with large samples (n>30). A sample of size n<30 is considered a small sample CITATION Hea14 \l 1033 (Heavey, 2014). Because the distributions of small samples are not normal (as is the case with POST_SCORES Variable), a t-test is used.

HO: There is no statistically significant difference in the PRE_PSS score and POST_PSS score means.

H1: There is a statistically significant difference in the PRE_PSS score and POST_PSS score means.

Before yoga intervention, the mean stress score of participants (SD=7.02) was 20.35. After the intervention, the mean stress score reduced to (SD=6.49) 16.15

A t-test of the mean difference between PRE_PSS and POST_PSS (t (19) =2.68, p=0.015) shows a statistically significant difference in means.

Yoga intervention reduced the participants' stress levels.

Part Two: Interpretive Report

1 Clinical Implications

Yoga should be considered as a supplement therapy or an alternative to medical therapy in the treatment of anxiety, depression, stress, and other mood disorders, as it has been found to improve body image and self-confidence and create a higher sense of well-being.

2 Limitation of the Study

A drawback of researching with small sample sizes is that it can be difficult to evaluate whether a given outcome is true, and in some situations, a type II error, in which the null hypothesis is wrongly accepted and no difference between research groups is recorded, can occur CITATION Hea14 \l 1033 (Heavey, 2014).

Discussion One

1 Research Project Scenarios to Use Nonparametric Statistical Tests

Mann-Whitney U Test

When the statistical assumptions needed to carry out an independent-sample t-test are not fulfilled, the Mann-Whitney U test can act as a substitute nonparametric test to compare sample means CITATION Mac16 \l 1033 (MacFarland & Yates, 2016). It compares the rankings of the variable values instead of the averages, requiring scale data. The only requirement for using rankings is that the data be captured on the ordinal scale. The ultimate goal of the Mann-Whitney U test, like the independent-sample t-test, is to find statistically significant evidence that the sample group means are different CITATION Mac16 \l 1033 (MacFarland & Yates, 2016).

Mann-Whitney U Test Project Scenario

Cholesterol (a form of fat) concentration in the blood is linked to the chance of getting heart disease, with greater concentrations of cholesterol indicating a higher risk and low concentrations indicating a reduced risk. An individual’s chance of having heart disease can be minimized by lowering cholesterol levels in the blood. The cholesterol concentration in a person’s blood rises when they are overweight and/or physically unfit. Weight loss and exercise can both help to lower the levels of cholesterol in the blood. However, it is unclear whether weight loss or exercising is the most effective way to lower cholesterol levels. As a result, researchers chose to examine if an exercise or a weight loss intervention would be more helpful in lowering levels of cholesterol in the participant’s blood. Towards this goal, the researchers collect a random sample of overweight, inactive men. The participants were then divided into two groups: Group 1 followed a calorie-controlled eating habit (this was the 'diet' team) and Group 2 followed an activity-training program (this was the 'exercise' team). Concentrations of cholesterol were analyzed between the two participant groups at the end of the research program using the Mann-Whitney U test to determine which treatment plan was more effective.

Wilcoxon Signed-Rank

The Wilcoxon signed-rank test is a non-parametric statistical method that is used to determine if two matched, related, or repeated measurements on any given samples have different population average ranks CITATION Ros06 \l 1033 (Rosner, Glynn, & Lee, 2006). It is preferred to the paired student's t-test if the distributions of two sample mean differences cannot be presumed to be normally distributed. To an extent, the Wilcoxon signed-rank test is the best nonparametric test for determining if two dependent samples were chosen from populations with the same distributions CITATION Ros06 \l 1033 (Rosner, Glynn, & Lee, 2006).

Wilcoxon Signed-Rank Project Scenario

Pain researchers are interested in discovering non-drug approaches for reducing lower back pain in people. Ac

### Other Topics:

- The Usage of ANOVA TestDescription: 1 What is the total sample size? The total sample size is n=935 2 How many women were in each of the different hprobgrp groups? N No Housing Problem 367 One Housing Problem 264 Two or More Housing Problems 304 3 What are the mean and standrad deviation (SD) ...1 page/≈275 words| No Sources | APA | Health, Medicine, Nursing | Statistics Project |