Essay Available:

You are here: Home → Statistics Project → Accounting, Finance, SPSS

Pages:

8 pages/≈2200 words

Sources:

Level:

Harvard

Subject:

Accounting, Finance, SPSS

Type:

Statistics Project

Language:

English (U.S.)

Document:

MS Word

Date:

Total cost:

$ 37.44

Topic:

# SPSS Analysis (Statistics Project Sample)

Instructions:

Content:

[Title]

By Name

Course

Instructor

Institution

Location

Date

Question 1

Model 1 shows significant linear relationship between treatment cost and indicator A. This is shown in the coefficients table under the significance column. With a level of significance of 0.000 means that the level of significance is very high. Comparing with the significance value of indicator B, which stands at 0.082, it is more than the required value of 0.05 to be considered significant. With the R-values of the two models A and B at 0.898 and 0.926 respectively, the degree of correlation in both models is very high. This further cements the usefulness of the regression models chosen.

There is portrayal of high quality on the two models since all the relevant data is clearly output for analysis. This includes the model summary, ANOVA, and coefficients tables from both models.

From the table obtained after predictions were made, we are 95% confident that the mean or average cost of a patient shall be £487 to £547.5 in Model 1 and £531.7 to £582.3 for Model 2. For each individual patient, we can be fairly sure that the cost will lie within £287.9 to £746.6 for Model 1 and £360.2 to £753.8 for Model 2. Since the costs used in predicting are mere estimates, the level of accuracy is not precise with what might be experienced realistically. There also exist different types of patients who may need different approaches to what they are ailing. This leads to a difference in actual costs from the predicted costs.

12345Actual Cost200300400500600Ind_a154.732164.232173.732183.232192.732Residual a45.732135.768226.268316.768407.268Ind_b154.679204.979255.279305.579355.879Residual b45.32195.021144.721194.421244.121Ind_a = cost x 0.095 + 135.732

Ind_b = cost x 0.503 + 54.079

EMBED MSGraph.Chart.8 \s

If the costs were 20% higher than the other hospital, the residual values would also be 20% higher. The slope would have a multiplier equal to 20%.

Question 2

Twenty-five percent of all the participants fall below the age of 25 years, while fifty percent fall below the age of 36 years. Seventy-five percent of the participants are 47 years and below as shown in table 3.

Statisticsage ABCDNValid200200200200Missing0000Mean35.635036.310035.505035.7100Median35.000036.000035.500036.0000Mode28.0023.0028.00a26.00Std. Deviation10.8804311.5747010.2946510.71775Skewness.115.060.142.070Std. Error of Skewness.172.172.172.172Minimum18.0018.0018.0018.00Maximum55.0055.0055.0055.00Percentiles2525.000025.000026.250026.00005036.000036.000035.500036.00007547.000047.000044.000045.0000a. Multiple modes exist. The smallest value is shownTable SEQ Table \* ARABIC 1: Age Statistics

Event B saw the best performances from all the participants followed by event D. Event C and A follow that sequence in that order as depicted in REF _Ref409159461 \h Figure 1: Age vs. Time Scatter Plot.

Figure SEQ Figure \* ARABIC 1: Age vs. Time Scatter Plot

This might be because the running distance is much shorter in event B than in all the other events. If we combine all the events together, we find that the average time taken to complete the events is about 5,690 seconds. Figure 2 to figure 5 are graphs that show the relationship between age and time taken to complete each event.

In event A, we see that those participants in the age of 25 to 40 perform better as shown in figure 2.

Figure SEQ Figure \* ARABIC 2: Age vs. Time for event A

Similarly, Event B has most participants in the age of 21 to 45 perform better as displayed in figure 3.

Figure SEQ Figure \* ARABIC 3: Age vs. Time for event B

Figure SEQ Figure \* ARABIC 4: Age vs. Time for event C

Figure 4 has a slight difference with those in the younger age performing better than those who are older do in event C. However, majority of those who performed better are between age 20 and 45.

Figure SEQ Figure \* ARABIC 5: Age vs. Time for event D

Event D is much similar to events A and B with majority of participants in the age of 22 to 45 performing better in the event as displayed in figure 5. It is therefore right to assume that those in the middle age performed better in all the events than those who are either young or old. This confirms the original hypothesis that competitors usually achieve their best times between the ages of 25 and 35 in all the events.

Figure 5 to figure 8 are histograms for events A to D respectively.

Figure SEQ Figure \* ARABIC 6: Histogram for event A

Figure SEQ Figure \* ARABIC 7: Histogram for event B

Figure SEQ Figure \* ARABIC 8: Histogram for event C

Figure SEQ Figure \* ARABIC 9: Histogram for event D

They all represent the data obtained from the participants of all the events. A normal curve has been derived to determine what kind of distribution this data had. From the graphs drawn, we see that all events had a normal distribution with majority of results falling about the mean of the distribution. This further confirms the original hypothesis to be true.

Statisticsage ABCDANValid200200200200Missing0000Mean35.635036.310035.505035.7100Median35.000036.000035.500036.0000Mode28.0023.0028.00a26.00Std. Deviation10.8804311.5747010.2946510.71775Skewness.115.060.142.070Std. Error of Skewness.172.172.172.172Minimum18.0018.0018.0018.00Maximum55.0055.0055.0055.00Percentiles2525.000025.000026.250026.00005035.000036.000035.500036.00007544.750047.000044.000045.0000a. Multiple modes exist. The smallest value is shownTable SEQ Table \* ARABIC 2: Age statistics

From table 2, the mean and median from all the events are very close with each other, which suggest that age is normally distributed. The mode takes the values of 28, 23, 28, and 26 for events A, B, C, and D respectively. With the assumption that most participant within the age of 25 and 35 performing well, it could be further explained that it is due to the majority of participants within this age limit. The skewness values of 0.115, 0.06, 0.142, and 0.07 for events A, B, C, and D respectively are very close to zero. This is another indicator that the age of the participants is normally distributed.

Table 3 tests for the level of significance between the variables used. Under event A, a significance value of 0.399 suggests that age and time are dependent with each other to some extent. In event B, age and time seem to have a strong dependency with each other due to the significance value, 0.029 being close to zero. Event C and D have higher significance levels of 0.536 and 0.463 respectively.

Chi-Square TestseventValuedfAsymp. Sig. (2-sided)APearson Chi-Square7022.540a6993.399Likelihood Ratio1396.81869931.000Linear-by-Linear Association39.2121.000N of Valid Cases200BPearson Chi-Square6807.090b6588.029Likelihood Ratio1370.61665881.000Linear-by-Linear Association23.0771.000N of Valid Cases200CPearson Chi-Square6722.962c6734.536Likelihood Ratio1365.40067341.000Linear-by-Linear Association32.5501.000N of Valid Cases200DPearson Chi-Square6781.197d6771.463Likelihood Ratio1364.42267711.000Linear-by-Linear Association25.0931.000N of Valid Cases200a. 7220 cells (100.0%) have expected count less than 5. The minimum expected count is .01.b. 6808 cells (100.0%) have expected count less than 5. The minimum expected count is .01.c. 6954 cells (100.0%) have expected count less than 5. The minimum expected count is .01.d. 6992 cells (100.0%) have expected count less than 5. The minimum expected count is .01.Table SEQ Table \* ARABIC 3: Pearson Chi-Square Tests

This may suggest that other factors exist in influencing the relationship between age and time. Some of these factors may include the cycling distance, swimming distance, and running distance.

Variables Entered/RemovedaeventModelVariables EnteredVariables RemovedMethodA1age.Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-remove >= .100).B1age.Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-remove >= .100).C1age.Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-remove >= .100).D1age.Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-remove >= .100).a. Dependent Variable: timeTable SEQ Table \* ARABIC 4

Based on the analysis done, only age is more useful in predicting the time taken to complete each event as shown in the table above.

From table 5 below, the adjusted R2 is 0.193, 0.111, 0.159, and 0.122 with the R2 at 0.197, 0.116, 0.164, and 0.126 for events A, B, C, and D respectively. This means that the linear regression explains 19.7%, 11.6%, 16.4%, and 12.6% of the variance in the data for events A, B, C, and D, respectively.

Model SummarybeventModelRR SquareAdjusted R SquareStd. Error of the EstimateDurbin-WatsonA1.444a.197.193320.482242.058B1.341a.116.111341.297002.121C1.404a.164.159322.522751.868D1.355a.126.122342.116911.768a. Predictors: (Constant), ageb. Dependent Variable: timeTable SEQ Table \* ARABIC 5

The Durbin-Watson values of 2.058, 2.121, 1.868, and 1.768 for events A, B, C, and D, respectively are all between the crucial values of 1.5 and 2.5. We can thus assume that there is no first order linear auto-correlation in the multiple linear regression data.

ANOVAaeventModelSum of SquaresdfMean SquareFSig.A1Regression4990488.61714990488.61748.589.00...

Get the Whole Paper!

Not exactly what you need?

Do you need a custom essay? Order right now: