+34 616 71 29 85 carsten@dataz4s.com

Independent two-sample t-test in R

Below, a run through of how to run an independent two-sample t-test in R

 

The independent 2-sample t-test is a parametric method used for exploring the difference in means for two populations.

 

Checking for equal variances

We will be using the Lung Capacity dataset with 725 observation and 6 variables comparing the lung capacity of smokers and to the lung capacity of non-smokers.

 

Read in data:

# Read in data via read_excel
library(readxl)
LungCapData <- read_excel(“C:/Users/Usuario/Documents/dataZ4s/R/MarinLectures/LungCapData.xlsx”,
col_types = c(“numeric”, “numeric”, “numeric”,
“text”, “text”, “text”))
# attach(LungCapData)
attach(LungCapData)

 

Start by visualizing

To get an initial overall idea of the spread of data we can visualize data through a boxplot:

boxplot(LungCap~Smoke)

 

Boxplot Independent two-sample t-test in R

 

Levene’s test

It could look as if the spread, the variance, in the non-smokers is greater than the variance for smokers. Let’s check this through the Levene’s test using the CAR package:

install.packages(“car”)

library(car)

leveneTest(LungCap~Smoke)

With a p-value of 0.0003408 we would reject the null hypothesis that the two variances should be equal as we have evidence to beleive that the population variances are not equal. With that we will use the non-equal assumption as we will run the t-test below.

The point estimates for the two population variances:

var(LungCap[Smoke==“yes”])

## [1] 3.545292

var(LungCap[Smoke==“no”])

## [1] 7.431694

 

 

Test

Testing if the mean in lung capacity of non-smokers and smokers can be the same. This leads to a two-tailed test and we will assume not equal variances as tested above with the Levene’s test.

# H0: Mean lung capacity of non-smokers = lung cap of smokers
# Two-tailed hypothesis test
# Assuming non-equal variances

t.test(LungCap~Smoke, mu=0, alternative = “two.sided”, conf.level = 0.95, var.equal = FALSE, paired = F)

##
##  Welch Two Sample t-test
##
## data:  LungCap by Smoke
## t = -3.6498, df = 117.72, p-value = 0.0003927
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.3501778 -0.4003548
## sample estimates:
##  mean in group no mean in group yes
##          7.770188          8.645455

# The arguments mu=0, alt=”two.sided”, conf=0.95, var.eq=FALSE, paired=F are default
# The test can therefore be written in short:

t.test(LungCap~Smoke)

##
##  Welch Two Sample t-test
##
## data:  LungCap by Smoke
## t = -3.6498, df = 117.72, p-value = 0.0003927
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.3501778 -0.4003548
## sample estimates:
##  mean in group no mean in group yes
##          7.770188          8.645455

 

RPubs

View my RPubs for this page: https://rpubs.com/CarstenGrube/Independent_two-sample_t-test

 

Learning R programming

I find Mick Marin’s video very useful: Two-Sample t Test in R (Independent Groups)

 

Carsten Grube

Carsten Grube

Freelance Data Analyst

0 Comments

+34 616 71 29 85

Call me

Spain: Ctra. 404, km 2, 29100 Coín, Malaga

...........

Denmark: c/o Musvitvej 4, 3660 Stenløse

Drop me a line

What are you working on just now? Can I help you, and can you help me? 

About me

Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children. 

What they say

20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.