# Independent two-sample t-test in R

Below, a run through of how to run an independent two-sample t-test in R

The independent 2-sample t-test is a parametric method used for exploring the difference in means for two populations.

**Checking for equal variances**

We will be using the Lung Capacity dataset with 725 observation and 6 variables comparing the lung capacity of smokers and to the lung capacity of non-smokers.

### Read in data:

*# Read in data via read_excel***library**(readxl)

LungCapData <- **read_excel**(“C:/Users/Usuario/Documents/dataZ4s/R/MarinLectures/LungCapData.xlsx”,

col_types = **c**(“numeric”, “numeric”, “numeric”,

“text”, “text”, “text”))*# attach(LungCapData)***attach**(LungCapData)

**Start by visualizing**

To get an initial overall idea of the spread of data we can visualize data through a boxplot:

**boxplot**(LungCap**~**Smoke)

### Levene’s test

It could look as if the spread, the variance, in the non-smokers is greater than the variance for smokers. Let’s check this through the Levene’s test using the CAR package:

install.packages(“car”)

library(car)

leveneTest(LungCap~Smoke)

With a p-value of 0.0003408 we would reject the null hypothesis that the two variances should be equal as we have evidence to beleive that the population variances are not equal. With that we will use the non-equal assumption as we will run the t-test below.

The point estimates for the two population variances:

**var**(LungCap[Smoke**==**“yes”])

## [1] 3.545292

**var**(LungCap[Smoke**==**“no”])

## [1] 7.431694

**Test**

Testing if the mean in lung capacity of non-smokers and smokers can be the same. This leads to a two-tailed test and we will assume not equal variances as tested above with the Levene’s test.

*# H0: Mean lung capacity of non-smokers = lung cap of smokers**# Two-tailed hypothesis test**# Assuming non-equal variances*

**t.test**(LungCap**~**Smoke, mu=0, alternative = “two.sided”, conf.level = 0.95, var.equal = FALSE, paired = F)

##

## Welch Two Sample t-test

##

## data: LungCap by Smoke

## t = -3.6498, df = 117.72, p-value = 0.0003927

## alternative hypothesis: true difference in means is not equal to 0

## 95 percent confidence interval:

## -1.3501778 -0.4003548

## sample estimates:

## mean in group no mean in group yes

## 7.770188 8.645455

*# The arguments mu=0, alt=”two.sided”, conf=0.95, var.eq=FALSE, paired=F are default**# The test can therefore be written in short:*

**t.test**(LungCap**~**Smoke)

##

## Welch Two Sample t-test

##

## data: LungCap by Smoke

## t = -3.6498, df = 117.72, p-value = 0.0003927

## alternative hypothesis: true difference in means is not equal to 0

## 95 percent confidence interval:

## -1.3501778 -0.4003548

## sample estimates:

## mean in group no mean in group yes

## 7.770188 8.645455

## RPubs

View my RPubs for this page: https://rpubs.com/CarstenGrube/Independent_two-sample_t-test

**Learning R programming**

I find Mick Marin’s video very useful: Two-Sample t Test in R (Independent Groups)

#### Carsten Grube

Freelance Data Analyst

##### Normal distribution

##### Confidence intervals

##### Simple linear regression, fundamentals

##### Two-sample inference

##### ANOVA & the F-distribution

+34 616 71 29 85

Call me

Spain: Ctra. 404, km 2, 29100 Coín, Malaga

...........

Denmark: c/o Musvitvej 4, 3660 Stenløse

**Drop me a line**

*What are you working on just now? Can I help you, and can you help me? *

**About me**

Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children.

**Connect with me**

**What they say**

20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.

## 0 Comments