Apply function in R
Apply functions consist of a set of loop functions in R. They take less coding and thereby result in lower risk of error when writing as well as they usually a faster than e.g. for loops.
This page based on Mike Marin’s Statslectures video ’Apply Function in R’. In my RPubs doc, you can view the page in the “right” format including the plots that have not been included on this page.
Usage and arguments
?apply(): apply(X, MARGIN, FUN, …) the X is the object to which we apply the function to. The MARGIN is for rows or columns. MARGIN1 means rows and MARGIN2 is for columns. FUN is the function and the … are the rest of the arguments we might send to the function.Let’s run an example with the dataset StockData which can be downloaded in Mike Marin’s page: https://www.statslectures.com/r-scripts-datasets
Read in data
# Read in data via read_excel
library(readxl)
StockData <- read.table(“C:/Users/Usuario/Documents/dataZ4s/R/Apply function/StockExample.txt”)
StockData
## Stock1 Stock2 Stock3 Stock4
## Day1 185.74 1.47 1605 95.05
## Day2 184.26 1.56 1580 97.49
## Day3 162.21 1.39 1490 88.57
## Day4 159.04 1.43 1520 85.55
## Day5 164.87 1.42 1550 92.04
## Day6 162.72 1.36 1525 91.70
## Day7 157.89 NA 1495 89.88
## Day8 159.49 1.43 1485 93.17
## Day9 150.22 1.57 1470 90.12
## Day10 151.02 1.54 1510 92.14
Mean price of each stock
# We will use the apply function
# MARGIN=2 meaning for columns. The data is StockData and the function is mean()
# An NA value is returned for column 2 as Day 7 in Stock 2 has a missing value
apply(X = StockData, MARGIN = 2,FUN = mean)
## Stock1 Stock2 Stock3 Stock4
## 163.746 NA 1523.000 91.571
Dealing with NA
# With the na.rm function we can have NA values removed
# With the na.rm function we thereby get the mean of all 4 stocks
apply(X = StockData, MARGIN = 2,FUN = mean, na.rm=TRUE)
## Stock1 Stock2 Stock3 Stock4
## 163.746000 1.463333 1523.000000 91.571000
# Save apply function to object
AVG <- apply(X = StockData, MARGIN = 2,FUN = mean, na.rm=TRUE)
AVG
## Stock1 Stock2 Stock3 Stock4
## 163.746000 1.463333 1523.000000 91.571000
# When confortable with the commands and the default orders in the functions we can skip the argument names
apply(StockData, 2, mean, na.rm=TRUE)
## Stock1 Stock2 Stock3 Stock4
## 163.746000 1.463333 1523.000000 91.571000
colMeans function
# The colMeans command does the same as the apply command that we used above
# It is already built into the function that it is the mean of columns
# The argument only takes the data adding argument for na.rm
colMeans(StockData, na.rm = TRUE)
## Stock1 Stock2 Stock3 Stock4
## 163.746000 1.463333 1523.000000 91.571000
Max values and percentiles
# Max values of the stocks
apply(X = StockData, MARGIN = 2, FUN = max, na.rm=TRUE)
## Stock1 Stock2 Stock3 Stock4
## 185.74 1.57 1605.00 97.49
# 20st and 80st percentiles
apply(X = StockData, MARGIN = 2, FUN = quantile, probs=c(0.2, 0.8), na.rm=TRUE)
## Stock1 Stock2 Stock3 Stock4
## 20% 156.516 1.408 1489 89.618
## 80% 168.748 1.548 1556 93.546
Row sums
# Sum for each row
apply(X=StockData, MARGIN = 1, FUN = sum, na.rm=TRUE)
## Day1 Day2 Day3 Day4 Day5 Day6 Day7 Day8 Day9 Day10
## 1887.26 1863.31 1742.17 1766.02 1808.33 1780.78 1742.77 1739.09 1711.91 1754.70
# And like the colMeans command, there is a rowSums command
rowSums(StockData, na.rm = TRUE)
## Day1 Day2 Day3 Day4 Day5 Day6 Day7 Day8 Day9 Day10
## 1887.26 1863.31 1742.17 1766.02 1808.33 1780.78 1742.77 1739.09 1711.91 1754.70
Plots
# Create line plots for each of the stocks
apply(X = StockData, MARGIN = 2, FUN = plot, type=”l”, main=”Stock”, ylab=”Price”, xlab=”Day”)
## NULL
# Plot for total per day
plot(apply(X=StockData, MARGIN = 1, FUN = sum, na.rm=TRUE), type = “l”, ylab = “Total Market Value”, xlab = “Day”, main = “Markets per day”)
View the RPubs for this page.

Carsten Grube
Freelance Data Analyst
Normal distribution
Confidence intervals
Simple linear regression, fundamentals
Two-sample inference
ANOVA & the F-distribution

+34 616 71 29 85
Call me

Spain: Ctra. 404, km 2, 29100 Coín, Malaga
...........
Denmark: c/o Musvitvej 4, 3660 Stenløse
Drop me a line
What are you working on just now? Can I help you, and can you help me?
About me
Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children.
Connect with me
What they say
20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.
0 Comments