+34 616 71 29 85 carsten@dataz4s.com

Apply function in R

Apply functions consist of a set of loop functions in R. They take less coding and thereby result in lower risk of error when writing as well as they usually a faster than e.g. for loops.

This page based on Mike Marin’s Statslectures video ’Apply Function in R’. In my RPubs doc, you can view the page in the “right” format including the plots that have not been included on this page.

Usage and arguments

?apply(): apply(X, MARGIN, FUN, …) the X is the object to which we apply the function to. The MARGIN is for rows or columns. MARGIN1 means rows and MARGIN2 is for columns. FUN is the function and the … are the rest of the arguments we might send to the function.Let’s run an example with the dataset StockData which can be downloaded in Mike Marin’s page: https://www.statslectures.com/r-scripts-datasets

Read in data

# Read in data via read_excel
library(readxl)
StockData <- read.table(“C:/Users/Usuario/Documents/dataZ4s/R/Apply function/StockExample.txt”)
StockData

##       Stock1 Stock2 Stock3 Stock4
## Day1  185.74   1.47   1605  95.05
## Day2  184.26   1.56   1580  97.49
## Day3  162.21   1.39   1490  88.57
## Day4  159.04   1.43   1520  85.55
## Day5  164.87   1.42   1550  92.04
## Day6  162.72   1.36   1525  91.70
## Day7  157.89     NA   1495  89.88
## Day8  159.49   1.43   1485  93.17
## Day9  150.22   1.57   1470  90.12
## Day10 151.02   1.54   1510  92.14

Mean price of each stock

# We will use the apply function
# MARGIN=2 meaning for columns. The data is StockData and the function is mean()
# An NA value is returned for column 2 as Day 7 in Stock 2 has a missing value
apply(X = StockData, MARGIN = 2,FUN = mean)

##   Stock1   Stock2   Stock3   Stock4
##  163.746       NA 1523.000   91.571

Dealing with NA

# With the na.rm function we can have NA values removed
# With the na.rm function we thereby get the mean of all 4 stocks
apply(X = StockData, MARGIN = 2,FUN = mean, na.rm=TRUE)

##      Stock1      Stock2      Stock3      Stock4
##  163.746000    1.463333 1523.000000   91.571000

# Save apply function to object
AVG <- apply(X = StockData, MARGIN = 2,FUN = mean, na.rm=TRUE)
AVG

##      Stock1      Stock2      Stock3      Stock4
##  163.746000    1.463333 1523.000000   91.571000

# When confortable with the commands and the default orders in the functions we can skip the argument names
apply(StockData, 2, mean, na.rm=TRUE)

##      Stock1      Stock2      Stock3      Stock4
##  163.746000    1.463333 1523.000000   91.571000

colMeans function

# The colMeans command does the same as the apply command that we used above
# It is already built into the function that it is the mean of columns
# The argument only takes the data adding argument for na.rm
colMeans(StockData, na.rm = TRUE)

##      Stock1      Stock2      Stock3      Stock4
##  163.746000    1.463333 1523.000000   91.571000

Max values and percentiles

# Max values of the stocks
apply(X = StockData, MARGIN = 2, FUN = max, na.rm=TRUE)

##  Stock1  Stock2  Stock3  Stock4
##  185.74    1.57 1605.00   97.49

# 20st and 80st percentiles
apply(X = StockData, MARGIN = 2, FUN = quantile, probs=c(0.2, 0.8), na.rm=TRUE)

##      Stock1 Stock2 Stock3 Stock4
## 20% 156.516  1.408   1489 89.618
## 80% 168.748  1.548   1556 93.546

Row sums

# Sum for each row
apply(X=StockData, MARGIN = 1, FUN = sum, na.rm=TRUE)

##    Day1    Day2    Day3    Day4    Day5    Day6    Day7    Day8    Day9   Day10
## 1887.26 1863.31 1742.17 1766.02 1808.33 1780.78 1742.77 1739.09 1711.91 1754.70

# And like the colMeans command, there is a rowSums command
rowSums(StockData, na.rm = TRUE)

##    Day1    Day2    Day3    Day4    Day5    Day6    Day7    Day8    Day9   Day10
## 1887.26 1863.31 1742.17 1766.02 1808.33 1780.78 1742.77 1739.09 1711.91 1754.70

Plots

# Create line plots for each of the stocks
apply(X = StockData, MARGIN = 2, FUN = plot, type=”l”, main=”Stock”, ylab=”Price”, xlab=”Day”)

## NULL

# Plot for total per day
plot(apply(X=StockData, MARGIN = 1, FUN = sum, na.rm=TRUE), type = “l”, ylab = “Total Market Value”, xlab = “Day”, main = “Markets per day”)

View the RPubs for this page.

Carsten Grube

Carsten Grube

Freelance Data Analyst

0 Comments

+34 616 71 29 85

Call me

Spain: Ctra. 404, km 2, 29100 Coín, Malaga

...........

Denmark: c/o Musvitvej 4, 3660 Stenløse

Drop me a line

What are you working on just now? Can I help you, and can you help me? 

About me

Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children. 

What they say

20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.