R base functions
table
table is a function used to build a contingency table, which is a table that shows counts for categorical data, from one or more categories. prop.table is a function that accepts table output, returning proportions of the counts.
Examples
In the Olympics data, which value appears in the "NOC" column the most times?
Click to see solution
myDF <- read.csv("/anvil/projects/tdm/data/olympics/athlete_events.csv")
head(sort(table(myDF$NOC), decreasing=TRUE), n=1)
USA 18853
grep & grepl
grep stands for " globally search for a regular expression and print all matches," just as in UNIX. The function allows you to use regular expressions to search for a pattern in a vector of strings or characters, and returns the index (indices) of the match(es).
Additionally, the function grepl (derived from grep-logical) uses the same inputs, but returns a logical vector, where TRUE indicates a match at that index, and FALSE indicates the opposite.
|
|
Examples
How many rows have "Denmark" in the team name ("Denmark" may or may not be the exact team name)?
Click to see solution
table(grepl("Denmark", myDF$Team))["TRUE"]
TRUE: 3496
Find the names of the teams that have "Denmark" in the team name but are not exactly "Denmark".
Click to see solution
myDF$Team[grepl("Denmark", myDF$Team) & myDF$Team != "Denmark"]
'Denmark/Sweden'
'Denmark-2'
'Denmark-1'
'Denmark-1'
'Denmark-1'
'Denmark-2'
'Denmark-1'
'Denmark-2'
'Denmark-2'
'Denmark-2'
'Miss Denmark 1964'
'Denmark-1'
'Denmark-1'
'Denmark-2'
'Denmark-2'
'Denmark-3'
'Denmark-1'
'Denmark-2'
'Denmark-2'
'Denmark-1'
'Denmark-2'
'Denmark-2'
'Denmark-1'
'Denmark-2'
'Denmark-1'
'Denmark-1'
'Denmark-1'
'Denmark-2'
'Denmark-2'
'Denmark-2'
'Denmark-1'
'Denmark-2'
'Denmark-1'
'Denmark-2'
'Denmark-2'
'Denmark-2'
'Denmark-2'
'Denmark-4'
'Denmark-2'
'Denmark-1'
'Denmark/Sweden'
'Denmark-2'
'Denmark-1'
'Denmark-2'
'Denmark-1'
'Denmark-2'
'Denmark-1'
'Denmark-1'
'Denmark-1'
'Denmark-1'
'Miss Denmark 1964'
'Denmark-1'
'Denmark-1'
'Denmark-1'
'Denmark-3'
'Denmark-2'
'Denmark-2'
'Denmark-2'
'Denmark-1'
'Denmark/Sweden'
'Denmark/Sweden'
'Denmark-2'
'Denmark/Sweden'
'Denmark-1'
'Denmark-1'
'Denmark-2'
'Denmark-1'
'Denmark-4'
'Denmark-1'
'Denmark-2'
'Denmark-1'
'Denmark/Sweden'