The two main components of the R language are objects and functions. Objects are the data structures that we use to store information, and functions are the tools that we use to manipulate these objects. Paraphrasing John Chambers, one of the originators of the R language, everything that “exists” in R is an object, and everything that “happens” is a function.
So far you have mostly used functions written by others. In this lesson, you will learn how to write your own functions.
Writing functions allows you to automate repetitive tasks, improve efficiency and reduce errors in your code.
In this lesson, we will learn the fundamentals of functions with simple examples. Then in a future lesson, we will write more complex functions that can automate large parts of your data analysis workflow.
By the end of this lesson, you will be able to:
if
, else if
,
and else
within functions.Run the following code to install and load the packages needed for this lesson:
if (!require(pacman)) install.packages("pacman")
pacman::p_load(tidyverse, here, NHSRdatasets, medicaldata, outbreaks, reactable)
‣ Install and load necessary packages for this lesson.
‣ Learn to create a simple function: Converting pounds to kilograms
‣ Define a function named pounds_to_kg
, by multiplying
the input by 0.4536:
‣ New object in environment of type function.
‣ Use pounds_to_kg
to convert a value:
## [1] 90.72
‣ Breaking down function structure:
‣ Function creation: Use function
,
followed by parentheses and braces.
‣ Arguments: Define inside parentheses. Example:
pounds
. But can be any name.
‣ Body: Code to execute inside braces.
‣ Returning values: Use return
to
specify what to return.
‣ Naming the function: Store in an object, e.g.,
pounds_to_kg
.
‣ We can use the function with named and unnamed arguments:
## [1] 90.72
## [1] 45.36
‣ And we can apply function to a vector:
## [1] 45.36 90.72 136.08
‣ To explore the function’s source code, type the function’s name without parentheses:
## function(pounds){
## kg <- pounds * 0.4536
## return(kg)
## }
## <bytecode: 0x000001dd90402978>
‣ Or view the function in RStudio with View
:
(NOTE: Answers are at the bottom of the page. Try to answer the questions yourself before checking.)
Create a simple function called years_to_months
that
transforms age in years to age in months.
Try it out with years_to_months(12)
‣ Let’s write a more complex function: Converting Fahrenheit to Celsius.
‣ The formula: \(C = \frac{5}{9} \times (F - 32)\)
fahrenheit_to_celsius <- function(fahrenheit){
celsius <- ((5/9) * (fahrenheit - 32))
return(celsius)
}
fahrenheit_to_celsius(32) # Should be 0
## [1] 0
‣ Testing the function on the airquality
dataset.
airquality |>
select(Temp) |> # select the Temp column
mutate(Temp_celsius = fahrenheit_to_celsius(Temp)) # apply the function to the Temp column
## Temp Temp_celsius
## 1 67 19.44444
## 2 72 22.22222
## 3 74 23.33333
## 4 62 16.66667
## 5 56 13.33333
## 6 66 18.88889
## 7 65 18.33333
## 8 59 15.00000
## 9 61 16.11111
## 10 69 20.55556
## 11 74 23.33333
## 12 69 20.55556
## 13 66 18.88889
## 14 68 20.00000
## 15 58 14.44444
## 16 64 17.77778
## 17 66 18.88889
## 18 57 13.88889
## 19 68 20.00000
## 20 62 16.66667
## 21 59 15.00000
## 22 73 22.77778
## 23 61 16.11111
## 24 61 16.11111
## 25 57 13.88889
## 26 58 14.44444
## 27 57 13.88889
## 28 67 19.44444
## 29 81 27.22222
## 30 79 26.11111
## 31 76 24.44444
## 32 78 25.55556
## 33 74 23.33333
## 34 67 19.44444
## 35 84 28.88889
## 36 85 29.44444
## 37 79 26.11111
## 38 82 27.77778
## 39 87 30.55556
## 40 90 32.22222
## 41 87 30.55556
## 42 93 33.88889
## 43 92 33.33333
## 44 82 27.77778
## 45 80 26.66667
## 46 79 26.11111
## 47 77 25.00000
## 48 72 22.22222
## 49 65 18.33333
## 50 73 22.77778
## 51 76 24.44444
## 52 77 25.00000
## 53 76 24.44444
## 54 76 24.44444
## 55 76 24.44444
## 56 75 23.88889
## 57 78 25.55556
## 58 73 22.77778
## 59 80 26.66667
## 60 77 25.00000
## 61 83 28.33333
## 62 84 28.88889
## 63 85 29.44444
## 64 81 27.22222
## 65 84 28.88889
## 66 83 28.33333
## 67 83 28.33333
## 68 88 31.11111
## 69 92 33.33333
## 70 92 33.33333
## 71 89 31.66667
## 72 82 27.77778
## 73 73 22.77778
## 74 81 27.22222
## 75 91 32.77778
## 76 80 26.66667
## 77 81 27.22222
## 78 82 27.77778
## 79 84 28.88889
## 80 87 30.55556
## 81 85 29.44444
## 82 74 23.33333
## 83 81 27.22222
## 84 82 27.77778
## 85 86 30.00000
## 86 85 29.44444
## 87 82 27.77778
## 88 86 30.00000
## 89 88 31.11111
## 90 86 30.00000
## 91 83 28.33333
## 92 81 27.22222
## 93 81 27.22222
## 94 81 27.22222
## 95 82 27.77778
## 96 86 30.00000
## 97 85 29.44444
## 98 87 30.55556
## 99 89 31.66667
## 100 90 32.22222
## 101 90 32.22222
## 102 92 33.33333
## 103 86 30.00000
## 104 86 30.00000
## 105 82 27.77778
## 106 80 26.66667
## 107 79 26.11111
## 108 77 25.00000
## 109 79 26.11111
## 110 76 24.44444
## 111 78 25.55556
## 112 78 25.55556
## 113 77 25.00000
## 114 72 22.22222
## 115 75 23.88889
## 116 79 26.11111
## 117 81 27.22222
## 118 86 30.00000
## 119 88 31.11111
## 120 97 36.11111
## 121 94 34.44444
## 122 96 35.55556
## 123 94 34.44444
## 124 91 32.77778
## 125 92 33.33333
## 126 93 33.88889
## 127 93 33.88889
## 128 87 30.55556
## 129 84 28.88889
## 130 80 26.66667
## 131 78 25.55556
## 132 75 23.88889
## 133 73 22.77778
## 134 81 27.22222
## 135 76 24.44444
## 136 77 25.00000
## 137 71 21.66667
## 138 71 21.66667
## 139 78 25.55556
## 140 67 19.44444
## 141 76 24.44444
## 142 68 20.00000
## 143 82 27.77778
## 144 64 17.77778
## 145 71 21.66667
## 146 81 27.22222
## 147 69 20.55556
## 148 63 17.22222
## 149 70 21.11111
## 150 77 25.00000
## 151 75 23.88889
## 152 76 24.44444
## 153 68 20.00000
Great!
‣ Reusability: Writing functions for repetitive code enhances efficiency.
‣ Readability: Descriptive functions clarify code purpose. Though not so obvious with simple functions:
## Ozone Solar.R Wind Temp Month Day
## 1 41 190 7.4 19.44444 5 1
## 2 36 118 8.0 22.22222 5 2
## 3 12 149 12.6 23.33333 5 3
## 4 18 313 11.5 16.66667 5 4
## 5 NA NA 14.3 13.33333 5 5
## 6 28 NA 14.9 18.88889 5 6
## 7 23 299 8.6 18.33333 5 7
## 8 19 99 13.8 15.00000 5 8
## 9 8 19 20.1 16.11111 5 9
## 10 NA 194 8.6 20.55556 5 10
## 11 7 NA 6.9 23.33333 5 11
## 12 16 256 9.7 20.55556 5 12
## 13 11 290 9.2 18.88889 5 13
## 14 14 274 10.9 20.00000 5 14
## 15 18 65 13.2 14.44444 5 15
## 16 14 334 11.5 17.77778 5 16
## 17 34 307 12.0 18.88889 5 17
## 18 6 78 18.4 13.88889 5 18
## 19 30 322 11.5 20.00000 5 19
## 20 11 44 9.7 16.66667 5 20
## 21 1 8 9.7 15.00000 5 21
## 22 11 320 16.6 22.77778 5 22
## 23 4 25 9.7 16.11111 5 23
## 24 32 92 12.0 16.11111 5 24
## 25 NA 66 16.6 13.88889 5 25
## 26 NA 266 14.9 14.44444 5 26
## 27 NA NA 8.0 13.88889 5 27
## 28 23 13 12.0 19.44444 5 28
## 29 45 252 14.9 27.22222 5 29
## 30 115 223 5.7 26.11111 5 30
## 31 37 279 7.4 24.44444 5 31
## 32 NA 286 8.6 25.55556 6 1
## 33 NA 287 9.7 23.33333 6 2
## 34 NA 242 16.1 19.44444 6 3
## 35 NA 186 9.2 28.88889 6 4
## 36 NA 220 8.6 29.44444 6 5
## 37 NA 264 14.3 26.11111 6 6
## 38 29 127 9.7 27.77778 6 7
## 39 NA 273 6.9 30.55556 6 8
## 40 71 291 13.8 32.22222 6 9
## 41 39 323 11.5 30.55556 6 10
## 42 NA 259 10.9 33.88889 6 11
## 43 NA 250 9.2 33.33333 6 12
## 44 23 148 8.0 27.77778 6 13
## 45 NA 332 13.8 26.66667 6 14
## 46 NA 322 11.5 26.11111 6 15
## 47 21 191 14.9 25.00000 6 16
## 48 37 284 20.7 22.22222 6 17
## 49 20 37 9.2 18.33333 6 18
## 50 12 120 11.5 22.77778 6 19
## 51 13 137 10.3 24.44444 6 20
## 52 NA 150 6.3 25.00000 6 21
## 53 NA 59 1.7 24.44444 6 22
## 54 NA 91 4.6 24.44444 6 23
## 55 NA 250 6.3 24.44444 6 24
## 56 NA 135 8.0 23.88889 6 25
## 57 NA 127 8.0 25.55556 6 26
## 58 NA 47 10.3 22.77778 6 27
## 59 NA 98 11.5 26.66667 6 28
## 60 NA 31 14.9 25.00000 6 29
## 61 NA 138 8.0 28.33333 6 30
## 62 135 269 4.1 28.88889 7 1
## 63 49 248 9.2 29.44444 7 2
## 64 32 236 9.2 27.22222 7 3
## 65 NA 101 10.9 28.88889 7 4
## 66 64 175 4.6 28.33333 7 5
## 67 40 314 10.9 28.33333 7 6
## 68 77 276 5.1 31.11111 7 7
## 69 97 267 6.3 33.33333 7 8
## 70 97 272 5.7 33.33333 7 9
## 71 85 175 7.4 31.66667 7 10
## 72 NA 139 8.6 27.77778 7 11
## 73 10 264 14.3 22.77778 7 12
## 74 27 175 14.9 27.22222 7 13
## 75 NA 291 14.9 32.77778 7 14
## 76 7 48 14.3 26.66667 7 15
## 77 48 260 6.9 27.22222 7 16
## 78 35 274 10.3 27.77778 7 17
## 79 61 285 6.3 28.88889 7 18
## 80 79 187 5.1 30.55556 7 19
## 81 63 220 11.5 29.44444 7 20
## 82 16 7 6.9 23.33333 7 21
## 83 NA 258 9.7 27.22222 7 22
## 84 NA 295 11.5 27.77778 7 23
## 85 80 294 8.6 30.00000 7 24
## 86 108 223 8.0 29.44444 7 25
## 87 20 81 8.6 27.77778 7 26
## 88 52 82 12.0 30.00000 7 27
## 89 82 213 7.4 31.11111 7 28
## 90 50 275 7.4 30.00000 7 29
## 91 64 253 7.4 28.33333 7 30
## 92 59 254 9.2 27.22222 7 31
## 93 39 83 6.9 27.22222 8 1
## 94 9 24 13.8 27.22222 8 2
## 95 16 77 7.4 27.77778 8 3
## 96 78 NA 6.9 30.00000 8 4
## 97 35 NA 7.4 29.44444 8 5
## 98 66 NA 4.6 30.55556 8 6
## 99 122 255 4.0 31.66667 8 7
## 100 89 229 10.3 32.22222 8 8
## 101 110 207 8.0 32.22222 8 9
## 102 NA 222 8.6 33.33333 8 10
## 103 NA 137 11.5 30.00000 8 11
## 104 44 192 11.5 30.00000 8 12
## 105 28 273 11.5 27.77778 8 13
## 106 65 157 9.7 26.66667 8 14
## 107 NA 64 11.5 26.11111 8 15
## 108 22 71 10.3 25.00000 8 16
## 109 59 51 6.3 26.11111 8 17
## 110 23 115 7.4 24.44444 8 18
## 111 31 244 10.9 25.55556 8 19
## 112 44 190 10.3 25.55556 8 20
## 113 21 259 15.5 25.00000 8 21
## 114 9 36 14.3 22.22222 8 22
## 115 NA 255 12.6 23.88889 8 23
## 116 45 212 9.7 26.11111 8 24
## 117 168 238 3.4 27.22222 8 25
## 118 73 215 8.0 30.00000 8 26
## 119 NA 153 5.7 31.11111 8 27
## 120 76 203 9.7 36.11111 8 28
## 121 118 225 2.3 34.44444 8 29
## 122 84 237 6.3 35.55556 8 30
## 123 85 188 6.3 34.44444 8 31
## 124 96 167 6.9 32.77778 9 1
## 125 78 197 5.1 33.33333 9 2
## 126 73 183 2.8 33.88889 9 3
## 127 91 189 4.6 33.88889 9 4
## 128 47 95 7.4 30.55556 9 5
## 129 32 92 15.5 28.88889 9 6
## 130 20 252 10.9 26.66667 9 7
## 131 23 220 10.3 25.55556 9 8
## 132 21 230 10.9 23.88889 9 9
## 133 24 259 9.7 22.77778 9 10
## 134 44 236 14.9 27.22222 9 11
## 135 21 259 15.5 24.44444 9 12
## 136 28 238 6.3 25.00000 9 13
## 137 9 24 10.9 21.66667 9 14
## 138 13 112 11.5 21.66667 9 15
## 139 46 237 6.9 25.55556 9 16
## 140 18 224 13.8 19.44444 9 17
## 141 13 27 10.3 24.44444 9 18
## 142 24 238 10.3 20.00000 9 19
## 143 16 201 8.0 27.77778 9 20
## 144 13 238 12.6 17.77778 9 21
## 145 23 14 9.2 21.66667 9 22
## 146 36 139 10.3 27.22222 9 23
## 147 7 49 10.3 20.55556 9 24
## 148 14 20 16.6 17.22222 9 25
## 149 30 193 6.9 21.11111 9 26
## 150 NA 145 13.2 25.00000 9 27
## 151 14 191 14.3 23.88889 9 28
## 152 18 131 8.0 24.44444 9 29
## 153 20 223 11.5 20.00000 9 30
## Ozone Solar.R Wind Temp Month Day
## 1 41 190 7.4 5.2222222 5 1
## 2 36 118 8.0 8.0000000 5 2
## 3 12 149 12.6 9.1111111 5 3
## 4 18 313 11.5 2.4444444 5 4
## 5 NA NA 14.3 -0.8888889 5 5
## 6 28 NA 14.9 4.6666667 5 6
## 7 23 299 8.6 4.1111111 5 7
## 8 19 99 13.8 0.7777778 5 8
## 9 8 19 20.1 1.8888889 5 9
## 10 NA 194 8.6 6.3333333 5 10
## 11 7 NA 6.9 9.1111111 5 11
## 12 16 256 9.7 6.3333333 5 12
## 13 11 290 9.2 4.6666667 5 13
## 14 14 274 10.9 5.7777778 5 14
## 15 18 65 13.2 0.2222222 5 15
## 16 14 334 11.5 3.5555556 5 16
## 17 34 307 12.0 4.6666667 5 17
## 18 6 78 18.4 -0.3333333 5 18
## 19 30 322 11.5 5.7777778 5 19
## 20 11 44 9.7 2.4444444 5 20
## 21 1 8 9.7 0.7777778 5 21
## 22 11 320 16.6 8.5555556 5 22
## 23 4 25 9.7 1.8888889 5 23
## 24 32 92 12.0 1.8888889 5 24
## 25 NA 66 16.6 -0.3333333 5 25
## 26 NA 266 14.9 0.2222222 5 26
## 27 NA NA 8.0 -0.3333333 5 27
## 28 23 13 12.0 5.2222222 5 28
## 29 45 252 14.9 13.0000000 5 29
## 30 115 223 5.7 11.8888889 5 30
## 31 37 279 7.4 10.2222222 5 31
## 32 NA 286 8.6 11.3333333 6 1
## 33 NA 287 9.7 9.1111111 6 2
## 34 NA 242 16.1 5.2222222 6 3
## 35 NA 186 9.2 14.6666667 6 4
## 36 NA 220 8.6 15.2222222 6 5
## 37 NA 264 14.3 11.8888889 6 6
## 38 29 127 9.7 13.5555556 6 7
## 39 NA 273 6.9 16.3333333 6 8
## 40 71 291 13.8 18.0000000 6 9
## 41 39 323 11.5 16.3333333 6 10
## 42 NA 259 10.9 19.6666667 6 11
## 43 NA 250 9.2 19.1111111 6 12
## 44 23 148 8.0 13.5555556 6 13
## 45 NA 332 13.8 12.4444444 6 14
## 46 NA 322 11.5 11.8888889 6 15
## 47 21 191 14.9 10.7777778 6 16
## 48 37 284 20.7 8.0000000 6 17
## 49 20 37 9.2 4.1111111 6 18
## 50 12 120 11.5 8.5555556 6 19
## 51 13 137 10.3 10.2222222 6 20
## 52 NA 150 6.3 10.7777778 6 21
## 53 NA 59 1.7 10.2222222 6 22
## 54 NA 91 4.6 10.2222222 6 23
## 55 NA 250 6.3 10.2222222 6 24
## 56 NA 135 8.0 9.6666667 6 25
## 57 NA 127 8.0 11.3333333 6 26
## 58 NA 47 10.3 8.5555556 6 27
## 59 NA 98 11.5 12.4444444 6 28
## 60 NA 31 14.9 10.7777778 6 29
## 61 NA 138 8.0 14.1111111 6 30
## 62 135 269 4.1 14.6666667 7 1
## 63 49 248 9.2 15.2222222 7 2
## 64 32 236 9.2 13.0000000 7 3
## 65 NA 101 10.9 14.6666667 7 4
## 66 64 175 4.6 14.1111111 7 5
## 67 40 314 10.9 14.1111111 7 6
## 68 77 276 5.1 16.8888889 7 7
## 69 97 267 6.3 19.1111111 7 8
## 70 97 272 5.7 19.1111111 7 9
## 71 85 175 7.4 17.4444444 7 10
## 72 NA 139 8.6 13.5555556 7 11
## 73 10 264 14.3 8.5555556 7 12
## 74 27 175 14.9 13.0000000 7 13
## 75 NA 291 14.9 18.5555556 7 14
## 76 7 48 14.3 12.4444444 7 15
## 77 48 260 6.9 13.0000000 7 16
## 78 35 274 10.3 13.5555556 7 17
## 79 61 285 6.3 14.6666667 7 18
## 80 79 187 5.1 16.3333333 7 19
## 81 63 220 11.5 15.2222222 7 20
## 82 16 7 6.9 9.1111111 7 21
## 83 NA 258 9.7 13.0000000 7 22
## 84 NA 295 11.5 13.5555556 7 23
## 85 80 294 8.6 15.7777778 7 24
## 86 108 223 8.0 15.2222222 7 25
## 87 20 81 8.6 13.5555556 7 26
## 88 52 82 12.0 15.7777778 7 27
## 89 82 213 7.4 16.8888889 7 28
## 90 50 275 7.4 15.7777778 7 29
## 91 64 253 7.4 14.1111111 7 30
## 92 59 254 9.2 13.0000000 7 31
## 93 39 83 6.9 13.0000000 8 1
## 94 9 24 13.8 13.0000000 8 2
## 95 16 77 7.4 13.5555556 8 3
## 96 78 NA 6.9 15.7777778 8 4
## 97 35 NA 7.4 15.2222222 8 5
## 98 66 NA 4.6 16.3333333 8 6
## 99 122 255 4.0 17.4444444 8 7
## 100 89 229 10.3 18.0000000 8 8
## 101 110 207 8.0 18.0000000 8 9
## 102 NA 222 8.6 19.1111111 8 10
## 103 NA 137 11.5 15.7777778 8 11
## 104 44 192 11.5 15.7777778 8 12
## 105 28 273 11.5 13.5555556 8 13
## 106 65 157 9.7 12.4444444 8 14
## 107 NA 64 11.5 11.8888889 8 15
## 108 22 71 10.3 10.7777778 8 16
## 109 59 51 6.3 11.8888889 8 17
## 110 23 115 7.4 10.2222222 8 18
## 111 31 244 10.9 11.3333333 8 19
## 112 44 190 10.3 11.3333333 8 20
## 113 21 259 15.5 10.7777778 8 21
## 114 9 36 14.3 8.0000000 8 22
## 115 NA 255 12.6 9.6666667 8 23
## 116 45 212 9.7 11.8888889 8 24
## 117 168 238 3.4 13.0000000 8 25
## 118 73 215 8.0 15.7777778 8 26
## 119 NA 153 5.7 16.8888889 8 27
## 120 76 203 9.7 21.8888889 8 28
## 121 118 225 2.3 20.2222222 8 29
## 122 84 237 6.3 21.3333333 8 30
## 123 85 188 6.3 20.2222222 8 31
## 124 96 167 6.9 18.5555556 9 1
## 125 78 197 5.1 19.1111111 9 2
## 126 73 183 2.8 19.6666667 9 3
## 127 91 189 4.6 19.6666667 9 4
## 128 47 95 7.4 16.3333333 9 5
## 129 32 92 15.5 14.6666667 9 6
## 130 20 252 10.9 12.4444444 9 7
## 131 23 220 10.3 11.3333333 9 8
## 132 21 230 10.9 9.6666667 9 9
## 133 24 259 9.7 8.5555556 9 10
## 134 44 236 14.9 13.0000000 9 11
## 135 21 259 15.5 10.2222222 9 12
## 136 28 238 6.3 10.7777778 9 13
## 137 9 24 10.9 7.4444444 9 14
## 138 13 112 11.5 7.4444444 9 15
## 139 46 237 6.9 11.3333333 9 16
## 140 18 224 13.8 5.2222222 9 17
## 141 13 27 10.3 10.2222222 9 18
## 142 24 238 10.3 5.7777778 9 19
## 143 16 201 8.0 13.5555556 9 20
## 144 13 238 12.6 3.5555556 9 21
## 145 23 14 9.2 7.4444444 9 22
## 146 36 139 10.3 13.0000000 9 23
## 147 7 49 10.3 6.3333333 9 24
## 148 14 20 16.6 3.0000000 9 25
## 149 30 193 6.9 6.8888889 9 26
## 150 NA 145 13.2 10.7777778 9 27
## 151 14 191 14.3 9.6666667 9 28
## 152 18 131 8.0 10.2222222 9 29
## 153 20 223 11.5 5.7777778 9 30
‣ Sharing: Functions facilitate code sharing, through packages or scripts. We’ll talk about sharing options later.
‣ SIDE-NOTE Data Frame and Plot functions are among the most useful. For example, this function creates an epidemic curve from a dataset:
# Function to plot an epidemic curve
plot_epidemic_curve <- function(data, date_column) {
data %>%
count({{ date_column }}) %>%
complete({{ date_column }} := seq(min({{ date_column }}),
max({{ date_column }}), by="day")) %>%
ggplot(aes(x = {{ date_column }}, y = n)) +
geom_col(fill = "#4395D1")
}
# Example usage
plot_epidemic_curve(outbreaks::ebola_sierraleone_2014, date_of_sample)
‣ We’ll explore such in a future lesson. For now, focusing on vector manipulation functions to give you a solid foundation.
Create a function named celsius_to_fahrenheit
that
converts temperature from Celsius to Fahrenheit. Here is the formula for
this conversion:
\[ Fahrenheit = Celsius * 1.8 + 32 \]
# Your code here
celsius_to_fahrenheit <- function(celsius){
fahrenheit <- celsius*1.8+32
return(fahrenheit)
}
Then test your function on the temp
column of the
built-in beaver1
dataset in R:
‣ Functions typically have multiple arguments. Let’s see a function with three arguments.
‣ calculate_calories
, to calculate calories from
macronutrients.
calculate_calories <- function(carb_grams, protein_grams, fat_grams){
result <- (carb_grams * 4) + (protein_grams * 4) + (fat_grams * 9)
return(result)
}
calculate_calories(carb_grams = 50, protein_grams = 25, fat_grams = 10)
## [1] 390
‣ Without all arguments, the function yields an error.
‣ Default values can be set for function arguments.
calculate_calories <- function(carb_grams, protein_grams, fat_grams) {
result <- (carb_grams * 4) + (protein_grams * 4) + (fat_grams * 9)
return(result)
}
calculate_calories(50, 25)
## Error in calculate_calories(50, 25): argument "fat_grams" is missing, with no default
‣ We can make all arguments optional with default values.
calculate_calories <- function(carb_grams = 0, protein_grams=0, fat_grams = 0) {
result <- (carb_grams * 4) + (protein_grams * 4) + (fat_grams * 9)
return(result)
}
‣ Now we can call the function with no or some arguments.
## [1] 0
## [1] 300
Create a function named calc_bmi
that calculates the
Body Mass Index (BMI) for one or more individuals. Keep in mind that BMI
is calculated as weight in kg divided by the square of the height in
meters. Therefore, this function requires two mandatory arguments:
weight and height.
# Your code here
calc_bmi <- function(weight=0, height=0){
bmi <- weight/((height/100)^2)
return(bmi)
}
Then, apply your function to the medicaldata::smartpill
dataset to calculate the BMI for each person:
‣ Scope refers to the visibility of variables and objects within different parts of your R code.
‣ Objects within a function have local scope (as opposed to global scope) and are not accessible outside the function.
‣ Imagin we wrote the pounds_to_kg
function like
this:
‣ May be tempted to try to access the kg
variable
outside of the function, but you’ll get an error:
## Error in eval(expr, envir, enclos): object 'kg' not found
‣ To use a value generated within a function, it must be returned by the function.
‣ Store the function’s result in a global variable to access it.
## [1] 22.68
if
, else if
and
else
‣ Conditionals control the flow of code execution, especially useful in functions.
‣ R implements conditionals using if
, else
,
and else if
statements.
‣ if
is used to run code only if a specific condition is
true.
‣ Structure of an if
statement:
‣ Example: Converting temperature from Celsius to Fahrenheit.
celsius <- 20
convert_to <- "fahrenheit"
# if convert_to is "fahrenheit", create fahrenheit, as c * 9/5 + 32, then print
if (convert_to == "fahrenheit"){
fahrenheit <- celsius * 9/5 + 32
print(fahrenheit)
}
## [1] 68
‣ If convert_to
is "fahrenheit"
, the code
body is executed, and if not, it is skipped.
celsius <- 20
convert_to <- "kelvin"
# if convert_to is "fahrenheit", create fahrenheit, as c * 9/5 + 32, then print
if (convert_to == "fahrenheit"){
fahrenheit <- celsius * 9/5 + 32
print(fahrenheit)
}
‣ Add an else
clause to handle the case where
convert_to
is not "fahrenheit"
.
celsius <- 20
convert_to <- "kelvin"
# add else clause to print original celsius value
if (convert_to == "fahrenheit"){
fahrenheit <- celsius * 9/5 + 32
print(fahrenheit)
} else {
print(celsius)
}
## [1] 20
‣ Use else if
to check multiple specific conditions.
celsius <- 20
convert_to <- "kelvin"
if (convert_to == "fahrenheit") {
fahrenheit <- (celsius * 9/5) + 32
print(fahrenheit)
} else if (convert_to == "kelvin") {
kelvin <- celsius + 273.15
print(kelvin)
} else {
print(celsius)
}
## [1] 293.15
‣ Code handles three scenarios: converting to Fahrenheit, Kelvin, or keeping Celsius.
‣ Can have as many else if
statements as you need, but
can only have one else
statement attached to an
if
statement.
‣ Finally, we can encapsulate this logic into a function.
celsius_convert <- function(celsius, convert_to){
if (convert_to == "fahrenheit"){
out <- (celsius * 9/5) + 32
} else if (convert_to == "kelvin"){
out <- celsius + 273.15
} else {
out <- celsius
}
return(out)
}
‣ Let’s test the function:
## [1] 68
## [1] 293.15
‣ One problem: silent failure. If we pass in an invalid value for
convert_to
, the function fails without an informative error
message.
## [1] 20
## [1] 20
‣ Will need to add some error handling to the function.
A function named check_negatives
is designed to analyze
a vector of numbers in R and print a message indicating whether the
vector contains any negative numbers. However, the function currently
has syntax errors.
check_negatives <- function(numbers) {
x <- numbers
if (any(x < 0)) {
print("x contains negative numbers")
}
else {
print("x does not contain negative numbers")
}
}
Identify and correct the syntax errors in the
check_negatives
function. After correcting the function,
test it with the following vectors to ensure it works correctly: 1.
c(8, 3, -2, 5)
2. c(10, 20, 30, 40)
## Error in check_negatives(x1): could not find function "check_negatives"
## Error in check_negatives(x2): could not find function "check_negatives"
‣ Argument checking is crucial in R functions to ensure inputs are sensible.
‣ Without checks, functions may return incorrect results, fail silently, or fail with an uninformative error message.
‣ Example: Using celsius_convert()
function for
temperature conversion.
## [1] 30
‣ Issue: Fails silently when convert_to
is not a valid temperature scale.
‣ Solution: Implement argument checking using the
stop()
function in R.
‣ Example: Validate convert_to
argument. We’ll write
them out first, then integrate them into the function.
convert_to <- "bad scale"
# if convert_to is not in fahrenheit, or kelvin, stop() with error message
if (!convert_to %in% c("fahrenheit", "kelvin")){
stop("convert to must be one of 'fahrenheit; or 'kelvin'")
}
‣ Integration into celsius_convert()
function:
celsius_convert <- function(celsius, convert_to){
# Checking validity
if (!convert_to %in% c("fahrenheit", "kelvin", "centigrade")){
stop("convert to must be one of 'fahrenheit; or 'kelvin'")
}
# Converting value
if (convert_to == "fahrenheit"){
out <- (celsius * 9/5) + 32
} else if (convert_to == "kelvin"){
out <- celsius + 273.15
} else if (convert_to == "centigrade"){
out <- celsius
}
return(out)
}
‣ No longer a need for else
, since stop()
will halt execution if convert_to
is not valid.
‣ Result: Clear error message for invalid temperature scales.
PRO TIP
‣ Balancing Argument Checking: Checking should ensure reliability without overcomplicating the code or impacting performance.
‣ You will develop a sense of the right amount of checking through experience and examining others’ code. For now, note that it is usually good to err on the side of more checking.
Consider the calculate_calories
function we wrote
earlier:
calculate_calories <- function(carb_grams = 0, protein_grams = 0, fat_grams = 0) {
result <- (carb_grams * 4) + (protein_grams * 4) + (fat_grams * 9)
return(result)
}
Write a function called calculate_calories2()
that is
the same as calculate_calories()
except that it checks if
the carb_grams
, protein_grams
, and
fat_grams
arguments are numeric. If any of them are not
numeric, the function should print an error message using the
stop()
function.
calculate_calories2 <- function(carb_grams = 0, protein_grams = 0, fat_grams = 0) {
# your code here
if (!is.numeric(c(carb_grams, protein_grams, fat_grams))){
stop("All arguments must be numeric")
}
result <- (carb_grams * 4) + (protein_grams * 4) + (fat_grams * 9)
return(result)
}
calculate_calories2("five", 20, 30)
‣ Important realization and source of errors: if
statements are not vectorized and only evaluate the first element of a
vector.
‣ Consider this attempt at a function classify_temp
for
classifying temperature readings.
classify_temp <- function(temp) {
if (temp < 35) {
print("hypothermia")
} else if (temp >= 35 & temp <= 37) {
print("normal")
} else if (temp > 37) {
print("fever")
}
}
‣ Works for a single value, but not for vectors.
## [1] "normal"
## Error in if (temp < 35) {: the condition has length > 1
‣ For conditional statements for vectors, we therefore use
ifelse
or dplyr::case_when
.
classify_temp <- function(temp){
out <- ifelse(temp < 35, "hypothermia",
ifelse(temp >= 35 & temp <= 37, "normal",
ifelse(temp > 37, "fever", "NA")))
return(out)
}
# ifelse temp less than 35, return "hypothermia"
# nested ifelse temp between 35 and 37, return "normal"
# nested ifelse temp greater than 37, return "fever"
classify_temp(temp_vec) # Works for vector
## [1] "normal" "normal" "fever"
‣ dplyr::case_when
is a more readable alternative.
classify_temp <- function(temp) {
case_when(
temp < 35 ~ "hypothermia",
temp >= 35 & temp <= 37 ~ "normal",
temp > 37 ~ "fever",
TRUE ~ NA_character_
)
}
classify_temp(temp_vec) # This also works as expected
## [1] "normal" "normal" "fever"
‣ This function can be seamlessly integrated with data frames.
## # A tibble: 1,000 × 2
## temp temp_classif
## <dbl> <chr>
## 1 36.8 normal
## 2 35 normal
## 3 36.2 normal
## 4 36.9 normal
## 5 36.4 normal
## 6 35.3 normal
## 7 35.6 normal
## 8 37.2 fever
## 9 35.5 normal
## 10 35.3 normal
## # ℹ 990 more rows
Let’s apply this knowledge to a practical case. Consider the following attempt at writing a function that calculates dosages of the drug isoniazid for adults weighing more than 30kg:
calculate_isoniazid_dosage <- function(weight) {
if (weight < 30) {
stop("Weight must be at least 30 kg.")
} else if (weight <= 35) {
return(150)
} else if (weight <= 45) {
return(200)
} else if (weight <= 55) {
return(300)
} else if (weight <= 70) {
return(300)
} else {
return(300)
}
}
This function fails with a vector of weights. Your task is to write a
new function calculate_isoniazid_dosage2()
that can handle
vector inputs. To ensure all weights are above 30kg, you’ll use the
any()
function within your error checking.
Here’s a scaffold to get you started:
calculate_isoniazid_dosage2 <- function(weight) {
if (any(weight < 30)) stop("Weights must all be at least 30 kg.")
# Your code here
{
out <- ifelse(weight <= 35, 150,
ifelse(weight <= 45, 200,
ifelse(weight <= 55, 300,
ifelse(weight <= 70, 300, 300))))
return(out)
}
return(out)
}
calculate_isoniazid_dosage2(c(30, 40, 50, 100))
## [1] 150 200 300 300
To store your functions in R for future use, you can save them in a script or package. Here’s how:
‣ Create a script file:
‣ Load the script: To use the functions in future sessions, load the script using:
source(("path/to/my_functions.R"))
‣ Example:
# Save this in my_functions.R
calculate_calories <- function(carb_grams, protein_grams, fat_grams) {
(carb_grams * 4) + (protein_grams * 4) + (fat_grams * 9)
}
Then load the file:
## [1] 78
‣ Save function objects:
save(calculate_calories2, calc_bmi, pounds_to_kg, years_to_months, celsius_convert, calculate_isoniazid_dosage2, celsius_to_fahrenheit, fahrenheit_to_celsius, classify_temp2, file = "my_functions.RData")
‣ Load the file: In future sessions, you can load the .RData file:
## [1] "normal" "normal" "fever"
If you plan to use your functions frequently across multiple projects, creating a package is a good approach.
Building packages doesn’t have to be intimidating! Thanks to the tidyverse team, getting started is simple with RStudio and the devtools and usethis packages. In this one-hour presentation , you’ll learn the fundamentals of R package development and gain the confidence to start building your own packages!
‣ Click here for slides.
‣ Click here for source code
‣ Click here for more information on devtools
‣ Click here for more information on usethis
‣ Click here for more resources on R Packages book
‣ Learn how to create a package, the fundamental unit of shareable, reusable, and reproducible R code by reading and studying the 2nd edition of R Packages, by Hadley Wickham and Jennifer Bryan
‣ This youtube video: How to Create Your Own Package in RStudio is also helpful in creating your own package and storing in GitHub.
You can also define functions in your .Rprofile
file, so
they are available every time you start R:
‣ Edit your .Rprofil
e:
‣ Add your function definitions to the file.
‣ Save and restart your R session to access the functions.
Choose the method that best fits your workflow and the frequency of use for your functions!
years_to_months <- function(years) {
months <- years * 12
return(months)
}
# Test
years_to_months(12)
## [1] 144
celsius_to_fahrenheit <- function(celsius) {
fahrenheit <- celsius * 1.8 + 32
return(fahrenheit)
}
# Test
beaver1 %>%
select(temp) %>%
mutate(Fahrenheit = celsius_to_fahrenheit(temp))
## temp Fahrenheit
## 1 36.33 97.394
## 2 36.34 97.412
## 3 36.35 97.430
## 4 36.42 97.556
## 5 36.55 97.790
## 6 36.69 98.042
## 7 36.71 98.078
## 8 36.75 98.150
## 9 36.81 98.258
## 10 36.88 98.384
## 11 36.89 98.402
## 12 36.91 98.438
## 13 36.85 98.330
## 14 36.89 98.402
## 15 36.89 98.402
## 16 36.67 98.006
## 17 36.50 97.700
## 18 36.74 98.132
## 19 36.77 98.186
## 20 36.76 98.168
## 21 36.78 98.204
## 22 36.82 98.276
## 23 36.89 98.402
## 24 36.99 98.582
## 25 36.92 98.456
## 26 36.99 98.582
## 27 36.89 98.402
## 28 36.94 98.492
## 29 36.92 98.456
## 30 36.97 98.546
## 31 36.91 98.438
## 32 36.79 98.222
## 33 36.77 98.186
## 34 36.69 98.042
## 35 36.62 97.916
## 36 36.54 97.772
## 37 36.55 97.790
## 38 36.67 98.006
## 39 36.69 98.042
## 40 36.62 97.916
## 41 36.64 97.952
## 42 36.59 97.862
## 43 36.65 97.970
## 44 36.75 98.150
## 45 36.80 98.240
## 46 36.81 98.258
## 47 36.87 98.366
## 48 36.87 98.366
## 49 36.89 98.402
## 50 36.94 98.492
## 51 36.98 98.564
## 52 36.95 98.510
## 53 37.00 98.600
## 54 37.07 98.726
## 55 37.05 98.690
## 56 37.00 98.600
## 57 36.95 98.510
## 58 37.00 98.600
## 59 36.94 98.492
## 60 36.88 98.384
## 61 36.93 98.474
## 62 36.98 98.564
## 63 36.97 98.546
## 64 36.85 98.330
## 65 36.92 98.456
## 66 36.99 98.582
## 67 37.01 98.618
## 68 37.10 98.780
## 69 37.09 98.762
## 70 37.02 98.636
## 71 36.96 98.528
## 72 36.84 98.312
## 73 36.87 98.366
## 74 36.85 98.330
## 75 36.85 98.330
## 76 36.87 98.366
## 77 36.89 98.402
## 78 36.86 98.348
## 79 36.91 98.438
## 80 37.53 99.554
## 81 37.23 99.014
## 82 37.20 98.960
## 83 37.25 99.050
## 84 37.20 98.960
## 85 37.21 98.978
## 86 37.24 99.032
## 87 37.10 98.780
## 88 37.20 98.960
## 89 37.18 98.924
## 90 36.93 98.474
## 91 36.83 98.294
## 92 36.93 98.474
## 93 36.83 98.294
## 94 36.80 98.240
## 95 36.75 98.150
## 96 36.71 98.078
## 97 36.73 98.114
## 98 36.75 98.150
## 99 36.72 98.096
## 100 36.76 98.168
## 101 36.70 98.060
## 102 36.82 98.276
## 103 36.88 98.384
## 104 36.94 98.492
## 105 36.79 98.222
## 106 36.78 98.204
## 107 36.80 98.240
## 108 36.82 98.276
## 109 36.84 98.312
## 110 36.86 98.348
## 111 36.88 98.384
## 112 36.93 98.474
## 113 36.97 98.546
## 114 37.15 98.870
calc_bmi <- function(weight, height) {
bmi <- weight / (height^2)
return(bmi)
}
# Test
library(medicaldata)
medicaldata::smartpill %>%
as_tibble() %>%
select(Weight, Height) %>%
mutate(BMI = calc_bmi(Weight, Height))
## # A tibble: 95 × 3
## Weight Height BMI
## <dbl> <dbl> <dbl>
## 1 102. 183. 0.00305
## 2 102. 180. 0.00314
## 3 68.0 180. 0.00209
## 4 69.9 175. 0.00227
## 5 44.9 152. 0.00193
## 6 94.8 185. 0.00276
## 7 86.2 188. 0.00244
## 8 76.2 165. 0.00280
## 9 74.4 173. 0.00249
## 10 64.9 170. 0.00224
## # ℹ 85 more rows
check_negatives <- function(numbers) {
if (any(numbers < 0)) {
print("x contains negative numbers")
} else {
print("x does not contain negative numbers")
}
}
# Test
check_negatives(c(8, 3, -2, 5))
## [1] "x contains negative numbers"
## [1] "x does not contain negative numbers"
calculate_calories2 <- function(carb_grams = 0, protein_grams = 0, fat_grams = 0) {
if (!is.numeric(carb_grams)) {
stop("carb_grams must be numeric")
}
if (!is.numeric(protein_grams)) {
stop("protein_grams must be numeric")
}
if (!is.numeric(fat_grams)) {
stop("fat_grams must be numeric")
}
result <- (carb_grams * 4) + (protein_grams * 4) + (fat_grams * 9)
return(result)
}
calculate_isoniazid_dosage2 <- function(weight) {
if (any(weight < 30)) stop("Weights must all be at least 30 kg.")
dosage <- case_when(
weight <= 35 ~ 150,
weight <= 45 ~ 200,
weight <= 55 ~ 300,
weight <= 70 ~ 300,
TRUE ~ 300
)
return(dosage)
}
calculate_isoniazid_dosage2(c(30, 40, 50, 100))
## [1] 150 200 300 300
Some material in this lesson was adapted from the following sources:
‣ Barnier, Julien. “Introduction à R et au tidyverse.” Accessed May 23, 2022. https://juba.github.io/tidyverse
‣ Wickham, Hadley; Grolemund, Garrett. “R for Data Science.” Accessed May 25, 2022. https://r4ds.had.co.nz/
‣ Wickham, Hadley; Jennifer Bryan. “R Packages (2e).” Accessed Nov 20, 2024.R Packages (2e)