It all seems so simple. You install R packages on your system with install.packages(package)
. This downloads the code and saves it on your computer. You can see what you’ve installed with installed.packages()
. And when you want to use a package in your code, you start the code with library(package)
.
But here’s where it gets crazy. Some packages require other packages to work. For example, take the package tibble. It’s pretty simple – it just makes tibbles, which are R dataframes that act more sanely and consistently and print out more nicely than standard dataframes.
But notice the third line of CRAN’s description of tibble:
Imports: cli, crayon, methods, pillar (>= 1.1.0), rlang, utils
When you install the tibble package, R will also download and install all those other packages (if you don’t already have them). But when you library the tibble package, only the functions in the tibble package itself become directly accessible to you.
But wait, there’s more. Functions inside the other packages are indirectly accessible to tibble, which can get at functions in the other packages by means of a prefix before the function name. The prefix consists of the name of the package and a double::colon. Functions called this way are loaded into R’s memory but aren’t directly accessible. R calls this loaded via a namespace. If you look at tibble’s code for its glimpse() function on GitHub, you’ll see the following line:
data_width <- width - crayon::col_nchar(var_names) - 2
The crayon::
part is calling a function in the crayon package. If the package hasn’t been loaded into memory yet, first R will automatically and quietly load it. Your own code can also call any function in an installed package in the same way.
If fact, you could never use library()
at all but just refer to all functions in packages with the package name-double colon prefix. So all the library()
function actually does is make sure the package is installed and allow you to call its functions without the prefix! It’s really just a version of attach()
.
If you’d like to see what packages are attached and what packages are loaded at any give time, use the function sessionInfo()
(note: it’s session, not system) and look at the last chunks of information it provides.
The tidyverse: installed vs attached packages
Once you understand all that, how the tidyverse works will suddenly make a lot more sense. Here’s a description of the package from the tidyverse CRAN info.
The 'tidyverse' is a set of packages that work in harmony because they share common data representations and 'API' design. This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step. Learn more about the 'tidyverse' at <https://tidyverse.org>.
The library(tidyverse)
function attaches the following packages:
But install.packages(tidyverse)
actually installs a whole boatload of additional packages, including a few whose functions you might want to access directly:
In the latest version of our LEMRS installation script we have R install the entire tidyverse and we get the whole boatload of packages. But inside the Open-Meta app, library(tidyverse)
doesn’t attach some of the ones we want to use. That’s why we have to library()
them in separately.
library() vs require()
Finally, let’s talk a moment about the library()
function’s twin, require()
. They both attach packages. The difference is in what they return if the package isn’t installed. In that case library()
will issue a fatal error and stop the program. The require()
function, on the other hand, returns FALSE
and a warning, but your program can continue.
If you write a library()
function like this: x <- library()
and then examine x
you’ll either find a libraryIQR object that lists all of your installed packages by the folders they reside in (you get this when you add no parameter) or a character object with the names of all the packages that are currently attached (assuming you pass a valid package name as a parameter; an invalid package name returns that fatal error).
With x <- require()
, on the other hand, you’ll get a fatal error with no parameter. Otherwise x
will be TRUE
for a package that is available on your system and FALSE
for a package that’s never been installed.
Both library()
and require()
insist that you pass in only one package name at a time. Neither accepts a vector of package names.