Manage Dependencies with the deps R Package for Docker Containers
Posted on October 15, 2022 by Peter Solymos in R bloggers | 0 Comments
[This article was first published on R - Hosting Data Apps , and kindly contributed to R-bloggers ]. (You can report issue about the content on this page here )
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Share Tweet
When building Docker images for your R-based applications, the biggest hurdle is knowing exactly which packages and system libraries your package depends on. Luckily, the tools have evolved quite a bit over the past few years. In this post, I show you where the deps package fits in and how this can be a great choice for dependency management for Docker-based workflows.
Reproducibility
Tools like packrat , renv , and capsule let you go to great lengths to make your R projects perfectly reproducible. This requires knowing the exact package versions and the source where it was installed from (CRAN, remotes, local files). This information is registered in a lock file, which serves as the manifest for recreating the exact replica of the environment.
Full reproducibility is often required for reports, markdown-based documents, and scripts. A loosely defined project that is combined with strict versioning requirements, often erring on the side of “more dependencies are safer”.
In our previous post we covered how to manage dependencies with the renv package:
Hosting Data AppsPeter Solymos
Package-based development
On the other end of the spectrum, we have package-based development. This is the main use case for dependency management-oriented packages, such as remotes and pak .
In this case, exact versions are managed only to the extent of avoiding breaking changes (given that testing can surface these). So what we have is a package-based workflow combined with a “no breaking changes” philosophy to version requirements. This approach often leads to leaner installation.
The middle ground
What if we are not writing an R package and wanted to combine the best of both approaches? – A loosely defined project with just strict-enough versioning requirements. All this without having to write a DESCRIPTION file by hand. Because why would you need a DESCRIPTION file when you have no package? Also, a DESCRIPTION file won’t let you pin an exact package version or specify alternative CRAN-like repositories.
What if you could manage dependencies by decorating your existing R code with special, roxygen-style comments? Just like this:
#' @remote analythium/ [email protected] rconfig::config() #' @repo sf https://r-spatial.r-universe.dev library(sf) #' @ver rgl 0.108.3 library(rgl)
This is exactly what deps does:
helps to find all dependencies from our files,
writes these into a dependencies.json file,
performs package installs according to the decorators.
The decorators make our intent explicit, just like if we were writing an R package. But we do not need to manually write these into a file and keep it up-to-date. We can just rerun create to update the JSON manifest file.
Tags
There are many different tags that you can use as part of your roxygen-style comments:
Tag
Using the deps package
The deps package has 2 main functions:
create() crawls the project directory for package dependencies. It will amend the dependency list and package sources based on the comments and query system requirements for the packages where those requirements are known for a particular platform; the summary is written into the dependencies.json file.
install() looks for the dependencies.json file in the root of the project directory (or runs create() when the JSON file is not found) and performs dependency installation according to the instructions in the JSON file.
In the simplest case, one might have a project folder with some R code inside. Running deps::install() will perform the package installation in one go. Additional arguments can be passed to install() so that local libraries etc. can be specified.
These arguments are passed to install.packages(). This is a really important consideration when it comes to utilizing RSPM or BSPM repositories on Linux systems. RSPM (RStudio Package Manager) provides rebuild binaries, BSPM (Bridge to System Package Manager) provides full system dependency resolution and integration with apt on top of binary packages.
Docker workflow
The following example is part of the deps package examples . We will use a Shiny app that we have used before to draw a 3D surface for a bivariate Normal distribution .
3D surface of a bivariate Normal distribution.
Let's say that we have a single file app/app.R with the following content:
library(shiny) library(MASS) options(rgl.useNULL = TRUE) library(rgl) ui