Logo

The Data Daily

Minimum R version dependency in R packages | R-bloggers

Minimum R version dependency in R packages | R-bloggers

Minimum R version dependency in R packages
Posted on September 11, 2022 by Posts on R-hub blog in R bloggers | 0 Comments
[This article was first published on Posts on R-hub blog , and kindly contributed to R-bloggers ]. (You can report issue about the content on this page here )
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Share Tweet
There have been much talk and many blog posts about R package dependencies . Yet, one special dependency is more rarely mentioned, even though all packages include it: the dependency on R itself. The same way you can specify a dependency on a package, and optionally on a specific version, you can add a dependency to a minimum R version in the DESCRIPTION file of your package. In this post we shall explain why and how.
How & why to declare a dependency to a minimum R version?
Although the R project is in a stable state, and prides itself in its solid backward compatibility, it is far from being a dead project. Many exciting new features keep being regularly added to R or some of its base libraries. As a package developer, you may want to use one of these newly added features (such as startsWith(), introduced in R 3.3.0) .
In this situation, you should inform users ( as well as automated checks from CRAN ) that your package only works for R versions more recent than a given number 1 .
To do so, you should add the required version number to your DESCRIPTION file 2 :
Depends: R (>= 3.5.0)
Which minimum R version your package should depend on?
There are different strategies to choose on which R version your package should depend:
Conservative approach
Some projects prefer to limit the minimum R version by design, rather than by necessity. This means that their packages might work with older R versions, but because they don’t or can’t test it, they’d rather not take the risk and limit themselves to versions for which they are sure the package is working:
this used to be the policy of usethis before 2017 (and therefore, of all packages built with usethis at that time). In the past, usethis added by default a dependency to the R version used by the developer at the time they created the package .
this is the strategy used by the tidyverse , which explicitly decided to guarantee compatibility with the 5 latest R minor releases, but no further. With the current R release cycle, this corresponds to compatibility with R versions up to 5 years old.
‘Wide net’ approach
On the opposite, other projects consider that packages are by default compatible with all R versions, until they explicitly add a feature associated with a new R version, or until tests prove it otherwise. This is the new policy of usethis (and therefore, of all packages built this usethis). By default, new packages don’t have any constraints on the R version. It is the responsibility of the developer to add a minimum required version if necessary.
Transitive approach
Another approach is to look at your package dependencies. If indirectly, via one of its recursive dependencies, your package already depend on a recent R version, there is no point in going the extra mile to keep working with older versions. So, a strategy could be to compute your package transitive minimum R version with the following function and decide that you can use base R features up to this version:
find_transitive_minR dplyr::pull(Depends) |> strsplit(split = ",") |> purrr::map(~ grep("^R ", .x, value = TRUE)) |> unlist() length(r_deps)[1] 11542 tail(r_deps)[1] "R (>= 3.5)" "R (>= 3.1.0)" "R (>= 2.4.0)" "R (>= 3.2)" [5] "R (>= 3.0.0)" "R (>= 2.13.0)"
A first result of our analysis if that 62% of CRAN packages specify a minimum R version.
As mentioned earlier, the minimum required version can be specified with a loose or strict inequality:
(r_deps_strict table()r_deps_ver 0.65 0.99 1.1 1.14.0 1.4 1.4.0 1 2 1 1 7 2 1.4.1 1.5.0 1.6.0 1.6.1 1.6.2 1.7 1 6 1 1 1 1 1.7.0 1.8.0 1.9.0 1.9.1 2.0 2.0.0 3 31 11 1 18 59 2.0.1 2.01 2.1 2.1.0 2.1.1 2.1.14 13 9 2 16 4 1 2.1.4 2.1.5 2.10 2.10.0 2.10.1 2.11 1 1 1578 136 22 1 2.11.0 2.11.1 2.12 2.12.0 2.12.1 2.13 13 10 6 45 1 8 2.13.0 2.13.1 2.13.2 2.14 2.14.0 2.14.1 37 4 1 38 105 22 2.14.2 2.15 2.15.0 2.15.1 2.15.2 2.15.3 13 47 87 60 9 9 2.16 2.2 2.2.0 2.2.1 2.2.4 2.20 1 2 23 9 1 1 2.3 2.3.0 2.3.1 2.3.12 2.3.2 2.4 2 13 3 1 1 4 2.4.0 2.4.1 2.5 2.5.0 2.5.1 2.5.3 24 2 3 30 1 1 2.50 2.6 2.6.0 2.6.1 2.6.2 2.7 2 9 40 3 2 5 2.7.0 2.7.2 2.8 2.8.0 2.8.1 2.9 27 1 1 23 2 2 2.9.0 2.9.1 2.9.2 3.0 3.0-0 3.0-2 28 4 3 236 4 1 3.0.0 3.0.1 3.0.2 3.0.3 3.0.4 3.00 750 94 231 33 1 19 3.00.0 3.1 3.1-0 3.1.0 3.1.1 3.1.2 1 182 2 583 97 144 3.1.3 3.10 3.10.0 3.2 3.2.0 3.2.1 33 2 1 134 397 50 3.2.2 3.2.3 3.2.4 3.2.5 3.2.6 3.3 95 110 27 29 1 141 3.3.0 3.3.1 3.3.2 3.3.3 3.4 3.4.0 461 63 39 23 181 566 3.4.1 3.4.2 3.4.3 3.4.4 3.5 3.5-0 12 5 2 9 339 1 3.5.0 3.5.0-4.0.2 3.5.00 3.5.1 3.5.1.0 3.5.2 2207 1 1 5 1 2 3.5.3 3.50 3.6 3.6.0 3.6.2 3.6.3 3 7 191 485 3 3 3.60 3.7.0 4.0 4.0.0 4.0.3 4.0.4 1 1 194 375 1 1 4.0.5 4.00 4.1 4.1-0 4.1.0 4.2 1 3 54 1 142 8 4.2.0 31
Interestingly, you can notice that some of these version numbers don’t match any actual R release. To confirm this, we can use the rversions package, from R-hub :
setdiff(unique(r_deps_ver), rversions::r_versions()$version) [1] "2.10" "3.0" "3.6" "3.5" "3.4" [6] "3.2" "3.00" "4.1" "2.14" "3.1" [11] "4.0" "3.3" "2.13" "2.3.2" "3.1-0" [16] "2.0" "2.5" "2.15" "4.00" "3.0-0" [21] "1.7" "2.7" "2.01" "2.6" "2.20" [26] "2.2" "2.2.4" "3.50" "4.2" "3.10.0" [31] "2.11" "2.9" "3.7.0" "3.10" "2.3" [36] "1.4.0" "2.5.3" "3.60" "2.50" "2.1.4" [41] "2.4" "3.0.4" "2.1" "2.12" "3.5.0-4.0.2" [46] "3.5.1.0" "2.8" "3.00.0" "2.3.12" "4.1-0" [51] "2.16" "1.14.0" "2.1.14" "3.5-0" "3.5.00" [56] "3.0-2" "2.1.5" "3.2.6"
We can infer the reason for the mismatch for some examples in this list:
missing . between version components (for instance 2.01, 2.50, 3.00, 3.60, 4.00)
. replaced by - in the patch version number (for instance 3.0-0, 3.0-2, 3.1-0, 3.5-0, 4.1-0) 3 .
missing patch version number (for instance 2.0, 2.2, 4.3)
extra patch version number (for instance 1.4.0)
recommended packages depend on a yet-to-be-released R version (4.3)
Note that this values are not syntactically wrong, and it might in some cases be intended by the author. They can be read and understood by the relevant function in base R (in particular, install.packages() ), but it is possible they do not correspond to what the package author was expecting, or trying to communicate. For example, in the case of R (=> 3.60): even if the author really intended to depend on R 3.6.0 as we assume here, the package cannot be installed in versions earlier than 4.0.0.
To visualise the actual minimum R version corresponding to the declared R dependency, we can do the following:
r_vers = 3.5.0 are by default only compatible with R >= 3.5.0. However, these are nothing more than educated guesses and only a proper, in-depth, analysis could confirm what made developers switch to a newer R version. This analysis could look at diffs between package versions and see what new R feature packages are using when they bump the R version dependency.
How to avoid depending on a new version?
For the various reasons presented above, it might not always be desirable to depend on a very recent R version. In this kind of situation, you may want to use the backports package . It reimplements many of the new features from the more recent R version. This way, instead of having to depend on a newer R version, you can simply add a dependency to backports, which is easier to install than a newer R version for users in highly controlled environments.
Backports is not a silver bullet though, as some new features are impossible to reimplement in a package. Notably, this is the case of the native R pipe (|>), introduced in R 4.1.0. Roughly speaking, this is because it is not simply a new function, but rather an entire new way to read R code.
How to test you depend on the correct version?
It is easy to make a mistake when specifying a minimum R version, and to forget to you use one recent R feature . For this reason, you should always try to verify that your minimum R version claim is accurate.
The most complete approach is to run your tests, or at least verify that the package can be built without errors, on all older R versions you claim to support. For this, locally, you could use rig , which allows you to install multiple R version on your computer and switch between them with a single command. But a convenient way to do so if to rely on continuous integration platforms, where existing workflows are already set up to run on multiple R versions. For example, if you choose to replicate the tidyverse policy of supporting the 5 latest minor releases of R, your best bet is probably to use the check-full.yaml GitHub Actions workflow from r-lib/actions 4 .
But this extensive test may prove challenging in some cases. In particular, the actions provided by r-lib.actions use rcmdcheck , which itself depends on R 3.3 (via digest). This means that you’ll have to write your own workflows if you wish to run R CMD check on older R versions. Some packages that place a high value in being compatible with older R versions, such as data.table, have taken this route and developed their own continuous integration scripts .
A more lightweight approach (although a little more prone to false-negatives) is to use the backport_linter() function provided by the lintr package . It works by matching your code against a list of functions introduced in more recent R versions. Note that this approach might also produce false positives is you use functions with the same name as recent base R functions.
Conclusion
As you’ve seen, there are quite a lot of strategies and subtleties in setting a minimum R dependency for your package: you could adopt the tidyverse approach of supporting the five last R versions, or choose to keep compatibility with older R versions and using backports if necessary. In all cases, you should try to verify that your declared minimum R version is correct: by using the dedicated linter from the lintr package, or by actually running your tests on older R versions. Whatever you end up doing and even if this topic may seem complex, we believe the tips we presented here are specific cases of more software development tips:
use automated tools to assist you in your work;
try to empathize with your users and minimize the friction necessary to install and use your tool;
look at what other developers in the community are doing.
Note that there is no mechanism to make your package compatible only with older R versions, and not with the more recent ones. Packages are supposed to work with the latest R versions. ↩︎
In theory, it is not strictly required to use >=. You could use a strict inequality (>) but as we will see later, this is a very uncommon option so we recommend you use the de facto community standard and stick to >=. ↩︎
However, it is interesting to note that package_version("3.5-0") == package_version("3.5.0"). The use of - instead of . is purely stylistic. ↩︎
Instead of manually copying this file, you can run usethis::use_github_action("check-full") in your package folder. ↩︎
Related

Images Powered by Shutterstock