The future.apply packages provides parallel implementations of common "apply" functions provided by base R. The parallel processing is performed via the future ecosystem, which provides a large number of parallel backends, e.g. on the local machine, a remote cluster, and a high-performance compute cluster.
Currently implemented functions are:
future_Map(): a parallel version of Map()
Reproducibility is part of the core design, which means that perfect, parallel random number generation (RNG) is supported regardless of the amount of chunking, type of load balancing, and future backend being used.
future_*() functions have the same arguments as the
corresponding base R function, start using them is often as simple as
renaming the function in the code. For example, after attaching the package:
code such as:
can be updated to:
y <- future_lapply(x, quantile, probs = 1:3/4)
The default settings in the future framework is to process code sequentially. To run the above in parallel on the local machine (on any operating system), use:
first. That's it!
To go back to sequential processing, use
If you have access to multiple machines on your local network, use:
This will set up four workers, one on
n3, and two on
If you have SSH access to some remote machines, use:
plan(cluster, workers = c("m1.myserver.org", "m2.myserver.org))
See the future package and
future::plan() for more examples.
The future.batchtools package provides support for high-performance compute (HPC) cluster schedulers such as SGE, Slurm, and TORQUE / PBS. For example,
Process via a Slurm scheduler job queue.
Process via a TORQUE / PBS scheduler job queue.
This builds on top of the queuing framework that the batchtools package provides. For more details on backend configuration, please see the future.batchtools and batchtools packages.
These are just a few examples of parallel/distributed backend for the future ecosystem. For more alternatives, see the 'Reverse dependencies' section on the future CRAN package page.