analysis-as-a-package.RmdTargets:
Lets consider an example: I need to run a statistical analysis resulting in a Sankey diagram.
In a script.R:
# Dummy collection of ADaM datasets: ADSL and ADPASI
raw <- read.table(
file.path(
"https://raw.githubusercontent.com/VIS-SIG/Wonderful-Wednesdays",
"master/data/2021/2021-04-14/WWW_SustainedResponse.csv"
),
header = TRUE,
sep = ",",
stringsAsFactors = FALSE
)
library(tidyr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
raw <- as_tibble(raw)
adsl <- raw %>%
distinct(USUBJID, TRT, BASELINE) %>%
mutate(
ARMCD = factor(
TRT,
levels = c(
"COMPARATOR TREATMENT",
"ACTIVE TREATMENT DOSE 01",
"ACTIVE TREATMENT DOSE 02"
),
labels = c("ARM A", "ARM B", "ARM C")
)
)
adpasi <- raw %>%
select(-TRT) %>%
gather(key = "AVISIT", value = "AVAL", -USUBJID) %>%
mutate(
AVISIT = ifelse(AVISIT == "BASELINE", "WEEK00", AVISIT),
PARAMCD = "PASITOT"
) %>%
select(USUBJID, PARAMCD, AVISIT, AVAL)
# Analysis dataset preprocessing
# function: replace missing levels of a factor by "Missing"
add_missing <- function(x) {
ll <- levels(x)
ll <- c(ll, "Missing")
x <- as.character(x)
x <- ifelse(is.na(x), "Missing", x)
factor(x, levels = ll, exclude = NULL)
}
set.seed(3)
ads <- adpasi %>%
filter(AVISIT %in% c("WEEK00", "WEEK01", "WEEK08", "WEEK52")) %>%
mutate(
time = factor(AVISIT),
rsp = cut(AVAL, breaks = 3, labels = c("Low", "Mid", "High")),
rsp = add_missing(rsp),
subj = gsub("^SUBJECT (.*)$", x = USUBJID, replacement = "\\1")
) %>%
select(subj, time, rsp) %>%
filter(subj %in% sample(unique(subj), 200)) %>%
arrange(rsp, time, subj)
# Generate the graphic
library(ggalluvial)
#> Loading required package: ggplot2
color_scale <- setNames(
viridis::viridis(4, begin = .2, end = .8, option = "C", direction = -1),
nm = levels(ads$rsp)
)
ggplot(ads, aes(x = time, stratum = rsp, alluvium = subj, fill = rsp)) +
geom_stratum(colour = NA) +
geom_flow(stat = "alluvium", color = "gray85", lwd = .01) +
scale_fill_manual(values = color_scale)
Result, it is working and:
add_missing()).
clean_slate(): figure annotations (header, title, notes and footer).preview(): preview the generated pdf itself.Either use devtools::create(), or follow the RStudio user interface in 5 clicks:

Click 1 - Create a new project

Click 2 - in a new directory

Click 3 - R Package using devtools

Click 4 - Create project

Click 5: already setup
inst/, a permissive area of the package.
R/ contains the documented functions:
adam_ww().add_missing(), similar to addNA().preview()clean_slate() family.The markup system roxygen2, helps making a complete documentation. Example with the function add_missing() in R/add_missing.R.
#' Factor: NA is "Missing" level
#'
#' Assign the value "Missing" to missing values of a factor. Built as [addNA()].
#'
#' @param x (`factor`)
#' @param missing_lvl (`character`)
#'
#' @export
#' @examples
#'
#' animals <- as.factor(c("cat", "dog", NA))
#' animals
#' add_missing(animals)
#'
add_missing <- function(x, missing_lvl = "Missing") {
assertthat::assert_that(is.factor(x))
ll <- levels(x)
ll <- c(ll, missing_lvl)
x <- as.character(x)
x <- ifelse(is.na(x), missing_lvl, x)
factor(x, levels = ll, exclude = NULL)
}Transfering a script in a package is not a drastic change:
script.R
R/.Frequently asked qestions:
<F2>: Go to function definition.source()":
devtools::load_all(): load all the package, all the function becomes available similarly to a library() call for an installed package (even the manual pages).usethis::use_readme_rmd(), use knitr::read_chunk(inst/study.R) to screen the script sections, and use empty but named R code chunk to decide which section to execute, in which order.Standard format comes with standard-support tools:
<ctrl>+<shift>+<d>, equivalent to devtools::document(), updates the documentation (populates the NAMESPACE file and man/ folder).<ctrl>+<shift>+<e>, equivalent to devtools::check(), verifies a large number of standards (e.g. functions are well documented, dependencies are well accounted for, examples are working).
Devtools: look at the Build panel
lintr::lint():

Lintr: keep-up with code quality (and proove it)
pkgdown:
README.md becomes the home page, functions are documented in reference, vignette added in articles.adam_ww(), will bring you to the right page.And much more:
inst/rmarkdown/.vignettes/: use (e.g. the present document).tests/: place for automated test (e.g. test that the ADSL data returned by adam_ww is such as USUBJID is the unique row identifier).