Find one or more cases from a pool
data.frame that match cases in a target
data.frame. Match exactly and/or by distance (sum of squared distances).
Usage
matchcases(
target,
pool,
n_matches = 1,
target_id = NULL,
pool_id = NULL,
exactmatch_factors = TRUE,
exactmatch_cols = NULL,
distmatch_cols = NULL,
norepeats = TRUE,
ignore_na = FALSE,
verbosity = 1L
)
Arguments
- target
data.frame you are matching against
- pool
data.frame you are looking for matches from
- n_matches
Integer: Number of matches to return
- target_id
Character: Column name in
target
that holds unique cases IDs. Default = NULL, in which case integer case numbers will be used- pool_id
Character: Same as
target_id
forpool
- exactmatch_factors
Logical: If TRUE, selected cases will have to exactly match factors available in
target
- exactmatch_cols
Character: Names of columns that should be matched exactly
- distmatch_cols
Character: Names of columns that should be distance-matched
- norepeats
Logical: If TRUE, cases in
pool
can only be chosen once.- ignore_na
Logical: If TRUE, ignore NA values during exact matching.
- verbosity
Integer: Verbosity level.
Examples
if (FALSE) { # \dontrun{
set.seed(2021)
cases <- data.frame(
PID = paste0("PID", seq(4)),
Sex = factor(c(1, 1, 0, 0)),
Handedness = factor(c(1, 1, 0, 1)),
Age = c(21, 27, 39, 24),
Var = c(.7, .8, .9, .6),
Varx = rnorm(4)
)
controls <- data.frame(
CID = paste0("CID", seq(50)),
Sex = factor(sample(c(0, 1), 50, TRUE)),
Handedness = factor(sample(c(0, 1), 50, TRUE, c(.1, .9))),
Age = sample(16:42, 50, TRUE),
Var = rnorm(50),
Vary = rnorm(50)
)
mc <- matchcases(cases, controls, 2, "PID", "CID")
} # }