Skip to contents

Find one or more cases from a pool data.frame that match cases in a target data.frame. Match exactly and/or by distance (sum of squared distances).

Usage

matchcases(
  target,
  pool,
  n_matches = 1,
  target_id = NULL,
  pool_id = NULL,
  exactmatch_factors = TRUE,
  exactmatch_cols = NULL,
  distmatch_cols = NULL,
  norepeats = TRUE,
  ignore_na = FALSE,
  verbosity = 1L
)

Arguments

target

data.frame you are matching against

pool

data.frame you are looking for matches from

n_matches

Integer: Number of matches to return

target_id

Character: Column name in target that holds unique cases IDs. Default = NULL, in which case integer case numbers will be used

pool_id

Character: Same as target_id for pool

exactmatch_factors

Logical: If TRUE, selected cases will have to exactly match factors available in target

exactmatch_cols

Character: Names of columns that should be matched exactly

distmatch_cols

Character: Names of columns that should be distance-matched

norepeats

Logical: If TRUE, cases in pool can only be chosen once.

ignore_na

Logical: If TRUE, ignore NA values during exact matching.

verbosity

Integer: Verbosity level.

Author

EDG

Examples

if (FALSE) { # \dontrun{
set.seed(2021)
cases <- data.frame(
  PID = paste0("PID", seq(4)),
  Sex = factor(c(1, 1, 0, 0)),
  Handedness = factor(c(1, 1, 0, 1)),
  Age = c(21, 27, 39, 24),
  Var = c(.7, .8, .9, .6),
  Varx = rnorm(4)
)
controls <- data.frame(
  CID = paste0("CID", seq(50)),
  Sex = factor(sample(c(0, 1), 50, TRUE)),
  Handedness = factor(sample(c(0, 1), 50, TRUE, c(.1, .9))),
  Age = sample(16:42, 50, TRUE),
  Var = rnorm(50),
  Vary = rnorm(50)
)

mc <- matchcases(cases, controls, 2, "PID", "CID")
} # }