Create "Multimodal" Synthetic Data using squares and arctangents
Usage
synth_multimodal(
n_cases = 10000,
init_fn = "runifmat",
init_fn_params = list(min = -10, max = 10),
n_groups = 4,
n_feat_per_group = round(seq(10, 300, length.out = n_groups)),
contrib_p = 0.33,
linear_p = 0.66,
square_p = 0.1,
atan_p = 0.1,
pair_multiply_p = 0.05,
pair_square_p = 0.05,
pair_atan_p = 0.05,
verbosity = 1L,
seed = NULL,
filename = NULL
)
Arguments
- n_cases
Integer: Number of cases to create.
- init_fn
Character: "runifmat" or "rnormmat". Use the respective functions to generate features as random uniform and random normal variables, respectively.
- init_fn_params
Named list with arguments "min", "max" for "runifmat" and "mean", "sd" for "rnormmat".
- n_groups
Integer: Number of feature groups / modalities to create.
- n_feat_per_group
Integer, vector, length
n_groups
: Number of features per group to create.- contrib_p
Float (0, 1]: Ratio of features contributing to outcome per group. a third of the features in each group will be used to produce the outcome y
- linear_p
Float [0, 1]: Ratio of contributing features to be included linearly. features in each group will be included
- square_p
Float [0, 1]: Ratio of contributing features to be squared. in each group will be squared
- atan_p
Float [0, 1]: Ratio of contributing features whose
atan
will be used. These will be selected from the features that were NOT sampled for squaring. i.e. .1 of .33 of features in each group will be transformed usingatan
, but given these features were not already picked to be squared (seesquare_p
)- pair_multiply_p
Float [0, 1] Ratio of features will be divided into pairs and multiplied.
- pair_square_p
Float [0, 1] Ratio of features which will be divided into pairs, multiplied and squared.
- pair_atan_p
Float [0, 1] Ratio of features which will be divided into pairs, multiplied and transformed using
atan
.- verbosity
Integer: Verbosity level.
- seed
Integer: If set, pass to
set.seed
for reproducibility- filename
Character: Path to file to save output.
Details
There are no checks yet for compatibility among inputs and certain combinations may not work.
Examples
if (FALSE) { # \dontrun{
xmm <- synth_multimodal(
n_cases = 10000,
init_fn = "runifmat",
init_fn_params = list(min = -10, max = 10),
n_groups = 5,
n_feat_per_group = c(20, 50, 100, 200, 300),
contrib_p = .33,
linear_p = .66,
square_p = .1,
atan_p = .1,
pair_multiply_p = .1,
pair_square_p = .1,
pair_atan_p = .1,
seed = 2019
)
} # }