Skip to contents

The impute_data function performs imputation on the numeric variables in the given dataset using the MissForest algorithm. The function excludes certain variables, termed "scenario-based" variables, from the imputation process. These are variables that were presented to some participants under certain conditions and might naturally contain NA values.

Usage

impute_data(
  data,
  scenario_based_vars = NULL,
  cores = 4,
  seed = 100,
  answers_only = FALSE
)

Arguments

data

The dataset to be imputed.

scenario_based_vars

A character or numeric vector representing variable names or indices that should be excluded from the imputation process. These are the scenario-based variables. Defaults to NULL.

cores

The number of CPU cores to be used for parallel processing. Default is 4.

seed

The seed for reproducibility. Default is 100.

answers_only

Logical value indicating whether to only include variables with a number or the word 'demo'/'Demo' in their names for the imputation process. Default is FALSE.

Value

The dataset with imputed values, maintaining the same column order as the original data.

References

Add any relevant references here.

Examples

if (FALSE) {
# Use the function
imputed_data <- impute_data(DataM, scenario_based_vars = c("WFC.CHECKLIST_1:CopeClosure", "PROLIFIC_PID:correct_answer_total"))
}