Skip to contents

This function extracts the base names from variable names in a dataset and optionally, demo variable names.

Usage

get_base_names(
  dataset,
  underscore_count = NULL,
  keep_qualtrics_vars = FALSE,
  other_vars_removal = NULL,
  use_numbers = TRUE,
  exclude_demo = TRUE,
  exclude_check = TRUE,
  keyword_exclude = NULL,
  extract_demo = FALSE
)

Arguments

dataset

The dataset from which variable names are extracted.

underscore_count

The number of underscores to consider when extracting base names (default is NULL).

keep_qualtrics_vars

Logical value indicating whether to keep Qualtrics variables (default is FALSE).

other_vars_removal

A character vector specifying additional variables to exclude from the extraction process (default is NULL).

use_numbers

Logical value indicating whether to append numbers to base names (default is TRUE).

exclude_demo

Logical value indicating whether to exclude variables containing 'dem' or 'demo' (default is TRUE).

exclude_check

Logical value indicating whether to exclude variables containing 'check' (default is TRUE).

keyword_exclude

A string indicating a keyword to exclude from variable names (default is NULL).

extract_demo

Logical value indicating whether to return a list of variables that contain 'dem' or 'demo' (default is FALSE).

Value

A character vector containing the unique base names extracted from the dataset variable names, or a list containing the base names and demo variable names if extract_demo is TRUE.

Details

The function takes a dataset as input and extracts the base names from the variable names. It allows for flexibility in handling variable names with underscores by specifying the number of underscores to consider. By default, the function removes Qualtrics variables commonly found in survey data, but this behavior can be modified with the keep_qualtrics_vars argument. Additional variables specified by the user can also be excluded. The function can also append numbers to base names if use_numbers is set to TRUE. Variables with 'dem' or 'demo' can be excluded and separately extracted if extract_demo is TRUE. Variables with 'check' or a user-defined keyword can also be excluded.

References

Remember to add reference here

Examples

dataset <- data.frame(Age_1 = c(25, 30, 35),
Gender_1 = c("Male", "Female", "Male"),
Income_1 = c(50000, 60000, 70000),
Age_2 = c(40, 45, 50),
Gender_2 = c("Female", "Male", "Female"),
Income_2 = c(80000, 90000, 100000))

get_base_names(dataset, underscore_count = 1)
#> [1] "Age"    "Gender" "Income"