The clean_speed
function identifies and removes
speed outliers from a given dataset based on the Median Absolute Deviation
(MAD). The function can identify both "fast" and "slow" outliers, depending
on the parameter settings. It prints the number of identified outliers,
initial number of observations, final number of observations after
removing outliers, and the number of observations removed.
Arguments
- data
A dataset where speed outliers need to be identified and removed.
- duration_var
The variable in the dataset used to identify outliers.
- remove_slow
A logical value to decide if slow outliers should be removed (default is FALSE).
Examples
if (FALSE) {
# Generate some data
df <- data.frame(
duration = c(1, 2, 3, 4, 5, 6, 1000, 2000, 50000, -50, -200),
other_var = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
)
# Remove speed outliers from the data
clean_data <- clean_speed(df, duration_var = "duration", remove_slow = TRUE)
}