Flag respondents based on response quality indicators

Flag respondents with one or more flagging expression.

Usage

flag_resp(x, ...)

Arguments

x: A data frame containing response quality indicators. Each column should be one response quality indicator. Each row should be the value of the response quality indicator of a respondent.
...: Flagging expressions. See details.

Value

A data frame containing one column per flagging strategy and the same number of rows asx. Each column contains T and F flags per respondents. An additional id column is added as the first column if a column named id is present in x.

Details

flag_resp() works very similar to the popular dplyr::filter() function. However, instead of filtering data, flag_resp() returns a data frame of T and F values, representing which respondents are flagged.

As the first argument, you provide a data frame of response quality indicators, where each column represents one response quality indicator and each row represents one respondent. As the second argument you provide one ore more logical statements to flag respondents. For example:

flag_resp(x,ERS > 0.5) returns a data frame with one column named ERS > 0.5. Each row represents one respondent and shows whether the statement "is the extreme response style indicator larger than 0.5" is true (T) or false (F).
flag_resp(x,ERS > 0.5,ii_mean < 3) returns a data frame with two columns indicating for which respondents the two flagging expressions are true or false.

Note that flag_resp() is not restricted to functions from the resquin package. You can supply any numerical column in the data frame x. This opens the possibility to compare flagging strategies based on response quality indicators across packages and functions.

Use the summary() function on the results to compare flagging strategies.

For more details see the vignette: vignette("flagging_respondents", package = "resquin")

Examples

res_dist_indicators <- resp_distributions(nep) # Create indicator data frame

flagged_respondents <- flag_resp(res_dist_indicators,
                                 ii_mean > 3, # Flagging strategy 1
                                 ii_sd < 2, # Flagging strategy 2
                                 ii_mean > 3 & ii_sd > 2) # Flagging strategy 3
flagged_respondents # A data frame with three columns, each corresponding to one flagging strategy
#> # A data frame: 1,222 × 4
#>       id `ii_mean > 3` `ii_sd < 2` `ii_mean > 3 & ii_sd > 2`
#>    <int> <lgl>         <lgl>       <lgl>                    
#>  1     1 TRUE          TRUE        FALSE                    
#>  2     2 TRUE          TRUE        FALSE                    
#>  3     3 FALSE         TRUE        FALSE                    
#>  4     4 TRUE          TRUE        FALSE                    
#>  5     5 TRUE          TRUE        FALSE                    
#>  6     6 NA            NA          NA                       
#>  7     7 TRUE          TRUE        FALSE                    
#>  8     8 TRUE          TRUE        FALSE                    
#>  9     9 TRUE          TRUE        FALSE                    
#> 10    10 TRUE          TRUE        FALSE                    
#> # ℹ 1,212 more rows
summary(flagged_respondents) # quickly compare flagging strategies
#> 
#> ── Number of respondents flagged (Total N: 1222) 
#>             ii_mean > 3               ii_sd < 2 ii_mean > 3 & ii_sd > 2 
#>                     832                     934                       2 
#> 
#> ── Agreement between flagging strategies 
#> 
#> 
#> Flag                      ii_mean > 3 & ii_sd > 2   ii_sd < 2   ii_mean > 3 
#> ------------------------  ------------------------  ----------  ------------
#> ii_mean > 3 & ii_sd > 2   2                                                 
#> ii_sd < 2                 0                         934                     
#> ii_mean > 3               2                         830         832