Skip to contents

Compute response pattern indicators for responses to multi-item scales or matrix questions.

Usage

resp_patterns(
  x,
  min_valid_responses = 1,
  defined_patterns,
  arbitrary_patterns,
  min_repetitions = 2
)

Arguments

x

A data frame containing survey responses in wide format. For more information see section "Data requirements" below.

min_valid_responses

numeric between 0 and 1. Defines the share of valid responses a respondent must have to calculate response pattern indicators. Default is 1.

defined_patterns

A character vector with patterns to search for. Will not be computed if not specified or if an empty vector is supplied.

arbitrary_patterns

A vector of integer values or a list containing vectors of integer values. The values determine the pattern that should be searched for. Will not be computed if not specified or if 0 is supplied.

Value

Returns a data frame with response quality indicators per respondent. Dimensions:

  • Rows: Equal to number of rows in x.

  • Columns:

Details

The following response distribution indicators are calculated per respondent:

  • n_transitions: Number of times two consecutive response options differ.

  • mean_string_length: Mean length of strings of identical answers.

  • longest_string_length: Longest length of string of identical answers.

  • (optional) defined_pattern: A list column that contains one named vector per respondent. The names of the vector are repeating patterns found in the responses of a respondent. The values of the vector are how often the pattern specified in the argument "defined_patterns" occurs. See section "Defined patterns" for more information.

  • (optional) arbitrary_patterns: A list column that contains one named vector per respondent. The names of the vector are repeating patterns found in the responses of a respondent. The values of the vector are how often the pattern occurred. See "Arbitrary patterns" for more information.

Defined and arbitrary pattern indicators:

Responses of an individual respondent can follow patterns, such as zig-zagging across the response scale over multiple items. There might be a-priori knowledge which response patterns could occur and might be indicative of low quality responding. For this case the defined_patterns argument can be used to specify one or more patterns whose presence will be checked for each respondent. If no a-priori knowledge exists, it is possible to check for all patterns of a specified length.

Defined patterns:

A pattern is defined by providing one ore more patterns in a character vector. A few examples: resp_patterns(x,defined_patterns =" checks how often the response pattern "123" occurs in the responses of a single respondent. c("123","321") checks how often the two patterns "123" and "321" occur individually the responses of a single respondent. There can be an arbitrary number of patterns

Arbitrary patterns

Checks for arbitrary patterns are defined by providing one ore more integer values in a numeric vector. The integers must be larger or equal to two. A few examples: resp_patterns(x,arbitrary_patterns = 2) will check for sequences of responses of length two which repeat at least two times. resp_patterns(x,arbitrary_patterns = c(2,3,4,5)) will check for sequences of responses of length two, three, four and five that repeat at least two times.

Data requirements:

resp_patterns() assumes that the input data frame is structured in the following way:

  • The data frame is in wide format, meaning each row represents one respondent, each column represents one variable.

  • The variables are in same the order as the questions respondents saw while taking the survey.

  • Reverse keyed variables are in their original form. No items were recoded.

  • All responses have integer values.

  • Questions have the same number of response options.

  • Missing values are set to NA.

References

Curran, P. G. (2016). Methods for the detection of carelessly invalid responses in survey data. Journal of Experimental Social Psychology, 66, 4–19. https://doi.org/10.1016/j.jesp.2015.07.006

See also

resp_styles() for calculating response style indicators. resp_distributions() for calculating response distribution indicators. resp_nondifferentiation() for calculating response nondifferentiation indicators.

Author

Matthias Roth, Thomas Knopf

Examples

# A small test data set with ten respondents
# and responses to three survey questions
# with response scales from 1 to 5.
testdata <- data.frame(
  var_a = c(1,4,3,5,3,2,3,1,3,NA),
  var_b = c(2,5,2,3,4,1,NA,2,NA,NA),
  var_c = c(1,2,3,NA,3,4,4,5,NA,NA))

# Calculate response pattern indicators
resp_patterns(x = testdata) |>
    round(2)
#> # A tibble: 10 × 3
#>    n_transitions mean_string_length longest_string_length
#>            <dbl>              <dbl>                 <dbl>
#>  1             2                  1                     1
#>  2             2                  1                     1
#>  3             2                  1                     1
#>  4            NA                 NA                    NA
#>  5             2                  1                     1
#>  6             2                  1                     1
#>  7            NA                 NA                    NA
#>  8             2                  1                     1
#>  9            NA                 NA                    NA
#> 10            NA                 NA                    NA

# Include respondents with NA values by decreasing the
# necessary number of valid responses per respondent.

resp_patterns(
      x = testdata,
      min_valid_responses = 0.2) |>
   round(2)
#> # A tibble: 10 × 3
#>    n_transitions mean_string_length longest_string_length
#>            <dbl>              <dbl>                 <dbl>
#>  1             2                  1                     1
#>  2             2                  1                     1
#>  3             2                  1                     1
#>  4             2                  1                     1
#>  5             2                  1                     1
#>  6             2                  1                     1
#>  7             2                  1                     1
#>  8             2                  1                     1
#>  9             2                  1                     1
#> 10            NA                 NA                    NA