Compute response nondifferentiation indicators

Compute response nondifferentiation indicators for responses to multi-item scales or matrix questions.

Usage

resp_nondifferentiation(x, min_valid_responses = 1, id = T)

Arguments

x: A data frame containing survey responses in wide format. For more information see section "Data requirements" below.
min_valid_responses: Numeric between 0 and 1 of length 1. Defines the share of valid responses a respondent must have to calculate response quality indicators. Default is 1.
id: default is True. If the default value is supplied a column named id with integer ids will be created. If False is supplied, no id column will be created. Alternatively, a numeric or character vector of unique values identifying each respondent can be supplied. Needs to be of the same length as the number of rows of x.

Value

Returns a data frame with response nondifferentiation indicators per respondent. Dimensions:

Rows: Equal to number of rows in x.
Columns: Four response nondifferentiation indicator columns + id column (if specified).

Details

Response nondifferentiation is the result of response behavior in which respondents deviate from an ideal response process. Optimal response behavior is termed optimizing, while deviations from optimal response behavior are termed satisficing (Krosnik, 1991). Optimizing describes a behavior in which respondents go through all steps of comprehension, retrieval, judgment, and response selection. When satisficing, respondents skip all or parts of the optimal response process. Satisficing can lead to non-response, "don't know" responses, random responding or nondifferentiation. The later is targeted by the function resp_nondifferentiation().

Nondifferentiation is characterized by respondents choosing similar or even the same response options regardless of the content of the question. Multiple indicators for response nondifferentiation have been developed. For resp_nondifferentiation(), the following response nondifferentiation indicators described by Kim et al. (2017) are calculated per respondent:

Simple Nondifferentiation: Respondents are assigned 1 or 0 depending on whether all responses have the same value (1) or not (0).
Mean Root of Pairs Method: Mean of the root of the absolute differences between all pairs in a multi-item scale or matrix questions. It ranges from 0 (least straightlining) to 1 (most straightlining). The indicator is rescaled to be inbetween the minimum and maximum of all values. This means that including/excluding responses or respondents into the calculation changes the indicators values.
Maximum Identical Rating Method: Proportion of the most commonly selected response option among all responses in a multi-item scale or matrix questions. It ranges from 0 (least straightlining) to 1 (most straightlining).
Scale Point Variation Method: The probability of differentiation is defined as $1 - Σ p_{i}^{2}$ , where $p_{i}$ is the proportion of the values rated at each scale point on a rating scale and $i$ indicates the number of scale points. The measure becomes larger if respondents use more scales points in a multi-item scale or matrix questions.

It should be noted that Kim et al. (2017) average the response nondifferentiation indicators to obtain an aggregate measure for response nondifferentiation. To do so, the summary() function can be called on the results of resp_nondifferentiation().

Data requirements

resp_nondifferentiationf() assumes that the input data frame is structured in the following way:

The data frame is in wide format, meaning each row represents one respondent, each column represents one variable.
The variables are in same the order as the questions respondents saw while taking the survey.
Reverse keyed variables are in their original form. No items were recoded.
All responses have integer values.
Questions have the same number of response options.
Missing values are set to NA.

References

Kim, Yujin, Jennifer Dykema, John Stevenson, Penny Black, and D. Paul Moberg. 2019. “Straightlining: Overview of Measurement, Comparison of Indicators, and Effects in Mail–Web Mixed-Mode Surveys.” Social Science Computer Review 37(2):214–33. doi: 10.1177/0894439317752406.

Krosnick, Jon A. 1991. “Response Strategies for Coping with the Cognitive Demands of Attitude Measures in Surveys.” Applied Cognitive Psychology 5(3):213–36. doi: 10.1002/acp.2350050305.

Author

Matthias Roth

Examples

# A small test data set with ten respondents
# and responses to three survey questions
# with response scales from 1 to 5.
testdata <- data.frame(
  var_a = c(1,4,3,5,3,2,3,1,3,NA),
  var_b = c(2,5,2,3,4,1,NA,2,NA,NA),
  var_c = c(1,2,3,NA,3,4,4,5,NA,NA))

# Calculate response nondifferentiation indicators
resp_nondifferentiation(x = testdata) |>
    round(2)
#> # A tibble: 10 × 5
#>       id simple_nondifferentiation mean_root_pairs max_identical_rating
#>    <dbl>                     <dbl>           <dbl>                <dbl>
#>  1     1                         0            1                    0.67
#>  2     2                         0            0.21                 0.33
#>  3     3                         0            1                    0.67
#>  4     4                        NA           NA                   NA   
#>  5     5                         0            1                    0.67
#>  6     6                         0            0.21                 0.33
#>  7     7                        NA           NA                   NA   
#>  8     8                         0            0                    0.33
#>  9     9                        NA           NA                   NA   
#> 10    10                        NA           NA                   NA   
#> # ℹ 1 more variable: scale_point_variation <dbl>

# Include respondents with NA values by decreasing the
# necessary number of valid responses per respondent.

resp_nondifferentiation(
      x = testdata,
      min_valid_responses = 0.2) |>
   round(2)
#> # A tibble: 10 × 5
#>       id simple_nondifferentiation mean_root_pairs max_identical_rating
#>    <dbl>                     <dbl>           <dbl>                <dbl>
#>  1     1                         0            0.58                 0.67
#>  2     2                         0            0.12                 0.33
#>  3     3                         0            0.58                 0.67
#>  4     4                         0            0.7                  0.33
#>  5     5                         0            0.58                 0.67
#>  6     6                         0            0.12                 0.33
#>  7     7                         0            0.79                 0.33
#>  8     8                         0            0                    0.33
#>  9     9                         0            1                    0.33
#> 10    10                        NA           NA                   NA   
#> # ℹ 1 more variable: scale_point_variation <dbl>

resp_nondifferentiation(
     x = testdata,
     min_valid_responses = 0.2) |>
  summary() # To obtain aggregate measures of response nondifferentiation
#>        id        simple_nondifferentiation mean_root_pairs 
#>  Min.   : 1.00   Min.   :0                 Min.   :0.0000  
#>  1st Qu.: 3.25   1st Qu.:0                 1st Qu.:0.1238  
#>  Median : 5.50   Median :0                 Median :0.5774  
#>  Mean   : 5.50   Mean   :0                 Mean   :0.4966  
#>  3rd Qu.: 7.75   3rd Qu.:0                 3rd Qu.:0.7011  
#>  Max.   :10.00   Max.   :0                 Max.   :1.0000  
#>                  NA's   :1                 NA's   :1       
#>  max_identical_rating scale_point_variation
#>  Min.   :0.3333       Min.   :0.4444       
#>  1st Qu.:0.3333       1st Qu.:0.4444       
#>  Median :0.3333       Median :0.6667       
#>  Mean   :0.4444       Mean   :0.6420       
#>  3rd Qu.:0.6667       3rd Qu.:0.7778       
#>  Max.   :0.6667       Max.   :0.8889       
#>  NA's   :1            NA's   :1