Compute response distribution indicators for responses to multi-item scales or matrix questions.
Value
Returns a data frame with response quality indicators per respondent. Dimensions:
Rows: Equal to number of rows in x.
Columns: Six, one for each response distribution indicator.
Details
The following response distribution indicators are calculated per respondent:
n_na: number of intra-individual missing answers
prop_na: proportion of intra-individual missing responses
ii_mean: intra-individual mean
ii_median: intra-individual median
ii_sd: intra-individual standard deviation
mahal: mahalanobis distance per respondent.
Intra-individual response variability (ii_sd) has been proposed to measure insufficient effort responding (Dunn et al., 2018) and to distinguish between random and conscientious responding (Marjanovic et al, 2015).
Intra-individual location indicators can be used to asses the average location of responses on a set of questions (ii_mean, ii_median).
Mahalanobis distance is a outlier detection indicator. It represents the distance of a participants responses from the center of a multivariate normal distribution defined by the data of all respondents.
Data requirements
resp_distributions()
assumes that data comes from multi-item scales or matrix questions,
which have the same number and labeling of response options for many questions.
The input data frame must be structured in the following way:
The data frame is in wide format, meaning each row represents one respondent, each column represents one variable.
All responses have integer values.
Missing values are set to
NA
.
Reverse coding of variables
The interpretation of the indicators depends on the whether response data of negatively worded questions was reversed or not:
Do not reverse data of negatively worded questions if you want to assess average response patterns (Dunn et al., 2018).
Reverse data of negatively worded questions if you want to assess whether responses are distributed randomly or not with respect to an assumed latent variable (Marjanovic et al., 2015).
Mahalanobis distance could not be calculated
Under certain circumstances, the mahalanobis distance can not be calculated. This may be if there is high collinearity (correlation between variables) or if there are to many missing values. Although this can happen in survey research data, this message can also indicate that something in the data is "off" due to one of the reasons stated above. A manual inspection for low-quality responses can be a next step.
References
Dunn, Alexandra M., Eric D. Heggestad, Linda R. Shanock, and Nels Theilgard. 2018. “Intra-Individual Response Variability as an Indicator of Insufficient Effort Responding: Comparison to Other Indicators and Relationships with Individual Differences.” Journal of Business and Psychology 33(1):105–21. doi: 10.1007/s10869-016-9479-0.
Marjanovic, Zdravko, Ronald Holden, Ward Struthers, Robert Cribbie, and Esther Greenglass. 2015. “The Inter-Item Standard Deviation (ISD): An Index That Discriminates between Conscientious and Random Responders.” Personality and Individual Differences 84:79–83. doi: 10.1016/j.paid.2014.08.021.
See also
resp_styles()
for calculating response style indicators.
Examples
# A small test data set with ten respondents
# and responses to three survey questions
# with response scales from 1 to 5.
testdata <- data.frame(
var_a = c(1,4,3,5,3,2,3,1,3,NA),
var_b = c(2,5,2,3,4,1,NA,2,NA,NA),
var_c = c(1,2,3,NA,3,4,4,5,NA,NA))
# Calculate response distribution indicators
resp_distributions(x = testdata) |>
round(2)
#> n_na prop_na ii_mean ii_sd ii_median mahal
#> 1 0 0.00 1.33 0.58 1 2.04
#> 2 0 0.00 3.67 1.53 4 1.60
#> 3 0 0.00 2.67 0.58 3 1.38
#> 4 1 0.33 NA NA NA NA
#> 5 0 0.00 3.33 0.58 3 0.97
#> 6 0 0.00 2.33 1.53 2 1.38
#> 7 1 0.33 NA NA NA NA
#> 8 0 0.00 2.67 2.08 2 1.88
#> 9 2 0.67 NA NA NA NA
#> 10 3 1.00 NA NA NA NA
# Include respondents with NA values by decreasing the
# necessary number of valid responses per respondent.
resp_distributions(
x = testdata,
min_valid_responses = 0.2) |>
round(2)
#> n_na prop_na ii_mean ii_sd ii_median mahal
#> 1 0 0.00 1.33 0.58 1.0 2.27
#> 2 0 0.00 3.67 1.53 4.0 1.68
#> 3 0 0.00 2.67 0.58 3.0 1.05
#> 4 1 0.33 4.00 1.41 4.0 2.21
#> 5 0 0.00 3.33 0.58 3.0 1.24
#> 6 0 0.00 2.33 1.53 2.0 1.29
#> 7 1 0.33 3.50 0.71 3.5 0.71
#> 8 0 0.00 2.67 2.08 2.0 2.24
#> 9 2 0.67 3.00 NaN 3.0 0.24
#> 10 3 1.00 NA NA NA NA