The present investigation is the first examination of the factor structures, reliability, external validity, longitudinal invariance, and stability of the De Jong Gierveld Loneliness Scale (DJGLS), as used with early adolescents. It is based on a two-wave, large, representative sample of Polish primary school pupils. The results demonstrate that the model most reflective of the factor structure of the DJGLS is the bifactor model, which assumes the occurrence of one, highly reliable, general factor (overall sense of loneliness) and two, relatively irrelevant, subfactors. Essential unidimensionality (the general factor accounting for three fourth of the common variance) suggest that the interpretation of the subfactors over and above the general factor is inappropriate. The longitudinal confirmatory factor analysis indicated that the bifactor structure of the DJGLS is invariant over time. Correlations with self-rated loneliness, sociometric acceptance/rejection, social self-efficacy, identification with class group, family structure, and gender provide support for the validity of the DJGLS. This implies that it could be used as a measure of loneliness in adolescence, which does not involve references to the school context, making it possible to conduct studies that go beyond school period and compare the intensity of the feeling of loneliness in that group with other age groups.
The current study investigated the utility and validity of a computerized "depression" module of the Minnesota Multiphasic Personality Inventory–Second version (MMPI-2), with and without sequential testing rules, with a college student sample. Participants completed one of three MMPI-2 test–retest administrations (i.e., conventional–conventional, conventional–module, or conventional–sequential module) as well as 15 criterion measures across two testing sessions exactly 1 week apart. The findings pointed to statistically significant and clinically meaningful time-savings in administering selected MMPI-2 scales (for both full-length and variable-length versions). Criterion measures rationally selected to represent similar (depression, anhedonia, anxiety) and dissimilar (behavioral, thought, and somatic dysfunction) psychological constructs were administered to assess the convergent and discriminant validity of the depression module. The criterion correlations suggested minimal differences in discriminant and convergent validity across administration modes, suggesting limited to no impact of administering targeted MMPI-2 scales in terms of construct validity.
A substantial amount of research has examined the developmental trajectory of antisocial behavior and, in particular, the relationship between antisocial behavior and maladaptive personality traits. However, research typically has not controlled for previous behavior (e.g., past violence) when examining the utility of personality measures, such as self-report scales of antisocial and borderline traits, in predicting future behavior (e.g., subsequent violence). Examination of the potential interactive effects of measures of both antisocial and borderline traits also is relatively rare in longitudinal research predicting adverse outcomes. The current study utilizes a large sample of youthful offenders (N = 1,354) from the Pathways to Desistance project to examine the separate effects of the Personality Assessment Inventory Antisocial Features (ANT) and Borderline Features (BOR) scales in predicting future offending behavior as well as trends in other negative outcomes (e.g., substance abuse, violence, employment difficulties) over a 1-year follow-up period. In addition, an ANT x BOR interaction term was created to explore the predictive effects of secondary psychopathy. ANT and BOR both explained unique variance in the prediction of various negative outcomes even after controlling for past indicators of those same behaviors during the preceding year.
Delay discounting has been linked to important behavioral, health, and social outcomes, including academic achievement, social functioning and substance use, but thoroughly measuring delay discounting is tedious and time consuming. We develop and consistently validate an efficient and psychometrically sound computer adaptive measure of discounting. First, we develop a binary search–type algorithm to measure discounting using a large international data set of 4,190 participants. Using six independent samples (N = 1,550), we then present evidence of concurrent validity with two standard measures of discounting and a measure of discounting real rewards, convergent validity with addictive behavior, impulsivity, personality, survival probability; and divergent validity with time perspective, life satisfaction, age and gender. The new measure is considerably shorter than standard questionnaires, includes a range of time delays, can be applied to multiple reward magnitudes, shows excellent concurrent, convergent, divergent, and discriminant validity—by showing more sensitivity to effects of smoking behavior on discounting.
The World Health Organization Quality of Life Scale (WHOQOL-BREF) is predicated on a multidimensional perspective on quality of life (QOL); yet studies are unclear about the latent structure underlying responses. This article reports on a study conducted to investigate the structure of WHOQOL-BREF scores. Competing latent structures of the data were examined in a general population sample. In addition, the complete factorial invariance of the retained model was investigated across gender. We also investigated latent mean differences in the QOL dimensions over age as well as age by gender interactions effects. Based on responses to the WHOQOL-BREF, support was found for a bifactor exploratory structural equation modeling representation of the data. This measurement structure accounts for construct-relevant multidimensionality in item responses due to the presence of general and specific factors underlying the data and the fallibility of indictors as pure reflections of only the single constructs they are purported to measure. Furthermore, support was found for measurement and structural invariance across gender. Finally, evidence was obtained for a curvilinear relationship of age with QOL, characterized by a midlife nadir. Taken together, the results of the study yield important validation data for the WHOQOL-BREF and tentatively resolve the dimensionality issues in the measurement of QOL using this instrument.
Background: In the past decade, the bifactor model of attention-deficit/hyperactivity disorder (ADHD) has been extensively researched. This model consists of an ADHD general dimension and two specific factors: inattention and hyperactivity/impulsivity. All studies conclude that the bifactor is superior to the traditional two-correlated factors model, according to the fit obtained by factor analysis. However, the proper interpretation of a bifactor not only depends on the fit but also on the quality of the measurement model. Objective: To evaluate the model-based reliability, distribution of common variance and construct replicability of general and specific ADHD factors. Method: We estimated expected common variance, omega hierarchical/subscale and H-index from standardized factor loadings of 31 ADHD bifactor models previously published. Results and Conclusion: The ADHD general factor explained most of the common variance. Given the low reliable variance ratios, the specific factors were difficult to interpret. However, in clinical samples, inattention acquired sufficient specificity and stability for interpretation beyond the general factor. Implications for research and clinical practice are discussed.
Positive emotionality, anhedonia, and reward sensitivity share motivational and experiential elements of approach motivation and pleasure. Earlier work has examined the interrelationships among these constructs from measures of extraversion. More recently, the Research Domain Criteria introduced the Positive Valence Systems as a primary dimension to better understand psychopathology. However, the suggested measures tapping this construct have not yet been integrated within the structural framework of personality, even at the level of self-report. Thus, this study conducted exploratory factor and exploratory bifactor analyses on 17 different dimensions relevant to approach motivation, spanning anhedonia, behavioral activation system functioning, and positive emotionality. Convergent validity of these dimensions is tested by examining associations with depressive symptoms. Relying on multiple indices of fit, our preferred model included a general factor along with specific factors of affiliation, positive emotion, assertiveness, and pleasure seeking. These factors demonstrated different patterns of association with depressive symptoms. We discuss the plausibility of this model and highlight important future directions for work on the structure of a broad Positive Valence Systems construct.
Attention deficit/hyperactivity disorder (ADHD) is a chronic disorder that afflicts individuals into adulthood. The field continues to refine diagnostic standards for ADHD in adults, complicated by the disorder’s heterogeneous presentation, subjective symptoms, and overlap with other disorders. Two key diagnostic questions are from whom to collect diagnostic information and which symptoms should be contained on an adult diagnostic checklist. Using a trifactor model, Martel et al. examine these questions in a sample of adults with and without self-identified ADHD symptoms. In this response, we highlight the importance of their finding that self and informant symptom reports differ in a sample of adults who acknowledge ADHD symptoms. We also review issues that continue to face the field related to model specification, evaluating symptom utility, and sample composition, discussing how these issues influence conclusions that may be drawn from Martel et al. and similar investigations. We conclude that the article makes an important research contribution about the nature of self and informant ADHD symptom reports but emphasize that symptom checklist refinement must occur through a broad lens that considers work from a range of sample types and clinically informative analytic strategies.
The present study examined the factor structure of the Wide Range Assessment of Memory and Learning–Second Edition (WRAML2) core battery with participants from the normative sample aged 9 to 90 years (n = 880) using higher order exploratory and confirmatory factor analytic techniques that were not reported in the in the WRAML2 Administration and Technical Manual. Exploratory factor analysis results suggested only one factor, whereas confirmatory factor analysis results favored the three factors posited by the test authors. Although model fit statistics were equivalent for the oblique, indirect hierarchical, and direct hierarchical measurement models, it was determined that the bifactor model best disclosed the influence of latent dimensions on WRAML2 manifest variables. In the three-factor bifactor model, the general factor accounted for 31% of the total variance and 69% of the common variance, whereas the three first-order factors combined accounted for 41% of the total variance and 31% of the common variance. Latent factor reliability coefficients (as estimated by h) indicated that only the general factor was measured with enough precision to warrant confident clinical interpretation. Implications for clinical interpretation of WRAML2 scores and the procedures utilized in the development of related measures are discussed.
Measures of body dysmorphic disorder symptoms have received little psychometric evaluation in adolescent samples. This study aimed to examine cross-sex measurement invariance in the Body Image Questionnaire–Child and Adolescent version (BIQ-C) to establish whether observed sex differences in total scores may be meaningful or due to differences in measurement properties. A sample of 3,057 Australian high school students completed the initial screening item of the measure (63.2% male, Mage = 14.58 years, SD = 1.37, range = 12-18 years). Of these participants, 1,512 (49.5%) reported appearance concerns and thus completed the full measure. Partial scalar measurement invariance was established among a revised two-factor, 9-item version of the BIQ-C (BIQ-C-9). Females reported significantly greater latent factor variance, higher BIQ-C-9 total and factor scores, and higher scores on most individual BIQ-C-9 items. The measure can be used with caution to compare body dysmorphic disorder symptoms between male and female adolescents, though sex-specific cutoff scores should be used.
Suicide has become an issue of great concern within the U.S. military in recent years, with recent reports indicating that suicide has surpassed combat related deaths as the leading cause of death. One concern regarding suicide risk in the military is that existing self-report measures allow service members to conceal or misrepresent current suicidal ideation or suicide plans and preparations. Implicit association tests (IATs) are computer-based, reaction time measures that have been shown to be resilient to such masking of symptoms. The death/suicide implicit association test (d/s-IAT) is an empirically supported IAT that is specific to death and suicide. The present study examined whether the performance of 1,548 U.S. military service members on the d/s-IAT significantly predicted lifetime suicidal ideation and depression. Zero-inflated negative binomial regression analyses were used to test these associations. Results indicated that the d/s-IAT was neither associated with history of suicidal ideation nor history of depression.
The Diagnostic and Statistical Manual of Mental Disorders–Fifth edition (DSM-5) Personality and Personality Disorders workgroup developed the Personality Inventory for the DSM-5 (PID-5) for the assessment of the alternative trait model for DSM-5. Along with this measure, the American Psychiatric Association published an abbreviated version, the PID-5–Brief form (PID-5-BF). Although this measure is available on the DSM-5 website for use, only two studies have evaluated its psychometric properties and validity and no studies have examined the U.S. version of this measure. The current study evaluated the reliability, factor structure, and construct validity of PID-5-BF scale scores. This included an evaluation of the scales’ associations with Section II PDs, a well-validated dimensional measure of personality psychopathology, and broad externalizing and internalizing psychopathology measures. We found support for the reliability of PID-5-BF scales as well as for the factor structure of the measure. Furthermore, a series of correlation and regression analyses showed conceptually expected associations between PID-5-BF and external criterion variables. Finally, we compared the correlations with external criterion measures to those of the full-length PID-5 and PID-5–Short form. Intraclass correlation analyses revealed a comparable pattern of correlations across all three measures, thereby supporting the use of the PID-5-BF as a screening measure of dimensional maladaptive personality traits.
Severely and persistently depressed outpatients (n = 138) completed interpersonal circumplex measures of self-efficacy, problems, and values/goals. Compared with normative samples, patients showed deficits in agency: They reported less self-efficacy, especially for being assertive, tough, and influential; stronger goals, especially to avoid conflict or humiliation; and more problems, especially with being too timid, inhibited, and accommodating. Circular and structural summary indices suggested greater variability among patients in goal profiles than in efficacy or problem profiles; nonetheless, latent profile analyses identified coherent subgroups of patients with distinct patterns of efficacy (e.g., lacking confidence for speaking up vs. setting boundaries) and problems (e.g., being overly inhibited vs. self-sacrificing) as well as goals (e.g., to be included vs. unobtrusive). Women and those with more severe symptoms were overrepresented in the least agentic groups. The results show how observing patients through multiple circumplex surfaces simultaneously can help clarify their interpersonal dispositions and inform interventions.
The Stress Overload Scale (SOS) has demonstrated validity in predicting pathological stress reactions; however, at 30 items, it is lengthy for some clinical applications. Here, two studies tested a 10-item SOS–Short (SOS-S). First, the SOS-S was compared with the SOS in a longitudinal community study (n = 391), using indices of pathology as criterion measures. Results showed the SOS-S to be equivalent to the SOS in reliability and concurrent and predictive validity, although not quite as sensitive to somatic symptoms. Second, the SOS-S was compared to the 10-item Perceived Stress Scale in a cross-sectional community study (n = 249), in which symptoms and response biases were also assessed. Results showed both measures to be susceptible to biasing, and the SOS-S to demonstrate superior validity when biases were controlled. The SOS-S appears a viable alternative to the SOS and the 10-item Perceived Stress Scale for assessing stress, and risk for sequelae, across a broad demographic spectrum.
Washington state requires school districts to file court petitions on students with excessive unexcused absences resulting in thousands of youth becoming involved in the court system. Once in the system, decisions are made about the level of risk each youth has for maladaptive behaviors. The Washington Assessment of the Risks and Needs of Students was created to assist youth service providers, courts, and schools to identify an adolescent’s needs for social, emotional, or educational intervention. However, the profile-based decisions advocated for by test developers lack empirical justification. This study employed latent profile analysis to examine risk and needs profiles of adolescents based on the Washington Assessment of the Risks and Needs of Students assessment. Profiles were developed to aid understanding of behaviors associated with school truancy, and examined across outcome variables (e.g., suspensions, arrests) to evaluate evidence in support of predictive claims. Results suggest distinct profiles that differ on important outcomes.
Sample sizes of 50 have been cited as sufficient to obtain stable means and standard deviations in normative test data. The influence of skewness on this minimum number, however, has not been evaluated. Normative test data with varying levels of skewness were compiled for 12 measures from 7 tests collected as part of ongoing normative studies in Brisbane, Australia. Means and standard deviations were computed from sample sizes of 10 to 100 drawn with replacement from larger samples of 272 to 973 cases. The minimum sample size was determined by the number at which both mean and standard deviation estimates remained within the 90% confidence intervals surrounding the population estimates. Sample sizes of greater than 85 were found to generate stable means and standard deviations regardless of the level of skewness, with smaller samples required in skewed distributions. A formula was derived to compute recommended sample size at differing levels of skewness.
Ecologically valid indicators of executive functions are designed to capture dysfunction not easily measured in a lab setting. Here, we present two studies on the development and validity analyses of a behavioral screener for executive functions among young adults. In Study 1, we derived a four-factor (problem solving, attentional control, behavioral control, and emotional control) behavioral screener using a sample of 765 individuals. We used invariance analyses to evaluate the screener’s measurement reliability across sex. In Study 2, we replicated the screener derivation analyses using an independent sample of 197 undergraduates. To further examine the screener’s validity, we evaluated it against a well-known executive functions rating scale. The four-factor model was supported in both samples and analyses provided support for this screener as a valid and reliable measure for everyday executive functions among young adults.
There are few assessments that gather valid, highly detailed data on short-term (i.e., weekly) symptom frequency/severity retrospectively. In particular, methodologies that provide valid data for research investigating symptom changes are typically prospective, expensive, and burdensome. The purpose of this study was to evaluate a new interactive and graphical assessment tool for gathering detailed information about eating-related symptom frequency/severity retrospectively over a 3-month period. A mixed eating disorder sample (N = 113) recruited from the community provided symptom data once weekly for 12 weeks and completed the Interactive, Graphical Assessment Tool (IGAT) assessing eating disorder symptoms on three occasions to determine the test–retest and concurrent validity of the IGAT. The IGAT performed marginally better than other measures for retrospective symptom frequency assessment in the eating disorders and did so at a greater level of detail than other available tools. Future research should evaluate the IGAT with other behaviors of interest.
The aim of this study was to assess the extent to which discrepancy between self-reported and clinician-rated severity of depression are due to inconsistent self-reports. Response inconsistency threatens the validity of the test score. We used data from a large sample of outpatients (N = 5,959) who completed the self-report Beck Depression Inventory–II (BDI-II) and the clinician-rated Montgomery–Åsberg Depression Rating Scale (MADRS). We used item response theory based person-fit analysis to quantify the inconsistency of the self-report item scores. Inconsistency was weakly positively related to patient–clinician discrepancy (i.e., higher BDI-II scores relative to MADRS scores). The mediating effect of response inconsistency in the relationship between discrepancy and demographic (e.g., ethnic origin) and clinical variables (e.g., cognitive problems) was negligible. The small direct and mediating effects of response inconsistency suggest that inaccurate patient self-reports are not a major cause of patient–clinician discrepancy in outpatient samples. Future research should investigate the role of clinician biases in explaining clinician–patient discrepancy.
This study outlines the development of the Parent Experience of Assessment Scale (PEAS), which is based on principles of Therapeutic Assessment. The study includes pilot testing of a 64-item questionnaire across 134 participants, with psychometric analyses utilizing confirmatory factor analysis. The revised version consists of 24 items across five subscales with appropriate internal consistency reliability (alphas from .76 to .88). The PEAS demonstrates statistically significant relations with general parent satisfaction, with two subscales indicating significant direct effects via structural equation modeling. The PEAS has the potential utility to provide more nuanced clinical and investigative feedback regarding the parent process during child psychological assessment.
The purpose of this study was to develop and provide initial validation for a measure of adult cyber intimate partner aggression (IPA): the Cyber Aggression in Relationships Scale (CARS). Drawing on recent conceptual models of cyber IPA, items from previous research exploring general cyber aggression and cyber IPA were modified and new items were generated for inclusion in the CARS. Two samples of adults 18 years or older were recruited online. We used item factor analysis to test the factor structure, model fit, and invariance of the measure structure across women and men. Results confirmed that three-factor models for both perpetration and victimization demonstrated good model fit, and that, in general, the CARS measures partner cyber aggression similarly for women and men. The CARS also demonstrated validity through significant associations with in-person IPA, trait anger, and jealousy. Findings suggest the CARS is a useful tool for assessing cyber IPA in both research and clinical settings.
Violent ideations (VIs) have potential significance across clinical, forensic, and research contexts. They feature in dominant theories of violence, are a candidate risk factor in violence prediction, and are a potential target for therapeutic intervention. Given this, there is a need for multi-item psychometrically supported measures of VIs. We report on the development and validation of the "Violent Ideations Scale" (VIS): a brief measure of VIs. In a normative sample of N = 1,276 older adolescents, we evaluated the dimensionality, sex invariance, concurrent validity, and discriminative power of the VIS. The VIS showed unidimensionality, minor measurement differences across males and females, correlated well with a preexisting measure of VIs and showed a strong relation to criminal violence. These features support the use of the VIS as a research tool and as a possible source of information regarding violence risk in clinical and forensic settings.
Behavioral diaries are used for observing health-related behaviors prospectively. Little is known about patterns and predictors of diary compliance to better understand differential attrition. An analytic sample of 241 young men who have sex with men (YMSM) from a 2-month diary study of substance use and sexual behavior were randomized to complete daily or weekly timeline followback diaries. Latent class growth analyses were used to analyze data. Weekly and daily diary groups produced similar compliance patterns: high, low, and declining compliance groups. Black YMSM were more likely to be in the declining compared with the high compliance group. YMSM who were randomly assigned to receive automated feedback about risk behaviors did not differ in compliance rate compared with those who did not. Risk behavior engagement did not predict compliance in the daily condition, but some substances predicted compliance in the weekly condition. Implications for observational and behavior change methods are discussed.
Despite literature highlighting the relevance of negative and positive emotions to risky behaviors, little research has examined the emotion-dependent context of risky behaviors. This study sought to develop and validate a comprehensive measure of the frequency and emotion-dependent context of distinct clinically relevant risky behaviors (the Risky Behavior Questionnaire [RBQ]), as well as to examine the unique relations of the RBQ Negative and Positive Scales (which assess the general tendency to engage in risky behaviors in the context of negative vs. positive emotions, respectively) to specific risky behaviors. Participants were 176 patients in a residential substance use disorder treatment facility (M age = 34.18; 65.3% White, 53.4% female). Results provided support for the construct and incremental validity (relative to extant measures of related constructs) of the RBQ Scales, as well as the differential relevance of RBQ Negative and Positive Scales to specific risky behaviors.
Scales to assess the eight octants and two axes of the interpersonal circumplex (IPC) using items from the revised NEO Personality Inventory were introduced by Traupman et al. Item changes in the revised and renormed third edition of the NEO instrument (NEO-PI-3) have affected item content in all eight octant scales, underscoring the need to reexamine the IPC scales. The current study examines the circumplex structure of the revised octant scales in the NEO-PI-3 and their correlations with the Dominance and Warmth scales of the Personality Assessment Inventory in 568 undergraduate students. The data show perfect fit to circumplex structure, suggesting equivalent or better assessment of the IPC with the NEO-PI-3 octant scales. Convergence of the eight octants with the Personality Assessment Inventory interpersonal scales further supports their saturation with interpersonal content and appropriate location within the IPC.
The current study tests the underlying structure of a multidimensional construct of helicopter parenting (HP), assesses reliability of the construct, replicates past relations of HP to poor emotional functioning, and expands the literature to investigate links of HP to emerging adults’ decision-making and academic functioning. A sample of 377 emerging adults (66% female; ages 17-30; 88% European American) were administered several items assessing HP as well as measures of other parenting behaviors, depression, anxiety, decision-making style, grade point average, and academic functioning. Exploratory factor analysis results suggested a four-factor, 23-item measure that encompassed varying levels of parental involvement in the personal and professional lives of their children. A bifactor model was also fit to the data and suggested the presence of a reliable overarching HP factor in addition to three reliable subfactors. The fourth subfactor was not reliable and item variances were subsumed by the general HP factor. HP was found to be distinct from, but correlated in expected ways with, other reports of parenting behavior. HP was also associated with poorer functioning in emotional functioning, decision making, and academic functioning. Parents’ information-seeking behaviors, when done in absences of other HP behaviors, were associated with better decision making and academic functioning.
The Multidimensional State Boredom Scale (MSBS) is a promising new self-report measure of state boredom. Two condensed versions of the scale have also been introduced. This study helped explore the psychometric qualities of these scales, using a large sample of Australian adults (N = 1,716), as well as two smaller samples (N = 199 and N = 422). Data analyses indicated strong convergent validity and very high internal consistency for the scales. Test–retest reliability over a 6- to 8-day period was moderately high. Confirmatory factor analyses of the MSBS authors’ suggested factor structure indicated good fit for this model. However, some of the data analyses raise questions as to whether the scale includes meaningful subfactors. Overall, the MSBS (and Short Form) is recommended for researchers who wish to assess state boredom.
Two studies were conducted to develop and validate a six-item scale for measuring context-specific attributions regarding the extent to which people either blame or exonerate partners during couples’ conflicts. Context-specific attributions pertain to appraisals made during a single episode of relationship conflict, and the scale was expected to be distinct from existing attribution scales measuring people’s schemas regarding the types of attributions they typically make. Study 1 included 2,452 people in marriage or cohabitating relationships; Study 2 included 172 people in dating relationships, and participants in both studies completed Internet questionnaires. Item response theory was used to create an attribution scale using the fewest number of items to discriminate reliably across the full range of attribution levels. The resulting scale produced an expected pattern of convergent and divergent correlations with other context-specific measures, including two types of underlying concerns and three types of emotion. The context-specific attribution scale explained variance in these criterion variables that could not be explained by other existing scales that assess attributions at the schematic level.
Numerous intelligence tests are available to psychological diagnosticians to assess children’s intelligence, but whether they yield comparable test results has been little studied. We examined test scores of 206 typically developing children aged 6 to 11 years on five German intelligence tests (Reynolds Intellectual Assessment Scales; Snijders Oomen Nonverbal Intelligence Test; Intelligence and Development Scales; Wechsler Intelligence Scale for Children, 4th edition; Culture Fair Intelligence Test Scale 2), which were individually administered. On a sample level, the test scores showed strong correlation and little or no mean difference. These results indicate that the tests measure a similar underlying construct, which is interpreted as general intelligence. On an individual level, however, test scores significantly differed across tests for 12% to 38% of the children. Differences did not depend on which test was used but rather on unexplained error. Implications for the application of intelligence assessment in psychological practice are discussed.
The factorial structure of the Parental Bonding Instrument (PBI) has been frequently studied in diverse samples but no study has examined its psychometric properties from large, population-based samples. In particular, important questions have not been addressed such as the measurement invariance properties across parental and offspring gender. We evaluated the PBI based on responses from a large, representative population-based sample, using an exploratory structural equation modeling method appropriate for categorical data. Analysis revealed a three-factor structure representing "care," "overprotection," and "autonomy" parenting styles. In terms of psychometric measurement validity, our results supported the complete invariance of the PBI ratings across sons and daughters for their mothers and fathers. The PBI ratings were also robust in relation to personality and mental health status. In terms of predictive value, paternal care showed a protective effect on mental health at age 43 in sons. The PBI is a sound instrument for capturing perceived parenting styles, and is predictive of mental health in middle adulthood.
The psychometric properties of the paper–pencil and online versions of the Beliefs Toward Mental Illness Scale (BTMI) were examined in two studies with Latina/o individuals. In Study 1, 316 Latina/o participants completed the BTMI in a paper–pencil mode. The original three-factor model was found to be a poor fit model for the sample. Subsequent exploratory and confirmatory factor analyses identified a four-factor model as the best fitting model for the sample. The identified factors were Dangerousness, Social Dysfunction, Incurability, and Embarrassment. In Study 2, the identified best fit model was tested with 280 Latina/o participants who completed the BTMI online. The four-factor model had adequate fit. A series of measurement invariance tests on the fit model supported equal factor loadings, but rejected equivalent intercepts across paper–pencil and online administration methods, though partially equivalent intercepts and residuals were found. Consequently, modality-specific norms are recommended, depending on whether paper–pencil or online venues are utilized for administration.
Hewitt and Flett’s 45-item Multidimensional Perfectionism Scale is a widely used instrument to assess self-oriented, other-oriented, and socially prescribed perfectionism. With 45 items, it is not overly lengthy, but there are situations where a short form is useful. Analyzing data from four samples, this article compares two frequently used 15-item short forms of the Multidimensional Perfectionism Scale—Cox et al.’s and Hewitt et al.’s—by examining to what degree their scores replicate the original version’s correlations with various personality characteristics (e.g., traits, social goals, personal/interpersonal orientations). Regarding self-oriented and socially prescribed perfectionism, both short forms performed well. Regarding other-oriented perfectionism, however, Cox et al.’s short form (exclusively composed of negatively worded items) performed less well than Hewitt et al.’s (which contains no negatively worded items). It is recommended that researchers use Hewitt et al.’s short form to assess other-oriented perfectionism rather than Cox et al.’s.
In this study, we investigated the factor structure of situational fears in agoraphobia by examining four models of the Avoidance Alone items in the Mobility Inventory for Agoraphobia. A main sample of 327 agoraphobic patients and an independent control sample of 64 agoraphobic patients were studied. A confirmatory factor analysis supported a four-factor model including a public places, an enclosed spaces, a public transportation, and an open spaces factor both for pre- and posttreatment data. The convergent and divergent validity of subscales derived from the four factors were supported by an expected pattern of correlations with interview-based measures. These subscales also proved to have satisfactory internal consistencies in the independent sample.
The externalizing spectrum may explain covariation among externalizing disorders observed in childhood and adulthood. Few prospective studies have examined whether externalizing spectrum might manifest differently across time, reporters, and gender during childhood. We used a multitrait, multimethod model with parent and teacher report of attention-deficit/hyperactivity disorder (ADHD) symptoms, oppositional defiant disorder (ODD) symptoms, and conduct disorder (CD)symptoms from kindergarten to Grade 5 in data from the Fast Track Project, a large multisite trial for children at risk for conduct problems (n = 754). The externalizing spectrum was stably related to ADHD, ODD, and CD symptoms from kindergarten to Grade 5, with similar contributions from parents and teachers. Configural, metric, and scalar invariance were largely supported across time, suggesting that the structure of the externalizing spectrum is stable over time. Configural and partial metric invariance were supported across gender, but scalar invariance was not supported, with intercepts consistently higher for males than for females. Overall, our findings confirm other research that the externalizing spectrum can be observed early in development as covariation between ADHD, ODD, and CD, and extend that work to show that it is relatively consistent across time and reporter, but not consistent across gender.
The Hamilton Anatomy of Risk Management–Forensic Version (HARM-FV) is a structured professional judgement tool of violence risk developed for use in forensic inpatient psychiatric settings. The HARM-FV is used with the Aggressive Incidents Scale (AIS), which provides a standardized method of recording aggressive incidents. We report the findings of the concurrent validity of the HARM-FV and the AIS with widely used measures of violence risk and aggressive acts, the Historical, Clinical, Risk Management–20, Version 3 (HCR-20V3) and a modified version of the Overt Aggression Scale. We also present findings on the predictive validity of the HARM-FV in the short term (1-month follow-up periods) for varying severities of aggressive acts. The results indicated strong support for the concurrent validity of the HARM-FV and AIS and promising support for the predictive accuracy of the tool for inpatient aggression. This article provides support for the continued clinical use of the HARM-FV within an inpatient forensic setting and highlights areas for further research.
Psychometric properties of the 100-item English-language HEXACO Personality Inventory–Revised (HEXACO-PI-R) were examined using samples of online respondents (N = 100,318 self-reports) and of undergraduate students (N = 2,868 self- and observer reports). The results were as follows: First, the hierarchical structure of the HEXACO-100 was clearly supported in two principal components analyses: each of the six factors was defined by its constituent facets and each of the 25 facets was defined by its constituent items. Second, the HEXACO-100 factor scales showed fairly low intercorrelations, with only one pair of scales (Honesty–Humility and Agreeableness) having an absolute correlation above .20 in self-report data. Third, the factor and facet scales showed strong self/observer convergent correlations, which far exceeded the self/observer discriminant correlations.
Extensive research has identified various social-cognitive vulnerabilities for internalizing disorders. However, few studies have assessed multiple disorders simultaneously, so it is unclear whether these vulnerabilities are transdiagnostic or specific risk factors. Their unique associations with disorders are also uncertain, given that they correlate strongly with neuroticism and one other. Psychiatric outpatients completed self-report and interview measures of six disorders (depression, generalized anxiety disorder, posttraumatic stress disorder, social anxiety, panic, obsessive-compulsive disorder), and personality (the Big Five, neuroticism facets, and four vulnerabilities: anxiety sensitivity, intolerance of uncertainty, perfectionism, experiential avoidance). All constructs were modeled as latent variables using structural equation modeling. All four vulnerabilities were closely associated with neuroticism, loading on its anxiety facet in factor analyses. Furthermore, after accounting for the contribution of neuroticism facets, intolerance of uncertainty and experiential avoidance were not uniquely associated with any disorders, and perfectionism was only related to obsessive-compulsive disorder. However, anxiety sensitivity accounted for substantial unique variance in several disorders (i.e., depression, social anxiety, posttraumatic stress disorder, and panic). We discuss theoretical and clinical implications of these results.
In clinical neuropsychology, it is often necessary to estimate a patient’s premorbid level of cognitive functioning in order to evaluate whether his scores on cognitive tests should be considered abnormal. In practice, test results from before the onset of brain pathology are rarely available, and the patient’s level of education is used instead as an estimate of his premorbid level. Unfortunately, level of education may be expressed on many different scales of education, which are difficult to use interchangeably. Here, we introduce a new scale that has the capacity to replace existing scales and can be used interchangeably with any of them: the Universal Scale of Intelligence Estimates (USIE). To achieve this, we propose to map all levels of existing educational scales to standard IQ scores. This USIE point estimate is supplemented with an estimation interval. We assert that USIE offers some important benefits for clinical practice and research.
Self-stigma instruments investigate how people with mental illness internalize public stigma. However, information is limited for the psychometric properties of their scores, especially cross-validating scores from different instruments. Thus, we used confirmatory factor analyses (CFAs) and item-response theory (IRT) models to examine the Internalized Stigma Mental Illness (ISMI) scale and the Self-Stigma Scale–Short (SSS-S). Participants with mental illness (n = 347) completed both instruments. The CFAs that simultaneously accounted for both the instrument (ISMI and SSS-S) and the trait (Affect, Cognitive, and Behavior concepts) effects outperformed those that accounted only for the instrument effect or only the trait effect. All item scores fit the IRT model and were fit with ordered, progressing hierarchies in their step difficulties. We conclude that both instruments are feasible for measuring the self-stigma and that future research can combine the items of both.
Conventional methods for producing test norms are often plagued with "jumps" or "gaps" (i.e., discontinuities) in norm tables and low confidence for assessing extreme scores. We propose a new approach for producing continuous test norms to address these problems that also has the added advantage of not requiring assumptions about the distribution of the raw data: Norm values are established from raw data by modeling the latter ones as a function of both percentile scores and an explanatory variable (e.g., age). The proposed method appears to minimize bias arising from sampling and measurement error, while handling marked deviations from normality—such as are commonplace in clinical samples. In addition to step-by-step instructions in how to apply this method, we demonstrate its advantages over conventional discrete norming procedures using norming data from two different psychometric tests, employing either age norms (N = 3,555) or grade norms (N = 1,400).
The Affective Neuroscience Personality Scales (ANPS) is a personality instrument based on six evolutionary-related brain systems that are at the foundation of human emotions and behaviors: SEEKING, CARING, PLAYFULNESS, FEAR, ANGER, and SADNESS. We sought to assess for the short and long versions of the ANPS: (a) the longitudinal measurement invariance and long-term (4-year) stability and (b) the sex measurement invariance. Using data from a Canadian cohort (N = 518), we used single-group confirmatory factor analysis to assess longitudinal invariance and multiple-group confirmatory factor analysis to assess sex invariance, according to a five-step approach evaluating five invariance levels (configural, metric, scalar, residual, and complete). Results supported full longitudinal invariance for both versions for all invariance levels. Partial residual invariance was supported for sex invariance. The long-term stability of both versions was good to excellent. Implications for personality assessment and ANPS development are discussed.
There have been over 30 studies and two meta-analyses comparing social anxiety between Asian Americans and European Americans. However, few have investigated the invariance of social anxiety measures that would make these comparisons appropriate. In the current study, we systematically examined psychometric properties and configural, metric, and scalar invariance of five social anxiety measures and four short forms that have been used more than once to compare Asian Americans (n = 232) and European Americans (n = 193). We found that four (i.e., SPS-6, SIAS-6, SPS, and SPAI-18) of the nine scales were scalar invariant, three scales (i.e., SIAS, SPAI, and B-FNES) only achieved configural invariance, and two scales (i.e., FNES and SADS) failed to achieve configural invariance. Latent mean comparisons based on the scalar invariant measures revealed higher social anxiety scores for Asian Americans than European Americans. The findings are discussed with regard to the issues and challenges when comparing social anxiety among different cultural and ethnic groups.
The present study examined the impact of performance validity test (PVT) failure on the Test of Premorbid Functioning (TOPF) in a sample of 252 neuropsychological patients. Word reading performance differed significantly according to PVT failure status, and number of PVTs failed accounted for 7.4% of the variance in word reading performance, even after controlling for education. Furthermore, individuals failing ≥2 PVTs were twice as likely as individuals passing all PVTs (33% vs. 16%) to have abnormally low obtained word reading scores relative to demographically predicted scores when using a normative base rate of 10% to define abnormality. When compared with standardization study clinical groups, those failing ≥2 PVTs were twice as likely as patients with moderate to severe traumatic brain injury and as likely as patients with Alzheimer’s dementia to obtain abnormally low TOPF word reading scores. Findings indicate that TOPF word reading based estimates of premorbid functioning should not be interpreted in individuals invalidating cognitive testing.
Dynamic psychological processes are most often assessed using self-report instruments. This places a constraint on how often and for how long data can be collected due to the burden placed on human participants. Smartphones are ubiquitous and highly personal devices, equipped with sensors that offer an opportunity to measure and understand psychological processes in real-world contexts over the long term. In this article, we present a novel smartphone approach to address the limitations of self-report in bipolar disorder where mood and activity are key constructs. We describe the development of MoodRhythm, a smartphone application that incorporates existing self-report elements from interpersonal and social rhythm therapy, a clinically validated treatment, and combines them with novel inputs from smartphone sensors. We reflect on lessons learned in transitioning from an existing self-report instrument to one that involves smartphone sensors and discuss the potential impact of these changes on the future of psychological assessment.
Evidence suggests that the behavior inhibition system (BIS) and fight-flight-freeze system play a role in the individual differences seen in social anxiety disorder; however, findings concerning the role of the behavior approach system (BAS) have been mixed. To date, the role of revised reinforcement sensitivity theory (RST) subsystems underlying social anxiety has been measured with scales designed for the original RST. This study examined how the BIS, BAS, and fight, flight, freeze components of the fight-flight-freeze system uniquely relate to social interaction anxiety and social observation anxiety using both a measure specifically designed for the revised RST and a commonly used original RST measure. Comparison of regression analyses with the Jackson-5 and the commonly used BIS/BAS Scales revealed important differences in the relationships between RST subsystems and social anxiety depending on how RST was assessed. Limitations and future directions for revised RST measurement are discussed.
This study explored the utility of the Montreal Cognitive Assessment (MoCA) in the detection of cognitive change over time in a community sample (age ranging from 58 to 77 years). The MoCA was administered twice approximately 3.5 years apart (n = 139). Participants were classified as mild cognitive impairment (MCI) or cognitively intact at follow-up based on multidisciplinary consensus. We excluded 33 participants who endorsed cognitive complaints at baseline. The MCI group (n = 53) showed a significant decrease in MoCA scores (M = –1.83, p < .001, d = 0.64). When accounting for age and education, the MCI group showed a decline of 1.7 points, while cognitively intact participants remained stable. Using Reliable Change Indices established by cognitively intact group, 42% of MCI participants demonstrated a decline in MoCA scores. Results suggest that the MoCA can detect cognitive change in MCI over a 3.5-year period and preliminarily supports the utility of the MoCA as a repeatable brief cognitive screening measure.
Borderline personality disorder (BPD) is a diagnosis defined by impairments in several dynamic processes (e.g., interpersonal relating, affect regulation, behavioral control). Theories of BPD emphasize that these impairments appear in specific contexts, and emerging results confirm this view. At the same time, BPD is a complex construct that encompasses individuals with heterogeneous pathology. These features—dynamic processes, situational specificity, and individual heterogeneity—pose significant assessment challenges. In the current study, we demonstrate assessment and analytic methods that capture both between-person differences and within-person changes over time. Twenty-five participants diagnosed with BPD completed event-contingent, ambulatory assessment protocols over 21 days. We used p-technique factor analyses to identify person-specific psychological structures consistent with clinical theories of personality. Five exemplar cases are selected and presented in detail to showcase the potential utility of these methods. The presented cases’ factor structures reflect not only heterogeneity but also suggest points of convergence. The factors also demonstrated significant associations with important clinical targets (self-harm, interpersonal violence).
The primary goals of this study were to evaluate the dimensionality of the Penny et al. Sluggish Cognitive Tempo Scale and to compare model fits for parent- and youth self-report versions. Participants were 262 young adolescents (ages 10-15) comprehensively diagnosed with attention-deficit/hyperactivity disorder. Both confirmatory factor analysis (CFA) and bifactor modeling were used to determine if the proposed three-factor structure previously identified through exploratory factor analysis could be confirmed. Results showed that although the three-factor CFA had better fit statistics than a one- or two-factor CFA, the bifactor model was the best-fitting model for both parent report and self-report. This implies that Sluggish Cognitive Tempo Scale is best conceptualized as having an underlying general factor, with three specific factors that may represent different etiologies. Importantly, results also showed low-to-moderate correlations between raters and equivalent or better fit statistics for self-report in comparison with parent report.
Distress tolerance (DT) refers to the ability to tolerate aversive psychological states. Research has mainly focused on the link between low DT and psychopathology with little empirical work on individuals on the high end (i.e., distress overtolerance). Distress overtolerance has been conceptualized as a tendency to tolerate very high levels of distress despite the negative consequences to one’s well-being. Currently, no measures of distress overtolerance have been developed, and current measures for DT are not well-suited for measuring distress overtolerance. To establish distress overtolerance as a construct, an exploratory factor analysis (N = 251) of the distress overtolerance scale was conducted and revealed a two-factor structure (i.e., Capacity for Harm and Fear of Negative Evaluation). In Study 2 (N = 257), a confirmatory factor analysis revealed strong psychometric properties, the expected nomological network, good construct validity, and incremental criterion utility. Results showed that this scale can be used as a starting point for the theoretical framework behind distress overtolerance.
This study assessed the reliability and validity of the Stalking Risk Profile (SRP), a structured measure for assessing stalking risks. The SRP was administered at the point of assessment or retrospectively from file review for 241 adult stalkers (91% male) referred to a community-based forensic mental health service. Interrater reliability was high for stalker type, and moderate-to-substantial for risk judgments and domain scores. Evidence for predictive validity and discrimination between stalking recidivists and nonrecidivists for risk judgments depended on follow-up duration. Discrimination was moderate (area under the curve = 0.66-0.68) and positive and negative predictive values good over the full follow-up period (Mdn = 170.43 weeks). At 6 months, discrimination was better than chance only for judgments related to stalking of new victims (area under the curve = 0.75); however, high-risk stalkers still reoffended against their original victim(s) 2 to 4 times as often as low-risk stalkers. Implications for the clinical utility and refinement of the SRP are discussed.
Due to the complex and heterogeneous nature of obsessive–compulsive disorder (OCD), movement toward multimodal assessment has become necessary to more precisely understand the nature of the disorder and interrelations between symptom clusters. Thus, the present study utilized large undergraduate samples (total N = 800) to test the validity of six in vivo assessments of OC symptoms (i.e., one ordering/arranging task, two contamination fear/washing tasks, and three checking tasks). Associations between task-specific variables and self-reported symptom scores (as measured by the Obsessive–Compulsive Inventory–Revised [OCI-R]) were examined. The majority of the in vivo task variables (those presented in Studies 1-4) exhibited significant relationships with the corresponding OCI-R symptom subscale (i.e., ordering, washing, checking). However, many of the task variables demonstrated relationships with other OCI-R symptom subscales, as well. Some evidence for discriminant validity was found, as task variables were generally unrelated to past-week symptoms of depression or anxiety. While continued research is necessary to further establish the validity and utility of the tasks discussed in the current article, findings have implications for improving future empirical examination of OC symptoms.
Given the emerging body of literature demonstrating the validity of the interpersonal–psychological theory of suicide (IPTS), and the importance of increasing our understanding of the development of risk factors associated with suicidal behavior, it seems worthwhile both to expand IPTS research via Minnesota Multiphasic Personality Inventory–2–Restructured Form (MMPI-2-RF) correlates and to expand the availability of methods by which to assess the constructs of the IPTS. The present study attempted to do so in a large adult outpatient mental health sample by (a) inspecting associations between the IPTS constructs and the substantive scales of the MMPI-2-RF and (b) exploring the utility of MMPI-2-RF scale–based algorithms of the IPTS constructs. Correlates between the IPTS constructs and the MMPI-2-RF scales scores largely followed a pattern consistent with theory-based predictions, and we provide preliminary evidence that the IPTS constructs can be reasonably approximated using theoretically based MMPI-2-RF substantive scales. Implications of these findings are discussed.
We explored the measurement model of the adolescent version of the Centrality of Event Scale and its invariance across community (n = 1,079; 42.8% male), referred for foster care (n = 205; 58.0% male), and detained (n = 206 male) adolescent participants. Results indicated a three-factor measurement model, including all three functions that memories of significant life events may have, as a good fit to our data, particularly for male participants. This measurement model was invariant across boys taken from those different samples but not across gender. As for the short version of the instrument, a one-factor solution was the best fit to our data. It was invariant across boys taken from those different samples and across gender. Boys and girls expressed similar experiences, whereas community male adolescents reported the lowest impact of a meaningful event, in comparison with referred and with detained boys. These findings provide evidence on the validity of the scale for use with diverse adolescent samples, which may contribute for a better understanding of the impact that significant life events may have on the development of gender-specific and group-specific vulnerabilities.
There is evidence that the major anxiety and depressive disorders could reflect a single underlying internalization factor. For a group of 1,031 clinic-referred children, the study examined support for this factor, and used the two-parameter logistic model to examine the item response theory properties of the disorders in this factor. For the set of anxiety and depressive disorders, confirmatory factor analysis supported a one-factor model. The two-parameter logistic model analysis indicated that all the internalizing disorders in this factor were strong discriminators of the internalizing dimension. Also, they measured more of the internalizing dimension and with more precision in the upper half of the trait continuum. There was also support for the convergent validity of the internalizing dimension, in that it had large-to-medium effect size correlations with internalizing scores of other measures. The implications of the findings for clinical practice and clinical classification are discussed.
In this article, we organize multimethod, multitimescale data around the interpersonal situation, a conceptual framework that can be used to integrate personality, psychopathology, and psychotherapy constructs in order to guide the assessment of clinical dynamics. We first describe the key variables of the interpersonal situation model and articulate methods for assessing those variables as they manifest (a) across different levels of personality, (b) across situations, and (c) within situations. We next use a case to demonstrate how to assess aspects of the interpersonal situation in a manner that enhances case conceptualization and facilitates the evaluation of clinical hypotheses. We also use this case to highlight challenges and decisions involved in implementing dynamic assessment in psychotherapy. We conclude by outlining areas in need of further exploration toward a more sophisticated approach to clinical practice that involves the routine assessment of dynamic processes.
The Dissociative Symptoms Scale (DSS) was developed to assess moderately severe levels of depersonalization, derealization, gaps in awareness or memory, and dissociative reexperiencing that would be relevant to a wide range of clinical populations. Structural analyses of data from four clinical and five nonclinical samples (N = 1,600) yielded four factors that reflected the domains of interest and showed good fit with the data. Sample scores were consistent with expectations and showed very good internal consistency and temporal stability. Analyses showed consistent evidence of convergent and divergent validity, and posttrauma elevations in scores and in patients with posttraumatic stress disorder provided additional evidence of construct validity. Item response theory analyses indicated that the items assessed moderately severe dissociative experiences. Overall, the results provide support for the reliability and validity of DSS total and subscale scores in the populations studied. Further work is needed to evaluate the performance of the DSS relative to structured interview measures and in samples of patients with other psychological disorders.
The nomothetic approach (i.e., the study of interindividual variation) dominates analyses of clinical data, even though its assumption of homogeneity across people and time is often violated. The idiographic approach (i.e., the study of intraindividual variation) is best suited for analyses of heterogeneous clinical data, but its person-specific methods and results have been criticized as unwieldy. Group iterative multiple model estimation (GIMME) combines the assets of the nomothetic and idiographic approaches by creating person-specific maps that contain a group-level structure. The maps show how intensively measured variables predict and are predicted by each other at different time scales. In this article, GIMME is introduced conceptually and mathematically, and then applied to an empirical data set containing the negative affect, detachment, disinhibition, and hostility composite ratings from the daily diaries of 25 individuals with personality pathology. Results are discussed with the aim of elucidating GIMME’s potential for clinical research and practice.
Alphabetic working memory (WM) tests, such as the Wechsler Adult Intelligence Scale–III and IV Letter Number Sequencing, are not appropriate for nonalphabetic cultures. This study examined the psychometric properties of the Taiwan Odd–Even Number Sequencing Test (TOENST) and identified representative norms. The TOENST and other mental screening tasks were administered to 300 randomly selected healthy participants, 32 purposive sampling patients with schizophrenia, and 32 quota sampling controls. To investigate reliability and validity, a subset of the 300 healthy participants was randomly selected to receive a second TOENST (n = 30) or conventional WM tests (n = 42). The split-half reliability of the TOENST ranged from 0.69 to 0.95, and its test–retest reliability was 0.75. Criterion validity was demonstrated by significant correlations with conventional WM measures (all p < .05, except semantic verbal fluency), and construct validity was demonstrated by significant correlations with aging (main effect, F10,259 = 10.99, p < .001). Normative data were established, and performance was significantly associated with age and education. TOENST scores of patients with schizophrenia were significantly lower and correlated with frontal lobe tests, but not demographical or clinical characteristics. The TOENST has adequate psychometric properties and clinical utility and is as a viable alternative WM task for nonalphabetic cultures.
The current study developed the 60-item Multidimensional Psychological Flexibility Inventory (MPFI)—a scale assessing the 12 dimensions of the Hexaflex model. We created an exhaustive pool of 554 items including 22 of the most widely used measures from the acceptance and commitment therapy and mindfulness literatures. Exploratory and confirmatory factor analyses were used in combination with item response theory and responsiveness to change analyses in 3,040 online respondents across three studies (NStudy 1 = 372; NStudy 2 = 2,150; NStudy 3 = 518) to create the MPFI. Associations between the MPFI subscales and an array of existing measures supported its convergent and discriminant validities. The MPFI offers acceptance and commitment therapy researchers new tools for elaborating treatment effects.
Multivariate psychological processes have recently been studied, visualized, and analyzed as networks. In this network approach, psychological constructs are represented as complex systems of interacting components. In addition to insightful visualization of dynamics, a network perspective leads to a new way of thinking about the nature of psychological phenomena by offering new tools for studying dynamical processes in psychology. In this article, we explain the rationale of the network approach, the associated methods and visualization, and illustrate it using an empirical example focusing on the relation between the daily fluctuations of emotions and neuroticism. The results suggest that individuals with high levels of neuroticism had a denser emotion network compared with their less neurotic peers. This effect is especially pronounced for the negative emotion network, which is in line with previous studies that found a denser network in depressed subjects than in healthy subjects. In sum, we show how the network approach may offer new tools for studying dynamical processes in psychology.
Although theory posits a multidimensional structure of resilience, studies have supported a unidimensional solution for data obtained from the commonly used Connor–Davidson Resilience Scale (CD-RISC). This study investigated the latent structure of CD-RISC responses in a sample of postsecondary students with disabilities. Furthermore, the validity of CD-RISC scores was examined with respect to career optimism and well-being. The analyses were conducted using confirmatory factor analysis and exploratory structural equation modeling (ESEM). Results supported a bifactor-ESEM representation of the CD-RISC data that accounts for construct-relevant multidimensionality in scores due to the presence of general and specific factors and the fallibility of indicators as pure reflections of the constructs they measure. Although three specific factors showed meaningful residual specificity over and above the general factor, two specific factors were weakly defined with little meaningful residual specificity. However, these factors may retain some utility in the bifactor-ESEM model insofar as they control for limited levels of residual covariance in items. Evidence was also obtained for relations of the general and substantively interpretable specific factors with career optimism and well-being. The results of the study provide validation data for the CD-RISC and clarify recent research converging on seemingly disparate unidimensional and multidimensional solutions.
Although Diagnostic and Statistical Manual of Mental Disorders–Fifth edition requires that attention-deficit/hyperactivity disorder (ADHD) symptoms are apparent across settings, assessed by multiple informants, there remains no standardized approach to integration of multiple sources in adult ADHD diagnosis. The goal of the study was to evaluate informant effects on adult ADHD symptom ratings. Participants were 406 adults, ages 18 to 37, and identified second reporters, recruited from the community, and completing a comprehensive diagnostic and cognitive assessment, including a clinician-administered diagnostic interview and self- and other-report questionnaires of ADHD symptoms. Structural equation modeling indicated good fit for a trifactor model of ADHD, including general ADHD, specific inattention and hyperactivity–impulsivity, and self- and other-perspective factors. Yet there were a number of symptoms on the specific hyperactive–impulsive and self-factors that exhibited nonsignificant loadings. Significant differential item functioning across self-ratings and informant ratings was also noted. The external validation indices of laboratory executive function and diagnostic team-rated impairment was significantly correlated with the specific inattentive factor. While executive function was marginally significantly correlated with the other perspective factor, impairment was associated with the self-perspective factor. Overall, inattentive symptoms may be more sensitive measures of adult ADHD, and other and self-ratings may provide different information in relation to external criteria.
Brief measures that are comparable across disparate groups are particularly likely to be useful in primary care settings. Prior research has supported a six-item short form of the Whiteley Index (WI), a commonly used measure of health anxiety, among English-speaking respondents. This study examined the measurement invariance of the WI-6 among Black (n = 183), Latino (n = 173), and White (n = 177) respondents seeking treatment at a U.S. community health center. Results supported a bifactor model of the WI-6 among the composite sample (N = 533), suggesting the presence of a general factor and two domain-specific factors. Results supported the incremental validity of one of the domain-specific factors in accounting for unique variance in somatic symptom severity scores beyond the general factor. Multiple-groups confirmatory factor analysis supported the configural, metric, ands scalar invariance of the bifactor WI-6 model across the three groups of respondents. Results provide support for the measurement invariance of the WI-6 among Black, Latino, and White respondents. The potential use of the WI-6 in primary care, and broader, settings is discussed.
We introduce a nonverbal "visceral" measure of hunger (i.e., squeezing a handheld dynamometer) and provide the first evidence of verbal overshadowing effects in this visceral domain. We presented 106 participants with popcorn and recorded their hunger levels in one of three conditions: (1) first report hunger using a traditional self-report rating scale (i.e., verbal measure) and then indicate hunger by squeezing a dynamometer (i.e., nonverbal measure), (2) first indicate hunger verbally and then indicate hunger nonverbally, or (3) indicate hunger only nonverbally. As hypothesized, nonverbal measures of hunger predicted subsequent eating behavior when they were uncontaminated by verbal measures—either because they preceded verbal measures of hunger or because they were the sole measure of hunger. Moreover, nonverbal measures of hunger were a better predictor of eating behavior than verbal measures. Implications of the study for communicating embodied experiences in a way that escapes the confines of symbolic representations are discussed.
The sociocultural differences between Western and sub-Saharan African countries make it imperative to standardize neuropsychological tests in the latter. However, Western-normed tests are frequently administered in sub-Saharan Africa because of challenges hampering standardization efforts. Yet a salient topical issue in the cross-cultural neuropsychology literature relates to the utility of Western-normed neuropsychological tests in minority groups, non-Caucasians, and by extension Ghanaians. Consequently, this study investigates the diagnostic accuracy, sensitivity, and specificity of executive function (EF) tests (The Stroop Test, Trail Making Test, and Controlled Oral Word Association Test), and a Revised Quick Cognitive Screening Test (RQCST) in a sample of 50 patients diagnosed with moderate traumatic brain injury and 50 healthy controls in Ghana. The EF test scores showed good diagnostic accuracy, with area under the curve (AUC) values of the Trail Making Test scores ranging from .746 to .902. With respect to the Stroop Test scores, the AUC values ranged from .793 to .898, while Controlled Oral Word Association Test had AUC value of .787. The RQCST scores discriminated between the groups, with AUC values ranging from .674 to .912. The AUC values of composite EF score and a neuropsychological score created from EF and RQCST scores were .936 and. 942, respectively. Additionally, the Stroop Test, Trail Making Test, EF composite score, and RQCST scores showed good to excellent sensitivities and specificities. In general, this study has shown that commonly used EF tests in Western countries have diagnostic accuracy, sensitivity, and specificity when administered in Ghanaian samples. The findings and implications of the study are discussed.
Recent discussions surrounding the Dark Triad (narcissism, psychopathy, and Machiavellianism) have centered on areas of distinctiveness and overlap. Given that interpersonal dysfunction is a core feature of Dark Triad traits, the current study uses self-report data from 562 undergraduate students to examine the interpersonal characteristics associated with narcissism, psychopathy, and Machiavellianism on four interpersonal circumplex (IPC) surfaces. The distinctiveness of these characteristics was examined using a novel bootstrapping methodology for computing confidence intervals around circumplex structural summary method parameters. Results suggest that Dark Triad traits exhibit distinct structural summary method parameters with narcissism characterized by high dominance, psychopathy characterized by a blend of high dominance and low affiliation, and Machiavellianism characterized by low affiliation on the problems, values, and efficacies IPC surfaces. Additionally, there was some heterogeneity in findings for different measures of psychopathy. Gender differences in structural summary parameters were examined, finding similar parameter values despite mean-level differences in Dark Triad traits. Finally, interpersonal information was integrated across different IPC surfaces to create profiles associated with each Dark Triad trait and to provide a more in-depth portrait of associated interpersonal dynamics.
Depression and suicidal ideation are highly intertwined constructs. A common practice in suicide research is to control for depression when predicting suicidal ideation, yet implications of this practice have not been subjected to sufficient empirical scrutiny. We explore what, precisely, is represented in a suicidal ideation variable with depression covaried out. In an adult psychiatric outpatient sample (N = 354), we computed two variables—depression with suicidal ideation covaried out, and suicidal ideation with depression covaried out—and examined correlations between these residuals, three factors comprising a variegated collection of psychological correlates of suicidal ideation, psychiatric diagnoses, and past suicidal behavior. Findings indicated that suicidal ideation with depression covaried out appears to be characterized by fearlessness about death, self-sacrifice, and externalizing pathology. We propose that suicidal ideation may comprise two distinct components: desire for death (passive ideation and depressive cognitions) and will (self-sacrifice, fearlessness, externalizing behavior). Implications, limitations, and future directions are discussed.
The Five-Factor Obsessive-Compulsive Inventory (FFOCI) is an assessment of obsessive-compulsive personality disorder (OCPD) that is based on the conceptual framework of the five-factor model (FFM) of personality. The FFOCI has 12 subscales that assess those five-factor model facets relevant to the description of OCPD. Research has suggested that the FFOCI scores relate robustly to existing measures of OCPD and relevant scales from general personality inventories. Nonetheless, the FFOCI’s length—120 items—may limit its clinical utility. This study derived a 48-item FFOCI–Short Form (FFOCI-SF) from the original measure using item response theory methods. The FFOCI-SF scales successfully recreated the nomological network of the original measure and improved discriminant validity relative to the long form. These results support the use of the FFOCI-SF as a briefer measure of the lower-order traits associated with OCPD.
Mobile technologies are increasingly used to measure cognitive function outside of traditional clinic and laboratory settings. Although ambulatory assessments of cognitive function conducted in people’s natural environments offer potential advantages over traditional assessment approaches, the psychometrics of cognitive assessment procedures have been understudied. We evaluated the reliability and construct validity of ambulatory assessments of working memory and perceptual speed administered via smartphones as part of an ecological momentary assessment protocol in a diverse adult sample (N = 219). Results indicated excellent between-person reliability (≥0.97) for average scores, and evidence of reliable within-person variability across measurement occasions (0.41-0.53). The ambulatory tasks also exhibited construct validity, as evidence by their loadings on working memory and perceptual speed factors defined by the in-lab assessments. Our findings demonstrate that averaging across brief cognitive assessments made in uncontrolled naturalistic settings provide measurements that are comparable in reliability to assessments made in controlled laboratory environments.
The present study examined the construct validity of the Violence Risk Scale–Sexual Offender version (VRS-SO) through an examination of its factor structure and convergence with psychological measures assessing conceptually relevant constructs in a sample of 732 treated incarcerated adult male sex offenders. The VRS-SO was rated prospectively pre- and posttreatment by service providers, and several of the men had completed a psychometric battery at each time point. Prospective Stable 2000 ratings were examined for comparison purposes. Results of exploratory longitudinal factor analysis, performed on VRS-SO pre- and posttreatment dynamic item scores, supported a three-factor model (comparative fit index = .990) and the measurement invariance of the loadings over time. A stringent longitudinal confirmatory factor analysis of the VRS-SO items also supported the three-factor structure. Scores from the three factors (Sexual Deviance, Criminality, and Treatment Responsivity) were correlated in conceptually meaningful ways with scores from the Stable 2000 and selected psychometric measures. The results provide evidence for the construct validity of VRS-SO test scores as providing an index of sex offender risk and, more specifically, that its item content and factor domains measure psychological constructs pertinent to sex offender risk and need.
We applied a new approach to Generalizability theory (G-theory) involving parallel splits and repeated measures to evaluate common uses of the Paulhus Deception Scales based on polytomous and four types of dichotomous scoring. G-theory indices of reliability and validity accounting for specific-factor, transient, and random-response measurement error supported use of polytomous over dichotomous scores as contamination checks; as control, explanatory, and outcome variables; as aspects of construct validation; and as indexes of environmental effects on socially desirable responding. Polytomous scoring also provided results for flagging faking as dependable as those when using dichotomous scoring methods. These findings argue strongly against the nearly exclusive use of dichotomous scoring for the Paulhus Deception Scales in practice and underscore the value of G-theory in demonstrating this. We provide guidelines for applying our G-theory techniques to other objectively scored clinical assessments, for using G-theory to estimate how changes to a measure might improve reliability, and for obtaining software to conduct G-theory analyses free of charge.
Existing measures of the five factor model (FFM) of personality are generally, if not exclusively, unipolar in their assessment of maladaptive variants of the FFM domains. However, two recently developed measures, the Five Factor Form (FFF) and the Sliderbar Inventory (SI), include items that assess for maladaptive variants at both poles of each item. This structure is unique among existing measures of personality and personality disorder, although there is a historical, infrequently used Stone Personality Trait Schema (SPTS) that had also included this item structure. To facilitate an exploration of their convergent and discriminant validity, the SI and SPTS items were reorganized into FFM scales. The convergent and discriminant validity of the FFF, SI-FFM, and SPTS-FFM scales was considered in a sample of 450 adults with current or a history of mental health treatment. The FFF, SI-FFM, and SPTS-FFM were also compared with respect to their relationship with FFM domains. Finally, the FFF items and SI-FFM scales were tested with respect to their relationship with measures of maladaptive variants of both high and low agreeableness and conscientiousness. The implications of the results are discussed with respect to the assessment of maladaptive personality functioning, and suggestions for future research are provided.
This article describes an investigation of whether Thurstonian item response modeling is a viable method for assessment of maladaptive traits. Forced-choice responses from 420 working adults to a broad-range personality inventory assessing six maladaptive traits were considered. The Thurstonian item response model’s fit to the forced-choice data was adequate, while the fit of a counterpart item response model to responses to the same items but arranged in a single-stimulus design was poor. Monotrait heteromethod correlations indicated corresponding traits in the two formats overlapped substantially, although they did not measure equivalent constructs. A better goodness of fit and higher factor loadings for the Thurstonian item response model, coupled with a clearer conceptual alignment to the theoretical trait definitions, suggested that the single-stimulus item responses were influenced by biases that the independent clusters measurement model did not account for. Researchers may wish to consider forced-choice designs and appropriate item response modeling techniques such as Thurstonian item response modeling for personality questionnaire applications in industrial psychology, especially when assessing maladaptive traits. We recommend further investigation of this approach in actual selection situations and with different assessment instruments.
Psychopathy refers to a range of complex behaviors and personality traits, including callousness and antisocial behavior, typically studied in criminal populations. Recent studies have used self-reports to examine psychopathic traits among noncriminal samples. The goal of the current study was to examine the underlying factor structure of the Self-Report of Psychopathy Scale–Short Form (SRP-SF) across complementary samples and examine the impact of gender on factor structure. We examined the structure of the SRP-SF among 2,554 young adults from three undergraduate samples and a high-risk young adult sample. Using confirmatory factor analysis, a four-correlated factor model and a four-bifactor model showed good fit to the data. Evidence of weak invariance was found for both models across gender. These findings highlight that the SRP-SF is a useful measure of low-level psychopathic traits in noncriminal samples, although the underlying factor structure may not fully translate across men and women.
Risky behaviors increase the likelihood of premature death, long-term disability, and poor mental health outcomes. Most current measures of risky behavior only assess behaviors within a single domain, fail to evaluate affective triggers for engaging in these behaviors, do not index the consequences of these behaviors, and are often limited to a narrow developmental period. The present study developed and evaluated a new 38-item questionnaire-based measure, the Risky, Impulsive, and Self-Destructive Behavior Questionnaire (RISQ), designed to address each of these limitations by expanding the breadth and depth of previous questionnaires. A bifactor model with a general factor and eight domain-specific factors (measuring drug use, aggression, self-harm, gambling, risky sexual behavior, impulsive eating, heavy alcohol use, and reckless behavior) best fit the RISQ, and indicators of internal consistency, as well as, construct validity were strong. Results provide initial validation for the RISQ as a broad, yet relatively brief, measure that quantifies and qualifies risky behaviors by assessing the severity, chronicity, and triggers for a range of harmful behaviors.
The Psychopathic Personality Inventory–Revised (PPI-R) includes validity scales that assess Deviant Responding (DR), Virtuous Responding, and Inconsistent Responding. We examined the utility of these scales for identifying careless responding using data from two online studies that examined correlates of psychopathy in college students (Sample 1: N = 583; Sample 2: N = 454). Compared with those below the cut scores, those above the cut on the DR scale yielded consistently lower validity coefficients when PPI-R scores were correlated with corresponding scales from the Triarchic Psychopathy Measure. The other three PPI-R validity scales yielded weaker and less consistent results. Participants who completed the studies in an inordinately brief amount of time scored significantly higher on the DR and Virtuous Responding scales than other participants. Based on the findings from the current studies, researchers collecting PPI-R data online should consider identifying and perhaps screening out respondents with elevated scores on the DR scale.
The Temperament in Middle Childhood Questionnaire (TMCQ) is a widely used parent-report measure of temperament. However, neither its lower nor higher order structures has been tested via a bottom-up, empirically based approach. We conducted higher and lower order exploratory factor analyses (EFAs) of the TMCQ in a large (N = 654) sample of 9-year-olds. Item-level EFAs identified 92 items as suitable (i.e., with loadings ≥.40) for constructing lower order factors, only half of which resembled a TMCQ scale posited by the measure’s authors. Higher order EFAs of the lower order factors showed that a three-factor structure (Impulsivity/Negative Affectivity, Negative Affectivity, and Openness/Assertiveness) was the only admissible solution. Overall, many TMCQ items did not load well onto a lower order factor. In addition, only three factors, which did not show a clear resemblance to Rothbart’s four-factor model of temperament in middle childhood, were needed to account for the higher order structure of the TMCQ.
Three hundred sixty-two adult patients were administered the Diagnostic Interview for Anxiety, Mood, and OCD and Related Neuropsychiatric Disorders (DIAMOND). Of these, 121 provided interrater reliability data, and 115 provided test–retest reliability data. Participants also completed a battery of self-report measures that assess symptoms of anxiety, mood, and obsessive-compulsive and related disorders. Interrater reliability of DIAMOND anxiety, mood, and obsessive-compulsive and related diagnoses ranged from very good to excellent. Test–retest reliability of DIAMOND diagnoses ranged from good to excellent. Convergent validity was established by significant between-group comparisons on applicable self-report measures for nearly all diagnoses. The results of the present study indicate that the DIAMOND is a promising semistructured diagnostic interview for DSM-5 disorders.
Recent developments in personality research led to the proposition of two alternative six-factor trait models, the HEXACO model and the Big Six model. However, given the lack of direct comparisons, it is unclear whether the HEXACO and Big Six factors are distinct or essentially equivalent, that is, whether corresponding inventories measure similar or distinct personality traits. Using Structural Equation Modeling (Study 1), we found substantial differences between the traits as measured via the HEXACO-60 and the 30-item Questionnaire Big Six (30QB6), particularly for Honesty-Humility and Honesty-Propriety (both model’s critical difference to the Big Five approach). This distinction was further supported by Study 2, showing differential capabilities of the HEXACO-60 and the 30QB6 to account for several criteria representing the theoretical core of Honesty-Humility and/or Honesty-Propriety. Specifically, unlike the indicator of Honesty-Humility, the indicator of Honesty-Propriety showed low predictive power for some conceptually relevant criteria, suggesting a limited validity of the 30QB6.
Despite the forensic relevance of psychopathy and the overrepresentation of Hispanics in the United States’ criminal justice system, these two issues remain underexplored, particularly with self-report measures of psychopathy. We investigated the criterion validity of three psychopathy measures among African Americans, Caucasians, and Hispanics in a sample of 1,742 offenders. More similarity than dissimilarity emerged across groups. The factor structures of psychopathy measures among Hispanic offenders were consistent with previous findings. Few significant differences emerged between Hispanic and Caucasian offenders, with most differences emerging between African Americans and the other ethnic groups. In such instances, the correlates of psychopathy were typically weaker for African Americans. The Psychopathy Checklist–Revised yielded fewer psychopathy x ethnicity interactions than the Psychopathic Personality Inventory and Levenson Primary and Secondary Psychopathy Scales. Overall, these psychopathy measures showed reasonable validity across these cultural groups.
This study investigates the factorial structure and validity of the Quantified Behavior Test Plus (Qb+©), a computerized test to objectively evaluate the three attention deficit/hyperactivity disorder core symptoms, hyperactivity, inattention, and impulsivity, independently. Confirmatory and exploratory factor analyses were conducted with an outpatient sample of 773 subjects ≥12 years old. In a second sample of 297 patients ≥16 years, a multitrait–multimethod analysis was performed to examine concurrent and discriminant validity. The discriminative power of the Qb+ was investigated using a general linear model and logistic regression analysis. The three factorial structure (Hyperactivity, Inattention, Impulsivity) was verified in the confirmatory factor analysis. Fit indices demonstrated a good model fit and factor loadings were almost all moderate to high. In the multitrait–multimethod analysis, the criterion for convergent validity was fulfilled. The discriminant validity of the Qb+ was partially supported. Significant but small gender and age effects were found. In the logistic regression analysis, omission errors and reaction time variability, belonging to the Inattention factor, were able to discriminate between subjects with and without attention deficit/hyperactivity disorder. The internal structure of the Qb+ was verified. Its validity was partially supported. Results regarding discriminative power were mixed.
Clinicians have long recognized the importance of tailoring psychotherapy interventions to the needs and characteristics of the individual patient. However, traditional approaches to clinical assessment, service delivery, and intervention research have not been conducive to such personalization. Contrary to traditional nomothetic approaches, idiographic assessment and modeling of intraindividual dynamic processes holds tremendous promise for tailoring the implementation of psychotherapy to the individual patient. In this article, we (a) present an argument for assessing person-specific dynamics, (b) provide a detailed description of a method that harnesses person-specific dynamic assessment and modeling for use in routine psychotherapy, (c) present exemplar clinical cases illustrating these methods, and (d) discuss how these methods can be translated into routine clinical assessment and psychotherapy.
Objective: The objective of this study was to create the Korean version of the Modified Practice Attitudes Scale (K-MPAS) to measure clinicians’ attitudes toward evidence-based treatments (EBTs) in the Korean mental health system. Method: Using 189 U.S. therapists and 283 members from the Korean mental health system, we examined the reliability and validity of the MPAS scores. We also conducted the first exploratory and confirmatory factor analysis on the MPAS and compared EBT attitudes across U.S. and Korean therapists. Results: Results revealed that the inclusion of both "reversed-worded" and "non–reversed-worded" items introduced significant method effects that compromised the integrity of the one-factor MPAS model. Problems with the one-factor structure were resolved by eliminating the "non–reversed-worded" items. Reliability and validity were adequate among both Korean and U.S. therapists. Korean therapists also reported significantly more negative attitudes toward EBTs on the MPAS than U.S. therapists. Conclusions: The K-MPAS is the first questionnaire designed to measure Korean service providers’ attitudes toward EBTs to help advance the dissemination of EBTs in Korea. The current study also demonstrated the negative impacts that can be introduced by incorporating oppositely worded items into a scale, particularly with respect to factor structure and detecting significant group differences.
The Levenson Self-Report Psychopathy (LSRP) scale is an efficient measure of psychopathy with promising psychometric properties. However, the cross-cultural utility of the LSRP has not been well documented, and no study has explored measurement invariance of the LSRP across East Asian and North American samples. We translated the LSRP into Chinese (Study 1) and investigated the validity and reliability of the Chinese LSRP using a sample of 226 university students in China (Study 2). Confirmatory factor analyses supported Brinkley, Diamond, Magaletta, and Heigel’s (2008) three-factor model (Egocentricity, Callousness, and Antisocial). Evidence for configural and partial metric (but not scalar) invariance of the factor structure was observed when comparing Chinese and U.S. university samples. However, response thresholds were significantly different between the two samples. The Chinese LSRP scores also demonstrated encouraging convergent and discriminate validity in terms of their associations with external criteria. We discuss the implications for cross-cultural assessment of psychopathy.
The present study is the first to investigate the Personality Assessment Screener, a brief self-report measure of risk for emotional and behavioral dysfunction, in relation to the informant report version of this instrument, the Personality Assessment Screener–Other. Among a sample of undergraduate roommate dyads (N = 174), self-report and informant report total scores on the Personality Assessment Screener/Personality Assessment Screener–Other moderately converged (r = 0.45), with generally greater agreement between perspectives observed for externalizing behaviors compared with internalizing distress. In addition, selves tended to report more psychological difficulties relative to informant ratings (d = 0.45) with an average absolute discrepancy between sources of 6.31 (SD = 4.96) out of a possible range of 66. Discrepancies between self-report and informant report were significantly associated with characteristics of the dyadic relationship (e.g., length of acquaintanceship) as well as the severity of self-reported psychological difficulties and positive impression management.
Determining whether Need for Cognition (NC) has the same meaning across age may help understand why there are dramatically different age trends for cognitive abilities and for NC in adulthood. Data from 5,004 participants aged between 18 and 99 years were used to examine both internal relations and external relations of NC. Internal relations were investigated with measures of reliability, examination of factor invariance, and test–retest coefficients across three age groups. External relations were investigated by examining relations of NC with cognitive abilities, engagement, personality, self-rated cognition, and affect. Results suggest that NC may be a broad construct that could reflect motivation to seek out intellectual challenge. In addition, examination of both internal and external relations of NC indicated that the meaning of the construct may be the same across the life span. Finally, the current article showed that the strongest predictor of NC was Openness to Experience, at any age.
Forensic assessments must always consider whether examinees are putting forth genuine effort or seeking to feign legally relevant incapacities. Miranda abilities are no exception when a putatively invalid Miranda waiver might result in the full suppression of an outright confession. Using a within-subjects simulation design, jail detainees were administered a representative Miranda warning and two Standardized Assessment of Miranda Abilities (SAMA) measures: Miranda Vocabulary Scale and Miranda Quiz. As expected, detainees have no difficulty in feigning severe deficits in their recall of the Miranda warning and portraying markedly impaired abilities on both SAMA measures. However, using floor-effect detection strategies, several feigning indicators proved effective at identifying likely feigned Miranda abilities. As an ancillary issue, the Inventory of Legal Knowledge was found to be very effective using both the traditional and revised scoring.
There is a need for brief, accurate screening scales for social anxiety disorder to enable better identification of the disorder in research and clinical settings. A five-item social anxiety screener, the Social Phobia Screener (SOPHS), was developed to address this need. The screener was validated in two samples: (a) 12,292 Australian young adults screened for a clinical trial, including 1,687 participants who completed a phone-based clinical interview and (b) 4,214 population-based Australian adults recruited online. The SOPHS (78% sensitivity, 72% specificity) was found to have comparable screening performance to the Social Phobia Inventory (77% sensitivity, 71% specificity) and Mini-Social Phobia Inventory (74% sensitivity, 73% specificity) relative to clinical criteria in the trial sample. In the population-based sample, the SOPHS was also accurate (95% sensitivity, 73% specificity) in identifying Diagnostic and Statistical Manual of Mental Disorders–Fifth edition social anxiety disorder. The SOPHS is a valid and reliable screener for social anxiety that is freely available for use in research and clinical settings.
The Clock Drawing Test (CDT) is a commonly used tool in clinical practice and research for cognitive screening among older adults. The main goal of the present study was to analyze the interrater reliability of three different CDT scoring systems (by Shulman et al., Babins et al., and Cohen et al.). We used a clock with a predrawn circle. The CDT was evaluated by three independent raters based on the normative data set of healthy older and very old adults and patients with nonamnestic mild cognitive impairment (naMCI; N = 438; aged 61-94). We confirmed a high interrater reliability measured by the intraclass correlation coefficients (ICCs): Shulman ICC = .809, Babins ICC = .894, and Cohen ICC = .862, all p < .001. We found that age and education levels have a significant effect on CDT performance, yet there was no influence of gender. Finally, the scoring systems differentiated between naMCI and age- and education-matched controls: Shulman’s area under the receiver operating characteristic curve (AUC) = .84, Cohen AUC = .71, all p < .001; and a slightly lower discriminative ability was shown by Babins: AUC = .65, p = .012.
Information processing is typically evaluated using simple reaction time (SRT) and choice reaction time (CRT) paradigms in which a specific response is initiated following a given stimulus. The measurement of reaction time (RT) has evolved from monitoring the timing of mechanical switches to computerized paradigms. The proliferation of mobile devices with touch screens makes them a natural next technological approach to assess information processing. The aims of this study were to determine the validity and reliability of using of a mobile device (Apple iPad or iTouch) to accurately measure RT. Sixty healthy young adults completed SRT and CRT tasks using a traditional test platform and mobile platforms on two occasions. The SRT was similar across test modality: 300, 287, and 280 milliseconds (ms) for the traditional, iPad, and iTouch, respectively. The CRT was similar within mobile devices, though slightly faster on the traditional: 359, 408, and 384 ms for traditional, iPad, and iTouch, respectively. Intraclass correlation coefficients ranged from 0.79 to 0.85 for SRT and from 0.75 to 0.83 for CRT. The similarity and reliability of SRT across platforms and consistency of SRT and CRT across test conditions indicate that mobile devices provide the next generation of assessment platforms for information processing.
Observational measurement plays an integral role in a variety of scientific endeavors within biology, psychology, sociology, education, medicine, and marketing. The current article provides an interdisciplinary primer on observational measurement; in particular, it highlights recent advances in observational methodology and the challenges that accompany such growth. First, we detail the various types of instrument that can be used to standardize measurements across observers. Second, we argue for the importance of validity in observational measurement and provide several approaches to validation based on contemporary validity theory. Third, we outline the challenges currently faced by observational researchers pertaining to measurement drift, observer reactivity, reliability analysis, and time/expense. Fourth, we describe recent advances in computer-assisted measurement, fully automated measurement, and statistical data analysis. Finally, we identify several key directions for future observational research to explore.
Time series analysis is a technique that can be used to analyze the data from a single subject and has great potential to investigate clinically relevant processes like affect regulation. This article uses time series models to investigate the assumed dysregulation of affect that is associated with bipolar disorder. By formulating a number of alternative models that capture different kinds of theoretically predicted dysregulation, and by comparing these in both bipolar patients and controls, we aim to illustrate the heuristic potential this method of analysis has for clinical psychology. We argue that, not only can time series analysis elucidate specific maladaptive dynamics associated with psychopathology, it may also be clinically applied in symptom monitoring and the evaluation of therapeutic interventions.
Emotional reactions are a vital part of the therapeutic relationship. The Feeling Word Checklist–24 (FWC-24) is an instrument asking the clinician (or the patient) to report to what degree he or she has experienced various feelings during a therapeutic interaction. The aim of this study was to assess the factor structure of the clinician-rated FWC-24 when taking dependencies in the data into account. The sample was deliberately heterogeneous and consisted of 4,443 ratings made by 101 psychotherapists working with different psychotherapy methods in relation to 191 patients of different ages, genders, and with different primary diagnoses. A random intercept-only model revealed large intraclass correlation coefficients at the therapist level, indicating that a multilevel analysis was warranted. A two-level exploratory factor analysis with therapists as the between level and patients plus sessions as the within level was conducted. The items from FWC-24 were found to be best represented by four factors on the between level and four factors on the within level. The factor structures were largely similar on the two levels and were labeled Engaged, Inadequate, Relaxed, and Moved. The different factors explained different amounts of variance on different levels, indicating that some factors are more therapist dependent and some more patient dependent.
The current study examines a bifactor model for the Youth Psychopathic Traits Inventory (YPI) in a Dutch community sample of adolescents (N = 2,874). The primary goal was to examine the latent structure of the YPI with a bifactor modeling approach. Furthermore, the study examines the dimensionality and measurement invariance of the YPI. Results show that a bifactor model at subscale level fits the YPI best. The general psychopathy factor influences the 10 subscales of the YPI strongly, indicating that the YPI seems to be rather unidimensional than multidimensional. Nevertheless, the dimensions still explain nearly one third of the variance found. Findings imply that the bifactor model of the YPI should be used when examining relations with outcome variables, with a focus on the total score of the YPI, while factor scores should be reported with caution. Furthermore, the bifactor model appears invariant for gender, age, and ethnic background.
In recent years, significant technological advances have changed our understanding of dynamic processes in clinical psychology. A particularly important agent of change has been ambulatory assessment (AA). AA is the assessment of individuals in their daily lives, combining the twin benefits of increased ecological validity and minimized retrospective biases. These benefits make AA particularly well-suited to the assessment of dynamic processes, and recent advancements in technology are providing exciting new opportunities to understand these processes in new ways. In the current article, we briefly detail the capabilities currently offered by smartphones and mobile physiological devices, as well as some of the practical and ethical challenges of incorporating these new technologies into AA research. We then provide several examples of recent innovative applications of AA methodology in clinical research, assessment, and intervention and provide a case example of AA data generated from a study utilizing multiple mobile devices. In this way, we aim to provide a sense of direction for researchers planning AA studies of their own.
Psychological inflexibility (PI) refers to the overarching and nonadaptive avoidance of thoughts and feelings. PI is a transdiagnostic process that is present in numerous psychopathologies, such as anxiety and mood disorders, addictive behaviors, and chronic pain, as presented by American adults and adolescents. Despite the high rates of depression and depressed mood among Spanish and Latino adolescents and the observed relation between PI and adjustment problems at this age, an instrument assessing PI in Spanish-speaking adolescents is lacking. In this study, we assessed the psychometric properties of a Spanish adaptation of the Avoidance and Fusion Questionnaire for Youth with 483 students from Spain (mean age 13.89 years). The Spanish Avoidance and Fusion Questionnaire for Youth proved to be a two-factor psychometrically sound instrument. Total PI scores correlated positively with depression and negatively with satisfaction with life. The predictive validity results showed cognitive fusion and experiential avoidance to be two interrelated but distinct processes that characterize PI.
Detecting psychological distress among international students can be challenging given diverse languages, cultural backgrounds, and lack of refined measurement properties of measures tailored to international students. Despite the challenges, ensuring that a psychological distress measure works effectively has considerable potential value for assessment purposes. The current study evaluates the measurement properties of a short 10-item version of Radloff’s Center for Epidemiologic Studies Depression Scale (CES-D). Grounded in long-standing evidence on gender differences in depressive symptoms, specific attention was given to examining measurement invariance of the CES-D Short-form across women and men. Based on a large, two-cohort sample of international students (N = 468), and through multiple analyses evaluating factor structure and measurement invariance, we derived an even briefer, seven-item single-factor form of the CES-D (CES-D Short-form International) that can be used with international students.
The present study examined the convergent and predictive validity of the Jesness Inventories (JI) in a sample of 138 juvenile offenders, completed in the course of routine service delivery. JI profiles were compared with ratings on three standardized forensic clinical scales: the Youth Level of Service/Case Management Inventory, Psychopathy Checklist: Youth Version, and Violence Risk Scale–Youth Version. The JI Asocial Index and the Undersocialized Active and Group-Oriented Conformist Interpersonal Maturity Level (I-level) subtypes demonstrated the strongest pattern of convergence and most consistently predicted recidivism. The Asocial Index did not incrementally predict recidivism after controlling for scores on the standardized forensic clinical scales; however, meaningful differences among broad I-Level groups (I-3 and I-4) remained after controlling for risk. Risk-need-responsivity applications of the JI (i.e., in terms of treatment dosage, identifying treatment targets, and adaptation of services) are discussed within the context of a comprehensive forensic assessment framework to inform case formulation, service delivery, and decision making with justice involved youth.
The Pathological Narcissism Inventory (PNI) is a multidimensional measure for assessing grandiose and vulnerable features in narcissistic pathology. The aim of the present research was to construct and validate a German translation of the PNI and to provide further information on the PNI’s nomological net. Findings from a first study confirm the psychometric soundness of the PNI and replicate its seven-factor first-order structure. A second-order structure was also supported but with several equivalent models. A second study investigating associations with a broad range of measures (DSM Axis I and II constructs, emotions, personality traits, interpersonal and dysfunctional behaviors, and well-being) supported the concurrent validity of the PNI. Discriminant validity with the Narcissistic Personality Inventory was also shown. Finally, in a third study an extension in a clinical inpatient sample provided further evidence that the PNI is a useful tool to assess the more pathological end of narcissism.
We examined the utility of the Minnesota Multiphasic Personality Inventory–2–Restructured Form (MMPI-2-RF) underreporting Validity Scales in a simulation design with a sample of 257 undergraduate college students. Extending past research by Sellbom and Bagby, we added a manipulation check to determine whether individuals complied with instructions to underreport and examined the impact of underreporting on all of the MMPI-2-RF substantive scales. Results indicated that individuals who complied with instructions to underreport produced statistically significantly and meaningfully higher scores on the MMPI-2-RF underreporting Validity Scales (Uncommon Virtues [L-r] and Adjustment Validity [K-r]) when compared with those who received standard instructions and with individuals who did not comply with instructions to underreport. Moreover, in comparisons with both groups, participants who complied with instructions to underreport had lower scores on the majority of the substantive scales. L-r and K-r added incremental predictive utility (in reference to one another) in differentiating individuals who underreported from individuals who were given standard instructions.
To help facilitate the dissemination and implementation of evidence-based assessment practices, we examined the psychometric properties of the shortened 25-item version of the Revised Child Anxiety and Depression Scale–parent report (RCADS-25-P), which was based on the same items as the previously published shortened 25-item child version. We used two independent samples of youth—a school sample (N = 967, Grades 3-12) and clinical sample (N = 433; 6-18 years)—to examine the factor structure, reliability, and validity of the RCADS-25-P scale scores. Results revealed that the two-factor structure (i.e., depression and broad anxiety factor) fit the data well in both the school and clinical sample. All reliability estimates, including test–retest indices, exceeded benchmark for good reliability. In the school sample, the RCADS-25-P scale scores converged significantly with related criterion measures and diverged with nonrelated criterion measures. In the clinical sample, the RCADS-25-P scale scores successfully discriminated between those with and without target problem diagnoses. In both samples, child–parent agreement indices were in the expected ranges. Normative data were also reported. The RCADS-25-P thus demonstrated robust psychometric properties across both a school and clinical sample as an effective brief screening instrument to assess for depression and anxiety in children and adolescents.
Previous research suggests that the Eating Pattern Inventory for Children (EPI-C) is best conceptualized as comprising four factors: dietary restraint, emotional, external eating and parental pressure to eat. This study aims to examine the psychometric properties of the EPI-C and to test gender and weight group differences. The population-based study sample comprised 1,939 children aged 11 to 12 years from the Copenhagen Child Cohort (CCC2000). Psychometric properties were evaluated using multigroup categorical data in confirmatory factor analysis (CFA) and differential item functioning (DIF) tests. CFA supported the four-factor solution for the EPI-C. Reliability estimates were satisfactory for three of the four scales. DIF with regard to weight was found for an item on weight loss intention. Girls reported higher restrained and emotional eating; overweight children reported higher restrained, emotional and external eating, while underweight children reported higher parental pressure to eat. The results support the use of EPI-C for measuring eating behaviors in preadolescence.
The primary goal of this study was to explicate the construct validity of the Narcissistic Personality Inventory (NPI) and the Hypomanic Personality Scale (HPS) by examining their relations both to each other and to measures of personality and psychopathology in a community sample (N = 255). Structural evidence indicates that the NPI is defined by Leadership/Authority, Grandiose Exhibitionism, and Entitlement/Exploitativeness factors, whereas the HPS is characterized by specific dimensions reflecting Social Vitality, Mood Volatility, and Excitement. Our results establish that (a) factor-based subscales from these instruments display divergent patterns of relations that are obscured when relying exclusively on total scores and (b) some NPI and HPS subscales more clearly tap content specifically relevant to narcissism and mania, respectively, than others. In particular, our findings challenge the construct validity of the NPI Leadership/Authority and HPS Social Vitality subscales, which appear to assess overlapping assertiveness content that is largely adaptive in nature.
This article introduces a new measure of resilience and five related protective factors. The Five-by-Five Resilience Scale (5x5RS) is developed on the basis of theoretical and empirical considerations. Two samples (N = 475 and N = 613) are used to assess the factor structure, reliability, convergent validity, and criterion-related validity of the 5x5RS. Confirmatory factor analysis supports a bifactor model. The 5x5RS demonstrates adequate internal consistency as evidenced by Cronbach’s alpha and empirical reliability estimates. The 5x5RS correlates positively with the Connor–Davidson Resilience Scale (CD-RISC), a commonly used measure of resilience. The 5x5RS exhibits similar criterion-related validity to the CD-RISC as evidenced by positive correlations with satisfaction with life, meaning in life, and secure attachment style as well as negative correlations with rumination and anxious or avoidant attachment styles. 5x5RS scores are positively correlated with healthy behaviors such as exercise and negatively correlated with sleep difficulty and symptomology of anxiety and depression. The 5x5RS incrementally explains variance in some criteria above and beyond the CD-RISC. Item responses are modeled using the graded response model. Information estimates demonstrate the ability of the 5x5RS to assess individuals within at least one standard deviation of the mean on relevant latent traits.
A state of loneliness describes an individual’s perception of having dissatisfying social connections to others. Though it is notable across the life span, it may have particularly deleterious effects in childhood and adolescence, leading to increased risk of emotional impairment. The current study evaluates a widely used test of loneliness, the Loneliness Questionnaire, for measurement invariance across ethnic groups in a large, representative sample of youth in the 2nd to 12th grades (N = 12,344; 41% African American) in Mississippi. Analyses were conducted using multigroup confirmatory factor analysis following a published, sequential method to examine invariance in form, factor loadings, and item intercepts. Overall, our results indicated that the instrument was invariant across ethnicities, suggesting that youth with equivalent manifest scores can be discerned as having comparable levels of latent loneliness. The loneliness scores also corresponded significantly with depression and anxiety scores for most subsamples, with one exception. These findings are discussed in the context of previous results comparing levels of loneliness across ethnicities. Additionally, the broader context of the need to expand invariance studies in instrumentation work is highlighted.
Unproctored, web-based assessments supposedly reduce social desirability distortions in self-report questionnaires because of an increased sense of privacy among participants. Three random-effects meta-analyses focusing either on social desirability (k = 30, total N = 3,746), the Big Five of personality (k = 66, total N = 2,951), or psychopathology (k = 96, total N = 16,034) compared social desirability distortions of self-reports across computerized and paper-and-pencil administration modes. Overall, a near-zero effect, = 0.01, was obtained that did not indicate less socially desirable responding in computerized assessments. Moreover, moderator analyses did not identify differential effects for proctored and unproctored procedures. Thus, paper-and-pencil and computerized administrations of self-report scales yield comparable mean scores. Unproctored web-based surveys do not offer an advantage with regard to socially desirable responding in self-report questionnaires.
This study explored whether the Kaufman Assessment Battery for Children–Second Edition (KABC-II) predicted academic achievement outcomes of the Kaufman Test of Educational Achievement–Second Edition (KTEA-II) equally well across a representative sample of African American, Hispanic, and Caucasian school-aged children (N = 2,001) in three grade groups (1-4, 5-8, 9-12). It was of interest to study possible prediction bias in the slope and intercept of the five underlying Cattell–Horn–Carroll (CHC) cognitive factors of the KABC-II—Sequential/Gsm (Short-Term Memory), Learning/Glr (Long-Term Storage and Retrieval), Simultaneous/Gv (Visual Processing), Planning/Gf (Fluid Reasoning), and Knowledge/Gc (Crystallized Ability)—in estimating reading, writing, and math. Structural equation modeling techniques demonstrated a lack of bias in the slopes; however, four of the five CHC indexes showed a persistent overprediction of the minority groups’ achievement in the intercept. The overprediction is likely attributable to institutional or societal contributions, which limit the students’ ability to achieve to their fullest potential.
The aim was to further test the reliability and validity of a newly developed instrument designed to assess psychopathic personality traits in children, the Child Problematic Traits Inventory (CPTI). Data from the Preschool Twin Study in Sweden were used, a national general population study of 5-year-old twins (n = 1,188, 50.3% girls). Both preschool teachers and parents were used as informants. Confirmatory factor analysis replicated the intended three-factorial structure of the 28 items of the CPTI. Overall, our findings demonstrated good internal consistency and convergent validity, with all the teacher-rated CPTI scores being associated with teacher and parent ratings of externalizing psychopathology, aggressive behavior, fearlessness, and prosocial peer involvement. In conclusion, the CPTI hold promise as a teacher-rated tool for assessing psychopathic traits in childhood, though more research is needed to see if these findings can be generalized to other countries, settings, and older children.
To examine hypothesized influence of method variance from negatively keyed items in measurement of callous–unemotional (CU) traits, nine a priori confirmatory factor analysis model comparisons of the Inventory of Callous–Unemotional Traits were evaluated on multiple fit indices and theoretical coherence. Tested models included a unidimensional model, a three-factor model, a three-bifactor model, an item response theory–shortened model, two item-parceled models, and three correlated trait–correlated method minus one models (unidimensional, correlated three-factor, and bifactor). Data were self-reports of 234 adolescents (191 juvenile offenders, 43 high school students; 63% male; ages 11-17 years). Consistent with hypotheses, models accounting for method variance substantially improved fit to the data. Additionally, bifactor models with a general CU factor better fit the data compared with correlated factor models, suggesting a general CU factor is important to understanding the construct of CU traits. Future Inventory of Callous–Unemotional Traits analyses should account for method variance from item keying and response bias to isolate trait variance.
To extend the evidence on the reliability and construct validity of the Five-Factor Model Rating Form (FFMRF) in its self-report version, two independent samples of Italian participants, which were composed of 510 adolescent high school students and 457 community-dwelling adults, respectively, were administered the FFMRF in its Italian translation. Adolescent participants were also administered the Italian translation of the Borderline Personality Features Scale for Children–11 (BPFSC-11), whereas adult participants were administered the Italian translation of the Triarchic Psychopathy Measure (TriPM). Cronbach α values were consistent with previous findings; in both samples, average interitem r values indicated acceptable internal consistency for all FFMRF scales. A multidimensional graded item response theory model indicated that the majority of FFMRF items had adequate discrimination parameters; information indices supported the reliability of the FFMRF scales. Both categorical (i.e., item-level) and scale-level regression analyses suggested that the FFMRF scores may predict a nonnegligible amount of variance in the BPFSC-11 total score in adolescent participants, and in the TriPM scale scores in adult participants.
Psychopathy as conceptualized by the triarchic model encompasses three distinct dispositional constructs: boldness, meanness, and disinhibition. The current study sought to further validate triarchic (Tri) construct scales composed of items from the Multidimensional Personality Questionnaire (MPQ) as a foundation for advancing research on the etiology of psychopathy using existing large-scale longitudinal studies. MPQ-Tri scales were examined in three samples: mixed-gender undergraduate students (N = 346), male offenders from a residential substance abuse treatment facility (N = 190), and incarcerated female offenders (N = 216). Across these three samples, the MPQ-Tri scales demonstrated high internal consistency and clear convergent and discriminant associations with criterion measures of psychopathy and other psychopathology outcomes. Gender comparisons revealed relatively few differences in relationships with criterion measures. Findings are discussed in terms of their implications for further investigation of the causal bases of psychopathy and other forms of psychopathology utilizing data from large etiologically informative studies.
The interpersonal circumplex is a well-established structural model that organizes interpersonal functioning within the two-dimensional space marked by dominance and affiliation. The structural summary method (SSM) was developed to evaluate the interpersonal nature of other constructs and measures outside the interpersonal circumplex. To date, this method has been primarily descriptive, providing no way to draw inferences when comparing SSM parameters across constructs or groups. We describe a newly developed resampling-based method for deriving confidence intervals, which allows for SSM parameter comparisons. In a series of five studies, we evaluated the accuracy of the approach across a wide range of possible sample sizes and parameter values, and demonstrated its utility for posing theoretical questions on the interpersonal nature of relevant constructs (e.g., personality disorders) using real-world data. As a result, the SSM is strengthened for its intended purpose of construct evaluation and theory building.
Elevated overreporting Validity Scale scores on the Minnesota Multiphasic Personality Inventory–2–Restructured Form (MMPI-2-RF) are associated with higher scores on collateral measures; however, measures used in prior research lacked validity scales. We sought to extend these findings by examining associations between elevated MMPI-2-RF overreporting scale scores and Personality Assessment Inventory (PAI) scale scores among 654 non–head injury civil disability claimants. Individuals were classified as overreporting psychopathology (OR-P), overreporting somatic/cognitive complaints (OR-SC), inconclusive reporting psychopathology (IR-P), inconclusive reporting somatic/cognitive complaints (IR-SC), or valid reporting (VR). Both overreporting groups had significantly and meaningfully higher scores than the VR group on the MMPI-2-RF and PAI scales. Both IR groups had significantly and meaningfully higher scores than the VR group, as well as lower scores than their overreporting counterparts. Our findings demonstrate the utility of inventories with validity scales in assessment batteries that include instruments without measures of protocol validity.
To assess the reliability and construct validity of the Personality Inventory for DSM-5 Brief Form (PID-5-BF) among adolescents, 877 Italian high school students were administered the PID-5-BF. Participants were administered also the Measure of Disordered Personality Functioning (MDPF) as a criterion measure. In the full sample, Cronbach’s alpha values for the PID-5-BF scales ranged from .59 (Detachment) to .77 (Psychoticism); in addition, all PID-5-BF scales showed mean interitem correlation values in the .22 to .40 range. Cronbach’s alpha values for the PID-5-BF total score was .83 (mean interitem r = .16). Although 2-month test–retest reliability could be assessed only in a small (n = 42) subsample of participants, all PID-5-BF scale scores showed adequate temporal stability, as indexed by intraclass r values ranging from .78 (Negative Affectivity) to .97 (Detachment), all ps <.001. Exploratory structural equation modeling analyses provided at least moderate support for the a priori model of PID-5-BF items. Multiple regression analyses showed that PID-5-BF scales predicted a nonnegligible amount of variance in MDPF Non-Cooperativeness, adjusted R2 = .17, p < .001, and Non-Coping scales, adjusted R2 = .32, p < .001. Similarly, the PID-5-BF total score was a significant predictor of both MDPF Non-Coping, and Non-Cooperativeness scales.
The Alabama Parenting Questionnaire nine-item short form (APQ-9) is an often used assessment of parenting in research and applied settings. It uses parent and youth ratings for three scales: Positive Parenting, Inconsistent Discipline, and Poor Supervision. The purpose of this study is to examine the longitudinal invariance of the APQ-9 for both parents and youth, and the multigroup invariance between parents and youth during the transition from middle school to high school. Parent and youth longitudinal configural, metric, and scalar invariance for the APQ-9 were supported when tested separately. However, the multigroup invariance tests indicated that scalar invariance was not achieved between parent and youth ratings. Essentially, parent and youth mean scores for Positive Parenting, Inconsistent Discipline, and Poor Supervision can be independently compared across the transition from middle school to high school. However, comparing parent and youth scores across the APQ-9 scales may not be meaningful.
Recognized for nearly four decades, most juvenile suspects waive their Miranda rights and almost immediately provide self-incriminating evidence. Miranda-specific measures were eventually developed to understand their capacities and limitations. With extensive revisions, the Miranda Rights Comprehension Instruments (MRCI) were normed and validated. Beyond reliability, the current study addresses the convergent and discriminant validity of the MRCI. In response to Frumkin and Sellbom’s criticism of the MRCI’s norms, the current research provides representative data on 245 legally involved juveniles with percentiles to facilitate the interpretation of MRCI data. The current investigation is also the first MRCI study to link directly Miranda comprehension (i.e., the knowing prong) to Miranda reasoning (i.e., the intelligent prong) of waiver decisions.
The Paternal Adjustment and Paternal Attitudes Questionnaire (PAPA) was designed to assess paternal adjustment and paternal attitudes during the transition to parenthood. This study aimed to examine the psychometric characteristics of the Portuguese versions of the PAPA-Antenatal (PAPA-AN) and -Postnatal (PAPA-PN) versions. A nonclinical sample of 128 fathers was recruited in the obstetrics outpatient unit, and they completed both versions of the PAPA and self-report measures of depressive and anxiety symptoms during pregnancy and the postpartum period, respectively. Good internal consistency for both PAPA-AN and PAPA-PN was found. A three-factor model was found for both versions of the instrument. Longitudinal confirmatory factor analysis revealed a good model fit. The PAPA-AN and PAPA-PN subscales revealed good internal consistency. Significant associations were found between PAPA (PAPA-AN and PAPA-PN) and depressive and anxiety symptoms, suggesting good criterion validity. Both versions also showed good clinical validity, with optimal cutoffs found. The present study suggested that the Portuguese versions of the PAPA are reliable multidimensional self-report measures of paternal adjustment and paternal attitudes that could be used to identify fathers with adjustment problems and negative attitudes during the transition to parenthood.
The current article describes the adaptation of a measure of sexual orientation self-concept ambiguity (SSA) from an existing measure of general self-concept clarity. Latent "trait" scores of SSA reflect the extent to which a person’s beliefs about their own sexual orientation are perceived as inconsistent, unreliable, or incongruent. Sexual minority and heterosexual women (n = 348), ages 18 to 30, completed a cross-sectional survey. Categorical confirmatory factor analysis guided the selection of items to form a 10-item, self-report measure of SSA. In the current report, we also examine (a) reliability of the 10-item scale score, (b) measurement invariance based on respondents’ sexual identity status and age group, and (c) correlations with preexisting surveys that purport to measure similar constructs and theoretical correlates. Evidence for internal reliability, measurement invariance (based on respondent sex), and convergent validity was also investigated in an independent, validation sample. The lowest SSA scores were reported by women who self-ascribed an exclusively heterosexual or exclusively lesbian/gay sexual identity, whereas those who reported a bisexual, mostly lesbian/gay, or mostly heterosexual identity, reported relatively higher SSA scores.
Chrysikou, E. G., & Thompson, W. J. (2015). Assessing cognitive and affective empathy through the interpersonal reactivity index: An argument against a two-factor model. Assessment. Advance online publication. doi:
Higher order factor structure of the Luria interpretive scheme on the Kaufman Assessment Battery for Children–Second Edition (KABC-II) for the 7- to 12-year and the 13- to 18-year age groups in the KABC-II normative sample (N = 2,025) is reported. Using exploratory factor analysis, multiple factor extraction criteria, and hierarchical exploratory factor analysis not included in the KABC-II manual, two-, three-, and four-factor extractions were analyzed to assess the hierarchical factor structure by sequentially partitioning variance appropriately to higher order and lower order dimensions as recommended by Carroll. No evidence for a four-factor solution was found. Results showed that the largest portions of total and common variance were accounted for by the second-order general factor and that interpretation should focus primarily, if not exclusively, at that level of measurement.
The present study applied item response theory to examine the psychometric properties of the Asian Adolescent Depression Scale and to construct a short form among 1,084 teenagers recruited from secondary schools in Hong Kong. Findings suggested that some items of the full form reflected higher levels of severity and were more discriminating than others, and the Asian Adolescent Depression Scale was useful in measuring a broad range of depressive severity in community youths. Differential item functioning emerged in several items where females reported higher depressive severity than males. In the short form construction, preliminary validation suggested that, relative to the 20-item full form, our derived short form offered significantly greater diagnostic performance and stronger discriminatory ability in differentiating depressed and nondepressed groups, and simultaneously maintained adequate measurement precision with a reduced response burden in assessing depression in the Asian adolescents. Cultural variance in depressive symptomatology and clinical implications are discussed.
The fifth edition of the Diagnostic and Statistical Manual includes a dissociative subtype of posttraumatic stress disorder, but no existing measures specifically assess it. This article describes the initial evaluation of a 15-item self-report measure of the subtype called the Dissociative Subtype of Posttraumatic Stress Disorder Scale (DSPS) in an online survey of 697 trauma-exposed military veterans representative of the U.S. veteran population. Exploratory factor analyses of the lifetime DSPS items supported the intended structure of the measure consisting of three factors reflecting derealization/depersonalization, loss of awareness, and psychogenic amnesia. Consistent with prior research, latent profile analyses assigned 8.3% of the sample to a highly dissociative class distinguished by pronounced symptoms of derealization and depersonalization. Overall, results provide initial psychometric support for the lifetime DSPS scales; additional research in clinical and community samples is needed to further validate the measure.
For the past 40 years, the conventional univariate model of self-monitoring has reigned as the dominant interpretative paradigm in the literature. However, recent findings associated with an alternative bivariate model challenge the conventional paradigm. In this study, item response theory is used to develop measures of the bivariate model of acquisitive and protective self-monitoring using original Self-Monitoring Scale (SMS) items, and data from two large, nonstudent samples (Ns = 13,563 and 709). Results indicate that the new acquisitive (six-item) and protective (seven-item) self-monitoring scales are reliable, unbiased in terms of gender and age, and demonstrate theoretically consistent relations to measures of personality traits and cognitive ability. Additionally, by virtue of using original SMS items, previously collected responses can be reanalyzed in accordance with the alternative bivariate model. Recommendations for the reanalysis of archival SMS data, as well as directions for future research, are provided.
The present study examined measurement invariance and convergent validity of a novel vignette-based measure of emotion-specific self-regulation that simultaneously assesses attributional bias, emotion-regulation, and self-efficacy beliefs about emotion regulation. Participants included 541 youth–mother dyads from three countries (Italy, the United States, and Colombia) and six ethnic/cultural groups. Participants were 12.62 years old (SD = 0.69). In response to vignettes involving ambiguous peer interactions, children reported their hostile/depressive attribution bias, self-efficacy beliefs about anger and sadness regulation, and anger/sadness regulation strategies (i.e., dysregulated expression and rumination). Across the six cultural groups, anger and sadness self-regulation subscales had full metric and partial scalar invariance for a one-factor model, with some exceptions. We found support for both a four- and three-factor oblique model (dysregulated expression and rumination loaded on a second-order factor) for both anger and sadness. Anger subscales were related to externalizing problems, while sadness subscales were related to internalizing symptoms.
We investigated the cross-cultural construct validity of hope, a factor associated with mental health protection and promotion, using the Children’s Hope Scale (CHS). The sample (n = 1,057; 48% girls) included baseline data from three cluster-randomized controlled trials with children affected by armed conflict (n = 329 Burundi; n = 403 Indonesia; n = 325 Nepal). The confirmatory factor analysis in each country indicated good fit for the hypothesized two-factor model. Analysis by gender indicated that configural invariance was supported and that scalar invariance was demonstrated in Indonesia. However, metric and scalar invariance were not supported in Burundi and Nepal. In country comparisons, configural and metric invariance were met, but scalar invariance was not supported. Evidence from this study supports the use of the CHS within various sociocultural settings and across genders, but direct comparisons of CHS scores across groups should be done with caution. Rigorous evaluations of the measurement properties of mental health protective and promotive factors are necessary to inform both research and practice.
In the current study, we fit confirmatory bi-factor models to the items of the Autism Spectrum Quotient (AQ) and Autism Spectrum Quotient Short Form (AQ-S) in order to assess the extents to which the items of each reflect general versus specific factors. The models were fit in a combined sample of individuals with and without a clinical diagnosis of autism spectrum disorders. Results indicated that, with the exception of the Attention to Details factor in the AQ and the Numbers/Patterns factors in the AQ-S, items primarily reflected a general factor. This suggests that when attempting to estimate an association between a specific symptom measured by the AQ or AQ-S and some criterion, associations will be confounded by the general factor. To resolve this, we recommend using a bi-factor measurement model or factor scores from a bi-factor measurement whenever hypotheses about specific symptoms are being assessed.
This study presents a Rasch-derived short form of the Warwick-Edinburgh Mental Well-Being Scale for use as a screening tool in the general population. Data from 2,005 18- to 69-year-olds revealed problematic discrimination at specific thresholds. Estimation of model fit also deviated from Rasch model expectations. Following deletion of 4 items, the 10 remaining items indicated the data fitted the model. No items showed differential item functioning, thereby making comparisons of overall positive mental well-being for the different age, gender, and income groups valid and accurate. Cronbach’s alpha and Rasch Person Separation Index indicated a strong degree of reliability. Overall, the 10-item scale challenges researchers and clinicians to reconsider the assessment of positive mental well-being.
The Mindful Attention Awareness Scale was developed to measure individual differences in the tendency to be mindful. The current study examined the psychometric properties of the Mindful Attention Awareness Scale in a heterogeneous sample of 565 nonmeditators and 612 meditators using the polytomous Rasch model. The results showed that some items did not function the same way for these two groups. Overall, meditators had higher mean estimates than nonmeditators. The analysis identified a group of items as highly discriminating. Using a different model, Van Dam, Earleywine, and Borders in 2010 identified the same group of items as highly discriminating, and concluded that they were the items with the most information. Multiple pieces of evidence from the Rasch analysis showed that these items discriminate highly because of local dependence, hence do not supply independent information. We discussed how these different conclusions, based on similar findings, result from two very different paradigms in measurement.
In this rejoinder, we comment on Wright’s response to our reanalysis and reinterpretation of the data presented by Wright and colleagues. Two primary differences characterize these perspectives. First, the conceptualization of grandiose narcissism differs such that emotional and ego vulnerability, dysregulation, and pervasive impairments are more characteristic of Wright’s conception, likely due to the degree to which it is tied to clinical observations. Our conceptualization is closer to psychopathy and describes an extraverted, dominant, and antagonistic individual who is relatively less likely to be found in clinical settings. Second, our approach to construct validation differs in that we take an empirical perspective that focuses on the degree to which inventories yield scores consistent with a priori predictions. The grandiose dimension of the Pathological Narcissism Inventory (PNI-G) yields data that fail to align with expert ratings of narcissistic personality disorder and grandiose narcissism. We suggest that caution should be taken in treating the PNI-G as a gold standard measure of pathological narcissism, that revision of the PNI-G is required before it can serve as a stand-alone measure of grandiose narcissism, and that the PNI-G should be buttressed by other scales when being used as a measure of grandiose narcissism.
It has been evident for some time that the Boredom Proneness Scale (BPS), a commonly used measure of trait boredom, does not constitute a single scale. Factor analytic studies have identified anything from two to seven factors, prompting Vodanovich and colleagues to propose an alternative two factor, short form version Boredom Proneness Scale–Short Form (BPS-SR). The present study further investigates the factor structure and validity of both the BPS and the BPS-SR. The two-factor solution obtained for the BPS-SR appears to be an artifact of item wording of reverse-scored items. These same items may also have contributed to the earlier complexity and inconsistency of results for the full BPS. An eight-item scale of only consistently worded items (i.e., those not requiring reverse scoring) was developed. This new scale demonstrated unidimensionality and the scale score had good internal consistency and construct validity comparable to the original BPS score.
This study examined various psychometric properties of the items comprising the shame and guilt scales of the Test of Self-Conscious Affect–Adolescent. A total of 563 adolescents (321 females and 242 males) completed these scales, and also measures of depression and empathy. Confirmatory factor analysis provided support for an oblique two-factor model, with the originally proposed shame and guilt items comprising shame and guilt factors, respectively. Also, shame correlated with depression positively and had no relation with empathy. Guilt correlated with depression negatively and with empathy positively. Thus, there was support for the convergent and discriminant validity of the shame and guilt factors. Multiple-group confirmatory factor analysis comparing females and males, based on the chi-square difference test, supported full metric invariance, the intercept invariance of 26 of the 30 shame and guilt items, and higher latent mean scores among females for both shame and guilt. Comparisons based on the difference in root mean squared error of approximation values supported full measurement invariance and no gender difference for latent mean scores. The psychometric and practical implications of the findings are discussed.
The utility of a narrative approach to identity and its role in psychological functioning are becoming increasingly recognized across various fields of inquiry. The current study aimed to develop a quantitative, self-report measure of the awareness of narrative identity and how globally coherent one’s autobiographical memories are perceived to be, specifically, in terms of temporal ordering, causal associations, and the perception of unifying themes. The construct validity and reliability of the Awareness of Narrative Identity Questionnaire (ANIQ) were assessed across three studies. In the first study, exploratory factor analysis of the responses of a large sample (N = 441, M [age in years] = 33.1, SD = 15.2) to an initial item pool resulted in a 20-item four-factor structure congruent with the proposed subscales, and convergent and divergent validity were established. In the second study, and with a different sample (N = 320, M [age in years] = 26.2, SD = 4.0), further evidence for the factor structure was provided through confirmatory factor analysis. Validity findings from Study 1 were replicated and extended on, and test–retest reliabilities were found to be high (r = .72-.79). Importantly, in the third study (N = 71, M [age in years] = 24.9, SD = 6.9), criterion validity was established, whereby the ANIQ subscales were demonstrated to be associated with dimensions of narrative coherence coded from written turning-point narratives. Across all studies, the internal reliabilities for the subscales were high (α = .86-.96). The ANIQ represents a valid, psychometrically sound, and novel method of assessing the awareness of narrative identity and autobiographical memory coherence.
In this article, we investigated the extent and nature of informant discrepancies on parent- and adolescent self-report versions of a checklist measuring youth exposure to life stressors. Specifically, we examined (a) mean-level differences, relative consistency, and consensus for family-level and youth-specific stressors and (b) the utility of parent–youth discrepancies in accounting for variance in youth temperament and psychopathology. Participants were 106 parent–child dyads (47 male, 59 female; 90.6% mothers) aged 13 to 18 years old (M = 16.01, SD = 1.29). The results revealed evidence for both congruence and divergence in parent and youth reports, particularly with respect to respondents’ accounts of youth-specific stressors. Discrepancies for youth-specific stressors were associated with adolescents’ negative affectivity, surgency, effortful control, and internalizing problems. Discrepancies for youth stressors may therefore reveal individual differences in emotionality and self-regulation, thus reflecting meaningful variance in adolescents’ functioning.
Malingering is relatively common in criminal forensic evaluations as base rates of malingering have ranged from 20% to 30%. Given that the most prevalent criminal forensic evaluation is the assessment of competency to stand trial, the assessment of feigning during competency evaluations is necessary for accurate findings. Most of the response style literature focuses on feigning mental health symptoms, but in competency evaluations, individuals may attempt to feign legal knowledge deficits in order to be found incompetent to stand trial. The current investigation includes two studies: 195 students instructed to simulate feigned mental illness or incompetence to stand trial and one using a sample of 130 state psychiatric hospital residents who had been adjudicated incompetent to stand trial. The purpose of the study was to evaluate the Inventory of Legal Knowledge’s (ILK; Musick & Otto, 2010) ability to detect individuals who are feigning legal knowledge deficits. Classification utility statistics, including sensitivity, specificity, positive predictive power, and negative predictive power are provided for each cut-score on the ILK beginning with a cut-score of 24 (which is the lower end of the range of chance) are provided. The current cut-score of 47 provided in the professional manual of the ILK was shown to create a large number of false positives and suggests that modifications to this cut-score are required.
This study examined the accuracy of depression cross-walk tables in a sample of people with multiple sclerosis (MS). The tables link scores of two commonly used depression measures to the Patient Reported Outcome Measurement Information System Depression (PROMIS-D) scale metric. We administered the 8-item PROMIS-D (Short-Form 8b; PROMIS-D-8), the 20-item Center for Epidemiologic Studies Depression Scale (CESD-20), and the 9-item Patient Health Questionnaire (PHQ-9) to 459 survey participants with MS. We examined correlations between actual PROMIS-D-8 scores and the scores predicted by cross-walks based on PHQ-9 and CESD-20 scores. Intraclass correlation coefficients were used to assess correspondence. Consistency in severity classification was also calculated. Finally, we used Bland–Altman plots to graphically examine the levels of agreement. The correlations between actual and cross-walked PROMIS-D-8 scores were strong (CESD-20 = .82; PHQ-9 = .74). The intraclass correlation was moderate (.77). Participants were consistently classified as having or not having at least moderate depressive symptoms by both actual and cross-walked scores derived from the CESD-20 (90%) and PHQ-9 (85%). Bland–Altman plots suggested the smaller differences between actual and cross-walked scores with greater-than-average depression severity. PROMIS cross-walk tables can be used to translate depression scores of people with MS to the PROMIS-D metric, promoting continuity with previous research.
The Cattell–Horn–Carroll (CHC) theory of cognitive abilities has been guiding in the revision of the Wechsler Adult Intelligence Scale–Fourth edition (WAIS-IV). Especially the measurement of fluid reasoning (Gf) is improved. A total of five CHC abilities are included in the WAIS-IV subtests. Using confirmatory factor analysis, a five-factor model based on these CHC abilities is evaluated and compared with the four index scores in the Dutch-language version of the WAIS-IV. Both models demonstrate moderate fit, preference is given to the five-factor CHC model both on statistical and theoretical grounds. Evaluation of the WAIS-IV according to CHC terminology enhances uniformity, and can be important when interpreting possible sources of index discrepancies. To optimally assemblage CHC and WAIS-IV, more knowledge of the interaction of abilities is needed. This can be done by incorporating intelligence testing in neuropsychological assessment. Using this functional approach contributes to a better understanding of an individual’s cognitive profile.
This study explored the longitudinal measurement invariance in the Beck Depression Inventory–II (BDI-II) in early adolescents (junior high school students). The participants were 730 early adolescents (330 boys and 400 girls), who were followed up over 3 years (in six waves). To reduce the size of longitudinal model and verify the stability of the findings, the Fall and Spring series data sets were analyzed separately. Each series includes three waves of data with about 1-year apart. It was found that the three-factor model (Negative Attitude, Performance Difficulty, and Somatic Elements) best fitted the data. Results of both data sets provided support for the longitudinal measurement invariance (threshold invariance) of the three-factor model, suggesting that the BDI-II measured the same construct over 3 years. The study also examined the category function of the BDI-II on the basis of the pattern of threshold estimates. Finally, the implications of the findings on the continuing use of the BDI-II are discussed.
The Personality Inventory for DSM-5 (PID-5) measures the trait part (Criterion B) of the alternative model for personality disorders proposed in Section III of DSM-5. Although its psychometric properties have proven adequate thus far, evidence is limited in other languages and in clinical samples. The Spanish PID-5 was examined in two samples comprising 446 clinical and 1,036 community subjects. Facet scales showed good internal consistency in both samples (median α = .86 and .79) and were unidimensional under exploratory and confirmatory approaches. They were also able to distinguish between clinical and community subjects with a mean standardized difference of z = 0.81. All facets except for Risk Taking were unipolar, such that the upper poles indicated pathology and the lower poles reflected normality, rather than the opposite pole of abnormality. The entire PID-5 hierarchical structure, from one to five factors, was confirmed in both samples with Tucker’s congruence coefficients over .95.
Psychological assessments are highly dependent on the forthrightness and sincere efforts of examinees. In particular, evaluations in forensic settings must consider whether feigning or other response styles are utilized to intentionally distort the clinical presentation. The current study examines the effectiveness of the Inventory of Legal Knowledge (ILK) at detecting feigned incompetency within a sample of jail detainees. As an ancillary goal, several scales of the Standardized Assessment of Miranda Abilities were included in the same within-subjects simulation design. Results of the total ILK score raised concerns regarding the mischaracterization of genuine offenders as "suggestive of feigning." Pending cross-validation, however, a Revised ILK proved highly effective, using a floor effect detection strategy. Although intended for Miranda-specific abilities, several detection strategies on the Standardized Assessment of Miranda Abilities appeared to be very promising within a broadened context of feigned incompetency.
The psychometric properties of the 64-item Self-Report Psychopathy Scale–III (SRP-III) and its abbreviated 28-item SRP–Short Form (SRP-SF) seem promising. Still, cross-cultural evidence for its construct validity in heterogeneous community samples remains relatively scarce. Moreover, little is known about the interchangeability of both instruments. The present study addresses these research gaps by comparing the SRP-III and SRP-SF factorial construct validity and nomological network in a Belgian community sample. The four-factor model of psychopathy was evaluated (N = 1,510) and the SRP scales’ relationship with various external correlates (i.e., attachment, bullying and victimization, right-wing attitudes, right-wing authoritarianism, and response styles) was examined (n = 210). Both SRP versions demonstrated a good fit for the four-factor model and a considerable overlap with the nomological network of psychopathy. The results suggested that the SRP-SF provides a viable alternative to the SRP-III for assessment in the community. Theoretical and practical implications are discussed.
Patients’ narratives about traumatic experiences and symptoms are useful in clinical screening and diagnostic procedures. In this study, we presented an automated assessment system to screen patients for posttraumatic stress disorder via a natural language processing and text-mining approach. Four machine-learning algorithms—including decision tree, naive Bayes, support vector machine, and an alternative classification approach called the product score model—were used in combination with n-gram representation models to identify patterns between verbal features in self-narratives and psychiatric diagnoses. With our sample, the product score model with unigrams attained the highest prediction accuracy when compared with practitioners’ diagnoses. The addition of multigrams contributed most to balancing the metrics of sensitivity and specificity. This article also demonstrates that text mining is a promising approach for analyzing patients’ self-expression behavior, thus helping clinicians identify potential patients from an early stage.
The present study examined the factor structure and construct validity of the Children’s Loneliness Scale (CLS), a popular measure of childhood loneliness, in Belgian children. Analyses were conducted on two samples of fifth and sixth graders in Belgium, for a total of 1,069 children. A single-factor structure proved superior to alternative solutions proposed in the literature, when taking item wording into account. Construct validity was shown by substantial associations with related constructs, based on both self-reported (e.g., depressive symptoms and low social self-esteem), and peer-reported variables (e.g., victimization). Furthermore, a significant association was found between the CLS and a peer-reported measure of loneliness. Collectively, these findings provide a solid foundation for the continuing use of the CLS as a measure of childhood loneliness.
The Interpersonal Dependency Inventory (IDI) is a frequently used, 48-item measure of maladaptive dependency. Our goal was to develop and psychometrically evaluate a very brief version of the IDI. An exploratory factor analysis of the IDI in Study 1 (N = 838) yielded a six-item IDI (IDI-6), with three items loading on an emotional dependency factor (IDI-6-ED), and the other three items loading on a functional dependency factor (IDI-6-FD). This factor solution was validated by confirmatory factor analysis in Study 2 (N = 916). The IDI-6-ED and IDI-6-FD demonstrated good convergent and divergent validity in Study 3 (N = 100). In Study 4 (N = 22-43), the IDI-6-ED and IDI-6-FD were generally stable over 4-week and 8-week intervals and were found to be responsive to the effects of psychological treatment. These results have implications for dependency conceptualizations and support the IDI-6 as a brief, psychometrically sound instrument.
Assessment is an integral component of treatment. However, prior surveys indicate clinicians may not use standardized assessment strategies. We surveyed 1,510 clinicians and used multivariate analysis of variance to explore group differences in specific measure use. Clinicians used unstandardized measures more frequently than standardized measures, although psychologists used standardized measures more frequently than nonpsychologists. We also used latent profile analysis to classify clinicians based on their overall approach to assessment and examined associations between clinician-level variables and assessment class or profile membership. A four-profile model best fit the data. The largest profile consisted of clinicians who primarily used unstandardized assessments (76.7%), followed by broad-spectrum assessors who regularly use both standardized and unstandardized assessment (11.9%), and two smaller profiles of minimal (6.0%) and selective assessors (5.5%). Compared with broad-spectrum assessors, unstandardized and minimal assessors were less likely to report having adequate standardized measures training. Implications for clinical practice and training are discussed.
Objectified body consciousness (OBC) appears to play a crucial role in eating and body-related disturbances, which typically emerge during adolescence. The 24-item OBC Scale (OBCS) has been employed in eating disorder (ED) research and school-based adolescent samples, but evidence for its psychometric proprieties exists only in adult (nonclinical) populations. We evaluated (a) the construct validity and reliability of the 24-item OBCS with data collected from 1,259 adolescent girls and boys from the community (Study 1) and 643 adolescents of both genders with an ED (Study 2) and (b) whether the instrument functions similarly and equivalently measures the underlying construct(s) across gender and samples (i.e., test of measurement equivalence/invariance; Study 3). Results upheld the three-factor structure and measurement equivalence/invariance of the 24-item OBCS across gender and samples. OBCS subscale scores were internally consistent and stable over a 4-week period. OBCS subscales discriminated community participants with high and low ED symptom levels with fair accuracy, as well as community participants from those with an ED. They were also associated with five constructs closely related to both OBC and ED psychopathology. Latent mean comparisons across samples and gender were performed and discussed. Implications and directions for future research are also outlined.
The Drinking Motives Questionnaire, previously postulated and documented to exhibit a measurement structure of four correlated factors (social, enhancement, conformity, and coping), is a widely administered assessment of reasons for consuming alcohol. In the current study (N = 552), confirmatory factor analyses tested the plausibility of several theoretically relevant factor structures. Fit indices corroborated the original four-factor model, and also supported a higher-order factor model involving a superordinate motives factor that explicated four subordinate factors. A bifactor model that permitted items to double load on valence type (positive or negative reinforcement) and source type (external or internal) generated mixed results, suggesting that this 2 x 2 motivation paradigm was not entirely tenable. Optimal fit was obtained for a bifactor model depicting a general factor and four specific factors of motives. Latent factors derived from this structure exhibited criterion validity in predicting frequency and quantity of alcohol usage in a structural equation model. Findings are interpreted in the context of theoretical implications of the instrument, alternative factor structures of drinking motives, and assessment applications.
Spinal cord stimulation (SCS) has variable effectiveness in controlling chronic pain. Previous research has demonstrated that psychosocial factors are associated with diminished results of SCS. The objective of this investigation is to examine associations between pre-implant psychological functioning as measured by the Minnesota Multiphasic Personality Inventory–2–Restructured Form (MMPI-2-RF) and SCS outcomes. SCS candidates at two sites (total N = 319) completed the MMPI-2-RF and measures of pain, emotional distress, and functional ability as part of a pre-implant psychological evaluation. At an average of 5 months post-implant, patients completed the measures of pain and emotional distress a second time. Poorer SCS outcomes and poorer patient satisfaction were associated with higher pre-implant MMPI-2-RF scores on scales used to assess emotional dysfunction, somatic/cognitive complaints, and interpersonal problems. Ways through which pre-implant psychological evaluations of spinal cord stimulator candidates can be informed by MMPI-2-RF findings are discussed.
The Trail Making Test (TMT) is used as an indicator of visual scanning, graphomotor speed, and executive function. The aim of this study was to examine the TMT relationships with several neuropsychological measures and to provide normative data in community-dwelling participants of 55 years and older. A population-based Spanish-speaking sample of 2,564 participants was used. The TMT, Symbol Digit Test, Stroop Color–Word Test, Digit Span Test, Verbal Fluency tests, and the MacQuarrie Test for Mechanical Ability tapping subtest were administered. Exploratory factor analyses and regression lineal models were used. Normative data for the TMT scores were obtained. A total of 1,923 participants (76.3%) participated, 52.4% were women, and the mean age was 66.5 years (Digit Span = 8.0). The Symbol Digit Test, MacQuarrie Test for Mechanical Ability tapping subtest, Stroop Color–Word Test, and Digit Span Test scores were associated in the performance of most TMT scores, but the contribution of each measure was different depending on the TMT score. Normative tables according to significant factors such as age, education level, and sex were created. Measures of visual scanning, graphomotor speed, and visuomotor processing speed were more related to the performance of the TMT-A score, while working memory and inhibition control were mainly associated with the TMT-B and derived TMT scores.
Although obsessive-compulsive (OC) symptoms are observed along four dimensions (contamination, responsibility for harm, order/symmetry, and unacceptable thoughts), the structure of the dimensions remains unclear. The current study evaluated a bifactor model of OC symptoms among those with and without obsessive-compulsive disorder (OCD). The goals were (a) to evaluate if OC symptoms should be conceptualized as unidimensional or whether distinct dimensions should be interpreted and (b) to use structural equation modeling to examine the convergence of the OC dimensions above and beyond a general dimension with related criteria. Results revealed that a bifactor model fit the data well and that OC symptoms were influenced by a general dimension and by four dimensions. Measurement invariance of the bifactor model was also supported among those with and without OCD. However, the general OC dimension accounted for only half of the variability in OC symptoms, with the remaining variability accounted for by distinct dimensions. Despite evidence of multidimensionality, the dimensions were unreliable after covarying for the general OC dimension. However, the four dimensions did significantly converge with a latent OC spectrum factor above and beyond the general OC dimension. The implications of these findings for conceptualizing the structure of OCD are discussed.
Multiple studies have shown that performance on behavioral decision-making tasks, such as the Iowa Gambling Task (IGT) and Balloon Analogue Risk Task (BART), is influenced by external factors, such as mood. However, the research regarding the influence of worry is mixed, and no research has examined the effect of math or test anxiety on these tasks. The present study investigated the effects of anxiety (including math anxiety) and math performance on the IGT and BART in a sample of 137 undergraduate students. Math performance and worry were not correlated with performance on the IGT, and no variables were correlated with BART performance. Linear regressions indicated math anxiety, physiological anxiety, social concerns/stress, and test anxiety significantly predicted disadvantageous selections on the IGT during the transition from decision making under ambiguity to decision making under risk. Implications for clinical evaluation of decision making are discussed.
The current study examined psychometric properties of the Japanese version of Abbreviated Multidimensional Acculturation Scale (AMAS-ZABB-JP) and the 20-item Multigroup Ethnic Identity Measure (MEIM-JP) with 273 Japanese sojourners and immigrants to the United States. The theoretical six-factor structure for the AMAS-JP and two-factor structure for the MEIM-JP was consistent with the literature. The subscales of the AMAS and MEIM showed expected patterns of correlation with each other and with additional variables (i.e., number of years in the United States), providing evidence for construct validity. Cronbach’s alpha reflected high levels of reliability for both scales. Despite strong psychometric findings, there were translational and cultural-based findings that suggest the need for further research.
Existing measures of emotion dysregulation typically assess dispositional tendencies and are therefore not well suited for study designs that require repeated assessments over brief intervals. The aim of this study was to develop and validate a state-based multidimensional measure of emotion dysregulation. Psychometric properties of the State Difficulties in Emotion Regulation Scale (S-DERS) were examined in a large representative community sample of young adult women drawn from four sites (N = 484). Exploratory factor analysis suggested a four-factor solution, with results supporting the internal consistency, construct validity, and predictive validity of the total scale and the four subscales: Nonacceptance (i.e., nonacceptance of current emotions), Modulate (i.e., difficulties modulating emotional and behavioral responses in the moment), Awareness (i.e., limited awareness of current emotions), and Clarity (i.e., limited clarity about current emotions). S-DERS scores were significantly associated with trait-based measures of emotion dysregulation, affect intensity/reactivity, experiential avoidance, and mindfulness, as well as measures of substance use problems. Moreover, significant associations were found between the S-DERS and state-based laboratory measures of emotional reactivity, even when controlling for the corresponding original DERS scales. Results provide preliminary support for the reliability and validity of the S-DERS as a state-based measure of emotion regulation difficulties.
The Autobiographical Memory Test (AMT) is the most commonly used tool to assess the phenomenon of overgeneral memory. The AMT has mainly been used in adult populations, but its use in preschool children is less common. The need to create an appropriate instrument to study the memory specificity in preschool years led us to develop an AMT version adapted for early childhood. The AMT–Preschool (AMT-P) was administered to a sample of preschool children aged between 3 and 6 (N = 364). The results suggest that the AMT-P functions differently in preschoolers depending on age. With children older than 53 months, results suggest that the AMT-P is appropriate for assessing overgenerality. Nevertheless, with younger children age, the task is more difficult. These results concur with previous research suggesting that the ability to recall specific memories is consolidated from the age of 41/2.
A number of measures have been developed to assess medical decision-making capacity (MDC) in adults. However, their clinical utility is limited by a lack of available normative data. In the current study, we introduce age-independent and age-adjusted normative data for a measure of MDC: the Capacity to Consent to Treatment Instrument. The sample consisted of 308 cognitively normal, community-dwelling adults ranging in age from 19 to 86 years. For age-adjusted norms, individual raw scores were first converted to age-corrected scaled scores based on position within a cumulative frequency distribution and then grouped according to empirically supported age ranges. For age-independent norms, the same method was utilized but without age-corrections being applied or participants being grouped into age ranges. This study has the potential to enhance MDC evaluations by allowing clinicians to compare a patient’s performance on the Capacity to Consent to Treatment Instrument with that of adults regardless of age as well as to same age peers. Tables containing normative corrections are supplementary material available online at http://asm.sagepub.com/supplemental.
This study examined the measurement equivalence of the K6 across diverse racial/ethnic and linguistic groups in the U.S. differential item functioning analyses using item response theory were conducted among 44,846 U.S. adults drawn from the California Health Interview Survey. Results show that four items ("nervous," "restless," "depressed," and "everything an effort") varied significantly across races/ethnicities and four items ("nervous," "hopeless," "restless," and "depressed") varied significantly across languages. In additional effect size analyses designed to separate effects of race/ethnicity from language, the structure of the White English group was substantially different from both the Hispanic/Latino English group and Hispanic/Latino Spanish group, whereas the Hispanic/Latino Spanish group was not different from the Hispanic/Latino English group. The findings suggest that there was evident measurement nonequivalence in the K6 among racially/ethnically and linguistically diverse adults and that the observed nonequivalence in the K6 appears to be driven by language rather than race/ethnicity.
Heart-focused anxiety (HFA) is a fear of cardiac sensations driven by worries of physical health catastrophe. HFA is impairing and distressing and has been shown to disproportionately affect individuals with noncardiac chest pain (NCCP), chest pain that persists in the absence of an identifiable source. The Cardiac Anxiety Questionnaire (CAQ) is a measure designed to assess HFA. The aim of this study was to evaluate the psychometric properties and factor structure of the CAQ in a sample of 229 adults diagnosed with NCCP. Results demonstrated that the CAQ is a useful measure of HFA in patients with NCCP and that a four-factor model including fear of cardiac sensations, avoidance of activities that elicit cardiac sensations, heart-focused attention, and reassurance seeking was the best fit for the data. Additionally, associations between CAQ subscales and two measures of health-related behaviors—pain-related interference and health care utilization—provided evidence of concurrent validity. Treatment implications are also discussed.
Narcissism continues to suffer from a lack of consensual definition. Variability in the definition is reflected in the growing multitude of measures with oftentimes diverging nomological nets. Although the themes of narcissistic grandiosity and vulnerability appear to have achieved reasonable agreement on their central importance, the lower order structure of each is not well understood and debates remain about how (and whether) they can be integrated into a coherent whole. However, it is clear that a narrow focus on higher order grandiosity without consideration of concomitant vulnerability neglects clinically important features of narcissism. Occasioned by the potential for a new personality disorder model in the Diagnostic and Statistical Manual of Mental Disorders–Fifth edition, several colleagues and I demonstrated that pathological narcissism, as measured by the Pathological Narcissism Inventory, could not be adequately summarized by the lower order traits of Grandiosity and Attention Seeking, and argued that this should be reflected in the diagnostic manual in some form. Miller, Lynam, and Campbell then subjected these same data to critical reanalysis and interpretation. I respond here to several points raised by Miller and colleagues. In so doing, I highlight areas of agreement, disagreement, and suggest directions for future research.
One aspect of higher order social cognition is empathy, a psychological construct comprising a cognitive (recognizing emotions) and an affective (responding to emotions) component. The complex nature of empathy complicates the accurate measurement of these components. The most widely used measure of empathy is the Interpersonal Reactivity Index (IRI). However, the factor structure of the IRI as it is predominantly used in the psychological literature differs from Davis’s original four-factor model in that it arbitrarily combines the subscales to form two factors: cognitive and affective empathy. This two-factor model of the IRI, although popular, has yet to be examined for psychometric support. In the current study, we examine, for the first time, the validity of this alternative model. A confirmatory factor analysis showed poor model fit for this two-factor structure. Additional analyses offered support for the original four-factor model, as well as a hierarchical model for the scale. In line with previous findings, females scored higher on the IRI than males. Our findings indicate that the IRI, as it is currently used in the literature, does not accurately measure cognitive and affective empathy and highlight the advantages of using the original four-factor structure of the scale for empathy assessments.
The accurate assessment of feigning is an important component of forensic assessment. Two potential strategies of feigning include the fabrication/exaggeration of psychiatric impairments and the fabrication/exaggeration of cognitive deficits. The current study examined the relationship between psychiatric and cognitive feigning strategies using the Structured Interview of Reported Symptoms and Test of Memory Malingering among 150 forensic psychiatric inpatients adjudicated incompetent to stand trial. A greater number of participants scored within the feigning range on the Structured Interview of Reported Symptoms than on the Test of Memory Malingering. Relative risk ratios indicated that individuals shown to be feigning cognitive deficits were 1.68 times more likely to feign psychiatric symptoms than those not shown to be feigning cognitive deficits. Likewise, individuals shown to be feigning psychiatric deficits were 1.86 times more likely to feign cognitive deficits than those not shown to be feigning psychiatric symptoms. Overall, findings suggest that psychiatric feigning and cognitive feigning are related, but can be employed separately as feigning strategies. Therefore, clinicians should consider evaluating for both feigning strategies in forensic assessments where cognitive and psychiatric symptoms are being assessed.
Hostile interpretation biases are central to the development and maintenance of anger, yet have been inconsistently assessed. The Word Sentence Association Paradigm (WSAP) was used to develop a new measure of hostile interpretation biases, the WSAP-Hostility. Study 1 examined the factor structure and internal consistency of the WSAP-Hostility, as well as its relationship with trait anger. Study 2 provided convergent and divergent validity data by examining its associations with trait anger, aggression, depression, and anxiety. Study 3 examined the relationship between WSAP-Hostility and another measure of hostile interpretation biases, as well as another word sentence association measure, in a sample of community participants. Study 4 also used a sample of community participants to offer further evidence of convergent validity. Across the studies, the WSAP-Hostility demonstrated convergent and divergent validity and internal consistency, supporting its use as a measure of hostile interpretation biases.
Cognitive behavioral models of social anxiety disorder (SAD) suggest that fear of negative evaluation is a core fear or vulnerability for SAD. However, why negative evaluation is feared is not fully understood. It is possible that core beliefs contribute to the relationship between fear of negative evaluation and SAD. One of these beliefs may be a core extrusion schema: a constellation of beliefs that one’s true self will be rejected by others and therefore one should hide one’s true self. In the current study (N = 699), we extended research on the Core Extrusion Schema and created a shortened and revised version of the measure called the Core Extrusion Schema–Revised. The Core Extrusion Schema–Revised demonstrated good factor fit for its two subscales (Hidden Self and Rejection of the True Self) and was invariant across gender and ethnicity. The Hidden Self subscale demonstrated excellent incremental validity within the full sample as well as in participants diagnosed with generalized SAD. Specifically, the Hidden Self subscale may help explain severity of social interaction anxiety. This measure could be used with individuals diagnosed with generalized SAD to design exposures targeting these core beliefs.
To assess the internal consistency, factor structure, and construct validity of the Italian translation of the Youth Psychopathic Traits Inventory–Short Version (YPI-S), both the YPI-S and its full version, the YPI, as well as self-reports of delinquency, aggression, and Big Five domains, were administered to two independent samples (N = 868 and N = 881) of Italian community, nonreferred adolescents. The internal consistency of the YPI-S was adequate, and confirmatory factor analyses showed a good fit of the theoretical three-factor model of the YPI-S in both samples. Hierarchical regression models suggested the same pattern of associations with self-report measures of delinquency and aggression for the YPI-S and YPI, although the YPI was a better predictor of Big Five domains than the YPI-S. The findings support the internal consistency, factor validity, and construct validity of the YPI-S.
There are no validated measures of psychiatric disability for traumatized refugees in Western psychiatric care. This is a serious shortcoming as it precludes monitoring of global treatment outcomes in this group, as well as appropriate matching of treatment needs to the disability levels. Using Rasch analysis, we evaluated the psychometrics of the Health of Nation Outcome Scales (HoNOS) in pretreatment data of consecutive refugee patients (N = 448) from a Danish psychiatric clinic. Then, we carried out a cross-validation of the pretreatment HoNOS model on posttreatment data from the same group. A revised 10-item HoNOS fit the Rasch model at pretreatment and also showed excellent fit within the cross-validation data. Culture, gender, and need for translation did not exert serious bias on the measure’s performance. The results establish good monitoring properties of the 10-item HoNOS as the first validated measure of psychiatric disability for traumatized refugees in Western psychiatric care.
Wisdom has been reported to be associated with better mental health and quality of life among older adults. Over the past decades, there has been considerable growth in empirical research on wisdom, including the development of standardized measures. The 39-item Three-Dimensional Wisdom Scale (3D-WS) is a useful assessment tool, given its rigorous development and good psychometric properties. However, the measure’s length can prohibit use. In this article, we used a sample of 1,546 community-dwelling adults aged 21 to 100 years (M = 66 years) from the Successful AGing Evaluation (SAGE) study to develop an abbreviated 12-item version of the 3D-WS: the 3D-WS-12. Balancing concerns for measurement precision, internal structure, and content validity, factor analytic methods and expert judgment were used to identify a subset of 12-items for the 3D-WS-12. Results suggest that the 3D-WS-12 can provide efficient and valid assessments of Wisdom within the context of epidemiological surveys.
The present study examined issues related to structural modeling of abilities by the use of simulated data as well as analysis of the standardization data from the Woodcock–Johnson-III. In both cases, results were evaluated with cross-validation. Simulation results showed that cross-validation with an independent data set was more successful in identifying the model that was used to generate test scores than were several fit indices. Analysis of the Woodcock–Johnson-III standardization data with cross-validation showed that bifactor models provided better fit than hierarchical or correlated factor models. This was true considering both fit indices and cross-validation. General and specific factors shared a considerable amount of variance as evaluated by using the bifactor models to partition variance. The results of the present study suggest that there is a certain degree of ambiguity in determining the exact amount of covariance in test performance accounted for by general and specific factors. This calls in to question the practice of adjusting or controlling for general abilities when evaluating measures of specific abilities. Evidence for the validity of a construct should not be limited to factor analysis of tests purported to measure that construct.
The reliability and validity of three short forms of the Dutch version of the Wechsler Memory Scale–Fourth Edition (WMS-IV-NL) were evaluated in a mixed clinical sample of 235 patients. The short forms were based on the WMS-IV Flexible Approach, that is, a 3-subtest combination (Older Adult Battery for Adults) and two 2-subtest combinations (Logical Memory and Visual Reproduction and Logical Memory and Designs), which can be used to estimate the Immediate, Delayed, Auditory and Visual Memory Indices. All short forms showed good reliability coefficients. As expected, for adults (16-69 years old) the 3-subtest short form was consistently more accurate (predictive accuracy ranged from 73% to 100%) than both 2-subtest short forms (range = 61%-80%). Furthermore, for older adults (65-90 years old), the predictive accuracy of the 2-subtest short form ranged from 75% to 100%. These results suggest that caution is warranted when using the WMS-IV-NL Flexible Approach short forms to estimate all four indices.
The triarchic model characterizes psychopathy in terms of three distinct dispositional constructs of boldness, meanness, and disinhibition. The model can be operationalized through scales designed specifically to index these domains or by using items from other inventories that provide coverage of related constructs. The present study sought to develop and validate scales for assessing the triarchic model domains using items from the Minnesota Multiphasic Personality Inventory–2–Restructured Form (MMPI-2-RF). A consensus rating approach was used to identify items relevant to each triarchic domain, and following psychometric refinement, the resulting MMPI-2-RF-based triarchic scales were evaluated for convergent and discriminant validity in relation to multiple psychopathy-relevant criterion variables in offender and nonoffender samples. Expected convergent and discriminant associations were evident very clearly for the Boldness and Disinhibition scales and somewhat less clearly for the Meanness scale. Moreover, hierarchical regression analyses indicated that all MMPI-2-RF triarchic scales incremented standard MMPI-2-RF scale scores in predicting extant triarchic model scale scores. The widespread use of MMPI-2-RF in clinical and forensic settings provides avenues for both clinical and research applications in contexts where traditional psychopathy measures are less likely to be administered.
This study examined measurement invariance of the NEO Five-Factor Inventory (NEO-FFI), assessing the five-factor model (FFM) of personality among Euro American (N = 290) and Asian international (N = 301) students (47.8% women, Mage = 19.69 years). The full 60-item NEO-FFI data fit the expected five-factor structure for both groups using exploratory structural equation modeling, and achieved configural invariance. Only 37 items significantly loaded onto the FFM-theorized factors for both groups and demonstrated metric invariance. Threshold invariance was not supported with this reduced item set. Groups differed the most in the item–factor relationships for Extraversion and Agreeableness, as well as in response styles. Asian internationals were more likely to use midpoint responses than Euro Americans. While the FFM can characterize broad nomothetic patterns of personality traits, metric invariance with only the subset of NEO-FFI items identified limits direct group comparisons of correlation coefficients among personality domains and with other constructs, and of mean differences on personality domains.
Item response theory (IRT) was separately applied to parent- and teacher-rated symptoms of attention-deficit/hyperactivity disorder (ADHD) from a pooled sample of 526 six- to twelve-year-old children with and without ADHD. The dimensional structure ADHD was first examined using confirmatory factor analyses, including the bifactor model. A general ADHD factor and two group factors, representing inattentive and hyperactive/impulsive dimensions, optimally fit the data. Using the graded response model, we estimated discrimination and location parameters and information functions for all 18 symptoms of ADHD. Parent- and teacher-rated symptoms demonstrated adequate discrimination and location values, although these estimates varied substantially. For parent ratings, the test information curve peaked between –2 and +2 SD, suggesting that ADHD symptoms exhibited excellent overall reliability at measuring children in the low to moderate range of the general ADHD factor, but not in the extreme ranges. Similar results emerged for teacher ratings, in which the peak range of measurement precision was from –1.40 to 1.90 SD. Several symptoms were comparatively more informative than others; for example, is often easily distracted ("Distracted") was the most informative parent- and teacher-rated symptom across the latent trait continuum. Clinical implications for the assessment of ADHD as well as relevant considerations for future revisions to diagnostic criteria are discussed.
Social anhedonia and social anxiety are two constructs with similar behaviors including avoidance of and withdrawal from social situations. In three studies, the current research aimed to test whether social anhedonia could be discriminated from social anxiety using the most common measure of social anhedonia, the Revised Social Anhedonia Scale (RSAS). In Study 1, an item-level factor analysis of the RSAS found two factors: Social Apathy/Aversion and Social Withdrawal. In Study 2, this two-factor structure was confirmed in a separate sample. In Study 3, a model with social anhedonia and anxiety scale scores loading on separate factors fit better than a model with social anhedonia and anxiety loading on a single factor. Social anhedonia and anxiety displayed differential associations with negative schizotypy and emotion processing. Findings suggest that the RSAS is successful in measuring social anhedonia distinct from social anxiety.
As the construct of moral injury has gained increased conceptual and empirical attention among military personnel and veterans, preliminary attempts to operationalize and measure the construct have emerged. One such measure is the Moral Injury Event Scale (MIES). The aim of the current study was to further evaluate the MIES’s psychometric properties in two military samples: a clinical sample of Air Force personnel and a nonclinical sample of Army National Guard personnel. Exploratory and confirmatory factor analyses across both samples supported a three-factor solution: transgressions by others, transgressions by self, and betrayal. Transgressions-Others was most strongly associated with posttraumatic stress; Transgressions-Self was most strongly associated with hopelessness, pessimism, and anger; and Betrayal was most strongly associated with posttraumatic stress and anger. Results support the construct validity of the MIES, although areas for improvement are indicated and discussed.
Kwan, John, Kenny, Bond, and Robins conceptualize self-enhancement as a favorable comparison of self-judgments with judgments of and by others. Applying a modified version of Kwan et al.’s approach to behavior observation data, we show that the resulting measure of self-enhancement bias is highly reliable, predicts self-ratings of intelligence as well as does actual intelligence, interacts with item desirability in predicting responses to questionnaire items, and also predicts general life satisfaction. Consistent with previous research, however, self-ratings of intelligence did not become more valid when controlling for self-enhancement bias. We also show that common personality scales like the Rosenberg Self-Esteem Scale reflect self-enhancement at least as strongly as do scales that were designed particularly for that purpose (i.e., "social desirability scales"). The relevance of these findings in regard to the validity and utility of social desirability scales is discussed.
In the present study, we report on the development and initial psychometric properties of the Family Aggression Screening Tool (FAST). The FAST is a brief, self-report tool that makes use of pictorial representations to assess experiences of caregiver aggression, including direct victimization and exposure to intimate partner violence. It is freely available on request and takes under 5 minutes to complete. Psychometric properties of the FAST were investigated in a sample of 168 high-risk youth aged 16 to 24 years. For validation purposes, maltreatment history was assessed using the Childhood Trauma Questionnaire; levels of current psychiatric symptoms were also assessed. Internal consistency of the FAST was good. Convergent validity was supported by strong and discriminative associations with corresponding Childhood Trauma Questionnaire subscales. The FAST also correlated significantly with multi-informant reports of psychiatric symptomatology. Initial findings provide support for the reliability and validity of the FAST as a brief, pictorial screening tool of caregiver aggression.
Extant theory posits well-differentiated dimensions of perceived social support as measured using the Social Provisions Scale (SPS). However, evidence is inconsistent with this multidimensionality perspective, with SPS factor correlations near unity and higher between-factor than within-factor item correlations. This article reports on research investigating the internal structure, gender invariance, and predictive validity of SPS scores. The analyses are conducted in a novel bifactor exploratory structural equation modeling (ESEM) framework, which is designed to account for presumed psychometric multidimensionality in SPS items due to (a) their fallibility as pure indicators of the constructs they are purported to measure and (b) the coexistence of general and specific factors. Based on 376 item responses, evidence was obtained for a bifactor-ESEM representation of the SPS data. In addition, support was found for the invariance of item thresholds and the latent mean invariance of six of the seven SPS factors in the retained solution. Only mean levels of Social Integration were found to differ by gender, with men scoring higher than women. Finally, evidence was obtained for the predictive validity of SPS scores with respect to loneliness and psychological well-being. Quite apart from yielding evidence validating the SPS, this research demonstrates the utility of bifactor ESEM for psychological assessment.
We investigated the impact of ego depletion on selected Rorschach cognitive processing variables and self-reported affect states. Research indicates acts of effortful self-regulation transiently deplete a finite pool of cognitive resources, impairing performance on subsequent tasks requiring self-regulation. We predicted that relative to controls, ego-depleted participants’ Rorschach protocols would have more spontaneous reactivity to color, less cognitive sophistication, and more frequent logical lapses in visualization, whereas self-reports would reflect greater fatigue and less attentiveness. The hypotheses were partially supported; despite a surprising absence of self-reported differences, ego-depleted participants had Rorschach protocols with lower scores on two variables indicative of sophisticated combinatory thinking, as well as higher levels of color receptivity; they also had lower scores on a composite variable computed across all hypothesized markers of complexity. In addition, self-reported achievement striving moderated the effect of the experimental manipulation on color receptivity, and in the Depletion condition it was associated with greater attentiveness to the tasks, more color reactivity, and less global synthetic processing. Results are discussed with an emphasis on the response process, methodological limitations and strengths, implications for calculating refined Rorschach scores, and the value of using multiple methods in research and experimental paradigms to validate assessment measures.
The Booklet Category Test (BCT) is a neuropsychological test of cognitive dysfunction that provides only one overall error score indicative of global impairment. It does not, however, delineate specific domains that might be impaired. The aim of this study is to concurrently validate 13 new BCT subscales using legacy instruments in patients with nonpenetrating traumatic brain injury (TBI). Eighty-nine patients with mild, moderate, and severe TBI completed a battery of neuropsychology tests. Partial correlations controlling for age were performed and there were significant correlations between the a priori selected scores from legacy measures of major cognitive domains and both BCT total errors and subscale scores. Additional analysis showed that several subscales were able to differentiate between performance levels on the legacy measures. Overall, our results showed that the subscales measured cognitive skills beyond global impairment, supporting the use of the BCT subscales in a population with TBI.
The current study examined the psychometric characteristics of the Chinese translation of the Marital Satisfaction Inventory–Revised (MSI-R) in a community sample of 117 couples from Taiwan. The Chinese MSI-R demonstrated moderate to strong internal consistency. Confirmatory factor analysis revealed similar scale factor structures in the Taiwanese and U.S. standardization samples. Mean profile comparisons between the current Taiwanese sample and the original MSI-R standardization sample revealed statistically significant but small differences on several subscales. Overall, the psychometric characteristics of the Chinese MSI-R lend support to its use with couples from diverse cultural backgrounds whose sole or preferred language is Chinese. It may also be appropriate to use the MSI-R in clinical settings for prevention or intervention efforts directed at Chinese-speaking couples. The implications of these findings for clinical and research purposes are discussed.
In our article "Evidence for the Criterion Validity and Clinical Utility of the Pathological Narcissism Inventory" (2012), we provided incorrect values for the rcontrast-cv coefficients we presented in Table 1. In the current report, we provide correct rcontrast-cv values in Table 1 and discuss the implications of our updated results, particularly with respect to how these results differ from our initial report.
Discrete forms of repetitive thought (RT), such as worry and reflection, can be characterized along basic dimensions of valence (positive vs. negative) and purpose (searching vs. solving). In addition, people can be characterized as high or low in their tendency to engage in RT. This dimensional model has been demanding to assess, and a smaller number of items that could stand in for a large battery would make measurement more accessible. Using four samples (N = 1,588), eight items that assess RT valence, purpose, and total in a circumplex model were identified. Across these and other samples, the dimensions were adequately reliable and valid with regard to assessment via large RT battery, other measures of RT, and depressive symptoms. The accessibility of dimensional assessment of RT using this smaller number of items should facilitate work on questions about the qualities of RT that predict mental and physical health.
Two studies were conducted to identify and cross-validate cutoff scores on the Wechsler Adult Intelligence Scale–Fourth Edition Digit Span–based embedded performance validity (PV) measures for individuals with schizophrenia spectrum disorders. In Study 1, normative scores were identified on Digit Span–embedded PV measures among a sample of patients (n = 84) with schizophrenia spectrum diagnoses who had no known incentive to perform poorly and who put forth valid effort on external PV tests. Previously identified cutoff scores resulted in unacceptable false positive rates and lower cutoff scores were adopted to maintain specificity levels ≥90%. In Study 2, the revised cutoff scores were cross-validated within a sample of schizophrenia spectrum patients (n = 96) committed as incompetent to stand trial. Performance on Digit Span PV measures was significantly related to Full Scale IQ in both studies, indicating the need to consider the intellectual functioning of examinees with psychotic spectrum disorders when interpreting scores on Digit Span PV measures.
This study investigated the stability of extreme response style (ERS) and acquiescence response style (ARS) over a period of 8 years. ERS and ARS were measured with item sets drawn randomly from a large pool of items used in an ongoing German panel study. Latent-trait-state-occasion and latent-state models were applied to test the relationship between time-specific (state) response style behaviors and time-invariant trait components of response styles. The results show that across different random item samples, on average between 49% and 59% of the variance in the state response style factors was explained by the trait response style factors. This indicates that the systematic differences respondents show in their preferences for certain response categories are remarkably stable over a period of 8 years. The stability of ERS and ARS implies that it is important to consider response styles in the analysis of self-report data from polytomous rating scales, especially in longitudinal studies aimed at investigating stability in substantive traits. Furthermore, the stability of response styles raises the question in how far they might be considered trait-like latent variables themselves that could be of substantive interest.
The Strengths and Difficulties Questionnaire (SDQ) is a widely used psychopathology screening tool that measures children’s emotional symptoms, peer problems, conduct problems, hyperactivity/inattention, and prosocial behavior. Previous psychometric studies of the SDQ focused primarily on older children in Western cultures and suffered from several methodological limitations. This study examined the reliability, factor structure, convergent, and discriminant validity of the SDQ by focusing on young Asian American children and using more rigorous methods. The five-factor structure of the SDQ was confirmed by confirmatory factor analysis. The coefficients indicated adequate reliability for all subscales except parent-rated peer problems and conduct problems. The correlated trait–correlated method minus one multitrait–multimethod model provided evidence for convergent validity and discriminant validity of all subscales except for conduct problems relative to hyperactivity/inattention. This study provided new evidence for the psychometric properties of the SDQ in young children and cultural suitability of the SDQ for Asian Americans.
The reliability of six Minnesota Multiphasic Personality Inventory–Second edition (MMPI-2) computer-based test interpretation (CBTI) programs was evaluated across a set of 20 commonly appearing MMPI-2 profile codetypes in clinical settings. Evaluation of CBTI reliability comprised examination of (a) interrater reliability, the degree to which raters arrive at similar inferences based on the same CBTI profile and (b) interprogram reliability, the level of agreement across different CBTI systems. Profile inferences drawn by four raters were operationalized using q-sort methodology. Results revealed no significant differences overall with regard to interrater and interprogram reliability. Some specific CBTI/profile combinations (e.g., the CBTI by Automated Assessment Associates on a within normal limits profile) and specific profiles (e.g., the 4/9 profile displayed greater interprogram reliability than the 2/4 profile) were interpreted with variable consensus (α range = .21-.95). In practice, users should consider that certain MMPI-2 profiles are interpreted more or less consensually and that some CBTIs show variable reliability depending on the profile.
Test norms enable determining the position of an individual test taker in the group. The most frequently used approach to obtain test norms is traditional norming. Regression-based norming may be more efficient than traditional norming and is rapidly growing in popularity, but little is known about its technical properties. A simulation study was conducted to compare the sample size requirements for traditional and regression-based norming by examining the 95% interpercentile ranges for percentile estimates as a function of sample size, norming method, size of covariate effects on the test score, test length, and number of answer categories in an item. Provided the assumptions of the linear regression model hold in the data, for a subdivision of the total group into eight equal-size subgroups, we found that regression-based norming requires samples 2.5 to 5.5 times smaller than traditional norming. Sample size requirements are presented for each norming method, test length, and number of answer categories. We emphasize that additional research is needed to establish sample size requirements when the assumptions of the linear regression model are violated.
Anger has high prevalence in clinical and forensic settings, and it is associated with aggressive behavior and ward atmosphere on psychiatric units. Dysregulated anger is a clinical problem in Danish mental health care systems, but no anger assessment instruments have been validated in Danish. Because the Novaco Anger Scale and Provocation Inventory (NAS-PI) has been extensively validated with different clinical populations and lends itself to clinical case formulation, it was selected for translation and evaluation in the present multistudy project. Psychometric properties of the NAS-PI were investigated with samples of 477 nonclinical, 250 clinical, 167 male prisoner, and 64 male forensic participants. Anger prevalence and its relationship with other anger measures, anxiety/depression, and aggression were examined. NAS-PI was found to have high reliability, concurrent validity, and discriminant validity, and its scores discriminated the samples. High scores in the offender group demonstrated the feasibility of obtaining self-report assessments of anger with this population. Retrospective and prospective validity of the NAS were tested with the forensic patient sample regarding physically aggressive behavior in hospital. Regression analyses showed that higher scores on NAS increase the risk of having acted aggressively in the past and of acting aggressively in the future.
Background. In recent years, a number of studies focusing on the evaluation of neuropsychological deficits in individuals with schizophrenia have shown deficits that include several cognitive functions. Attention deficits as well as memory or executive function deficits are common in this kind of disorder together with sustained attention problems, working memory deficiencies, and problem-solving difficulties, among many others. Currently, the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) is gaining special importance in the evaluation of the cognitive deficits associated with schizophrenia. Method. In this article, we describe an RBANS screening in a sample of 88 Spanish patients diagnosed with schizophrenia. We also aimed to check the battery’s reliability, sensitivity, and specificity in the studied sample. We performed a comparative study with 88 healthy participants. Results. The results showed a reliability index value of α = .795 and an item value of α = .762. For total test reliability, we obtained an index value of α = .761 and an item value of α = .762. Sensitivity score was 87.5% and specificity 86.4%. Conclusions. RBANS obtained good reliability, sensitivity, and specificity scores and represents a good screening tool in detecting cognitive deficits associated with schizophrenia.
This study examined the psychometric properties and diagnostic accuracy of the Swedish translations of the Spence Children’s Anxiety Scale, self- and parent report versions, in a sample of 104 adolescents presenting at two general psychiatric outpatient units. Results showed high informant agreement and good internal reliability and concurrent and discriminant validity for both versions and demonstrated that this scale can distinguish between adolescents with and without an anxiety disorder in a non–anxiety-specific clinical setting. The relative clinical utility of different cutoff scores was compared by looking at the extent to which dichotomized questionnaire results altered the pretest probability of the presence of a diagnosis as defined by the Schedule for Affective Disorders and Schizophrenia for School-Age Children. Optimized for screening and diagnostic purposes in Sweden, cutoff scores obtained in the current study outperformed a previously identified cutoff score derived from an Australian community sample. The Spence Children’s Anxiety Scale is a useful clinical instrument for the assessment of anxiety in adolescents.
The current research sought to validate the Chernyshenko Conscientiousness Scales (CCS), a novel measure designed to assess six facets of conscientiousness. Data from 7,569 U.S. participants and 649 U.K. participants were analyzed to assess the internal reliability and factorial structure of the scales. Test–retest reliability, convergent and divergent validity, and criterion-related validity were also evaluated using a separate U.K. sample (n = 118; n = 80 for test–retest). The results showed that those items designed to measure industriousness, order, self-control, traditionalism, and virtue were best represented by a five-factor structure, broadly consistent with the five scales. However, the content and structure of the responsibility scale requires further investigation. Overall, the CCS has the potential to be a useful alternative to the faceted measures of conscientiousness that are currently available. However, future research is required to refine a number of problematic items and to clarify which facets can be better described as interstitial dimensions between conscientiousness and other Big Five domains.
The Cambridge Neuropsychological Test Automated Battery (CANTAB) is a semiautomated computer interface for assessing cognitive function. We examined whether CANTAB tests measured specific cognitive functions, using established neuropsychological tests as a reference point. A sample of 500 healthy older (M = 60.28 years, SD = 6.75) participants in the Tasmanian Healthy Brain Project completed battery of CANTAB subtests and standard paper-based neuropsychological tests. Confirmatory factor analysis identified four factors: processing speed, verbal ability, episodic memory, and working memory. However, CANTAB tests did not consistently load onto the cognitive domain factors derived from traditional measures of the same function. These results indicate that five of the six CANTAB subtests examined did not load onto single cognitive functions. These CANTAB tests may lack the sensitivity to measure discrete cognitive functions in healthy populations or may measure other cognitive domains not included in the traditional neuropsychological battery.
The Five-Factor Borderline Inventory (FFBI) is a 120-item dimensional measure of borderline personality disorder (BPD) that was developed from the description of BPD from the perspective of the Five-Factor Model. The FFBI includes 12 subscales and 1 total score. The current study created a short form of the FFBI (FFBI-SF) using item response theory analyses based on an undergraduate student sample that completed the FFBI. Based on the results, the final FFBI-SF included 48 items, with 4 items per subscale. The construct validity of the short form was compared with the original FFBI in five additional samples. The FFBI-SF showed strong convergence with other BPD scales and comparable convergent and discriminant validity with the FFM compared with the FFBI. The correlational profiles generated by the total score and subscales were highly convergent. Results of the current study suggest that the FFBI-SF may be an accessible and useful assessment tool of BPD.
The purpose of the current study was to identify Minnesota Multiphasic Personality Inventory-2–Restructured Form (MMPI-2-RF) correlates of police officer integrity violations and other problem behaviors in an archival database with original MMPI item responses and collateral information regarding integrity violations obtained for 417 male officers. In Study 1, we estimated MMPI-2-RF scores from the MMPI item pool (which includes approximately 80% of the MMPI-2-RF items) in a normative sample, a psychiatric inpatient sample, and a police officer sample, and conducted analyses that demonstrated the comparability of estimated and full scale scores for 41 of the 51 MMPI-2-RF scales. In Study 2, we correlated estimated MMPI-2-RF scores with information about subsequent integrity violations and problem behaviors from the integrity violation data set. Several meaningful associations were obtained, predominately with scales from the emotional, thought, and behavioral dysfunction domains of the MMPI-2-RF. Application of a correction for range restriction yielded substantially improved validity estimates. Finally, we calculated relative risk ratios for the statistically significant findings using cutoffs lower than 65T, which is traditionally used to identify clinically significant elevations, and found several meaningful relative risk ratios.
Two studies were conducted to examine the factor structure of attitude toward ambiguity, a broad personality construct that refers to personal reactions to perceived ambiguous stimuli in a variety of context and situations. Using samples from two countries, Study 1 mapped the hierarchical structure of 133 items from seven tolerance–intolerance of ambiguity scales (N = 360, Italy; N = 306, United States). Three major factors—Discomfort with Ambiguity, Moral Absolutism/Splitting, and Need for Complexity and Novelty—were recovered in each country with high replicability coefficients across samples. In Study 2 (N = 405, Italian community sample; N =366, English native speakers sample), we carried out a confirmatory analysis on selected factor markers. A bifactor model had an acceptable fit for each sample and reached the construct-level invariance for general and group factors. Convergent validity with related traits was assessed in both studies. We conclude that attitude toward ambiguity can be best represented a multidimensional construct involving affective (Discomfort with Ambiguity), cognitive (Moral Absolutism/Splitting), and epistemic (Need for Complexity and Novelty) components.
This article investigated the accuracy of six short forms of the Dutch Wechsler Preschool and Primary Scale of Intelligence–Third edition (WPPSI-III-NL) in estimating intelligent quotient (IQ) scores in healthy children aged 4 to 7 years (N = 1,037). Overall, accuracy for each short form was studied, comparing IQ equivalences based on the short forms with the original WPPSI-III-NL Full Scale IQ (FSIQ) scores. Next, our sample was divided into three groups: children performing below average, average, or above average, based on the WPPSI-III-NL FSIQ estimates of the original long form, to study the accuracy of WPPSI-III-NL short forms at the tails of the FSIQ distribution. While studying the entire sample, all IQ estimates of the WPPSI-III-NL short forms correlated highly with the FSIQ estimates of the original long form (all rs ≥ .83). Correlations decreased significantly while studying only the tails of the IQ distribution (rs varied between .55 and .83). Furthermore, IQ estimates of the short forms deviated significantly from the FSIQ score of the original long form, when the IQ estimates were based on short forms containing only two subtests. In contrast, unlike the short forms that contained two to four subtests, the Wechsler Abbreviated Scale of Intelligence short form (containing the subtests Vocabulary, Similarities, Block Design, and Matrix Reasoning) and the General Ability Index short form (containing the subtests Vocabulary, Similarities, Comprehension, Block Design, Matrix Reasoning, and Picture Concepts) produced less variations when compared with the original FSIQ score.
Although there are many studies devoted to person-fit statistics to detect inconsistent item score patterns, most studies are difficult to understand for nonspecialists. The aim of this tutorial is to explain the principles of these statistics for researchers and clinicians who are interested in applying these statistics. In particular, we first explain how invalid test scores can be detected using person-fit statistics; second, we provide the reader practical examples of existing studies that used person-fit statistics to detect and to interpret inconsistent item score patterns; and third, we discuss a new R-package that can be used to identify and interpret inconsistent score patterns.
The factor structure and the convergent validity of the Personality Inventory for DSM-5 (PID-5), a self-report questionnaire designed to measure personality pathology as advocated in the fifth edition, Section III of Diagnostic and Statistical Manual of Mental Disorders (DSM-5), are already demonstrated in general population samples, but need replication in clinical samples. In 240 Flemish inpatients, we examined the factor structure of the PID-5 by means of exploratory structural equation modeling. Additionally, we investigated differences in PID-5 higher order domain scores according to gender, age and educational level, and explored convergent and discriminant validity by relating the PID-5 with the Dimensional Assessment of Personality Pathology—Basic Questionnaire and by comparing PID-5 scores of inpatients with and without a DSM-IV categorical personality disorder diagnosis. Our results confirmed the original five-factor structure of the PID-5. The reliability and the convergent and discriminant validity of the PID-5 proved to be adequate. Implications for future research are discussed.
The Short-Term Assessment of Risk and Treatability (START) aims to assist mental health practitioners to estimate an individual’s short-term risk for a range of adverse outcomes via structured consideration of their risk ("Vulnerabilities") and protective factors ("Strengths") in 20 areas. It has demonstrated predictive validity for aggression but this is less established for other outcomes. We collated START assessments for N = 200 adults in a secure mental health hospital and ascertained 3-month risk event incidence using the START Outcomes Scale. The specific risk estimates, which are the tool developers’ suggested method of overall assessment, predicted aggression, self-harm/suicidality, and victimization, and had incremental validity over the Strength and Vulnerability scales for these outcomes. The Strength scale had incremental validity over the Vulnerability scale for aggressive outcomes; therefore, consideration of protective factors had demonstrable value in their prediction. Further evidence is required to support use of the START for the full range of outcomes it aims to predict.
We used integrated and conjoint confirmatory factor analysis of Shipley-2 and Wechsler Intelligence Scale for Children–Fourth Edition (WISC-IV) data to investigate constructs measured in the Shipley-2 for children and adolescents. We also estimated Shipley-2 composite reliability at the subtest level rather than the item level. The three Shipley-2 subtests for the most part measured what was described in the manual, although Block Patterns measured visual spatial ability in addition to fluid ability and Abstraction was best considered a measure of psychometric g. The g factors derived from the WISC-IV and Shipley-2 were similar but not identical. Internal reliability estimates for Shipley-2 composites that were based on correlations between the subtests were substantially lower than those based on the items. Last, based on WISC-IV derived g factors, 37% to 53% of the variance in Shipley-2 composites was explained by g. Some of the reliable variance in the Shipley-2 composites was due to something specific that the subtests had in common not explained by psychometric g.
Objective: The Quality Of Life after BRain Injury (QOLIBRI) consortium has developed a short six-item scale (QOLIBRI-OS) to screen health-related quality of life after traumatic brain injury. The goal of the current study is to examine further psychometric qualities of the Quality Of Life after BRain Injury-Overall Scale (QOLIBRI-OS) at the item level using Rasch analysis with particular emphasis on the operating characteristics of the items. Method: A total of 921 participants with traumatic brain injury were recruited. The analysis sample was restricted to 795 participants with Glasgow Coma Score and Glasgow Outcome Score–Extended available in order to ensure a well-characterized sample. Results: Overall fit statistics indicate sufficient reliability of the QOLIBRI-OS. The assumption of unidimensionality could be confirmed with reservation. The range of item locations is small, whereas item thresholds cover a wide range of the latent trait. The majority of parameter estimations for all class intervals of the respective test are in accordance with the model assumptions. Conclusion: The results show that, despite marginal misfits to the model, the six items representing the QOLIBRI-OS could establish a Rasch scale.
Researchers have repeatedly argued that it is important to determine whether the psychometric properties of an emotional competence measure hold in Eastern populations because there may be cultural variability in abilities linked with emotional competence. However, few studies have examined potential differences in an emotional competence measure in Eastern cultures. To fill this gap, we investigated the applicability of the Profile of Emotional Competence to a Japanese population. Results demonstrated measurement and structural invariance across our Japanese and the original Belgian data sets. As was found in the Belgian sample, this measure showed adequate convergent and criterion validity in the Japanese sample. Furthermore, the scores on this measure were stronger predictors of subjective health and happiness in the Japanese than Belgian population. This measure also showed incremental validity. Our results suggest that the Profile of Emotional Competence is applicable to the Japanese population, an Eastern society.
Low positive emotion distinguishes depression from most types of anxiety. Formative work in this area employed the Anhedonic Depression scale from the Mood and Anxiety Symptom Questionnaire (MASQ-AD), and the MASQ-AD has since become a popular measure of positive emotion, often used independently of the full MASQ. However, two key assumptions about the MASQ-AD—that it should be represented by a total scale score, and that it measures time-variant experiences—have not been adequately tested. The present study factor analyzed MASQ-AD data collected annually over 3 years (n = 618, mean age = 17 years at baseline), and then decomposed its stable and unstable components. The results suggested the data were best represented by a hierarchical structure, and that less than one quarter of the variance in the general factor fluctuated over time. The implications for interpreting past findings from the MASQ-AD, and for conducting future research with the scale, are discussed.
The National Institute of Mental Health Research Domain Criteria initiative (
The Narcissistic Personality Inventory (NPI) is currently the most widely used measure of narcissism in social/personality psychology. It is also relatively unique because it uses a forced-choice response format. We investigate the consequences of changing the NPI’s response format for item meaning and factor structure. Participants were randomly assigned to one of three conditions: 40 forced-choice items (n = 2,754), 80 single-stimulus dichotomous items (i.e., separate true/false responses for each item; n = 2,275), or 80 single-stimulus rating scale items (i.e., 5-point Likert-type response scales for each item; n = 2,156). Analyses suggested that the "narcissistic" and "nonnarcissistic" response options from the Entitlement and Superiority subscales refer to independent personality dimensions rather than high and low levels of the same attribute. In addition, factor analyses revealed that although the Leadership dimension was evident across formats, dimensions with entitlement and superiority were not as robust. Implications for continued use of the NPI are discussed.
Low empathy is a criterion for most externalizing disorders, and empathy training is a regular component of treatment for aggressive people, from school bullies to sex offenders. However, recent meta-analytic evidence suggests that current measures of empathy explain only 1% of the variance in aggressive behavior. A new assessment of empathy was developed to more fully represent the empathy construct and better predict important outcomes—particularly aggressive behavior and externalizing psychopathology. Across three independent samples (N = 210-708), the 36-item Affective and Cognitive measure of Empathy (ACME) was internally consistent, structurally reliable, and invariant across sex. The ACME bore significant associations to important outcomes, which were incremental relative to other measures of empathy and generalizable across sex. Importantly, the affective scales of the ACME—particularly a new "Affective Dissonance" scale—yielded moderate to strong associations with aggressive behavior and externalizing disorders. The ACME is a short, reliable, and useful measure of empathy.
The most commonly used risk assessment tools for predicting sexual violence focus almost exclusively on static, historical factors (e.g., characteristics of prior offences). Consequently, they are assumed to be unable to directly inform the selection of treatment targets or evaluate change. In this article, we argue that this limitation can be mitigated by using latent variable models as a framework to link historical risk factors to the psychological characteristics of offenders. Accordingly, we conducted a factor analysis of the 13 nonredundant items from the two most commonly used risk tools for sexual offenders (Static-99R and Static-2002R) to identify the psychological information contained in these tools. Three factors were identified: (a) persistence/paraphilia, a construct related to sexual criminality, especially of the pedophilic type; (b) youthful stranger aggression, a construct centered on young age and offence seriousness; and (c) general criminality, a construct that reflected the diversity and magnitude of criminal careers. These constructs predicted sexual recidivism with similar accuracy, but only youthful stranger aggression and general criminality predicted nonsexual recidivism. These results indicate that risk tools for sexual violence are multidimensional, and support a shift from a focus on atheoretical risk markers to the assessment of psychologically meaningful constructs.
Parent ratings of their children’s behavioral and emotional difficulties are commonly collected via the Strength and Difficulties Questionnaire (SDQ). For the first time, this study addressed the issue of interparent agreement using a measurement invariance approach. Data from 695 English couples (mothers and fathers) who had rated the behavior of their 4.25-year-old child were used. Given the inconsistency of previous results about the SDQ factor structure, alternative measurement models were tested. A five-factor Exploratory Structural Equation Model allowing for nonzero cross-loadings fitted data best. Subsequent invariance analyses revealed that the SDQ factor structure is adequately invariant across parents, with interrater correlations ranging from .67 to .78. Fathers reported significantly higher levels of child conduct problems, hyperactivity, and emotional symptoms, and lower levels of prosocial behavior. This suggests that mothers and fathers each provide unique information across a range of their child’s behavioral and emotional problems.
This study examined the utility of the Minnesota Multiphasic Personality Inventory–2–Restructured Form (MMPI-2-RF) substantive scales in the prediction of premature termination and therapy no-shows while controlling for other relevant predictors in a university-based community mental health center, a sample at high risk of both premature termination and no-show appointments. Participants included 457 individuals seeking services from a university-based psychology clinic. Results indicated that Juvenile Conduct Problems (JCP) predicted premature termination and Behavioral/Externalizing Dysfunction and JCP predicted number of no-shows, when accounting for initial severity of illness, personality disorder diagnosis, therapist experience, and other related MMPI-2-RF scales. The MMPI-2-RF Aesthetic-Literary Interests scale also predicted number of no-shows. Recommendations for applying these findings in clinical practice are discussed.
The Five-Factor Model Rating Form (FFMRF) provides a brief, one-page assessment of the Five-Factor Model. An important and unique aspect of the FFMRF is that it is the only brief measure that includes scales for the 30 facets proposed by Costa and McCrae. The current study builds on existing validity support for the FFMRF by evaluating its factorial invariance across gender within a sample of 699 undergraduate students. Consistent with other measures of the Five-Factor Model, men scored lower than women on the domains of neuroticism, extraversion, agreeableness, and conscientiousness but slightly higher on openness. The novel contribution of the current study is the use of exploratory structural equation modeling to determine that the FFMRF displayed a five-factor structure that demonstrated strong measurement invariance across gender. This factorial invariance adds important support for the validity of the FFMRF as a self-report measure as it indicates that the scores assess the same latent constructs in men and women. Although future work is needed to clarify some facet-level findings and evaluate for potential predictive biases, the present results add to the increasing body of research supporting the validity of the FFMRF as a self-report measure of personality.
The purpose of this study was to investigate the predictive validity of the Minnesota Multiphasic Personality Inventory–2–Restructured Form (MMPI-2-RF) in a sample of law enforcement officers. MMPI-2-RF scores were collected from preemployment psychological evaluations of 136 male police officers, and supervisor ratings of performance and problem behavior were subsequently obtained during the initial probationary period. The sample produced meaningfully lower and less variant substantive scale scores than the general population and the MMPI-2-RF Police Candidate comparison group, which significantly affected effect sizes for the zero-order correlations. After applying a correction for range restriction, MMPI-2-RF substantive scales demonstrated moderate to strong associations with criteria, particularly in the Emotional Dysfunction and Interpersonal Functioning domains. Relative risk ratio analyses showed that cutoffs of 45T and 50T maintained reasonable selection ratios because of the exceptionally low scores in this sample and were associated with significantly increased risk for problematic behavior. These results provide support for the predictive validity of the MMPI-2-RF substantive scales in this setting. Implications of these findings and limitations of these results are discussed.
Attention deficit/hyperactivity disorder (ADHD) is one of the most common psychiatric disorders in childhood and adolescence. Rating the severity of psychopathology and symptom load is essential in daily clinical practice and in research. The parent and teacher ADHD-Rating Scale (ADHD-RS) includes inattention and hyperactivity/impulsivity subscales and is one of the most frequently used scales in treatment evaluation of children with ADHD. An extended version, mADHD-RS, also includes an oppositional defiant disorder subscale. The partial credit Rasch model, which is based on item response theory, was used to test the psychometric properties of this scale in a sample of 566 Danish school children between 6 and 16 years of age. The results indicated that parents and teachers had different frames of reference when rating symptoms in the mADHD-RS. There was support for the unidimensionality of the three subscales when parent and teacher ratings were analyzed independently. Nonetheless, evidence for differential item functioning was found across gender and age for specific items within each of the subscales. The findings expand existing psychometric information about the mADHD-RS and support its use as a valid and reliable measure of symptom severity when used in age- and gender-stratified materials.
The Schedule for Nonadaptive and Adaptive Personality full-length (SNAP) and short versions (SNAP-SRF and SNAP-ORF) were developed as measures of normal-range and more pathological personality traits. This study investigated the validity of the SNAP Brief Self-Description Rating Form (SNAP-BSRF), an alternative version of the SNAP Self-Description Rating Form (SNAP-SRF) revised for further brevity. The scales of the SNAP-BSRF showed good convergence with the SNAP-SRF and the SNAP Other-Description Rating Form (SNAP-ORF) scales. A three-factor structure consistent with extant literature was found for the SNAP-BSRF using an exploratory structural equation modeling approach. Scales from the SNAP-BSRF showed meaningful associations with self-reported internalizing symptoms. Results suggest that this new version is a reasonable substitute for the SNAP-SRF and will be useful when a very brief measure of adaptive and maladaptive personality is needed.
Recent work has extended the idea of implicit self-theories to the realm of emotion to assess beliefs in the malleability of emotions. The current article expanded on prior measurement of emotion beliefs in a scale development project. Items were tested and revised over rounds of data collection with both students and nonstudent adult online participants. Exploratory and confirmatory factor analyses revealed a three-factor structure. The resulting scale, the Emotion and Regulation Beliefs Scale, assesses beliefs that emotions can hijack self-control, beliefs that emotion regulation is a worthwhile pursuit, and beliefs that emotions can constrain behavior. Preliminary findings suggest that the Emotion and Regulation Beliefs Scale has good internal consistency, is conceptually distinct from measures assessing individuals’ beliefs in their management of emotions and facets of emotional intelligence, and predicts clinically relevant outcomes even after controlling for an existing short measure of beliefs in emotion controllability.
Elevated levels of irritability have been reported across a range of psychiatric and medical conditions. However, research on the causes, consequences, and treatments of irritability has been hindered by limitations in existing measurement tools. This study aimed to develop a brief, reliable, and valid self-report measure of irritability that is suitable for use among both men and women and that displays minimal overlap with related constructs. First, 63 candidate items were generated, including items from two recent irritability scales. Second, 1,116 participants (877 university students and 229 chronic pain outpatients) completed a survey containing the irritability item pool and standardized measures of related constructs. Item response theory was used to develop a five-item scale (the Brief Irritability Test) with a strong internal structure. All five items displayed minimal conceptual overlap with related constructs (e.g., depression, anger), and test scores displayed negligible gender bias. The Brief Irritability Test shows promise in helping to advance the burgeoning field of irritability research.
Perfectionism cognitions capture automatic perfectionistic thoughts and have explained variance in psychological adjustment and maladjustment beyond trait perfectionism. The aim of the present research was to investigate whether a multidimensional assessment of perfectionism cognitions has advantages over a unidimensional assessment. To this aim, we examined in a sample of 324 university students how the Perfectionism Cognitions Inventory (PCI) and the Multidimensional Perfectionism Cognitions Inventory (MPCI) explained variance in positive affect, negative affect, and depressive symptoms when factor or subscale scores were used as predictors compared to total scores. Results showed that a multidimensional assessment (PCI factor scores, MPCI subscale scores) explained more variance than a unidimensional assessment (PCI and MPCI total scores) because, when the different dimensions were entered simultaneously as predictors, perfectionistic strivings cognitions and perfectionistic concerns cognitions acted as mutual suppressors thereby increasing each others’ predictive validity. With this, the present findings provide evidence that—regardless of whether the PCI or the MPCI is used—a multidimensional assessment of perfectionism cognitions has advantages over a unidimensional assessment in explaining variance in psychological adjustment and maladjustment.
Background. The Inventory of Callous-Unemotional Traits is a self- and other report questionnaire of callous-unemotional behaviors that is increasingly widely used in research and clinical settings. Nonetheless, questions about the factor structure and validity of scales remain. Method. This study provided the first large-scale (N = 1,078) investigation of the parent report version of the Inventory of Callous-Unemotional Traits in a community sample of school-age (first-grade) children. Results. Confirmatory factor analysis indicated that a two-factor model that distinguished empathic-prosocial (EP) from callous-unemotional (CU) behaviors provided the best fit to the data. EP and CU were moderately to strongly correlated with each other ( = –.67, p < .001) and with oppositional defiant disorder and conduct disorder (ODD/CD) behaviors (ODD/CD, EP = –.55; ODD/CD, CU = .71, ps < .001). Individual differences in EP and CU behaviors explained unique variation, beyond that attributable to ODD/CD behaviors, in peer-, teacher-, and parent relationship quality. Moreover, whereas EP moderated the effects of ODD/CD in the prediction of student–teacher relationship quality, CU moderated the effects of ODD/CD in the prediction of peer and parent relationship quality. Conclusions. Results are discussed with respect to the use of the ICU with school-age children.
When questionnaire data with an ordered polytomous response format are analyzed in the framework of item response theory using the partial credit model or the generalized partial credit model, reversed thresholds may occur. This led to the discussion of whether reversed thresholds violate model assumptions and indicate disordering of the response categories. Adams, Wu, and Wilson showed that reversed thresholds are merely a consequence of low frequencies in the categories concerned and that they do not affect the order of the rating scale. This article applies an empirical approach to elucidate the topic of reversed thresholds using data from the Revised NEO Personality Inventory as well as a simulation study. It is shown that categories differentiate between participants with different trait levels despite reversed thresholds and that category disordering can be analyzed independently of the ordering of the thresholds. Furthermore, we show that reversed thresholds often only occur in subgroups of participants. Thus, researchers should think more carefully about collapsing categories due to reversed thresholds.
The Memory for Intentions Screening Test (MIST) is a clinical measure of prospective memory. There is emerging support for the sensitivity and ecological relevance of the MIST in clinical populations. In the present study, the construct validity of the MIST was evaluated in 40 younger (18-30 years), 24 young-old (60-69 years), and 37 old-old (70+ years) healthy adults. Consistent with expectations derived from the prospective memory and aging literature, older adults demonstrated lower scores on the MIST’s primary scale scores (particularly on the time-based scale), but slightly better performance on the seminaturalistic 24-hour trial. Among the healthy older adults, the MIST showed evidence of both convergent (e.g., verbal fluency) and divergent (e.g., visuoperception) correlations with standard clinical tests, although the magnitude of those correlations were comparable across the time- and event-based scales. Together, these results support the discriminant and convergent validity of the MIST as a measure of prospective memory in healthy older adults.
Objective. The study assesses the reliability and validity of a new Online Continuous Performance Test (OCPT) for measuring sustained attention, response inhibition, and response time consistency among children. Method. The study sample comprised 73 children (6-13 years), 47 children with attention deficit hyperactivity disorder and 24 in the control group. The Diagnostic Interview Schedule for Children was administered to participants’ parents to confirm group allocation. Children completed the OCPT in a laboratory setting, and a week later completed the OCPT at home. Results. Split-half correlation coefficients reflected high levels of reliability in the laboratory and at home. Significant correlations were found between the laboratory- and home-based OCPT scores. Significant differences in OCPT performance were found between children with and without attention deficit hyperactivity disorder on the OCPT in the two settings. Conclusions. These results support the reliability and validity of the OCPT and suggest that it may serve as an effective tool for the assessment of children’s attention function in naturalistic settings.
The performance of 100 patients with traumatic brain injury (TBI) on the Wechsler Adult Intelligence Scale–Fourth Edition (WAIS-IV) was compared with that of 100 demographically matched neurologically healthy controls. Processing Speed was the only WAIS-IV factor index that was able to discriminate between persons with moderate-severe TBI on the one hand and persons with either less severe TBI or neurologically healthy controls on the other hand. The Processing Speed index also had acceptable sensitivity and specificity when differentiating between patients with TBI who either did or did not have scores in the clinically significant range on the Trail Making Test. It is concluded that WAIS-IV Processing Speed has acceptable clinical utility in the evaluation of patients with moderate-severe TBI but that it should be supplemented with other measures to assure sufficient accuracy in the diagnostic process.
The need for efficient clinical assessment instruments has been growing during the past years. In the current application, the item information (item response theory) is used to evaluate and build fixed short versions. The method was applied to a questionnaire measuring psychological distress and data were collected from two mixed outpatient and general population samples. After fitting the partial credit model, two short versions were built: one to increase efficiency in screening applications; the other for the monitoring of high distress patients. A cross-validation bootstrap procedure is proposed to check whether the short versions are more efficient than alternative item selections. Using the partial credit model, the results from short and full versions can be compared on score level, which improves the flexibility of the assessment. The discussion focuses on the model selection and on how many items are realistically needed in routine assessments of psychological distress.
The current article compares the use of exploratory structural equation modeling (ESEM) as an alternative to confirmatory factor analytic (CFA) models in personality research. We compare model fit, factor distinctiveness, and criterion associations of factors derived from ESEM and CFA models. In Sample 1 (n = 336) participants completed the NEO-FFI, the Trait Emotional Intelligence Questionnaire–Short Form, and the Creative Domains Questionnaire. In Sample 2 (n = 425) participants completed the Big Five Inventory and the depression and anxiety scales of the General Health Questionnaire. ESEM models provided better fit than CFA models, but ESEM solutions did not uniformly meet cutoff criteria for model fit. Factor scores derived from ESEM and CFA models correlated highly (.91 to .99), suggesting the additional factor loadings within the ESEM model add little in defining latent factor content. Lastly, criterion associations of each personality factor in CFA and ESEM models were near identical in both inventories. We provide an example of how ESEM and CFA might be used together in improving personality assessment.
Depression has robust associations with personality, showing a strong relation with neuroticism and more moderate associations with extraversion and conscientiousness. In addition, each Big Five domain can be decomposed into narrower facets. However, we currently lack consensus as to the contents of Big Five facets, with idiosyncrasies across instruments; moreover, few studies have examined associations with depression. In the current study, community participants completed six omnibus personality inventories; self-reported depressive symptoms were assessed at baseline and 5 years later. Exploratory factor analyses suggested three to five facets in each domain, and these facets served as prospective predictors of depression in hierarchical regressions, after accounting for baseline and trait depression. In these analyses, high anger (from neuroticism), low positive emotionality (extraversion), low conventionality (conscientiousness), and low culture (openness to experiences) were significant prospective predictors of depression. Results are discussed in regard to personality structure and assessment, as well as personality–psychopathology associations.
This study evaluated the specificity and false positive (FP) rates of the Rey 15-Item Test (FIT), Word Recognition Test (WRT), and Test of Memory Malingering (TOMM) in a sample of 21 forensic inpatients with mild intellectual disability (ID). The FIT demonstrated an FP rate of 23.8% with the standard quantitative cutoff score. Certain qualitative error types on the FIT showed promise and had low FP rates. The WRT obtained an FP rate of 0.0% with previously reported cutoff scores. Finally, the TOMM demonstrated low FP rates of 4.8% and 0.0% on Trial 2 and the Retention Trial, respectively, when applying the standard cutoff score. FP rates are reported for a range of cutoff scores and compared with published research on individuals diagnosed with ID. Results indicated that although the quantitative variables on the FIT had unacceptably high FP rates, the TOMM and WRT had low FP rates, increasing the confidence clinicians can place in scores reflecting poor effort on these measures during ID evaluations.
We investigated reliability and validity of the Mediator’s Assessment of Safety Issues and Concerns (MASIC), a screening interview for intimate partner violence and abuse (IPV/A) in family mediation settings. Clients at three family mediation clinics in the United States and Australia (N = 391) provided reports of the other parent’s IPV/A. Internal consistency of the total screen was excellent. A confirmatory factor analysis provided evidence that the MASIC assesses seven types of IPV/A: psychological abuse, coercive controlling behaviors, threats of severe violence, physical violence, severe physical violence, sexual violence, and stalking. Sex differences on differing types of violence victimization were generally consistent with previous research. Higher levels of victimization predicted self-reported consequences of abuse (e.g., fear, injuries). More abusive parties, as identified by their partners on the MASIC, had more Protective Orders and No Contact Orders and criminal convictions and crimes potentially related to IPV/A. Results provide initial evidence of the reliability and validity of the MASIC but more research is needed.
A body of research has demonstrated that individuals with Asian ethnicity endorse higher levels of fear of negative evaluation compared with individuals with European ethnicity. To date, no study has examined whether this Asian-European difference may be confounded by the differential interpretation of the measures of fear of negative evaluation by the two groups. The current study thus aimed to examine the measurement equivalence of the 12-item Brief Fear of Negative Evaluation (BFNE) scale and its 8-item variant composed of straightforwardly worded items (BFNE-S) in a sample of individuals who identified with a Chinese ethnicity (n = 204) and a sample of individuals who identified with an Anglo ethnicity (n = 528). Measurement equivalence across the samples was obtained for a two-factor BFNE model and a one-factor BFNE-S model. However, the BFNE-S model demonstrated superior fit to the data. Using the BFNE-S, we found that the Chinese ethnicity sample scored significantly higher on the latent dimension of fear of negative evaluation compared with the Anglo ethnicity sample (d = 0.24). These findings disambiguate previous research on Asian-European differences in fear of negative evaluation and highlight the need for the continued examination of the validity of measures across different ethnicities and cultures.
While there are a number of short personality trait measures that have been validated for use with adults, few are specifically validated for use with adolescents. To trust such measures, it must be demonstrated that they have adequate construct validity. According to the view of construct validity as a unifying form of validity requiring the integration of different complementary sources of information, this article reports the evaluation of content, factor, convergent, and criterion validities as well as reliability of adolescents’ self-reported personality traits. Moreover, this study sought to address an inherent potential limitation of short personality trait measures, namely their limited conceptual breadth. In this study, starting with items from a known measure, after the language-level was adjusted for use with adolescents, items tapping fundamental primary traits were added to determine the impact of added conceptual breadth on the psychometric properties of the scales. The resulting new measure was named the Big Five Personality Trait Short Questionnaire (BFPTSQ). A group of expert judges considered the items to have adequate content validity. Using data from a community sample of early adolescents, the results confirmed the factor validity of the Big Five structure in adolescence as well as its measurement invariance across genders. More important, the added items did improve the convergent and criterion validities of the scales, but did not negatively affect their reliability. This study supports the construct validity of adolescents’ self-reported personality traits and points to the importance of conceptual breadth in short personality measures.
The Emotion Understanding Assessment (EUA) is based on a theoretical model of recognizing emotion expressions and reasoning about situation-based, desire-based, and belief-based emotions. While research has noted that emotion understanding predicts current and future social and academic functioning, little is known about the psychometric properties of the EUA. This research sought to test the EUA factor structure and measurement invariance across gender, across language (English and Spanish speakers), and over time (24 weeks) in 281 preschoolers attending Head Start. Results indicated that a two-factor model of emotion expression recognition and emotional perspective taking of the EUA fit the data for the total sample, for each group (gender and language), and at each time point. Furthermore, configural and scalar invariance of the EUA was demonstrated across gender, language, and time. These results offer support that the EUA is assessing emotion expression recognition and emotional perspective taking constructs equivalently in boy, girls, Spanish and English speakers, and over time. Examination of latent means across groups and time indicate no differences in emotion understanding based on gender or language or over the 24-week time frame in this sample of preschoolers attending Head Start.
The Grooved Pegboard Test (GPT) was conceived as a test of manual dexterity, upper-limb motor speed, and hand–eye coordination. The aim of our study was to test the componential structure of the GPT on an archetypal model of motor impairment, Parkinson’s disease (PD). A total of 45 PD patients (33 males, 12 females; age M = 67, range = 49-81; PD duration M = 10, range = 6-20 years; H/Y stage 2, range = 2-3) and 20 age- and education-matched controls (14 males, 6 females; age M = 66, range = 48-80) were included. All participants were investigated using the GPT, Short Falls Efficacy Scale–International, Frontal Assessment Battery (FAB), Montreal Cognitive Assessment (MoCA), and Non-Motor Symptom Scale. Patients were followed for 6 months, using fall diaries and monthly phone calls to define PD fallers (falls ≥ 1; n = 27) and PD nonfallers (falls = 0; n = 18). Using structural equation modeling, the GPT predicted performance on the MoCA (p < .001), but not on the FAB (p = .29). In conclusion, analysis of the structure of the GPT provided evidence about important cognitive features, in addition to the motor component of this test in PD.
The present study examined the structural validity of the 25-item Connor–Davidson Resilience Scale (CD-RISC) in a large sample of U.S. veterans with military service since September 11, 2001. Participants (N = 1,981) completed the 25-item CD-RISC, a structured clinical interview and a self-report questionnaire assessing psychiatric symptoms. The study sample was randomly divided into two subsamples: an initial sample (Sample 1: n = 990) and a replication sample (Sample 2: n = 991). Findings derived from exploratory factor analysis (EFA) did not support the five-factor analytic structure as initially suggested in Connor and Davidson’s instrument validation study. Although parallel analyses indicated a two-factor structural model, we tested one to six factor solutions for best model fit using confirmatory factor analysis. Results supported a two-factor model of resilience, composed of adaptability- (8 items) and self-efficacy-themed (6 items) items; however, only the adaptability-themed factor was found to be consistent with our view of resilience—a factor of protection against the development of psychopathology following trauma exposure. The adaptability-themed factor may be a useful measure of resilience for post-9/11 U.S. military veterans.
All measures of depression yield a global summary scale indicating the severity of depressive symptoms, implicitly conceptualized as a homogeneous construct. However, depression is a heterogeneous construct, with different presentations, subtypes, correlates, and responses to interventions. In response, the National Institute of Mental Health (NIMH) has suggested changes in the way depression is assessed, moving the focus to specific factors, such as cognitive, somatic, or affective symptoms. Still, there is little factor overlap between measures, and shared factors are weighted differently. To help fulfill NIMH’s strategic plan, this study used canonical correlation analysis (CCA) to explore shared latent variables and redundancy across the measures. It also analyzed the psychometric properties of factor-based subscales in the Beck Depression Inventory–2nd edition (BDI-II), Center for Epidemiologic Studies Depression scale (CES-D), Inventory for Depression and Anxiety Symptoms (IDAS), and Inventory of Depressive Symptomatology (IDS). Using a diverse sample of 218 students who reported at least mild depressive symptoms, this study found that the IDAS was best aligned with NIMH’s strategic plan; it has complete DSM-IV/DSM-5 symptom coverage and content-valid, psychometrically sound subscales. The BDI-II, CES-D, and IDS did not have consistent subscales, nor had incomplete or incongruent coverage of DSM criteria. Furthermore, CCA revealed low redundancy across measures (23% to 41% shared variance). These results suggest that different measures of depression do not measure the same construct. As a partial solution, empirical conversion tables were provided for researchers and clinicians to empirically compare total scores from different measures.
Research has demonstrated strong connections among working memory (WM), higher-level cognition, and academic achievement. Despite the importance of WM, currently available WM tests have practical limitations and lack comprehensive coverage of multiple WM components. The Working Memory Battery (WOMBAT) includes nine subtests measuring multiple content domains and processing demands, in accordance with contemporary WM theoretical frameworks. The current study evaluated the WOMBAT factor structure and identified misfitting items using confirmatory factor analysis and Rasch modeling with scores from 125 adolescents and 177 adults (N = 302). Overall, results indicated the WOMBAT measures separate Verbal, Static Visual-Spatial, and Dynamic Visual-Spatial dimensions, and that more than 98% of items contribute to measurement of those dimensions. This provides support for the theoretical organization of WM into three distinct content domains in the WOMBAT. Misfitting items were identified using infit and outfit indices for further review to improve reliability and stability. Results also demonstrated adequate person separation and Rasch person reliability and item reliability. Test–retest reliability and internal consistency coefficients suggest adequate reliability for early-stage research, but further refinement is needed before the WOMBAT can be used for individual decision making. Implications for future test development and research on the WM construct are provided.
A number of studies have attempted to identify the factor structure of the Dysfunctional Attitude Scale (DAS). However, no studies have done so using a clinical sample of outpatients likely to generalize to the clinical trials in which the DAS is commonly used. The current investigation utilized exploratory structural equation modeling in an outpatient sample (N = 982) and found support for a one-factor solution (composed of 19 items). This solution was largely confirmed in a second outpatient sample (N = 301). Construct validity was demonstrated in correlations with measures of depression, social interaction anxiety, and symptoms of obsessive-compulsive disorder.
The ability to reason with language is a highly valued cognitive capacity that correlates with IQ measures and is sensitive to damage in language areas. The Penn Verbal Reasoning Test (PVRT) is a 29-item computerized test for measuring abstract analogical reasoning abilities using language. The full test can take over half an hour to administer, which limits its applicability in large-scale studies. We previously described a procedure for abbreviating a clinical rating scale and a modified procedure for reducing tests with a large number of items. Here we describe the application of the modified method to reducing the number of items in the PVRT to a parsimonious subset of items that accurately predicts the total score. As in our previous reduction studies, a split sample is used for model fitting and validation, with cross-validation to verify results. We find that an 8-item scale predicts the total 29-item score well, achieving a correlation of .9145 for the reduced form for the model fitting sample and .8952 for the validation sample. The results indicate that a drastically abbreviated version, which cuts administration time by more than 70%, can be safely administered as a predictor of PVRT performance.
Personality is an important predictor of various outcomes in many social science disciplines. However, when personality traits are not the principal focus of research, for example, in global comparative surveys, it is often not possible to assess them extensively. In this article, we first provide an overview of the advantages and challenges of single-item measures of personality, a rationale for their construction, and a summary of alternative ways of assessing their reliability. Second, using seven diverse samples (N total = 4,263) we develop the SIMP-G, the German adaptation of the Single-Item Measures of Personality, an instrument assessing the Big Five with one item per trait, and evaluate its validity and reliability. Third, we integrate previous research and our data into a first meta-analysis of single-item reliabilities of personality measures, and provide researchers with guidelines and recommendations for the evaluation of single-item reliabilities.
Personal growth initiative (PGI), an individual’s active and intentional desire to engage in the growth process, has been an important construct in studies of physical and mental health around the world. However, there is a dearth of research examining this construct in African American samples. In addition, PGI has recently undergone a revision of both its theory and measure; the resulting Personal Growth Initiative Scale–II (PGIS-II) has been validated for use only with European American and international college student samples. The current study examined the psychometric properties of the PGIS-II in a sample of African American college students. Confirmatory factor analyses yielded results consistent with previous studies, and the PGIS-II showed evidence of convergent and discriminant validity for three of its four factors. In addition, the PGIS-II was significantly related to aspects of Black racial identity, suggesting that it is a viable construct in this population.
There exists substantial debate about how to best assess pathological narcissism with a variety of measures designed to assess grandiose and vulnerable narcissism, as well as the DSM-IV and DSM-5 based conceptualizations of narcissistic personality disorder (NPD). Wright and colleagues published correlations between several narcissism measures (Narcissistic Personality Inventory [NPI]; Pathological Narcissism Inventory [PNI]; Personality Diagnostic Questionnaire [PDQ] NPD) with the traits comprising the DSM-5 Section III personality trait model. In the current study, we examine the agreement manifested by Wright and colleagues’ narcissism–DSM-5 trait profiles with expert ratings of the DSM-5 traits most relevant to descriptions of DSM-IV NPD. Despite concerns regarding the NPI’s ability to measure pathological narcissism, its trait profile was strongly correlated with expert ratings, as was PDQ NPD’s profile. Conversely, the trait profiles associated with the PNI were primarily uncorrelated with the expert rated NPD profile. The implications of these findings with regard to the assessment of narcissism are discussed.
In adult populations, research on methodologies to identify noncredible performance and exaggerated symptoms during neuropsychological evaluations has grown exponentially in the past two decades. Far less work has focused on methods appropriate for children. Although several recent studies have used stand-alone performance validity tests with younger populations, a near absence of pediatric work has investigated other indices to identify response bias. The present study examined the relationship between the validity scales from the self-report Behavior Assessment System for Children, Second Edition (BASC-2) and performance on the Medical Symptom Validity Test (MSVT), a stand-alone performance validity test. The sample consisted of 274 clinically referred patients with mild traumatic brain injuries aged 8 through 17 years. Fifty patients failed the MSVT based on actuarial criteria. The majority of these patients (92%) provided valid self-report BASC-2 profiles, with only three patients (6%) producing an invalid profile due to an elevated F index. Analysis of valid/invalid self-report BASC-2 profiles and MSVT pass/fail did not reveal a significant relationship (p = 0.471, two-tailed Fisher’s exact test). These findings suggest that performance validity tests like the MSVT provide substantively different information about the validity of a neuropsychological profile than that provided by the self-report validity scales of the BASC-2.
The Social Interaction Anxiety Scale and Social Phobia Scale are widely used measures of social anxiety. Using data from individuals with social anxiety disorder (n = 435) and nonanxious controls (n = 86), we assessed the psychometric properties of two independently developed short forms of these scales. Indices of convergent and discriminant validity, diagnostic specificity, sensitivity to treatment, and readability were examined. Comparisons of the two sets of short forms to each other and the original long forms were conducted. Both sets of scales demonstrated adequate internal consistency in the patient sample, showed expected patterns of correlation with measures of related and unrelated constructs, adequately discriminated individuals with social anxiety disorder from those without, and showed decreases in scores over the course of cognitive-behavioral therapy and/or pharmacotherapy. However, some significant differences in scale performance were noted. Implications for the clinical assessment of social anxiety are discussed.
The current study examined the measurement and structural invariance of the Depression Anxiety Stress Scales-21 (DASS-21) across ratings provided by men (N = 227) and women (N = 460). Multiple-group confirmatory factor analysis (CFA) supported full metric invariance and intercepts invariance for 20 of the 21 items. Invariance for all item intercepts was supported by multiple indicators multiple causes (MIMIC) procedure that controlled for the effects of age. Multiple-group CFA supported invariance for all factor variances and covariances. This procedure and the MIMIC analyses found equivalency for all latent mean scores. These findings indicate good support for measurement and structural invariance of the DASS-21 rating across men and women. The psychometric and practical implications of the findings are discussed.
Self-determination theory is potentially useful for understanding reasons why individuals with mental illness do or do not engage in psychiatric treatment. The current study examined the psychometric properties of three questionnaires based on self-determination theory—The Treatment Entry Questionnaire (TEQ), Health Care Climate Questionnaire (HCCQ), and the Short Motivation Feedback List (SMFL)—in a sample of 348 Dutch adult outpatients with primary diagnoses of mood, anxiety, psychotic, and personality disorders. Structural equation modeling showed that the empirical factor structures of the TEQ and SMFL were adequately represented by a model with three intercorrelated factors. These were interpreted as identified, introjected, and external motivation. The reliabilities of the Dutch TEQ, HCCQ, and SMFL were found to be acceptable but can be improved on; congeneric estimates ranged from 0.66 to 0.94 depending on the measure and patient subsample. Preliminary support for the construct validities of the questionnaires was found in the form of theoretically expected associations with other scales, including therapist-rated motivation and treatment engagement and with legally mandated treatment. Additionally, the study provides insights into the relations between measures of motivation based on self-determination theory, the transtheoretical model and the integral model of treatment motivation in psychiatric outpatients with severe mental illness.
Individuation is widely considered a fundamental developmental task of adolescence. It is a process through which the adolescent seeks to define new boundaries between his or her self and others, and the failure to do so has been shown to have serious consequences. Given its importance for understanding developmental transitions, it is surprising that there are few assessments of dysfunctional individuation. Over three studies, we provide evidence of a promising new measure of this important construct: the 10-item Dysfunctional Individuation Scale (DIS). Using confirmatory factor analysis and item response theory, we demonstrate that the DIS possesses a strong one-factor structure and excellent psychometric properties. Furthermore, we document the convergent, discriminant, and concurrent validity of the DIS through its relationships with indices of individuation, adjustment, and clinically relevant symptoms. Finally, we examine the incremental validity of the DIS over neuroticism as a predictor of depression (Beck Depression Inventory–II).
The psychometric properties and predictive validity of the Depression Change Expectancy Scale (DCES), a modification of an expectancy scale originally developed for patients with anxiety disorders, were examined in two studies. In Study 1, the 20-item scale was administered along with a battery of questionnaires to a sample of 416 dysphoric undergraduate students and demonstrated good internal consistency. A two-factor solution most parsimoniously accounted for the variance, with one factor containing all pessimistically worded items (DCES-P) and the second containing all optimistically worded items (DCES-O). The DCES-P showed patterns of correlations with other measures of related constructs consistent with hypothesized relationships; the DCES-O showed similar, but weaker, relationships with the other measures. Multilevel modeling was used to examine the predictive utility of the DCES in a clinical sample of 63 adults (Study 2). Improved depressive symptoms (over 6 weeks) were strongly associated with optimistic expectancies but were unrelated to pessimistic expectancies for change. The DCES appears to be a promising measure of expectancies for improvement among individuals with depressive symptoms.
The current study tests the convergent and discriminant validity of a modified version of the Five Factor Model Rating Form (FFMRF), a one-page, brief measure of the five-factor model. The Five Factor Form (FFF) explicitly identifies maladaptive variants for both poles of each of the 30 facets of the FFMRF. The purpose of the current study was to test empirically whether this modified version still provides a valid assessment of the FFM, as well as to compare its validity as a measure of the FFM to other brief FFM measures. Two independent samples of 510 and 330 community adults were sampled, one third of whom had a history of some form of mental health treatment. The FFF was compared with three abbreviated and/or brief measures of the FFM (i.e., the FFMRF, the Ten Item Personality Inventory, and the Big Five Inventory), a more extended measure (i.e., International Personality Item Pool-NEO), an alternative measure of general personality (i.e., the HEXACO-Personality Inventory-Revised), and a measure of maladaptive personality functioning (i.e., the Personality Inventory for Diagnostic and Statistical Manual of Mental Disorders, 5th edition). The results of the current study demonstrated convergent and discriminant validity, even at the single-item facet level.
The assessment and management of risk for future violence is a core requirement of mental health professionals in many settings. Despite an increasing need for violence risk assessments across diverse contexts, little is known regarding the ecological validity of many widely used risk assessment schemes or the level of reliability with which actual practicing clinicians score these instruments. The current study investigated the interrater reliability of the Historical, Clinical, and Risk Management-20 (HCR-20), a widely used structured professional tool to assess violence risk, among 21 practicing clinicians in a forensic psychiatric program in Ontario, Canada. Results suggest that clinicians with varying professional training backgrounds and experience were able to rate the HCR-20 with good to excellent levels of reliability across three patients who varied in risk level. Consistent with studies investigating rater reliability for research purposes, we found that the risk management scale of the HCR-20 was the most challenging for clinicians to rate reliably. Importantly, results from generalizability theory analyses revealed that less than 3% of the variance in HCR-20 total scores and summary risk ratings is attributable to rater effects, whereas the majority of variance is attributable to differences among patients.
Assessment of cognitive functioning is an important component of telephone surveys of health. Previous cognitive telephone batteries have been limited in scope with a primary focus on dementia screening. The Brief Test of Adult Cognition by Telephone (BTACT) assesses multiple dimensions central for effective functioning across adulthood: episodic memory, working memory, reasoning, verbal fluency, and executive function. The BTACT is the first instrument that includes measures of processing speed, reaction time, and task-switching/inhibitory control for use over the telephone. We administered the battery to a national sample (N = 4,268), age 32 to 84 years, from the study of Midlife in the United States (MIDUS) and examined age, education, and sex differences; reliability; and factor structure. We found good evidence for construct validity with a subsample tested in person. Implications of the findings are considered for efficient neuropsychological assessment and monitoring changes in cognitive aging, for clinical and research applications by telephone or in person.
Three socially aversive traits—Machiavellianism, narcissism, and psychopathy—have been studied as an overlapping constellation known as the Dark Triad. Here, we develop and validate the Short Dark Triad (SD3), a brief proxy measure. Four studies (total N = 1,063) examined the structure, reliability, and validity of the subscales in both community and student samples. In Studies 1 and 2, structural analyses yielded three factors with the final 27 items loading appropriately on their respective factors. Study 3 confirmed that the resulting SD3 subscales map well onto the longer standard measures. Study 4 validated the SD3 subscales against informant ratings. Together, these studies indicate that the SD3 provides efficient, reliable, and valid measures of the Dark Triad of personalities.
Objective: We present a new open language analysis approach that identifies and visually summarizes the dominant naturally occurring words and phrases that most distinguished each Big Five personality trait. Method: Using millions of posts from 69,792 Facebook users, we examined the correlation of personality traits with online word usage. Our analysis method consists of feature extraction, correlational analysis, and visualization. Results: The distinguishing words and phrases were face valid and provide insight into processes that underlie the Big Five traits. Conclusion: Open-ended data driven exploration of large datasets combined with established psychological theory and measures offers new tools to further understand the human psyche.
The aim was to test the internal structure of scores on the short and very short forms of the Children’s Behavior Questionnaire (CBQ) scale and to study the relationship between the dimensions derived and external variables previously related to extreme temperament in a Spanish community sample. The sample comprised of 622 three-year-old children participating in a longitudinal study. Data were obtained from parents and teachers through a semistructured diagnostic interview and questionnaires evaluating children’s characteristics and psychological states. Results showed a three-factor structure and moderate reliability of the scale scores for both the short and very short forms. Associations were found between the Surgency/Extraversion dimension and attention-deficit/hyperactivity disorder and externalizing problems, between Negative Affect and internalizing and emotional problems, and between Effortful Control and attention, externalizing, and social problems and other executive function measures. Salient temperamental characteristics predicted psychopathological disorders and impairment at ages 3 and 4. The short forms of the CBQ provide reliable and valid scores for assessing temperamental characteristics in the preschool years.
The Health-Related Quality of Life for Eating Disorder–Short questionnaire is one of the most suitable existing instruments for measuring quality of life in patients with eating disorders. The objective of the study was to evaluate its reliability, validity, and responsiveness in a cohort of 377 patients. A comprehensive validation process was performed, including confirmatory factor analysis and a graded response model, and assessments of reliability and responsiveness at 1 year of follow-up. The confirmatory factor analysis confirmed the two second-order latent traits, social maladjustment, and mental health and functionality. The graded response model results showed that all items were good for discriminating their respective latent traits. Cronbach’s alpha coefficients were high, and responsiveness parameters showed moderate changes. In conclusion, this short questionnaire has good psychometric properties. Its simplicity and ease of application further enhance its acceptability and usefulness in clinical research and trials, as well as in routine practice.
An alternative model for diagnosing personality disorders (PDs) appears in DSM-5 Section III. This model includes a set of dimensional personality traits, which along with impairment in personality functioning can be configured to represent one of six PDs. Although specific assessment instruments for these personality traits have already been developed (e.g., the Personality Inventory for DSM-5 [PID-5]), clinicians will likely continue to use omnibus measures of psychopathology that are familiar to them to inform diagnostic decision making. One such measure, the Minnesota Multiphasic Personality Inventory–2–Restructured Form (MMPI-2-RF), will likely remain in the test armamentarium of many practitioners and be employed to assess the DSM-5 dimensional traits. In the current investigation, we examined the associations between MMPI-2-RF scale scores and the PID-5 trait scores and DSM-5 Section III PDs in a combined sample of university students (n = 668) from the United States and Canada. Our results indicated that the MMPI-2-RF scale scores mostly converge with PID-5 dimensional traits as well as the Section III PDs in a conceptually expected manner. As such, we conclude that the MMPI-2-RF is a potentially useful instrument in assessing personality psychopathology as conceptualized in DSM-5 Section III.
Background. The assessment of intervention integrity is essential in psychotherapeutic intervention outcome research and psychotherapist training. There has been little attention given to it in mindfulness-based interventions research, training programs, and practice. Aims. To address this, the Mindfulness-Based Interventions: Teaching Assessment Criteria Scale (MBI:TACS) was developed. This article describes the MBI:TACS and its development and presents initial data on reliability and validity. Method. Sixteen assessors from three centers evaluated teaching integrity of 43 teachers using the MBI:TACS. Results. Internal consistency (α = .94) and interrater reliability (overall intraclass correlation coefficient = .81; range = .60-.81) were high. Face and content validity were established through the MBI:TACS development process. Data on construct validity were acceptable. Conclusions. Initial data indicate that the MBI:TACS is a reliable and valid tool. It can be used in Mindfulness-Based Stress Reduction/Mindfulness-Based Cognitive Therapy outcome evaluation research, training and pragmatic practice settings, and in research to assess the impact of teaching integrity on participant outcome.
This study investigated inconsistent responding to survey items by participants involved in longitudinal, web-based substance use research. We also examined cross-sectional and prospective predictors of inconsistent responding. Middle school (N = 1,023) and college students (N = 995) from multiple sites in the United States responded to online surveys assessing substance use and related variables in three waves of data collection. We applied a procedure for creating an index of inconsistent responding at each wave that involved identifying pairs of items with considerable redundancy and calculating discrepancies in responses to these items. Inconsistent responding was generally low in the Middle School sample and moderate in the College sample, with individuals showing only modest stability in inconsistent responding over time. Multiple regression analyses identified several baseline variables—including demographic, personality, and behavioral variables—that were uniquely associated with inconsistent responding both cross-sectionally and prospectively. Alcohol and substance involvement showed some bivariate associations with inconsistent responding, but these associations largely were accounted for by other factors. The results suggest that high levels of carelessness or inconsistency do not appear to characterize participants’ responses to longitudinal web-based surveys of substance use and support the use of inconsistency indices as a tool for identifying potentially problematic responders.
Numerous researchers have noted that, instead of response sets or styles, most social desirability scales seem to measure personality traits instead. In two studies, we investigated the substantive interpretation of the Balanced Inventory of Desirable Responding in terms of the HEXACO model of personality. Because of its focus on honesty and integrity, the Impression Management (IM) scale was hypothesized to be mainly related to HEXACO Honesty-Humility. In the main study among 1,106 students and well-acquainted others (friends, family, or partners), positive self–other agreement correlations were found for both IM (r = .45) and Self-Deceptive Enhancement (SDE; r = .34), supporting a trait conception of IM and SDE. In both self- and other ratings, the most important predictors of SDE were (low) Emotionality, Extraversion, and Conscientiousness. IM was associated with Conscientiousness and Agreeableness, but Honesty-Humility was by far its most important predictor. In a subsample (n = 465), Honesty-Humility and IM were unrelated to GPA.
The Multidimensional Personality Questionnaire (MPQ) is a widely used personality assessment instrument informing lower- and higher-order personality dimensions. Despite recent developments of brief (MPQ-BF) and simplified wording (MPQ-SF) forms, there is relatively little work on the utility and validity of the MPQ in younger samples with lower reading levels. This study is the first to assess the reliability, factor structure, and criterion validity of the MPQ-SF in a sample of treatment-referred mid-adolescents (N = 105; 12-17 years). Results suggest adequate reliabilities for most of the lower-order primary scales and support a three-factor structure of the MPQ-SF, consistent with previous research with adult and college-aged samples. However, there were also notable cross-loadings for particular scales, which we discuss in relation to the four-factor MPQ model and the Five Factor Model of Personality. Relationships between MPQ personality dimensions and psychopathology using youth, parent, and clinician-rated psychopathology indices supported criterion-related validity. Together, these results confirm the utility of the MPQ in youth with treatment histories.
In order to assess the internal consistency, factor structure, and ability to recover DSM-IV personality disorders (PDs) of the Personality Inventory for DSM-5 (PID-5) scales, 710 Italian adult community dwelling volunteers were administered the Italian translation of the PID-5, as well as the Italian translation of the Personality Diagnostic Questionnaire–4+ (PDQ-4+). Cronbach’s alpha values were >.70 for all PID-5 facet scales and greater than .90 for all PID-5 domain scales. Parallel analysis and confirmatory factor analysis supported the theoretical five-factor model of the PID-5 trait scales. Regression analyses showed that both PID-5 trait and domain scales explained a substantial amount of variance in the PDQ-4+ PD scales, with the exception of the Passive-Aggressive PD scale. When the PID-5 was administered to a second independent sample of 389 Italian adult community dwelling volunteers, the basic psychometric properties of the scale were replicated. In this second sample, the PID-5 trait and domain scales proved to be significant predictors of psychopathy measures. As a whole, the results of the present study support the hypothesis that the PID-5 is a reliable instrument which is able to recover DSM-IV PDs, as well as to capture personality pathology that is not included in the DSM-IV (namely, psychopathy).
We used the Electronically Activated Recorder to observe 31 individuals with either borderline personality disorder (BPD; n = 20) or a history of a depressive disorder (n = 11). The Electronically Activated Recorder yielded approximately forty-seven 50-second sound clips per day for 3 consecutive days. Recordings were coded for expressed positive affect (PA) and negative affect (NA), and coder ratings were compared to participants’ reports about their PA and NA during interpersonal events. BPD participants did not differ from participants with depressive disorder in terms of their recalled levels of NA or PA across different types of interpersonal events. However, significant discrepancies between recalled and observed levels of NA and PA were found for BPD participants for all types of interpersonal events. These findings may reflect limitations in the ability of those with BPD to recall their emotional intensity during interpersonal events and may also provide some evidence for emotional invalidation experienced by those with BPD.
The Elemental Psychopathy Assessment (EPA) is a 178-item self-report measure designed to assess the basic elements of psychopathy from a Five-Factor Model perspective: Anger, Arrogance, Callousness, Coldness, Disobliged, Distrust, Dominance, Impersistence, Invulnerable, Manipulation, Opposition, Rashness, Self-Assurance, Self-Centered, Self-Contentment, Thrill-Seeking, Unconcern, and Urgency. The present article reports on the development of a short-form version of the EPA in two large undergraduate samples using item response theory. The validity of the resultant, 72-item, item response theory–derived short form is compared against the validity for the full scale in the undergraduate samples and smaller forensic sample. Results indicate that the 18 subscales of the EPA short form remain relatively reliable, possess an internal structure virtually identical to the full version, and manifest highly similar correlational profiles to a variety of criterion measures. The EPA short form is offered as a viable assessment of psychopathy when assessment time is limited. Implications of these findings are discussed.
Trait Negative Affect (NA) and Positive Affect (PA) are strongly associated with Neuroticism and Extraversion, respectively. Nevertheless, measures of the former tend to show substantially weaker self–other agreement—and stronger assumed similarity correlations—than scales assessing the latter. The current study separated the effects of item content versus format on agreement and assumed similarity using two different sets of Neuroticism and Extraversion measures and two different indicators of NA and PA (N = 381 newlyweds). Neuroticism and Extraversion consistently showed stronger agreement than NA and PA; in addition, however, scales with more elaborated items yielded significantly higher agreement correlations than those based on single adjectives. Conversely, the trait affect scales yielded stronger assumed similarity correlations than the personality scales; these coefficients were strongest for the adjectival measures of trait affect. Thus, our data establish a significant role for both content and format in assumed similarity and self–other agreement.
Despite widespread use, the South Oaks Gambling Screen (SOGS) has been criticized for excessive false positives as an indicator of pathological gambling (PG), and for items that misalign with PG criteria. We examine the relationship between SOGS scores and PG symptoms and convergent validity with regard to personality, mood, and addictive behaviors in a sample of 353 gamblers. SOGS scores correlated r = .66 with both DSM-IV and DSM-5 symptoms, and they manifested similar correlations with external criteria (intraclass correlation of .95). However, 195 false positives and 1 false negative were observed when using the recommended cut point, yielding an 81% false alarm rate. For uses with DSM-IV criteria, a cut point of 10 would retain high sensitivity with greater specificity and fewer false positives. For DSM-5 criteria, we advocate a cut point of 8 for use as a clinical screen and a cut point of 12 for prevalence and pseudo-experimental studies.
This study examined the validity and reliability of the Structured Assessment of Violence Risk in Youth (SAVRY), the Youth Level of Service/Case Management Inventory (YLS/CMI), and the Psychopathy Checklist: Youth Version (PCL:YV) in a sample of Spanish adolescents with a community sanction (N = 105). Self-reported delinquency with a follow-up period of 1 year was used as the outcome measure. The predictive validity of the three measures was compared with the unstructured judgment of the juvenile’s probation officer and the self-appraisal of the juvenile. The three measures showed moderate effect sizes, ranging from area under the curve (AUC) = .75 (SAVRY) to AUC = .72 (PCL:YV), in predicting juvenile reoffending. The two unstructured judgments had no significant predictive validity whereas the SAVRY had significantly higher predictive validity compared with both unstructured judgments. Finally, SAVRY protective factor total scores and SAVRY summary risk ratings did not add incremental validity over SAVRY risk total scores. The high base rates of both violent (65.4%) and general reoffending (81.9%) underline the need for further risk assessment and management research with this population.
This study examined the measurement structure of Child Behavior Checklist internalizing and externalizing syndrome scales in 1,146 eleven-year-old children from a birth cohort in Mauritius. We tested for measurement invariance at configural, metric, and scalar levels by gender and religioethnicity (Creole, Hindu, Muslim). A pared-down model representing five primary factors and two secondary factors met all three forms of invariance, supporting the validity of their use for group comparisons among Mauritian children. As rated by their parents, girls were higher than boys on Somatic Complaints and lower on Aggressive Behavior, Attention Problems, and Externalizing. Creoles were higher than Muslims and Hindus on all seven factors. Hindus were higher than Muslims on Somatic Complaints and lower on Aggressive Behavior. To our knowledge, this is the first study to demonstrate strict invariance of a Child Behavior Checklist-based internalizing and externalizing factor structure among subgroups within a society.
Estimating the risk for recidivism is important for many areas of the criminal justice system. In the present study, the Youth Actuarial Risk Assessment Tool (Y-ARAT) was developed for juvenile offenders based solely on police records, with the aim to estimate the risk of general recidivism among large groups of juvenile offenders by police officers without clinical expertise. On the basis of the Y-ARAT, juvenile offenders are classified into five risk groups based on (combinations of) 10 variables including different types of incidents in which the juvenile was a suspect, total number of incidents in which the juvenile was a suspect, total number of other incidents, total number of incidents in which co-occupants at the youth’s address were suspects, gender, and age at first incident. The Y-ARAT was developed on a sample of 2,501 juvenile offenders and validated on another sample of 2,499 juvenile offenders, showing moderate predictive accuracy (area under the receiver-operating-characteristic curve = .73), with little variation between the construction and validation sample. The predictive accuracy of the Y-ARAT was considered sufficient to justify its use as a screening instrument for the police.
The Frontal Systems Behavior Scale (FrSBe) is a 46-item questionnaire that measures behaviors associated with frontal subcortical deficits (apathy, disinhibition, and executive dysfunction) in adult neurologic populations. Based on findings from a previous exploratory factor analysis on the scale, the current study used confirmatory factor analysis to explore and potentially improve on the measurement model fit of current FrSBe scores. Model fit indices and reliabilities (measured using internal consistency reliability) were compared in the original and in several alternative models. The original scale demonstrated a generally good fitting model, although the best fitting model (referred to as the reduced model) removed eight items from the original measure and modestly improved model fit over the original FrSBe. Strong reliability was found in both versions. Results from the current study provide a critical first step in a potential FrSBe scale revision.
This study investigates the relationship between the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) and the Temperament and Character Inventory (TCI) in a combined data set (N = 491) of patients with a broad range of psychiatric disorders (n = 286) as well as alcohol use disorder (n = 205). We examined bivariate correlations between both measures. The MMPI-2-RF scales relate to the TCI dimensions as was hypothesized, and relationships between both measurements were largely similar for psychiatric patients and alcohol-dependent patients. Theoretical and clinical implications are considered.
It is widely accepted that abilities are a meaningful level of abstraction for distinguishing among individuals with respect to their levels of cognitive functioning. However, relatively little is known about the extent to which different combinations of tests reflect the same cognitive abilities, or about the relation of cognitive abilities in one test battery with specific tests in another battery. Data from two cognitive batteries were analyzed to determine the correspondence of ability factors in the two batteries, and to evaluate the relative influence of cognitive abilities from one battery on the subtest scores in the other battery. Although the batteries involved different combinations of tests, correlations between the theoretically similar ability factors in the two batteries were very high (i.e., r > .84). Furthermore, with only a few exceptions, the primary influences on the subtest scores in one battery were from the theoretically relevant ability factor in the other battery.
Mindfulness-based interventions are increasingly being used in various populations to improve well-being and reduce psychological afflictions. However, there is lack of a validated mindfulness measurement in the Chinese language. This study validated the Chinese version of the Five Facet Mindfulness Questionnaire (FFMQ-C) in both a community sample of 230 adults and a clinical sample of 156 patients with significant psychological distress. Results showed a good test–retest reliability (.88) and a high internal consistency (.83 in the community sample and .80 in the clinical sample). Mindfulness as measured by FFMQ-C has moderate to large correlations with psychological distress and mental well-being. Two of the five subscales (describing and acting with awareness) showed incremental validity over the others in predicting psychological symptoms and mental health. Confirmatory factor analysis confirmed the five-factor structure of the FFMQ-C and demonstrated adequate model fit. A 20-item short form scale (FFMQ-SF) was developed using the proposed comprehensive criteria. These findings indicate that the FFMQ-C is reliable and valid to measure mindfulness in a Chinese population. Further study is needed to evaluate the psychometric properties of FFMQ-SF.
The importance of self-beliefs in prominent models of social phobia has led to the development of measures that tap this cognitive construct. The Self-Beliefs Related to Social Anxiety (SBSA) Scale is one such measure and taps the three maladaptive belief types proposed in Clark and Wells’s model of social phobia. This study aimed to replicate and extend previous research on the psychometric properties of the SBSA. Replicating previous research, in an (undiagnosed) undergraduate sample (n = 235), the SBSA was found to have a correlated three-factor structure using confirmatory factor analyses, and the SBSA and its subscales demonstrated good internal consistency and test–retest reliability. The SBSA and its subscales also had unique relationships with social anxiety and depression, the majority of which replicated previous research. Extending previous research, the SBSA and its subscales showed good incremental validity in the undergraduate sample and good discriminative validity using the undergraduate sample and a sample of individuals with social phobia (n = 33). The SBSA’s strong theoretical basis and the findings of this study suggest that the SBSA is an ideal research and clinical tool to assess the cognitions characteristic of social phobia.
This study examined the factor structure, reliability, and validity of the Antisocial Process Screening Device–Self-Report (APSD-SR), the Youth Psychopathic Traits Inventory (YPI), and the YPI–Short Version (YPI-SV) in detained female adolescents aged 12 to 17 years. The proposed three-factor structure of the YPI and YPI-SV was replicated, whereas the proposed three-factor structure of the APSD-SR or alternate models did not yield adequate fit. Overall, reliability indices for the YPI and YPI-SV were higher than those reported for the APSD-SR. APSD-SR and YPI scales were positively related with each other, except the affective dimensions of the instruments. All questionnaires showed good criterion validity but the YPI’s factor structure and reliability was superior to the APSD-SR. This superiority is not because of the larger number of items in the YPI, because we also demonstrated that the factor structure and reliability of the YPI-SV was better than that of the APSD-SR.
The Miranda Rights Comprehension Instruments (MRCI) constitute a revision of Grisso’s Instruments for Assessing Understanding and Appreciation of Miranda Rights (IAUAMR) original series of tests. We believe that the MRCI represents an improvement in many respects, including (but not limited to) a thorough discussion of admissibility issues in the test manual, simplification of language, the addition of a fifth warning, and updated normative data. We also review some potential challenges associated with the revised MRCI tests. These concerns include inconsistent and confusing terminology in the test manual, potentially nonrepresentative normative data, problematic reliability estimates, issues with scoring criteria for the CMR-II, item and test content representativeness for specific tests, and recommendations on how test effort and malingering should be assessed via the CMR-R-II. We conclude with a series of recommendations for forensic use of the MRCI.
Although recent research with the Disgust Scale–Revised (DS-R) has contributed to current knowledge regarding the structure of disgust, this line of research has exclusively employed adult samples. The current study extended existing research by examining the factor structure of the DS-R in an adolescent sample (N = 637). Exploratory factor analysis revealed three factors: Contagion, Mortality, and Contact Disgust. Subsequent to removing three items due to inadequate factor loadings, confirmatory factor analysis provided support for the 3-factor model across gender, grade level, and racial subgroups. Tests of item–intercept invariance also revealed no differences in item means across grade level. However, three and four items were associated with differences across race and gender, respectively. Latent factor means were also found to be invariant across racial groups and grade level, but not across gender. Implications of the DS-R factor structure in this adolescent sample and its domains are discussed.
Although the McLean Screening Instrument for Borderline Personality Disorder (MSI-BPD) has shown validity in adult samples, only one study has explored its validity in adolescents and, to our knowledge, the measure has not been validated with inpatient adolescents. The aim of the current study was to evaluate the reliability, and convergent and criterion validity, of the MSI-BPD in an effort to establish the clinical utility of the MSI-PBD as a screening measure for BPD in inpatient adolescents. A total of 121 adolescents from an acute care inpatient unit were recruited for the study. Convergent validity was examined with established measures of BPD in adolescents, including the use of receiver operating characteristics analyses to establish a clinical cutoff score for the MSI-BPD in predicting a diagnosis of BPD. Criterion validity was examined by using this clinical cutoff to investigate group differences in suicidal ideation and Axis I symptoms, known correlates of BPD. Findings demonstrated support for validity of the MSI-BPD when used among inpatient adolescents, and established a clinical cutoff of 5.5. Taken together, this study demonstrates adequate validity for the MSI-BPD, and suggests it is a valuable screening measure for BPD in adolescent inpatients.
Mixed group validation (MGV) is a statistical model for estimating the diagnostic accuracy of tests. Unlike the more common approach to estimating criterion-related validity, known group validation (KGV), MGV does not require a perfect external validity criterion. The present article describes MGV by (a) specifying both the standard error associated with MGV validity estimates and the effect of assumption violation, (b) recommending required sample sizes under various study conditions, (c) evaluating whether assumption violation can be identified, and (d) providing a simulated example of an MGV with imperfect base rate estimates. It is concluded that MGV will always have a wider margin of error than KGV, MGV performs best when the research design approximates a KGV design, the effect of assumption violation depends on the severity of the assumption violation and also the value of the base rates, and that assumption violation may only be detected in severe cases.
The Inventory of Callous and Unemotional Traits (ICU), developed to assess callous/unemotional (CU) traits, has recently experienced increased attention in light of the proposal to add a CU specifier to the conduct disorder diagnosis in DSM-5. In a sample of 70 at-risk adolescents (ages 13-17 years) in the foster care system who received a contemplative intervention program, the present study placed the ICU within a nomological network of correlates, including anxiety, depression, hopefulness, loneliness, and physiological measures of stress (e.g., cortisol). The findings offered some support for the ICU’s construct validity, including significant negative associations with measures of compassion toward others. Nevertheless, unexpected substantial positive correlations emerged with multiple measures of psychological distress, raising questions concerning other aspects of the ICU’s construct validity. Taken together, results of the current study suggest that rather than assessing a dearth of all major emotions as implied by its name and some previous descriptions, the ICU may be heavily saturated with negative emotionality and global maladjustment.
Using a multimeasure longitudinal research design, we measured psychopathy with the Youth Psychopathic Traits Inventory (YPI) and the Psychopathy Checklist–Youth Version (PCL-YV) among 122 offending girls. We examined the psychometric properties of the YPI, investigated the association between the YPI and the PCL-YV, and assessed their concurrent and longitudinal association with externalizing problems on the Youth/Adult Self-Report and violent and delinquent behaviors on the Self-Report of Offending. Alphas for the YPI were adequate and there were small to moderate correlations between the YPI and PCL-YV, suggesting that each assesses distinctive personality features. The YPI and the PCL-YV were approximately equivalent in their association with concurrent and longitudinal outcomes with two exceptions, where the YPI demonstrated a stronger association with antisocial behavior. Concurrently, there was a divergent relationship between the psychopathy factor scores and antisocial outcomes. Within 2 years, the psychopathy affective factor, which constrained the YPI and PCL-YV to be equivalent, was associated with externalizing behaviors and the YPI affective factor was associated with violent offending. Approximately 41/2 years later, neither measure was significantly related to antisocial behavior after accounting for past behavior. Reasons for continuity and discontinuity in risk identification are discussed.
Although it is understood that assessment tools require evaluation using diverse samples, such evaluations are relatively rare. There are obstacles to such work, but it remains important to pursue psychometric data in broad samples. As such, we evaluated measurement invariance and population heterogeneity of two versions of a widely used measure in the anxiety literature—the Intolerance of Uncertainty Scale (IUS)—among self-identifying White (N = 1,185) and Black (N = 301) students. Data from multiple-groups confirmatory factor analysis supported the equivalence of the equal form and factor loadings of both IUS versions in White and Black respondents. However, specific IUS items functioned differently in the two groups, with more IUS items appearing biased in the full-length relative to the short-form version. Correlations between IUS factors and worry were equivalent among White and Black respondents. We discuss the implications of these results for future research.
This study investigated the psychometric properties of a number of neuropsychological tests adapted for use in sub-Saharan Africa. A total of 308 school-age children in a predominantly rural community completed the tests. These tests were developed to assess skills similar to those measured by assessments of cognitive development published for use in Western contexts. Culturally appropriate adaptations were made to enhance within-population variability. Internal consistency ranged from .70 to .84. Scores on individual tests were related to various background factors at the level of the child, household, and neighborhood. School experience was the most consistent predictor of outcome, accounting for up to 22.9% of the variance observed. Significant associations were identified to determine salient background characteristics that should be taken into account when measuring the discrete effects of disease exposure in similar sociocultural and economic settings.
This study used the correlated trait–correlated method minus one model to examine the convergent and discriminant validity of the scales of the Strengths and Difficulties Questionnaire (SDQ). The SDQ scales are emotional symptoms (ES), conduct problems (CP), hyperactivity (HY), peer problems (PP), and prosocial behaviors (PS). A total of 202 adolescents provided self-ratings and were also rated by their mothers and teachers. The findings indicated support for convergent validity for all five SDQ scales for all three respondents. Generally there was more convergence between mother–adolescent ratings than mother–teacher and adolescent–teacher ratings, especially for ES and PP. There was support for the discriminant validity between the traits in all scales, except between CP and HY. The findings are discussed in relation to the construct validity and clinical use of the SDQ.
The VIA Inventory of Strengths (VIA-IS) has emerged as the primary instrument for gauging individual strengths and virtues. Prior studies have generated inconsistent results concerning the latent structure of the VIA-IS. The present study attempted to address some of these inconsistencies. VIA-IS results from a large sample (N = 458,998) of U.S. adults who completed the inventory online were subjected to a series of principal components and factor analyses. The sample was 66.46% female with a mean age of 34.36 years (SD = 14.13 years) and consistent with the general U.S. population in terms of geographic distribution. Information on ethnicity was not available. The size of the sample permitted both scale- and item-level analyses. The scale-level analyses produced findings similar to those of previous studies, but raised concerns about multidimensionality in the scales. Item-level analyses suggested an alternate set of 24 scales, 20 of which overlapped substantially with existing VIA-IS scales. A second-order analysis suggested five factors, including a new one labeled Future Orientation, versus the original six virtues proposed in the development of the VIA-IS. The results were used to speculate about elements of a second-generation model of strengths.
Using a multiple regression approach with a large developmental sample (N = 460) of Rorschach protocols from psychiatric, forensic, and nonclinical control groups, the authors created continuous multivariable Composite scores corresponding to the Comprehensive System (CS) Perceptual-Thinking Index, Hypervigilance Index, and Suicide Constellation. Within a validation sample (N = 230), these three new scores, called the Thought and Perception Composite, Vigilance Composite, and Suicide Concern Composite were strongly associated with the three original CS Indices. Additional analyses suggest that the new Composite scores were more reliable than and at least as valid as the original Indices. Interpretive guidelines are offered.
Appraisals of substance abuse often constitute a key component of psychological assessments affecting both diagnostic and treatment issues. Because of negative consequences, many substance users engage in outright denials and marked minimization regarding their drug use. Psychological measures, especially those with transparent items, are highly vulnerable to this denial. To address this response style, indirect items are often included on substance use measures to identify those who deny their use. The purpose of this study was to examine the effects of complete and partial denial on the Drug Abuse Screening Test–20, Substance Abuse Subtle Screening Inventory–3, and Drug Use Screening Inventory–Revised. Partial denial refers to the disacknowledgement of drug-related impairment interfering in multiple domains of a client’s functioning. The study used a mixed within- and between-subjects design with 102 inpatient substance users. Each participant completed the study under two conditions: a disclosing condition and an experimental condition (either complete denial or partial denial). Results show partial denial is distinctly different from complete denial across three self-report substance use measures. Importantly, substance users engaging in these denial conditions were often undetected by these measures. Contrary to expectations, subtle scales with indirect item content were only minimally more effective than the face valid scales alone for the assessment of denied drug use.
This study describes the development and psychometric properties of the Dutch brief form of the Multidimensional Personality Questionnaire (MPQ-BF-NL). Representative samples from the Netherlands (N = 1,055) and the United States (N = 1,153) and a Dutch student sample (N = 987) were used for development, cross- and external validation, respectively. The authors’ strategy for item selection and scale validation replicated the development of the U.S. brief form (MPQ-BF). Internal consistencies were generally good and comparable to the U.S. version, as were correlations with the U.S. full-length scales and higher order structure. Moreover, convergent and divergent patterns were consistent with prediction, with Positive Emotionality related to social and activating behavior, Negative Emotionality to anxiety, and Constraint to reversed impulsivity and externalizing behaviors. In sum, the MPQ-BF-NL provides the Dutch-Flemish language area with a personality inventory well suited for both psychopathology research and clinical practice and offers new opportunities for fundamental and cross-cultural studies on personality.
The purpose of this study was to demonstrate the application of the many-faceted Rasch model to a personality measure. The authors use the model to calibrate the Self-Talk Scale (STS). Good model–data fit supported the measurement of self-talk frequency in adults as a unidimensional construct. Results also supported the proper functioning of the original five-category STS response format. Because of evidence that different items do not contribute equally to the total score, the authors provide information for converting raw STS total scores into more appropriate logit scores. The methodology and results demonstrate how the Rasch model can provide additional support for the validity of measures. Implications for using the Rasch model for personality assessment in general and for using the STS in particular are discussed.
The internal consistency reliability of a measure can be a focal point in an evaluation of the potential adequacy of an instrument for adaptation to another cultural setting. Cronbach’s alpha (α) coefficient is often used as the statistical index for such a determination. However, alpha presumes a tau-equivalent test and may constitute an inaccurate population estimate for multidimensional tests. These notions are expanded and examined with a Japanese version of a questionnaire on nursing attitudes toward suicidal patients, originally constructed in Sweden using the English language. The English measure was reported to have acceptable internal consistency (α) albeit the dimensionality of the questionnaire was not addressed. The Japanese scale was found to lack tau-equivalence. An alternative to alpha, "composite reliability," was computed and found to be below acceptable standards in magnitude and precision. Implications for research application of the Japanese instrument are discussed.
The Chinese version of Beck Depression Inventory II (BDI-II-C) is one of the most used instruments to measure the severity of depression in Taiwan. The scarce literature regarding its psychometric properties (e.g., measurement invariance) highlighted the need and significance for such an investigation. The purpose of this study was to examine the gender-related measurement invariance of the BDI-II-C in an adolescent sample facing an entrance examination in the following two ways: (a) examining configural, metric, and scalar invariance using multigroup confirmatory factor analyses and (b) estimating the effects of any detected noninvariance on mean differences. The participants included 827 (416 boys and 411 girls) Taiwanese adolescents. Results indicate that measurement invariance was established at the level of configural, metric, and partial scalar invariance. Seven noninvariant intercepts (Items 2, 3, 7, 9, 10, 12, and 19) were identified, showing that there was differential additive response style bias for the BDI-II-C across gender groups. Additionally, the results demonstrated that the noninvariance had significant effects on interpretation based on gender latent mean difference as well as observed mean difference.
Continuous Performance Tests (CPTs) are used in research and clinical contexts to measure sustained attention and response inhibition. Reliability and validity of a new Online Continuous Performance Test (OCPT) was assessed. The OCPT is designed for delivery over the Internet, thereby opening new opportunities for research and clinical application in naturalistic settings. In Study 1, participants completed the OCPT twice over a 1-week period. One test was taken at home and one in the laboratory. Construct validity was assessed against a gold standard CPT measure. Results indicate acceptable reliability between the home- and laboratory-administered tests. Modest to high correlations were observed between the OCPT scales and the corresponding scales of the gold standard CPT. Study 2 examined whether the OCPT may discriminate participants with attention deficit hyperactivity disorder from healthy controls. Results revealed significantly higher rates of omission and commission errors and greater response time variability in participants with attention deficit hyperactivity disorder relative to healthy controls. These results support the reliability and validity of the OCPT and suggest that it may serve as an effective tool for the assessment of attention function in naturalistic settings.
Methodologically, longitudinal assessment of cognitive development in young children has proven difficult because few measures span infancy through school age. This matter is further complicated when the child presents with a sensory deficit such as hearing loss. Few measures are validated in this population, and children who are evaluated for cochlear implantation are often reevaluated annually. The authors sought to evaluate the predictive validity of subscales of the Mullen Scales of Early Learning (MSEL) on Leiter International Performance Scales–Revised (LIPS-R) Full-Scale IQ scores. To further elucidate the relationship of these two measures, comparisons were also made with the Vineland Adaptive Behavior Scale–Second Edition (VABS), which provides a measure of adaptive functioning across the life span. Participants included 35 children (14 female, 21 male) who were evaluated both as part of the precandidacy process for cochlear implantation using the MSEL and VABS and following implantation with the LIPS-R and VABS. Hierarchical linear regression revealed that the MSEL Visual Reception subdomain score significantly predicted 52% of the variance in LIPS-R Full-Scale IQ scores at follow-up, F(1, 34) = 35.80, p < .0001, R2 = .52, β = .72. This result suggests that the Visual Reception subscale offers predictive validity of later LIPS-R Full-Scale IQ scores. The VABS was also significantly correlated with cognitive variables at each time point.
Moderator and mediator relationships linking variables from three different theoretical traditions—race (subcultural theory), education (life-course theory), and criminal thinking (social learning theory)—and recidivism were examined in 1,101 released male federal prison inmates. Preliminary regression analyses indicated that racial status (White, Black, Hispanic) moderated the relationship between criminal thinking, as measured by the General Criminal Thinking (GCT) score of the Psychological Inventory of Criminal Thinking Styles (PICTS), and recidivism. Further analysis, however, revealed that it was not racial status, per se, that moderated the relationship between the PICTS and recidivism, but educational attainment. Whereas the PICTS was largely effective in predicting recidivism in inmates with 12 or more years of education, it was largely ineffective in predicting recidivism in inmates with fewer than 12 years of education. When education and the GCT score were compared as possible mediators of the race–recidivism relationship only the GCT successfully mediated this relationship. Sensitivity testing showed that the GCT mediating effect was moderately robust to violations of the sequential ignorability assumption on which causal mediation analysis rests. Moderator and mediator analyses are potentially important avenues through which theoretical constructs can be integrated and assessment strategies devised.
This study examined the predictive validity of the Washington State Juvenile Pre-Screen Assessment (WSJCA pre-screen) in the Netherlands. Previous research conducted in the United States showed the predictive validity of the WSJCA pre-screen to be modest, as is the case with the predictive validity of most other risk assessment instruments for juveniles. Therefore, it was also examined whether the predictive validity of the WSJCA pre-screen can be improved by modifying the scoring procedure. The sample consisted of 520 youths who had been referred to the juvenile probation service by court. The present study showed the predictive validity of the WSJCA pre-screen in the Netherlands to be modest too, with an area under the receiver operating characteristic curve (AUC) of .625. Modifying the scoring procedure by means of chi-squared automatic interaction detector analyses significantly improved the predictive validity to an AUC of .702. The modified scoring procedure is time-saving because only variables that uniquely contribute to the prediction of recidivism are included, which at the same time leads to a more accurate prediction of recidivism.
This article reviews cognitive interviewing (CI) as a survey pretesting method in cross-national settings. Particularly, semi-structured cognitive interviewing (SSCI) using direct probing is advocated when CI involves multiple countries/languages. Four major groups of fundamental issues are discussed: conceptual, measurement, procedural, and practical. The conceptual issues relate to the nature of interview data, potential sources of problems, and sample size. Next, it is shown how the SSCI method can be used to informally evaluate validity, reliability, and cross-cultural equivalence. This is followed by the procedural steps and the practical issues in implementing cross-national SSCI studies. Some methodological and practical limitations are also noted. The article concludes by highlighting the implications of using the cross-national CI method in a single-country context with multiple immigrant/cultural/language groups or in monocultural settings.
This study examined the psychometric properties of the 19-item Thought–Action Fusion (TAF) Scale, a measure of maladaptive cognitive intrusions, in a large clinical sample (N = 700). An exploratory factor analysis (n = 300) yielded two interpretable factors: TAF Moral (TAF-M) and TAF Likelihood (TAF-L). A confirmatory bifactor analysis was conducted on the second portion of the sample (n = 400) to account for possible sources of item covariance using a general TAF factor (subsuming TAF-M) alongside the TAF-L domain-specific factor. The bifactor model provided an acceptable fit to the sample data. Results indicated that global TAF was more strongly associated with a measure of obsessive-compulsiveness than measures of general worry and depression, and the TAF-L dimension was more strongly related to obsessive-compulsiveness than depression. Overall, results support the bifactor structure of the TAF in a clinical sample and its close relationship to its neighboring obsessive-compulsiveness construct.
Objective. To examine psychometric properties and investigate factor structures of the Mandarin Chinese version of the Eating Disorder Inventory (C-EDI). Method. The Mandarin C-EDI and other self-administered questionnaires were completed by a group of female eating disorder (ED) patients (n = 551) and a group of female nursing students (n = 751). Internal consistency, and convergent and discriminant validities were evaluated. Exploratory and confirmatory factor analyses were conducted to examine the construct validity of the Mandarin C-EDI. Results. The Mandarin C-EDI had good internal consistency and convergent and discriminant validities. With a few exceptions, the original clinically derived eight EDI subscales were clearly identified and the factorial validity of the first-order eight-factor structure and the second-order two-factor structure showed an acceptable degree of fit to our empirical data in clinical patients. Discussion. The findings suggest that the Mandarin C-EDI is a valid tool for clinical use in Taiwan.
The Relationship dimension of the Family Environment Scale, which consists of the Cohesion, Expressiveness, and Conflict subscales, measures a person’s perception of the quality of his or her family relationship functioning. This study investigates an adaptation of the Relationship dimension of the Family Environment Scale for Alaska Native youth. The authors tested the adapted measure, the Brief Family Relationship Scale, for psychometric properties and internal structure with 284 12- to 18-year-old predominately Yup’ik Eskimo Alaska Native adolescents from rural, remote communities. This non-Western cultural group is hypothesized to display higher levels of collectivism traditionally organized around an extended kinship family structure. Results demonstrate a subset of the adapted items function satisfactorily, a three-response alternative format provided meaningful information, and the subscale’s underlying structure is best described through three distinct first-order factors, organized under one higher order factor. Convergent and discriminant validity of the Brief Family Relationship Scale was assessed through correlational analysis.
This article reports on a confirmatory factor analytic study of an adapted version of an instrument designed to assess family functioning of Chinese families. The Chinese Family Assessment Instrument, originally designed for completion by adolescents, was adapted for completion by parents. A sample of 700 parent dyads of elementary school children (382 girls and 318 boys) completed the adapted questionnaire. Initial factor analyses showed that the existing five-factor structure used for adolescents’ responses was not a good fit for these data. Instead, a four-factor solution emerged where the factors were positive family functioning, negative family functioning, tolerance for family members, and parental understanding. This structure was the same for both mothers and fathers. Further studies of the Chinese Family Assessment Instrument parent adaptation are required to test the factor structure that emerged. Following such studies, validation studies will be required.
Historically, psychopathy has been viewed as a clinical syndrome with a unitary etiology, assessed via clinical interview. However, factor analytic studies suggest that psychopathy may also be understood as a combination of two subfactors consisting of (a) interpersonal-affective and (b) lifestyle-antisocial traits. Furthermore, evidence supports the use of self-report measures to assess psychopathy and these subfactors. This investigation employed a Stroop-like task to determine the relationship of the two psychopathy factors, as assessed by both interview-based and self-report measures, to attention-related abnormalities in psychopathy. For both instruments, the factors interacted to predict performance (i.e., interference), though the unique main effects were nonsignificant. The results suggest that the anomalous selective attention of psychopathic offenders is specific to individuals with high scores on both factors. Moreover, these results have important implications for the two-factor model of psychopathy and provide preliminary support for the functional similarity of self-report and interview-based measures of psychopathy.
This study examined how implicit and explicit changes following integrative inpatient treatment of adolescents with eating disorder (ED) may predict the posttreatment ratings of psychodynamic therapists of their patients’ openness to therapeutic processes and their change (Therapist Evaluation Inventory). The relative contribution of inpatients’ ego functions was compared with that of their mental distress and ED symptoms in two subgroups: restricting type anorexia (AN-R) and binging/purging type EDs (B/P). Data indicated that the implicit personality variable of elevated ability to modulate affects was the best predictor of therapist-rated global outcome among patients with B/P symptoms, whereas in patients with AN-R, evolving openness to implicit negative affects and a reduction in reported distress were best predictors. In patients with AN-R, attenuated affect control was also significantly correlated with therapist posttreatment ratings. These data point that in addition to addressing behavioral/symptomatic aspects, personality variables should be addressed in the psychological treatment of EDs.
Terrorism creates lingering anxiety about future attacks. In prior terror research, the conceptualization and measurement of coping behaviors were constrained by the use of existing coping scales that index reactions to daily hassles and demands. The authors created and validated the Coping with Terror Scale to fill the measurement gap. The authors emphasized content validity, leveraging the knowledge of terror experts and groups of Israelis. A multistep approach involved construct definition and item generation, trimming and refining the measure, exploring the factor structure underlying item responses, and garnering evidence for reliability and validity. The final scale comprised six factors that were generally consistent with the authors’ original construct specifications. Scores on items linked to these factors demonstrate good reliability and validity. Future studies using the Coping with Terror Scale with other populations facing terrorist threats are needed to test its ability to predict resilience, functional impairment, and psychological distress.
There are two commonly used measures of boredom: the Boredom Proneness Scale (BPS) and the Boredom Susceptibility Scale (ZBS). Although both were designed to measure the propensity to experience boredom (i.e., trait boredom), there are reasons to think they may not measure the same construct. The present research sought to evaluate this proposition in several stages. Specifically, relationships between the BPS, ZBS, and important causal (Study 1, N = 837), correlational (Study 2, N = 233), and outcome variables (Study 3, N = 137) were examined in university students. Taken together, results support the notion that the BPS and ZBS do not measure the same construct. Specifically, higher BPS scores were associated with higher levels of neuroticism, experiential avoidance, attentional and nonplanning impulsivity, anxiety, depression, dysphoria, and emotional eating. Conversely, higher ZBS scores were associated with higher levels of motor impulsivity, sensitivity to reward, gambling, and alcohol use and lower levels of neuroticism, experiential avoidance, and sensitivity to punishment.