BACKGROUND

The categorical classification system of the Diagnostic and statistical manual of mental disorders (DSM; American Psychiatric Association) has long been criticized in relation to personality disorders (Trull & Durett, 2005). Longstanding issues include a lack of diagnostic reliability (mainly owing to overlapping criteria for the different disorders and the clinically meaningless comorbidity of resulting diagnoses) and an inability to provide information about disease severity (Westen & Arkowitz-Westen, 1998). Over the years, numerous empirical studies (Krueger & Eaton, 2010) have supported this criticism of the categorical model, resulting in a gradual shift toward a more dimensional approach to personality disorders (Cierpiałkowska, 2013). As Stern et al. (2018) point out, this dimensional approach has, to varying degrees, found expression in the Alternative Model for Personality Disorders in the DSM-5 (APA, 2013), the Psychodynamic Diagnostic Manual-2 (Lingiardi & McWilliams, 2017), and the International Classification of Diseases, 11th ed. (WHO, 2020). The consensus in favor of applying a dimensional approach to personality disorders has necessarily revived the debate on the precise nature of personality dimensions (Hicks et al., 2017) and called into question the use of assessments based mainly on dispositional traits related to the Big Five (Krueger & Markon, 2014).

For many psychodynamic clinicians around the world, these moves toward a more dimensional model are welcome, but incomplete. In particular, these hybrid models fail to account for the highly unstable nature of interpersonal relationships over time and the lack of self-perception in disorders such as borderline personality disorder (Gunderson et al., 2018). The greatest risk of these hybrid and therefore partial assessments is that they adversely affect the planning or management of treatment (Gordon et al., 2019). For this reason, the dimensional approach has started to gain precedence over categorical models (Hörz-Sagstetter et al., 2021).

We therefore sought to develop a French-language version of the Inventory of Personality Organization (IPO), based on Kernberg’s model of personality organization.

In Kernberg’s (1996) diagnostic and theoretical framework, personality lies on a continuum between normal and the most disturbed pathological psychotic level. The level of personality organization is established by assessing three main components of personality, namely primitive psychological defenses, reality testing, and identity diffusion. Personality disorders manifest themselves in high levels of one or more of these three dimensions. Reality testing refers to the ability to distinguish self from nonself (and intrapsychic from external stimuli), and to maintain an empathetic connection with the ordinary social criteria of reality (Kernberg, 1996). Identity diffusion refers to the failure to produce cohesion in the subjective experience of self and significant others. The boundaries of the self are confused, and the self is fragmented. As a result, norms, interests, ethics and ideals (i.e., value system) are little integrated, if at all. The presence of primitive defense mechanisms such as splitting, projection, idealization, dissociation and denial, as opposed to more mature and elaborate defenses such as repression, isolation, reaction formation, and disturbance in the perception of identity, distorts and interferes with interpersonal interactions. It is suggestive of severe psychopathology, in particular a borderline organization, while the exclusive or massive presence of primitive defenses associated with a loss of reality testing (psychotics have high scores in all three dimensions; Smits et al., 2009) is suggestive of a psychotic organization.

The Inventory of Personality Organization (IPO) is derived from Kernberg’s (1996) psychodynamic model. The original 155-item version of this self-report instrument assesses the three major dimensions of personality and features various secondary scales: aggression, moral values (Lenzenweger et al., 2001). The three primary clinical scales, totaling 57 items, operationalize the core diagnostic components of Kernberg’s model. The IPO has been translated and assessed in several languages, including French Canadian (Biberdzic, 2017), Japanese (Igarashi et al., 2009) and other languages.

The present research only concerned the IPO’s three primary clinical scales (i.e., 57 items). It was translated into French before being evaluated. We chose to translate the English version, which has remained the most frequently used version since its evaluation by Lenzenweger et al. (2001). First, the IPO was translated into French by an English speaker who had been living in France for a long time. Next, the quality of this translation was assessed in the light of its back translation into English by two bilingual people who were not familiar with the questionnaire. Some items were then tweaked by two experts in the field of psychology, so that they were colloquially expressed in French whilst corresponding to Kernberg’s theory. During this translation process, we considered Biberdzic’s (2017) earlier translation into Canadian French, adapting some of his items to audiences who use Metropolitan French.

In the first of two studies, we assessed the reliability of the internal structure of our French form of the IPO (IPO-fr). The second study assessed the convergent validity of the questionnaire. Both studies were conducted in a nonclinical young adult population. Our methodology was consistent with that used for the reference version of the tool (Lenzenweger et al., 2001), as well as with that used to assess the translations (e.g., Igarashi et al., 2009; Smits et al., 2009).

STUDY 1: ASSESSMENT OF IPO-FR’S INTERNAL STRUCTURE

Given Lenzenweger et al.’s (2001) conclusions, refined by Smits et al. (2009), we expected the latent structure of the IPO, originally developed on the basis of Kernberg’s work as a three-dimensional construct, to fit the data better when the Identity Diffusion and Primitive Defense scales were treated as contributing to the same dimension. This expectation corresponded to what can be assessed clinically and empirically (Smits et al., 2009), insofar as a personality pathology or the absence of pathological elements is reflected in very high scores on one or the other of these two scales. By contrast, a high score on the Reality Testing scale only allows users to assume the presence of a psychotic personality organization. Based on the methodology developed by Smits et al. (2009), we only retained items with a factor loading of at least .40 (Samuels, 2016).

PARTICIPANTS AND PROCEDURE

PARTICIPANTS

Our sample comprised 602 first- and second-year psychology students (University of Angers, France): 112 men, 486 women, and 4 participants of unspecified gender. This gender ratio is typical of psychology students in France (Gaillard & Rexand-Galais, 2017). The mean age of the sample was 19.40 years (SD = 1.95).

INSTRUMENT

The IPO-fr is made up of three scales: Primitive Defenses (PD; 16 items), Identity Diffusion (ID; 21 items), and Impaired Reality Testing (RT; 20 items). All the items are rated on a 5-point Likert-like scale ranging from never true to always true. The psychometric properties and validity of the IPO have already been demonstrated in both clinical and nonclinical samples (Smits et al., 2009).

PROCEDURE

This study was approved by the research ethics committee of Angers University (registration no. UA-CER- 2021-09). Data were collected from 26 November 2021 to 19 February 2022. The researchers first gave prospective respondents detailed background information. Respondents then signed an informed consent form. Students who agreed to participate provided basic demographic information and answered the questionnaire anonymously via a computer.

STATISTICAL ANALYSIS

First, we ran an exploratory factor analysis (EFA) of responses to the questionnaire provided by first-year students (n = 269). As the factors were necessarily dependent on each other, we performed a promax rotation (i.e., diagonal rotation), in accordance with Igarashi et al. (2009). In line with Lenzenweger et al. (2001) and Smits et al. (2009), we set the number of factors at two. Furthermore, at the end of this EFA, items with factor loadings no higher than .40 on the scale to which they were supposed to belong were excluded. If an item loaded on an unexpected factor, it was also deleted. As is customary, in order to assess the stability of the IPO-fr’s factor structure identified by this EFA, we carried out a series of confirmatory factor analyses (CFA) on the data collected among second-year students (n = 333). The model fit parameters included indices (Hu & Bentler, 1999) such as the comparative fit index (CFI), whose conventional threshold of .90 is a good indicator of a reasonable fit (Awang, 2012). In addition, the root mean square error of approximation (RMSEA) and the standardized root mean residual (SRMR) were used as absolute measurement indices, with threshold values of < .08 for RMSEA and < .06 for SRMR (Hu & Bentler, 1999). Finally, the Akaike information criterion (AIC) and Bayesian information criterion (BIC) made it possible to compare the fit of the different models we tested, using Raftery’s (1995) and Burnham and Anderson’s (2004) criteria.

RESULTS AND DISCUSSION

The results showed good internal consistency for all three scales (PD: α > .86; ID: α > .91; RT: α > .89). Mean scores and standard deviations for each scale, gender, and sample are set out in Table 1.

Table 1

Mean scores and standard deviations as a function of gender and sample for the Reality Testing (RT) and Primitive Defenses (PD) / Identity Diffusion (ID) scales

SubscalesNonclinical sample
(first-year students)
Nonclinical sample
(second-year students)
Total
(n = 267a)(n = 331a)(N = 602)
MalesFemalesMalesFemalesMSD
MSDMSDMSDMSD
IPO-fr (40 items)
 RT1.990.452.070.471.730.341.710.331.870.43
 PD/ID2.400.592.490.562.410.772.350.742.410.67
IPO (57 items)
 RT1.990.452.070.471.730.341.710.331.870.43
 PD2.320.572.410.532.330.852.250.762.320.69
 ID2.460.622.540.602.480.762.420.752.480.69

[i] Note. aTwo participants in each sample (total = 4) did not specify their gender.

Concerning the EFA carried out on our first-year sample (n = 269), the results (see: Supplementary materials) showed that (a) almost all the items with a significant loading on the first factor belonged to the PD/ID scale, and (b) almost all the items with a significant loading on the second factor belonged to the RT scale. Across the scales, 10 items were excluded, as they did not have a factor loading above .40. It should also be noted that items 44 and 46 (theoretically belonging to the PD/ID scale) as well as items 30, 39, 42, 56 and 12 (originally belonging to the RT scale) had a factor loading above .40, but not on the right scale. They were therefore excluded from subsequent analyses. Factor analysis of the retained items showed that they all had a loading above .40 on their respective factor.

Supplementary materials

In a second step, in accordance with Smits et al. (2009), we ran a CFA of the 40 remaining items for the sample of second-year students. For this purpose, three CFA models were fitted to the IPO-fr: (1) a single-factor model in which all the items loaded on a single factor (M1); (2) a two-factor model in which the items belonging to the ID and PD scales loaded on factor 1, and the items belonging to the RT scale loaded on factor 2 (M2); and (3) a three-factor model corresponding to what had originally been theorized (M3).

The results of these CFAs (Table 2) showed that the two- and three-factor models fitted the data, confirming previous findings (Lenzenweger et al., 2001; Smits et al., 2009). The three-factor model seemed to fit the data better (χ2/df = 2.40, CFI = .904, RMSEA = .065, SRMR = .050, AIC = 26290.09, BIC = 26605.91) than the two-factor model did (χ2/df = 2.26, CFI = .902, RMSEA = .066, SRMR = .050, AIC = 26306.54, BIC = 26614.76) with respect to Raftery’s (1995) and Burnham and Anderson’s (2004) criteria. We nevertheless decided to retain the M2 model, as there was an extremely close correlation (.98) between the identity diffusion and primitive defense factors in model 3, making them difficult to differentiate (Smits et al., 2009).

Table 2

Fit indices of confirmatory factor analysis

Modelχ2dfCFIAICBICSRMRRMSEA
M13067.99***740.78427580.2527884.66.097.099
M21792.28***739.90226306.5426614.76.050.066
M31771.83***737.90426290.0926605.91.050.065

[i] Note. M3 had the best fit, but we chose to retain M2; ***p < .001.

We prioritized the most balanced model, considering both goodness of fit (good fit of the model to the data) and parsimony (simplicity of the model due to the low number of parameters). This choice seems entirely consistent with Kernberg’s approach (Kernberg, 1996) and statistical findings on the IPO (Lenzenweger et al., 2001; Smits et al., 2009).

In addition to showing a good fit to the data, the two-factor model had a relatively satisfactory interfactor correlation (.44), supporting the uniqueness of the two dimensions. No significant gender differences could be observed in the two nonclinical samples’ results on the different scales of the IPO-fr. The two-factor structure of the IPO-fr was wholly consistent with Kernberg’s (1996) theory, according to which identity diffusion involves the use of primitive defense mechanisms.

This analysis attested to the validity of the IPO-fr in a student population. The factor structure of this French translation was convergent with previously acquired reference data (Lenzenweger et al., 2001; Smits et al., 2009). Although 17 items had to be removed from the original instrument, this was in line with Smits et al. (2009). Ten of these items (1, 3, 9, 18, 26, 32, 35, 38, 45, and 50) did not have a sufficient factor loading to be retained, while the seven others (ID/PD: 44, 46; RT: 12, 30, 39 42, 56) were excluded because they loaded on a factor to which they were not supposed to belong. To maintain the specificity of each of the factors, we chose not to move these seven items to the factor to which they appeared statistically to belong. We therefore obtained a questionnaire with 40 items, 10 belonging to the RT scale and 30 to the PD/ID scale (see: Supplementary materials), for which the validity of the internal structure and factorial singularity were established.

STUDY 2: CONVERGENT VALIDITY WITH PERSONALITY DIAGNOSTIC QUESTIONNAIRE-4+, POSITIVE AND NEGATIVE AFFECT SCHEDULE, HOSPITAL ANXIETY AND DEPRESSION SCALE, AND AGGRESSION QUESTIONNAIRE

To verify that the IPO-fr can indeed enable clinicians to locate individuals on the pathological-normal personality organization continuum, in line with Kernberg’s theory, we conducted this second study to assess its convergent validity. We compared respondents’ scores on the IPO-fr’s scales with other measures assessing (a) personality disorders, (b) positive and negative affect, (c) aggression, and (d) depressive symptoms and anxiety. Given the considerable comorbidity between personality pathology and mood disorders (Gunderson et al., 2018), we expected to observe a close link between the severity of respondents’ structural impairment and their emotional dysregulation in its affective and behavioral (aggression) components, as well as the anxious and depressive elements. High scores on these would be reflected by similar scores on the IPO-fr scales. We also expected to find a link with positive affects, as already reported by Lenzenweger et al. (2001). Furthermore, given that Igarashi et al. (2009) and Smits et al. (2009) demonstrated a link between the personality disorder categories of the DSM-IV and the IPO, we expected to observe a similar correspondence. We assumed that there would be significant differences on all the IPO-fr scales between individuals who met the criteria for cluster A personality disorders (paranoid, schizoid, and schizotypal), and those with cluster B personality disorders (antisocial, borderline, histrionic, and narcissistic) or cluster C personality disorders (active-avoidant, obsessive-compulsive, and dependent), as assessed with another test that is closely correlated with the DSM. In addition, we expected participants in cluster A and cluster B to differ significantly on the PD/ID scale, but not on the RT scale.

PARTICIPANTS AND PROCEDURE

PARTICIPANTS

The sample for this second study consisted of 305 students from different departments at the University of Angers. It was composed of 63 men and 242 women, with a mean age of 19.83 years (SD = 2.12). Participants completed all the tests that were administered to them.

INSTRUMENTS

For this second study, all participants were asked to respond not only to the IPO-fr, but also to four questionnaires translated and validated in French and previously used in different studies:

Personality disorders. The Personality Diagnostic Questionnaire-4+ (PDQ-4+) is a self-report questionnaire derived from the personality disorders section of the DSM-IV. It assesses 10 specific personality disorders in Axis II, plus two (passive-aggressive and depressive) in Appendix B of the DSM-IV. In France, the validated French version of the PDQ-4+ (Bouvard, 2002) is one of the questionnaires most widely used to screen for personality disorders and it is particularly useful for nonclinical populations (Wang et al., 2013).

Positive and negative affect. The Positive and Negative Affect Schedule (PANAS) is a self-report questionnaire composed of two 10-item scales that measure positive (attentive, active, alert, etc.) and negative (hostile, irritable, ashamed, etc.) affects. The French version of the PANAS was validated by Caci and Baylé (2007).

Aggression. Lack of aggression control has been assessed using the French version of the Aggression Questionnaire (AQ; Masse, 2001), which was validated by Pfister et al. (2001) in a nonclinical populatio Depressive symptoms and anxiety. The French version of the Hospital Anxiety and Depression Scale (HADS) was introduced by Lepine et al. (1985). The HADS is a self-report scale that was developed to detect states of depression and anxiety in the setting of a hospital outpatient clinic.

PROCEDURE

Like Study 1, this study was approved by the research ethics committee of Angers University (registration no. UA-CER-2021-09) and followed the same protocol.

STATISTICAL ANALYSIS

After looking for correlations between the scores on the IPO-fr scales and scores on the HADS, AQ, and PANAS (Table 3), we divided the sample into clinical subgroups according to the 10 personality disorders (plus: passive-aggressive and depressive) diagnosed with the PDQ-4+ and its Clinical Significance Scale.

Table 3

Means, standard deviations, and correlations between the IPO-fr scales and the HADS, PANAS and AQ scales

123MSD
1. IPO-fr Total74.9922.04
2. IPO-fr PD/ID.99***59.9520.54
3. IPO-fr RT.46***.30***15.043.98
4. HADS Total.59***.58***.23***10.616.77
5. HADS A.44***.45***.17**7.074.36
6. HADS D.70***.69***.34***3.613.57
7. PANAS+–.59***–.61***–.12*38.6510.85
8. PANAS–.43***.42***.18**22.118.28
9. AQ Total.67***.68***.21***64.9426.82
10. AQ PA.60***.61***.19***10.674.33
11. AQ VA.58***.59***.20***11.244.11
12. AQ A.68***.69***.21***9.994.84
13. AQ H.62***.63***.17**11.745.56

[i] Note. *p < .05, **p < .01, ***p < .001. IPO Total – total score on Inventory of Personality Organization; IPO PD/ID – Primitive Defenses and Identity Diffusion score; IPO RT – Reality Testing score; HADS Total – total score on Hospital Anxiety and Depression Scale; HADS A – anxiety subscore; HADS D – depression subscore; PANAS+ – positive affect measured by Positive and Negative Affect Schedule; PANAS– – negative affect measured by Positive and Negative Affect Schedule; AQ Total – total score on Aggression Questionnaire; AQ AP – physical aggression subscore; AQ AV – verbal aggression subscore; AQ A – anger subscore; AQ H – hostility subscore.

Moreover, to verify the IPO-fr’s predictive potential, we ran analyses of variance (ANOVAs) on its two scales according to subgroups. These ANOVAs and their related post hoc tests served to isolate the diagnostic specificities of the IPO-fr scales with regard to the different personality organizations. For the sake of transparency and replicability of our research protocol, we carried out different types of ANOVAs, taking their particular parametric assumptions into account. Various authors have argued for the robustness of F-tests to compare k-means in the context of violations of prerequisites such as the homogeneity of variances and normality of distribution (Blanca et al., 2017). This is not a consensual approach, as it leads to the inflation of false positives (i.e., Type I error). Although it is all too seldom explained and justified, the choice of a particular approach has an impact on the construction of scientific knowledge (Delacre et al., 2019). We made the following choices on the basis of our statistical data. Thus, the Shapiro-Wilk test and Levene’s test did not allow us to assume normal distribution and homoscedasticity of the residuals (p < .01), prompting us to favor the results of the nonparametric Kruskal-Wallis test (Ostertagova et al., 2014). All of these results are set out in Supplementary materials.

RESULTS

No significant differences were observed between the IPO-fr scales with respect to gender. The IPO-fr’s means and standard deviations were similar to those we observed in Study 1. Furthermore, the diagnostic assessment performed using the PDQ-4+ and its Clinical Significance Scale allowed us to observe a percentage of personality disorders in the surveyed population comparable to what has been demonstrated in other studies using the same test among French students (Bouvard & Cosma, 2008). No personality disorder was detected in 224 respondents (73.44%), but 43 individuals (14.09%) had cluster C personality disorders, 33 (10.81%) could be categorized as cluster B personality disorders, and 5 (1.6%) had cluster A personality disorders.

In line with Kernberg (1996) and Lenzenweger et al. (2001), the results tended to show highly significant correlations between IPO-fr scores, negative affect (PANAS), and the aggression components of the AQ. Correlations were also found between the anxiety and depression dimensions of the HADS and the IPO-fr scales. Furthermore, a negative correlation was observed between the RT and PD/ID scales of the IPO-fr and positive affect (PANAS).

The Kruskal-Wallis tests carried out on the sample, divided into subgroups corresponding to personality disorder clusters based on PDQ-4+ results, revealed that the subgroup with no personality disorders, cluster B, and cluster C only differed significantly on the PD/ID score (p < .001). A post hoc test revealed that participants with no personality disorders had the lowest PD/ID score (52.64, SD = 12.62), followed by participants in cluster C (62.51, SD = 15.60), then cluster B (95.24, SD = 8.54) and finally cluster A (132.60, SD = 10.36). This difference was significant in each case, except between clusters A and B. Furthermore, the results showed that participants in cluster A scored significantly higher on RT scale (35.4, SD = 4.16, p < .001).

DISCUSSION

The significant correlations between the IPO-fr scales and the three convergent measures support the validity of the 40-item French version of the IPO.

One of the major benefits of the IPO-fr is that it enables individuals to be assessed in terms of the severity of their structural personality impairment. The Kruskal-Wallis test, carried out on the subgroups constituted by the clusters of personality disorders, showed that the IPO-fr’s three scales are indeed capable of distinguishing between individuals according to how their personality is organized. Participants with no personality disorder differed from those belonging to clusters B and C on the PD/ID score. Despite the fact that clusters A and B do not differ on this score, the results clearly highlighted the sensitivity of the RT scale to psychotic features, and high scores on both the PD/ID and RT scales were a consistent indicator of this particular personality organization. Although the personality disorders present in our sample appeared to be consistent with their prevalence in the French student population, the low proportion of participants with disorders falling into cluster A or B suggests that no definitive conclusions should be drawn. Nevertheless, this study is quite encouraging with regard to the use of the IPO-fr for personality assessment purposes.

GENERAL CONCLUSIONS AND DIRECTIONS FOR FUTURE RESEARCH

Personality assessment is a fundamental issue in research. It is from this perspective that we set out to develop a French version of the IPO with an internal structure similar to the revised and validated English version of the original test (Lenzenweger et al., 2001). Although it is shorter than the English version (40 items instead of 57), its internal clarity (in particular, no excessive cross-loading between factors, as shown in Study 1) and its discriminatory capacity (Study 2) make it easy to interpret.

Although our results seem promising in the light of Kernberg’s theory, the IPO-fr will need to be tested with larger clinical populations. This will allow us to replicate the convergence of data between positive and negative affect, aggression, anxiety, and depression within clinical groups. The challenge here will be to establish the instrument’s predictive validity with greater confidence. Furthermore, we are aware of the low proportion of people with features suggesting psychotic organization in our sample (n = 5). For this reason, the results presented here should be treated with caution. Further investigations will be necessary to affirm or invalidate them, in particular with regard to the comparison of this population with those of people with symptoms specific to clusters B and C.

It is important to be able to assess the short-, medium- and long-term test-retest reliability of the IPO-fr. It would also be necessary to conduct a comparative study of the IPO-fr with a semistructured interview such as the Structured Interview of Personality Organization (STIPO; Clarkin et al., 2016). To our knowledge, the STIPO has not yet been validated in French. In the future, this will make it possible to verify the good ability of the IPO-fr to diagnose a personality disorder (Unoka et al., 2022), and this in a context where a sometimes precarious convergence has been shown between the PDQ4+ and other measures of personality disorders (Laconi et al., 2016).

Similarly, although the RT scale proved particularly sensitive to psychotic features, it is worth bearing in mind the recent contributions of Beatson et al. (2019) regarding the possible presence of specific psychotic symptoms (including verbal hallucinations) in borderline patients. These contributions will need to be taken into account in the future, in order to improve the accuracy of personality assessment by being as close as possible to the specific clinical characteristics of each personality organization. In conclusion, the present study allowed us to establish the relevance of the IPO-fr as a reliable and brief instrument for assessing individual personality. It could make a major contribution to the screening of personality pathologies in the French population and the assessment of treatment programs.

Supplementary materials are available on the journal’s website.