As mentioned in my previous posts, this is the fourth post in a seven part series examining science and pseudoscience in the field of personality and psychopathology assessment. My goal is to expose the less than stellar underbelly of some of the most commonly-used and well known measures, particularly those that fall under the grouping of “projective” assessments. The third measure I will turn a careful and critical eye to is primarily used with children.
The third type of projective test to be discussed is not a specific measure, like the Rorschach or the Thematic Apperception Test, but instead a collection of measures. A number of methods to reportedly assess personality and psychopathology require that an individual to draw pictures of a person, people, or objects. The three most widely used are the Draw?A? Person test (DAP; Harris, 1963), the House?Tree?Person test (HTP; Buck, 1948), and the Kinetic Family Drawing test (KFD; Burns & Kaufman, 1970). In surveys of clinical psychologists, all rank in the top 15 most commonly used instruments (Hogan, 2005), while school psychologists use them in 26?43% of assessments (depending on the instrument; Hojnoski et al., 2006). Given the speed and ease of their administration (many take fewer than 10 minutes), it is perhaps unsurprising that they are used so frequently.
Although each test has its own set of interpretation(s), there are two broad approaches to scoring figure drawings: the global approach and the sign approach (Lilienfeld, Wood, & Garb, 2000; Weiner & Greene, 2008). In the global approach (Koppitz, 1968), interpretation is based on sets of indicators that are summed to yield a total score of adjustment (or lack thereof). One scoring system (Tharinger & Stark, 1990) calls for a global score based not on sets of indicators, but instead the general impression of the psychologist. The sign approach, in contrast, relies on identification of isolated features of the drawing (e.g., eye size, size of figure, placement of figure) that are supposedly related to specific pathology or personality problems. For example, Machover (1949, 1951) identified large eyes as being linked to paranoid ideation, small figures to low self?esteem, and placing figures high on a page to high achievement striving. Purportedly, constructing these drawings could bypass conscious efforts to hide or exaggerate symptoms and provide a more complete understanding of a person.
Large amounts of research over the last 60 years have been conducted to examine the reliability and validity of figure drawings, with highly varied results. Interrater reliability (IRR) for the individual pieces used in the signs approach, for example, has been shown to be widely variable across different studies (for major reviews see Kahill, 1984; Palmer et al., 2000; Vass, 1998). Though certain signs have been shown as reliable from rater to rater (for example, size, detail, and line heaviness in Joiner et al., 1996), others were horribly unreliable, throwing the overall IRR into question. The same type of studies examining global scoring in the global approach, have yielded consistently higher rates of IRR, although still quite variable (for reviews see Kahil, 1984; Thomas & Jolley, 1998). Internal consistencies for quantitative approaches have been moderate to high, with many showing high levels (Groth?Marnat & Roberts, 1998; Naglieri, McNeish, & Bardos, 1992).
Validity studies across different projective drawings have met with a number of difficulties, particularly in the sign approach. A primary one is lack of consistency in operational definitions. For instance, different studies or scoring systems often have the same feature interpreted in a different way. To illustrate, West’s (1998) meta?analysis found that head sizes were interpreted to indicate sexual abuse in some studies but physical abuse in others. Some guidelines for interpreting drawings seem to almost specialize in making non?falsifiable predictions. Hammer (1959) said that pathology could be seen in drawings that were too large or too small, lines that were too heavy or too light, and ones that had either too few or too many corrections (erasures). Others stated that those same signs could either indicate high levels of anxiety or successful coping efforts against high anxiety (Handler & Reyher, 1965). Or, it might be that, as Waehler (1997) contends, lack of validity in a drawing may be simply because that individual does not show their distress in a drawing. Making such non?falsifiable predictions and explaining away negative findings are both hallmarks of pseudoscientific thinking (Shermer, 2002).
Specific research examining the validity of the sign approach for different psychological characteristics shows the problems one would expect based on the above information. For example, only 7% (2 of 30) of Machover’s (1949, 1951) signs have been found to have support – round torsos being indicative of more stereotypically feminine personality traits and drawings that were colored in being related to anxiety level (Kahill, 1984). Similar reviews of the KFD concluded that individual signs showed little to no relation to actual psychopathology (Handler & Habenicht, 1994). A study examining depressive and anxious symptoms in children on an inpatient psychiatric ward used both projective measures and objective measures (Joiner, Schmidt, & Barnett, 1996). Interestingly, this study found that the differing projective measures not only did not relate to scores on the objective measures, but also did not have a relationship to scores from the different projective measures (even another drawing measure). Other well?controlled studies have similar lack of validity in the sign approach in assessing for depressive and anxious symptoms in children (Tharinger & Stark, 1990). It should be mentioned, however, that one study of school children found that those with high scores on an objective anxiety measure showed used significantly lower amounts of pencil pressure on the DAP, resulting in light lines (LaRoque & Orbzut, 2006).
Despite the lack of validity demonstrated by the sign approaches, however, there is a silver lining for projective drawings. In a study examining the KFD and DAP, Tharinger and Stark (1990) were able to accurately distinguish groups of children. The KFD was able to differentiate between children with and without mood disorders, while the DAP distinguished among children who had mood disorders and mixed mood/anxiety problems (Tharinger & Stark, 1990). However, neither one was able to discriminate those with from those without anxiety disorders. There was also not a direct comparison to objective measures and their ability to distinguish between groups, even though there are mountains of evidence to support their use with children (Sattler, 2008). Further, there has been some support for the use of another global scoring procedure for the DAP, the Screening Procedure for Emotional Disturbance, to differentiate between groups of children with and without disruptive behavior problems (Naglieri & Pfeffier, 1992). Several other studies found similar results (e.g., Matto, 2002; Matto, Naglieri, & Clausen, 2005), although one study found much lower effect size differences and concluded that it was of limited utility in the schools (Wrightson & Saklofske, 2000).
Even these positive findings, though, must be interpreted cautiously at this point. One reason is that it is not known if controlling for intelligence, which has been shown to be lower across many types of psychopathology, would reduce or eliminate the positive findings reviewed above. The lone study I found that addressed that issue (Schneider, 1978) found that controlling for intelligence eliminated the possible incremental validity of drawings given to school?age children when assessing for behavior problems. The complex role of artistic ability in impacting scores and interpretations is also not well?understood, with some suggesting it may play the role of a suppressor variable (Lilienfeld et al., 2000). Also problematic is the fact that it is unknown how many practicing clinicians use a sign versus a global approach, although a small study of active practitioners (Smith & Dumont, 1995) suggests that the vast majority of those that rely on drawings for clinical hypotheses use some combination of the approaches.
In summary, it does appear that there may be limited uses for global scoring systems for projective drawings, in particular using the DAP and KFD for assessment of general behavioral and mood problems. It is unlikely, though, that these would add incremental validity to objective measures, and as such the use of projective drawings in persons is unlikely to be diagnostically useful. Further research on this issue, particularly as regards global scoring systems, should be conducted.
Verdict – Pseudoscience as primarily used, some support for use in broadly differentiating adjusted from maladjusted
Next time, I will be writing on the sentence completion tasks, and then the (non-projective but very widely used) Myers-Briggs Type Indicator. I will conclude the series with a look at scientifically reliable and valid measures of personality.
(For a full list of the works I’ve cited above, feel free to email me.)