Ability Model of Emotional Intelligence Mayer-Salovey-Caruso Emotional Intelligence Test
About EI MSCEIT Resources Training Research FAQ's About Us
EI Tests MSCEIT /EI Workshop MSCEIT Research Use Take MSCEIT

News

Contact Us
     

Emotionaliq.org

 

Pre-print

 

Measuring Emotional Intelligence with the MSCEIT V2.0

 

 

Also see: Emotional Intelligence as a Standard Intelligence.

   

ABOUT THIS ARTICLE: The actual article can be found at, and cited as: Mayer, J.D., Salovey, P., Caruso, D.R., & Sitarenios, G. (2003).  Measuring emotional intelligence with the MSCEIT V2.0.  Emotion, 3, 97-105.

ABOUT PUBLISHING THIS MANUSCRIPT ON THE WEB: A copyedited version of this manuscript is scheduled to appear in the journal Emotion. Copyright is by the American Psychological Association. According to APA guidelines on internet publishing (http://www.apa.org/journals/posting.html), the article can only be posted on the author's web site; "the posted article must carry an APA copyright notice and include a link to the APA journal home page; APA does not permit archiving with any other non-APA repositories; APA does not provide electronic copies of the APA published version for this purpose; and, authors are not permitted to scan in the APA published version." Please help us by abiding by these guidelines and not posting the article elsewhere. Please refer to that APA web site (http://www.apa.org ) for more information. Thank you!
 

Measuring Emotional Intelligence with the MSCEIT V2.0

John D. Mayer
Peter Salovey
David R. Caruso
Gill Sitarenios
 

ABSTRACT

Does a recently introduced ability scale adequately measure emotional intelligence (EI) skills? Using the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT), we examined (a) whether members of a general standardization sample and emotions experts identified the same test answers as correct, (b) the test’s reliability, and (c) the possible factor structures of EI. Twenty-one emotions experts endorsed many of the same answers as did 2112 members of the standardization sample, and exhibited superior agreement particularly when research provides clearer answers to test questions (e.g., emotional perception in faces). The MSCEIT achieved reasonable reliability, and confirmatory factor analysis supported theoretical models of EI. These findings help clarify issues raised in earlier articles published in Emotion.

INTRODUCTION

The past 12 years have seen a growing interest in emotional intelligence (EI), defined as a set of skills concerned with the processing of emotion-relevant information, and measured with ability-based scales. A new ability test of EI, the Mayer-Salovey-Caruso Emotional Intelligence Test, Version 2.0 (MSCEIT), potentially improves upon earlier measures, and can inform the debate over the scoring, reliability, and factor validity of such scales . The MSCEIT is intended to measure four branches, or skill groups, of emotional intelligence: (a) perceiving emotion accurately, (b) using emotion to facilitate cognitive activities, (c) understanding emotion, and (d) managing emotion .

The MSCEIT is the most recent of a series of ability scales of emotional intelligence. Its immediate predecessor was the MSCEIT Research Version 1.1 (MSCEIT RV1.1), and before that, the Multifactor Emotional Intelligence Scale . Those tests, in turn, evolved out of earlier scales measuring related constructs such as emotional creativity, social intelligence, and non-verbal perception . The MSCEIT and its predecessors are based on the idea that emotional intelligence involves problem solving with and about emotions. Such ability tests measure something relatively different from, say, self-report scales of emotional intelligence, with which correlations are rather low . The ability to solve emotional problems is a necessary, although not sufficient, ingredient to behaving in an emotionally adaptive way.

The MSCEIT V2.0 standardization process required collecting data sets relevant to the several issues concerning emotional intelligence. For example, in a recent exchange of papers in this journal, Roberts, Zeidner, and Mathews , raised concerns about the earlier-developed MEIS test of emotional intelligence. These concerns included whether there is one set of correct answers for an emotional intelligence test or whether expert and general (e.g., lay) opinions about answers diverge too much, whether such tests could be reliable, and whether the factor structure of such tests was fully understood and consistent with theory. In that same issue of the journal we, and others, responded . Findings from the MSCEIT standardization data reported here have the promise of more directly informing the debate, through empirical findings. The analyses we present address three questions: (a) Do general and expert criteria for correct answers to emotional intelligence test items converge? (b) What is the reliability of such tests? (c) And, is the factor structure of such tests consistent with theoretical models of EI?

THREE ISSUES ABOUT EMOTIONAL INTELLIGENCE ADDRESSED IN THE PRESENT STUDY

The Criteria for Correct Answers

One must know how to score a test’s items before one can settle such issues as the test’s reliability and factor structure. Our model of EI hypothesizes that emotional knowledge is embedded within a general, evolved, social context of communication and interaction . Consequently, correct test answers often can be identified according to the consensus response of a group of unselected test-takers. For example, if the respondents identify a face as predominantly angry, then that can be scored as a correct answer. We have further hypothesized that emotions experts will identify correct answers with greater reliability than average, particularly when research provides relatively good methods for identifying correct alternatives, as in the case of facial expressions of emotion, and the meaning of emotion terms .

If the general and the expert consensus diverge too far as to the correct answers on a test, a complication arises, because the two methods yield potentially different scores for each person. Roberts et al. used an expert criterion that we had developed based on only two experts, and a general consensus method to score the earlier MEIS measure, and found those methods did not always converge. Some aggregation of experts beyond two is necessary, however, to achieve a reliable identification of answers . Twenty-one emotions experts were employed in the present study.

Issues of Reliability

The MSCEIT V2.0 must exhibit adequate levels of reliability, as did the MEIS, MSCEIT RV1.0, and comparable psychological tests . As with its predecessor tests, the four MSCEIT V2.0 branch scores (e.g., Perception, Facilitating, Understanding, Management) draw on different tasks that include different item forms; that is, the items are non-homogeneous. Under such conditions, split-half reliability coefficients are the statistic of choice (relative to coefficient alphas), as they involve the orderly allocation of different item types to the two different halves of the test . The test-retest reliability of the total MSCEIT score has been reported elsewhere, at r(60) = .86 .

Issues of Factor Structure

The factor structure of a test indicates how many entities it plausibly measures. It is important to any debate over whether EI is a coherent, unified, concept. In this specific case, it indicates how many dimensions of EI the test is "picking up" -- one unified dimension, many related dimensions, or something else. We believe that the domain of EI is well described by 1-, 2-, and 4- oblique (correlated) factor models, as well as other equivalent models. If the MSCEIT V2.0 shows similar structure to the MEIS for both expert and general scoring, it would strengthen the argument that the theory of EI we employ works across tests. Using the standardization sample, we performed confirmatory factor analyses of the full scale MSCEIT V2.0, testing 1-, 2-, and 4- factor models to examine the range of permissible factor structures for representing the EI domain.

METHOD

Participants

General Sample

The present sample consisted of 2,112 adult respondents, age 18 or older, who completed the MSCEIT V 2.0 in booklet or online forms prior to May, 2001.  The sample was composed of individuals tested by independent investigators in 36 separate academic settings from several countries. The investigators had requested pre-release versions of the MSCEIT booklet or online forms (depending on Internet availability and other criteria), and had submitted documentation of their research qualifications and of approval of their research from their sponsoring institution. Only basic demographic data were collected across samples due to the diverse nature of the research sites.

Of those reporting gender, 1,217 (58.6%) were women and 859 (41.4%) were men.  The mean age of the sample was M = 26.25; S = 10.51, with roughly half the sample college-aged (52.9%), and the rest ranging upward to 69 years old.  The participants were educationally diverse, with 0.6% reporting not completing high school, 10.3% having completed only high school, 39.2% having some college or university courses, 33.7% having completed college, and 16.1% holding Masters level or higher degrees.  The group was ethnically diverse as well, with 34.0% Asian, 3.4% Black, 2.0% Hispanic, 57.9% White, and 2.3% other or mixed ethnicity.  Most participants came from the United States (1240), with others from South Africa (231), India (194), the Philippines (170), the United Kingdom (115), Scotland (122), and Canada (37); all testing was in English.

Expert Sample

The expert sample was drawn from volunteer members of the International Society for Research on Emotions (ISRE) at its 2000 meeting. The Society was founded in 1984 with the purpose of fostering interdisciplinary scientific study of emotion. Membership is open to researchers and scholars who can demonstrate a serious commitment to the investigation of the emotions. Twenty-one experts, 10 male and 11 female, from eight Western countries, participated. The sample of experts had a mean age of 39.38 (S = 6.44; Range =30 to 52); no data about their ethnicity were collected.

The MSCEIT V2.0

The MSCEIT V2.0 is a newly developed, 141-item scale designed to measure four branches (specific skills) of emotional intelligence: (a) Perceiving Emotions, (b) Using Emotions to Facilitate Thought, (c) Understanding Emotions, and (d) Managing Emotions. Each of the four branches is measured with two tasks. Perceiving Emotions is measured with the Faces and Pictures tasks; Facilitating Thought is measured with the Sensations and Facilitation tasks; Understanding Emotions is measured with Blends and Changes; and Managing Emotions is measured with Emotion Management and Emotional Relationships tasks.

Each of the 8 MSCEIT tasks is made up of a number of item parcels or individual items. A parcel structure occurs, for example, when a participant is shown a face (in the Faces task), and asked about different emotions in the face in five subsequent items. The five items make up an item parcel because they are related to the same face, albeit each asks about a different emotion . Other items involve one response per stimulus, and are, in that sense, free-standing. Response formats were intentionally varied across tasks so as to ensure that results generalized across response methods, and to reduce correlated measurement error. Thus, some tasks, such as Pictures, employed 5-point rating scales, whereas other tasks, such as Blends, employed a multiple-choice response format.

Briefly, in the Faces task (4 item parcels; 5 responses each), participants view a series of faces and for each, respond on a five point scale indicating the degree to which a specific emotion is present in a face. The Pictures task (6 parcels; 5 responses each) is the same as Faces except that landscapes and abstract designs form the target stimuli, and the response scale consists of cartoon faces (rather than words) of specific emotions. In the Sensations task (5 parcels; 3 responses each), respondents generate an emotion and match sensations to them. For example, they might generate a feeling of envy and decide how hot or cold it is. In the Facilitations task (5 item parcels; 3 responses each), respondents judge the moods that best accompany or assist specific cognitive tasks and behaviors, for example, whether joy might assist planning a party. In the Blends task (12 free-standing items), respondents identify emotions that could be combined to form other emotions. They might conclude, for example, that malice is a combination of envy and aggression. In the Changes task (20 free-standing items), respondents select an emotion that results from the intensification of another feeling. For example, they might identify depression as the most likely consequence of intensified sadness and fatigue. Respondents in the Emotion Management task (5 parcels; 4 responses each) judge the actions that are most effective in obtaining the specified emotional outcome for an individual in a story. They are asked to decide, for example, what a character might do to reduce her anger, or prolong her joy. Finally, in the Emotional Relationships task (3 item parcels; 3 responses each), respondents judge the actions that are most effective for one person to use in the management of another person’s feelings. See the test itself, and its manual, for more specific task information .

General and Expert Consensus Scoring

The MSCEIT yields a total score, two area scores (experiential and strategic), four branch scores corresponding to the four-branch model, and eight task scores. Each score can be calculated according to a general consensus method. In that method, each one of a respondent’s answers is scored against the proportion of the sample who endorsed the same MSCEIT answer. For example, if a respondent indicated that surprise was "definitely present" in a face, and the same alternative was chosen by 45% of the sample, the individual’s score would be incremented by the proportion, .45. The respondent’s total raw score is the sum of those proportions across the 141 items of the test. The other way to score the test is according to an expert scoring method. That method is the same, except that the each of the respondent’s scores is evaluated against the criterion formed by proportional responding of an expert group (in this case, the 21 ISRE members). One of the purposes of this study was to compare the convergence of these two methods.

Procedure

The MSCEIT administration varied depending upon the research site at which the data were collected (see "Sample," above). The MSCEIT was given to participants to complete in large or small groups, or individually. Of the 2,112 participants, 1,368 took the test in a written form and 744 took the test in an on-line form that presented the exact same questions and response scales, by accessing a web-page. Those taking the pencil and paper version completed scannable answer sheets that were entered into a database. Web page answers were transmitted electronically. Prior research has suggested that booklet and on-line forms of tests are often indistinguishable .

RESULTS

Comparison of Test-Booklet versus On-Line Administration Groups

We first compared the MSCEIT V2.0 booklet and on-line tests. For each, there are 705 responses to the test (141 items X 5 responses each). The correlation between response frequencies for each alternative across the two methods was r(705) = .987. By comparison, a random split of the booklet sample alone, for which one would predict there would be no differences, yields almost exactly the same correlation between samples of r(705) = .998. In each case, a scatterplot of the data indicated that points fell close to the regression line throughout the full range of the joint distribution, and that the points were spread through the entire range (with more points between .00 and .50, than above .50). Even deleting 30 zero-response alternatives from the 705 lowered the correlation by only .001 (in the random split case). The booklet and on-line tests were, therefore, equivalent, and the samples were combined.

Comparison of General vs. Expert Consensus Scoring Criteria

We next examined the differences between answers identified by the experts and by the general consensus. We correlated the frequencies of endorsements to the 705 responses (141 items x 5 responses) separately for the general consensus group and the expert consensus group, and obtained an r(705)= .908 that, although quite high, was significantly lower than the r = .998 correlation for the random split (z(703) = 34.2, p < .01). The absolute difference in proportional responding for each of the 705 alternatives also was calculated. The mean value of the average absolute difference between the expert and general groups was ÷ M D(705)| = .08; S = .086, which also was significantly greater than the difference between the booklet and on-line samples of ÷ MD(705)| = .025; S = .027; z(705) = 16.3, p < .01.

We hypothesized that emotions experts would be more likely than others to possess an accurate shared social representation of correct test answers; their expertise, in turn, could provide an important criterion for the test. If that were the case, then experts should exhibit higher inter-rater agreement than the general group. To assess inter-rater agreement, we divided the expert group randomly into two subgroups of 10 and of 11 experts each, and computed the modal response for each of the 705 responses for the two subgroup of experts. The Kappa representing agreement controlling for chance across the two expert subgroups for the 5 responses of the 141 items was 6= .84. We then repeated this process for two groups of 21 individuals, randomly drawn from the standardization (general) samples and matched to the expert group exactly on gender and age. Two control groups, rather than one, were used to enhance our confidence in the results; the education level of the comparison groups was comparable to that of the rest of the general sample. When we repeated our reliability analysis for the two matched control groups, we obtained somewhat lower Kappas of 6= .71 and .79.

The same superiority of the expert group exists at an individual level as well, where disaggregated agreement will be lower . The average inter-rater Kappa coefficients of agreement across the 5 responses of the 141 items, for every pair of raters within the expert group, was 6= .43, which significantly exceeded the average Kappas of the two control groups 6’s = .31; .38 (z’s = 4.8; 1.85; p < .05 to .01, one-tailed test).

Because general and expert groups both chose similar response alternatives as correct, and experts have higher inter-rater reliability in identifying such correct alternatives, members of the standardization sample should obtain somewhat higher test scores when the experts’ criterion is used (before normative scaling corrections are applied). Moreover, the expert group should obtain the largest score advantages on skill branches where the experts most agree, owing to the experts’ greater convergence for those responses. For example, one might expect increased expert convergence for Branches 1 (emotional perception) and 3 (emotional understanding) because emotions experts have long focused on the principles of coding emotional expressions , as well as on delineating emotional understanding . By contrast, research on Branches 2 (emotional facilitation of thought) and 4 (emotion management) is newer and has yielded less consensus, and so experts might be more similar to the general sample in such domains.

To test this idea, we conducted a 4 (branch) X 2 (consensus versus expert scoring criterion) ANOVA on MSCEIT scores. The main effect for scoring criterion was significant (F(1,1984) = 3464, p < .001), indicating, as hypothesized, that participants obtained higher raw scores overall when scored against the expert criteria. The main effect for branch was significant as well (F(3,5952) = 1418, p < .001), indicating, unsurprisingly, that items on some branches were harder than others. Finally, there was a branch by scoring criterion interaction (F(3,5952) = 2611, p < .001).

Orthogonal contrasts indicated that participants scored according to the expert criterion on Branches 1 and 3 obtained significantly higher scores than when scored against the general consensus (see Table 1; F(1,1984) = 1631 and 5968, respectively; p’s < .001). Branch 2 (using emotions to facilitate thought) showed a significant difference favoring general consensus (F(1,1984) = 711, p’s < .001), and Branch 4 showed no difference (F(1,1984) = 1.57, n.s.). The advantage for expert convergence on Branches 1 and 3 may reflect the greater institutionalization of emotion knowledge among experts in these two areas.

In a final comparison of the two scoring criteria, participants’ tests were scored using the general criterion, on the one hand, and the expert criterion, on the other. The correlation between the two score sets ranged from r (2004-2028) = .96 to .98 across the Branches, Areas, and Total EIQ scores, as reported in Table 1.

The evidence from this study reflects that experts are more reliable judges, and converge on correct answers where research has established clear criteria for answers. If further studies bear out these results, the expert criteria may prove superior to the general consensus.

Reliability of the MSCEIT V2.0

The MSCEIT has two sets of reliabilities depending upon whether a general or expert scoring criterion is employed. That is because reliability analyses are based on participants’ scored responses at the item-level, and scores at the item-level vary depending upon whether responses are compared against the general or expert criterion. The MSCEIT full-test split-half reliability is r(1985) = .93 for general and .91 for expert consensus scoring. The two Experiencing and Strategic Area score reliabilities are r(1998) = .90 and .90, and r(2003) = .88 and .86 for general and expert scoring, respectively. The four branch scores of Perceiving, Facilitating, Understanding, and Managing range between r(2004-2028) = .76 to .91 for both types of reliabilities (see Table 1). The individual task reliabilities ranged from a low of " (2004-2111) = .55 to a high of .88. However scored, reliability at the total scale and area levels was excellent. Reliability at the branch level was very good, especially given the brevity of the test. Compared to the MEIS, reliabilities were overall higher at the task level but were sometimes lower than is desirable. We therefore recommend test interpretation at the total scale, area, and branch levels, with cautious interpretations at the task level, if at all.

Correlational and Factorial Structure of the MSCEIT V2.0

As seen in Table 2, all tasks were positively intercorrelated using both general (reported below the diagonal) and expert consensus scoring (above the diagonal). The intercorrelations among tasks ranged from r(1995-2111) = .17 to .59, p’s < .01, but with many correlations in the mid .30’s.

Confirmatory Factor Analyses

A factor analysis of the MSCEIT V2.0 can cross-validate earlier studies that support 1-, 2-, and 4- factor solutions of the EI domain . The 1-factor, "g" model, should load all eight MSCEIT tasks. The 2-factor model divides the scale into an "Experiential" area (Perception and Facilitating Thought Branches) and a "Strategic" area (Understanding and Managing Branches). The 4-factor model loads the two designated Branch tasks on each of the 4 branches . These analyses are particularly interesting given that the MSCEIT V2.0 represents an entirely new collection of tasks and items.

We tested these models using AMOS , and cross-checked them using LISREL and STATISTICA . The confirmatory models shared in common that (a) error variances were uncorrelated, (b) latent variables were correlated, i.e., oblique, and (c) all other paths were set to zero. In the 4-factor solution only, the two within-area latent variable covariances (i.e., between Perceiving and Facilitating, and between Understanding and Managing) were additionally constrained to be equal so as to reduce a high covariance between the Perceiving and Facilitating branches

There was a progressively better fit of models from the 1- to the 4-factor model, but all fit fairly well (4 vs. 2 factors: P2 = 253, df = 4, p < .001; 2 vs. 1 factors: P2 = 279, df = 1, p < .001; see Table 2 for further details). The P2 values are a function of sample size (N-1)F; and their size reflects the approximately 2,000 individuals involved, moreso than any absolute quality of fit. Fit indices independent of sample size include the normed fit index (NFI) which ranged from .99 to .98 across models, which is excellent , as well as the Tucker-Lewis index , which ranged from .98 to .96, and was also quite good, and Steiger’s root mean square error of approximation (RMSEA), which ranged from .12 for the 1-factor solution, which was a bit high, to an adequate .05 for the 4-factor solution. A model fit using the 4-factor solution with expert scoring was equivalent to that of general scoring (e.g., NFI = .97; TLI = .96; RMSEA = .04), and this correspondence between the expert and general consensus held for the 1- and 2-factor models as well.

MacCallum and Austin have noted that alternative models may be found that fit well, and this was the case with a 3-factor model described elsewhere that we tested on these data . On the other hand, if one intentionally violates the 4-factor model, by shifting the second task on each branch to the next branch up (and placing Branch 4’s second task back on Branch 1), the P2 rises from 94 to 495, the fit indices become unacceptable (e.g., TLI drops from .96 to .78), and 4 of 6 correlations among branches are estimated at higher than r = 1.0. The 4-Branch model, in other words, does create a fit to the data that can be markedly superior to other models.

DISCUSSION

In this study, emotions experts converged on correct test answers with greater reliability than did members of a general sample. The expert’s convergence was better in areas where more emotions research has been conducted. If future research confirms these findings, then an expert criterion may become the criterion of choice for such tests. Critiques of the emotional intelligence concept have suggested, based on the use of one or two emotions experts, that expert and general consensus criteria might be quite different . Others have argued that, as more experts are employed, and their answers aggregated, their performance will resemble that of the consensus of a large, general, group . The 21 experts in this study did exhibit superior agreement levels relative to the general sample. At the same time, the expert and general consensus criteria often agreed on the same answers as correct, r = .91. Participants’ MSCEIT scores were also similar according to the two different criteria, r = .98.

Reliabilities for Branch, Area, and Total test scores were reasonably high for the MSCEIT, with reliabilities at the level of the individual tasks ranging lower. Two week test-retest reliabilities of r(60) = .86 are reported elsewhere . In addition, the findings from the factor analyses indicate that 1-, 2-, and 4-factor models provide viable representations of the emotional intelligence domain, as assessed by the MSCEIT V2.0.

No empirical findings, by themselves can settle all the theoretical issues surrounding EI that were reflected in the September 2001 issue of Emotion. In addition, the applied use of emotional intelligence tests must proceed with great caution. That said, the findings here suggest that those who employ the MSCEIT can feel more confident about the quality of the measurement tool they use to assess EI. Ultimately, the value of the MSCEIT as a measure of emotional intelligence will be settled by studies of its validity and utility in predicting important outcomes over and above conventionally-measured emotion, intelligence, and related constructs. A number of such studies related to pro-social behavior, deviancy, and academic performance, have begun to appear . In the mean time, we hope that the present findings inform and, by doing so, clarify issues of scoring, of reliability, and of viable factorial representations.
 
 

 

ACKNOWLEDGEMENTS AND AUTHOR NOTES

The authors gratefully acknowledge the assistance of Rebecca Warner and James D. A. Parker, who served as expert consultants concerning the structural equation models reported in this paper. In addition, Terry Shepard was instrumental in preparing early versions of the tables.

The MSCEIT V2.0 is available from Multi-Health Systems (MHS) of Toronto, Ontario in booklet and in web-based formats. MHS scores the test based on the standardization sample and expert criteria; researchers have the further option of developing their own independent norms. Researchers can obtain the MSCEIT through special arrangements with MHS, which has various programs to accommodate their needs.

All correspondence regarding this manuscript can be directed to: John D. Mayer, Department of Psychology, Conant Hall, 10 Library Way, University of New Hampshire, Durham, NH 03824.
 
 

 

REFERENCES

Arbuckle, J. L. (1999). Amos 4.0. Chicago, IL: SmallWaters Corp.

Averill, J. R., & Nunley, E. P. (1992). Voyages of the heart: Living an emotionally creative life. New York: Free Press.

Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88, 588-606.

Brackett, M., & Mayer, J. D. (2001, October). Comparing measures of emotional intelligence. Paper presented at the Third Positive Psychology Summit, Washington, DC.

Buchanan, T., & Smith, J. L. (1999). Using the internet for psychological research: Personality testing on the World Wide Web. British Journal of Psychology, 90, 125-144.

Cattell, R. B., & Burdsal, C. A. (1975). The radial parcel double factoring design: A solution to the item-vs-parcel controversy. Multivariate Behavioral Research, 10, 165-179.

Ciarrochi, J. V., Chan, A. Y., & Caputi, P. (2000). A critical evaluation of the emotional intelligence concept. Personality and Individual Differences, 28, 539-561.

Ekman, P., & Friesen, W. V. (1975). Unmasking the face: A guide to recognizing the emotions from facial cues. Englewood Cliffs, NJ: Prentice Hall.

Izard, C. E. (2001). Emotional intelligence or adaptive emotions? Emotion, 1, 249-257.

Joreskog, K. G., & Sorbom, D. (2001). LISREL 8.51. Lincolnwood, IL: Scientific Software, Inc.

Kaufman, A. S., & Kaufman, J. C. (2001). Emotional intelligence as an aspect of general intelligence: What would David Wechsler say? Emotion, 1, 258-264.

Legree, P. I. (1995). Evidence for an oblique social intelligence factor established with a Likert-based testing procedure. Intelligence, 21, 247-266.

MacCallum, R. C., & Austin, J. T. (2000). Applications of structural equation modeling in psychological research. Annual Review of Psychology, 51, 201-226.

Mayer, J. D., Caruso, D. R., & Salovey, P. (1999). Emotional intelligence meets traditional standards for an intelligence. Intelligence, 27, 267-298.

Mayer, J. D., & Salovey, P. (1997). What is emotional intelligence? In P. Salovey & D. Sluyter (Eds.), Emotional development and emotional intelligence: Educational implications (pp. 3-31). New York: Basic Books.

Mayer, J. D., Salovey, P., & Caruso, D. R. (2002a). Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) Item Booklet . Toronto, Canada: MHS Publishers.

Mayer, J. D., Salovey, P., & Caruso, D. R. (2002b). Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) User's Manual . Toronto, Canada: MHS Publishers.

Mayer, J. D., Salovey, P., Caruso, D. R., & Sitaraneos, G. (2001). Emotional intelligence as a standard intelligence. Emotion, 1, 232-242.

Nunnally, J. C. (1978). Psychometric theory. New York: McGraw-Hill.

Ortony, A., Clore, G. L., & Collins, A. M. (1988). The cognitive structure of emotions. Cambridge: Cambridge University Press.

O'Sullivan, M., & Guilford, J. P. (1976). Four factor tests of social intelligence: Manual of instructions and interpretations. Orange, CA: Sheridan Psychological Services.

Roberts, R. D., Zeidner, M., & Matthews, G. (2001). Does emotional intelligence meet traditional standards for an intelligence? Some new data and conclusions. Emotion, 1, 196-231.

Rosenthal, R., Hall, J. A., DiMatteo, M. R., Rogers, P. L., & Archer, D. (1979). The PONS Test. Baltimore, MD: Johns Hopkins University Press.

Schaie, K. W. (2001). Emotional intelligence: Psychometric status and developmental characteristics: Comments on Roberts, Zeidner, and Matthews (2001). Emotion, 1, 243-248.

Scherer, K. R., Banse, R., & Wallbott, H. G. (2001). Emotion inferences from vocal expression correlate across languages and cultures. Journal of Cross-Cultural Psychology., 32, 76-92.

Statsoft. (2002). Statistica 6.0 Software. Tulsa, OK: Statsoft, Inc.

Steiger, J. H. (1990). Structural model evaluation and modification: An interval estimatin approach. Multivariate Behavioral Research, 25, 173-180.

Tucker, L. R., & Lewis, C. (1973). The reliability coefficient for maximum likelihood factor analysis. Psychmetrika, 38 , 1-10.

Zeidner, M., Mathews, G., & Roberts, R. D. (2001). Slow down, you move too fast: Emotional intelligence remains an "elusive" intelligence. Emotion, 1, 265-275.

TABLES

Table 1: Unscaled Score Meansa, Standard Deviationsa, Reliabilitiesb, and Correlations c for the MSCEIT V 2.0, for General and Expert Scoringd
 

Total EI Score

All Statistics

General: M = .48; S = .07; Reliab. = .93  Expert: M = .50; S = .08; Reliab. = .91

 

Experiential EI Area Score

Strategic EI Area Score


 
General Expert General Expert
M (S) .49(.08) .50(.09) .47(.09) .51(.10)
Reliability .90 .90 .88 .86

 

Perception 

Branch

Facilitation 

Branch 

Understanding

Branch 

Management

Branch

  Gen. Exp. Gen. Exp. Gen. Exp. Gen. Exp.
M (S) .50(.10) .54(.13) .47(.09) .45(.08) .53(.10) .60(.13) .42(.10) .42(.09)
Reliability .91 .90 .79 .76 .80 .77 .83 .81

 
Faces Task Pictures Task Facilit. Task Sensat. Task Changes Task Blends Task Manage. Task Relations. Task
General: M (S) .50(.12) 50(.13) .44(.09) .50(.12) .56(.10) .50(.12) .41(.09) .43(.12)
Expert: M (S) .57(.18) .50(.13) .41(.07) .50(.12) .63(.14) .57(.16) .40(.09) .43(.12)
General: Rel. .80 .88 .64 .65 .70 .66 .69 .67
Expert: Rel. .82 .87 .63 .55 .68 .62 .64 .64
 

Task Intercorrelations (General Consensus Scoring Below the Diagonal; Expert Above)

Faces 1.000 .356 .300 .315 .191 .157 .191 .179
Pictures .347 1.000 .288 .400 .286 .263 .282 .271
Facilitation .340 .328 1.000 .313 .283 .242 .262 .262
Sensations .336 .402 .352 1.000 .388 .374 .384 .415
Changes .225 .282 .255 .382 1.000 .575 .437 .417
Blends .171 .260 .224 .375 .589 1.000 .425 .424
Management .232 .300 .299 .395 .417 .416 1.000 .542
Relationships .191 .275 .269 .411 .395 .409 .575 1.000
  1. These M's and SD's are unscaled; final MSCEIT test scores for both general and expert scoring are converted to a standard IQ scale where M = 100 and SD = 15.
  2. Split-half reliabilities are reported at the total test, area, and branch score levels due to item heterogeneity. Coefficient alpha reliabilities are reported at the subtest level due to item homogeneity.
  3. Correlations are based on the sample for which all data at the task level was complete; N = 1985. Differences of roughly .001 between the r’s are significant at the .05 level. Significance testing is more accurate when employing Fischer z’ transformed versions of the r’s.
  4. Apart from the correlations (see note c), N for the overall scale was 2112; N's for the branch scores were Perception: 2015; with task N's between 2018 to 2108; Facilitating: 2028, with individual task N's between 2034-2103; Understanding: 2015, with individual task N's between 2016-2111; Managing 2088, with individual task N's from 2004-2008.


 

Table 2. MSCEIT V2.0 Parameter Estimates of the Observed Tasks on the Latent Variables, and Goodness-of-Fit Statistics for the One, Two, and Four Factor Models
Theoretical Arrangement

Model Tested (Factor and Loading)

Branches Tasks

1-Factor 

2-Factor 

4-Factor

Branch 1:   I I I
Perceiving Faces .40 .50 .55
  Pictures .50 .59 .68
Branch 2:       II
Facilitating Facilitation .46 .54 .53
  Sensations .64 .71 .72
Branch 3:     II III
Understanding Changes .65 .68 .77
  Blends .64 .67 .76
Branch 4:       IV
Managing Emotion Man. .68 .70 .76
  Emotional Rel. .66 .68 .74
Goodness of Fit Index 

Model Fit

Chi Square 626.56

347.32

94.28 

Chi Square df 20

19

15

Normed Fit Index (NFI) .988

.993

.977

Tucker-Lewis Index (TLI) .979

.988

.964

Root Mean Square (RMSEA) .124 .093

.052

N 1,985 1,985 1,985


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Error terms in all models were uncorrelated. In the 4-Branch model, the two within-area covariances (i.e., the covariance between Perception and Using, and between Understanding and Management, were constrained to be equal to one another; see text)