|
ABOUT THIS ARTICLE: The actual article can be found at, and cited as:
Mayer, J.D., Salovey, P., Caruso, D.R., & Sitarenios, G. (2003). Measuring
emotional intelligence with the MSCEIT V2.0. Emotion, 3, 97-105.
ABOUT PUBLISHING THIS MANUSCRIPT ON THE WEB: A copyedited version of
this manuscript is scheduled to appear in the journal Emotion. Copyright is by
the American Psychological Association. According to APA guidelines on internet
publishing (http://www.apa.org/journals/posting.html), the article can only be
posted on the author's web site; "the posted article must carry an APA copyright
notice and include a link to the APA journal home page; APA does not permit
archiving with any other non-APA repositories; APA does not provide electronic
copies of the APA published version for this purpose; and, authors are not
permitted to scan in the APA published version." Please help us by abiding by
these guidelines and not posting the article elsewhere. Please refer to that APA
web site (http://www.apa.org ) for more
information. Thank you!
Measuring Emotional Intelligence with the MSCEIT V2.0
John D. Mayer
Peter Salovey
David R. Caruso
Gill Sitarenios
ABSTRACT
Does a recently introduced ability scale adequately measure
emotional intelligence (EI) skills? Using the Mayer-Salovey-Caruso Emotional
Intelligence Test (MSCEIT), we examined (a) whether members of a general
standardization sample and emotions experts identified the same test answers as
correct, (b) the test’s reliability, and (c) the possible factor structures of
EI. Twenty-one emotions experts endorsed many of the same answers as did 2112
members of the standardization sample, and exhibited superior agreement
particularly when research provides clearer answers to test questions (e.g.,
emotional perception in faces). The MSCEIT achieved reasonable reliability, and
confirmatory factor analysis supported theoretical models of EI. These findings
help clarify issues raised in earlier articles published in Emotion.
INTRODUCTION
The past 12 years have seen a growing interest in emotional
intelligence (EI), defined as a set of skills concerned with the processing of
emotion-relevant information, and measured with ability-based scales. A new
ability test of EI, the Mayer-Salovey-Caruso Emotional Intelligence Test,
Version 2.0 (MSCEIT), potentially improves upon earlier measures, and can inform
the debate over the scoring, reliability, and factor validity of such scales .
The MSCEIT is intended to measure four branches, or skill groups, of emotional
intelligence: (a) perceiving emotion accurately, (b) using emotion to facilitate
cognitive activities, (c) understanding emotion, and (d) managing emotion .
The MSCEIT is the most recent of a series of ability scales
of emotional intelligence. Its immediate predecessor was the MSCEIT Research
Version 1.1 (MSCEIT RV1.1), and before that, the Multifactor Emotional
Intelligence Scale . Those tests, in turn, evolved out of earlier scales
measuring related constructs such as emotional creativity, social intelligence,
and non-verbal perception . The MSCEIT and its predecessors are based on the
idea that emotional intelligence involves problem solving with and about
emotions. Such ability tests measure something relatively different from, say,
self-report scales of emotional intelligence, with which correlations are rather
low . The ability to solve emotional problems is a necessary, although not
sufficient, ingredient to behaving in an emotionally adaptive way.
The MSCEIT V2.0 standardization process required collecting
data sets relevant to the several issues concerning emotional intelligence. For
example, in a recent exchange of papers in this journal, Roberts, Zeidner, and
Mathews , raised concerns about the earlier-developed MEIS test of emotional
intelligence. These concerns included whether there is one set of correct
answers for an emotional intelligence test or whether expert and general (e.g.,
lay) opinions about answers diverge too much, whether such tests could be
reliable, and whether the factor structure of such tests was fully understood
and consistent with theory. In that same issue of the journal we, and others,
responded . Findings from the MSCEIT standardization data reported here have the
promise of more directly informing the debate, through empirical findings. The
analyses we present address three questions: (a) Do general and expert criteria
for correct answers to emotional intelligence test items converge? (b) What is
the reliability of such tests? (c) And, is the factor structure of such tests
consistent with theoretical models of EI?
THREE ISSUES ABOUT EMOTIONAL INTELLIGENCE ADDRESSED IN THE
PRESENT STUDY
The Criteria for Correct Answers
One must know how to score a test’s items before one can
settle such issues as the test’s reliability and factor structure. Our model of
EI hypothesizes that emotional knowledge is embedded within a general, evolved,
social context of communication and interaction . Consequently, correct test
answers often can be identified according to the consensus response of a group
of unselected test-takers. For example, if the respondents identify a face as
predominantly angry, then that can be scored as a correct answer. We have
further hypothesized that emotions experts will identify correct answers with
greater reliability than average, particularly when research provides relatively
good methods for identifying correct alternatives, as in the case of facial
expressions of emotion, and the meaning of emotion terms .
If the general and the expert consensus diverge too far as to
the correct answers on a test, a complication arises, because the two methods
yield potentially different scores for each person. Roberts et al. used an
expert criterion that we had developed based on only two experts, and a general
consensus method to score the earlier MEIS measure, and found those methods did
not always converge. Some aggregation of experts beyond two is necessary,
however, to achieve a reliable identification of answers . Twenty-one emotions
experts were employed in the present study.
Issues of Reliability
The MSCEIT V2.0 must exhibit adequate levels of reliability,
as did the MEIS, MSCEIT RV1.0, and comparable psychological tests . As with its
predecessor tests, the four MSCEIT V2.0 branch scores (e.g., Perception,
Facilitating, Understanding, Management) draw on different tasks that include
different item forms; that is, the items are non-homogeneous. Under such
conditions, split-half reliability coefficients are the statistic of choice
(relative to coefficient alphas), as they involve the orderly allocation of
different item types to the two different halves of the test . The test-retest
reliability of the total MSCEIT score has been reported elsewhere, at r(60)
= .86 .
Issues of Factor Structure
The factor structure of a test indicates how many entities it
plausibly measures. It is important to any debate over whether EI is a coherent,
unified, concept. In this specific case, it indicates how many dimensions of EI
the test is "picking up" -- one unified dimension, many related dimensions, or
something else. We believe that the domain of EI is well described by 1-, 2-,
and 4- oblique (correlated) factor models, as well as other equivalent models.
If the MSCEIT V2.0 shows similar structure to the MEIS for both expert and
general scoring, it would strengthen the argument that the theory of EI we
employ works across tests. Using the standardization sample, we performed
confirmatory factor analyses of the full scale MSCEIT V2.0, testing 1-, 2-, and
4- factor models to examine the range of permissible factor structures for
representing the EI domain.
METHOD
Participants
General Sample
The present sample consisted of 2,112 adult respondents, age 18 or older, who
completed the MSCEIT V 2.0 in booklet or online forms prior to May, 2001. The
sample was composed of individuals tested by independent investigators in 36
separate academic settings from several countries. The investigators had
requested pre-release versions of the MSCEIT booklet or online forms (depending
on Internet availability and other criteria), and had submitted documentation of
their research qualifications and of approval of their research from their
sponsoring institution. Only basic demographic data were collected across
samples due to the diverse nature of the research sites.
Of those reporting gender, 1,217 (58.6%) were women and 859 (41.4%) were
men. The mean age of the sample was M = 26.25; S = 10.51, with roughly half the
sample college-aged (52.9%), and the rest ranging upward to 69 years old. The
participants were educationally diverse, with 0.6% reporting not completing high
school, 10.3% having completed only high school, 39.2% having some college or
university courses, 33.7% having completed college, and 16.1% holding Masters
level or higher degrees. The group was ethnically diverse as well, with 34.0%
Asian, 3.4% Black, 2.0% Hispanic, 57.9% White, and 2.3% other or mixed
ethnicity. Most participants came from the United States (1240), with others
from South Africa (231), India (194), the Philippines (170), the United Kingdom
(115), Scotland (122), and Canada (37); all testing was in English.
Expert Sample
The expert sample was drawn from volunteer members of the International
Society for Research on Emotions (ISRE) at its 2000 meeting. The Society was
founded in 1984 with the purpose of fostering interdisciplinary scientific study
of emotion. Membership is open to researchers and scholars who can demonstrate a
serious commitment to the investigation of the emotions. Twenty-one experts, 10
male and 11 female, from eight Western countries, participated. The sample of
experts had a mean age of 39.38 (S = 6.44; Range =30 to 52); no data about their
ethnicity were collected.
The MSCEIT V2.0
The MSCEIT V2.0 is a newly developed, 141-item scale designed to measure four
branches (specific skills) of emotional intelligence: (a) Perceiving Emotions,
(b) Using Emotions to Facilitate Thought, (c) Understanding Emotions, and (d)
Managing Emotions. Each of the four branches is measured with two tasks.
Perceiving Emotions is measured with the Faces and Pictures tasks; Facilitating
Thought is measured with the Sensations and Facilitation tasks; Understanding
Emotions is measured with Blends and Changes; and Managing Emotions is measured
with Emotion Management and Emotional Relationships tasks.
Each of the 8 MSCEIT tasks is made up of a number of item parcels or
individual items. A parcel structure occurs, for example, when a participant is
shown a face (in the Faces task), and asked about different emotions in the face
in five subsequent items. The five items make up an item parcel because they are
related to the same face, albeit each asks about a different emotion . Other
items involve one response per stimulus, and are, in that sense, free-standing.
Response formats were intentionally varied across tasks so as to ensure that
results generalized across response methods, and to reduce correlated
measurement error. Thus, some tasks, such as Pictures, employed 5-point rating
scales, whereas other tasks, such as Blends, employed a multiple-choice response
format.
Briefly, in the Faces task (4 item parcels; 5 responses each), participants
view a series of faces and for each, respond on a five point scale indicating
the degree to which a specific emotion is present in a face. The Pictures task
(6 parcels; 5 responses each) is the same as Faces except that landscapes and
abstract designs form the target stimuli, and the response scale consists of
cartoon faces (rather than words) of specific emotions. In the Sensations task
(5 parcels; 3 responses each), respondents generate an emotion and match
sensations to them. For example, they might generate a feeling of envy and
decide how hot or cold it is. In the Facilitations task (5 item parcels; 3
responses each), respondents judge the moods that best accompany or assist
specific cognitive tasks and behaviors, for example, whether joy might assist
planning a party. In the Blends task (12 free-standing items), respondents
identify emotions that could be combined to form other emotions. They might
conclude, for example, that malice is a combination of envy and aggression. In
the Changes task (20 free-standing items), respondents select an emotion that
results from the intensification of another feeling. For example, they might
identify depression as the most likely consequence of intensified sadness and
fatigue. Respondents in the Emotion Management task (5 parcels; 4 responses
each) judge the actions that are most effective in obtaining the specified
emotional outcome for an individual in a story. They are asked to decide, for
example, what a character might do to reduce her anger, or prolong her joy.
Finally, in the Emotional Relationships task (3 item parcels; 3 responses each),
respondents judge the actions that are most effective for one person to use in
the management of another person’s feelings. See the test itself, and its
manual, for more specific task information .
General and Expert Consensus Scoring
The MSCEIT yields a total score, two area scores (experiential and
strategic), four branch scores corresponding to the four-branch model, and eight
task scores. Each score can be calculated according to a general consensus
method. In that method, each one of a respondent’s answers is scored against the
proportion of the sample who endorsed the same MSCEIT answer. For example, if a
respondent indicated that surprise was "definitely present" in a face, and the
same alternative was chosen by 45% of the sample, the individual’s score would
be incremented by the proportion, .45. The respondent’s total raw score is the
sum of those proportions across the 141 items of the test. The other way to
score the test is according to an expert scoring method. That method is the
same, except that the each of the respondent’s scores is evaluated against the
criterion formed by proportional responding of an expert group (in this case,
the 21 ISRE members). One of the purposes of this study was to compare the
convergence of these two methods.
Procedure
The MSCEIT administration varied depending upon the research site at which
the data were collected (see "Sample," above). The MSCEIT was given to
participants to complete in large or small groups, or individually. Of the 2,112
participants, 1,368 took the test in a written form and 744 took the test in an
on-line form that presented the exact same questions and response scales, by
accessing a web-page. Those taking the pencil and paper version completed
scannable answer sheets that were entered into a database. Web page answers were
transmitted electronically. Prior research has suggested that booklet and
on-line forms of tests are often indistinguishable .
RESULTS
Comparison of Test-Booklet versus On-Line Administration
Groups
We first compared the MSCEIT V2.0 booklet and on-line tests. For each, there
are 705 responses to the test (141 items X 5 responses each). The correlation
between response frequencies for each alternative across the two methods was
r(705) = .987. By comparison, a random split of the booklet sample alone, for
which one would predict there would be no differences, yields almost exactly the
same correlation between samples of r(705) = .998. In each case, a
scatterplot of the data indicated that points fell close to the regression line
throughout the full range of the joint distribution, and that the points were
spread through the entire range (with more points between .00 and .50, than
above .50). Even deleting 30 zero-response alternatives from the 705 lowered the
correlation by only .001 (in the random split case). The booklet and on-line
tests were, therefore, equivalent, and the samples were combined.
Comparison of General vs. Expert Consensus Scoring Criteria
We next examined the differences between answers identified by the experts
and by the general consensus. We correlated the frequencies of endorsements to
the 705 responses (141 items x 5 responses) separately for the general consensus
group and the expert consensus group, and obtained an r(705)= .908 that,
although quite high, was significantly lower than the r = .998 correlation for
the random split (z(703) = 34.2, p < .01). The absolute difference in
proportional responding for each of the 705 alternatives also was calculated.
The mean value of the average absolute difference between the expert and general
groups was ÷ M D(705)| = .08; S = .086,
which also was significantly greater than the difference between the booklet and
on-line samples of ÷ MD(705)| = .025; S =
.027; z(705) = 16.3, p < .01.
We hypothesized that emotions experts would be more likely than others to
possess an accurate shared social representation of correct test answers; their
expertise, in turn, could provide an important criterion for the test. If that
were the case, then experts should exhibit higher inter-rater agreement than the
general group. To assess inter-rater agreement, we divided the expert group
randomly into two subgroups of 10 and of 11 experts each, and computed the modal
response for each of the 705 responses for the two subgroup of experts. The
Kappa representing agreement controlling for chance across the two expert
subgroups for the 5 responses of the 141 items was
6= .84. We then repeated this process for two groups of 21 individuals,
randomly drawn from the standardization (general) samples and matched to the
expert group exactly on gender and age. Two control groups, rather than one,
were used to enhance our confidence in the results; the education level of the
comparison groups was comparable to that of the rest of the general sample. When
we repeated our reliability analysis for the two matched control groups, we
obtained somewhat lower Kappas of 6= .71
and .79.
The same superiority of the expert group exists at an individual level as
well, where disaggregated agreement will be lower . The average inter-rater
Kappa coefficients of agreement across the 5 responses of the 141 items, for
every pair of raters within the expert group, was
6= .43, which significantly exceeded the average Kappas of the two
control groups 6’s = .31; .38 (z’s
= 4.8; 1.85; p < .05 to .01, one-tailed test).
Because general and expert groups both chose similar response alternatives as
correct, and experts have higher inter-rater reliability in identifying such
correct alternatives, members of the standardization sample should obtain
somewhat higher test scores when the experts’ criterion is used (before
normative scaling corrections are applied). Moreover, the expert group should
obtain the largest score advantages on skill branches where the experts most
agree, owing to the experts’ greater convergence for those responses. For
example, one might expect increased expert convergence for Branches 1 (emotional
perception) and 3 (emotional understanding) because emotions experts have long
focused on the principles of coding emotional expressions , as well as on
delineating emotional understanding . By contrast, research on Branches 2
(emotional facilitation of thought) and 4 (emotion management) is newer and has
yielded less consensus, and so experts might be more similar to the general
sample in such domains.
To test this idea, we conducted a 4 (branch) X 2 (consensus versus expert
scoring criterion) ANOVA on MSCEIT scores. The main effect for scoring criterion
was significant (F(1,1984) = 3464, p < .001), indicating, as
hypothesized, that participants obtained higher raw scores overall when scored
against the expert criteria. The main effect for branch was significant as well
(F(3,5952) = 1418, p < .001), indicating, unsurprisingly, that items on
some branches were harder than others. Finally, there was a branch by scoring
criterion interaction (F(3,5952) = 2611, p < .001).
Orthogonal contrasts indicated that participants scored according to the
expert criterion on Branches 1 and 3 obtained significantly higher scores than
when scored against the general consensus (see Table 1; F(1,1984) = 1631 and
5968, respectively; p’s < .001). Branch 2 (using emotions to facilitate thought)
showed a significant difference favoring general consensus (F(1,1984) = 711,
p’s < .001), and Branch 4 showed no difference (F(1,1984) = 1.57, n.s.). The
advantage for expert convergence on Branches 1 and 3 may reflect the greater
institutionalization of emotion knowledge among experts in these two areas.
In a final comparison of the two scoring criteria, participants’ tests were
scored using the general criterion, on the one hand, and the expert criterion,
on the other. The correlation between the two score sets ranged from r
(2004-2028) = .96 to .98 across the Branches, Areas, and Total EIQ scores, as
reported in Table 1.
The evidence from this study reflects that experts are more reliable judges,
and converge on correct answers where research has established clear criteria
for answers. If further studies bear out these results, the expert criteria may
prove superior to the general consensus.
Reliability of the MSCEIT V2.0
The MSCEIT has two sets of reliabilities depending upon whether a general or
expert scoring criterion is employed. That is because reliability analyses are
based on participants’ scored responses at the item-level, and scores at the
item-level vary depending upon whether responses are compared against the
general or expert criterion. The MSCEIT full-test split-half reliability is r(1985)
= .93 for general and .91 for expert consensus scoring. The two Experiencing and
Strategic Area score reliabilities are r(1998) = .90 and .90, and r(2003)
= .88 and .86 for general and expert scoring, respectively. The four branch
scores of Perceiving, Facilitating, Understanding, and Managing range between
r(2004-2028) = .76 to .91 for both types of reliabilities (see Table 1). The
individual task reliabilities ranged from a low of
" (2004-2111) = .55 to a high of .88. However scored, reliability at the
total scale and area levels was excellent. Reliability at the branch level was
very good, especially given the brevity of the test. Compared to the MEIS,
reliabilities were overall higher at the task level but were sometimes lower
than is desirable. We therefore recommend test interpretation at the total
scale, area, and branch levels, with cautious interpretations at the task level,
if at all.
Correlational and Factorial Structure of the MSCEIT V2.0
As seen in Table 2, all tasks were positively intercorrelated using both
general (reported below the diagonal) and expert consensus scoring (above the
diagonal). The intercorrelations among tasks ranged from r(1995-2111) =
.17 to .59, p’s < .01, but with many correlations in the mid .30’s.
Confirmatory Factor Analyses
A factor analysis of the MSCEIT V2.0 can cross-validate earlier studies that
support 1-, 2-, and 4- factor solutions of the EI domain . The 1-factor, "g"
model, should load all eight MSCEIT tasks. The 2-factor model divides the scale
into an "Experiential" area (Perception and Facilitating Thought Branches) and a
"Strategic" area (Understanding and Managing Branches). The 4-factor model loads
the two designated Branch tasks on each of the 4 branches . These analyses are
particularly interesting given that the MSCEIT V2.0 represents an entirely new
collection of tasks and items.
We tested these models using AMOS , and cross-checked them using LISREL and
STATISTICA . The confirmatory models shared in common that (a) error variances
were uncorrelated, (b) latent variables were correlated, i.e., oblique, and (c)
all other paths were set to zero. In the 4-factor solution only, the two
within-area latent variable covariances (i.e., between Perceiving and
Facilitating, and between Understanding and Managing) were additionally
constrained to be equal so as to reduce a high covariance between the Perceiving
and Facilitating branches
There was a progressively better fit of models from the 1- to the 4-factor
model, but all fit fairly well (4 vs. 2 factors: P2
= 253, df = 4, p < .001; 2 vs. 1 factors:
P2 = 279, df = 1, p
< .001; see Table 2 for further details). The P2
values are a function of sample size (N-1)F; and their size reflects the
approximately 2,000 individuals involved, moreso than any absolute quality of
fit. Fit indices independent of sample size include the normed fit index (NFI)
which ranged from .99 to .98 across models, which is excellent , as well as the
Tucker-Lewis index , which ranged from .98 to .96, and was also quite good, and
Steiger’s root mean square error of approximation (RMSEA), which ranged from .12
for the 1-factor solution, which was a bit high, to an adequate .05 for the
4-factor solution. A model fit using the 4-factor solution with expert scoring
was equivalent to that of general scoring (e.g., NFI = .97; TLI = .96; RMSEA =
.04), and this correspondence between the expert and general consensus held for
the 1- and 2-factor models as well.
MacCallum and Austin have noted that alternative models may be found that fit
well, and this was the case with a 3-factor model described elsewhere that we
tested on these data . On the other hand, if one intentionally violates the
4-factor model, by shifting the second task on each branch to the next branch up
(and placing Branch 4’s second task back on Branch 1), the
P2 rises from 94 to 495, the fit
indices become unacceptable (e.g., TLI drops from .96 to .78), and 4 of 6
correlations among branches are estimated at higher than r = 1.0. The
4-Branch model, in other words, does create a fit to the data that can be
markedly superior to other models.
DISCUSSION
In this study, emotions experts converged on correct test
answers with greater reliability than did members of a general sample. The
expert’s convergence was better in areas where more emotions research has been
conducted. If future research confirms these findings, then an expert criterion
may become the criterion of choice for such tests. Critiques of the emotional
intelligence concept have suggested, based on the use of one or two emotions
experts, that expert and general consensus criteria might be quite different .
Others have argued that, as more experts are employed, and their answers
aggregated, their performance will resemble that of the consensus of a large,
general, group . The 21 experts in this study did exhibit superior agreement
levels relative to the general sample. At the same time, the expert and general
consensus criteria often agreed on the same answers as correct, r = .91.
Participants’ MSCEIT scores were also similar according to the two different
criteria, r = .98.
Reliabilities for Branch, Area, and Total test scores were
reasonably high for the MSCEIT, with reliabilities at the level of the
individual tasks ranging lower. Two week test-retest reliabilities of r(60)
= .86 are reported elsewhere . In addition, the findings from the factor
analyses indicate that 1-, 2-, and 4-factor models provide viable
representations of the emotional intelligence domain, as assessed by the MSCEIT
V2.0.
No empirical findings, by themselves can settle all the
theoretical issues surrounding EI that were reflected in the September 2001
issue of Emotion. In addition, the applied use of emotional intelligence
tests must proceed with great caution. That said, the findings here suggest that
those who employ the MSCEIT can feel more confident about the quality of the
measurement tool they use to assess EI. Ultimately, the value of the MSCEIT as a
measure of emotional intelligence will be settled by studies of its validity and
utility in predicting important outcomes over and above conventionally-measured
emotion, intelligence, and related constructs. A number of such studies related
to pro-social behavior, deviancy, and academic performance, have begun to appear
. In the mean time, we hope that the present findings inform and, by doing so,
clarify issues of scoring, of reliability, and of viable factorial
representations.
ACKNOWLEDGEMENTS AND AUTHOR NOTES
The authors gratefully acknowledge the assistance of Rebecca Warner and James
D. A. Parker, who served as expert consultants concerning the structural
equation models reported in this paper. In addition, Terry Shepard was
instrumental in preparing early versions of the tables.
The MSCEIT V2.0 is available from Multi-Health Systems (MHS) of Toronto,
Ontario in booklet and in web-based formats. MHS scores the test based on the
standardization sample and expert criteria; researchers have the further option
of developing their own independent norms. Researchers can obtain the MSCEIT
through special arrangements with MHS, which has various programs to accommodate
their needs.
All correspondence regarding this manuscript can be directed to: John D.
Mayer, Department of Psychology, Conant Hall, 10 Library Way, University of New
Hampshire, Durham, NH 03824.
REFERENCES
Arbuckle, J. L. (1999). Amos 4.0. Chicago, IL:
SmallWaters Corp.
Averill, J. R., & Nunley, E. P. (1992). Voyages of the
heart: Living an emotionally creative life. New York: Free Press.
Bentler, P. M., & Bonett, D. G. (1980). Significance
tests and goodness of fit in the analysis of covariance structures.
Psychological Bulletin, 88, 588-606.
Brackett, M., & Mayer, J. D. (2001, October).
Comparing measures of emotional intelligence. Paper presented at the
Third Positive Psychology Summit, Washington, DC.
Buchanan, T., & Smith, J. L. (1999). Using the internet
for psychological research: Personality testing on the World Wide Web.
British Journal of Psychology, 90, 125-144.
Cattell, R. B., & Burdsal, C. A. (1975). The radial
parcel double factoring design: A solution to the item-vs-parcel
controversy. Multivariate Behavioral Research, 10, 165-179.
Ciarrochi, J. V., Chan, A. Y., & Caputi, P. (2000). A
critical evaluation of the emotional intelligence concept. Personality
and Individual Differences, 28, 539-561.
Ekman, P., & Friesen, W. V. (1975). Unmasking the
face: A guide to recognizing the emotions from facial cues. Englewood
Cliffs, NJ: Prentice Hall.
Izard, C. E. (2001). Emotional intelligence or adaptive
emotions? Emotion, 1, 249-257.
Joreskog, K. G., & Sorbom, D. (2001). LISREL 8.51.
Lincolnwood, IL: Scientific Software, Inc.
Kaufman, A. S., & Kaufman, J. C. (2001). Emotional
intelligence as an aspect of general intelligence: What would David Wechsler
say? Emotion, 1, 258-264.
Legree, P. I. (1995). Evidence for an oblique social
intelligence factor established with a Likert-based testing procedure.
Intelligence, 21, 247-266.
MacCallum, R. C., & Austin, J. T. (2000). Applications of
structural equation modeling in psychological research. Annual Review of
Psychology, 51, 201-226.
Mayer, J. D., Caruso, D. R., & Salovey, P. (1999).
Emotional intelligence meets traditional standards for an intelligence.
Intelligence, 27, 267-298.
Mayer, J. D., & Salovey, P. (1997). What is emotional
intelligence? In P. Salovey & D. Sluyter (Eds.), Emotional development
and emotional intelligence: Educational implications (pp. 3-31). New
York: Basic Books.
Mayer, J. D., Salovey, P., & Caruso, D. R. (2002a).
Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) Item Booklet .
Toronto, Canada: MHS Publishers.
Mayer, J. D., Salovey, P., & Caruso, D. R. (2002b).
Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) User's Manual
. Toronto, Canada: MHS Publishers.
Mayer, J. D., Salovey, P., Caruso, D. R., & Sitaraneos,
G. (2001). Emotional intelligence as a standard intelligence. Emotion, 1,
232-242.
Nunnally, J. C. (1978). Psychometric theory. New
York: McGraw-Hill.
Ortony, A., Clore, G. L., & Collins, A. M. (1988). The
cognitive structure of emotions. Cambridge: Cambridge University Press.
O'Sullivan, M., & Guilford, J. P. (1976). Four factor
tests of social intelligence: Manual of instructions and interpretations.
Orange, CA: Sheridan Psychological Services.
Roberts, R. D., Zeidner, M., & Matthews, G. (2001). Does
emotional intelligence meet traditional standards for an intelligence? Some
new data and conclusions. Emotion, 1, 196-231.
Rosenthal, R., Hall, J. A., DiMatteo, M. R., Rogers, P.
L., & Archer, D. (1979). The PONS Test. Baltimore, MD: Johns Hopkins
University Press.
Schaie, K. W. (2001). Emotional intelligence:
Psychometric status and developmental characteristics: Comments on Roberts,
Zeidner, and Matthews (2001). Emotion, 1, 243-248.
Scherer, K. R., Banse, R., & Wallbott, H. G. (2001).
Emotion inferences from vocal expression correlate across languages and
cultures. Journal of Cross-Cultural Psychology., 32, 76-92.
Statsoft. (2002). Statistica 6.0 Software. Tulsa, OK:
Statsoft, Inc.
Steiger, J. H. (1990). Structural model evaluation and
modification: An interval estimatin approach. Multivariate Behavioral
Research, 25, 173-180.
Tucker, L. R., & Lewis, C. (1973). The reliability
coefficient for maximum likelihood factor analysis. Psychmetrika, 38
, 1-10.
Zeidner, M., Mathews, G., & Roberts, R. D. (2001). Slow
down, you move too fast: Emotional intelligence remains an "elusive"
intelligence. Emotion, 1, 265-275.
TABLES
| Table 1: Unscaled Score Meansa,
Standard Deviationsa, Reliabilitiesb, and Correlations
c for the MSCEIT V 2.0, for General and Expert Scoringd |
| |
Total EI Score
|
|
All Statistics
|
General: M = .48; S =
.07; Reliab. = .93 |
Expert: M = .50; S =
.08; Reliab. = .91 |
|
Experiential EI Area Score
|
Strategic EI Area Score
|
|
General |
Expert |
General |
Expert |
| M (S) |
.49(.08) |
.50(.09) |
.47(.09) |
.51(.10) |
| Reliability |
.90 |
.90 |
.88 |
.86 |
|
Perception
Branch
|
Facilitation
Branch
|
Understanding
Branch
|
Management
Branch
|
| |
Gen. |
Exp. |
Gen. |
Exp. |
Gen. |
Exp. |
Gen. |
Exp. |
| M (S) |
.50(.10) |
.54(.13) |
.47(.09) |
.45(.08) |
.53(.10) |
.60(.13) |
.42(.10) |
.42(.09) |
| Reliability |
.91 |
.90 |
.79 |
.76 |
.80 |
.77 |
.83 |
.81 |
|
Faces Task |
Pictures Task |
Facilit. Task |
Sensat. Task |
Changes Task |
Blends Task |
Manage. Task |
Relations. Task |
| General: M (S) |
.50(.12) |
50(.13) |
.44(.09) |
.50(.12) |
.56(.10) |
.50(.12) |
.41(.09) |
.43(.12) |
| Expert: M (S) |
.57(.18) |
.50(.13) |
.41(.07) |
.50(.12) |
.63(.14) |
.57(.16) |
.40(.09) |
.43(.12) |
| General: Rel. |
.80 |
.88 |
.64 |
.65 |
.70 |
.66 |
.69 |
.67 |
| Expert: Rel. |
.82 |
.87 |
.63 |
.55 |
.68 |
.62 |
.64 |
.64 |
| |
Task Intercorrelations (General Consensus Scoring
Below the Diagonal; Expert Above)
|
| Faces |
1.000 |
.356 |
.300 |
.315 |
.191 |
.157 |
.191 |
.179 |
| Pictures |
.347 |
1.000 |
.288 |
.400 |
.286 |
.263 |
.282 |
.271 |
| Facilitation |
.340 |
.328 |
1.000 |
.313 |
.283 |
.242 |
.262 |
.262 |
| Sensations |
.336 |
.402 |
.352 |
1.000 |
.388 |
.374 |
.384 |
.415 |
| Changes |
.225 |
.282 |
.255 |
.382 |
1.000 |
.575 |
.437 |
.417 |
| Blends |
.171 |
.260 |
.224 |
.375 |
.589 |
1.000 |
.425 |
.424 |
| Management |
.232 |
.300 |
.299 |
.395 |
.417 |
.416 |
1.000 |
.542 |
| Relationships |
.191 |
.275 |
.269 |
.411 |
.395 |
.409 |
.575 |
1.000 |
- These M's and SD's are unscaled; final MSCEIT test scores
for both general and expert scoring are converted to a standard IQ scale where
M = 100 and SD = 15.
- Split-half reliabilities are reported at the total test,
area, and branch score levels due to item heterogeneity. Coefficient alpha
reliabilities are reported at the subtest level due to item homogeneity.
- Correlations are based on the sample for which all data at
the task level was complete; N = 1985. Differences of roughly .001
between the r’s are significant at the .05 level. Significance testing is more
accurate when employing Fischer z’ transformed versions of the r’s.
- Apart from the correlations (see note c), N
for the overall scale was 2112; N's for the branch scores were
Perception: 2015; with task N's between 2018 to 2108; Facilitating:
2028, with individual task N's between 2034-2103; Understanding: 2015,
with individual task N's between 2016-2111; Managing 2088, with
individual task N's from 2004-2008.
| Table 2. MSCEIT V2.0 Parameter
Estimates of the Observed Tasks on the Latent Variables, and Goodness-of-Fit
Statistics for the One, Two, and Four Factor Models |
| Theoretical Arrangement |
Model Tested (Factor and Loading)
|
| Branches |
Tasks |
1-Factor
|
2-Factor
|
4-Factor
|
| Branch 1: |
|
I |
I |
I |
| Perceiving |
Faces |
.40 |
.50 |
.55 |
| |
Pictures |
.50 |
.59 |
.68 |
| Branch 2: |
|
|
|
II |
| Facilitating |
Facilitation |
.46 |
.54 |
.53 |
| |
Sensations |
.64 |
.71 |
.72 |
| Branch 3: |
|
|
II |
III |
| Understanding |
Changes |
.65 |
.68 |
.77 |
| |
Blends |
.64 |
.67 |
.76 |
| Branch 4: |
|
|
|
IV |
| Managing |
Emotion Man. |
.68 |
.70 |
.76 |
| |
Emotional Rel. |
.66 |
.68 |
.74 |
| Goodness of Fit Index |
Model Fit
|
| Chi Square |
626.56 |
347.32
|
94.28
|
| Chi Square df |
20 |
19
|
15
|
| Normed Fit Index (NFI) |
.988 |
.993
|
.977
|
| Tucker-Lewis Index (TLI) |
.979 |
.988
|
.964
|
| Root Mean Square (RMSEA) |
.124 |
.093 |
.052
|
| N |
1,985 |
1,985 |
1,985 |
Error terms in all models were
uncorrelated. In the 4-Branch model, the two within-area covariances (i.e., the
covariance between Perception and Using, and between Understanding and
Management, were constrained to be equal to one another; see text)
|
|