返回列表 回复 发帖

A DERIVED STIMULUS RELATIONS CONCEPTUALIZATION AND DEMONSTRATION EXPERIMENT

THE CONJUNCTION FALLACY: A DERIVED STIMULUS RELATIONS CONCEPTUALIZATION AND DEMONSTRATION EXPERIMENT
Scott T Gaynor, Yukiko Washio, Frederick Anderson. The Psychological Record. Gambier: Winter 2007. Vol. 57, Iss. 1; pg. 63, 23 pgs

Abstract (Summary)
There is a long and fruitful history within behavior analysis of providing interpretations of complex human behavior (e.g., verbal behavior, problemsolving, remembering) in terms of empirically established principles (see Donahoe & Palmer, 1994; Skinner, 1953, 1957). The present pa-per has three goals: (a) Describe a "classic" error in logical reasoning from the cognitive literature: the conjunction fallacy (CF); (b) provide an interpretation of CF responding based on a learning history and current context that promotes particular derived stimulus relations; and, (c) describe a demonstration experiment testing the plausibility of the interpretation.

8.pdf (1.16 MB)

阅读权限: 20 售价: 开心果 10  [记录]  [购买]

版主招聘 论坛使用说明 快速挣开心果的方法 提问的智慧-新人必看 看看你的心理学DNA 心理搜普的文化和游戏规则 论坛VIP购买渠道

Full Text (9305  words)

Copyright The Psychological Record Winter 2007

[Headnote]
The conjunction fallacy (CF) comes about when the occurrence of two events is rated as more likely than either in isolation. A typical participant in a CF study is presented with a description of a hypothetical individual (i.e., a compound sample stimulus) and then asked to make judgments as to the likelihood that that person engages in a particular vocation, avocation (i.e., single comparison stimuli), or both (i.e., a compound comparison stimulus). The CF is witnessed when the combination is judged as more likely than either the vocation or avocation alone. Commission of the CF is often attributed to participants' judgment being guided by representativeness (i.e., the representativeness heuristic) rather than the laws of probability (Tversky & Kahneman, 1983). This pa-per provides a behavioral interpretation of CF responding based on derived/emergent stimulus relations and presents data from 27 undergraduates who participated in a study designed to test the plausibility of the interpretation. Using nonsense words and a many-to-one training structure, prerequisite baseline relationships for establishing two 6-member stimulus equivalence classes were trained. Next was an analogue CF test in which participants rated the likelihood that former sample stimuli (now presented as comparison stimuli, either alone or in sets of 2) were a correct answer given a compound sample stimulus composed of 3 former sample stimuli (all of which were from 1 class but had never been directly related during training). Following this test phase the emergence of equivalence classes was formally assessed and a standard CF scenario presented. The incidence of CF responding under the analogue procedure (60%) was similar to that reported in standard scenarios and commission of the CF was significantly related to whether equivalence classes formed. These data constitute a preliminary demonstration in support of the derived equivalence relations interpretation.


There is a long and fruitful history within behavior analysis of providing interpretations of complex human behavior (e.g., verbal behavior, problemsolving, remembering) in terms of empirically established principles (see Donahoe & Palmer, 1994; Skinner, 1953, 1957). Because these interpretations are constrained, relying only on principles and processes established under controlled laboratory conditions, they are considered scientific and beyond mere speculation (Donahoe & Palmer, 1994, p. 325). Despite the utility and conceptual elegance of such scientific interpretations, there have been a number of suggestions that behavior analysts buttress their interpretations with empirical study of complex human behavior (Fantino, 1998; Marr, 1984). Currently, empirical study of complex human behavior is a domain dominated by cognitive psychology. However, over 25 years ago Sidman (1978; p. 267) opined that "stimulusstimulus relations ... are close to the heart of cognitive theory's subject matter" and encouraged behavior analysts to pay more attention to understanding relations among stimuli. Indeed, a great deal of behavior analytic work on stimulus relations has been done in the last 30 years and the developments in the study of what have been described as emergent (Sidman, 1994) or derived (Hayes, Barnes-Holmes, & Roche, 2001) stimulus relations appear relevant for better understanding complex "cognitive" performances.

The present pa-per has three goals: (a) Describe a "classic" error in logical reasoning from the cognitive literature: the conjunction fallacy (CF); (b) provide an interpretation of CF responding based on a learning history and current context that promotes particular derived stimulus relations; and, (c) describe a demonstration experiment testing the plausibility of the interpretation.

The Conjunction Fallacy

Human reasoning often fails to conform to expectations from probability theory and is, in that sense, considered fallacious. Consider the Bill scenario from a standard CF experiment. In the experiment the participant is first presented with descriptive framing information (a personality sketch or vignette) about a hypothetical individual named Bill.

Bill is 34 years old. He is an intelligent, but unimaginative, compulsive, and generally lifeless man who in school was strong in mathematics but weak in social studies and the humanities.

Next, the participant is shown a number of vocations and/or avocations and asked to rate how likely it is that the hypothetical individual, Bill, engages in these activities.

A. Bill is an accountant.

B. Bill plays jazz for a hobby

C. Bill is an accountant who plays jazz for a hobby.

Probability theory, specifically the conjunction rule, suggests that A and B should be rated as more probable than C, as the probability of occurrence of two independent events, P(A&B), cannot exceed the probability of occurrence of one of the events, P(A) or P(B). However, a common outcome is that the likelihood ratings show the following pattern: A > C > B. The sizable number of participants (40-90% across studies, see Fantino, Kulik, Stolarz-Fantino, & Wright, 1997; Gavanski & Roskos-Ewoldsen, 1991; Stolarz-Fantino, Fantino, & Kulik, 1996; Tversky & Kahneman, 1983), including those versed in statistics and logic, rating C as more probable than B, demonstrate the CF (Tversky & Kahneman, 1983).

In the cognitive literature, one prominent interpretation attributes the CF to subjects' reasoning based on representativeness (Tversky & Kahneman, 1983). According to Tversky and Kahneman "Representativeness is an assessment of the degree of correspondence between a sample and a population, an instance and a category, an act and an actor or, more generally, between an outcome and a model" (Tversky & Kahneman, 1983; p. 295). In other words, when the participant is given a job or hobby to rate, he or she assesses whether the traits in the personality sketch are prototypical (stereotypical, representative) of someone with that occupation or avocation. By design, the description of Bill was made representative of an accountant and unrepresentative of a jazz enthusiast (Tversky & Kahneman, 1983). As such, A, because it is includes a highly representative possibility, is rated as more likely than B, the unrepresentative possibility. Moreover, C is rated as more likely than B because it includes one possibility that is representative of the personality description, while B contains only the unrepresentative possibility. Because representativeness "do(es) not conform to the extensional logic of probability theory ... a conjunction can be more representative than one of its constituents" (Tversky & Kahneman, 1983; p. 295).

The CF is a robust finding that has received much attention in the cognitive literature and, in the last decade, has received attention from behavior analysts (see Fantino, 1998; Fantino et al., 1997; Stolarz-Fantino et al., 1996). If developing and evaluating behavioral interpretations of complex human reasoning is of value, studying a well-documented pattern of responding such as the CF may be useful. The present analysis and study builds on the significant contributions of Fantino, Stolarz-Fantino, and colleagues as well as the work of cognitive and social psychologists studying the CF. The most important aspects of this prior work for the present purposes are reviewed below.

Selective Review of the Conjunction Fallacy Literature

The presentation of the conjunction is different from the presentation of the constituents in that the former can be considered a compound stimulus (i.e., accountant and jazzist) while the later includes only an individual stimulus (i.e., accountant). Using a choice procedure, Fantino and Savastano (1996) found that, in the absence of direct experience with a low payoff compound stimulus, human participants snowed a preference for compound stimuli over individual stimuli. Specifically, when offered a choice between a compound stimulus made up of one stimulus associated with a high probability of reinforcement (.80) coupled with one stimulus associated with a low probability of reinforcement (.20), participants preferred the compound to both individual stimuli. Participants chose the compound 88% of the time over the stimulus signaling the low rate of reinforcement and 69% of the time over the stimulus signaling the high rate of reinforcement. The choice of the compound stimulus suggested that participants summated, possibly due to a life history where stimulus values combine to predict the availability of reinforcement (Fantino & Savastano, 1996).

Although human participants may respond more to compound stimuli than their components, this observation, nor sim-ple summation alone, does not provide a sufficient analysis of CF responding. Consider the data from Stolarz-Fantino and colleagues (1996; Question 3) where responding was reported for a subset of participants who were given no descriptive framing information but were told only that there was an 80% probability Bill was an accountant and a 20% probability that he played jazz as a hobby. When these participants estimated the likelihood of the conjunction (from 0, virtually impossible, to 100, virtually certain), only 30% demonstrated the CF. There is clearly a difference between experienced probabilities (as in Fantino & Savastano, 1996) and textual statements describing probabilities (as in Stolarz-Fantino et al., 1996). The latter, by explicitly providing the probabilities, emphasizes ignorance of how probabilities combine in the commission of the CF and suggested that only a minority of the sample showed the CF under these conditions. Moreover, because only the percentage of participants demonstrating the CF was reported, and not the actual likelihood ratings, it is not clear whether this 30% of the sample did indeed summate. Additional relevant data comes from a condition where the Bill scenario was presented without the descriptive framing information and participants were required to make a forced choice between B and C. Under these circumstances only 24% of the participants chose the conjunction suggesting that the majority of respondents did not simply favor compound stimuli (StolarzFantino et al., 1996; Experiment 1).

The preceding makes clear that the CF occurs, in a minority of participants, in the absence of the descriptive framing information. However, the framing information appears to exert a great deal of control over subsequent responding. When Stolarz-Fantino and colleagues (1996) presented the Bill scenario with the framing information and required forced choice between B and C, 72% now chose the conjunction, compared to the 24% when the framing information was absent. These data replicate the finding reported by Tversky and Kahneman (1983) that 85% of respondents chose the conjunction under a forced choice procedure. Similarly, using a between groups design and a procedure where participants gave likelihood ratings (rather than mak-ing a forced choice), 78% of those who saw the framing information displayed the CF compared to 41% who did not see the framing information (StolarzFantino et al., 1996; Experiment 4). Thus, whereas the CF may occur without the framing information, the framing information appears to exert a significant amount of control over subsequent responding.

The framing information, from Kahneman & Tversky's perspective, establishes the reference class/category/mental prototype/mental model against which the representativeness/correspondence of the items A, B, and C will be evaluated. Indeed, a study reviewed by Tversky and Kahneman (1983; p. 297) showed that when participants were given the framing information and then asked to rank the representativeness of each statement, according to the degree that Bill "resembles the typical member of that class," 87% rated A > C > B; thus, the pattern based on representativeness ratings was in close correspondence with the pattern observed from probability ratings.

A number of authors have suggested that the likelihood rating for the conjunction represents an average of the constituents (Dougherty, Gettys, & Ogden, 1999; Fantino et al., 1997; Gavanski & Roskos-Evoldsen, 1991; Yates & Carlson, 1986). It has been further suggested that averaging appears most likely when the response options contain one highly likely and one highly unlikely event, such as in the Bill scenario. However, evidence of averaging has been found even when constituents were equally likely (see Fantino et al., 1997). As an illustration, Gavanski and Roskos-Evoldsen (1991; Experiment 2) found that more than three quarters of their sample demonstrated the CF in response to the Bill scenario, with mean probability ratings of 69% for A, 19% for B, and 42% for C (44% is the exact midpoint). These data suggest that the descriptive framing information may impact the representativeness of each individual constituent stimulus and that these may be averaged in determining the likelihood of the conjuction.
Derived Stimulus Relations Interpretation of the Conjunction Fallacy

When verbally competent individuals are given conditional discrimination (match-to-sample) training; that is, given training during which in the presence of stimulus A1 selecting stimulus B1, instead of B2, is reinforced (A1[arrow right]B1) and in the presence of stimulus B1 selecting stimulus C1, instead of C2, is reinforced (B1[arrow right]C1), they often derive relations from B1[arrow right]A1, C1[arrow right]B1, A1[arrow right]C1, and C1[arrow right]A1, without additional training. Similarly, simultaneous training to relate A2 with B2, instead of B1, and B2 with C2 instead of C1 results in the emergence of relations B2[arrow right]A2, C2[arrow right]B2, A2[arrow right]C2, and C2[arrow right]A2. Thus, although only 4 stimulus relations are taught, 12 total relations often result. When the presence of the derived or emergent relations is documented, the stimuli are considered part of two equivalence classes (A1-B1-C1 and A2-B2-C2). The important process(es) leading to these emergent relations remains controversial (see accounts by Hayes et al., 2001; Home & Lowe, 1996; Sidman, 1994, 2000); however, that these untrained relations are often produced following conditional discrimination training is well documented.

In laboratory studies with human subjects the stimulus materials are generally nonsense words or abstract shapes which ensures that participants do not have preexisting histories of relating the stimuli to one another prior to exposure to the conditional discrimination training. This experimental preparation is critical in establishing that the relations truly developed as a result of experimental events. The structure of the training has generated interest in its own right (Saunders & Green, 1999); however, the outcome (i.e., that equivalence classes emerge) has been of particular interest because it may provide an experimental model of the flexibility and productivity often demonstrated by humans in the cognitive/verbal realm (Hayes et al., 2001; Sidman, 1994). For instance, stimulus equivalence research suggests that once a relation between ? and y is established, a relation between z and y will yield a relation between z and x. Conditional discrimination training is experimentally useful in documenting the emergence of untrained, derived relations, but would be a relatively crude and cumbersome way of teaching verbally competent humans what goes with what. Instead we might simply tell them the relations. This is exactly what happens in traditional conjunction error experiments.

Specifically, the information given in the personality sketch can be viewed as a verbal description of a set of conditional discriminations. That is, the participant is told that Bill is a stimulus (e.g., X1) in the presence of which selecting intelligent (B1), unimaginative (C1), lifeless (D1), compulsive (E1), and mathematical (F1) would be reinforced, while selecting social science or fine arts related stimuli would fail to yield reinforcement. In verbally competent humans such a description might be expected to result in derived (bidirectional) relationships between Bill and his attributes, such that not only if given Bill would the subject now choose unimaginative (over, say, imaginative), but if given unimaginative the subject would be likely to select Bill (over, say, Tom). In addition, the verbal rule may also instantiate relations between the attributes (e.g., if X is related to B and X is related to C then C is related to B). Moreover, these relations are also likely, based on the subject's preexperimental history, to have some significant strength prior to the subject entering the experiment. That is, the terms mathematics, intelligence, rigidity, and lifelessness may be correlated for the subject upon entry into the experiment.

The present account suggests that participants enter traditional CF experiments with the necessary learning history to readily derive the relevant stimulus relations described above. The personality sketch capitalizes on this history, establishing the equivalence classes against which the comparison items will be evaluated. Specifically, relations of similarity are described between certain stimuli and relations of oppositionality with others promoting the formation of (emergence of) two stimulus equivalence classes that stand in opposition to one another (see Figure 1, top panel). Participants' likelihood ratings are then essentially a match-to-class or correspondence rating. That is, the participant is asked how well, given the information presented and his or her history, would accountant, jazzist, and accountant and jazzist fit into the equivalence relations involving the other attributes ascribed to Bill (see Figure 1, middle panel).

So why do subjects rate Bill is an accountant who plays jazz for a hobby as more probable than Bill plays jazz for a hobby? As each test sentence begins with reference to Bill, the stimulus "Bill" alone cannot be the source of control over differential responding. What seems important is the set of relations between the attributes ascribed to Bill (i.e., intelligent, unimaginative, lifeless, inflexible, and mathematical) and whether accountant, jazzist, or accountant and jazzist are likely members of such a set of equivalence relations. As suggested above (and following Tversky & Kahneman, 1983), our interpretation is that subjects enter CF experiments with established stimulus relations such that accountant is more likely to share class membership with intelligent, mathematical, lifeless, unimaginative, and inflexible than jazzist. Indeed jazzist may be seen as a nonmember and more likely to fall in a class consisting of musical, creative, vital, imaginative, and flexible. Thus, selecting accountant will be the strongest, most prepotent, response with jazzist being the weakest. The conjunction, which involves two stimuli having opposing class membership (at least in the context of the experiment), typically produces a rating of less likely than accountant, but more likely than jazzist. This result is similar to what Skinner (1953) referred to as algebraic summation (see also Yates & Carlson, 1986). To illustrate algebraic summation, consider an example involving reflexive behavior offered by Skinner (1953; p. 219): "One reflex may call for the extension of a leg, another for its flexion. Under certain circumstances the occurrence of both stimuli at the same time produces an intermediate position of the leg." In the context of the conditional stimuli (mathematical, lifeless, unimaginative), accountant is a discriminative stimulus (S^sup D^; yielding a high likelihood rating), and jazzist is an s-delta (SΔ; yielding a low likelihood rating). When presented as a conjunction, the presence of both an S^sup D^ and SΔ, the participant is confronted with two incompatible response tendencies, resulting in an intermediate response (a middling likelihood rating; see Figure 1, middle panel).

The Present Experiment

Based on the above interpretation, a demonstration experiment was conducted to see if the CF could be modeled using nonsense words and an analogue procedure. Specifically, participants were given conditional discrimination training to provide them with the prerequisite history for derivation of two six-member equivalence classes (see Figure 1, bottom panel). Using a match-to-sample task subjects were taught the following unidirectional relations to 100% accuracy (B1[arrow right]A1, C1[arrow right]A1, D1[arrow right]A1, E1[arrow right]A1, F1[arrow right]A1 and B2[arrow right]A2, C2[arrow right]A2, D2[arrow right]A2, E2[arrow right]A2, F2[arrow right]A2). This is the prerequisite training for establishing the equivalence classes depicted in Figure 1, bottom panel. However, before testing for the emergence of the equivalence relations, participants were shown a series of compound conditional (sample) stimuli. Each compound conditional stimulus contained only members from one potential equivalence class (e.g., D1, E1, F1) and was considered analogous to the presentation of the framing information in the Bill scenario. In the presence of the compound conditional (sample) stimulus, participants were asked to rate the likelihood that various other (comparison) stimuli (e.g., B1, B2, and B1, B2 which were presented sequentially) were a correct response. The B1, B2, and B1, B2 stimuli were analogous to the likely constituent (e.g., accountant), unlikely constituent (e.g., plays jazz), and conjunction (accountant who plays jazz) in the Bill scenario.

This analogue procedure allowed us to explore three primary research questions:

1. What proportion of participants would show the CF pattern under this analogue procedure and how does this compare to rates on the Bill scenario?

2. Did the formation of equivalence relations alter the propensity to demonstrate the analogue CF? Notice, at the time of testing for responding analogous to the CF, the participants had received no direct training in which B1 was the sample and D1, E1, or F1 was a comparison or vice versa. Testing for the emergence of the equivalence classes was done without feedback following the collection of likelihood ratings.

3. What was the magnitude of the CF, did it appear to represent an average?

Method

Participants

Twenty-seven Western Michigan University undergraduate students (89% female, 82% Caucasian, 93% Freshmen or Sophomores, M age = 19.1 [1.05], M GPA 3.3 [.35]) participated. Participants were recruited from undergraduate courses and received points exchangeable for money ($.01 for each point) for correct responses during the experiment. On average participants earned about $4.50.

Location, Apparatus, and Stimulus Materials and Presentation

Participants worked individually on a personal computer in a quiet experimental room measuring approximately 11 × 17 feet. Following receipt of consent for participation the individual was seated at a workspace located along the front wall of the laboratory in front of a personal computer. The personal computer, a Dell OptiPlex GX 150, running Visual Basic® software, presented all of the experimental instructions and stimuli and recorded participant responses, which were made by left-clicking on the mouse. (The keyboard was on a retractable accessory that slid under the workspace during the experiment.) The experimenter remained in the room during the experiment, seated quietly at a table to the back of the participant and approximately 10 feet away from the computer. From his or her position in the room the experimenter was unable to see the individual responses of the participants, but was accessible in case questions arose or the participant requested a break.

The stimuli used in the experiment were 12 nonsense words divided into two classes (see the bottom panel of Figure 1). When presented on the computer, the nonsense words were written in all capital letters, in black Ariel font (size 16) and appeared centered inside a gray box measuring about .66 of an inch in height and 2 inches in length. Sample nonsense stimuli appeared centered from left-to-right and middle-to-top on the computer monitor. Comparison stimuli appeared centered from middle-to-bottom in the lower half of the screen in one of four locations evenly spaced from left-to-right.

Procedure

Training phase. The goal of the training phase was to provide the prerequisite history for the establishment of two six-member equivalence classes consisting of nonsense words. Using a match-to-sample procedure and a comparison-as-node (many-to-one; see Saunders & Green, 1999) training sequence (B1[arrow right]A1, C1[arrow right]A1, D1[arrow right]A1, E1[arrow right]A1, F1[arrow right]A1 and B2[arrow right]A2, C2[arrow right]A2, D2[arrow right]A2, E2[arrow right]A2, F2[arrow right]A2), initial conditional discriminations were trained.

At the start of the training phase the following instructions were provided to the participant:

During this stage of the experiment you will be asked to identify logical relationships between nonsense words. At the top of each page will be a nonsense word. You must look at this nonsense word at the top of the screen and click on it using the left mouse button. After clicking on the top nonsense word additional nonsense words will be presented at the bottom. You must choose one of the two nonsense words at the bottom. You make your choice by moving the mouse onto the nonsense word you wish to choose and clicking the left mouse button. After each choice the computer will inform you if you made a CORRECT or INCORRECT choice. Correct selections will earn points. The computer will keep track of correct responses and occasionally throughout the experiment will show you the total points you have accumulated up to that point.

Click the YES box below when you understand these instructions and are ready to begin.

On each trial the computer presented a sample stimulus on the top portion of the computer screen (e.g., "FOW" or "REK"). The participant was required to click on this sample stimulus, which then produced two comparison stimuli presented on the bottom portion of the screen (e.g., "SOJ" or "ZAB"). The participant then chose one comparison stimulus by left-clicking the mouse on it, which resulted in feedback. For example, selecting SOJ in the presence of FOW, and selecting ZAB in the presence of REK, resulted in the computer indicating "CORRECT" (in green font) and the participant earning a point (added to a cumulative point total which was briefly shown to the participant at 10 trial intervals). Selecting SOJ in the presence of REK or selecting ZAB in the presence of FOW resulted in the computer screen indicating "INCORRECT" (in red font) and no point was earned. After the feedback was given the program proceeded to the next sample stimulus. In blocks of 10, the computer presented 5 each of these B1[arrow right]A1/A2 and B2[arrow right]A1/A2 trials. Trials were presented in quasi-random alternation (quasi-random because no trial type could occur more than twice in a row) and continued until the subject answered all 10 trials in a block (5 of each type) correctly. After 100% correct performance on the block of 10 trials, the program displayed the cumulative point counter for 5 s and then advanced to CA training. Across training, the sample stimuli changed (from B to C to D to E to F) as the participant met the 100% performance criterion on the previous sample set, but the comparison stimuli were always the same (this comparison-as-node or many-to-one training approach appears to lead to the most rapid acquisition of the relationships, see Saunders & Green, 1999). Using the approach described above, the following relations were taught:

FOW[arrow right]SOJ and REK[arrow right]ZAB

XID[arrow right]SOJ and MEP[arrow right]ZAB

BAX[arrow right]SOJ and VAG[arrow right]ZAB

TIV[arrow right]SOJ and PID[arrow right]ZAB

LOQ[arrow right]SOJ and FUH[arrow right]ZAB

To ensure that the earlier relations remained intact, the participant was next presented with a mix of all the previously learned relations. These mixed training trials were presented in blocks of 20 which contained 2 each of the training trials presented randomly. Feedback was provided for correct and incorrect responses, with correct responses earning the participant points. Mixed training continued until the subject made 20 correct responses during a block.

The training provided the prerequisite experiences for the establishment of two six-member equivalence classes (see Figure 1, bottom panel), analogous to the kind of learning history we suspect participants have when they enter traditional CF experiments and are provided with the personality sketch (as presented in Figure 1, top panel). However, note the comparison-as-node training structure. As depicted in Figure 1, the classes were held together by the center nonsense word, and the participant received no explicit training directly linking the outer words with one another. It is unlikely that the life history of participants in traditional conjunction fallacy experiments involved explicit conditional discrimination training with all the stimulus relations in the Bill scenario. Instead their life history probably provided enough prerequisite experiences such that the conditional stimuli presented in the personality sketch along with the comparison vocation and/or avocation resulted in derived relational stimulus control. To approximate this history, in the present experiment we provided sufficient prerequisite (unidirectional) conditional discrimination training for equivalence class formation but provided no direct exposure to the emergent/derived relations until after the CF had been assessed.
Conjunction fallacy test phase. After training, the computer presented the following instructions (borrowing from Fantino et al., 1997) for the CF test phase:

During this stage of the experiment you must make judgments based on what you've previously learned about the relationships between nonsense words to respond to novel arrangements. As before, you will start by clicking on the item at the top of the screen. However, during this phase, after clicking on the top item, you will now be shown only one choice at the bottom. Below the choice, you will see a drag box which you can adjust to enter a percentage from 0-100%. You will be asked to rate how likely you think it is that each choice displayed is correct given the combination of nonsense words on the top. For example, entering "0" would suggest that given your experience you think it is virtually impossible that the arrangement of nonsense words shown on the bottom would be a correct answer given the combination of nonsense words on the top. Entering "100" would suggest that given your experience you think it is virtually certain that the arrangement of nonsense words shown on the bottom would be a correct answer given the combination of nonsense words on the top. Since each new display asks you to enter a new percentage from 0-100, it is NOT EXPECTED that the numbers you enter will sum to 100. You will not be told if specific responses are correct or incorrect, but you may be awarded bonus points. You will be told if you have been awarded bonus points after all of your ratings have been entered.

Click the "yes" box below when you understand these instructions and are ready to begin.

When the participants indicated readiness to proceed, the computer presented 18 test trials designed to be analogous to those used in traditional CF studies. Specifically, the sample now consisted of three members of one of the two stimulus classes (e.g., FOW, XID, BAX or REK, VAG, PID). For each three-member sample (analogous to the descriptive framing information), the participant was then asked to enter a likelihood rating that each of nine comparison stimuli (that differed in terms of their "representativeness"; that is, class membership), was a correct answer. The comparison stimuli were presented one-by-one and, during this phase, were centered from left-to-right and middle-to-bottom on the screen. For example, in the presence of the compound stimulus TIV, XID, BAX, the participant rated, in the following order, the likelihood that FOW (representative), FUH (nonrepresentative), FOW,FUH (conjunction), REK (non-representative), LOQ (representative), REK,LOQ (conjunction), YOF (never before seen nonsense word), BOK (never before seen nonsense word), and YOF, BOK were correct answers. For the preceding example, the conjunction fallacy would be identified by the following pattern of likelihood ratings: FOW,FUH > FUH and REK,LOQ > REK. Of note, YOF, BOK, and YOF,BOK were never before seen nonsense words used to establish the "base rate" of conjunction responding (i.e., instances where YOF,BOK > YOF or BOK). The same procedure was then conducted for the compound stimulus MEP, VAG, PID. Thus, across the 18 test trials there were four conjunctions assessed involving stimuli used in the training and two involving never-before-seen nonsense words. During this phase, responses (the percentages entered by the participant) did not result in any feedback; however, to maintain motivation, the participants were told that their responses might produce "bonus points," and that they would be informed if they received any bonus points after all of their responses were entered. To their cumulative point total, all participants received 18 additional points, 1 for each response, independent of the percentages entered.

Equivalence test phase. The purpose of the equivalence test phase was to determine if the participants derived relationships between the nonsense words through their common node. That is, were equivalence classes indeed formed such that participants treated FOW, XID, BAX, TIV, and LOQ as related based on the common node SOJ and REK, VAG, PID, FUH, and MEP based on the common node ZAB?

The screen set-up during equivalence testing was identical to that in the training phase. Across trials the sample and comparison stimuli changed such that the relationships between the various members of each class were assessed. In this phase trials were presented in blocks of 50. The blocks of 50 were designed to combine the 10 trials from the mixed training phase with the 40 equivalence test items (i.e., BC, BD, BE, BF, CB, CD, CE, CF, DB, DC, DE, DF, EB, EC, ED, EF, FB, FC, FD1 FE relations for both classes). However, because of a computer programming oversight, the 50 trial blocks actually included 11 mixed training trials and 39 untrained test trials. As a result, not all directions of all derived relations were assessed, but all possible derived relations were assessed (e.g., C1[arrow right]D1 was not assessed; however, D1[arrow right]C1 was). Because the training procedure was comparison-as-node, such that both C1[arrow right]D1 and D1[arrow right]C1 trials constitute tests of equivalence of the CD relation, and all derived relations were assessed, the omission of the few test items, while unfortunate, does not appear to be an oversight that would alter conclusions.

During the equivalence test, trials were presented randomly within each block and continued until 2 blocks of 50 trials (100 total trials) were presented. When the subject made his or her choice no feedback was delivered. However, to maintain motivation, participants were informed that despite the absence of feedback there were correct choices and that correct choices would be recorded by the computer and generate points that would be awarded at the conclusion of a 50-trial block (responses demonstrating derived relations were considered the correct response). After the 50-trial blocks the point counter was displayed for 5 s.

Conjunction fallacy replication test. In the final phase of the experiment the participants were presented with the Bill scenario. The computer presented the following information at the same height as the sample stimuli in the previous phases:

Read the passage in the box below closely and click on "yes" when you are finished.

Bill is 34 years old. He is intelligent but unimaginative, compulsive, and generally lifeless. In school, he was strong in mathematics but weak in social studies and humanities.

After the participant entered "yes" the following instructions appeared on the screen (in addition to the passage, which remained on the screen):

You will now be shown various statements about Bill. You will be asked to indicate the likelihood that each statement is true of Bill by entering a percentage using the drag bar. Since each statement asks you to enter a new percentage from 0-100, it is NOT EXPECTED that the numbers you enter will sum to 100. You will not be told if specific responses are correct or incorrect, but you may be awarded bonus points. You will be told if you have been awarded bonus points after all of your ratings have been entered. Please click on "yes" when you understand these instructions.

The participant was then asked to indicate the likelihood that each of the following statements presented one at a time, in blue font, and in this order, were true by entering a percentage on the drag box.

A. Bill is an accountant

B. Bill plays jazz for a hobby

C. Bill is an accountant who plays jazz for a hobby

During this phase, responses (the percentages entered by the participant) did not result in any feedback; however, to maintain motivation, the participants were again told that their responses might produce "bonus points," and that they would be informed if they received any bonus points after all of their responses were entered. In fact, all participants received 1 point per response, independent of the percentages entered.

After completing the CF replication phase the participant was informed by the computer that the experiment was done along with her or his final point total. The experimenter then paid the participants and thanked them for their participation.

Results

1. What proportion of participants showed the CF pattern on the analogue procedure and how did these rates compare to rates on the Bill scenario?

Incidence during the analogue conjunction fallacy test phase. Responses analogous to the CF were identified when the conjunction was rated higher than the low likelihood constituent. Using this method, across opportunities, responding analogous to the CF occurred 60% of the time. On each of the four opportunities, 67%, 56%, 59%, and 59% of the sample evinced the conjunction fallacy pattern. In addition, showing the CF pattern on one opportunity was a significant predictor of showing the CF pattern on other opportunities ([straight phi] = .37-.47, ps = .01-.05).

We also examined how often participants rated the conjunction of never-before-seen nonsense words as more likely than either one of the constituent never-before-seen nonsense words. Across opportunities responding analogous to the CF occurred 46% of the time, with rates of 44% and 48% observed on each of the two opportunities. However, showing the CF pattern on the first opportunity was not associated with CF responding on the second occasion ([straight phi] = .18, p = .34). Moreover, there was no relationship between number of CF responses to the never-before-seen stimuli and CF responses to the stimuli to which the participant was exposed during training (tau = .02, p = .91). Thus, it appears that something very different was happening when participants responded to the stimuli to which they were exposed during training compared to those that were never encountered until the CF test phase.

Incidence during the conjunction fallacy replication test. When responding to the Bill scenario, 69% of the sample evinced the CF. A binomial analysis suggested that there was no significant difference between the observed frequency of .69 on the Bill scenario and a test proportion of .60 (the frequency observed during the analogue CF test), p = .22. Thus, the rates of CF responding were similar during the analogue CF test and the CF replication test. Moreover, commission of the CF on the Bill scenario was not predictive of greater commission of the CF during the anaologue test, t(24) = -.21, p = .83. Thus, it did not appear that participants had a response set or general proclivity towards CF responding that was stable across stimulus arrangments.

2. Did the formation of equivalence relations alter the propensity to demonstrate the analogue CF?

In considering these data it is important to remember the experimental timeline. At the time of testing for responding analogous to the CF, the participants had received only the comparison-as-node training (i.e., B[arrow right]A, C[arrow right]A, D[arrow right]A, E[arrow right]A, F[arrow right]A). Thus, none of the members of the compound sample stimuli used in the analogue CF test were part of an explicitly trained class nor did any explicit training establish relations between any members of the compound sample stimuli and the comparison stimuli used in the analogue CF test. Testing for the emergence of the equivalence classes occurred in the subsequent phase and without feedback.

Equivalence classes were considered to have formed if participants responded correctly on 90% or more of the emergent relations items during the equivalence test phase, a criterion that has been used by others for identifying class formation (see Lazar, Davis-Lang, & Sanchez, 1984). According to this criterion, equivalence classes formed for 13/27 participants (48%). The average percentage correct for those 13 participants on the items assessing emergent relations was 97% (SD = 3%), compared to the average of 70% (SD = 15%, range 42-87%) for the remaining 14 participants, t(25) = 6.6, p < .001.

Based on these equivalence test data, two groups were formed: (a) those who were positive for equivalence class formation (n = 13), and (b) those who were negative for equivalence class formation (n = 14). We then explored whether there were group differences in the number of analogue CF responses, finding a statistically significant difference, F(1, 25) = 17.31, p < .001. Specifically, the mean number of analogue CF responses (with four being the maximum) for participants showing equivalence class formation was 3.38 (SD = 1.12, median & mode = 4), compared to a mean for the remaining participants of 1.50 (SD = 1.23, median & mode = 1). The individual participant data (presented in Figure 2) corroborated the group level analysis and suggested that this finding was fairly robust at the level of the individual participant.

Also conducted was an analysis of how often participants in each group rated the conjunction of never-before-seen nonsense words as more likely than either one of the constituent never-before-seen nonsense words. This analysis revealed no group difference, F(1, 25) = 0.25, p = .62. Furthermore, there was no correlation between group membership (> 90% vs. < 90% equivalence test performance) and commission of the CF on the Bill scenario, [straight phi] = .05, p = .79. In sum, the data suggest that commission of the analogue CF was governed by the nature of the stimulus relations established during the experiment with those who showed derived relations being much more likely to evince the CF pattern.

3. What was the magnitude of the conjunction fallacy, did it appear to represent an average?

The magnitude of the CF was determined by calculating the difference between the likelihood rating given to the conjunction and the nonrepresentative item. If this was positive, we assessed whether the difference appeared to represent an averaging of the representative and nonrepresentative items.

Magnitude and averaging on the analogue conjunction fallacy test. Across the sample (N = 27), the mean (SD) likelihood ratings on the analogue CF test for the representative, nonrepresentative, and conjunction items were 62.95 (30.69), 23.37 (22.76), and 34.00 (20.47), respectively, for a magnitude of 10.63, which was lower than would be expected by the averaging hypothesis (19.79). However, this analysis includes the entire sample, which is known to systematically differ in terms of commission of the CF based on equivalence class formation. When only the group of participants who demonstrated equivalence class formation were considered, the mean (SD) likelihood ratings on the analogue CF test for the representative, nonrepresentative, and conjunction items were 75.92 (24.87), 13.96 (15.75), and 47.23 (17.30), respectively, for a magnitude of 33.27, which was quite consistent with what would be expected by an averaging hypothesis (30.98). The upper panel of Figure 3 presents the observed conjunction likelihood ratings with what would be expected based on averaging for each participant. If these are identical they will fall on the diagonal. For 7/13 participants (54%), observed minus expected difference scores were ± 10 points or less, while 3/13 (23%) had difference scores that were lower than expected and 3/13 (23%) had difference scores that were higher than expected based on averaging. Across the group, the observed minus expected difference scores were not statistically significantly different from zero, (M = 2.29, SD = 13.94, t(12) = .59, p =.57.

The group who failed to demonstrate equivalence class formation showed no evidence of averaging and, in fact, the mean likelihood ratings for the conjunctions of 21.71 (14.96) were lower than the ratings for the nonrepresentative items of 32.11 (25.22). The ratings for the representative items averaged 50.91 (31.50). The middle panel of Figure 3 presents the observed conjunction likelihood ratings with what would be expected based on averaging for each participant. Across this group, the observed minus expected data were statistically significantly different from zero, M = -19.79, SD = 18.28, t (13) = -4.05, p =.001.

Magnitude and averaging on the conjunction fallacy replication test. Across the sample (N = 26, a computer error produced missing data for 1 participant), the mean (SD) likelihood ratings on the Bill scenario for the representative, nonrepresentative, and conjunction items were 80.96 (23.39), 7.0 (10.97), and 23.77 (20.11), respectively, for a magnitude of 16.77, which was lower than what would be expected by averaging. Similarly, when only the data from the 69% of the sample that showed CF responding was considered, the respective likelihood ratings were 88.89, 6.22, and 29.72; again, lower than what would be expected by averaging. The bottom panel of Figure 3 presents the observed conjunction likelihood ratings with what would be expected based on averaging for the participants demonstrating CF responding. For 7/18 participants (39%), observed minus expected difference scores were ±10 points or less, and the remaining 11/18 (61%) made likelihood ratings that were systematically lower than expected by averaging. As a result, the observed minus expected difference scores were statistically significantly different from zero, M = -17.83, SD = 14.78, t(17) = -5.12, p < .001. Thus, although the conjunction produced an intermediate response on the Bill scenario and a subset of participants' responses was consistent with averaging, the group average was significantly lower than what would be expected based on the averaging hypothesis.
点击加入心理学人的交友网络
Discussion

The incidence of CF responding under the analogue procedure was similar to that reported in the literature with standard CF procedures. Moreover, within our sample the rates on the analogue procedure (60%) and the Bill scenario (69%) were similar. This convergence is at least consistent with the suggestion that our preparation was analogous to other preparations employed. Importantly, CF responses during the analogue CF test phase were correlated with one another when the stimuli comprising the classes were used; however, these CF responses were not associated with CF responding to the never-before-seen stimuli nor CF responding to the Bill scenario. These data suggest responding was sensitive to experimental events and not the result of a cross-situationally stable, intra-individual proclivity, to choose or not choose conjunctions. If, as the preceding suggests, our procedure was a reasonable analogue and responding during the procedure was governed by learning that occurred (or failed to occur) during the study, then our data demonstrate the possible role of derived stimulus relations in CF responding.

Furthering the suggestion that the present results provide a preliminary demonstration in support of the derived stimulus relations interpretation of CF responding is that commission of the CF was strongly related to whether equivalence classes were formed. The modal participant who demonstrated equivalence relations committed the CF on the analogue test on all four occasions, while the modal participant who failed to evince equivalence class formation committed the CF once during the analogue test. These results are viewed as a preliminary demonstration because we did not actively manipulate an independent variable to demonstrate experimental control. Instead we provided a plausible behavioral interpretation of a complex "cognitive" performance and then tried to determine if this interpretation would hold up in an analogue laboratory preparation. The derived stimulus relations interpretation and the current data are not antithetical to the main thrust of the representative heuristic analysis of Tversky and Kahneman (1983), and indeed might be viewed simply as a behavioral translation. However, this translation may contribute to mak-ing the analysis more complete. The representativeness heuristic explanation rests on assumptions about participants' learning histories. To recognize that this is so, remember that Tversky and Kahneman explicitly developed the Bill scenario to be more representative of an accountant than a jazzist; thus, assuming an existing prototype of individuals with this vocation or avocation. In the approach taken in this experiment, no history was assumed, instead the relevant learning history was provided during the experiment and the outcome was directly assessed and tied to the response of interest. There is no need to postulate a representativeness heuristic to explain the data, but the data can explain a representative heuristic, at least in part, in terms of contextual/conditionally controlled stimulus equivalence relations. The relevance of equivalence relations is evident in Tversky and Kahneman's description: "Representativeness is an assessment of the degree of correspondence between a sample and a population, an instance and a category, an act and an actor or, more generally, between an outcome and a model" (Tversky & Kahneman, 1983; p. 295). To see the importance of contextual/conditional control, imagine the framing information was "Bill is a 34-year-old who was born with severe brain damage, is profoundly mentally retarded, and required intensive training to learn to feed and dress himself." In this context CF responding appears unlikely as accountant and jazzist would now share membership in a stimulus class involving unexpected activities of Bill (i.e., they would now both be nonrepresentative items).

CF responses on both the analogue test and the Bill scenario were given a likelihood rating that fell between the representative and nonrepresentative items. However, only on the analogue test (among those who demonstrated equivalence class formation) was the CF response consistent with averaging. The likelihood ratings for these participants of 76, 14, and 47 across A, B, and C, respectively approximated those reported by Gavanski and Roskos-Evoldsen (1991) in their Experiment 2 where 85% of the sample made the CF, and mean likelihood ratings were 69,19, and 42 (see also Fantino et al., 1997). However, our CF replication test data, using the Bill scenario, did not reveal averaging. The group average likelihood rating on the conjunction was lower than would be expected by averaging, both when the entire sample was considered and when only the data from the 69% who showed the CF were used. These data were consistent with those from Gavanski and Roskos-Evoldsen's (1991) Experiment 1 where 45% of the sample made the CF, and the average conjunction rating fell in between the two constituents, but was lower than would be expected by averaging.

The reason for the differing results is not clear. With respect to the present study, one potentially relevant feature of the analogue methodology is the greater control over the learning history established. During training, the stimuli were always from the two classes, responses were always either correct or incorrect, and the amount of exposure to the stimuli was similar. As such, when a conjunction was presented (in the presence of the compound conditional stimulus), the individual items (provided equivalence classes were formed) included a roughly equal excitatory stimulus (S^sup D^) and inhibitory stimulus (SΔ); a set-up that might most readily lend itself to averaging. In the standard Bill scenario, the life history of participants with accountants and jazz musicians is likely more varied. However, this makes it even more interesting that those who committed the CF seemed to consistently rate the likelihood of the conjunction as lower than would be expected by averaging. In future experiments trained relations within classes could be systematically varied in strength by altering their probability of reinforcement to directly explore the effects of varied learning histories.

There are a number of other considerations in interpreting the findings from the present experiment. As mentioned above, it was not an experimental analysis but a demonstration study. The procedure was not designed to evaluate alternative accounts of CF responding, such as erroneous combining of probabilities, which in some designs (such as those where the probabilities of the constituents are explicitly stated) may be especially important to consider. Indeed it seems reasonable to conclude that across test arrangements and individuals, CF responding may occur (or fail to occur) for a number of reasons (see also Yates & Carlson, 1986): preference for compounds, erroneous combining of probabilities, availability of additional cues, and so forth. That the descriptive framing information appears to be an important variable implicates stimulus control generally and derived stimulus relations particularly. Another consideration involves the number of participants who demonstrated development of equivalence relations. Despite our use of a college student sample, a comparison-as-node training procedure, and a relatively rigorous training regimen with a 100% correct mastery criterion, less than 50% of the sample demonstrated equivalence relations. Importantly, this outcome is not unique to the present study. Others using similar samples have found a relatively low percentage of participants demonstrating equivalence relations, especially when large (six-member) classes were employed (Arntzen & Holth, 2000). In future research participants who failed to show emergence of equivalence relations could be invited back and provided additional training. Following the additional training they could again take the analogue CF test and then the equivalence test. If they now show emergence of equivalence classes their responses to the analogue CF could be compared to see if the rates of CF responding increased (as would be expected based on the current results).

The present data are also relevant to the issue of whether equivalence relations are established following training and simply revealed during standard symmetry, transitivity, and equivalence tests or whether the testing itself is necessary for the emergence of equivalence relations (McIlvane & Dube, 1990). In the present study, if equivalence relations had not formed at the time of the analogue CF tests (i.e., before equivalence testing), then they could not account for the differential performance between the positive-for-equivalence and negative-for-equivalence subgroups. This analysis suggests that for the positive-for-equivalence relations subgroup, the relations formed before the testing. The conclusion that the equivalence relations formed prior to testing is consistent with reports describing differential posttraining brain activation when participants are presented with to-be-related stimuli compared to unrelated stimuli (DiFiore et al., 2000). However, from an alternative perspective, the analogue CF test might be interpreted as just that -another test. As such, given the current methodology, there remains some risk in talking about the relations being formed before some relevant behavior has been emitted (McIlvane & Dube, 1990).

In conclusion, the present pa-per offered a behavioral interpretation of CF responding based on a learning history and current context promoting particular derived stimulus relations. The interpretation was then modeled in a laboratory preparation where the suspected history was established directly and the outcome of the history assessed and tied to the response of interest. The results demonstrate the potential role of derived stimulus relations in CF responding and illustrate how derived stimulus relations may contribute to a behavioral analysis of a "cognitive" phenomenon.

[Reference]
References
ARNTZEN, E., & HOLTH, P. (2000). Probability of stimulus equivalence as a function of class size vs. number of classes. The Psychological Record, 50, 79-104.
DIFIORE, A., DUBE, W. V., OROSS, S., WILKINSON, K., DEUTSCH, C. K., & MCILVANE, W. J. (2000). Studies of brain activity correlates of behavior in individuals with and without developmental disabilities. Experimental Analysis of Human Behavior Bulletin, 18, 33-35.
DONAHOE, J. W., & PALMER, D. C. 1994. Learning and complex behavior. Needham Heights, MA: Allyn & Bacon.
DOUGHERTY, M. R. P., GETTYS, C. F., & OGDEN, E. E. (1999). MINERVA-DM: A memory processes model for judgments of likelihood. Psychological Review, 106(1), 180-209.
FANTINO, E. (1998). Behavior analysis and decision mak-ing. Journal of the Experimental Analysis of Behavior, 69, 355-364.
FANTINO, E., KULIK, J., STOLARZ-FANTINO, S., & WRIGHT, W. (1997). The conjunction fallacy: A test of averaging hypotheses. Psychonomic Bulletin & Review, 4(1), 96-101.
FANTINO, E., & SAVASTANO, H. I. (1996). Humans' responses to novel stimulus compounds and the effects of training. Psychonomic Bulletin & Review, 3(2), 204-207.
GAVANSKI, I., & ROSKOS-EWOLDSEN, D. R. (1991). Representativeness and conjoint probability. Journal of Personality and Social Psychology, 61, 181-194.
HAYES, S. C., BARNES-HOLMES, D., & ROCHE, B. (2001). Relational frame theory: A post-Skinnerian account of human language and cognition. New York: Kluwer Academic/Plenum Publishers.
HORNE, P. J., & LOWE, C. F. (1996). On the origins of naming and other symbolic behavior. Journal of the Experimental Analysis of Behavior, 65(1), 185-241, 341-53.
LAZAR, R. M., DAVIS-LANG, D., & SANCHEZ, L. (1984). The formation of visual stimulus equivalences in children. Journal of the Experimental Analysis of Behavior, 41(3), 251-266.
MARR, J. (1984). Conceptual approaches and issues. Journal of the Experimental Analysis of Behavior, 42, 353-362.
MCILVANE, W. J., & DUBE, W. V. (1990). Do stimulus classes exist before they are tested? The Analysis of Verbal Behavior, 8, 13-17.
SAUNDERS, R. R., & GREEN, G. (1999). A discrimination analysis of training-structure effects on stimulus equivalence outcomes. Journal of the Experimental Analysis of Behavior, 72(1), 117-137.
SIDMAN, M. (1978). Remarks. Behaviorism, 6(2), 265-268.
SIDMAN, M. (1994). Equivalence relations and behavior: A research story. Boston, MA: Authors CoO p e r ative, Inc., Publishers.
SIDMAN, M. (2000). Equivalence relations and the reinforcement contingency. Journal of the Experimental Analysis of Behavior, 74(1), 127-146.
SKINNER, B. F. (1953). Science and human behavior. New York: Free Press.
SKINNER, B. F. (1957). Verbal behavior. New York: Appleton-Century-Crofts.
STOLARZ-FANTINO, S., FANTINO, E., & KULIK, J. (1996). The conjunction fallacy: Differential incidence as a function of descriptive frames and educational context. Contemporary Educational Psychology, 21, 208-218.
TVERSKY, A., & KAHNEMAN, D. (1983). Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review, 90, 293-315.
YATES, J. F., & CARLSON, B. W. (1986). Conjunction errors: Evidence for multiple judgment procedures, including "signed summation." Organizational Behavior & Human Decision Processes, 37(2), 230-253.

[Author Affiliation]
SCOTT T. GAYNOR, YUKIKO WASHIO, and FREDERICK ANDERSON
Western Michigan University

[Author Affiliation]
Yukiko Washio is now at the Department of Psychology, University of Nevada, Reno. We express our thanks to Paul Castone for computer programming. We also thank Richard Saunders for helpful suggestions on an earlier version of this manuscript.
Correspondence should be addressed to Scott T. Gaynor, Department of Psychology, Western Michigan University, 1903 W. Michigan Avenue, Kalamazoo, MI 49008-5439. (Email: scott.gaynor@wmich.edu).
返回列表