1. D is standardized as an effect size (d) by dividing the difference in means by the pooled standard deviation, to control for differences in variation in the studies. Average d variables are computed using the inverse of the sampling variance as weights (Hedges & Olkin, 1985).
2. 1500ms Stimulus Onset Asynchrony condition only
3. Specifically, valence ratings in Williams' et al.'s (1998) experiment had the following means for positivity and negativity when rated on a scale of 1 (not emotional) to 5 (very emotional): positive=(3.62, 1.13), negative=(1.49, 3.45), neutral=(2.36. 1.28).
4. The network was designed in concert with collection of data for Siegle's (1996) thesis. Thus, reaction time estimates for nonpersonally relevant negative, positive, and neutral information were, to some extent, informed by observed data, and should not be thought of entirely as predictions. Other network behaviors are more strictly predictions.
5. This finding is not preserved using a Hebb learning rule. Since the Hebb rule does not seek to minimize errors in decision making, previous training does not affect the network's response to current stimuli. Rather, biases would be expected to be exaggerated in ruminative copers based on a Hebb-training model.
6. With no forgetting function, the same pattern emerges for late simulated dilations, but there is no decrease in simulated early dilations.
7. One of three participants tested by a research assistant rather than the primary investigator was rumored to have been tested on the same day as his words were collected, but this report was not verified.
8. The warned reaction time task and gaze task were added after the protocol had begun. The 36 participants who took these tasks were the last 36 participants run on the protocol. These tasks were performed at the end of the testing session and thus should not have confounded results obtained during the rest of the testing session.
9. This differential elimination did not generally affect interpretation of subsequent analyses a great deal. The effects of the differential elimination are explored in detail in exploratory analyses contained in Siegle (1999b).
10. D was used rather than the more traditional d' because most people made relatively few errors. When an individual makes no errors, calculating d' involves finite approximations for infinite Z scores. As d' is effectively a normalized version of D (calculated as z(sensitivity)+z(1-specificity)), D should provide the same information that d' would.
11. .74 without assuming non-zero population error rates
12. By analyzing each factor separately, the assumption is made that factors from the PCA represent qualitatively different independent processes, i.e., rumination involves different processes, and potentially different brain areas, than cognitive or motor processes. This analysis does not preclude the possibility that these different processes arise as a function of a single distributed system, such as the neural network model. Rather, just as different nodes in the neural network model were observed to become active at different but overlapping times, different brain processes are assumed to be active at different, possibly overlapping times.
13. These results are pursued in exploratory analyses included on the web site accompanying this dissertation in Siegle (1999b).
14. Follow-up analysis of this claim is contained in the associated web site.
15. Again, this conclusion is not dependent on empirical results showing differences in depressed people's late pupil dilation in response to personally relevant and nonrelevant information. Such late processing is assumed to occur after any information has been associated with personally relevant information, and represents individuals thinking about personally relevant information independent of the content of the original stimulus.
16. Due to their smaller network, Siegle and Ingram (1997a) used 70 epochs of overtraining.
* Many sources (e.g., Coyne, 1994; Coyne &
Gotlib, 1983; Gotlib, 1984) have questioned whether results obtained
using dysphoric college students may be generalized to clinically
depressed individuals. They suggest that dysphoric college students
experience many stressors that are not representative of most
depressed individuals, such as constant evaluation by superiors,
academic problems, changing social relationships, the adjustment to
independent living, recent separation from parents, and the transition
to adulthood (e.g., Kashani & Priesmeyer, 1983). More generally, Coyne
(1994) cautions that analog populations often do not capture the whole
picture of a disorder, and thus, arguments based on a sample do not
necessarily generalize to the disorder. As an example, Coyne states
that poverty often causes distress, but rarely depression. As such,
arguments made regarding the relationship of income levels to distress
in an analog population may not generalize to the greater depressed
population. In addition, the neuroanatomy of college students
(specifically development of connections between the cortex and limbic
structures which are thought to be important to depression) is
undergoing fundamental changes irrespective of their newfound
educational status (Benes, 1989). Thus, to be sure that Siegle et al.'s
(1998) results hold not only for dysphoric, but for depressed
individuals it will be important to examine results of the tasks in a
population clinically depressed and nondepressed individuals who are
not in college. This strategy is adopted in the experiment described in the following sections.
* Scaling was done by scaling positivity values to 3.62...
* That is, because the standard deviation around neutral words was higher than for other
valences, depressed individuals did not display a significantly larger discrepancy in response times to negative
and neutral words than did non-depressed individuals.
* These results use person-100cut-harmmean-rescaled data set available from the author
* Similarly, the larger biases towards positivity and neutrality in nondepressed
individuals are consistent with the network’s predictions.
* Planned contrasts assuming the factors represent a continuous
process were also performed. Rather than representing qualitatively different processes, the
extracted pupil dilation components may be thought of as indexing a continuous phenomenon, each component occuring at
approximately the same temporal offset from the previous component..
To test this hypothesis, contrasts were examined from ANOVAs with valence
(positive, neutral, negative) and factors believed to represent aspects
of attention and information processing after stimulus onset (Factor 1,
2, 3) on factor loadings. Tests of the linear trend in factor revealed that pupil dilations
increased over time for depressed individuals, F(1,22)=4.66,
p=.042, h2=.175, but decreased
for nondepressed individuals, F(1,24)=6.62, p=.017, h2=.216,
on the valence identification task. The difference in the linear trend between depressed and
nondepressed individuals was significant, based on a
contrast from the same MANOVA in which group was included as a between
subjects variable, F(1,46)=11.04, p=.002, h2=19.2.
To examine the idea that depressed individuals would have high early
and late dilations to personally relevant negative words, in contrast
to their generally low early dilations,
Findings were less strong, but similar for the lexical decision task
data for trials in which nonmatching valence ratings were
excluded. For depressed individuals, the linear trend showed a slight,
but not significant increase, F(1,21)=2.09, p=.163,
h2=.09. Because the first cognitive factor
(Factor 2) was higher
than the late ruminative factor (Factor 1), a quadratic trend was more
strongly represented, F(1,21)=4.5, p=.043, h2=.178.
A slight decrease in dilation over time, of relatively similar magnitude, was
observed in the nondepressed group, F(1,24)=2.58, p=.121,
h2=.097. As with the valence-identification
task, the difference in linear trends between depressed and nondepressed individuals was
significant, F(1,45)=4.5, p=.019, h2=.092.
* Specifically, the possibilities, based on the physiological model,
were that a) they could think about nonemotional aspects
of negative things, b) they could think about emotional aspects of
negative things, and c) thinking about emotional and nonemotional
aspects of negative things could be thought of as interacting.
* Exploratory analyses reported on the associated
web site suggest that depressed individuals’ attention to personally
relevant negative stimuli was sustained, while nondepressed
individuals did not sustain attention to these stimuli. This result
occurred during both tasks. Surprisingly nondepressed people showed
greater differentiation in responses to stimuli, displaying greater
cognitive activity during the early stages of attention, and paying
greater late attention to positive information.
* Indeed, analysis of the rating data revealed that depressed
individuals reliably suggest that negative words are negative, whereas
they are not as likely to categorize positive or neutral words
consistently. Similarly, analysis of response biases also
suggested that depressed individuals appear biased to label all
types of stimuli as negative. These findings are consistent with the general
results from the neural network model in which all stimuli tend
to be rated as more negative, and most often labeled as negative by the
overtrained network. Moreover, depressed individuals seemed
particularly prone to rate words words they had generated to be
positive and personally relevant as negative or neutral. This finding
further suggests that depressed individuals have a difficult time
seeing positivity, even when stimuli are relevant to them.
Results analyzing reaction times on the valence identification task
were similarly consistent with predictions derived from the neural
network model. As predicted, depressed individuals are indeed slow
to say that positive words are positive, and are quick to say that
negative words are negative. This finding suggests that depressed
individuals have an easier time processing negative than positive
information.
* The network had learned to associate a particular stimulus with a negative valence
more strongly than it had learned any other association. Thus,
connections in the network to its representation of the negative
valence, and to that stimulus were stronger than other
connections. When feedback occurred within the network, these bits
of information were thus likely became activated no matter what the
original stimulus was. The network’s initial responses were thus most
related to stimulus when it was the personally relevant negative
stimulus on which the network was overtrained.
* An argument against this explanation is that various
tasks that do not nominally assess semantic processing, such as the
Stroop task, in which individuals are asked to name the color in which
words are presented, often reveal effects of interference from the
semantic content of stimuli (e.g., Williams et al., 1996). As such,
even if the task could be done without conscious semantic processing,
it is likely that individuals are interpreting the semantic meaning
of stimuli. Still, they may not do so until after their reaction time.
* For example, the analog of a healthy individual
was created by training the model equally on positive, negative,
and neutral exemplars based on the assumption that nondepressed
individuals have relatively equivalent numbers of positive, negative,
and neutral experiences, while depressed people have more negative
thoughts or experiences. An alternate approach consistent with
Schwartz and Garamoni’s (1989) States of Mind (SOM) model might
suggest the analog of nondepressed individuals should involve
overtraining on positive information (Siegle, 1996; Park, 1998)
to represent their hypothesized greater numbers of positive than negative
congitions. An analog of depression would involve subsequently
overtraining the model on negative exemplars enough to disrupt the
ratio of positive to negative training examples. Such a network
would be initially biased to respond to positive information. As it is
overtrained on negative information, the network would begin to
respond more evenly to positive and negative exemplars. The result
would be a model in which the non-overtrained analog of "normal"
functioning would respond differently to different valences,
potentially paying particular attention to positive
information. Depending on the level of overtraining, the overtrained
network, might respond more evenly to various valences on some tasks.
This explanation is not wholy satisfying however, given that depressed
individuals were biased in responding to the valence identification
task in the expected manner. Moreover the initial training on
non-orthogonal valences done with the current model served to generate
connection weights similar to some overtraining on positive
information, using an orthogonal valence representation. Extensive
formal modeling would be necessary to establish whether this
hypothesis could explain the obtained results.
* The following experiment with the simulated neural network shows that
overtraining can be largely reversed by retraining.
The associated figure shows the network’s response to a positive
stimulus, along with connection weights between the semantic and
valence layers before overtraining, and after overtraining. The
network is then retrained on one positive, and one neutral exemplar
for two and then for five epochs. The conventions for the subfigures
on the left follow those described for Figure 5. As shown in the
figure, with more retraining, the network’s valence activation, match
accumulation, and simulated pupil dilation curves for the valence
identification task look increasingly like they did before the
overtraining. In the semantic nodes, it can be seen that the retrained
network responds to the presented stimulus by activation of the two
new stimuli on which it was overtrained. As shown in the Hinton
diagrams on the right side of the figure, the retrained network still
inhibits positive information more than it had originally done so,
but
activation from the new personally relevant positive and neutral
patterns allows competition from valence nodes representing
positivity.
![]()
      Helping depressed people to relearn positive associations is thus
expected to lead people think more positively, even when negative
cognitions are not challenged. The trick will be to make positive
cognitions "stick" for depressed people in the same way that
negative cognitions do. The more a depressed person associates
incoming information with learned negative exemplars, the less likely
a positive exemplar is to be learned, as such. Siegle (1996; Siegle &
Ingram, 1997) have shown that the amount of feedback occurring between
the affective and semantic representations of information in the brain
govern how likely information is to be turned negative. Thus, it is
suggested that ruminative response styles be targeted in therapy
before positive retraining is engaged in. Rehearsal schedules and
other traditional methods of behavioral reinforcement may also be of
use in this respect.
      In terms of pharmacologic interventions, it was noted that the primary
function of depressive overtraining was to increase inhibition of
cognitions that are not personally relevant and negative. This
analysis suggests that a pharmacologic agent that could block
inhibition in the amygdala and hippocampal systems might be useful
in the remediation of depression. Park (1998, unpublished) presents
converging evidence suggesting that seratonergic pathways stemming
from the median raphe may serve a primarily inhibitory function, and
may thus be candidates for pharmacologic intervention. Additionally,
because biases are hypothesized to occur as a result of inhibitory
feedback between the hippocampus and amygdala systems, drugs
targeting either of these structures could break the cycle. If later
research shows that certain depressed individuals attend primarily
to the affective or semantic aspects of information, drugs specifically
targeting one or the other of these structures could be considered.