UNIVERSITY OF CALIFORNIA, SAN DIEGO
SAN DIEGO STATE UNIVERSITY
Cognitive and Physiological Aspects of Attention to
Personally Relevant Negative Information in Depression
A dissertation submitted in partial satisfaction of the
requirements for the degree Doctor of Philosophy in
Clinical Psychology
by
Greg Jeremy Siegle
 
Committee in charge:
University of California, San Diego
San Diego State University
Professor Greg Brown
Professor Rick E. Ingram, Co-chair
Professor Chris Gillin
Professor Georg E. Matt, Co-chair
Professor Eric Granholm
 
Professor John McQuaid
 
1999
 
Copyright
Greg Jeremy Siegle, 1999
All rights reserved.

 

The dissertation of Greg Siegle is approved, and it is
acceptable in quality and form for publication on microfilm

 

Eric Granholm, Ph. D.
John McQuaid, Ph. D.
Christopher Gillin, M.D.
Greg Brown, Ph. D.
Rick Ingram, Ph. D., Co-Chair
Georg Matt, Ph. D., Co-Chair
University of California, San Diego
1999

 

Dedication
This work is dedicated to Monica Barback and to any
depressed organic or silicon beings who may be helped
by its contents.
 
 
Psychiatry became real to me only when the concepts and the experiences with its facts and problems became clearer and more concretely related to life interests and especially when I had to handle patients whom I also had known without the mental disorder and who were viewed not as mere derelicts but as persons to be readjusted.
-- Adolf Meyer, 1928
Thirty-five years of psychiatry in the United States and our present outlook. American Journal of Psychiatry, 85, 1-31.
 
 

Table of Contents

Signature Page *
Dedication *
Table of Contents *
List of Tables *
List of Figures *
Acknowledgements *
Curriculum Vitae *
Abstract of the dissertation *
I. INTRODUCTION * A Physiological Framework for Understanding Information Processing Biases in Depression *
Tasks to Evaluate Attention in Depression *
The Need for Assessment Using Personally Relevant Information *
Using Pupil Dilation to Measure Attention Continuously in Depressed Individuals *
    Pupil Dilation as a Measure of Cognitive Load *
    Appropriateness of Using Pupil Dilation to Study Depressed Individuals *
Identification of Sustained Attention with Rumination *
The Need for Simulation Before Prediction *
II. A NEURAL NETWORK MODEL OF EMOTIONAL INFORMATION PROCESSING * A Short Introduction to Computational Neural Networks *
Why Computer Simulations of Neural Networks are Particularly Appropriate for Investigating Depression *
A Computational Framework for Investigating Affective Information Processing *
    Representation of Valence *
    Simulation of Normal and Depressed Experiences in the Model *
    Simulating Attention and Recognition Processes *
Reproducing Valence Rating Data *
Use of the Network to Predict the Time Course of Attention to Emotional Information *
    Predictions for Reaction Times *
    Predictions for Error Rates *
    Predictions for Pupil Dilation *
Summary of Predictions Based on the Performance of the Neural Network Model *
Experimental Design Based on the Network’s Behavior *
III. METHOD * Research Participants *
Self-Report Measures *
Apparatus *
    Target Stimulus Materials *
Tasks *
    Lexical Decision Task *
    Valence Identification Task *
    Warned Reaction Time Task *
    Gaze Task *
    Counterbalancing of Conditions *
Procedure *
Statistical Power *
Selection of Relevant Stimuli for Analysis *
Methods of Analysis - Reaction Times *
    Data Aggregation *
    Data Cleaning *
    Analytic Techniques *
Methods of Analysis - Signal Detection *
    Data Cleaning *
    Analytic Techniques *
Methods of Analysis - Pupil Dilation *
    Data Cleaning *
    Data Aggregation - Principal Components Analysis *
Type I Error Control *
Type II Error Control and Exploratory Analyses *
IV. RESULTS * Reaction Times *
    Lexical Decision Task Planned Contrasts *
Valence Identification Task Planned Contrasts *
Planned Analyses Exploring the Relationship of Ruminative Coping to Reaction Time Measures *
Signal Detection Rates * Valence Identification Task Response Biases *
Lexical Decision Task False Alarms *
Pupil Dilation *
    Aggregate Curves *
    Principal Components Analysis *
    Planned Contrasts *
    Interpretation of the late pupil dilation factor *
Brief Summary *
Personal Relevance *
V. DISCUSSION * What This Experiment Could Say About Depressed People *
What this Experiment Suggests about Depression: Depressed People Have Sustained Attentional Responses to Many Things *
Operationalizing Rumination *
Evaluation of Predictions from the Neural Network Model Based on the Empirical Data * Predictions Consistent With the Data: Seeing Things Negatively, and Sustaining Attention *
Predictions Not Consistent with the Data: Attending Differently to Different Valences *
Take Home Lessons *
Potential Treatment Implications *
Limitations * Limitations of the Neural Network Model *
Limitations of the Sample of Research Participants *
Limitations of the Sample of Stimuli *
Limitations of the Analyses *
Limitations of the Design *
Other Generalizability Concerns *
Future Directions *
The Big Picture *
Appendices * APPENDIX A -- TECHNICAL DETAILS OF THE NEURAL NETWORK * Construction of the Hebb trained network *
Construction of the BackPropagation Network *
Differences between Hebb and Backpropagation learning rules *
APPENDIX B: GRAPHS OF NETWORK ACTIVATION IN RESPONSE TO EACH VALENCE *
APPENDIX C: SELF-REPORT MEASURES * Beck Depression Inventory (BDI) *
Beck Anxiety Inventory (BAI) *
State Trait Anxiety Inventory (STAI) *
Response Styles Questionnaire (RSQ) *
APPENDIX D: WORD LISTS USED IN THE EXPERIMENT *
APPENDIX E: WORD GENERATION FORM *
APPENDIX F: CONSENT FORMS * SDSU form *
UCSD form for non-veterans *
UCSD form for veterans *
APPENDIX G: INSTRUCTIONS FOR TASKS * Initial Directions *
Lexical Decision Task—Practice *
Lexical Decision Task—Task *
Valence Identification Task—Practice *
Valence Identification Task—Task *
Word Rating Task *
Warned Reaction Time Task *
Gaze Task *
APPENDIX H: SOFTWARE USED FOR DATA COLLECTION AND ANALYSIS *
APPENDIX I: Examples of a few selected pupil dilation curves *
APPENDIX J: Robustness of PCA analyses *
APPENDIX K: ANOVA planned contrasts on differences between pupil dilation factors *
References *
 

List of Tables
1. Difference in reaction times (D) to negative and neutral words, in milliseconds *
2. Median valence ratings for each stimulus from 10 simulated rating sessions *
3. Numbers of research participants by ethnicity and gender *
4. Age and years of formal education of participants meeting inclusion criteria *
5. Mean match percentage between rated and normed or generated word valences. *
6. Mean harmonic mean reaction times for the lexical decision tasks, in seconds. *
7. Mean harmonic mean reaction times for the valence identification in seconds. *
8. Valence Response Biases for the Valence Identification task. *
9. Normed words used in the lexical decision and valence identification tasks *
10. Nonwords used in the lexical decision task *
11. Tests of relevant effects and contrasts for each component of pupil dilation
for the valence identification task *
12. Tests of relevant effects and contrasts for each component of pupil dilation
for the lexical decision task *

List of Figures

1. A neural network model for investigating affective and semantic information processing
2. Spread of activation through the network
3. Reaction time predictions from simulated affective and lexical decision tasks, from Siegle and Ingram (1997a)
4. Confusion and error rate predictions from simulated affective and lexical decision tasks, from Siegle and Ingram (1997b)
5. Simulated valence identification task network responses
6. Simulated pupil dilation on the valence identification task
7. Mean harmonic mean reaction times for the lexical decision task, in seconds.
8. Mean harmonic mean reaction times for the valence identification
9. Mean of median pupil dilations for depressed and nondepressed individuals.
10. Factor loadings for each extracted component of pupil dilations from the lexical decision and valence identification tasks
11. Factor scores for depressed and nondepressed individuals for extracted components from the lexical decision and valence identification tasks
12. Simulated valence identification task network responses
13. Simulated lexical decision task network responses
14. Software used for management of pupil dilation database
15. Average pupil dilation curves for two depressed participants
16. Average pupil dilation curves for two nondepressed participants
17. Factor loadings for separate PCA’s for depressed and nondepressed individuals

 
ACKNOWLEDGEMENTS

Throughout the last five years an extraordinary group of individuals have supported, guided, and encouraged me in my work. Most notably, my advisors and committee members, Rick Ingram, Jörg Matt, Eric Granholm, John McQuaid, Greg Brown, and Chris Gillin, have each demonstrated consistent wisdom and generosity in helping my work to see its current plateau. I feel unique in having a committee in which each individual has helped create and mold my research program in ways I had never considered possible. Rick, has shared profound and very human insights into depression and academia. Jörg has helped me to see so many different ways of making research stronger. Eric has encouraged many flights of science and fancy leading towards understanding the physiological basis of behavior, as well as upholding my sanity in our band’s blues rhythm section. John’s keen clinical insight, enthusiasm, and emotional support have been crucial for me. Greg’s critical insights into directions my models should go have been invaluable. Chris Gillin’s mobilization of the UCSD Mental Health CRC to recruit participants, fund this research, and keep the project afloat has allowed me to conduct my "ideal" study, rather than a much more limited version. You have all been constant sources of inspiration and information for me. Monica, thank you for your well placed and much needed support, acts of genius when I thought all was lost, and healthy cynicism. Gary Williams’ many hours of work on this project are also well noticed and much appreciated. Thanks also to the conscientious research participants who gave their hours and a glimpse into their souls such that this project could be completed. Mom and Dad, thanks so much for your encouragement at each stage of my training leading up to this project, as well as your support throughout its gestation. The support and unwavering faith of my in-laws has also been more valuable than they can know. There are a number of other people whose suggestions, presence around the labs, comments, reassurances, and offhand remarks are most appreciated. Jenn Ritter, Javier Movellan, Bob McGivern, Martin Paulus, and the folks in Rick’s, Jörg’s, and Eric’s labs, thanks. Thanks also to my advisors and teachers in the past who have helped to shape me into someone who could create this document. Finally, thanks to Jonathan, Christine, and all the folks at the Clarke Institute who've stood by me during the revisions and finishing touches to this project.

 
 
Curriculum Vitae
Greg Siegle
 
Office Address: Home Address:
The Clarke Institute 5 Hartham Place #305 
250 College St, 11th Floor North York, Ontario M3k 1P2
Toronto, Ontario M5T 1R8 Phone: 416-635-0716
Phone: 416-979-4747 x2376 Web: http://www.sci.sdsu.edu/CAL/greg.html
Fax: 416- 979-6821 Email: gsiegle@psychology.sdsu.edu

Date of Birth: February 9, 1969

Education
1999 Psychology Intern
  Clarke Institute of Psychiatry
1999 Doctor of Philosophy in Clinical Psychology
  San Diego State University / University of California , San Diego 
Joint Doctoral Program
  Dissertation: Cognitive and Physiological Aspects of Attention to Personally Relevant Negative Information in Depression
1996 M.S. in Psychology
  San Diego State University
  Thesis: Rumination on Affect: Cause for negative attention biases in depression?
1992 Graduate courses at the Institute for the Learning Sciences
  Northwestern University, Evanston IL
1991 Sc.B. with honors in Cognitive Science, magna cum laude
  Brown University, Providence, RI
  Thesis: Time is of the Essence: A Theory of Multiple Rate
  Determined Methods for Understanding Continuous Change

Awards and Distinctions
1998, 1999 Travel Award, University of Wisconsin
1997  Sigma Xi Grant in Aid of Research
1996  Dorothy K. Fricke Award, Voted by Joint Doctoral Program faculty for the graduate student in the SDSU/UCSD Joint Doctoral Program who has made the most significant contribution to the program
1995  Travel Award, University of Maryland
1991  Elected to Sigma Xi -- a national research honor society.
1987  National Merit Scholar.
 
Editorial Responsibilities
1998- Web Master, Connectionist Models of Cognitive, Affective, Brain, and Behavioral Disorders
1997- 1998 Program Committee, International Workshop on Neural Network Models of Cognitive and Brain Disorders.
1996-  Notes and Announcements Editor, Cognitive Therapy and Research
1996-  Web Master, Cognitive Therapy and Research
1995-  Web Master, Cognitive Clinical Assessment Lab, San Diego State University
1994-  Reviewer, Cognitive Therapy and Research, Clinical Psychology Review, Journal of Affective Disorders

Memberships in Professional Societies
American Psychological Association
Association for the Advancement of Behavior Therapy
Sigma Xi

Research Experience

My research program investigates the roles of memory and attention biases in the onset and maintenance of psychopathology. My primary research has involved the creation and validation of biologically motivated computational neural network models of the roles of attention and memory in depression. Empirical validation for the models has compared the behavior of simulations to that of depressed individuals on a variety of information-processing and neuropsychological tasks, as well as physiological measurements and the results of meta-analytic research syntheses.
 
1998 - present Clarke Institute of Psychiatry, Toronto. 

As an intern at the Clarke Institute I am beginning a study measuring neuropsychological deficits in schizophrenia, and have been involved in the analysis of data for a number of ongoing projects. This work has involved creation of novel indices to represent binding kinetics for dopamine occupancy studies, an analysis of whether rumination hinders recovery from depression in cognitive therapy, and a taxonometric analysis of differences between dysthymia and depressive personality disorder.

1993 – present  Cognitive Clinical Assessment Lab, San Diego State University.  

With Rick Ingram, Ph. D. I have investigated information processing biases in individuals with features of depression. Responsibilities in the lab have involved researching and developing a large battery of computer based emotional analogs of traditional neuropsychological and information processing tasks such as a Stroop task, Degraded Stimulus Task and Backward Masking task. I have overseen the collection of norms on the battery in college students with and without features of depression and depressed inpatients, and have been responsible for statistical analysis of data from these tasks. Additionally my work with the lab has involved investigating the relationships between cognitive aspects of depression and other phenomena including other psychopathologies, aspects of personality, ethnicity, and ethnic identity.

1993 – present  Pupillometry Lab, University of California, San Diego. 

With Eric Granholm, Ph. D. I have investigated cognitive and neuropsychological correlates of psychopathology, using physiological indices of information processing. Particular emphasis was placed on using pupil dilation as a reliable indicator of cognitive load. I have examined differences in pupil dilation over time in response to a number of emotional information processing tasks in individuals with and without clinical depression, and have related variation in pupil dilation to change in other physiological measures. This research has involved both the creation of software to measure pupil dilation during a number of cognitive tasks, and the creation of novel methods of aggregating and analyzing physiological data. It has also involved the establishment and maintenance of an ongoing relationship between the Pupillometry lab and the UCSD Mental Health Clinical Research Center (MHCRC). 

1993 – present  Measurement Lab, San Diego State University.  

With Georg Matt, Ph. D., I have investigated rigorous methods of representing mental processes. This research has centered on evaluating the validity of computational models of mental disorder, and using fuzzy sets to represent memory for health related behaviors. In addition we have performed meta-analyses of published studies using the affective lexical decision task in depressed and nondepressed individuals, and of psychotherapy outcomes in clinically representative conditions.

1993  Chicago Followup Study, University of Illinois at Chicago. 

With Martin Harrow, Ph. D., I used neural network theory to explain information processing in individuals with schizophrenia. Specifically we investigated the role of task difficulty in producing bizarre responses from schizophrenic individuals. Additionally I provided computer support for the Chicago Followup Study, a longitudinal study of schizophrenia and depression.

1992-1993  Institute for the Learning Sciences, Northwestern University.  

With Clark Elliott, Ph. D., I modeled emotional reasoning processes on a computer with Clark Elliott. We were especially interested in modeling mood intensity, and the relationship of depression to creativity.

1991-1993  Institute for the Learning Sciences, Northwestern University.  

With Ken Forbus, Ph. D., and Dedre Gentner, Ph. D., I created computational models of qualitative and analogical reasoning and created visual abstractions of analogical processes.

1990-1991  Cognitive Science Department, Brown University. 

With Bill Warren, Ph. D., and Heinrich Bulhtoff, Ph. D., I investigated the psychological and physiological bases of visually perceived transformations. I tried to demonstrate experimentally that people have multiple cognitive mechanisms for understanding continuous visually perceived change, based on the rate at which it occurs. Results were tied to neurophysiological research and ecological perceptivism.

1989 – 1991  Artificial Intelligence Group, Brown University.  

With Tom Dean, Ph. D., I investigated models of temporal cognition in continuous event systems and robot control in such systems, and designed a graphical robot control simulator.

Clinical Experience
1998 Clarke Institute of Psychiatry, Toronto 

Supervisors: Zindel Segal, Ph. D., Laurie Gillies, Ph. D., Bruce Christensen, Ph. D. 

Providing individual cognitive behavioral therapy for depression and interpersonal therapy for depression and borderline personality disorder, as well as neuropsychological testing. Attend weekly rounds, perform intakes, suitability assessments, and structured diagnostic interviewing.

1997 Mental Health Clinical Research Center, UCSD 
Supervisor: Chris Gillin, M.D. 
Evaluating appropriateness of individuals for clinical research, screening depressed individuals for appropriateness for Cognitive Behavioral Group Therapy treatment, evaluating individuals on novel neuropsychological tasks. Attend weekly inpatient and outpatient rounds for VA depression research unit.
1995-1997  Mood Clinic, San Diego VA Medical Center 
Supervisor: John McQuaid, Ph. D. 
Providing individual, couples, and group Cognitive Behavioral Therapy to veterans with features of depression, anxiety, and PTSD. Co-led "Cognitive Therapy for Depression" group. Provided psychological consultations to the VA spinal-cord-injury unit for patients with depression. Developed an auditory virtual reality intervention for PTSD in which patients are able to control the onset, volume, and character of auditory experiences relevant to triggers for their PTSD in the course of systematic desensitization. Screened patients for appropriateness for Cognitive Behavioral Therapy. 
1995-1996  PTSD Unit, San Diego VA Medical Center 
Supervisor: Jeff Matloff, Ph. D. 
Provided outpatient therapy to veterans with PTSD. Other responsibilities included screening for appropriateness for groups and individual therapy, and administering and scoring a number of measures relevant to PTSD. Provided psychological consultations to the VA spinal-cord-injury unit. Developed computer programs to score a large number of PTSD measures regularly administered to veterans. 
1994-1995  Outpatient Clinic, San Diego State University 
Supervisor: Jeannine Feldman, Ph. D. 
Provided outpatient counseling to individuals and couples from the San Diego community. Counseling was done using a variety of theoretical orientations. 
Summer 1994  Psychiatric Inpatient Unit, UCSD Medical Center 
Supervisor: William Perry, Ph. D. 
Provided neuropsychological consultation services for an 18 bed inpatient psychiatric unit on which all patients received brief neuropsychological screens. Performed neuropsychological and psychodiagnostic evaluations of inpatients with a wide variety of disorders. Led two hour group for patients on the unit. 
Teaching Experience
 
1998 Guest Lecturer for a course in cognition and psychopathology, University of Toronto
1998 Guest Lecturer on Cognitive Behavior Therapy for Medical Residents, Clarke Institute of Psychiatry
1996-1998  Instructor for Testing and Measurement, a course for upper level psychology undergraduates, San Diego State University. 
Responsible for all aspects of instruction including course-preparation, teaching, exam construction, and grading.
1995-1996  Teaching Assistant for doctoral statistics courses in regression and analysis of variance. San Diego State University 
Responsible for creation and grading of assignments, meeting with students, teaching computer-based analysis of data, and some lectures.
1994-present  Supervisor for undergraduate research assistants in research methodology and investigation of cognitive correlates of depression and anxiety. 
Supervision has involved teaching a weekly course in research methods and the role of cognition in psychopathology as well as one-on-one research supervision and training. San Diego State University
1993 Guest Lecturer for a course in experimental mathematics, Northwestern University
1991  Instructor for Automated Learning Systems, a course involving the use of psychological data to create learning algorithms for autonomous robots. Brown University. 
Responsible for creating the course, course-preparation, and teaching.
1991  Head Teaching Assistant for Introduction to Artificial Intelligence. Brown University. 
Responsible for creation and grading of assignments, regular office hours, supervision of other teaching assistants.
1990  Teaching Assistant for Honors Third Semester Calculus, Brown University. 
Responsible for teaching a mathematics visualization laboratory 
1990  Teaching Assistant for Introduction to Artificial Intelligence and Fundamentals of Programming. Brown University. 
Responsible for grading assignments and regular office hours
1989  Lead a group independent study in creating software for the visualization of mathematically generated functions. Brown University.
Other Professional Experience
 
1986 - present  President, Small Miracles Consulting (a private computer consulting firm). 
PC/Unix Programming, Training, Support, Statistical Data Analysis, Web/Internet Publishing, Graphics, Artificial Intelligence programming, Database and Spreadsheet Design and Programming, System Consulting, Computer System Maintenance.
1993  Consultant, Uni*Quality. 
Programming risk-analysis software for an international trading firm. Programming was done in Lisp and C. Graphic interfaces were developed in LispView. Database programming was done in SQL running on Sybase. Programs ran on Sun Sparc workstations.
1990 (summer), 1991 - 1992  Programmer/Analyst, Institute for the Learning Sciences. 
Modeling qualitative reasoning and human analogical reasoning, developing graphic interface software in Lisp with CLIM for a "modeler’s workbench" which allows users to analyze qualitative and numerical models of physical systems, and creating educational software for use in schools and large corporations. All programs ran on IBM RISC/6000 workstations.
1988 - 1991  Computer Consultant, Brown University Computer Science Department. 
Maintaining and supporting Apollo and Sun software and hardware
1988  Consultant/Programmer, Abbott Laboratories. 
Software, system consulting, LOTUS 123 programming, accounting.
1984 - 1985  Programmer, The OMEGA Group. 
Developing user friendly graphical analysis software for analog to digital conversion

Publications

In Preparation

Siegle, G., Granholm, E., Ingram R., Matt, G. (in prep). Physiological measurement of sustained attention in depression using pupil dilation.

Submitted Siegle, G., Ingram, R. E., & Matt, G. E. (under revision). Affective interference: Cause for negative information processing biases in depression?

Hinkin, C. H., Castellon, S. A., Granholm, E., Kalechstein, A. D., Yarema, K. T., & Siegle, G. (under revision). Computerized and traditional Stroop task dysfunction in HIV-1 infection.

Siegle, G., Ingram, R. E., & Matt, G. E. (under revision). Connectionist models of psychopathology, A review. Used as a text in university courses at Fordham University and the University of Warsaw.

Available on the world wide web at www.sci.sdsu.edu/CAL/connectionist-models/

Siegle, G., Ingram, R. E., & Matt, G. E. (under revision). A connectionist model of the emotional Stroop task.

In Press and Published Peer Reviewed Articles Brown, G., Kinderman, S., Siegle, G. J., Granholm, E., Wong, E. C., Buxton, R. B. (in press). Brain activation and pupil response during covert performance of the Stroop color word task. Journal of the International Neuropsychological Society.

Siegle, G. (1997) Why I make models (or what I learned in graduate school about validating clinical causal theories with computational models). The Behavior Therapist, 20, 179-184.

McGivern, R. F., Huston, J. P., Byrd, D., King, T., Siegle, G., & Reilly, J. (1997). Sex differences in visual recognition memory: Support for a sex related difference in attention in adults and children. Brain, and Cognition, 34, 323-336.

Shadish, W. R., Matt, G. E., Navarro, A. M., Siegle, G., Crits-Christoph, P., Hazelrigg, M., Jorm, A., Lyons, L. S., Nietzel, M. T., Prout, H. T., Robinson, L., Smith, M. L., Svartberg, M., & Weiss, B. (1997). The generalization of psychotherapy research to clinically representative conditions: A preliminary answer. Journal of Consulting and Clinical Psychology, 65, 355-365.

Ingram, R. E., Kendall, P. C., Siegle, G., & Guarino, J., (1996). Psychometric properties of the positive automatic thoughts questionnaire, Psychological Assessment, 7, 495-507.

Siegle, G., & Ingram, R. (1996). The big picture. Contemporary Psychology, 41(2), 163-164.

Banchoff, T., Achter, J., Ahmad, R., Curtis, C., Hendrickson, C., Siegle, G., & Stone, M., (1991). Student Generated Software for Differential Geometry. In W. Zimmerman & S. Cunningham, (Eds.), Visualization in Teaching and Learning Mathematics, Mathematical Association of America.

Banchoff, T., Achter, J., Ahmad, R., Curtis, C., Hendrickson, C., Siegle, G., & Stone, M., (1989). Student Generated Interactive Software for Calculus of Surfaces in a Workstation Laboratory, Undergraduate Mathematics Education Trends.

Book Chapters Siegle, G. (in press). A neural network model of attention biases in depression. J. Reggia & E. Ruppin (Eds.). Neural Network Models of Brain and Cognitive Disorders, v. 2. Elsevier.

Siegle, G. & Ingram, R. (in press). Cognitive Science and Cognitive Therapy. K. Dobson (Ed.) Handbook of Cognitive Therapy.

Siegle, G. & Ingram, R. (1997). Modeling individual differences in negative information processing biases. In G. Matthews (Ed.), Cognitive Science Perspectives on Personality and Emotion. New York, NY: Elsevier.

Ingram, R. E., Scott, W., Siegle, G. (in press) Depression. In T. Millon (Ed.), Oxford Textbook of Psychopathology, Oxford University Press.

Dombeck, M., Siegle, G., & Ingram, R. E., (1996). Cognitive interference and coping strategies in vulnerability to negative affect: The threats to identity model. In I. G. Sarason, B. Sarason, & G. Pierce (Eds.). Cognitive interference: Theories, methods, and findings (pp. 299-323). Mahwah, NJ: Erlbaum.

Invited Address Siegle, G. (1998). A neural network model of affective interference in depression. International Workshop on Neural Network Models of Cognitive and Brain Disorders, College Park, MD. Presentations Siegle, G., Ingram, R., Granholm, E., & Matt, G. (1998). Modeling the time course of attention to negative information in depression. In G. Matthews (Chair), Cognitive science perspectives on personality and emotion.Presentation at the 9th European Conference on Personality, Surrey, England.

Ritter, J., Siegle, G., & Ingram, R. (1998). Parental Bonding and Unique and Shared Aspects of Affective Symptomatology. Presentation at the meeting of the American Psychological Association, San Francisco, CA.

Williams, G., Conner, J., Siegle, G., Ingram, R., & Cole, D. (1998). Is more negative less positive? Relating dysphoria to emotion ratings. Presentation at the meeting of the Western Psychological Association, Albuquerque, New Mexico.

Siegle, G. & Ingram, R. (1997). A neural network model of inability to process emotional information in depression. Presentation at the meeting of the Society for Research in Psychopathology. Palm Springs, CA.

Felsch, J., Turingan, M., Garcia, M., Primicias, W., Siegle, G., & Matt, G. E. (1997). Comparing traditional and fuzzy retrospective self-reports of fitness behaviors. Presentation at American Evaluation Association, San Diego, CA.

Siegle, G. & Sarkin, A. (1997). Identifying latent positive and negative schizotypal taxons using depressive symptomatology. Presentation at the meeting of the American Psychological Association. Chicago, IL.

Siegle, G. & Gray, J., Ingram, R.E. (1997). Relationships of binging and purging to depressive and anxious symptomatology. Presentation at the meeting of the American Psychological Association. Chicago, IL.

Liu, P. J., Eftekhari, A., Malcarnne, V., Siegle, G., Chavira, D. (1997). Perceptions of parental bonding: Ethnic differences and relationship to depression. Presentation at the meeting of the American Psychological Association. Chicago, IL.

Ingram, R. E., Bailey, K., Siegle, G. (1997). Depressotypic information processing in individuals with disrupted parental attachment. Presentation at the meeting of the American Psychological Association. Chicago, IL.

Ingram, R.E., Bailey, K., Siegle, G., Huston, P. (1997). Development of a measure of past depressive symptoms and impairment. Presentation at the meeting of the Western Psychological Association. Seattle, WA.

Frydach, C. J., Garcia, M., Siegle, G. J., & Matt, G. E. (1997). Comparing traditional and fuzzy retrospective self-reports of health behaviors. Presentation at the Meeting of the Western Psychological Association, Seattle, WA.

Ritter, J., Siegle, G., & Ingram, R. (1997). Relationship of early parental bonding to depressive and anxious symptomatology. Presentation at the meeting of the Western Psychological Association, Seattle, WA.

Hinkin, C. H., Castellon, S. A., Wood, S., Granholm, E. L., & Siegle, G. (1996). Computerized and traditional stroop task dysfunction in HIV-1 Infection. Presentation at the meeting of the International Neuropsychological Society.

Bailey, K., Siegle, G., & Ingram, R. (1996). Attachment style: A vulnerability factor for depression? Presentation at the meeting of the Rocky Mountain Psychological Association, Park City, Utah.

Efthekari, A., Siegle, G., & Ingram, R. (1996). Nonparticipation and nonattendance to psychology experiments. Poster session presented at the meeting of the Rocky Mountain Psychological Association, Park City, Utah.

Ingram, R. E., Malcarne, V. L., Chavira, D., Siegle, G., & Velasquez, R. (1995). Depression, anxiety, and ethnic identity/acculturation. In D. Hope (Chair), New perspectives on depression and anxiety. Symposium presented at the meeting of the Association for the Advancement of Behavior Therapy, Washington, D. C.

Siegle, G., Ingram, R. E., & Jeffers, M. (1995). Information processing differences in depression and anxiety. Presentation at the meeting of the Association for the Advancement of Behavior Therapy, Washington, D. C.

Siegle, G., Ingram, R. E., & Matt, G. E., (1995). A neural network model of information processing biases in depression. Presentation at the workshop Neural Modeling of Cognitive and Brain Disorders. College Park, Maryland.

Siegle, G., Ingram, R. E., Matt, G. E., Shibley, M., Gyll, S., & Flaherty, M. (1995). Rumination on valence in depression: Understanding the lexical decision task. Presentation at the meeting of the American Psychological Association, New York, NY.

Le, V., Takarae, Y., Matt, G. E., & Siegle, G., (1995). People’s conceptions of self-reports. A fuzzy set model. Presentation at the meeting of the Western Psychological Association.

Siegle, G. (1994) A Connectionist Model of Recall in Depression. In G. Siegle & R. Ingram (Chairs), Connectionist Models of Negative Affect. Panel discussion conducted at the meeting of the Association for the Advancement of Behavior Therapy, San Diego.

Matt, G. E., Siegle, G., Chartier, L., & Le, V. (1994). Improving self-reports of health related behaviors: A fuzzy set model. Presentation at the meeting of the Society for Behavioral Medicine, San Diego, CA.

Matt, G. E., Chartier, L., McKellar, J., & Siegle, G., (1994). Self-reports as fuzzy sets: Arguments, implications, and examples. Presentation at the meeting of the American Psychological Association. Los Angeles, CA.

Matt, G. E., Shadish, W., Navarro, A., & Siegle, G. (1994). Generalizing from the research lab to clinical practice: A reanalysis of psychotherapy meta-analyses. Presentation at the meeting Evaluation ‘94.

Matt, G. E., Shadish, W., Navarro, A., & Siegle, G., (1994). The effects of psychotherapy conducted under clinically representative conditions. Presentation at the meeting of the American Psychological Association, Los Angeles, CA.

Siegle, G. (1994). Integration of self as a modulator for depressive rumination. In M. Dombeck (Chair), Self structure and vulnerability to psychopathology, Symposium conducted at the meeting of the Rocky Mountain Psychological Association, Las Vegas.

Siegle, G., Harrow, M., Sands, J., Miller, A., & Jobe, T., (1993). Does stimulus difficulty play a role in thought pathology? Schizophrenics vs. Depressives. Presentation at the meeting of the Society for Research in Psychopathology, Chicago, IL.

Siegle, G. (1993). The CLIM Prototyping Environment. Presentation at the meeting of INTERCHI: Amsterdam.

Elliott, C.,& Siegle, G., (1993). Variables affecting the intensity of simulated affective states, Presentation at the AAAI Spring Symposium on Mental States.

Elliot, C. & Siegle, G. (1993). Emotion Intensity Factors in Simulated "Believable Agents". In WAUME '93: Workshop on Architectures Underlying Motivation and Emotion . Birmingham, UK: The University of Birmingham.

Dean, T., & Siegle, G., (1990) An Approach to Reasoning About Continuous Change for Applications in Planning, Proceedings of 1990 AAAI Conference on Planning and Control.

Siegle, G., & Gunn, G., (1986) Mathematical analysis of collective motion. Presentation at the Connecticut Junior Science and Humanities Symposium.

Unpublished Scholarly Works Siegle, G. (1996). The Big Cab Book: Manual for the Cognitive Assessment Battery

Siegle, G. (1995). Mass Testing Manual: A manual for the running and scoring of Mass Testing measures at San Diego State University

Events Organized

Chaired Panel Discussion Connectionist Models of Negative Affect, Meeting of the Association for the Advancement of Behavior Therapy. Presenters included Gerald Matthews, Warren Tryon, and Greg Siegle. Rick Ingram was co-chair and discussant.

Colloquium series on Computational Models of Mental Disorders. The series has run biweekly for the last two years with participants from both UCSD and SDSU.

ABSTRACT of the dissertation

Cognitive and Physiological Aspects of Attention to
Personally Relevant Negative Information in Depression
by
Greg Jeremy Siegle
Doctor of Philosophy in Clinical Psychology
University of California, San Diego,
San Diego State University, 1999
Professor Rick Ingram and Professor Georg Matt, Co-Chairs

Evidence suggests depressed individuals pay excessive attention to negative information. The current research investigates the nature and clinical implications of such attention biases. A computational neural network, reflecting interacting brain systems that identify emotional and nonemotional aspects of information, is described in which depression is identified with strongly learning certain negative information. The model's behavior suggested that depressed people are reminded of, and attend to personally relevant negative information in response to many stimuli.

Predictions for depressed and nondepressed individuals' reaction times, signal detection rates, and the time course of cognitive load in response to emotional stimuli were derived from the computational model. To evaluate these predictions, pupil dilations and reaction times were collected from 24 unmedicated depressed and 25 nondepressed adults in response to emotional lexical decision and valence identification tasks. Pupil dilation was used to index cognitive load.

Mixed ANOVA planned contrasts were employed to evaluate predictions. In support of model derived predictions, depressed individuals rated many stimuli as negative more than nondepressed individuals. The network's behavior suggested that depressed individuals would be quicker to say that negative words were negative, than positive words were positive, and that this difference would be reduced in nondepressed individuals. This prediction was supported empirically.

Principal components analysis of pupil dilations revealed early attentional components (at or before reaction times) and late, possibly ruminative, components (peaking 2 and 4 seconds after reaction times). The computational model suggested cognitive load, indexed by pupil dilation, would be highest for nondepressed individuals during early stages of attention but highest for depressed individuals during later stages of attention. This prediction was supported. Contrary to predictions, differences in depressed individuals' dilations to positive and negative stimuli were not detected.

These data suggest depressed individuals may not initially attend to the content of presented information, but may quickly associate any incoming information with whatever made them depressed. Sustained attention to personally relevant negative information may characterize depressive attention biases. Targeting implicated cognitive and brain processes may improve interventions for depression.

 

 
I. INTRODUCTION

Depression is a disabling disorder characterized by negative moods, lack of interest in pleasurable activities, weight change, sleep disturbance, psychomotor retardation, fatigue, feelings of worthlessness, decreased attention, and suicidal ideation (American Psychiatric Association, 1994). Estimates for point prevalence of the phenomenon range from 5% to 44% of the population (Flaherty, Gavira, & Val, 1992). The prevalence and seriousness of the disorder make understanding factors associated with its onset and maintenance a common goal of clinical researchers. Attentional styles have been suggested to play such a maintaining role (e.g., Ingram, 1990, Teasdale, Williams, & Segal, 1993).

A great deal of research suggests that depressed people pay excessive attention to negative information, in comparison to nondepressed people. For example, research finds that depressed individuals evidence a bias to selectively attend to negative information (Matthews & Harley, 1996; Williams, Mathews, & MacLeod, 1996), remember negative information better than positive information (Blaney, 1986; Matt, Vazquez, & Campbell, 1992), and interpret information as negative that other people do not see as negative (Williams, Conner, Siegle, Ingram, & Cole, 1998). These findings mirror clinical presentations of depressed individuals who say that they "see things negatively" or "hear only the negative."

Although there is overwhelming evidence that such information processing biases exist in depressed individuals, their role in promoting or maintaining aspects of depressive symptomatology is unclear. For example, some researchers have suggested that negative attention and interpretation biases may be integral to the onset of depression (e.g., Beck, 1967, 1974; Ingram, 1984, 1990; Ingram, Miranda, & Segal, 1998). Others hypothesize that negative attention biases are epiphenomenal byproducts of the depressive state (e.g., Teasdale & Barnard, 1994). Still other researchers propose that depressed people do not differentially attend to negative, positive, and neutral information, and that results suggesting they do are a product of other cognitive processes such as negative memory or decision making biases (Hill & Dutton, 1989; MacLeod & Mathews, 1991; MacLeod, Mathews, & Tata, 1986).

Because negative attentional biases are critical aspects of several cognitive theories of depression (e.g., Beck, 1967, 1974; Ingram, 1984; Ingram et al., 1998), clarifying the function of attention in depression may have broad theoretical implications for understanding this disorder. In addition, elucidating the link between attention and depression may aid in understanding negative mood states in the normal range of experience; that is, by understanding how normal emotional information processing may become disturbed constraints may be inferred on the mechanisms by which emotional information processing occurs.

The research described here examines the idea that thinking about personally relevant negative events in response to any negative stimulus, possibly long after the stimulus is presented is responsible for information processing biases in people with clinical depression. The physiological basis for this theory will be described. Computational simulations of analogs of relevant physiology will be used to understand the implications of such a theory. Finally, an empirical study will be presented that examines reaction time, signal detection, and physiological correlates of affective interference.

A Physiological Framework for Understanding Information Processing Biases in Depression

To better understand the role of attention in depression, it is useful to appeal to theories regarding how emotional information is processed in the brain. LeDoux (1997) has suggested that emotional information is processed in parallel by the amygdala system, brain structures responsible for associating stimuli with an emotional valence, and the hippocampal system, brain structures responsible for associating information with non-emotional or semantic features. Feedback occurs between these structures, and eventually, information is transmitted to other brain areas associated with the generation of behaviors.

A great deal of physiological evidence supports such a model. Both hippocampal and amygdala systems receive inputs from the brain’s sensory processing areas (LeDoux, 1989). Extensive reviews by LeDoux (1992, 1995) and Halgren (1992) document the role of the amygdala in emotional perception. Other reviews document the hippocampal system’s role in semantic association, suggesting that it acts as an index to the semantic memory system, moderating activation of semantic qualities associated with stimuli in cortex (e.g., Squire, 1992). Amarel et al. (1992) review an extensive literature documenting extensive feedback between these systems. Tucker and Derryberry (1992) have implicated this feedback loop in the integration of affective and semantic processing.

LeDoux’s (1997) model is conceptually similar to traditional theories of emotional information processing, which suggest that cognitive activity involves nodes in a cognitive network containing semantic information (e.g., Collins & Loftus, 1975). When a node is activated (e.g., by association with an environmental stimulus) it activates other nodes to which it is connected, which in turn activate still other nodes to which they are connected. In this way, activation spreads throughout the network. Bower (1981) explained emotional responses in such a network by positing that some nodes contain exclusively emotional content. Thus, activation can occur for both the semantic and affective content of incoming information. For example, a stimulus such as a crying person might activate both "person" and "sadness" nodes in an observer’s cognitive network. Ingram (1984) suggests that people who are depressed suffer from strongly activated connections between negative affective nodes and multiple semantic concepts, creating feedback loops that serve to propagate depressive affect and cognition. According to this framework, cognitive functioning in depression directs attention toward negative stimuli as well as toward the negative features of stimuli that have both positive and negative features. This approach has been used to account for many types of biased information processing observed in depressed people as well as the onset and maintenance of depression (Ingram, 1984; Teasdale, 1988), its treatment (Ingram & Hollon, 1986; Morrow & Nolen-Hoeksema, 1990) and potential recurrence (Teasdale, 1988).

There are many ways of extending Ingram’s (1984) theory to account for depressive information processing biases in LeDoux’s physiologically motivated model. Depressive attention biases could be thought of as involving excessive attention to emotional features (biased amygdala system processing). A less theoretically motivated alternative that is also possible in LeDoux’s model is that depressed individuals’ attention is directed to nonemotional or semantic features of information (biased hippocampal system processing). Similarly, depressive attention biases could result from feedback between these systems.

These different types of attention biases suggest different roles for attention in the onset and maintenance of depression. For example, if depressed individuals attend primarily to the nonemotional aspects of a negative stimulus, depression may be viewed as a process in which mood alters the types of semantic associations which are made to one’s environment (Ingram, 1984). Such processing would be exemplified by a depressed person hearing the word "guilt" and thinking of various things for which he or she feels guilty. Alternately, if depressed individuals attend primarily to the emotional aspects of negative information, depression might be thought of as allowing emotions to interfere with normal semantic associations. This thinking would be exemplified by a depressed person hearing the word "guilt" and thinking how negative guilt is. Subsequent thoughts would primarily be focused on negativity itself. If feedback between affective and semantic structures becomes disturbed in depression, semantic associations with any negative information may be those associations most closely related to a person’s representation of negativity. For example, a recently widowed depressed person might hear the word guilt, feel guilty towards their spouse’s death, and dwell on thoughts and feelings of bereavement. This phenomenon will be referred to as "affective interference." Normal semantic associations are preempted by the affective quality of a stimulus.

Tasks to Evaluate Attention in Depression

Based on LeDoux’s model, a number of tasks could be used to aid in understanding the nature of attention biases in depression. A lexical decision task, in which participants are asked to identify whether a string of letters spells a word, which may be positive, negative, or neutral, asks participants to identify aspects of a stimulus that have nothing to do with the affective content of the stimulus. In this way, participants’ attention is directed towards non-emotional aspects of stimuli. If people who are depressed primarily focus on such non-emotional aspects of negative information, negative information processing biases should be especially pronounced on a lexical decision task. In contrast, if depressed people focus more on the affective valence of information, information processing biases would be more apparent on a task designed to focus people’s attention on affective information. A valence-identification task, in which individuals are asked to name whether a word is positive, negative, or neutral, can be used. Attentional biases involving feedback between mental representations of affective and semantic aspects of information are assumed to result in biased information processing on both tasks.

Both tasks have been used in past research on psychopathology, but rarely together. Siegle, Ingram, and Matt (1998) have documented a number of studies in which depressed people’s reaction times were compared to nondepressed people’s on an affective lexical decision task. In general, they show that people who are depressed are slightly slower to say that negative words are words than they are to say that neutral words are words. In contrast, nondepressed people are not slower to react to negative words, as shown in Table 1. On this table, D represents the difference in reaction times to negative and neutral words in milliseconds.1 To explore the robustness of the finding that depressed people are slow to react to non-emotional aspects of negative information, Siegle (1996) meta-analytically examined a number of other contrasts on affective lexical decision task studies conducted with depressed and nondepressed individuals with similar results. The overall conclusion was that depressed people react slower to negative than other types of information, whereas nondepressed people do not. Siegle et al. (1998) also performed an affective lexical decision task with dysphoric and nondysphoric undergraduates. They found that dysphoric undergraduates were slower to respond to negative than neutral words, in comparison to nondysphoric undergraduates. The overall picture is one of delays in responding to negative words by depressed individuals on the task.

In contrast to the relatively broad literature examining lexical decisions in depressed individuals, the literature reveals no experiments using a valence-identification, or "affective decision" task with a depressed population. Still, many tasks do reveal group differences in processing of different emotional valences assumed to be associated with particular disorder (e.g., identifying threat by anxious individuals). Siegle, et al. (1998) assessed dysphoric and nondysphoric college students on an affective valence identification task. They found that dysphoric college students were slow to say that positive words were positive, but were not particularly biased in their processing of negative information. Other uses of valence-identification tasks have used different populations. Mathews and Milroy (1994) used an affective decision task to assess sensitivity to threat words in anxious people. They found relatively little difference between affective decision times to positive and negative words. In contrast, anxious individuals were facilitated in responding to threat words. Hill and Kemp-Wheeler (1989) also described an affective decision task in which participants were asked to categorize words as threatening or nonthreatening. They found strong effects for anxious participants, and relatively small differences in affective decision times for depressed participants.

 

Table 1

Difference in reaction times (D) to negative and neutral words, in milliseconds
 
Nondepressed  Nondepressed  Depressed Depressed 
Study  D d d
Bradley et al. (1994) 0.1  .04
Bray (1984) 10 .15     
Challis & Krane (1988) -62  -.62     
Matthews & Southall (1991)2 18  .36  76  1.24
Stip et al. (1992, 1994) -8  -.29  30  .11
Williamson et al. (1991) -46  unavailable    
Macleod, Tata, & Mathews (1987) 10 unavailable 29 unavailable
Average  -2.07  .36  27.44  .41
A similar study requiring participants to appraise the affective valence of entire sentences (Weaver & McNeill, 1992) examined differences in reaction time latencies for nondepressed subjects and subjects in whom a sad mood had been induced. The mood by valence interaction with response time latency was not statistically significant in this study; the authors did not report a statistic that would allow a calculation of effect size for this measure. It is unclear whether their apparent null results were due to a true absence of mood congruence effects or power too low on some dimension to detect a true difference between the groups of interest. Low power is suggested because the authors were analyzing a three-way interaction (Mood x Gender x Sentence affect) and their sample sizes were relatively small (N=24 for one version of the task and N=32 for another). Similarly, the consistency of participants’ reactions to the authors’ 12 happy and 12 sad sentences is unclear; if only some happy sentences were interpreted as happy by participants, true differences in responding to happy and sad material may have been obscured.

Valence identification tasks have also been used to explore the time-course of information processing using physiological measurements in non-clinical populations. Diedrich, Neumann, Maier, Becker, and Bartussek (in press) have demonstrated frontal positive slow waves in ERP’s in response to positive and negative, but not neutral pictures on the task. Vanderploeg, Brown, and Marsh (1987) conducted a similar task and analyzed the data using principal components analysis. They found greater slow-wave responses (post 500ms) on ERP’s to emotional than non-emotional stimuli, but did not find valence-linked responses in earlier components. In contrast, Neumann et al. (1992) used positive, negative, and neutral adjectives. They found differential responses to different valences for P3 (350ms latency) but not for the averaged amplitude in the range of 700 to 1200ms. Possibly the first use of a valence-identification task was reported by Derryberry (1988), in which nondepressed participants signaled whether a word was positive, negative, or neutral following a positive, negative, or neutral warning signal. They received, positive, negative, or neutral feedback. Derryberry reported a significant valence by warning by feedback interaction effect with both reaction time and error rate, making it difficult to interpret how subjects would have responded if there were no warning or feedback tones. Simple effects analyses suggested that words were categorized faster based on an incentive to perform that was congruous with their affective tone. Positive words were categorized fastest after feedback involving loss, while negative words were categorized fastest after feedback prompting joy. These results suggest that the task is sensitive to affect-related variables, and thus examining the task in depressed versus nondepressed individuals may yield interesting results.

Only one experiment has directly contrasted results of an affective lexical decision task with those of an affective valence identification task. Siegle et al.’s (1998) study examined both tasks in a population of dysphoric and nondysphoric college students. In this study dysphoric college students were slower to say that positive words were positive than to say that negative words were negative, while the same was not true for nondepressed students. In contrast, they were slower to say that negative words were words than other valences, which was not true for nondepressed students. While it cannot be assumed that these results generalize to clinically depressed individuals * (e.g., Coyne, 1994; Coyne & Gotlib, 1983; Gotlib, 1984), they suggest that the tasks could be measuring fundamentally different aspects of information processing in individuals whose mood is disrupted.

The Need for Assessment Using Personally Relevant Information

The previously described experiments all used stimuli chosen from lists of words or pictures that were chosen before participants arrived. Although such stimuli may capture how individuals react to information that most people find negative or positive, the stimuli that depressed individuals encounter and think about regularly may be idiosyncratic to their depressed states. Such stimuli would be relevant only to them. A number of authors (e.g., Dozois & Dobson, 1998; Siegle et al., 1998) have argued that the common practice of using stimuli normed such that a majority of people find them negative, may be too general to tap depressive information processing biases.

The following scenario depicts one situation in which information processing biases would be revealed by presenting personally relevant information and not other types of negative information. It stems from computer simulations of brain areas involved in the processing of affective information (Siegle, 1996; Siegle & Ingram, 1997). Suppose that some depressed individuals have one or a few very negative experiences that they have thought about a great deal. They might thereby learn to think about these particular negative experience very easily. As connections in the brain become strongest to active areas (Hebb, 1942), connections to mental representations of these negative experiences might become particularly strong. Depressed individuals would therefore tend to think about these personally relevant negative experiences when any related information (e.g., other negative information) was presented. One consequence of such an information processing style could be that depressed individuals would be especially quick to recognize personally relevant negative information, since neurons associated with representing this information have become particularly strongly connected to other neurons that are active in perceiving and recognizing information. In contrast, the same depressed individuals might be slower to recognize other, nonpersonally relevant negative information, because they would immediately associate such incoming stimuli with personally relevant information, rather than their mental representations of the presented stimuli. No matter what the initial stimulus was, depressed individuals would be expected to sustain their attention to their thoughts of personally relevant negative information.

An experiment that presented such depressed individuals with normed negative words in a recognition task (e.g., a lexical decision task) might thus conclude that depressed individuals were not faster to recognize negative information than nondepressed individuals. An experiment that presented them with personally relevant negative information would, in contrast, find that the depressed individuals were particularly fast to respond to this negative information. Both conditions would allow detection of sustained attention to personally relevant negative information.

The primary lesson from this example is that if depressed individuals think a great deal about particular pieces of negative information, they might respond quickly and efficiently to this negative information, and not other negative information. Similar arguments have been made in research on other psychopathologies, and idiosyncratic stimuli have been adopted in these domains. Stronger effects have been observed when stimuli that are relevant to an individual’s psychopathology were used than when pre-selected stimuli were used in studies of anxiety (e.g., Martin et al., 1991) and PTSD (Riemann & McNally, 1995).

One way to account for possibly different effects of personally relevant and nonpersonally relevant information in information processing tasks is to incorporate both normed and personally relevant stimuli into information processing tasks (e.g., Reimann & McNally, 1995; Segal et al., 1995). This strategy would reveal whether the same depressed individuals respond differently to specific bits of information relevant to their depression and other types of negative information.

Such a finding could have wide theoretical and treatment implications. For example, if differences are observed in the way depressed individuals react to personally relevant and other negative information, clinicians might have better knowledge of whether tuning cognitive exercises to individuals’ particular negative beliefs would be important in cognitive therapy.

Experiments described in the following sections therefore employed both normed word lists, representative of the types of negative information depressed people encounter in their every day environments, and personally relevant negative information, thought to represent issues that are central to their depression. Of particular interest will be whether depressed people are especially quick to recognize personally relevant negative information, and whether they display sustained attention to personally relevant information in response to both personally relevant and non-relevant information.

Using Pupil Dilation to Measure Attention Continuously in Depressed Individuals

The affective lexical decision and valence identification tasks can provide a wealth of information regarding which aspects of negative information depressed people pay attention to. By analyzing signal detection and confusion rates on the tasks, the role of interference in interpretation can be quantified. By analyzing reaction times on the tasks, the ways in which negative information attracts attention, or interferes with early attentional processes can be quantified. To effectively analyze the entire time course of attention, e.g., to examine whether depressed people to continue to expend cognitive load in response to information after they respond to it, a more continuous measure of cognitive load is needed. Pupil dilation is a strong candidate for such a measure.

Pupil Dilation as a Measure of Cognitive Load

Muscles controlling pupil dilation are innervated by structures essential to both cognitive and affective information processing (Hess, 1972). For example, cognitive activity is often thought to involve activation of the frontal lobes, which project to the midbrain reticular formation which is, in turn, connected to the ocularmotor nuclei. Controlled experiments involving stimulation of the midbrain reticular formation have thus been shown to lead to changes in pupil dilation (Beatty, 1986). Emotional activity is often thought to be mediated by activity in the hypothalamo-thalamo-cortical axis. As such, activity in these structures, and limbic structures connected to them have been shown to result in pupil dilation (see Hess, 1972 for a review). Stimulation of the amygdala, in particular, has been shown to increase pupil dilation in cats, dogs, and monkeys (Fernandez de Molina & Hunsberger, 1962; Koikegami & Yoshida, 1953). Pupil dilation therefore provides an interesting measure of activity in the brain areas implicated by LeDoux’s (1997) model.

A number of studies have suggested that pupil dilation reliably indexes cognitive load on attention and memory tasks that tap the types of processes hypothesized to be involved in affective interference tasks (see Beatty, 1982 for a review). For example, Kahneman and Beatty (1966) show that the pupil reliably dilates one tenth of one millimeter in diameter for each digit a research participant is asked to remember in a short-term memory task. Pupils also dilate during performance of cognitively difficult tasks such as multiplication of large numbers (Hess & Polt, 1964). Other studies suggest that pupil dilation varies as a function of effort in perceiving in visual noise (Hakerem & Sutton, 1966). To show that dilation is specific to the signal detection processes and not other aspects of the task, Hakarem and Sutton (1966) also show that changing features such as the signal to noise ratio do not affect pupil dilation. Beatty (1982) shows that pupil dilation decreases over time as performance decreases on a sustained attention task, which further suggests a reliable relationship between attention and pupil dilation. Finally, Beatty (1980) found that small but reliable changes in pupil dilation occurred on selective attention tasks with respect to stimuli to which participants were supposed to attend, and not to other stimuli. Beatty (1982) also reviews a host of studies demonstrating that task-based changes in pupil size are invariant with respect to a wide range of variation in environmental factors not expected to change overall cognitive load on a task such as task, state, or trait anxiety or incentive to perform well. Factors that affect baseline pupil dilation such as ambient light, Beatty (1982) demonstrates, can be controlled within an experiment.

Pupil dilation therefore seems to be a particularly sensitive and specific way to measure many of the processes hypothesized to be involved in affective information processing. Ideally, pupil dilation could be used to create indices of initial effort at perceiving a stimulus and sustained attention to the affective valence of a stimulus.

Appropriateness of Using Pupil Dilation to Study Depressed Individuals

Three factors make pupil dilation especially appropriate for studying information processing in people with affective psychopathology. First, pupil dilation has been observed to be greater when individuals are asked to read negative or positive words than neutral words (Vacchiano, Strauss, Ryan, & Hockman, 1968), suggesting that it may index cognitive effort devoted to both affective and semantic processing. More generally, Janisse (1973) concludes that pupil dilation is linearly related to the affective intensity of presented stimuli, though differences in pupil dilation in response to positive and negative stimuli in nondepressed individuals are not reliable across studies throughout the literature.

Second, the differential pupil dilations to affective and neutral words appear to be related to reaction times on recognition tasks. Hutt and Anderson (1967) found a small but statistically significant negative correlation (r=-.19) between reaction times to a recognition task and pupil dilation to a presented stimulus, using the difference between responses to two negative and two neutral words as a dependent measure. These results suggest that longer response latencies for negative words on a recognition task are associated with smaller pupil dilations, and hence less thought regarding negative words. Because this study did not involve depressed individuals, and only two stimuli of each word valence were used, it is difficult to interpret the meaning of this result for depressed individuals who are assumed to think a great deal about negative information. Still, it is encouraging to note that pupil dilation may be somewhat comparable to more traditional indices of cognitive effort for purposes of comparison.

Finally, pupil dilation is non-invasive, and relatively easy to measure. Research participants are not inconvenienced in any way beyond the pressures they might encounter while performing traditional information processing tasks. Unlike other physiological measures that may involve abrasion of skin, attaching electrodes or machines to body parts, or involve anxiety provoking pharmacological insults, pupil dilation is not expected to engender any specific affective reaction in research participants. As such, measurement of pupil dilation is not expected to be obscure measurement of physiological correlates of affective reactions on an affective lexical decision or valence identification task.

A number of previous studies have examined pupil dilation in depressed versus nondepressed individuals, but none has examined these differences on attentional tasks involving stimuli with varying affective valences. The majority of this research has involved two goals. First, measurement of "resting" or baseline dilation in depressed versus control groups has been used to examine whether depressed individuals evidence greater psychophysiological activity than nondepressed individuals (e.g., Liakos & Crisp, 1971). Results from this research suggest that many subgroups of depressed individuals have larger resting pupil dilations than nondepressed individuals (Liakos & Crisp, 1971). Pupil dilation in depressed individuals has also been studied as a way of gauging the mechanism of action of various antidepressant medications. Pupil dilation closely reflects sensitivity of the a-1 adrenergic receptor (e.g., Ghose, 1976; Turner, 1975). Thus, to examine whether antidepressants change a-1 adrenergic sensitivity, pupil dilation is often used as a dependent measure (e.g., Muijen et al., 1989; Shur & Checkley, 1982).

One study has examined pupil dilation in depressed and nondepressed individuals while watching a non-emotional video, and examined the same individuals on a reaction time task with respect to recognition memory for words of different hedonic tone (Deijen, Orlebeke, & Rijskidk, 1993). The authors found that while depressed individuals were less accurate than controls at the recognition memory task, and took longer than controls to respond, no group by word-type interactions were found. In general, pupil size was the same for members of both groups that watched the movie. The authors did not report pupil dilations with regard to the recognition memory task.

Identification of Sustained Attention with Rumination

The idea that depressed people "ruminate" pervades both clinical reports and the discussion sections of empirical articles describing cognitive mechanisms underlying information processing biases and deficits. Yet, the idea of rumination is rarely defined explicitly and it has often been used to mean different things. For example, Nolen-Hoeksema (1984) suggests that rumination involves thinking excessively about the symptoms of one’s depression. Her research over the past fifteen years, and a host of related research using a measure of rumination she devised, has investigated the construct of rumination from this perspective. In contrast, Ingram’s (1984) theory of depression engenders a broader sense of rumination in which depressed individuals are explained as preserving activation of constructs in a semantic network related to sadness long after a stimulus has been presented. The one feature shared by nearly all conceptions of rumination seems to be the notion of sustained attention to some negative information. It is this broad sense, involving sustained attention and devotion of cognitive resources, in which the term "rumination" will be employed throughout the remainder of this document. Specifically, rumination will be defined as sustained feedback between cognitive and affective processing systems associated with prolonged cognitive load. Implicit in this definition is the idea that rumination happens in response to a stimulus, which could be internal or external. This usage, while different from the way this term is employed in many specific articles, captures the general sense of sustained cognitive and affective processing in which the term is usually used. Moreover, it is consistent with the types of feedback assumed to exist during consideration of emotional material in cognitive theories of emotional information processing (e.g., Ingram, 1984) as well as corresponding physiological theories (LeDoux, 1997).

Sustained pupil dilation after the presentation of a stimulus is assumed to represent sustained attention to information. Sustained pupil dilation will therefore be interpreted as an analog for a broad notion of "rumination". Importantly, the topic of rumination, indexed by sustained pupil dilation, is unclear (i.e., possibly representing something other than focus on symptoms). Similarly, whether this type of rumination is a cognitive process, and whether this type of rumination reflects feedback between particular physiological structures cannot be determined from pupil dilation measures.

The Need for Simulation Before Prediction

The collection of reaction times, signal detection rates, and various indices associated with pupil dilation allows the generation of multitudes of theoretically motivated hypotheses regarding the role of affect in attention. Yet, LeDoux’s model involves complex interactions between highly nonlinear systems. The flow of information through these systems is difficult, if not impossible, to predict just by thinking about the model. Similarly, the flow of information through a network such as Bower’s (1981) model is quite complex; the role of feedback between components of such a model, especially if there is any noise in processing, is notoriously hard to predict (Movellan & McClelland, 1997). Although predictions regarding the maintenance of depression, or even about how depressed people will behave in experimental settings is certainly a noble goal, doing so without formal representations of relevant model components could thus lead to predictions that do not follow from assumptions about the construction of the model, i.e., that would not be internally consistent with the model. Similarly, if a model is not represented formally, people reasoning from the same theoretical model might expect depressed individuals to behave differently in response to negative information.

For example, based on speculation from Bower’s (1981) model, a number of apparently contradictory predictions about how depressed people will behave on an affective lexical decision task have been advanced. Some researchers have suggested that if depressed people have a high level of spreading activation from sadness nodes, recognition of information that is well-linked to these nodes will be facilitated. They therefore predicted that depressed people would respond faster to negative stimuli than positive or neutral stimuli on tests of attention such as an affective lexical decision task (e.g., Challis & Krane, 1988; Macleod, Mathews, & Tata, 1986; Matthews & Southall, 1991; Ruiz Caballero & Bermudez Moreno, 1992). Clark, Teasdale, Broadbent, and Martin (1983) suggest that instead,

semantic information about a word and information about an individual’s emotional experience with that word may be stored separately, with the former being accessed more quickly, but the latter being more likely to be activated by a congruent mood state. The quickly accessed semantic and presemantic information could be used for lexical decisions; hence we would not expect a mood-congruency effect on such decisions. (p. 178)

Similar logic may be used to predict not only a lack of congruence, but strong mood incongruence effects on the lexical decision task for some people. If depressive symptomatology is related to a particular event such as a significant loss (Ingram, 1984; Paykel, 1979), it might be hypothesized that only certain personally relevant bits of information, which are associated with this event, will be especially strongly linked to sadness nodes. In this case, presentation of negative information that is not personally relevant might lead to the activation of this personally relevant information. The depressed person might focus on the personally relevant negative information and be slow to identify the lexicality of the presented negative information. Such associations might not occur for positive information, leading to an apparent mood-incongruence effect on the task.

The idea that affectively valenced stimuli may interfere with normal information processing is not particularly new. Research on tasks which are traditionally considered measures of interference (e.g., the affective Stroop task, in which an individual is asked to name the color of an affectively valenced stimulus) has long suggested that affective stimuli interfere with color naming in distressed individuals (see Williams, Mathews, & MacLeod, 1996 for a review). Yet, this research does not suggest what about an affectively valenced stimulus causes the interference. Potentially the depressed person is distracted from the color by strong associations with the semantic content of the stimulus. Alternately, the distraction might occur because the person focuses entirely on the emotion conveyed by the stimulus (i.e., the affective content of the stimulus). The affective interference hypothesis specifically predicts that the affective content of stimuli (i.e., their negativity), rather than semantic content of stimuli, captures the depressed person’s attention.

Computational modeling can provide a formal and internally consistent basis for understanding how negative experiences could impact information processing in models such as Bower’s or LeDoux’s. The strength of computational models stems from their ability to explicitly illustrate interactions between modeled nonlinear components. Of course, computational models of theories such as LeDoux’s or Bower’s notions of affective information processing are only as useful, and have only as much validity, as the theories themselves. Moreover, because computational modeling forces implicit assumptions to be made explicit, some arbitrary decisions are often necessary to implement such vague theories mathematically.

The next section presents a model of possible interactions between the types of components found in Bower’s and LeDoux’s models and thus, allows predictions to be made about how depressed individuals will perform on the affective lexical decision and valence identification tasks. Distinctions between personally relevant and non-personally relevant stimuli are captured within the model. The model also accounts for the entire time-course of attention, to allow prediction of continuous quantities such as pupil dilation. In order to minimize arbitrary decisions in the creation of the model, the assumed function of components in LeDoux’s model is captured rather than including aspects of all relevant brain areas. Empirical parameter estimation was used when possible.

 

II. A NEURAL NETWORK MODEL OF EMOTIONAL INFORMATION PROCESSING
A Short Introduction to Computational Neural Networks

Computer simulations are becoming increasingly popular as a way of understanding behavior, empirical data, and the internal consistency of theories. Computational models a) aid in the translation of theory to a rigorously specified, empirically testable causal model, b) aid in understanding causal processes that lead to behaviors, c) help to generate hypotheses about abnormal processes, d) promote the creation of novel clinical interventions, and e) help to integrate various granularities of research (e.g., physiological and behavioral; Siegle, 1997).

One type of model that has sparked a great deal of interest among the computational community is called a neural network model. Neural network models have become popular for many reasons including a) their predictive power (Sarle, 1994), b) their biological congruity (Cohen & Servan-Schreiber, 1992), c) their ability to handle noisy data (Cohen & Servan-Schreiber, 1992), d) the natural way in which they can be used to model information processing tasks (McClelland, Rumelhart, & Hinton, 1986), and e) their implicit resolution of traditional schisms in psychology such as the mind/body problem (Tryon, 1993). Additionally, neural networks can be used to understand phenomena that are difficult to represent using more traditional symbolic modeling techniques, such as behaviors for which explicit governing rules are not known (Hecht-Neilson, 1990). The following section discusses some of the basic concepts necessary for understanding neural network models. A number of excellent review articles are available which summarize specific types of network models (Rumelhart, Hinton, & McClelland, 1986), and the mathematics and mechanics of computational simulations of neural networks (e.g., Arbib, 1987; Hecht-Neilson, 1990).

Neural network models represent information throughout a connected network of individually meaningless units nodes. Information is represented in a distributed fashion, as a function of the simultaneous activation of multiple nodes. Each unit receives "activation" from other nodes to which it is connected in response to the activation of these nodes. The unit then sends a transfer function of the activations coming into it to other nodes to which it is connected, much as a biological neuron sends a function of activations from its dendrites to other neurons through its axons. For example, a collection of nodes, when activated together, may be said to represent a concept such as "sun". Activation of these nodes may spread to another collection of nodes which, together, represent the concept "moon". This process might be interpreted as a computational analog of a mental association between sun and moon. The pattern of connections between nodes, or the network’s "architecture" governs what types of information may be represented with the network and the manner in which nodes may activate each other.

By strategically modifying the strengths of connections between nodes, the network can be made to produce a specific set of activations in response to another set of activations. This process has been likened to making the network "learn" an association of a stimulus with a response. Numerous procedures for allowing a network to learn associations in this fashion have been proposed (e.g., Fallman & Lebiere, 1991; Rumelhart, Hinton, & Williams, 1986). The most basic type of learning in neural networks follows Hebb’s (1949) rule that when two neurons are active, the strength of the synapse between them increases. To capture this law in a computational neural network, connections between nodes active in corresponding inputs and outputs can be increased as a way of "training" the network. Another common training procedure called "back-propagation" (Rumelhart, Hinton, & Williams, 1986) is devoted to minimizing discrepancies (error) between the response, or output, of a network and its expected response to stimuli. By strategically modifying connections strengths so that the network is more likely to produce expected responses to the presentation of stimuli, the network can be said to learn associations between stimuli and responses. This conception is intuitive for an information processing system, in that learning is often considered a process by which people gradually become more efficient, e.g., making fewer and fewer mistakes in evaluating information and attending to information in an adaptive manner.

A neural network’s behavior can be evaluated on a number of dimensions, reflecting processes that the network is designed to simulate. For example, if the network is designed to simulate performance on some information processing task, associations made by the network could be compared to associations made by humans to a stimulus; the frequency of the network’s erroneous associations could be measured as an analog of human error rates. Similarly, a network may take a number of associative steps to settle on a learned association. The number of associative processing cycles or "epochs" the network needs to associate a stimulus with a particular response can be examined as an analog of reaction time.

As may be inferred from this discussion, connectionist models are generally associated with a number of parameters that can be varied such as the relative strengths of connections within the network, the rate at which the network can "learn", and the number of nodes devoted to representing types of information within the network. Different values for these parameters can be used to make the network respond in different ways to similar stimuli. Manipulating network parameters may thus be seen as one way of approximating individual differences. Often, changing network parameters is one way in which neural networks made function in a manner associated with psychopathology (see Siegle, 1998b, for a review).

At the same time, the plethora of parameters available to the neural network modeler can lead to limitations in the validity of models. Not every parameter in a neural network has clear correlates to biological or cognitive phenomena. Similarly, parameter values that do have correspondences to human variables are not necessarily independent or interdependent in the same way as the analogous human variables. Thus, some level of arbitrariness is often present in the development of models. This arbitrariness can lead to the use of apparently different sets of parameters in simulations that nominally represent the same phenomenon. In fact, different sets of parameters, and slightly different models developed over the last five years were used to create the different simulations discussed in the following sections. This practice could be construed as problematic in that a consistent set of parameters was not used to make a consistent set of predictions.

Alternately, the models can be considered to be advanced tools for hypothesis generation in which sets of differential equations that help to formalize intuitions about the behavior of groups, and to generate predictions about ways these groups may behave under certain conditions. In this sense, variation in parameters that do not clearly correspond to aspects of human functioning may, in fact, help to contribute to making robust predictions based on the central and invariant behaviors of the models. Variation in these model parameters could, perhaps, be considered an analog for individual differences, in which factors not accounted for by models of group behavior contribute to variation in individuals’ behavior. It is therefore suggested that the models’ behaviors should, at most, be interpreted as interesting ways to understand aspects of behavior, one at a time, that allow predictions about aspects of the behavior of possibly different, and variable, depressed and nondepressed individuals.

Why Computer Simulations of Neural Networks are Particularly Appropriate for Investigating Depression

Computational models, specifically neural networks, are especially useful for understanding attention in depression because they are natural extensions of the semantic networks, on which Bower’s (1981) network theory is based (Anderson, 1990; Blank, Meeden, & Marshall, 1991; Hinton, 1991; Yates & Nasby 1993). For example, Yates and Nasby (1993) identify six fundamental assumptions common to semantic and connectionist models of associative memory including: 1) Both types of models use nodes representing aspects of propositional knowledge, 2) Propositions are associated at the time of encoding, 3) Information in the network becomes conscious when it is activated above some threshold, 4) Environmental stimuli activate some nodes in the network, 5) Activation can spread in varying degrees, and 6) Consciousness may be understood as involving the activation of nodes. Neural networks also allow a number of advantages over semantic networks. These include the ability to handle incomplete information as inputs, ability to demonstrate emergent rule-like behavior, and ability to demonstrate competition for activation of representations of constructs (Barnden, 1995). Their biological congruity allows an intuitive representation of LeDoux’s (1997) model to be integrated with the semantic network approach.

More practically, many types of experiments regarding depression can be performed using a computational neural network model that it would be unethical or difficult to perform on a depressed human. For example, it would often be desirable to test theories of depression by progressively depressing a person, bit by bit, until he or she were on the brink of complete despair. At each step of the induction, the person could be assessed on a number of variables to ascertain whether the person’s behavior follows that predicted by a theory. Such an experiment would, of course, be unethical. As a substitute, if a computational model of analogs of processes thought to operate in depression could be created, the computer’s behavior on some task could be analyzed at various levels of operation of these processes with relatively few ethical complications.

Similarly, experiments aimed at identifying factors that may lead to depression can be examined within a neural network. By exposing the network to different types of stimuli, and allowing it to "learn" from them, the effects of different types of stimuli can be examined with regard to subsequent information processing biases in the network which are analogous to those found in depressed individuals. Such experimentation is possible because the idea of learning and representational change in neural networks has been well explored in the past. This style of experimentation would be difficult to perform in a semantic network approach in which there is no established mechanism for allowing the network to learn in a way which is thought to be analogous to human learning.

Neural network models of a number of psychopathologies have been used to make a number of theoretical advances in the understanding of information processing in disorders including anxiety, schizophrenia, bipolar disorder (see Siegle, 1998b, for a review). Neural network models of aspects of unipolar depression have been used with great success by researchers attempting to represent memory processes in depression (Williams & Oaksford, 1992), patterns of symptom remission (Luciano, 1997; Park, 1998), and attentional interference. The following sections augment this literature by presenting a computational neural network model that embodies the essential features of Bower’s (1981) and LeDoux’s (1997) models of emotional processing. Implications for understanding the neuropsychology of information processing biases in depression are drawn from the model.

A Computational Framework for Investigating Affective Information Processing

Siegle and colleagues (Siegle, 1999a; Siegle, Ingram, Granholm, & Matt, 1998, Siegle, 1996; Siegle & Ingram, 1997a,b; Siegle, Ingram, & Matt, 1995) have developed a computational neural network model of affective information processing that embodies the essential features of Bower’s (1981) and LeDoux’s (1997) notions of emotional information processing. The model can be used to make predictions regarding the nature and time course of emotional information processing in depression. Implications of these predictions for depressive information processing as well as vulnerability to depression and recovery from depression are discussed elsewhere (e.g., Siegle & Ingram, 1997a). The model is shown in Figure 1. In the figure, each small circle corresponds to an individual node. A large ellipse corresponds to a cluster of nodes that perform the same conceptual function.
 
 

In this model, nine nodes representing perceptual characteristics of stimuli are fully connected to, and feed activation forward to nine nodes representing the nonaffective features (e.g., semantic content) of stimuli and two nodes representing the affective content of stimuli, in parallel. Feedback occurs between the nodes representing affective and nonaffective features. Both nonaffective and affective feature nodes feed activation forward to 12 nodes representing the network’s outputs (one for each semantic concept and one for each valence). Additionally, inputs from task units, representing the context in which the stimulus is to be interpreted (either as a lexical decision or valence identification) feed to the output nodes. These nodes are represented on the right side of Figure 1 to note that they are an internal cognitive phenomenon, rather than perceptual inputs. Summative indices are kept representing the squared difference in activation between the output nodes and each learned output, so that the network can eventually be considered to have settled (i.e., decided) on an output, based on its closeness to some learned pattern. The nonaffective nodes are intended to roughly correspond to LeDoux’s notion of hippocampal system processing. The affective nodes are intended to correspond to LeDoux’s notion of amygdala system processing. The feedback between them captures LeDoux’s notion of feedback between these brain areas, as well as Ingram’s (1984) notion of feedback between mental representations of affective and semantic features. The summative output indices represent products of decision processes assumed to occur in the frontal lobes. Perceptual, semantic, and output features of the network are coded as normalized vectors in which one field is more highly activated than the others. Thus, nine perceptual and nonaffective features were used so that nine different stimuli could be represented within the network.

The recognition of a presented stimulus in the network is thus the result of a diffusion process, as suggested by Ratcliff (1978). Conceptually, this diffusion is illustrated in Figure 2, and might proceed as follows. At the beginning of a simulation, activations of the input nodes are set to predetermined values representing a stimulus, subject to some perceptual noise (Figure 2A). As activation from the input units feeds to the affective and nonaffective units over time, a pattern of activation in the semantic units would be formed, corresponding to some non-emotional features the network has learned (e.g., if the stimulus was "death", the notion of the end of life might be retrieved). At the same time, the positive and negative valence units would take on activations suggesting that the stimulus is either positive or negative (Figure 2B). Feedback between the affective and nonaffective units might lead the network to change the pattern of activations in the nonaffective units, suggesting that the network has associated a different set of non-emotional features with the stimulus. Similarly, the feedback could lead to a different pattern of activations occuring in the valence units, suggesting, for example, that the network originally perceived the stimulus as positive, but now perceives it as negative. During this process the output units are becoming active in proportion to activation of the nonaffective and affective nodes, with additional contextual information from the task units (Figure 2C). The fit of these activations to each stored pattern is evaluated simultaneously. Eventually when an overwhelming proportion of evidence for one output pattern is accumulated, the network can be said to have "recognized" the stimulus, as that pattern. This event would correspond to a person having recognized a stimulus, i.e., having recognized it’s non-emotional features, and having assigned it an affective valence.

 

The model is thus one of the time course of attention to a presented stimulus. To the extent that the model is valid, it provides information about how relationships between emotional and non-emotional aspects of information could influence attention, and can provide insight into what aspects of an emotional stimulus a person might pay attention to, both before and after the stimulus is recognized. By training the network in different ways, hypothesized effects of different experiences on emotional information processing can be examined. Technical details of the model are presented in Appendix A. A few intuitions that are especially relevant to examining the model are discussed here.

Representation of Valence

Positive and negative valences are often thought of as orthogonal. To represent positivity and negativity orthogonally, two nodes could be used. High activation of one node represents positive information while high activation of the other node represents negative information. Low activation of both nodes represents neutral information.

The validity of using a near orthogonal representation of positivity and negativity can be supported empirically. Williams, et al. (1998) had 600 undergraduates rate the positivity and negativity of words that had been previously categorized as positive, negative, or neutral. They found that positive words were generally rated as somewhat positive and not negative. Negative words were rated as less positive and more negative. Neutral words were rated as slightly positive but lacking in negativity.3 To best approximate empirical relationships between the valences, activation of the valence units in the network was made proportional to Williams et al.’s (1998) means.

Simulation of Normal and Depressed Experiences in the Model

Following the idea that nondepressed individuals are exposed to a variety of positive, negative, and neutral information, an analog of normal experience was induced in the network by training it on equal numbers of positive, negative, and neutral exemplars. Many theorists suggest that the onset of depression can involve one or a few pervasive negative life events or loss experiences (e.g., Beck 1974, Paykel 1979) that are continuously thought about. This process was operationalized in the neural network model by training the network on a single negative stimulus for a prolonged period after it had been trained on equal numbers of positive, negative and neutral stimuli. No claim is made here that nondepressed people actually have equal numbers of positive, negative, and neutral experiences. Rather, the important claim is that the ratio of training on negative and non-negative exemplars is different for depressed and nondepressed individuals. Even rather large changes in the amount of training on each valence given to the network that represented nondepressed learning did not change the network’s behavior qualitatively. Additionally, no claim is made that all depressions involve this sort of overlearning of a few negative experiences. Depressions brought on through accumulation of many less intense negative life experiences would be approximated in the model by overtraining it on many negative exemplars (Siegle, 1994). Purely biologically mediated depressed states would have to be accounted for by other vehicles than biased training on environmental stimuli. Rather, the current model attempts to simulate processes operating in a specific type of cognitively mediated depression.

A number of different algorithms for allowing the network to "learn" initial exemplars, as well as negative information have been explored. In initial descriptions of the network (Siegle, 1996, Siegle & Ingram 1997a,b), a back-propagation learning algorithm was used. This algorithm treats learning as an error correction process, in which connections within the network are adjusted to minimize the error between inputs to the network and expected outputs. This procedure is widely used throughout the neural modeling literature, but can be criticized on two grounds. First, there is little evidence suggesting that back-propagation is a biologically plausible learning rule (Jobe, Fitchner, Port, & Gavira, 1995). Second, given the small number of exemplars on which the current network is trained, unless a great deal of noise is included in the network, it learns all training exemplars perfectly very quickly. Thus, overtraining modifies weights in the network very little. As such, it is difficult to simulate the effects of various levels of overtraining without a great deal of noise. To overcome these barriers, new simulations conducted here were done using a Hebb learning rule by which connections between nodes that were active at the same time (e.g., affective and non-affective feature nodes) were strengthened. The resulting network’s behavior is qualitatively similar to the backpropagation trained network. For these simulations, noise could be considerably reduced. Details of the Hebb and Backpropagation trained networks are presented separately in Appendix A.

Simulating Attention and Recognition Processes

The network’s activation of nodes in the nonaffective layer can be thought of as corresponding to the process of recognizing non-emotional features of a stimulus. The activation of valence nodes can be thought of as the process of recognizing emotional aspects of a stimulus. To understand the time course of attention in the network, activation of these areas can be compared for positive, negative, or neutral information. The sum of activations throughout the network (i.e., the network’s "energy", a common measure of processing in neural networks) is thus a rough estimate of cognitive load (i.e., activation throughout the brain), and hence, may be expected to correspond to pupil dilation.

There is a great deal of debate regarding how to obtain an analog of reaction time from a neural network model (Bullinaria, 1994). In general, though, reaction times are thought of as the culmination of a diffusion process, in which evidence is accumulated for various possible responses, until one reaches some threshold. To capture that intuition, the network can be said to make a lexical decision when evidence for some learned pattern in the nonaffective layer reaches an arbitrary threshold. Similarly, the network can be judged to have identified the valence associated with a stimulus when activation in one of the valence units reaches some threshold. Error and confusion rates can be estimated by examining whether the pattern of activation in relevant sections of the network, at its simulated reaction time, is closer to the expected pattern, than other erroneous patterns of activation.

Reproducing Valence Rating Data

To check that the processes used to represent valence in the network, as well as to induce an analog of depression in the network were effective, an analog of the valence-rating procedure can be used. The network was presented with each stimulus on which it was trained, for 200 processing cycles (long enough to reach assymptotic activations in the valence units). The activation values for valence units representing positivity and negativity were recorded as an analog of a rating for how positive, and how negative stimuli were rated. These values were scaled to be equivalent to the 1 (low emotionality) to 5 (high emotionality) ratings used in Williams et al’s (1998) study. The median scaled valence ratings for stimuli of each valence for 10 trials are shown in Table 2. For each valence, ratings for all three stimuli were within one tenth of a point. As expected, positive stimuli were primarily positive. Negative stimuli were somewhat positive. Neutral stimuli were more positive.*

Whether the overtraining procedure behaved as expected can also be tested using this method. Williams et al. (1998) found that dysphoric college students generally rated all stimuli more negatively and less positively than nondysphoric individuals. Inspection showed that the rate of forgetting in the Hebb network governs the magnitude of resulting weights after overlearning. With too little forgetting, all ratings go up after overtraining (i.e., too much positivity for all stimuli). With too much forgetting, previously learned stimuli are no longer recognized after overlearning. The desired effect was obtained for a minimum forgetting rate of .89, which was therefore adopted for subsequent simulations. Table 2 also presents the simulated valence ratings for the network, overtrained five times, with a forgetting rate of .89. In each case, ratings are more negative for the overtrained network. While ratings for the positive information are lower on positivity than in the original network, ratings for neutral stimuli are similar, and positivity ratings for negative stimuli are higher in the overtrained network.

Table 2.

Median valence ratings for each stimulus from 10 simulated rating sessions. Values are scaled between 1 (low emotionality) and 5 (high emotionality).
 
  Nonovertrained Overtrained (5 epochs)
  positivity  negativity  positivity  negativity
Positive 3.6 1.4  2.5  1.9
Negative 2.2 2.3  1.9  3.1
Negative – personally relevant 2.2 2.3  5.0  5.0
Neutral 2.7  1.1  2.0  1.9
 

Use of the Network to Predict the Time Course of Attention to Emotional Information

Predictions for Reaction Times

The neural network model described here can be used to understand the course of attention expected to emerge from affective lexical decision and valence identification tasks. Earlier simulations with similar backpropagation networks (e.g., Siegle & Ingram, 1997a; Siegle, 1996) have shown that after overtraining, the network tends to associate any incoming information with the negative stimulus on which it has been overtrained. If the network’s behaviors are analogous to human behaviors, then when any stimuli are presented depressed individuals may think of personally relevant negative information, since this information is most closely associated with their depression. Effects of this feedback on reaction times are presented for simulations done by Siegle and Ingram (1997a) in Figure 3.4 The horizontal axis represents valence. The vertical axis represents the network’s simulated reaction time, in processing epochs. The network’s performance on each task, for each level of over training, is represented by a separate marker.

Because any incoming information is most easily associated with a negative valence, the overtrained network responds more slowly than the non-overtrained network to positive information, and more quickly to negative information on the simulated valence-identification task. The network’s behavior suggests that depressed individuals will be slow to report that positive words are positive, but quick to report that negative words are negative, on a valence identification task. If neutral decisions are made in the same way, these would also be biased. Alternately, Siegle and Ingram (1997a) suggest that neutral decisions may be exclusionary, made when neither a positive or negative decision is reached after a variable temporal threshold. If depressed and nondepressed individuals have the same threshold for neutral decisions, neutral decision making would not be biased. On a simulated lexical decision task, the network is slowest to make associations with negative stimuli on which it is not overtrained because the representation of negative

information on which it was overtrained quickly becomes activated and competes for activation. It is fastest at making associations with the negative stimuli on which it is overtrained. The network’s behavior thus suggests that depressed individuals will be slow to say that negative non-personally-relevant words are words, because they will be

reminded, so strongly, of personally relevant information that they will not immediately respond to the task. In contrast, depressed individuals are expected to be especially fast at responding to negative personally-relevant words on a lexical decision task.

The network’s reaction times lead to predictions regarding neuropsychological correlates of depressive information processing biases. Biases observed in the network happen as a function of feedback between structures responsible for identifying affective and semantic features of information. In humans, increased connections between structures responsible for cognitive and affective processing are presumed to lead to such biases. Clinically, stronger connections between the semantic processing system, and the mental representation of negativity in the affective processing system might lead depressed individuals to interpret any negative information in light of whatever has made them depressed, a phenomenon clinical researchers have discussed for a number of years (e.g., Teasdale & Barnard, 1993). Based on the model’s performance, depressed individuals are also expected to have difficulty processing positive information.

A second set of predictions regarding reaction times involves the role of rumination. Siegle and Ingram (1997a) have suggested that depressive rumination, the idea that depressed people think excessively about negative information, may involve feedback between mental representations of affective and semantic information. They show that information processing biases on the valence identification task tend to increase when feedback is increased between the network’s representations of affective and semantic representations throughout the network’s training. This type of rumination may represent a coping style in which individuals think excessively about emotional information throughout their lives. Such a finding would be in keeping with clinical observations that depressed individuals think about the sad state of their lives and the future a great deal (e.g., Beck, 1967), and statements by theoreticians that repeated thoughts on these topics might maintain depression (e.g., Teasdale, 1988). In the current framework, thinking about personally relevant negative experiences a great deal would make these experiences particularly well learned. Mental representations of many concepts would become associated with these pieces of negative information. As a result, thoughts of any information could lead to thoughts of personally relevant negative information.

In contrast, Siegle and Ingram (1997a) have shown that when excessive feedback, representative of ruminative coping, is invoked only during overtraining, information processing biases on the valence identification task decrease. This situation may represent individuals who think intensely about negative events for only a short time after they occur, as a way of dealing with them. The decrease in biases comes because the would-be stressor is immediately associated with already-learned information, and thus little new learning of the stressful information takes place using a back-propagation learning rule.5 Siegle and Ingram (1997a) suggest that such a coping strategy may be protective against initially getting depressed, but may also hinder recovery from depression, after its onset.

Predictions for Error Rates

Siegle (1996) showed that a variant of the computational model tends to make the most valence confusions (i.e., responding that a word of one valence, e.g., positive, has a different valence, e.g., negative) on neutral words on the valence identification task, followed by positive, and then negative words. Because the model associates most information with negative information, the model rarely misidentifies negative information. Because the model’s representation of neutrality is close to that for negativity it tends to associate neutral information with negative information sometimes, and positive with negative information less frequently. Siegle and Ingram (1997b) have used a variant of the model to explain the occurance of individuals who are "too depressed" to complete information processing tasks, or who make excessive errors on these tasks. Towards this end they have shown that errors on the tasks tend to increase with overtraining, approaching results similar to those documented by Siegle (1996). The predicted error rates for various levels of overtaining, based on Siegle and Ingram’s (1997b) model are presented in Figure 4. The x axis represents the amount of overtraining in a back-propagation trained model. The y axis for the left panel represents the proportion of mistaken attributions (e.g., seeing a word, and judging it as another word, or nonword) for each valence on the lexical decision task. The y axis for the right panel represents the rate of valence confusions for each valence. The model never made a valence confusion for negative words. Thus, no line representing errors for negative words appears on the graph. As shown in the graphs, error and confusion rates all increase with overtraining. The greatest number of misattributions occur for negative words; after overtraining many negative words are recognized as personally relevant negative words. Similar misattributions happen, to a lesser degree, for neutral and positive exemplars. Similarly, positive and neutral words are also often labeled as negative.

As with predictions for reaction times, an implication of these results is that depressed individuals will rarely make valence confusions in perceiving negative information. In contrast they may make many mistakes in valence identification, and thus have difficulty processing nonnegative information. They may tend to see even positive information as negative. This type of bias may help to maintain depressive affect, as few environmental stimuli would appear positive.


Predictions for Pupil Dilation

To make predictions for pupil dilation, it is useful to examine the expected cognitive activity of the network in response to affective information before and after a moderate amount of overtraining. The network responds qualitatively similarly to stimuli of all valences when each task is simulated. As an example of this pattern of responding Figure 5 presents the network’s behavior in response to a positive stimulus on the valence identification task. The top sub-figure represents the network’s behavior before overtraining. The bottom sub-figure represents the network’s behavior after 4 epochs of overtraining on a single negative stimulus. The x-axis in each panel represents time, and the y-axis represents activation. In each of the sub-figures, the top left panel represents the activation of the network’s nonaffective features. The activation of each of the nine trained exemplars is represented by a single line in the panel. The top right panel is the activation of the network’s affective features. Solid lines are negative, and dotted lines are positive. The bottom left panel is the accumulation of evidence for a given valence or semantic exemplar. In representing output from the valence units, dashed lines are neutral, dotted lines are positive, and solid lines are negative. On the bottom right is the sum of activations for these layers, used as an analog of pupil dilation.

As shown in the figure, before overtraining, the network generally responds to the presentation of stimuli by activating its representation of nonaffective aspects of the incoming stimulus and its valence. This activation falls off after a period. This behavior is seen by the one peak in the top left panel of the top sub-figure. Similarly, for positive or negative stimuli, there is a peak and decay for the appropriate valence unit (top right panel). Activation of the appropriate output units leads to a sustained match for the correct output (bottom left panel).

Consequently, there is a peak and dip in the expected pupil dilation waveform (bottom right panel). When the network is overtrained on negative information, its activation of the appropriate valence and semantic pattern are initially kindled, but after a short time the network’s representation of the negative information on which it has been overtrained becomes more highly activated. Similarly, the negative valence unit becomes activated.

The only difference between the network’s responses to stimuli of different valences is how quickly activation in response to the presented stimulus falls off and the activation of nonaffective and affective features of the overtrained stimulus begins. The reversal happens most quickly for negative words on which the network has not been overtrained, followed by neutral words, and finally by positive words. It is thus predicted that long after a stimulus is taken away, on either task, dilations will be highest to negative personally relevant words, followed by nonpersonally relevant negative stimuli, neutral, and positive stimuli. These predictions are consistent with the idea that consideration of personally relevant negative information interferes with the processing of environmental stimuli by depressed individuals. For completeness, the network’s simulated responses to stimuli of all valences on each of the tasks is presented in Appendix B, using the same conventions as in Figure 5.

More intriguing predictions emerge as a direct consequence of allowing newly learned information to occlude previously learned information, when components of the network’s activations are examined as a function of depressive overtraining. Figure 6 presents the sum of the network’s activation (i.e., simulated pupil dilation) as a function of time and overtraining on the valence identification task. The network’s simulated pupil dilation in response to the lexical decision task looked almost identical. As shown in the figure, before the network is overtrained, it initially activates strongly in response to the presentation of a stimulus. This activation falls off quickly. In contrast, after overtraining, the network reacts weakly to the stimulus, but its activation is preserved over time. Decreased initial activation occurs due to the diffuse decrease in simulated synaptic weights throughout the network during overtraining. Increased sustained activation occurs because the overtrained network’s activation of personally relevant activation happens subsequent to the presentation of any stimulus. Momentary variation in network’s energy is due to noise. Additional simulations suggested that this pattern becomes stronger as overtraining increases. The prediction for pupil dilations is thus that nondepressed people should have relatively high early dilations on both tasks, but should have considerably lower late dilations, for all stimuli, on both tasks. Depressed individuals should have lower early dilations and higher late dilations for all non-personally relevant information. Depressed individuals are expected to have high early and late dilation for personally relevant negative information. This prediction is consistent with the idea of depressive rumination (i.e., feedback between cognitive and affective information processing) long after information is presented.6

Summary of Predictions Based on the Performance of the Neural Network Model

Thus far, it has been suggested that depressed individuals pay attention to negative information. The relevance of this notion to understanding how depressed individuals will think or behave differently in their environment from nondepressed individuals, how depressed individuals respond physiologically to different types of information, and how to best treat depression based on these observations are unclear from this insight alone due to its vageness. For example, what about negative information depressed individuals pay attention to, how long they pay attention to it have traditionally been unclear. The computational model presented in this section has proposed answers to these questions that allow speculations to be advanced regarding specific ways in which cognitive and physiological processes will differ in depressed and nondepressed individuals in response to different types of affective information. If valid, the model could then be used to develop broader notions of physiological correlates of depression, understanding of how depressed people would behave in negative situations throughout their environment, and could be used to predict responses to novel treatments for depression (e.g., Siegle, 1999a). The path towards validation for the model involves using it to predict behaviors on tasks that are measurable in a lab.

Because traditional cognitive theories are generally unclear regarding aspects of negative information to which depresssed individuals pay attention, it has been difficult to predict how depressed individuals will differ from nondepressed individuals on information processing tasks, or physiological measures of cognitive load, in response to affective information on controlled information processing tasks. In contrast, formalizing relevant intuitions regarding the nature of attention to emotional information from popular contemporary models of emotional information processing and depression, allowed predictions to be made about how depressed and nondepressed people were expected to perform on an affective lexical decision and valence identification task. Specifically, experiments with the computational model suggest that depressed individuals are likely to think about personally relevant negative information when presented with any type of information. The following patterns of behavioral and physiological responses are expected to be associated with this style of information processing on an affective lexical decision and valence identification task. On the valence identification task, depressed individuals are expected to be quicker to say that negative words are negative than to say that positive words are positive. Depressed individuals are expected to react fastest to personally relevant negative words on the valence identification task. They should make the fewest valence confusions for negative words, and the most for neutral words.

On the lexical decision task, depressed individuals are expected to be slower to say that non-personally relevant negative words are words than to say that positive words are words. As on the valence identification task, depressed individuals are expected to react fastest to personally relevant negative words on the lexical decision task. They may make greater numbers of false-alarms to non-words, associating them with personally relevant negative information. These biases are not expected to be present for nondepressed individuals; potentially they will be opposite, with nondepressed individuals showing the shortest reaction times to positive words.

Depressed individuals are expected to show the greatest cognitive load in response to personally relevant negative words, followed by nonpersonally relevant negative, and all neutral and positive words on both tasks. Cognitive load is expected to be low in the early stages of attention, for all words that are not negative and personally relevant. Cognitive load in the late stages of attention is expected to be high. In contrast, nondepressed individuals are expected to show less differentiation in cognitive load to stimuli of different affective valences. Nondepressed individuals are expected to show large cognitive load in the early stages of attention and small cognitive load in the late stages of attention.

Experimental Design Based on the Network’s Behavior

To test these hypotheses, reaction times, signal detection rates, and cognitive load, indexed by pupil dilation, were measured on an affective lexical decision and valence identification task in depressed and nondepressed individuals. To distinguish between personally relevant and non-relevant words, both normed word-lists and words generated by participants in the days preceding the experiment were employed. In addition to the information processing tasks, a number of other measures allowed for detailed understanding of the obtained patterns of results. Specifically, a state-measure of mood (the Beck Depression Inventory; Beck, 1967) was used to establish that individuals were symptomatic at the time of testing, which may have occurred over a week after they were initially diagnosed. A measure of ruminative coping (Nolen-Hoeksema & Morrow, 1991) was included to understand whether the feedback within the model model, which was assumed to represent sustained attention to negative information, corresponded to traditional notions of ruminative coping. Because of the extensive overlap between depressive and anxious symptomatology, two measures of anxious symptomatology were included to allow for analysis of whether depressive symptomatology contributed to information processing biases above and beyond current anxious symptomatology.

 

III. METHOD

To evaluate the role of attention to negative information in depression, and its correspondence to the neural network model presented here, affective lexical decision and valence identification tasks were presented to clinically depressed and nondepressed individuals.

Research Participants

Twenty-eight outpatients and two inpatients meeting DSM-IV (American Psychiatric Association, 1994) criteria for Major Depressive Disorder and twenty-six nondepressed controls were recruited through the UCSD Mental Health Clinical Research Center (CRC). Inclusion criteria for depressed participants included being diagnosed with Unipolar Depression using the SCID and having scored above 14 on the Beck Depression Inventory (BDI; Beck, 1967) within two weeks of testing. Inclusion criteria for nondepressed controls included endorsing no current symptoms of depression and having no psychiatric diagnoses on a SCID interview, having no known relatives with psychiatric diagnoses, and never endorsing having had a psychiatric diagnosis. Exclusion criteria for research participants included being medicated with psychotropic medications (exclusive of St. John’s Wort, taken in doses too low to be considered therapeutic), having a history of drug or alcohol abuse within the last six months, being diagnosed as having any psychotic disorder (e.g., Schizophrenia, or Schizoaffective disorder) or a Bipolar mood disorder using the SCID (Structured Clinical Interview for the DSM-IV), or having any eye problems or difficulties in corrected vision. Additionally, all participants had to be medically cleared for research after a full physical examination, chemistry panel, thyroid test, complete blood count, urinalysis, HIV test, and EKG.

The decision to include only unmedicated depressed patients was made to ensure against effects of medications on pupil response. As the pupil response is dependent upon cholinergic receptor activity, medications that affect the cholinergic system (e.g., tricyclic antidepressants) have been observed to affect pupil dilation. In addition, pilot testing with medicated depressed individuals suggested that some selective seratonin reuptake inhibitors affected pupil responsivity on the tasks.

Depressed participants were selected from the CRC pool at a weekly SCID diagnosis consensus meeting. Nondepressed participants were recruited from the San Diego Metropolitan Area (approximately 2.5 million people) via newspaper advertisements placed by the CRC. SCID interviews were conducted by trained CRC staff. The CRC staff undergoes SCID reliability checks every six months, and all diagnoses are reviewed at a consensus meeting with a senior medical doctor trained in DSM-IV diagnosis present.

Of these participants, 23 depressed individuals had BDI scores over 14 at the time of testing, and had acceptable eyesight as measured by reading line seven of a standard Snellen eye chart from ten feet, corresponding to 20/30 vision. Of these individuals, all were unmedicated at the time of testing. Twenty-five nondepressed individuals had BDI scores under 8 (below a usual measure of mild dysphoria) at testing. These patients were retained for subsequent analyses. The gender and ethnic distributions of these individuals are reported in Table 3. The distribution of ethnicities was representative of the San Diego metropolitan area. Descriptive statistics for age and education of participants are reported in Table 4.

Table 3

Numbers of research participants by ethnicity and gender

Group  Depressed  Nondepressed 
African American 
Asian 
Caucasian  16  20
Hispanic and Latino 
Other 
Male  14 
Female  16
 

 

Table 4

Age and years of formal education of participants meeting inclusion criteria

Group Measure M Sd Minimum Maximum
depressed Age 47.83 9.56 32 68
Education  14.52  2.37 11 21
Control  Age 42.08 12.30 23 62
Education  16.20  2.02 12 20
 
Self-Report Measures

The Beck Depression Inventory (BDI; Beck, 1967) is a 21 item self-report inventory, given to assess current depressive affect in the days preceding testing. Individual items are worded so that responses reflect increasing degrees of severity and are given values of 0-3. The total possible score ranges from 0-63. It has acceptable validity and reliability (Beck, 1967; Beck, Steer, & Garbin, 1988). The scale is presented in Appendix C.

The Beck Anxiety Inventory (BAI) was given to assess current anxious affect. Like the BDI, the BAI is a 21 item self-report inventory. Individual items are worded so that responses reflect increasing degrees of severity and are given the values of 0-3. The total possible score ranges from 0-63. The inventory is frequently used to assess depressive affect. It has acceptable validity and reliability (Steer et al., 1993). The scale is presented in Appendix C.

The State Trait Anxiety Inventory (STAI—trait form; Spielberger, 1983) was given to assess trait anxiety. The trait form of the STAI consists of twenty statements that assess how anxious individuals generally feel. Each item is given a weighted score of 1-4 in which 4 indicates the presence of high levels of anxiety for some questions, and low levels of anxiety for other questions. The latter set of questions is scored in reverse yielding scores in the range of 20-80. The STAI was developed to assess trait anxiety in adults, college students, and high school students. Its reliability and validity have been demonstrated for assessing clinical anxiety in a large number of studies (Spielberger, 1983). The scale is presented in Appendix C.

The Response Styles Questionnaire (RSQ; Nolen-Hoeksema, 1998, personal communication) is a 71 item self report measure given to assess the level at which individuals engage in various cognitive coping styles. It has subscales for rumination and distraction. The questionnaire asks test takers to endorse thoughts and behaviors they engage in while in a depressed mood. Items are responded to on a four item frequency scale: "almost never", "sometimes", "often", and "almost always". Items on the rumination scale ask about how often individuals think of aspects of their depression, e.g., "think ‘I am ruining everything’". Items on the distraction scale ask about how often the individual engages in distracting behaviors, e.g., "watch TV to distract yourself". The rumination and distraction subscales of this questionnaire have been shown to have adequate internal consistency, and to be predictive of behaviors associated with rumination and distraction (Nolen-Hoeksema & Morrow, 1991; Nolen-Hoeksema, Morrow, & Fredrickson, 1993; Nolen-Hoeksema, Parker, & Larson, 1994). The scale is presented in Appendix C.

The Structured Clinical Interview for the DSM-IV (SCID) is a structured interview used to diagnose DSM-IV disorders. The interview consists of standardized questions assessing each inclusion and exclusion criterion for each DSM-IV disorder, and is administered by a trained interviewer. Interviewers can probe ambiguous answers with standardized probe questions. Each symptom is rated as either present, absent, or ambiguous. Diagnoses are established by counting symptoms rated as present, according to DSM-IV guidelines. The interview takes between 30 minutes and three hours, depending on how many symptoms are endorsed. Previous versions of the interview have been shown to have adequate reliability and validity for outpatient samples (Spitzer et al., 1992).

A computer based word rating questionnaire was used to allow participants to rate the emotionality of each word from the standard and idiosyncratic set. On this questionnaire, the following directions were presented to the participant:

In the following task you will be asked to rate each of a series of words on how emotional it is for you, using a 7 point scale. When each word appears, please press a number, 1 through 7 corresponding to the following scale: 1 -- Very Negative, 2 -- Negative, 3 -- Somewhat negative, 4 -- Neutral, not emotional, at all, 5 -- Somewhat positive, 6 -- Positive, 7 -- Very Positive This questionnaire then presented each word in the standard set, followed by each word in the idiosyncratic set, one at a time to the participant. Psychometric properties of this questionnaire have not yet been examined on another sample.

Apparatus

Stimuli were displayed on an IBM PC compatible 386 computer with a 14 inch color monitor. Research participants sat approximately 28 inches from the bottom of the stimulus. Stimuli were drawn in lowercase letters approximately 5/8 inches high on the monitor, subtending approximately 1.21 degrees of visual angle.

Reaction times were recorded using a modified Battle GearTM game pad. The game pad is capable of reading input from up to 4 buttons with millisecond resolution. It was modified from its original form to contain only three buttons, arranged in a triangle, so that prior to responding, respondents’ fingers were equidistant from each possible response.

Pupil dilation was recorded using a Micromeasurements System 1200 pupillometer. The pupillometer consisted of a video camera and infrared light source that were pointed at a participant’s eye, and a device that tracked the location and size of the pupil using these tools. Pupil size and location were recorded at 60Hz (every 16.7ms) and were passed from the pupillometer to both the 386 computer, which controlled the display of stimuli, and a 486 computer which stored the acquired data. Additionally, signals were transmitted from the 386 computer to the 486 to signal the beginning and ending of trials as well as the end of fixation, stimulus onset time, and reaction time.

Target Stimulus Materials

For the computer administered tasks, 10 positive, 10 negative, and 10 neutral words balanced for normed affect, word frequency, and word length were chosen. The selection of words for the tasks was done using a computer program (Siegle, 1994) designed to create word lists of an arbitrary length balanced for normed affect, word frequency, and word length. To determine whether word valence covaried with other characteristics of the chosen stimulus words, two analyses were performed. A one-way Analysis of Variance (ANOVA) using word valence (positive, negative, neutral) as the independent variable and word frequency as the dependent variable did not reveal a statistically significant effect of valence, F(1,29)=1.7, p>.05, nor did a similar ANOVA on word length reveal a statistically significant effect of word length, F(1,29)=1.5, p>.05. Nonwords were created by perturbing the spelling of 6 positive, 6 negative, and 6 neutral words such that they were not words, but were pronounceable and did not violate English syntactic conventions. The list of words and nonwords is included in Appendix D. To obtain personally relevant stimuli, participants were asked to generate 10 positive, 10 negative, and 10 neutral personally relevant words, using the form provided in Appendix E, at least one day in advance of testing. Word lists were generated between four weeks and one night before the final testing session.7

Tasks

Lexical Decision Task

For the lexical decision task, each word and nonword was presented to the participant as follows. A fixation square appeared and remained on the screen until the participant’s gaze was fixed within one degree of visual angle for 200ms. At this time, the fixation square was replaced by a row of X’s (forward mask) for 2000ms. The X’s in the middle of the string were replaced by letters spelling a word or nonword. After a stimulus duration of 150ms, the letters were masked by a row of X’s again, and the

participant was allowed to respond. Pupil dilation continued to be recorded for six seconds after the initial onset of the stimulus, regardless of when the participant responded. Research participants were instructed to push buttons labeled "Yes" or "No" as quickly and accurately as they could. Labels for these responses were on a card in the participant’s vision. The research participant’s pupil dilation throughout the trial, as well as reaction time and response were recorded on the computer for each stimulus.

Valence Identification Task

For the valence identification task, each word was presented to the participant as in

the lexical decision task. Research participants were instructed to push buttons labeled "+", "-", or "N" (standing for "Positive", "Negative", or "Neutral", respectively) as quickly and accurately as they could, in response to each stimulus. Labels for these responses were on a card in the participant’s vision.

Warned Reaction Time Task

To account for the possibility that depressed individuals generally reacted slower to all stimuli on the basis of psychomotor retardation, and to provide an understanding of differences in pupil dilation related soley to motor, rather than cognitive phenomena, a subset of 20 depressed and 16 nondepressed participants8 were given a warned reaction time task. Nonwords consisting of between 3 and five "a"'s were presented to the participant as in the lexical decision task. Research participants were instructed to push the middle game-pad button as quickly as they could in response to each word.

Gaze Task

To account for variability in resting pupil dilation, the same subset of participants who took the warned reaction time task were given a gaze task. Nonwords consisting of between 3 and five "a""s were presented to the participant as in the lexical decision task. Research participants were instructed to watch as the stimuli were presented.

Counterbalancing of Conditions

A number of features of the experiment were counterbalanced using modular arithmetic, which ensured that conditions were assigned cyclically. In modular arithmetic the function mod(a,b) represents the remainder when a is divided by b. Thus, if a increases sequentially, and b is 5, mod(a,b) will repeatedly count from 0 to 4. Identification numbers were assigned to participants based on recruitment order by the Mental Health Clinical Research Center (MHCRC); control participants had id’s from 9000 to 10000. Depressed participants had id’s from 5000-6000. The order in which the tasks were completed was assigned based on the function mod(id/6,2). The order in which buttons for responses on the game pad were labeled for the lexical decision task was assigned as mod(id/12,2). The button order for the valence identification task (using all six possible orderings of the buttons) was assigned as mod(id,6). This strategy ensured that the distribution of each of these variables would be pseudorandom, and that each variable would approach a correlation of zero with the other variables as the number of participants grew. The order of words within each task, and the order in which a given word was shown were assigned randomly by computer for each participant.

Procedure

Depressed participants were selected from the CRC pool at a weekly consensus meeting. Selection involved reviewing SCID summaries to make sure potential participants met inclusion and exclusion criteria, and were medically cleared for research. Nondepressed participants were selected from a database of individuals who had been given SCID interviews within the last six months, after having responded to a newspaper advertisement placed by the CRC.

Individuals who had received a SCID diagnosis of depression and nondepressed participants were contacted by phone. During the phone conversation, the purpose, length, and reimbursement for the study were explained briefly, and appointments were scheduled for participants to come to two testing sessions. The first testing session was conducted by the author, a research assistant, or a trained member of the CRC staff. At this session participants were asked to read and sign human subjects consent forms for UCSD, SDSU, and the VA research council (Appendix F).

Next, participants were given the form included as Appendix E, and were asked to generate sets of personally relevant emotional stimuli in the blank lines on the form. The initial line of directions were read out loud, after which the participant was left to read subsequent lines to herself. The form asked individuals to generate a list of 10 negative words which best represented what they thought about when they were down, depressed, or upset, along with 10 analogous positive and neutral words. Filling out the form was untimed, and was completed during the first meeting. Individuals conducting the first meeting were instructed not to coach participants, but to gently encourage them if they appeared to have trouble filling out the form. Some CRC staff reported having talked to depressed individuals about things which had, at one time, been positive or neutral in their life, when they had particular trouble filling out the form. Words from the form were entered into the computer by the researcher after the participant left. This session took between 15 and 30 minutes. The second testing session, occurred at least one day later to insure against priming effects from the word generation task.*

In this session, a brief vision test was administered, asking participants to stand with their heels against a cabinet, 10 feet away from a standard eye-chart, and to read each line on the chart, beginning with line 7, with both eyes open. This test assured that participants could adequately see experimental stimuli. The pupillometer was calibrated to assure fixation by asking the participants to look at each of 9 locations on the screen, and waiting until a fixation was achieved at each location. Parameters representing the extent of pupillary movement during these fixations were stored automatically by the calibration program.

Participants were then introduced to the lexical decision and valence identification tasks via directions on the computer (Appendix G). They were then given a practice session consisting of three trials, with feedback (e.g., "The word was birthday. You pressed the middle button, saying the word was positive") which were repeated until all of the practice trials were correctly performed and the participant reported that he or she was ready to proceed with the task. After querying for any questions, participants were instructed to complete the first task. The identical procedures were used for the second task. For each task, all stimuli in both the standard and personally relevant word sets were displayed, and then the task ended. For the lexical decision task research participants were asked to place their right palm below two buttons on the game pad. The task proceeded as described previously. For the valence identification task, research participants were asked to place their right palm below the game pad with their index finger equidistant from the three buttons, arrayed in a triangle, on the game pad. The task proceeded as described previously.

The tasks paused every six trials to allow participants to rest. During this time, the experimenter wrote down any notable behavioral observation or errors in data collection (e.g., not setting the pupillometer for data collection during the task). Errors that precluded the use of collected data, or represented missing data were used to remove unusable data from the database, and to make sure that behavioral responses (e.g., reaction times) were matched correctly to obtained pupil dilation data.

After the completion of both tasks, participants were asked to rate the emotionality of each word from the standard set, followed by the idiosyncratic word set, using the word rating questionnaire. Next, participants completed computer-administered versions of the BDI, BAI, STAI, and RSQ. The analogous paper versions are included in Appendix C.

All but the first 10 participants then received the gaze and warned reaction time tasks. For the warned reaction time task, participants were asked to place their right palm below game pad with their the index finger equidistant from the three buttons.

Following this procedure, participants were debriefed. Debriefing consisted of discussing the theory behind the tasks, as well as walking participants through their reaction time and pupil dilation data on the valence identification task. Participants were given the opportunity to discuss what factors they believed contributed to long or short reaction times, and high or low dilations. After the debriefing, participants received $20 to cover transportation and participation in the approximately 2 hour experimental session.

Statistical Power

The sample size was based on practical considerations as well as an a priori analysis of statistical power for multivariate split-plot ANOVA planned contrasts. Because the current experiment required more severely disordered participants than Siegle et al.’s (1997) study, and because it included various methodological improvements over that study, it was assumed that effect sizes for reaction times would be larger than those observed with dysphoric and nondysphoric undergraduates by Siegle et al. (1997). In Siegle et al.’s (1997) study on the lexical decision task, relevant effect sizes (d; Cohen, 1977) were around .55. On the valence identification task, relevant effect sizes were in the neighborhood of .60. Furthermore, it was assumed that effect sizes for pupil dilations would generally be larger than those for reaction times, as it was assumed to be a more sensitive measure of similar cognitive events.

Initial power estimates suggested that with 40 participants per group, power to detect effects in reaction times for relevant contrasts on the lexical decision task would be .82 with relevant effect sizes of .65. After over a year of data collection only 20 depressed participants meeting criteria had been recruited. Preliminary analyses of reaction times on the first 15 participants in each group suggested that effect sizes were at least as large as expected, or were near enough to zero, that considerably more than 40 participants per group were necessary. Relevant tests of pupil dilation variables yielded similar results. Thus, data collection was terminated after 25 participants per group had been collected. Given the current sample, power for relevant contrasts was .60 to detect effects of magnitude d=.65. Power increased to .8 for effects of magnitude d=.85.

Selection of Relevant Stimuli for Analysis

Trials with reaction times below 150ms were discarded as outliers because previous results suggest that reaction times in this range indicate that a response was made without regard for the stimulus (Matthews & Southall, 1991). Similarly, trials with reaction time latencies over 5000ms were discarded as outliers because it is assumed that if participants took more than 5 seconds to respond, they did not attend to the trial. This procedure eliminated very little data. Nine participants had one long or short outlier on the lexical decision task. One participant had two, and one had three. Eight participants had one long outlier on the valence identification task. One participant had two, one had four, and one had five outliers. Trials in which stimuli were incorrectly identified on the lexical decision task, and trials for which the valence rating was incongruent with the normed valence on the valence identification task were not removed from the data set. These trials were retained because the tasks were intended to reflect features of cognitive processes used to make lexical decisions and valence identifications; the same processes were assumed to operate during trials in which correct and incorrect decisions were reached.

To be sure that individuals considered all positive words to be positive, negative words to be negative, and neutral words to be neutral, analysis was restricted to those words that met the following criteria based on the word-rating task given at the end of the experiment. Positive words were used only when they were rated somewhat positive, positive, or very positive (5-7). Negative words were used only when they were rated somewhat negative, negative, or very negative (1-3). Neutral words were used when they were rated somewhat positive, neutral, or somewhat negative (3-5). To illustrate the amount of data that were eliminated due to this restriction, Table 5 presents descriptive statistics for match rates for each valence in depressed and nondepressed individuals.

Table 5

Mean match percentage between rated and normed or generated word valences, M(SD).  
  Depressed  Nondepressed 
Positive  .701 (.253)  .874 (.129) 
Negative  .861 (.147)  .804 (.161) 
Neutral  .752 (.192)  .853 (.173) 
To examine whether data were eliminated differently for depressed and nondepressed individuals, a valence (3: positive, negative, neutral) x status (2: depressed, nondepressed) multivariate split plot ANOVA with match percentage was performed. This analysis revealed a statistically significant valence x status interaction, F(2,46)=5.41, p=.008. Simple effects analysis revealed that statistically significantly more trials were eliminated based on incorrect matches for depressed individuals than nondepressed individuals for positive words t(33.8)=2.89, p=.005, D=.17%, but not for negative or neutral words. Put another way, because depressed individuals did not say that positive words were positive, as much as nondepressed individuals. Therefore, fewer of their data for positive words were analyzed.9

Methods of Analysis—Reaction Times

Data Aggregation

The harmonic mean of reaction times was used to index the central tendency of an individual’s reaction times. As noted by Ratcliff (1993), this measure is expected to yield lower standard deviations around measurements of central tendency (i.e., is less sensitive to outliers) than means, medians, or trimmed means when the same individual’s reaction times are measured repeatedly.

Data Cleaning

Outliers more than 1.5 times the interquartile range from the median harmonic mean on any variable were scaled to the closest obtained value below this cutoff plus the difference between this value and the next closest value. For example if reaction times of 1.8 and 1.9 seconds were the highest values below the cutoff value of 1.5 times the interquartile range from the median, and the data contained an outlier of 4.0 seconds, this outlier would have been scaled back to 1.9+(1.9-1.8)=2.0 seconds. This technique preserved relative rank-ordering of data-points and insured that statistical assumptions for subsequent hypothesis tests were not violated. On inspection, plots of mean harmonic means were, in general, close to those for median harmonic means, after this transformation.

Analytic Techniques

Because the neural network was only used to make predictions about nonpersonally relevant positive, negative, and neutral words, and personally relevant neutral words, planned comparisons were done based on MANOVAs incorporating only these valences. That is, a number of multi-variate split-plot ANOVA planned contrasts with valence (Four: positive non-personally relevant, neutral nonpersonally relevant, negative nonpersonally relevant, negative personally relevant), and Status (Two: depressed, nondepressed) were performed for the valence identification and lexical decision tasks.

Planned contrasts were performed using ANOVAs in which no nuisance covariates (e.g., age, education) were entered, because they often interacted with valence. To make sure that various demographic and physical characteristics did not affect results, exploratory analyses were also conducted in which gender, age, education, and simple reaction times were covaried out. In no case did the inclusion of age, education, or gender as covariates change results qualitatively. Equality of variance was tested for all contrasts using a Levene test. When variances were not statistically significantly different, contrasts assuming equal variances were used; otherwise contrasts assuming unequal variances were used.

Methods of Analysis—Signal Detection

Data Cleaning

For signal detection analyses all trials were used, regardless of whether their normed valence matched their rated valence. This technique allowed investigation of whether individuals were biased to rate words in ways other than their normed valence as a function of their depression.

Analytic Techniques

Computing response biases on the valence identification task. The neural network predicts that depressed people will be biased to interpret positive and neutral information as negative (i.e., to confuse their valences). Williams et al.’s (1998) study suggests that negative and neutral words may be more similar to each other than to positive words. Luce and Narens (1983) present an approach that is useful for understanding whether there are biases to respond to all stimuli as negative, even in the presence of such asymmetries. This approach has been validated through a wide literature (Marley, 1995). They suggest that confusions in categorizing stimuli are functions of both biases b towards some response (e.g., how likely an individual is to respond to all stimuli as negative) and discriminabilities h between categories (e.g., how similar the individual considers positive and negative words to be). Response biases b for a response i can be calculated independent of differences in discriminability according to the formula: bj = where r is a response, s is a stimulus, and n is the number of alternatives (Movellan, 1998, personal communication). p(ri|sj) represents the proportion of times that response i (e.g., "negative") was chosen when stimulus j (e.g., "love") was presented. Response biases are 1 when there is no bias present. Terms greater than one represent a tendency to name all stimuli as belonging to a valence. Terms less than one represent a tendency to avoid naming stimuli as belonging to a valence.

Ideally, biases would be computed for each person in the sample, allowing statistical testing of hypotheses regarding their reliability. Because many individuals made no confusions of negative words with positive words this technique was not possible. Instead, biases were computed on the mean confusion matrices. Because no nondepressed individuals ever labeled a negative word as positive, the bias of nondepressed individuals to label things as negative could not be computed without some assumption; doing so would have yielded zero denominator terms for some cells. To be conservative, a rate slightly lower than that for depressed individuals, equivalent to under two people making this confusion for one trial, (.002) was used. This technique assumes that in a larger sample, there would be a few individuals, but not many, who made valence confusions on negative words.

Computing signal detection rates for the lexical decision task. The network model predicts that if depressed individuals will see all stimuli as negative to the extent that they may even interpret nonwords as negative information. Were this phenomenon to occur, depressed individuals would make large numbers of false alarms on the lexical decision task, identifying nonwords as words (specifically, as negative words). Thus, they would be expected to have lower signal detection rates than nondepressed individuals on the task, as signal detection rates are a function of false-alarms. To test this prediction, signal detection rates (D), defined as a classifier’s sensitivity+specificity-1 were calculated for each valence. Sensitivity was defined as the proportion of words an individual correctly identified as words. Specificity was defined as the proportion of non-words an individual correctly identified as non-words.10 D values lie between -1 and 1. Classifiers with D values of .3 and above are considered adequate (Friedenberg, 1997). Individuals with D values below .3 would thus be said to have trouble discriminating between words and non-words, reflecting a tendency either to say that non-words were words, or vice-versa.

Methods of Analysis - Pupil Dilation

Because pupil dilation waveforms contain a great deal of noise, extensive data cleaning must be done before they can be analyzed. In addition, the methods by which waveforms are aggregated and segmented into relevant components can also affect results. A number of strategies for cleaning and aggregating dilation waveforms were thus adopted.

Data Cleaning

Blinks were identified as apparent large changes in pupil dilation occurring too rapidly to signify actual dilation or contraction. Trials comprised of over 50% blinks were removed from consideration for pupil data. Linear interpolations replaced blinks throughout the data set. This technique preserved the time course of pupil dilations during blinks. The data were then subjected to a 10 point rolling average to remove small irrelevant fluctuations in pupil size. This technique has the effect of a low-pass filter in the frequency domain, stripping out very high frequency variations. Baseline pupil dilations, measured as the average dilation over the one second preceding the onset of the stimulus were subtracted from pupil dilations after stimulus onset to produce pupil dilation curves representing an increase in pupil dilation from baseline. Subsequent analyses were performed on the resulting curves.

Data Aggregation - Principal Components Analysis

To test predictions regarding the time course of attention, it is useful to segment pupil dilation curves into windows representing early and late stages of attention. Since this type of analysis has not been done with pupil dilation curves before, there is little theoretical basis for deciding what parts of a pupil dilation curve should be considered as representing early or late stages of attention. Principal components analysis (PCA), an empirical approach frequently used to identify relevant temporal components of other evoked potentials (e.g., ERPs) in emotion research (e.g., Neumann et al., 1992; Vanderploeg et al., 1987), was therefore adopted as a way of identifying parts of pupil dilation curves that could represent early and late stages of attention.

PCA is often used in the social sciences to group variables that measure a similar construct under a single aggregate index, so that each variable does not have to be examined separately. This procedure involves analyzing patterns of covariance among variables in a matrix containing data on each variable for each of a large number of individuals. In the current analysis, each time at which a measurement of pupil dilation was taken was considered a variable. Whereas traditional factor-analytic procedures often reduce data along variables contained in a variable by person matrix, the current procedure reduces data along time-points in a time-point by person matrix. This type of analysis has been recommended over other techniques for understanding relevant aspects of physiological data (Coles, Gratton, Kramer, and Miller, 1986) and has been used with similar tasks for understanding event-related brain potentials (e.g., Neumann et al., 1992; Vanderploeg et al., 1987).

The goal of the PCA was to derive a small number of indices that could be used to represent relevant aspects of the over 600 pupil dilation measurements taken on each trial on the lexical decision and valence identification tasks. Each such index, or component, thus represented groups of times at which pupil dilation was measured that had high bivariate correlations. That is, if a number of times at which pupil dilation was measured consistently had similar magnitudes relative to the rest of the waveform they were identified as a component. The chief product of such a principal components analysis is a set of factor loadings for each time at which a measurement was taken on each extracted component, representing the degree of association of each time of measurement with each factor.

The PCA was performed upon six curves for each person, each representing a person’s median pupil dilation curve (approximately 300 points) to personally relevant and nonpersonally relevant positive, negative, and neutral words. Specifically, the median dilation for each valence, for each task, for each participant, at each time point after the stimulus onset, relative to a participant’s reaction time, was obtained. Responses were time-locked to reaction times rather than stimulus onsets so that sustained pupil dilation in depressed individuals could not be attributed exclusively to psychomotor retardation. That is, if depressed people were observed to have greater dilations at a number of seconds relative to stimulus onset, the cause could either be due to later reactions to the stimulus, or late attentional processes occurring after individuals reacted to stimuli. Time-locking to reaction times effectively removes variance due to the first possibility, since all late dilation is known to happen a given amount of time after individuals react to a stimulus.

During aggregation, outlying dilations were excluded as in the reaction time analysis. That is, outliers more than 1.5 times the interquartile range from the median harmonic mean on any condition were scaled to the highest value below this cutoff plus the difference between this value and the next highest value.

The average pre-stimulus dilation was subtracted from dilations throughout each trial to obtain a dilation relative to a baseline. Since some individuals had primarily short reaction times (i.e., most were under 2.5 seconds) reliable estimates of pupil dilation could not be derived for these individuals for pupil dilation occurring more than 2.5 seconds before reaction times. Similarly, since some individuals had primarily longer reaction times, there was not always enough data collected to allow reliable estimates of average pupil dilation more than 5 seconds past individuals’ reaction times. Thus, pupil dilation curves were analyzed only for the region from 2.5 seconds prior to individuals’ reaction times to 5.0 seconds after their reaction times. Reliable estimates of pupil dilation were available throughout this interval for all participants.

By regressing dilation scores on loadings, factor scores could thus be obtained for each person’s responses to a given condition for each extracted component. These scores represented a person’s average dilation at each area of interest (i.e., component) along the pupil dilation curve. Varimax rotation was used to produce orthogonal factors, in which times at which pupil dilation was measured were strongly associated with only one factor. Thus, measurements around a certain time (e.g., one second into the window in which the PCA is performed) appeared strongly associated with only one factor. Factor labels were then assigned to each factor representing cognitive processes assumed to be engaged during the time at which strongly loading measurements occurred.

This procedure produced factor scores for each person, for each valence condition. The factor scores represented the strength with which reactions to each valence were associated with each extracted component. Differential responses to each valence by depressed and nondepressed individuals could thus be analyzed using ANOVAs.

Type I Error Control

A conservative type I error control strategy was adopted for controlling error in planned contrasts. Family-wise alpha for all planned pairwise contrasts was controlled at 0.05 using a Bonferroni correction, unless all pairwise contrasts were tested, in which case a Tukey correction was performed.

Type II Error Control and Exploratory Analyses

To control for the possibility of type II error, a series of exploratory analyses was also performed. These analyses examined all effects present in omnibus tests reflecting the full experimental design (e.g., personal relevance x valence x depressive status) rather than specific planned contrasts. This technique allowed understanding of the context in which significant planned contrasts occurred (e.g., within three- or four-way interactions). The impact of accounting for various covariates (e.g., age and education) on relevant effects was examined. Differences in depressed and nondepressed individuals’ performance on rationally-derived (as opposed to empirically derived) indices of pupil dilation waveforms (e.g., mean dilation, peak dilation, time to maximum pupil dilation) in response to information of different valences were also examined. Exploratory analyses also included sensitivity analyses involving aggregation of behavioral and physiological indices a number of different ways. In addition, these analyses accounted for conditions that were included for completeness but did not impact planned contrasts (e.g., personally relevant positive information). They also examined tasks that were administered to better understand effects that could have qualified results but that did not enter into planned contrasts (e.g., gaze task and warned reaction time task).

As these analyses did not directly address questions discussed in previous sections, their results are not included in this document. The text from these exploratory analyses can be obtained from the author, or, currently, from the "Exploratory Analyses" section of Siegle (1999b), a world wide web site associated with this dissertation.

IV. RESULTS

The neural network model was used to make predictions for reaction times, signal detection rates, and pupil dilation during the lexical decision and valence identification tasks. The current section presents descriptive statistics for the lexical decision and valence identification tasks. Planned contrasts derived from the neural network model’s performance are then tested for each domain. Indices and methods of aggregation that were, a priori, considered to be most likely to represent dimensions of behavior addressed by relevant theory and approximated by the computational model are used. The software used to perform these analyses are described in detail in Appendix H.

Reaction Times

Lexical Decision Task Planned Contrasts

The neural network model predicted that depressed individuals would be slower to respond to negative stimuli that are not personally relevant than to stimuli of other valences, in comparison to nondepressed people. Table 6 and Figure 7 present the mean harmonic mean reaction times for depressed and nondepressed individuals for the lexical decision task, after outlier removal and rescaling, in seconds.

 

Table 6 Mean harmonic mean reaction times for the lexical decision tasks, in seconds.
 
    depressed nondepressed
Valence  SD  SD
Personally  Negative  .75  .19  .68  .11
Relevant  Positive  .74  .21  .59  .11
Neutral  .72  .18  .64  .12
NonPersonally  Negative  .77  .20  .69  .16
Relevant  Positive  .76  .22  .65  .16
Neutral  .81  .22  .69  .16

As shown in Figure 7, the neural network’s prediction was not confirmed. Planned contrasts did not reveal significant differences between depressed and nondepressed individuals in the time it took to respond to negative non-personally-relevant words and words of other valences (p>.115, h2<.054, for all contrasts).

The model also predicted that depressed individuals would be quicker to respond to negative personally relevant information than to information of other valences, and that this discrepancy would be larger for depressed than nondepressed individuals. Planned contrasts did not reveal significant differences between depressed and nondepressed individuals in the time it took to respond to negative personally-relevant words and words of other valences (p>.19, h2<.046 for all contrasts).

Valence Identification Task Planned Contrasts

The neural network model predicted that depressed people would be slower to respond to positive words (and in some versions, also to neutral words) than to negative words, whereas differences in response times for positive and negative words would not be as large for nondepressed individuals. To test these predictions, planned contrasts were performed examining whether the difference in reaction times for positive and other valences were larger for depressed than nondepressed individuals. Table 7 and Figure 8 present the mean harmonic mean reaction times for depressed and nondepressed individuals for the valence identification task, after outlier removal, in seconds.

Table 7

Mean harmonic mean reaction times for the valence identification in seconds.
 
   
depressed
nondepressed
Valence  SD  SD
Personally  Negative  1.36  .51  1.0011  .23
Relevant  Positive  1.16  .34  .79  .21
Neutral  1.56  .56  1.15  .27
NonPersonally  Negative  1.18  .31  1.03  .25
Relevant  Positive  1.22  .38  .85  .23
  Neutral  1.46  .43  1.08  .28

As shown in Table 7 and Figure 8, the network’s predictions were largely confirmed for nonpersonally relevant information. As predicted, depressed people did respond more slowly to positive than to negative nonpersonally relevant words, whereas nondepressed individuals responded more quickly to positive than negative words, F(1,44)=14.81, p=.0004, h2=.252. The same contrast was not significant for neutral words, p=.87, h2=.001, or personally relevant negative words, p=.985, h2<.001. Depressed individuals did not respond faster to negative than positive personally relevant words.*

The latter observation also disconfirms the network’s prediction that depressed individuals would be faster to identify the valence of personally relevant negative information than information of other valences. In fact, planned contrasts revealed that the difference in response times for personally relevant negative and nonpersonally relevant negative words was reliably larger for depressed individuals than nondepressed individuals F(1,44)=10.60, p=.002, h2=.194. This result suggests that depressed individuals responded especially slowly to personally relevant negative words in comparison to non-personally relevant negative words.

Planned Analyses Exploring the Relationship of Ruminative Coping to Reaction Time Measures

The performance of the neural network model suggested that if rumination is represented by feedback between mechanisms responsible for affective and semantic association, rumination may be associated with larger information processing biases on the valence identification task. To the extent that conventional measures of rumination, e.g., RSQ rumination scores, capture this notion of rumination, they were expected to correlate with information processing biases. Based on the network’s performance, the difference between reaction times to non-personally relevant positive and negative words was expected to be the pairwise comparison that best reflected such biased performance. To test the network’s prediction, rumination as measured by the RSQ was related to biased performance on the valence identification task, as measured by individuals’ difference in reaction times to positive and personally relevant negative words.

Scores on the rumination subscale of the RSQ were significantly associated with biases on the valence identification task r2=.27, F(1,20)=16.94, p<.0005, though the relationship was somewhat mediated by depression.* That is, more depressed individuals were both more biased on the tasks, and had higher RSQ rumination scores. As such, a hierarchical regression on valence identification biases in which depression status was entered on the first step and an individual’s score on the rumination scale of the RSQ was entered on the second step revealed that depression accounted for 20.3% of the variation in valence identification biases, F(1,43)=11.47, p=.001. Rumination, as measured by the RSQ, was also positively linked to depressive information processing biases after controlling for depressive status, accounting for an additional 7.6% of the variation in biases, Fchange(1,44)=4.63, pchange=.037. Within the depressed group rumination, measured by scores on the rumination subscale of the RSQ, was not significantly associated with biases on the tasks, though the effect size was similar to that for RSQ rumination scores after controlling for depression, from previous analyses r2=.11, F(1,20)=2.39, p=.14. The main finding from this analysis is thus that individuals who score higher on a conventional measure of rumination appear to have more biased information processing. This result is consistent with predictions from the neural network model in which rumination was considered a personality factor that operated throughout an individual’s lifetime.

Somewhat surprisingly, similar results were obtained for the lexical decision task in that RSQ rumination scores were, again, associated with biases (faster reactions to nonpersonally relevant negative than positive words) on the task above and beyond the nonsignificant contribution of depression, r2change=.17, F(1,44)=6.52, p=.01, b=.003, bstandardized=.55. These results suggest that depressed people who score higher on rumination on the RSQ responded faster to negative than positive words, whereas they were predicted to respond slower.

Signal Detection Rates

Signal detection rates allow inspection of whether the products of individuals’ judgement processes are affected by their negative information processing biases. The network’s predictions generally suggest that depressed individuals will not only be biased during their decision processes (e.g., creating delays in reaction times) but that their tendency to see information as negative will affect their decisions. Such a tendency can be examined directly on the valence identification task by examining whether depressed individuals label many stimuli as negative. On the lexical decision task, explicit valence judegements are not available, so the effects of such biases must be inferred from error rates.

Valence Identification Task Response Biases

The network’s performance suggested that depressed individuals would be biased to see things as negative. As a result, on the valence identification task, it was expected that they would label positive, neutral, and negative stimuli as negative more often than would nondepressed people. To evaluate this hypothesis, "response biases" for each valence, representing the rate at which individuals were likely to label all words (positive, negative, and neutral words) as having that valence, were computed.

The resulting biases are presented in Table 8. Terms of 1.0 represent an unbiased observer. Terms greater than one represent a tendency to name all stimuli as belonging to a valence. Terms less than one represent a tendency to avoid naming stimuli as belonging to a valence. Since biases are unitless, ratios of biases are generally examined. Although the number of people making no classification errors precludes evaluating the statistical significance of these biases, these descriptive results appear consistent with the performance of the neural network. The bias towards negative responding in the depressed group is over four times the deviation from unbiased performance of any other bias in the table. This result seems indicative of a general predisposition in depressed individuals to make negative associations.*

Table 8

Valence Response Biases for the Valence Identification task.
 
  Depressed  Nondepressed 
Positive  .83  1.00
Negative  1.80  1.07 
Neutral  .82  .96 

One way to test whether this apparent difference in biases reflects significant differences between the depressed and nondepressed groups is to examine whether depressed individuals labeled non-negative words as negative more than they labeled negative words as non-negative (i.e., whether they were similarly biased to mis-label each type of word). This analysis does not rely on the computation of response biases for each group, but is subject to the problem that observed differences could be a function of either differential response biases or differential semantic distances between different valences (Luce, 1963), so it is only used to support the result obtained above. Towards this end, contrasts from a valence x personal-relevance x status MANOVA using the rate of mislabeling of valences as the dependent variable suggested that depressed individuals displayed a greater difference in positive and negative confusions than nondepressed individuals, F(1,46)=52.27, p=.026, h2=.103. This difference was largely due to the fact that depressed people made many confusions on positive personally relevant words, while nondepressed individuals did not.

Lexical Decision Task False Alarms

If depressed individuals were so preoccupied with personally relevant negative information that they associated nonwords with personally relevant negative information, it was expected that they would evidence a large number of false alarms on the lexical decision task, mistaking nonwords for negative words they were thinking about. This phenomenon was not predicted to occur for nondepressed individuals. The mean false-alarm rate for depressed individuals was 11.4% (SD=.121) whereas the mean false-alarm rate for nondepressed individuals was 14.7% (SD=.147). The difference in false-alarm rates was not significant, t(46)=-.85, p=.40. Thus, the depressed individuals did not seem more prone than nondepressed individuals to consider nonwords to be negative words. Together, the signal detection findings from the valence identification task and lexical decision tasks suggest that depressed individuals may see words in their vocabulary as negative, but are not so severely biased as to see strings of letters that are not in their vocabulary as negative.

Pupil Dilation

It is believed that expending greater cognitive load results in greater pupil dilation. Based on the results of the neural network simulations, it was expected that a tendency to associate incoming information with personally relevant negative information would result in low initial cognitive load for depressed individuals, since the actual information in stimuli will not be processed. Hence, little pupil dilation in the early stages of attention for all stimuli except personally relevant negative stimuli was predicted in depressed individuals. In contrast, depressed individuals were expected to have especially high pupil dilation in late stages of attention, reflecting sustained attention to personally relevant negative information. Additionally, it was expected that dilation in the late stages of attention would be approximately equal for all valences for nondepressed individuals. Based on the simulations, depressed individuals were expected to have their highest pupil dilation in response to personally relevant negative information, followed by non-personally relevant negative information, neutral, and positive information.

To test these hypotheses, principal components analysis (PCA) was used to identify early and late stages of attention. ANOVAs on factor scores from the PCA were used to identify depressive information processing biases.

Aggregate Curves

To obtain a visual representation of observed pupil dilation data, aggregate curves representing pupil dilation over time for depressed and nondepressed individuals were obtained for each valence. These curves were generated using the median pupil dilation for each group for each relevant valence, on each task, after cleaning. Curves were time-locked to reaction times to give a picture of how dilations differed long after individuals responded to the tasks. As curves for the different valences were visually indistinguishable, curves for each valence were further aggregated into single curves for each group. The resulting aggregate curves for depressed and nondepressed individuals on the valence identification and lexical decision task are shown in Figure 9. As can be seen from the figure, curves generally increased from stimulus onset until slightly after individuals’ reaction time (zero on the time axis), after which dilation returned to baseline. Depressed individuals appear to dilate less initially in response to stimuli, and appear to continue to dilate after the stimulus has been removed on both tasks.

Considerable variability from these averaged curves was observed for averaged curves within individuals, and even more variability was observed for individual curves within a valence, for given individuals. In fact, few followed the pattern presented by the "average" curves in Figure 9 (A number of such divergent pupil dilation curves, obtained for different individuals in different conditions are presented in Appendix I). Thus, to examine whether the pattern of differences between obtained pupil dilation curves for depressed and nondepressed individuals were reliable data had to be aggregated quantitatively. Principal components analysis was used for this purpose.

Principal Components Analysis

Principal components analysis was performed on pupil dilation data, recorded at each time interval, to identify early and late components of dilation. For the valence identification task, eighteen factors had eigen values greater than one. A Scree plot revealed differences between the first five factors and the rest. The first five factors accounted for 82.36% of the total variance. Similarly, for the lexical decision task, 23 factors had eigen-values over one, of which five were distinguishable on a Scree plot, accounting for 72.06% of the variance. A second principal components analysis was therefore performed for each task, restricting extraction to just five factors. Factor loadings for each of the extracted factors for each task are presented in Figure 10. In this figure, each line represents the strength with which one factor, or group of measurements with high bivariate correlations, is associated with each recorded time point. Lighter markings indicate higher factor loadings. Lines are plotted relative to individuals’ reaction times (marked zero), representing the time at which a button was pressed in response to a displayed stimulus.

As shown in the loading plots, the fifth factor (accounting for 7.48% and 6.05% of the variation on the valence identification and lexical decision tasks respectively), loads primarily at the beginning of the waveform, and probably represents pre-attentive or exclusively preparatory processing. The fourth factor (accounting for 16.16% and 12.57% of the variation) occurs up to one second before the reaction time, and may represent early attentional or preparatory processes. The third factor (accounting for 16.52 % and 14.73% of the variation), peaks near the reaction time and thus may be viewed as a motor and early cognitive component. The second factor (accounting for 22.44 % and 12.57% of the variation) peaks at approximately 1 second post-reaction-time. This factor can be thought of as cognitive processes associated with stimulus identification, representing the pupil’s tendency to lag behind cognitive events by up to 1.5 seconds. The first factor (accounting for 22.71% and 21.81% of the variation) was a late factor, peaking around 4 seconds post-reaction-time. It is assumed to represent late attentional processes, possibly reflecting feedback between cognitive and affective processing systems. Together, the five extracted factors thus accounted for 85.31% of the variation in the valence identification task data and 23.73% of the variation in lexical decision task data. Factor loadings were similar for both depressed and nondepressed individuals, as described in Appendix J.

 
Figure 11 shows differences in the average factor loadings for depressed and nondepressed individuals for stimuli of each valence on each task. The left-hand graphs represent average loadings for nondepressed individuals and the right-hand graphs represent average loadings for depressed individuals. The graphs each have five lines; each line represents average loadings on one factor, for each valence.

Planned Contrasts

The network model suggested that depressed individuals would show small early dilations and large late dilations, representing sustained attention, on all non-personally relevant stimuli. Nondepressed individuals were predicted to show the opposite pattern: large early dilations followed by a quick decay leading to small late dilations. To support this hypothesis, differences would need to be found between depressed and nondepressed individuals on factors representing both early and late processing. Thus, the hypothesis was evaluated by examining main effects from multivariate split plot ANOVAs in which valence (positive, neutral, negative, personally relevant negative) and depression status were independent variables, and factor scores on each extracted component were dependent variables. The neural network model also suggested that depressed individuals should pay more attention to negative personally relevant information than other types of information in all stages of attention. To test this hypothesis, ANOVA planned contrasts examining differences in factor scores between responses to personally relevant negative words and other words were conducted.12 The statistical significance of each planned contrast, representing tests of whether responses were different between depressed and nondepressed individuals for each valence, are presented in Appendix K.

These planned contrasts revealed that depressed individuals scored reliably higher on the first, or late attentional factor, than did nondepressed individuals, F(1,45)=5.85, p=.02, h2=.12, though there were no significant differences in responses to one valence or another. This finding suggests that depressed individuals displayed greater sustained attention in response to all stimuli, than did nondepressed individuals. On the second factor, assumed to represent early cognitive phenomena, nondepressed individuals displayed comparatively more attention to personally relevant negative words versus nonpersonally relevant positive words than did depressed individuals, F(1,45)=6.51, p=.01, h2=.13. As expected, depressed individuals were reliably lower on factor three, the motor and early attentional component, than nondepressed individuals, F(1,45)=7.20, p=.01, h2=.14. No reliable differences between the groups in responding to one valence were observed. Depressed individuals were reliably higher on factor four, the early attentional component, than nondepressed individuals, F(1,45)=5.37, p=.03, h2=.11.

Results were less consistent for the lexical decision task. While depressed and nondepressed individuals did not differ reliably on the first, or late attentional factor, all individuals generally evidenced greater sustained attention to personally relevant negative words than to neutral nonpersonally relevant words, F(1,45)=5.55, p=.02, h2=.11, negative nonpersonally relevant words, F(1,45)=8.70, p=.01, h2=.16, and marginally, to positive nonpersonally relevant words, F(1,45)=3.40, p=.07, h2=.07. Nondepressed individuals were reliably higher on the second, or cognitive factor, F(1,45)=5.15, p=.03, h2=.10, and to a lesser, nonsignificant extent, on the third or motor component, F(1,45)=3.19, p=.08, h2=.07, than were nondepressed individuals. No main effects or interactions were observed for the fourth, or early attentional component.*

Interpretation of the late pupil dilation factor

The first pupil dilation factor is assumed to represent sustained attention to aspects of presented information four seconds after an individual reacts to the information (generally between five and seven seconds after the information has been removed from view). The extent to which this factor is an index of more traditional notions of a ruminative response style can be tested in a number of ways. To assess whether aspects of pupil dilation were predictably related to conventional assessments of ruminative coping processes, bivariate correlations between the first factor score and the rumination scale of the RSQ were examined. RSQ rumination scores were not significantly correlated with the first factor for nonpersonally relevant words, .19<r<.24, p>.05 for all valences, and for personally relevant words was only significantly correlated with positive words, r=.33, p=.02, with the correlation for personally relevant negative words being in the same range as for nonpersonally relevant words, r=.26, p=.08.

Still, RSQ rumination scores were highly correlated with depressive severity, so hierarchical regressions in which covariates such as age and gender, as well as depression were entered first and rumination was entered second, using late factor scores as the dependent variable did not reveal an independent relationship of RSQ rumination scores to pupil dilation factor scores (maximum R2change<.03).

Another way of examining whether the late pupil dilation factor represents rumination involves understanding whether late factor responses are related to other variables, assumed to be affected by early aspects of rumination, defined by feedback between cognitive and affective processing systems, such as reaction times. Correlations of late factor responses with reaction times did not reveal significant relationships between scores on the late pupil dilation factor and reaction times for any valence (r<.16 for all pairs).

Brief Summary

Together these findings suggest that depressed individuals generally respond similarly to all words, having little early dilation and higher late dilation. In contrast, nondepressed individuals pay attention to both the affective and semantic content of stimuli very quickly after they are presented. Nondepressed individuals thus respond differently to different types of stimuli based on their valence and personal relevance.

Personal Relevance

The neural network model predicted that depressed people would process personally relevant negative information differently than nonpersonally relevant on every available measure. Specifically, it predicted they should react more quickly to, make fewer errors, and continue to think about personally relevant information more than non-personally relevant information. These predictions are representative of a broader issue addressed by this dissertation, involving whether individuals process personally relevant information differently from non-personally relevant information, and whether this phenomenon warrants the inclusion of personally relevant information in future experiments. To answer this broader question, it is useful to consider whether there were predicted and non-predicted interactions with personally relevant information on many of the observed measures. A number of omnibus tests were performed to answer this question that are not followed up here,13 so as to keep the amount of analysis consistent with the exploratory nature of the question.

On the lexical decision task, omnibus tests of valence (3: positive, negative, neutral) x personal relevance (2: relevant, non-relevant) x depression status (2: depressed, nondepressed) interactions on reaction times revealed a personal-relevance x valence interaction, F(2,42)=4.41, p<.018, h2=.174. On the valence identification task, the same test revealed a significant valence x personal-relevance x depression status interaction, F(2,41)=4.70, p=.015, h2=.186. The same test examining rates of mislabeling valences on the valence identification task revealed a personal-relevance x valence interaction, F(2,45)=29.6, p<.001, h2=.568. Similarly, on the lexical decision task, a significant personally relevance x valence interaction was observed using signal detection rates (D) as the dependent variable, F(2,45)=5.89, p=.005, h2=.207.

The analogous test for the pupil dilation factor scores was a valence (3) x personal relevance (2) x component (3: late-sustained attention, cognitive, motor) x depression status (2) multivariate split-plot ANOVA on component loadings. On the valence identification task, the ANOVA revealed a personal relevance x component x valence x depression status interaction, F(4,39)=2.75, p=.042, h2=.22. While this ANOVA was difficult to interpret in the context of the relatively low power for decomposing a four way interaction, it does suggest that personal relevance could be important to understanding physiological as well as behavioral responses.

 

 

V. DISCUSSION

This dissertation has investigated a problem that at first seems quite simple: what do depressed people think about when they think about negative things?* Assumptions about relevant neuroanatomy were formalized in a computational model. The model suggested that depressed individuals relate incoming stimuli to personally relevant negative information, and think about that, instead of the presented information. Based on this process, the model predicted that depressed people would pay little early attention to environmental stimuli, but as time went on, they would begin to think about personally relevant negative information in response to any internal or external stimuli. Depressed and nondepressed individuals were given an affective lexical decision task and valence identification task to examine these hypotheses.

Results supported a number of predictions from the neural network model. Depressed people displayed little cognitive effort during the early stages of information processing and more cognitive effort during the late stages of processing, as measured by pupil dilation. In contrast, nondepressed individuals expended cognitive effort during earlier stages of attention and not as much during later stages of attention. The finding that depressed individuals devoted the most attention to the earliest pupil dilation component may reflect early perceptual vigilance. Results analyzing signal detection rates and reaction times further suggested that depressed people appeared to pay particular attention to negative information. Fewer differences in reaction times and signal detection rates were observed for nondepressed individuals. Results, along with their implications are summarized in more detail in the following sections.

What This Experiment Could Say About Depressed People

Depression leaves people miserable, thinking about negative things, feeling bad, and frequently, becoming suicidal. By understanding the processes that transform normal patterns of attention and association to be very negative, insight can be gained regarding the experiences of depressed people. This study specifically lends insight into aspects of attention occurring in the seconds after people encounter information. Specifically, results suggest that depressed people may tend not to pay attention to emotional information immediately after it is presented, but may continue to attend to it, even after it is taken away. This tendency is associated with reacting quickly to negative information when individuals’ attention is directed towards emotional aspects of information. Together these findings suggest that depression is a disorder characterized by sustained attention, and potentially, by relating that information to previously well-learned negative information. Were phenomena observed on a millisecond scale exaggerated in a more ecologically relevant environment, a depressed person’s experience might be characterized by thinking for a prolonged period in response to environmental stimuli, and only acting on them if they are negative. If they are positive, the stimuli might be associated with something negative, before the individual reacts. These conclusions are explored in more detail in the following sections.

What this Experiment Suggests about Depression: Depressed People Have Sustained Attentional Responses to Many Things

The lexical decision and valence identification tasks were evaluated on a number of dimensions to understand whether depressed individuals processed negative information in a unique manner. Signal detection theory was used to interpret behavioral responses. This analysis suggested that, in comparison to nondepressed people, depressed people tended to evaluate positive, neutral, and negative stimuli as negative more often than they interpreted them as either positive or neutral on the valence identification task.

This relative discrepancy in response biases is in keeping with the idea that depressed people think about negative things in response to any presented information. To support the idea that this finding was not just a function of the valence identification task, it may be noted that initial analysis of the word-rating data showed that depressed individuals’ ratings for negative words were in keeping with normed ratings (i.e., the normed category from which words were chosen, as reported by Siegle (1994)) more often (81% of the time) than ratings for positive (70% of the time) or neutral words (75% of the time), which were often rated as negative, as shown in table five.14 Yet, depressed individuals rarely interpreted non-words as (negative) words on the lexical decision task, suggesting there is some recognition threshold below which they do not deem information as negative.

The next level of detail at which results were interpreted was also behavioral. The time it took individuals to make a response on the tasks was used as an index of the amount that thinking about negative things interfered with various decisions. Analysis of reaction times on the lexical decision and valence identification tasks suggested that depressed individuals were slow to say that positive words were positive, and were comparatively quick to say that negative words were negative, except when they were personally relevant, but only on the valence identification task (Figures 7-8). The implication for depression is that depressed people will very quickly respond to negative aspects of negative information in their environment, when their attention is drawn to the emotional aspects of things. This is true unless the information directly pertains to them (potentially, to their depression). Such information may be so powerful a stimulus that depressed individuals will find themselves thinking about it, rather than acting. That is, taking a long time to respond to personally relevant information on information processing tasks may correspond to being paralyzed with inaction in the presence of personally relevant information in more ecological situations.

To better understand processes leading to such sustained attention to negative information, a third level of granularity was explored. The time-course of attention was evaluated using pupil dilation as a measure of cognitive load. Results from this analysis were somewhat consistent with hypotheses. In comparison to nondepressed individuals depressed individuals showed relatively little early attention to most stimuli but as time progressed to later stages of attention they displayed more cognitive activity than nondepressed individuals, regardless of the valence of the presented information. As results were time-locked to reaction-times rather than stimulus onset, it is suggested that this result is due to sustained expenditure of cognitive resources rather than psychomotor slowing.*

Taken together, these findings present a remarkably consistent picture of depression. Depressed individuals generally appear to pay rather little attention to information as it is presented. As time passes they begin to devote cognitive resources, possibly to thinking about the affect associated with the information. They turn things negative. Seconds after a stimulus is presented, they are still thinking, but about whatever negative information is central to their depression, rather than the presented stimulus. To the extent that the presented computational model of affective information processing is correct, this phenomenon might be considered indicative of feedback between cognitive and affective processing systems. Nondepressed individuals processed the stimulus earlier, and were done with it more quickly.

An important point for understanding implications of these results is that depressed individuals could associate stimuli with personally relevant information, and continue to think about that information, whether or not greater pupil dilation was observed in response to personally relevant and nonrelevant information in the late stages of attention. If nonpersonally relevant information is associated with personally relevant information in the early stages of attention, both types of information would be expected to yield the same attentional profile in later stages of attention. That is, in either case, the depressed individual would be expected to continue to think about personally relevant information. For the interpretation to be consistent with the data, the only observation that needs to hold is that depressed individuals show greater dilation in the late stages of attention, in response to all types of stimuli, than do nondepressed individuals. This pattern of information processing could explain an aspect of the phenomenology of depression that is often noted clinically. Depressed people often suggest that they can only think of personally relevant negative information. Cognitive therapies often challenge this notion. Based on the current model, it is suggested that depressed people might be right. The current model, like a number of earlier models (e.g., Teasdale, Segal, & Williams, 1995) predicts that depressed individuals will pay little attention to environmental stimuli when they are presented. Seconds after any information is perceived, it is suggested that depressed individuals associate it with whatever negative thoughts are central to their depression.

This pattern of information processing could also explain a number of the deficits often observed in depressed people. For example, depressed people’s reaction times on the current tasks were often slower than nondepressed people’s (Figures 7,8). A common explanation for this phenomenon involves motor slowing. The current explanation could suggest that a lack of early attention to stimuli could make depressed people begin to process them later, and hence respond later to them. An analogous mechanism produced the apparent slower responses in the computational neural network model after overtraining; there is no motor component represented in the neural network model. More generally, depressed people with this pattern of information processing could find it difficult to attend to presented information (because they think of personally relevant negative information too much), have difficulty solving problems (because they are distracted by their cognitions regarding personally relevant negative information), and have difficulty remembering anything but personally relevant negative information.

Operationalizing Rumination

One exciting aspect of this project has been the identification of physiological responses (i.e., sustained pupil dilation) that may reflect rumination, defined as prolonged feedback between cognitive and affective processing systems. That is, pupil dilation indexes sustained attention in response to presented information long after the information has been taken away. When pupil dilation data were analyzed using principal components analysis, the component that accounted for the greatest variance in pupil dilation waveforms peaked at about four seconds after participants’ reactions to a stimulus. This component is occurring much later than would be expected for processes leading to individuals’ decisions. Many individuals, in fact, remarked that by the end of the measurement period for a stimulus, they had begun to think of things other than the stimulus. As such, it seems reasonable to consider this late dilation factor as representing internal cognitive processes. That the component was greater for depressed individuals, suggests that it represents cognitive processes that are present to a greater extent in depressed than nondepressed individuals. Based on the computational model, it is suggested that such sustained attention may reflect sustained feedback between cognitive and affective processes, which may be consistent with the clinical phenomenon of "rumination".

Yet, even if late sustained attention is consistent with clinical descriptions of rumination, it is not clear that the rumination measured by the late component of pupil dilation is the same type of rumination discussed by depression and personality researchers. Nolen Hoeksema’s Response Styles Questionnaire (RSQ) was used as a self-report measure of rumination. The RSQ was designed to measure a type of rumination characterized by thinking excessively about one’s symptoms of depression, when one is depressed. That is, the questionnaire instructs participants to respond by rating how strongly they endorse certain behaviors when they are depressed. Behaviors on the rumination scale were intended to reflect thinking about depressive symptomatology (Nolen-Hoeksema & Morrow, 1991), e.g., "think about how you don’t seem to feel anything anymore." RSQ rumination scores were not strongly correlated with scores on the late component of pupil dilation. This late component also does not seem to measure the same "ruminative" processes that cause interference in reaction times on the tasks, as it’s peak occurred long after reaction times, the component was barely represented in the time interval around the reaction time, and the component was not correlated with reaction times.

Another possibility is that sustained attention, as indexed by sustained pupil dilation after the presentation of a stimulus, may not represent rumination at all. For example depressed individuals have difficulty disengaging attention from information, or take longer for their pupils to reflect attentional disengagement. Alternately, pupil dilation may reflect sustained affective engagement with presented information that does not have a cognitive component. All of these possibilities, along with the weak associations between sustained dilation and traditional measures of rumination, point to the need for further research to better understand the relationship of sustained pupil dilation to rumination.

A final consideration regarding potential correspondences between pupil dilation and rumination involves the time-scale on which rumination is expected to occur. Rumination, considered as prolonged feedback between cognitive and affective processing systems, is expected to begin on the order of seconds after a stimulus is presented. Conceivably, this process could last until another stimulus needs to be processed. Conventional notions of rumination, and clinical descriptions of this phenomena, are often thought of as occurring over days or weeks. Potentially this discrepancy points to fundamental differences between the current formulation of rumination and more conventional notions.

Evaluation of Predictions from the Neural Network Model Based on the Empirical Data

To better understand the validity of the computational neural network model, it is useful to examine which of the predications, based on the neural network model, were born out empirically, and which were not. Because all predictions were based on the output of a computational model that reflected a conceptual understanding of depression, mechanisms behind each confirmed, or disconfirmed hypothesis can be traced explicitly to aspects of the model. In turn, suggestions can be made for what changes to the model must be made to achieve a conceptual understanding of depression that is consistent with the data.

Predictions Consistent With the Data: Seeing Things Negatively, and Sustaining Attention

Some of the model’s predictions were born out. These predictions are described in the following section along with the aspects of the model that they likely support. More speculative conclusions regarding implications of the model are saved until non-confirmed predictions are also discussed.

Behavioral indices on the Valence Identification Task. The model’s behavior suggested that depressed people would interpret most information somewhat negatively and would tend to associate any information with whatever made them depressed (Figure 5). It also suggested that they would therefore be quick to identify the valence of negative information, and slow to identify the valence of positive information (Figure 3). These predictions were largely confirmed empirically (Figure 8). Because the model’s valence identification task performance was dependent primarily on activation in a system responsible for identifying affect, these observed biases can be seen as consistent with disruption in cognitive or brain systems responsible for affect identification or feedback between systems responsible for affective and semantic identification.*
 

Low initial pupil dilation, and high late pupil dilation in depressed individuals. In comparison to nondepressed individuals, depressed people were observed to show relatively little cognitive load, indexed by pupil dilation, in the early stages of attention, and more cognitive load in the late stages of attention on the valence identification task (Figure 9). This behavior matches the output of the network for all but personally relevant negative stimuli. The decrease in early attention in the network is due to the network’s learning rule, in which new information prevents access to previously learned information. If the same mechanisms operate in people, this finding would suggest that, if depression really involves overlearning, this overlearning could not only strengthen connections between depressed individuals’ mental representations of negativity and personally relevant negative information, but that connections to other information, within their semantic networks, could become weaker as they become depressed.15 Such a process would be in keeping with depressed individuals’ frequent reports that they "can’t think straight," and with findings outlined by MacLeod and Matthews (1991) detailing a number of difficulties depressed individuals have in cognitive tasks ranging from free recall to problem solving. It would also suggest that a possible therapy for depression might involve relearning some of the connections that have become weak during the induction of depression. Supporting this idea, Siegle (1998a; 1999) has shown that retraining the overtrained network on positive and neutral exemplars can lead to the absence of information processing biases in the overtrained network.

Importance of measuring personally relevant information. Although not all predictions regarding the nature of individuals’ reactions to personally relevant information were supported, the notion that individuals would react differently to personally relevant and nonrelevant material across a variety of measures was supported. Reaction times were sensitive to personal relevance on both tasks. Depressed individuals were especially slow to recognize that personally relevant negative information was negative, but were quick to recognize that other information was negative (Figure 8). People were generally slower to respond to non-personally relevant negative and neutral information than to personally relevant negative and neutral information on the lexical decision task. Similarly, personal relevance x valence interactions were present on error rates for both tasks. Analysis of pupil dilations during the tasks showed that personal relevance interacted with factor scores, valence, and status on the valence identification task.

Predictions Not Consistent with the Data: Attending Differently to Different Valences

Some of the network’s predictions were not born out. Each of these predictions is analyzed in detail in the following section with particular emphasis on mechanisms that could have lead to differences between the model and reality.

Before separate factors accounting for each difference between model predictions and observed data are discussed, the overarching possibility that the entire model is largely incorrect can be addressed as a hypothesis that could explain all observed inconsistencies. There are a number of fundamental weaknesses in the model that could lead to the adoption of this conclusion. A simulated affective recognition system containing two nodes, and a simulated semantic recognition system containing only nine nodes without feedback between them are hopelessly inadequate representations of the human cognitive system. The neurons in the model are not very neural (i.e., they are simply summative indices of incoming activations, and do not have simulated dendritic hairs, ion-gated channels, etc.), and the systems they create together do not interestingly represent known nuances in the cognitive attentional system. Similarly, without a module representing cortex, it is hopeful at best to think that the model would accurately reflect complex cognitive processes that go into generating behaviors such as motor movements in response to cognitive tasks. The idea that depressed individuals are overtrained on negative information, along with the notions that depression has anything to do with learning or connection weights could be completely wrong and would lead to incorrect predictions for the current experiment. Any of these errors could be seen as justification for wholesale dismissal of the model.

Still, discarding the entire model based on inconsistencies with observed data may involve rejecting valid as well as invalid aspects of the model. A more fine-grained approach to examining each inconsistency may allow aspects of the model to be preserved, and conclusions to be drawn about depression based on these aspects. This strategy can lead to iterative refinement of the model, and generation of new experiments to test relevant refinements. In short, this strategy fuels a potentially productive research cycle with theoretically motivated questions, rather than simply negating it. It also encourages the model to be expanded in theoretically useful and parsimonious directions rather than leading to a model that is built up unnecessarily by initially incorporating all aspects of brain function, or accounting for irrelevant cognitive phenomena. A final point along these lines is that all models are technically "wrong" in that there are aspects of reality they fail to mimic. Understanding how wrong a model is can thus be more useful than simply acknowledging the notion that the model is "wrong".

Personally Relevant Negative Information. The network’s behavior made clear predictions regarding differences in how depressed individuals should respond to personally relevant negative information, and other information, which were often not born out in either the reaction time or the physiological data. For reaction times, the network’s performance suggested that depressed individuals should always respond quickly to personally relevant negative words. In general, depressed people responded significantly slower to personally relevant negative information than with positive or nonpersonally relevant information (Table 7). The network also predicted that pupil dilation should be always be greatest for personally relevant negative information. Pupil dilation was not always greatest for personally relevant negative information. The only time pupil dilation appeared to truly increase in response to personally relevant negative information was on the late component on the lexical decision task, and in this case, it was high for both depressed and nondepressed individuals. Counter to predictions, nondepressed individuals showed a greater difference in pupil responsivity to personally relevant negative and nonpersonally relevant positive words than did depressed individuals. To understand mechanisms that could have produced these non-predicted behaviors in people it may be useful to understand why predicted behaviors occurred in the computational model. The network’s quick responses to personally relevant negative information came because it had received the most training on (i.e., the most experience perceiving) this information and thus had developed the strongest connections from aspects of the personally relevant stimulus to a response. A direct explanation for the failure of depressed individuals to respond quickly to negative information is thus, that they did not have more experience with personally relevant information than other types of information. By this logic, the current results might suggest that nondepressed individuals had more experience with personally relevant negative information than positive information.* As little literature has previously suggested this explanation, other explanations for the same data might be warranted.

One possibility is that even if depressed individuals had a great deal of experience with personally relevant information, non-modeled variables prevented them from responding to it quickly. In support of this idea, at least three depressed individuals who displayed slow reaction times to personally relevant negative words suggested that they were so affected by the personally relevant information that they became momentarily distracted from the task. They reported that this distraction contributed to their delayed responses. Before responding to it, two depressed individuals broke down in tears repeatedly during the presentation of personally relevant negative information. As such, discussion with depressed individuals suggests that this slowing may result from cognitive mechanisms that inhibit (or distract an individual from) motor responses such as button presses. According to this hypothesis, when depressed people think hard about negative information, their entire attention is drawn to the thinking, and away from their motor response. No mechanism for cognitive inhibition of motor responses was represented in the neural network, and was thus, not accounted for in initial predictions. This explanation would necessitate that more attention is actually paid to personally relevant negative information than other information. Other mechanisms that could have prevented depressed people from responding quickly to personally relevant information seem less consistent with an integrative interpretation of the data. For example, the theory of perceptual defense (Powell & Helmsley, 1987) posits that depressed individuals systematically ignore negative information. This explanation is not consistent with depressed individuals’ especially quick reaction times to non-personally relevant negative words, in comparison to nondepressed individuals. Similarly, it is possible that negative words generated by depressed individuals were not representative of the same information on which they were overtrained. Again, was this the case, it would be unclear why depressed individuals responded to these words not only less fast, but most slowly, than other negative words.

Lexical decision task signal detection rates. Because signal detection rates were generally fairly high on the lexical decision task, depressed individuals did not appear to experience more difficulty in recognizing words of some valence. Disruption in signal detection on the lexical decision task in the neural network happened only when it was overtrained so much that all stimuli were immediately associated only with the negative association on which the network was overtrained. One explanation for the observed lack of false alarms that is consistent with the network model is that while depressed individuals tended to see things somewhat negatively, not all incoming stimuli were immediately associated with personally relevant negative information.

Lexical decision reaction times. Very few behavioral predictions regarding performance on the lexical decision task were confirmed empirically. That is, depressed individuals did not react more slowly to non-personally-relevant negative words than other words (they responded more quickly to them than neutral words). Depressed individuals did not respond significantly faster to personally relevant negative words than other words (Figure 7). On measures of cognitive load, depressed individuals showed equivalent load in response to negative and positive information, while nondepressed individuals showed larger responses to positive than negative information (Figure 11).

Potentially these findings suggest that the neural network model was not a valid model of depressive information processing. Alternately, the model may represent aspects of depressive processing, which are not reflected in reaction times to the lexical decision task. For example, the prediction of differential reaction times to positive and negative information on the task in the network was predicated on the assumption that the task reflected semantic processing of information. Arguably, the lexical decision task could instead be done entirely through syntactic recognition processes, without ever causing semantic activation to occur. Were the lexical decision task to not involve any semantic recognition before their reaction time, there is no reason to assume that any biases would be present. The importance of this notion is that reaction times may not be a powerful method for assessing differences in cognitive load in response to different valences on an easy and relatively automatic task that involve little more than reading a word.

There are also a number of ways to explain the obtained data based on the model. One promising candidate involves how many negative experiences are involved in making someone depressed. This possibility can be investigated by examining how the analog of depression was established in the network. Based on the idea that depressed individuals have just a few salient negative experiences, the network was overtrained on just one (or, in early simulations, just a few) negative stimuli. Ingram (1984) has suggested that some depressed states may, instead, involve overexposure to many negative stimuli. Siegle (1994) has shown that in this case, all negative stimuli would act as personally relevant negative stimuli. The network would respond quickly to them. Siegle (1996) suggests that a population of individuals in which some have just a few personally relevant negative exemplars, and others have many could effectively cancel each other out; no biases on the task would be observed for non-personally relevant information.*

Another explanation for the relative dearth of biases in depressed individuals involves the amount of negative experiences needed to engender biased processing in depression. Potentially adjusting the ratio of training the network underwent on positive and negative information could lead it to mimic empirical results on the lexical decision task. Such a process would undoubtedly change simulated results on the valence identification task as well. Extensive formal modeling would be necessary to establish whether this process could help to better understand the experiences of depressed individuals.*

Differential pupillary response. Based on the network’s behavior (Figure 5) it was expected that depressed individuals would generate different patterns of pupil dilation in response to different valences. Empirical data did not support this hypothesis; depressed individuals tended to respond similarly to all valences on each of the extracted components of pupil dilation.

As with other non-confirmed predictions, the lack of confirmation for these predictions might reflect fundamental errors in the model. For example, the model’s simulated analog of pupil dilation might not reflect important mechanisms involved in pupil dilation in people. Similarly, were overtraining not a good analog of what makes depressed individuals different from nondepressed individuals, differential responses to different valences might not be expected. Other explanations for the observed null results allow more of the model to be preserved. Further testing is required to evaluate the relative merits of adopting such explanations rather than dispensing with the entire model.

For example, when the network’s simulated pupil dilations are examined as a function of overtraining (Figure 6) it is interesting to note that the pattern of differences between the overtrained network on early and late dilations is quite evident. Differences in the network’s final dilations to items of different valences after overtraining on negative information are negligible. Formally, the network’s behavior suggests that depressed individuals should not selectively experience prolonged cognitive engagement in response to negative information (in fact, the network’s final dilation to even personally relevant negative information is comparable to that for other valences). Rather, they should continue to attend in response to all information. That is, depressed people should think equally and intensely about personally relevant negative information in response to all other information, long after it is presented. The prediction most consistent with the network’s behavior was that sustained attention would increase for all valences on the tasks. Perhaps, human behavior corresponded more to the network’s behavior in the very late stages of attention than in the rather large window over which dilations were initially summed. Even so, the network’s performance did seem to suggest that depressed individuals would respond differently throughout the course of attention to negative stimuli that were closely tied to their depression, versus other stimuli. As above, one explanation for the lack of especially high pupil dilation in response to personally relevant negative words would be that these words did not represent negative concepts that were actually closely associated with participants’ depressions, even if they were otherwise personally relevant.

The same explanation given to explain the relative absence of biases in behavioral indices in depressed individuals on the lexical decision task can also be considered as an explanation for the lack of differentiation in depressive pupillary responses. If nondepressed individuals have more positive than negative experiences, and depressed individuals have more even ratios of experiences it would be expected that depressed individuals would respond more evenly to negative and positive stimuli than nondepressed individuals. A seemingly fatal flaw for this explanation is that overtraining on positive information would suggest that nondepressed people should show more dilation in the late stages of attention to positive stimuli than depressed individuals; that is, they should experience excessive attention to positive information. This phenomenon was not suggested by the data. Moreover this expectation does not intuitively capture the phenomenology associated with depression. Clinically, depressed people complain of thinking excessively about negative information (i.e., not being balanced) whereas nondepressed people don’t often present for treatment with complaints of excessive thought about positive information.

Equivalence of lexical decision and valence identification task behavior. Based on the network’s behavior, the time course of pupil dilation was predicted to be equivalent for the lexical decision and valence identification tasks, even if behavioral measures were not. While the appearance of aggregate curves for each task was similar (Figure 9), the pattern of greater sustained attention in depressed versus nondepressed individuals was only significant on the valence identification task. The lack of a significant difference on the lexical decision task may be due to low power to detect this difference above and beyond the main effect of valence. Both depressed and nondepressed individuals reacted very differently on the lexical decision task to words of different valences. The large spread between these means could have obscured overall differences in group means. Were this study replicated with more power, specific valence x status interaction contrasts could be tested in a planned fashion. Alternately, as above, the obtained null results may suggest any number of fundamental errors in the computational model.

Take Home Lessons

In summary, the model’s predictions regarding reaction times and response biases on the valence identification task were largely confirmed. In addition, the notion that depressed individuals would generally have low initial pupil dilations and higher sustained dilations, in comparison to nondepressed individuals, was supported. Finally, the notion that people would react differently to personally relevant and nonrelevant information was supported. Predictions for reaction times and signal detection rates on the lexical decision task were also not confirmed. Specific predictions regarding the nature of differential attention to personally relevant information were not confirmed. Predictions regarding differential pupil dilation in response to different emotional valences were not confirmed.

Based on aspects of the model that were and were not supported empirically, it is suggested that the model may be useful in understanding some aspects of depressed individuals’ allocation of attention to emotional information. Since biases were different on the valence identification and lexical decision task, and because predictions were confirmed for the valence identification task, the notion that individuals process affective and non-affective aspects of information in parallel, and that depressed individuals differentially attend to affective aspects of the information is supported. Conclusions cannot be drawn regarding differential attention to non-affective aspects of information.

Depressed individuals’ sustained attention to all stimuli may, as in the model, reflect an association of all incoming stimuli with a single type of information, e.g., personally relevant negative information. Potentially, this finding suggests that depressed individuals do not think a great deal about information as it is presented, but later, continue to think about it and to relate it to singular, potentially irrelevant associations. Such an information processing style might be targeted for cognitive intervention. Finally, differences in depressed and nondepressed information processing are not consistent for personally relevant and nonrelevant information. Because theories of depression so often regard depressed individuals thinking about things relevant to their depression (e.g., what made them depressed), and because tests of information processing so often do not employ such information, this finding seems important evidence that future experiments should attempt to assess aspects of personally relevant information processing, possibly in addition to nonpersonally relevant information processing.

Potential Treatment Implications

Ideally, information about depressive attentional styles can be used to inform interventions. For example, it might be useful for clinicians to address aspects of information upon which depressed individuals focus during cognitive interventions. Better understanding of aspects of emotional information on which depressed individuals focus, e.g., through assessment of individuals’ performance on the currently examined tasks might aid this endeavor. Simulating factors that lead to remediation of cognitive bias in the presented computational model could assist in the creation of such interventions.

A specific question sparked by the current model involves the role of identifying negative thoughts and emotions during cognitive interventions, a central tenant of some cognitive therapies (e.g., Greenberger & Padesky, 1995). The current model suggests that focussing excessively on such information might strengthen cognitive connections to it, and thus increase the chances that depressed individuals make negative associations with environmental stimuli. Based on the current model, it is speculated that training that allowed individuals to manage their attentional focus might increase the benefit of such therapies for depressed individuals who tend to maintain attention to negative aspects of information once they consider them. Strategies for training attentional control such as mindfulness meditation techniques may be useful in this regard (Teasdale et al., 1995). Similarly, attentional control training strategies from the cognitive rehabilitation literature may be used to help individuals learn to direct and sustain their attention to important aspects of a stimulus, rather than the aspects that have become most natural for them (e.g., the emotional valence). Pharmacological interventions that target brain areas responsible for feedback between affective and non-affective aspects of information may also be useful in this regard. Of course, a great deal of empirical research is necessary to support the use of any of these techniques.*

Limitations

There are a number of limitations in the methodology used in this dissertation, as well as the generalizability of conclusions based on the presented experiments that should be addressed.

Limitations of the Neural Network Model

The computational models described in this manuscript are admittedly extremely simplified and very speculative. The analogs of very complex phenomena which they display capture few of the actual intricacies of depression. As such they are intended only as springboards for thinking about a set of apparently disparate biases that frequently appear impenetrable. McDermott’s (1981) caution that we should not let "artificial intelligence" approach "natural stupidity" by positing computers that "think" or actually get "depressed" must be interpreted in the strongest sense.

Additionally, the following concerns regarding the network’s performance even in the limited domain it does try to model are notable. The model does not account for relationships among semantic features. No feedback occurs within the semantic and within the affective processing units. Thus, hypotheses regarding whether depressive information processing biases stem from feedback between these structures versus within the structures cannot be addressed. The model does not differentiate between short- and long-term memory and thus, mood state effects cannot be differentiated in the network from trait effects. Finally, the model is very small. While humans have on the order of 1010 neurons, years of life experiences, and numerous contexts in which stimuli can be interpreted, the model has under 50 (i.e., 5*101) simulated neurons, is taught only nine patterns, and has exactly two contexts in which information can be interpreted - valence identification and lexical decision. Needless to say, the model is surely reductively simple. Again McDermott’s words of caution suggest that the model’s performance be used as a way to generate testable hypotheses rather than as an end to itself.

The model is further subject to the limitations of any modeling effort. Decisions for inclusion and exclusion of components are largely a function of the whims of the model’s creator. That is, modelers decide what aspects of the universe are important enough to include in the model; a modeler who is not convinced that some features are important may not include them whereas another modeler will. As such, the parameters a modeler chooses to put into the model affect not only the model’s structure but the predictions that result from it. Moreover, the model will likely fail in many cases for which it was not specifically designed (as well as some of those for which it was designed). The model represents the behavior of an idealized group of individuals that may not exist in reality.

Another example of ways in which the model’s creator could have erred in its application involves the population that its performance was assumed to reflect. The model has been used throughout the current manuscript as a template for understanding clinical depression. As noted previously, the model may better represent a subset of clinically depressed individuals whose depressions are mediated by having learned some piece of negative information too strongly. Alternately, as the model does not account for many of the features of clinical depression (e.g., eating and sleeping disturbance, suicidality, decreased energy) the model may best represent a subclinical population. Conclusions specific to clinical depression, as opposed to a more broad or more specific condition, should therefore be considered especially tentative.

For all these reasons, the model is probably best considered a tool for hypothesis generation that reflects intuitions of its creator. By formalizing aspects of the programmer’s intuitions, the model allows predictions to be created that are necessarily consistent with the programmer’s assumptions, to the extent that the assumptions are implemented in the model. The ultimate test of the model’s utility may thus not be how accurately it portrays behaviors or physiological phenomena, but what interesting hypotheses it can be used to generate, what experiments result from its use, and whether these experiments can be used to explain something interesting about depression, regardless of the model’s predictions.

A final concern regarding the model involves its revision subsequent to the current study. Previously, the possibility of changing the model to better account for observed data has been suggested. This strategy is only useful to the extent that findings are valid. Modeling findings due to error would be both problematic and unwarranted. Thus care will need to be taken in the revision of the model based on the current study.

Limitations of the Sample of Research Participants

A number of aspects of the sample could affect the interpretation of obtained results. Because recruitment of an unmedicated clinically depressed population was difficult, the sample was smaller than initial power estimates originally suggested was necessary to detect effects for reaction times on the lexical decision task. Although obtained effect sizes suggest that no new reliable effects are likely to be detected, even with moderately increased power, this concern could be addressed empirically in future studies. There are also potential concerns involving the individuals who were recruited. The nondepressed controls in the experiment were recruited for having no current depressive symptomatology, and no past episodes of depression. The mean BDI score for this group was lower than for the general population (Mnondepressed=.68, Mdepressed=26.87). Eighteen control individuals scored zero on the BDI. Arguably, such symptom-free individuals are not representative of the general population of nondepressed individuals (Kendall, Hollon, Beck, Hammen, & Ingram, 1987). Potentially, differences between the control and depressed population could be as much a function of the unusual characteristics of the control group as the depressed group.

Additionally, the depressed group contained considerably more men than women. The nondepressed group contained more women than men. Aside from the inherent confounding of gender with depression in this recruiting strategy, the ratio of men to women in the depressed group is the opposite of expectations suggested by epidemiological studies; estimates of the point prevalence of major depression vary from 5-9% for women and 2-3% for men (American Psychiatric Association, 1994). The low sample sizes did not yield sufficient power to investigate the effects of gender in a rigorous manner. Were this study to be replicated in the future, the issue of gender representation could be addressed formally.

A final concern of the current sample involves whether the depressed sample was representative of the type of depression for which predictions were generated. As noted previously, depression was operationalized in the computational neural network model to reflect a cognitively mediated depression in which individuals were assumed to have had one or a few particularly negative experiences that they thought about a great deal, and hence learned very well. It was not meant to account for other types of cognitively or biologically mediated depressions. In contrast, the current sample consisted of a heterogenous group of depressed individuals. To the extent that their depressions were based on other etiologies they might not have been expected to perform according to the model’s predictions on administered tasks. Future studies that included a more homogenous depressed sample could thus find stronger effects.

Limitations of the Sample of Stimuli

The stimuli used in this experiment were normed and idiosyncratically generated words. The ecological validity of using words to represent the types of emotional stimuli that individuals experience in their lives has been repeatedly questioned. Moreover, the normed words that were used may not have best represented the valences they were intended to represent. Williams et al. (1998) renormed the word-set used in this study on over 500 undergraduates. The spread of positive and negative words was fairly wide, with nominally negative words such as "tears" often being rated as both fairly positive and fairly negative. Restricting analyzed words to those that were rated by individuals in the same way as their normed valence may have skirted this issue to some extent, though valences were only rated on one dimension; the neural network model suggested they should be rated on both positivity and negativity. Similarly, by randomly selecting the normed words it is likely that the positive and negative words were more highly inter-related than neutral words. These relationships may have allowed previously presented positive and negative words to act as primes for subsequent affective stimuli, artificially increasing differences in indices of information processing to positive and negative words.

A final concern involves whether the idiosyncratically generated stimuli represented the stimuli that are specifically relevant to the depressions of the depressed participants. The word generation form (Appendix E) asked for words representative of what individuals thought about when they were down or depressed. Potentially individuals had little insight into such stimuli. Similarly, what individuals think about consciously may not be the most active representation after prolonged cognitive-affective feedback. A final possibility is that if depressed individuals consciously attempt to avoid thinking about such information (e.g., as suggested by Powell & Helmsley, 1987), they may not have written down the most personally relevant stimuli.

Limitations of the Analyses

Analyses reported in this dissertation involve only tests of planned contrasts. No omnibus tests or sensitivity analyses are included to aid in the interpretation of reported results. Planned contrasts that are significant may be qualified by interactions that are of interest. Similarly, obtained null results may be significant when examined in other ways. These analyses were left out of the dissertation because they did not specifically address predictions made by the computational neural network model. To aid in the interpretation of reported results, relevant omnibus tests and sensitivity analyses may be obtained from the author, or from the exploratory analyses provided in Siegle (1999b).

Limitations of the Design

The current study is cross-sectional. Although its results have been used to describe the relationship between attention and the onset and maintenance of depression, the cross-sectional nature of the study precludes the drawing of causal inferences from the data. Specifically the model can not be used to say whether attentional factors actually maintain depression. Longitudinal research would be more useful in establishing the role of negative life events in causing negative attention biases in depression.

Other Generalizability Concerns

A number of other factors concerning how constructs were measured or operationalized may also be of concern in evaluating the generalizability of the current results.

Comparing tasks. Comparisons of reaction times and cognitive effort on the valence identification and lexical decision tasks may be confounded by the different motor and discrimination requirements of the two tasks. Specifically, the valence identification task required participants to respond by pressing one of three buttons, and the lexical decision task required participants to respond by pressing one of two buttons. As it is unknown how much more effort a motor task involving discriminating between three options takes than discriminating between two options, it is unclear how to compare overall reaction times and pupil dilations for the tasks.

Comparing factor structures. The same PCA analysis was used to derive factor scores for depressed and nondepressed individuals. This technique was largely supported by the similar appearance of factor loadings when PCAs were done independently for each population. Still, to rigorously test whether the same PCA should be applied to both populations, more rigorous methods could be employed. Such a test might compare structural equation models in which depression is and is not allowed to vary as a free parameter. Similarly, it was assumed that the same PCA analysis applied to positive, negative, and neutral words which were and were not idiosyncratic. Various three-way factor analysis procedures (e.g., PARAFAC, Kiers & Krijnen, 1991; Snyder, Walsh, & Pamment, 1983) would be useful for examining whether the different valences actually yield different factor structures.

Comparing indices of rumination. A final factor that may limit conclusions that can be drawn from the experiment involves how rumination was measured. Nolen Hoeksema’s Response Styles Questionnaire was used. This questionnaire asks individuals to rate how often they engage in various ruminative responses to depressive symptoms, involving focusing on symptoms of depression, when they feel sad or depressed. This strategy does not allow conclusions to be made regarding whether individuals ruminate in general, or just when they are in a sad mood. Additionally, it may not tap rumination on anything but an individual’s symptoms; people who ruminate primarily on negative events, other people, or other negative environmental information may not be labeled as ruminative by the questionnaire. Thus, reported conclusions regarding relationships between "rumination" and sustained attention, as indexed, e.g., by pupil dilation, could better be considered relationships between sustained attention and Nolen-Hoeksema’s particular notion of rumination.

Future Directions

The research described in this dissertation is part of a larger research program aimed at identifying cognitive and physiological correlates of attention biases in depression. By formalizing mechanisms for these biases in a computational model, and refining the model based on research similar to the current experiment, it is hoped that the computational model can be used to generate hypotheses that will lead to the creation of new treatments for depression, and new ways to prevent the disorder. Towards this end, a number of directions can be pursued that will further help to refine the model. A first step will involve further examination of how the model must be modified to account for the obtained data. Once this step is accomplished, new predictions may be made for similar experiments, and replication studies can be used to test them. Towards this end some refinements of the model may help to make it better represent cognitive processes. For example, the only feedback within the model that is simulated occurs between the affective and semantic representations of stimuli within the network. This type of feedback does not allow for any interaction between the semantic representations of different stimuli, except though their affective valence. In contrast, most theories of spreading activation in semantic networks, including Bower’s (1981) network theory, suggest that feedback between the mental representations of different semantic concepts does occur. Recurrent connections in the CA1 area of the hippocampus also point to the need for feedback between semantic units. Even if nodes representing the affective valence of a stimulus are not activated, this feedback may be thought of as allowing interference with the detection of a given semantic concept. This phenomenon could be captured in the current framework by allowing feedback between the units responsible for the semantic representation of a stimulus. A number of popular models of lexical processing in which the lexical decision task has been simulated (e.g., Seidenberg & McClelland, 1989) allow feedback to occur between units on the same layer with many intuitive results, e.g., the easier identification of common than uncommon words. Thus, to improve the correspondence between the current model and cognitive theories of associative memory, feedback could be allowed between semantic units, as well as between the semantic and affective units. The effects of this manipulation on simulated reaction times and signal detection rates can be observed. This feedback was not implemented in the current model because estimates of connection strengths between the various represented semantic concepts could not be obtained. To empirically estimate these connection strengths, it will be useful effectively map the semantic distance between concepts in individuals’ semantic networks (e.g., Jencius, 1998; Seidenberg & McClelland, 1989).

A second direction suggested by the current research involves identifying contributions of different physiological structures to sustained attention to affective and non-affective aspects of information. The computational model was patterned after interactions between the hippocampal and amygdala systems, so these might be useful systems to start with. Functional imaging technologies seem especially promising in this regard.

A third direction involves explicitly identifying a subtype of depressed individuals for whom the current conceptualization is appropriate. Previous sections have advanced the notion that the current model applies only to cognitive depressions characterized by learning associations with one or a few negative stimuli very well. Validation for this subtype of depression could help to better understand certain people’s depressions and to tune interventions more specifically for given patients. Similarly The current data suggest that pupil dilation might be used to aid in understanding attentional styles of depressed individuals before they begin therapy. For example, therapies suited to remediating sustained attention biases, could be applied to only those individuals who actually have these biases.

The Big Picture

The work presented in this dissertation has brought together a number of converging lines of research. A computational neural network model was used as a bridge between high level cognitive and low level physiological descriptions of attention to emotional information. A theory of depression, originally advanced for semantic networks (Ingram, 1984) was shown to correspond, with only slight modification, to a plausible set of physiological structures. By implementing the model computationally, predictions regarding the time course of attention, for depressed and nondepressed individuals, were advanced. Using behavioral and physiological measures, these predictions were tested. Based on the results of the research, integrative conclusions regarding the behavioral, cognitive, and physiological underpinnings of depression were advanced. Specifically, data supported the notion that depressed individuals display sustained attention to many types of information. Potentially this sustained physiological responsivity represents a physiological analog of prolonged feedback between cognitive and affective processing systems, which, in turn, may be associated with clinical notions of rumination.

Based on this research, speculations were advanced regarding what depressed individuals pay attention to when they attend to negative information. Specifically, it was suggested that depressed individuals may tend to associate any incoming information with personally relevant negative information, and may continue to think about that information even after stimuli are removed. It is my hope that the impact of this work will involve greater clinical attention to the impact of sustained attention biases on the cognitive and behavioral functioning of depressed individuals. Potentially the measurement and modeling techniques developed in this dissertation can be used to develop or refine interventions for depression that target sustained attention biases.

Depression is, by definition (American Psychiatric Association, 1994), a disorder that affects people’s behavior, cognitions, and physiology. It therefore seems necessary to account for all three domains in explaining the onset and maintenance of the disorder. It is my hope that this type of integrative research, tied together using computational models, may prove to be a powerful research tool in future studies of depression. Integrating behavioral, cognitive, and physiological research on depression, through computational modeling and empirical model evaluation, may thus have the power to advance our understanding of depression in all of these fields.

 
 

 

 
Appendices
 
APPENDIX A -- TECHNICAL DETAILS OF THE NEURAL NETWORK

Construction of the Hebb trained network

The Hebb trained network was implemented in Matlab on an Intel Pentium II computer. The Matlab code used to implement the network is available from the author upon request.

Representation. 9 locally coded orthographic nodes, 9 locally coded semantic, 2 valence, 2 task, and 12 output nodes were used. Orthographic, semantic, and output features were bipolar and normalized, such that one node was activated with strength 1 and all others were activated with strength -2/(vectorlength-2). Valence was represented as: positive: 1, .31; negative: .41, .95; neutral: .65, .35. These valences were empirically determined by allowing over 600 undergraduates to rate the positivity and negativity of 30 words normed for affective valence. The lexical decision task was represented in the task nodes as activations: 1, -1. Valence identification was represented as activations -1, 1.

Architecture. Orthographic nodes feed forward to semantic and valence nodes. Semantic and valence nodes feed forward to each other and to the output nodes. Task nodes feed forward only to the output nodes.

Training. Initial training was done by multiplying input vectors by the transpose of desired output vectors to obtain a weight matrix, for each set of connections. This technique is equivalent to using Hebb training with the network on one presentation of each stimulus with no noise and no forgetting.

Induction of Depression. Overtraining occurred by repeatedly adding products of the valence and semantic features for a single negative stimulus to connections between the valence and semantic units, approximating thinking about a negative stimulus. This technique implemented a Hebb rule. To bound the increase in weights, a slight decay factor on previously learned information (a forgetting rule) was imposed making the full rule:

NewConnectionStrength=.89*(OldConnectionStrength)+NewConnectionStrength.

The decay factor was estimated empirically, as the maximum value that would stop activations from growing with each step of overtraining. This value allowed assessments of positive valence to decrease, and negative valence to increase, with overtraining. When no forgetting was imposed, all reported simulation results were qualitatively similar, with the exception of decreased early activation after overtraining.

Network Activation During Tasks. The network is cascaded, meaning that each node’s activation is a squashed function of it’s input over time plus noise. Unless otherwise noted, all multiplication described below is matrix multiplication. Activation of a layer is represented by that layer’s name. Connections are represented by the name of each layer. For example, "InputSemantic" represents connections from the input to the semantic nodes.

Before and after stimulus presentation, only noise entered the system. Initial activation of the semantic and valence units occurred according to the following rules. Activation of nodes in the network represents the average firing rate of a population of neurons at a given time t.

Semantict=(1-t)Semantict-1+t*(noise*InputSemanticT)
Valencet=(1-t)Valencet-1+t*(noise*InputValenceT)

where t is the diffusion rate for inputs. Noise was bipolar and uniformly distributed with a magnitude .15. During the presentation of a stimulus, activation also accounted for the input:

Semantict=(1-t)Semantict-1+t*((input + noise)*InputSemanticT)
Valencet=(1-t)Valencet-1+t*((input + noise)*InputValenceT)

Stimuli were presented for 10 epochs, after which the network operated entirely on noise input plus feedback between the semantic and affective feature units for 250 epochs, representing the brief (150ms) presentation time for empirical stimuli. Feedback between semantic and valence nodes was operationalized according to the differential equations:

Semantict=(1-b)Semantict-1+b*lyupanov*(Valence*ValenceSemanticT)
Valencet=(1-b)Valencet-1+b*lyupanov*(Semantic*SemanticValenceT)

where b governed the amount of feedback between the structures. lyapunov governed the lyapunov exponent of the system. Values below one act as a decay factor, allowing activations to approach zero. Values above one tend to preserve and increase activation, creating a positive feedback loop between the affective and semantic structures.

Activation of the output units was based on the activation of all units feeding to them as:

Outputt=Semantic * SemanticOutput' + Valence * ValenceOutput' + TaskPriority * (Task * TaskOut')

Nonlinearity was introduced by limiting activations of nodes to 2. This technique was used rather than a sigmoid activation function because even small deviations from zero, using a sigmoid, tended to magnify on feedback as a function of the squashing function rather than other properties of the network. Using a piecewise linear function allowed all observed biasing effects to be based on the architecture and training of the network.

Soft competition was introduced for output nodes by subtracting the maximum activation of any other node in the output layer from each node’s activation. Matches were determined a manner analogous to that used by Cohen, Dunbar, and McClelland (1990) to represent word and color naming in a connectionist model of the Stroop task. Following Ratcliff’s (1978) notion that semantic identification is a diffusion process, they suggest that a semantic identification occurs when the activation of the mental representation of a stimulus reaches a threshold. Counters were therefore defined to represent the accumulated evidence for each possible item the network might identify. The counters added evidence for a given item proportional to the difference between the activation of the representation of that item and the maximum activation of any other representation, subject to gaussian noise (magnitude 0 for these simulations). When any counter exceeded a threshold (arbitrarily set to 1), the network was said to have made an identification. That time was counted as the network’s reaction time. Together, match and output neurons were assumed to represent one competitive system of neurons.

Pupil dilation was simulated as the sum of positivbe activations in the Valence, Semantic, and Match nodes. Input node activations were not simulated so as to minimize the effects of stimulus energy on simulated dilations. As the magnitude of task node inputs were constant, these were not added. Output nodes were not added because real neuronal systems were assumed to have exclusively competitive accumulating neurons, represented by the match nodes, rather than the non-competitive, non-accumulating output nodes.

Parameters used in the Hebb trained neural network simulations
 
Parameter  Value 
Network construction   
Number of input nodes 
Number of semantic nodes 
Number of Valence nodes 
Activation parameters   
t (input diffusion rate)  0.1 
b (affective-semantic loop diffusion rate)  0.02 
lyupanov (lyupanov exponent)  .2 
maximum network activation  2.0 
minimum network activation  -2.0 
noisemag  0.05 
Task parameters   
stimulus duration  10 epochs 
Total measured duration  250 epochs 
accumulation noise  0.0 
positive determination accumulation threshold  1.0 
negative determination accumulation threshold  1.0 
Learning parameters   
additional epochs of training on negative stimuli 
rate at which new training exemplars are assimilated  1.0 
preservation of old learning during new learning (i.e., the forgetting rate)  .89 
Training set   
Number of stimuli 
Number of negative stimuli representing depressogenic loss 

Construction of the BackPropagation Network

To ensure that conclusions based on the network’s performance were not due to characteristics of the modeling environment or specific learning law, a second network, based on Siegle’s (1996) earlier simulations was used. This network was implemented in the PlaNet modeling environment (Miyata, 1991) on a Sun SPARC 1 computer. PLANET is an environment in which neural network simulations may be constructed using a language developed specifically for that purpose. Users can interactively examine activations within the network and its contents, interactively train the network to associate inputs with outputs, and observe the resulting error rates. The PlaNet code representing the network and for presenting stimuli are available from the first author upon request. Because the code for the backprop network is similar to the code for the Hebb trained network, only salient differences with the network above will be described.

Representation. In most simulations activation fed forward from orthographic inputs, through a layer of generalization nodes, to activate semantic feature nodes and 2 valence nodes. Siegle (1996) did not connect inputs to valence nodes, and used 18 input and semantic nodes and 12 generalization nodes. Orthographic and semantic features of stimuli were determined randomly, and were valued at either 0.5 or -0.5. Siegle and Ingram (1997a) also did not connect inputs to valence nodes, eliminated the hidden nodes, and included only 10 orthographic and 10 semantic nodes. For most simulations, only the first nine nodes were used. The tenth node was reserved for simulations involving "novel" stimuli to which the network was not exposed during its initial training period. Following Ingram’s (1984) and Derryberry and Tucker’s (1992) models of feedback between affective and semantic memory representations, the model’s positive or negative affective determinations feed back to the semantic units creating a loop, allowing affective information to affect semantic processing.

Siegle’s (1996) network was trained on twelve positive, twelve negative and twelve neutral stimuli. Stimuli were generated as pseudo-random strings of 0’s and 1’s in which 2/3 of the stimuli were made to be 0’s. Three positive, three negative, and three neutral stimuli were used for Siegle & Ingram’s (1997a) study. Stimuli were represented in a localist fashion in which only one orthographic, one semantic and one valence node was expected to be active for a given stimulus. While the restriction to a localist representation was not essential for the simulations, it was useful for illustrating how network connections changed when various aspects of personality were simulated.

In Siegle’s (1996) study valences were represented orthogonally (positive: 1 0, negative: 0 1, neutral: 0 0). In subsequent simulations Positive affective features were coded as activations of .2, -0.2. Negative features are coded as -.2,.1. Neutral features are coded as 0, -.2. These valences were empirically determined by allowing over 600 undergraduates to rate the positivity and negativity of 30 words normed for affective valence.

Training. Training involved presenting a simulated orthographic representation of a stimulus to the network using the presentation parameters for the tasks described below, for 10 cycles, observing the network’s semantic and valence nodes, and adjusting the weights within the network until the desired semantic and valence representations were achieved using a modified back-propagation learning algorithm (Rumelhart, Hinton, and Williams, 1986). Weights were modified based on the error after 10 cycles rather than according to the standard backpropagation through time algorithm which updates weights based on the average error at each cycle, because it is assumed that learning only occurs after associations are made. No claim is made here that human learning actually takes place via a back-propagation algorithm. Rather, it is assumed that as in the back-propagation algorithm, humans change their associations with stimuli based on their experiences, and more association to some stimulus means that it is learned better. Training continued until the sum of the mean squared error in the semantic and valence nodes was below 0.001 for Siegle’s (1996) study, for a block of all inputs. Due to the greater error incurred by not using hidden nodes, Siegle & Ingram (1997a) used an error threshold of 0.004 for all stimuli.

Induction of Depression. Thus, to represent the induction of depression, the network was trained on a single negative stimulus for 100 epochs after the network's initial training was complete.16

Network Activation During Tasks. The rules governing the network’s activation were as above, with the exception that nonlinearity was introduced as a logistic function rather than a piecewise linear function. t was generally set to .5, and b was generally set to .1. The threshold for affective and semantic determinations on match filters is .46. Gaussian noise was incorporated on all layers. After a specified stimulus onset asynchrony network inputs were eliminated entirely, rather than propagating noise through the network, as in the Hebb network simulations. To allow for neutral judgements in a network with unipolar weights (i.e., neutrality is represented as the absence of positivity and negativity) the network was said to judge a stimulus to be neutral when little evidence was accumulated for either valence (both accumulators less than 0.8) after a temporal threshold of 132 epochs plus gaussian noise.
 
Parameter  Value 
Network construction   
Number of input nodes  10 
Number of semantic nodes  10 
Number of Valence nodes 
Activation parameters   
t (input diffusion rate)  0.5 
b (affective-semantic loop diffusion rate)  0.2 
maximum network activation  1.0 
minimum network activation 
network noise  0.05 
Task parameters   
accumulation noise  1.0 
temporal threshold for "Nonword" decisions  200 epochs 
temporal threshold noise  10 
positive determination accumulation threshold  1.0 
negative determination accumulation threshold  1.0 
Learning parameters   
eta (learning rate)  0.2 
alpha (learning momentum)  0.4 
error threshold for initial learning  0.004 
additional epochs of training on negative stimuli  70 
Activations in one training epoch  10 
Training set   
Number of stimuli 
Number of negative stimuli representing depressogenic loss 

Relevant differences between the models trained using backpropagation. The original model described by Siegle et al. (1995) did not allow the model to decide that a stimulus was a nonword. This addition was incorporated by Siegle (1996). Both Siegle et al. (1995) and Siegle (1996) used distributed representations of stimuli. Because some of the results obtained in these papers could have occurred as a result of the particular distributed representations which were chosen, Siegle and Ingram (1997a,b) performed a number of simulations on a similar network using a localist representation. Additionally, Siegle & Ingram (1997a) discontinued the use of a hidden layer in order to make their simulations more interpretable. Finally, Siegle & Ingram (1997a) allowed less feedback between the network’s representation of semantic and valence identification than did Siegle (1996). The network performed relatively similarly to Siegle’s (1996) original network with the exception that when overtrained on negative stimuli it was facilitated on negative stimuli on the valence identification task with respect to the network which was not overtrained.

Differences between Hebb and Backpropagation learning rules

Differences between the network behaviors, their interpretation, and the types of parameters used to change network behavior all exist when learning is done using Hebb or Backpropagation. There are theoretical arguments for the biological relevance of each system, e.g., many people argue there is no biological analog of backpropagation; others argue that we surely have hidden layers, and it is unclear how to incorporate hidden layers in a supervised Hebb learning system. The more interesting issues occur at a more highly theoretical level.

Differences in assumtions about the nature of learning. Backpropagation learning rules assume that learning is a procedure for minimizing error between actual and expected responses to stimuli. The consequence of this learning rule is that when information is learned sufficiently well, little new learning takes place, unless there is a great deal of noise in a system (i.e., unless error is always introduced into the system’s output, so that there is something to be minimized). Two consequences of this approach affected modeling efforts. First, enough noise was added to the system that it’s behavior was often erratic. Second, when feedback was increased in the network after original training, no new learning took place during overtraining, since the network’s outputs tended to resemble learned patterns; the feedback acted like a "cleanup" system, minimizing the network’s errors. Siegle & Ingram (1997a) interpret this behavior to suggest that rumination could act as an adaptive method for coping with depression; by considering negative events more someone might not distort and internalize them. The downside of this rule is that someone who continues to have negative experiences would not be expected to become progressively more depressed, based on the network’s behavior.

In contrast, Hebb learning assumes that each experience strengthens connections, regardless of how strong those connections were previously. Thus, new learning can always occur. In this case, rumination could not be considered a coping mechanism. More feedback would strengthen associations with negativity, and thus, would allow the network to relearn negative associations more strongly. Someone who has more negative experiences would be predicted to become progressively more depressed.

Parameter differences. Different parameters are also available for investigating as analogs of cognitive variables, to the researcher using back-propagation versus Hebb learning. In a backpropagation environment, two parameters representing the rate at which new stimuli are learned, and the effect of recent previous learning on new learning, are available. Siegle & Ingram (1997a) interpret these parameters as representing different dimensions of the personality variable Openness to Experience. In Hebb learning a parameter representing the effect of new experiences on connection strengths is relatively analogous to the learning rate in back-propagation. A parameter representing the rate at which new learning occludes old learning, i.e., a forgetting function, is also available. The role of this parameter is discussed extensively in simulations contained in this dissertation.

A nonjudgmental conclusion. The extensive differences between Hebb and back-propagation learning rules do not immediately suggest that one is "better" than another. Rather, they afford different interpretations of similar phenomena involving exposure to new information. Simulations using both architectures are thus valuable for better understanding disorders that may involve overlearning information, such as depression.

 

 


 
 
 
 
 
 
APPENDIX B: GRAPHS OF NETWORK ACTIVATION IN RESPONSE TO EACH VALENCE

The following figures follow the same conventions as for Figure 5, described in the text, and on the legend for that figure.


 

 
 
 
 
 

APPENDIX C: SELF-REPORT MEASURES

The following measures were used, but are not included on the web version of the dissertation due to copyright restrictions:

Beck Depression Inventory (BDI)

Beck Anxiety Inventory (BAI)

State Trait Anxiety Inventory (STAI)

Response Styles Questionnaire (RSQ)



 
APPENDIX D: WORD LISTS USED IN THE EXPERIMENT

 

Table 9

Normed words used in the lexical decision and valence identification tasks
 
Word  Valence  Length  Frequency 
amazed  positive  10 
bliss  positive 
excited  positive  21 
happy  positive  97 
laugh  positive  22 
cheer  positive  10 
pleased  positive  30 
relieved  positive  21 
vitality  positive  17 
warmth  positive  28 
ashamed  negative  17 
depressed  negative  11 
doom  negative  10 
failure  negative  93 
fright  negative 
helpless  negative  21 
hopeless  negative  14 
poverty  negative  20 
tears  negative  34 
unhappy  negative  26 
context  neutral  37 
library  neutral  20 
margin  neutral  16 
measure  neutral  87 
moth  neutral 
package  neutral  14 
pave  neutral 
recruit  neutral  17 
slope  neutral  19 
submarine  neutral  15 

 

 

Table 10

Nonwords used in the lexical decision task
 
Nonword  Original 
Valence 
Length  Original Frequency 
amazid  positive  10 
chuer  positive  10 
fabthile  positive  10 
jry  positive  47 
sparole  positive 
tonder  positive  11 
culse  negative  10 
guset  negative  16 
hoielets  negative  14 
silemn  negative  12 
widewed  negative 
sorwy  negative  47 
cousip  neutral  13 
inid  neutral  39 
mendion  neutral  18 
pon  neutral  18 
qen  neutral  18 
weefly  neutral  14 

 
 

 
APPENDIX E: WORD GENERATION FORM

The following form was used to obtain personally relevant positive, negative, and neutral stimuli from participants.

Before the experiment, we ask that you provide us with the following lists of personally relevant words We ask that you think of words which are not articles (i.e., not "the ", "a", etc.) and which are between 3 and 10 letters long. Thank you for your assistance!

Please list the following:

10 personally relevant negative words which best represent what you think about when you arc upset, down or depressed.

1. _____________________________

2. _____________________________

3. _____________________________

4. _____________________________

5. _____________________________

6. _____________________________

7. _____________________________

8. _____________________________

9. _____________________________

10. _____________________________

10 personally relevant positive words which best represent what you think about when you are happy or in a good mood 1. _____________________________

2. _____________________________

3. _____________________________

4. _____________________________

5. _____________________________

6. _____________________________

7. _____________________________

8. _____________________________

9. _____________________________

10. _____________________________

10 personally relevant neutral (i.e., not positive or negative) words which best represent what you think about when you are neither happy nor upset/down or depressed 1. _____________________________

2. _____________________________

3. _____________________________

4. _____________________________

5. _____________________________

6. _____________________________

7. _____________________________

8. _____________________________

9. _____________________________

10. _____________________________

 

 

 



 

 

APPENDIX F: CONSENT FORMS
 
SDSU form
UCSD form for non-veterans
UCSD form for veterans

SDSU form

SAN DIEGO STATE UNIVERSITY
INFORMED CONSENT AGREEMENT
Study title: Validation for the Affective Interference Hypothesis.
You are being asked to participate in a research study. Before you give your consent to be a volunteer, it is important that you read the following information and ask as many questions as necessary to be sure you understand what you will be asked to do.

Investigators

The principal investigator of this study is Greg Siegle, M.S., a student in the SDSU/UCSD Joint Doctoral Program in Clinical Psychology. Co-investigators include Rick Ingram, Ph. D., Georg Matt, Ph. D., Eric Granholm, Ph. D., Chris Gillin, Ph. D., Greg Brown, Ph. D., Gary Williams, Stephanie Ortiz, and Ellen Alvarez.

Purpose of the Study

The study is designed to look at how fast individuals recognize characteristics of strings of letters. These characteristics include recognition of whether the letters spell a word, and if they do spell a word, whether the word is "positive", "negative", or "neutral" in tone.

Description of the Study

You will be asked to generate a number of words which are positive, negative, or neither for you. You will then be asked to complete tasks in which you will press buttons after seeing groups of letters flashed on a computer screen. In one task the letters will spell a word, and you will be asked to judge whether the word which has just been shown to you is "positive" (e.g., "happy"),"negative" (e.g., "sad"), or "neutral" (neither positive nor negative), as quickly as you can. In the other task you will be asked to judge whether or not the group of letters spells a word. During these tasks, you will be asked to sit with your chin in a chin rest with a video camera pointed at your eye which will record the size of your pupil. This gives us information regarding cognitive effort expended during the task. Each of these tasks should take no more than thirty minutes.

Additionally you will be asked to complete a series of questionnaires which ask about your current mood and about the positivity or negativity of a series of words, for you. These questionnaires typically take about 20 minutes, and pupil dilation will not be recorded during this time. In total the tasks and questionnaires should take approximately one and one half hours.

The experiment will be conducted in room 2252 on the East Wing of the 2nd floor of the San Diego VA Medical Center.

What is Experimental About this Study

None of the procedures or questionnaires used in this study are experimental in nature. The only experimental aspect of this study is the gathering of information for the purpose of analysis.

Risks or Discomforts

During the experiment you will be asked to generate negative words which are personally relevant for you, and you will be asked a number of questions regarding how you "feel". You may become aware of feelings of happiness, sadness, or other mood states which you had not considered before. In this event, or if the experiment makes you uncomfortable in any way, you may discontinue your participation in the experiment, either temporarily or permanently. You should also feel free to talk to the experimenter about how the experiment has made you feel. Additionally, we can provide a list of counseling services at your request.

You will be asked to remain in one position, with your chin in a chin rest throughout the two computer-based tasks. Participants have been known to experience boredom and fatigue during this time. If this position becomes uncomfortable or tiring you are free to suspend the experiment momentarily and change positions.

Benefits of the Study

You will receive an incentive payment of $20 at the completion of the experiment. No other benefits or consequences of the experiment for you are foreseen except the satisfaction of having advanced clinical research.

Confidentiality

Since the questionnaires will be anonymous and your name will not be entered into any of the computer programs which you will use, your name will not be associated with this research in any way. Any information that is obtained in connection with this study and that can be identified with you will remain confidential to the extent allowed by law, and will otherwise be disclosed only with your permission.

Results of this research are expected to be published in scientific journals. The principle investigator is therefore obligated to provide data collected from this experiment to other researchers who wish to re-analyze this data. In the event that data collected in this experiment is requested, only your responses to the computer tasks and questionnaires will be provided. Your name will not be disclosed.

Voluntary Nature of Participation

Participation in this study is voluntary. Your decision of whether or not to participate will not prejudice your future relations with San Diego State University. If you decide to participate, you are free to withdraw your consent and to discontinue your participation at any time without penalty or loss of benefits to which you are otherwise entitled.

Questions About the Study

If you have any questions about the research now, please ask. If you have questions later about the research and/or research-related injuries, you may contact Greg Siegle at the following address:

Greg Siegle
Doctoral Training Facility, Suite #103
6363 Alvarado Ct.
San Diego, CA 92120

or by telephone at 619-594-4840.

If you have questions regarding your rights as a human subject and participant in this study, you may call the office of the Committee on Protection of Human Subjects at San Diego State University for information. The telephone number of the committee is 619-594-6622. You may also write to the following address
Committee on Protection of Human Subjects
San Diego State University
San Diego, CA 92182
This consent form has been approved by the Committee on Protection of Human Subjects at San Diego State University, as signified by the Committee’s stamp. The consent form must be reviewed annually and expires on the date indicated on the stamp.

Your signature below indicates that you have read the information above and have had a chance to ask any questions you have about the study. You agree to be in the study and have been told that you can change your mind and withdraw your consent to participate at any time. You have been told that by signing this consent form you are not giving up any of your legal rights.

 

 

____________________________ _________________________

Research Participant Signature Date

 

 

____________________________ _________________________

Signature of Investigator Date

 



 

UCSD form for non-veterans

UNIVERSITY OF CALIFORNIA - SAN DIEGO
MENTAL HEALTH CLINICAL RESEARCH CENTER
CONSENT TO ACT AS A RESEARCH SUBJECT

Greg Siegle, M.S., is conducting a research study to learn more about the way people pay attention and process information, how the pupil and eye change when people process information, and how information processing patterns and pupil responses change in response to information of different emotional contents. You have been asked to participate in this study as a healthy subject or because you have a mood disorder.

If you agree to participate, the following will happen. You will be asked to generate a number of words which are positive, negative, or neither for you. You will be asked to name letters and colors on a printed chart. You will be asked to sit still with your chin in a chin rest so a camera can record changes in your pupil while you do attention tasks. During the tasks you will be asked to press buttons after seeing groups of letters flashed on a computer screen. In one task the letters will spell a word, and you will be asked to judge whether the word which has just been shown to you is "positive" (e.g., "happy"),"negative" (e.g., "sad"), or "neutral" (neither positive nor negative), as quickly as you can. In another task you will be asked to judge whether or not the group of letters spells a word. In a final task, you will be asked to judge the color in which the words are printed. These tasks each typically take between 15 and 20 minutes. Additionally you will be asked to complete a series of questionnaires which ask about your current mood and about the positivity or negativity of a series of words, for you. The questionnaires typically take between 20 and 30 minutes. There will not be a change in your medication nor will you be subjected to any medical procedure as a result of participation in this study. If you are about to begin Cognitive Behavior Therapy, you may be asked to participate in the study again at a later time. Your participation now is not a commitment to participate again at another time.

The tasks in this study are not expected to involve risks or discomforts beyond those of a standard testing situation. You may become bored, tired, and/or frustrated during this study. During the experiment you will be asked to generate negative words which are personally relevant for you, and you will be asked a number of questions regarding how you "feel". You may become aware of feelings of happiness, sadness, or other mood states which you had not considered before. In this event, or if the experiment makes you uncomfortable in any way, you may discontinue your participation in the experiment, either temporarily or permanently. In the event of a serious emotional reaction, a recommendation for independent professional counseling will be provided.

You may not personally benefit from this research project. However, the results of such studies may help provide a better understanding the causes and treatments of serious illnesses, such as depression.

If you are injured as a result of participation in this research, the University of California will provide any medical care you need to treat those injuries. The University will not provide any other form of compensation to you if you are injured. You may call The UCSD Human Subjects Office at UCSD at (619) 534-4520 for more information about this, or to inquire about your rights as a research subject, or to report research-related problems.

The testing session is expected to take between one and two hours. You will receive $20.00 per testing session at the Mental Health Clinical Research Center for all services that you provide as a subject in this study. There will be no charge to you or to your insurance company for these tests. You may refuse to participate or may withdraw from this study at any time, and you have the right to refuse to answer any question(s) without any negative consequences. The investigator may also stop the study at any time.

Research records will be kept confidential to the extent provided by law. An identification number, instead of your name, is used to identify the information you provide, and the list connecting your name to your identification number is kept in a separate, locked cabinet. No identifying information will be released to federal or other agencies, except as required by law. Your name will not be used in correspondence or publications as a result of participation.

The principal investigator of this study, Greg Siegle, will be available to answer any questions you may have regarding this research or this form. If you have any questions, comments, or concerns about the study or the informed consent process, you may reach Greg Siegle at (619) 594-4840.

Participation in research is entirely voluntary. You may refuse to participate or withdraw at any time without jeopardy to the medical care you will receive at this institution.

You have received a copy of this consent document to keep and a copy of the "Experimental Subject’s Bill of Rights."

You agree to participate.

 

 

_________________________________ ________________________________
Subject                                                    Date     Witness                                             Date

 



 

UCSD form for veterans

VETERANS AFFAIRS MEDICAL CENTER - SAN DIEGO
MENTAL HEALTH CLINICAL RESEARCH CENTER
CONSENT TO ACT AS A RESEARCH SUBJECT

Greg Siegle, M.S., is conducting a research study to learn more about the way people pay attention and process information, how the pupil and eye change when people process information, and how information processing patterns and pupil responses change in response to information of different emotional contents. You have been asked to participate in this study as a healthy subject or because you have a mood disorder.

If you agree to participate, the following will happen. You will be asked to generate a number of words which are positive, negative, or neither for you. You will be asked to name letters and colors on a printed chart. You will be asked to sit still with your chin in a chin rest so a camera can record changes in your pupil while you do attention tasks. During the tasks you will be asked to press buttons after seeing groups of letters flashed on a computer screen. In one task the letters will spell a word, and you will be asked to judge whether the word which has just been shown to you is "positive" (e.g., "happy"),"negative" (e.g., "sad"), or "neutral" (neither positive nor negative), as quickly as you can. In another task you will be asked to judge whether or not the group of letters spells a word. In a final task, you will be asked to judge the color in which the words are printed. These tasks each typically take between 15 and 20 minutes. Additionally you will be asked to complete a series of questionnaires which ask about your current mood and about the positivity or negativity of a series of words, for you. The questionnaires typically take between 20 and 30 minutes. There will not be a change in your medication nor will you be subjected to any medical procedure as a result of participation in this study. If you are about to begin Cognitive Behavior Therapy, you may be asked to participate in the study again at a later time. Your participation now is not a commitment to participate again at another time.

The tasks in this study are not expected to involve risks or discomforts beyond those of a standard testing situation. You may become bored, tired, and/or frustrated during this study. During the experiment you will be asked to generate negative words which are personally relevant for you, and you will be asked a number of questions regarding how you "feel". You may become aware of feelings of happiness, sadness, or other mood states which you had not considered before. In this event, or if the experiment makes you uncomfortable in any way, you may discontinue your participation in the experiment, either temporarily or permanently. In the event of a serious emotional reaction, a recommendation for independent professional counseling will be provided.

You may not personally benefit from this research project. However, the results of such studies may help provide a better understanding the causes and treatments of serious illnesses, such as depression.

If you are injured as a result of being in this study, treatment will be available. If you are eligible for veteran's benefits, the costs of such treatment will be covered by the Veteran's Administration. If not, the costs of such treatment may be covered by the Veteran's Administration and the University of California, depending on a number of factors. The Veteran's Administration and the University do not normally provide any other form of compensation for injury. For further information about this, you may call the VA Regional Counsel at (619) 680-4899 or the UCSD Human Subjects Program Office at (619) 534-4520 to inquire about your rights as a research subject, or to report research-related problems.

The testing session is expected to take between one and two hours. You will receive $20.00 per testing session at the Mental Health Clinical Research Center for all services that you provide as a subject in this study. There will be no charge to you or to your insurance company for these tests. You may refuse to participate or may withdraw from this study at any time, and you have the right to refuse to answer any question(s) without any negative consequences. The investigator may also stop the study at any time. Your present or future care at the San Diego VAMC will not be affected in any way if you refuse participation or withdraw before completion of the study.

Research records will be kept confidential to the extent provided by law. An identification number, instead of your name, is used to identify the information you provide, and the list connecting your name to your identification number is kept in a separate, locked cabinet. No identifying information will be released to federal or other agencies, except as required by law. Your name will not be used in correspondence or publications as a result of participation.

The principal investigator of this study, Greg Siegle, will be available to answer any questions you may have regarding this research or this form. If you have any questions, comments, or concerns about the study or the informed consent process, you may reach Greg Siegle at (619) 594-4840.

Participation in research is entirely voluntary. You may refuse to participate or withdraw at any time without jeopardy to the medical care you will receive at this institution.

You have received a copy of this consent document to keep and a copy of the "Experimental Subject's Bill of Rights."

You agree to participate.

 
_________________________________ ________________________________
Subject                                                    Date     Witness                                             Date

 

  


 

 

APPENDIX G: INSTRUCTIONS FOR TASKS

Initial Directions

Hi there and welcome to our experiment.

In the following hour you will be asked to give us a series of words. You will then be asked to rate the emotional content of a different set of words. Then you will complete two computer tasks and a questionnaire.

In each task you will be shown a screen with a line full of x’s. The x’s will momentarily be replaced by a series of letters, and then by x’s once again.

You will then be asked to answer one of two questions. In one task you will be asked "What’s the Emotion?" By this question we’re asking whether the word was positive, negative, or neutral (neither positive nor negative) for you. Respond by pressing one of the labeled buttons on the keyboard.

In the other task you will be asked "Is it a word?". By this we're asking whether the letters which were on the screen spell a word.

Respond as fast as you can. Until the experiment is over you will be given another word as soon as you respond.

If you have any questions, please ask the experimenter.

Lexical Decision Task—Practice

We will now try three practice items for the "Is it a word?" task. This task asks you to determine whether strings of letters spell a word. In the following task you will be shown X’s which turn into letters, which turn back into X’s.

After seeing a stimulus, answer the question: "Is it a word?" by pressing either "Y" for "Yes" or "N" for "No".

Please respond as quickly and accurately as you can.

You will probably have to blink during the task. Try to blink as little as possible, except when the square is present.

Lexical Decision Task—Task

REMEMBER: In the following task you will be shown X’s which turn into letters, which turn back into X’s.

After seeing a stimulus, answer the question: "Is it a word?" by pressing either "Y" for "Yes" or "N" for "No".

Please respond as quickly and accurately as you can.

You will probably have to blink during the task. Try to blink as little as possible, except when the square is present.

Valence Identification Task—Practice

We will now try three practice items for the "What’s the emotion?" task. In the following task you will be shown X’s which turn into letters, which turn back into X’s.

After seeing a stimulus, answer the question: "What’s the emotion?" by pressing "+" for positive, "-" for negative, or "N" for neutral (not positive or negative).

Please respond as quickly and accurately as you can.

You will probably have to blink during the task. Try to blink as little as possible, except when the square is present.

Valence Identification Task—Task

REMEMBER: In the following task you will be shown X’s which turn into letters, which turn back into X’s. After seeing a stimulus, answer the question: "What’s the emotion?" by pressing "+" for positive, "-" for negative, or "N" for neutral (not positive or negative).

Please respond as quickly and accurately as you can.

You will probably have to blink during the task. Try to blink as little as possible, except when the square is present.

Word Rating Task

For each word set: In the following task you will be asked to rate each of a series of words on how emotional it is for you, using a 7 point scale. When each word appears, please press a number, 1 through 7 corresponding to the following scale:

1 -- Very Negative

2 -- Negative

3 -- Somewhat negative

4 -- Neutral, not emotional at all

5 -- Somewhat positive

6 -- Positive

7 -- Very Positive

Between the normed and idiosyncratic words: Now we’ll try another word set. Please rate the words in the exact same way you did the previous set.

Warned Reaction Time Task

In the following task you will be shown X’s which turn into letters, which turn back into X’s.

As soon as the stimulus (letters) appear, press the middle button as quickly as you can.

You will probably have to blink during the task. Try to blink as little as possible, except when the square is present.

Gaze Task

In the following task you will be shown X’s which turn into letters, which turn back into X’s. You do not have to press anything in this task. Just watch the stimuli as they appear on the screen. You will probably have to blink during the task. Try to blink as little as possible, except when the square is present.

 


 

APPENDIX H: SOFTWARE USED FOR DATA COLLECTION
AND ANALYSIS

An extensive set of software was written to allow collection and analysis of reaction time, error rate, and pupil data. I wrote programs to generate stimuli and collect behavioral data. Stored data included the presented stimuli, their valence and personal relevance, participants’ responses, and reaction times. The Pupil program, distributed by Micromeasurements for use with the pupillometer, was used to record data from the pupillometer. Software written by members of Dr. Eric Granholm’s lab was used to transform data collected by the Pupil program into ascii files, to do initial smoothing of pupil data, and to interpolate blinks. I wrote software that imported all of the behavioral and pupil data into a Microsoft AccessTM database.

I wrote a program to manage the data database that had the following features. It scores all data from the questionnaires. The next set of features were devoted to assuring that the stored data was valid. It checked the behavioral and pupil data for consistency, eliminating missing trials (input by the user or detected automatically by the software), removing any data for which the pupilometry reaction times were not consistent with the recorded behavioral reaction times (very few trials - happened only when trials were recorded behaviorally, but not on the pupillometer).

The next set of features allowed aggregate statistics to be calculated for each waveform. The user can specify whether or not simple-reaction time curves should be selected from pupil data. The user enters time thresholds for early and late windows (early-threshold and late-threshold) on the pupil trial, and an offset after which post-reaction time dilations are recorded. The following data is recorded for each waveform: stimulus, valence, personal relevance, rated valence (1-7), reaction time, response, whether response was an error, recorded pupillometer reaction time. The following statistics are calculated from pupil waveforms: peak amplitude, peak latency, amplitude of peak after reaction-time plus offset, amplitude of peak after late-threshold, average pupil area, average area pre-stimulus (baseline), average area in the last five measurements pre-stimulus (a different baseline), average area post-stimulus, average area post-stimulus but pre-reaction-time, average area post reaction-time plus offset, average area pre-peak, average area post-peak, average area post-stimulus but pre-early-threshold, average area post-late-threshold, whether a blink occurs during the baseline, whether a blink occurs during the peak, percent of trial with blinks, dilation at reaction-time, slope pre-user-defined-time-threshold, slope post-user-defined-time-threshold, slope pre-reaction-time, slope post-reaction-time plus user-defined-time-threshold, slope post-peak. The next set of features allows data from waveforms to be aggregated for each person subject to any combination of the following constraints: using only correct responses, using only responses to trials in which the valence rating matched the normed or collected valence (described in detail in the Results section), elimination of outliers post a user-defined time-threshold, subtraction of baselines from each pupil waveform, rescaling of outliers as described in the Results section. Data can be aggregated using means, medians, or harmonic means. Calculated statistics for each person include: central tendency of each waveform variable, standard deviation of a number of waveform variables.

Additionally the user can generate the mean, median, or harmonic mean of pupil dilation curves for each individual in each condition (task, valence, personal relevance), as well as aggregate curves for depressed and nondepressed individuals. A final set of features is devoted to transferring this computed data to external files, or to SPSS. That is, the program can automatically send various commands to SPSS, allowing the creation of large sets of graphs, or statistics at a time. The interface for this software is shown in Figure 12, for no other reason than I want my advisors to see the result of the months of programming I did for this dissertation. Statistics were calculated in SPSS. All of the software and SPSS syntax that I have written for this project is available upon request.

 

Figure 14. Software used for management of pupil dilation database

 


 
 
 
APPENDIX I: Examples of a few selected pupil dilation curves

 

 

 
  
 
 
 
APPENDIX J: Robustness of PCA analyses

The Principal Components Analysis (PCA’s) of pupil dilations reported in the text assume that the same factor structure applies to both depressed and nondepressed individuals. Potentially, depressed and nondepressed individuals, in fact, do not share information processing mechanisms to such an extent. In this case, using one PCA for both depressed and nondepressed individuals would not be valid. To check whether the same factor structure likely applies to both groups, separate PCA’s were conducted for depressed and nondepressed individuals. Results of each PCA are shown in the following figure. The PCA’s for each group revealed factors with similar peaks (near +250ms, +120ms, +10ms, -60ms, -140ms) relative to the individuals’ reaction time. A few differences in the shape and relative strength of factors were evident. In the depressed group, the late, or "ruminative" factor, having a peak near 250ms, accounted for more variance, began earlier, and was more strongly represented the entire time after the stimulus was presented. In contrast, the "cognitive" factor, having a peak approximately 120ms after the reaction time, accounted for the most variance in the nondepressed group, was more strongly represented throughout the post-reaction-time window in the nondepressed group. The overall conclusion from these factor analyses is thus, that qualitatively similar processes appear to operate during the valence identification task in depressed and nondepressed individuals, though depressed individuals may engage more in rumination throughout the task.
 
 

 

 
 
 
 Figure 17: Separate PCA’s of pupil dilation for depressed and nondepressed individuals

 
 
 
 
APPENDIX K: ANOVA planned contrasts on differences between pupil
dilation factors

Results from relevant tests and contrasts on the described ANOVAs are presented in the following tables. Interaction contrasts test whether the difference in responses to personally relevant negative and other words were different for depressed and nondepressed individuals. Main effect contrasts test whether the difference was statistically significant for all individuals, interpreted as a group. All interaction contrasts were planned. All main effects contrasts are exploratory and were done post-hoc. The tables present only the results from stimuli for which valence ratings were consistent with normed valences; using nonmatching stimuli as well did not suggest qualitatively different results. Statistically significant contrasts are highlighted with an asterisk in the statistical significance column.

Table 11.

Tests of relevant effects and contrasts for each component of pupil dilation for the valence identification task. p=personally relevant
 
      Df h2
Factor 1:  Effects: Valence * Depression 0.26  3, 43  0.85  0.02
Sustained   Valence 0.18  3, 43  0.91  0.01
Attention   Depression 5.85  1, 45  0.02*  0.12
  Interaction Contrasts: Positive v. Negativep 0.73  1, 45  0.40  0.02
    Neutral v. Negativep 0.12  1, 45  0.73  0.00
    Negative v. Negativep 0.20  1, 45  0.66  0.00
Factor 2:  Effects:  Valence * Depression 3.31  3, 43  0.03*  0.19 
Cognitive   Valence 5.07  3, 43  0.00*  0.26 
    Depression 0.01  1, 45  0.94  0.00 
  Interaction Contrasts: Positive v. Negativep 6.51  1, 45  0.01*  0.13 
    Neutral v. Negativep 0.35  1, 45  0.55  0.01 
    Negative v. Negativep 1.21  1, 45  0.28  0.03 

 
 
      Df h2
Factor 3:  Effects:  Valence * Depression 0.56  3, 43  0.64  0.04 
Motor    Valence 1.23  3, 43  0.31  0.08 
  Depression 7.20  1, 45  0.01*  0.14 
  Interaction Contrasts: Positive v. Negativep 0.01  1, 45  0.94  0.00 
Neutral v. Negativep 0.73  1, 45  0.40  0.02 
Negative v. Negativep 0.11  1, 45  0.75  0.00 
Factor 4:  Effects:  Valence * Depression 0.53  3, 43  0.67  0.04 
Early Attention /  Valence 3.87  3, 43  0.02*  0.21 
Perception   Depression 5.37  1, 45  0.03*  0.11 
  Interaction Contrasts:  Positive v. Negativep 1.55  1, 45  0.22  0.03 
    Neutral v. Negativep 0.20  1, 45  0.66  0.00 
    Negative v. Negativep 0.62  1, 45  0.43  0.01 
  Main Effect Contrasts: Positive v. Negativep 1.18  1, 45  0.28  0.03
    Neutral v. Negativep 4.95  1, 45  0.03*  0.10
    Negative v. Negativep 0.93  1, 45  0.34  0.02

 

 

Table 12.

Tests of relevant effects and contrasts for each component of pupil dilation for the lexical decision task. p=personally relevant
 
      Df h2
Factor 1: Effects:  Valence * Depression 1.21  3, 43  0.32  0.08 
Sustained   Valence 2.94  3, 43  0.04*  0.17 
Attention   Depression 1.37  1, 45  0.25  0.03 
  Interaction Contrasts:  Positive v. Negativep 2.21  1, 45  0.14  0.05 
    Neutral v. Negativep 0.37  1, 45  0.55  0.01 
    Negative v. Negativep 0.01  1, 45  0.91  0.00 
  Main Effect Contrasts:  Positive v. Negativep 3.40  1, 45  0.07  0.07 
    Neutral v. Negativep 5.55  1, 45  0.02*  0.11 
    Negative v. Negativep 8.70  1, 45  0.01*  0.16 
Factor 2:  Effects:  Valence * Depression 1.57  3, 43  0.21  0.10 
Cognitive   Valence 1.96  3, 43  0.13  0.12 
    Depression 5.15  1, 45  0.03*  0.10 
  Interaction Contrasts:  Positive v. Negativep 4.31  1, 45  0.04*  0.09 
    Neutral v. Negativep 0.84  1, 45  0.37  0.02 
    Negative v. Negativep 0.65  1, 45  0.42  0.01 
Factor 3:  Effects:  Valence * Depression 0.68  3, 43  0.57  0.05 
Motor    Valence 0.65  3, 43  0.59  0.04 
    Depression 3.19  1, 45  0.08  0.07 
  Interaction Contrasts:  Positive v. Negativep  0.00  1, 45  0.96  0.00 
    Neutral v. Negativep 0.55  1, 45  0.46  0.01 
    Negative v. Negativep 0.20  1, 45  0.65  0.00 

 
 
      Df h2
Factor 4:  Effects:  Valence * Depression 0.95  3, 43  0.43  0.06 
Early Attention /   Valence 1.97  3, 43  0.13  0.12 
Perception    Depression 0.68  1, 45  0.41  0.01 
  Interaction Contrasts:  Positive v. Negativep 0.54  1, 45  0.46  0.01 
    Neutral v. Negativep 0.89  1, 45  0.35  0.02 
    Negative v. Negativep 0.11  1, 45  0.77  0.00 
  Main Effect Contrasts:  Positive v. Negativep 6.01  1, 45  0.02*  0.12 
    Neutral v. Negativep 1.91  1, 45  0.17  0.04 
    Negative v. Negativep 1.96  1, 45  0.17  0.04

 

References

Amaral, D., Price, J., Pitkanen, A., & Carmichael, S. T. (1992). Anotomical organization of the primate amygdaloid complex. In J. P. Aggleton (Ed.) The amygdala: Neurobiological aspects of emotion, memory, and mental dysfunction (pp. 191-228). New York, NY: Wiley-Liss.

American Psychiatric Association, (1994). Diagnostic and statistical manual of mental disorders, fourth edition, Washington, D. C.: American Psychiatric Association.

Anderson, J. A. (1990). Hybrid computation in cognitive science: Neural networks and symbols. Applied Cognitive Psychology, 4, 337-347.

Arbib, M. (1987). Brains, Machines, and Mathematics. New York: Springer-Verlag.

Barnden, J. (1995). Artificial intelligence and neural networks. In M. Arbib (Ed.), The Handbook of Brain Theory and Neural Networks. Cambridge, MA: MIT Press.

Beatty, J. (1980). Pupillometric signs of selective attention in man. (Technical Report 10). Los Angeles: University of California, Los Angeles, Human Neurophysiology Laboratory.

Beatty, J. (1982). Task-evoked pupillary responses, processing load, and the structure of processing resources. Psychological Bulletin, 91, 276-292.

Beatty, J. (1986). The pupil system. In M. G. H. Coles et al. (Eds.), Psychophysiology: Systems, Processes, and Applications. New York: Guilford.

Beck, A. T. (1967). Depression: Clinical, experimental, and theoretical aspects. New York: Hoeber.

Beck, A. T. (1974). The development of depression. In R. J. Friedman & M. M. Katz (Eds.), The psychology of depression. New York: Winston-Wiley.

Beck, A., Steer, R., & Garbin, M. (1988). Psychometric properties of the Beck Depression Inventory: Twenty-five years of evaluation. Clinical Psychology Review, 8, 77-100.
 

Blaney, P. (1986). Affect and memory: A review. Psychological Bulletin, 99, 229-246.

Blank, D. S., Meeden, L. A., Marhsall, J. B. (1991). Exploring the symbolic/subsymbolic continuum: A case study of RAAM. In J. Dinsmore (Ed.), Closing the Gap: Symbolism vs. Connectionism. Hillsdale, NJ: Erlbaum.

Bower, G. (1981). Mood and memory. American Psychologist, 36, 129--148.

Bradley, B. P., Mogg, K., & Williams, R. (1994). Implicit and explicit memory for emotional information in non-clinical subjects. Behaviour Research & Therapy, 32(1), 65-78.

Bray, D. E. (1988). The effects of hedonic manipulations on the perceptual processing of linguistic material. Dissertation Abstracts International, 49, 1413.

Bullinaria, J. A., (1994). Modelling reaction times. In L. Smith & Hancock, P. J. B. (Eds.) Neural Computation and Psychology. London: Springer.

Challis, B. H. & Krane, R. V. (1988). Mood induction and the priming of semantic memory in a lexical decision task: Asymmetric effects of elation and depression. Bulletin of the Psychonomic Society, 26(4), 309-312.

Clark, D. M., Teasdale, J. D., Broadbent, D. E., & Martin, M. (1983) Effect of mood on lexical decisions. Bulletin of the Psychonomic Society, 21(3), 175-178.

Cohen, J. D., Dunbar, K., & McClelland, J. (1990). On the Control of Automatic Processes: A Parallel Distributed Processing Account of the Stroop Effect, Psychological Review, 97. 332-361.

Cohen, J. D., & Servan-Schreiber, D. (1992). Introduction to neural network models in psychiatry, Psychiatric Annals. 22(3), 113-118.

Cohen, J. (1977). Statistical power analysis for the behavioral sciences, revised edition. New York: Academic Press.

Coles, M. G. H., Gratton, G., Kramer, A. F., & Miller, G. A. (1986). Principles of signal acquisition and analysis. In M. G. H. Coles, E. Donchin & S. W. Porges (Eds.), Psychophysiology: Systems, processes, and applications. New York: Guilford Press.

Collins, A., & Loftus, E., (1975). A Spreading-Activation Theory of Semantic Processing, Psychological Review. 82, 407-428.

Coyne, J. C. (1994). Self-reported distress: Analog or ersatz depression? Psychological Bulletin, 116, 29-45.

Coyne, J. C. & Gotlib, I. H. (1983). The role of cognition in depression: A critical appraisal. Psychological Bulletin, 94, 472-505.

Davidson, R. (1997). Affective style and affective disorders: Perspectives from affective neuroscience. Address given at the meeting of the Society for Research in Psychopathology, Palm Springs, CA.

Davidson, R. (1998). Affective style and affective disorders: Perspectives from affective neuroscience. Address given at the Fourth Annual Wisconsin Symposium on Emotion: Affective Neuroscience, Madison, WI.

Deijen, J. B., Orlebeke, J. F., Rijsdijk, F. V. (1993). Effect of depression on psychomotor skills, eye movements and recognition-memory. Journal of Affective Disorders, 29, 33-40.

Diedrich, O., Naumann, E., Maier, S., Becker, G., Bartussek, D. (in press). A frontal positive slow wave in the ERP in the context of emotional slides. Journal of Psychophysiology.

Derryberry, D. (1988). Emotional influences on evaluative judgements: Roles of arousal, attention, and spreading activation. Motivation and Emotion, 12(1), 23-55.

Dozois, D. J. A. & Dobson, K. S. (1998). A review of the Stroop task in psychopathology. Unpublished manuscript.

Fallman, & Lebiere (1991). The cascade-correlation learning architecture. (Tech. Rep. No. CMU-CS-90-100). Pittsburgh, PA: Carnegie Mellon University, Computer Science Department.

Fernandez de Molina, A. & Hunsberger, R. W. (1962). Organization of the subcortical system governing defence and flight reactions in the cat. Journal of Physiology, 7, 200-213.

Flaherty, J. A., Gavira, F. M. & Val, E. R. (1982). Diagnostic considerations. In E. R. Val, F. M. Gavira, & J. A. Flaherty (Eds.), Affective disorders: Psychopathology and treatment. Chicago: Year Book Medical Publishers.

Friedenberg, L. (1995). Psychological testing: Design, analysis, and use. Needham Heights, MA: Allyn & Bacon.

Ghose, K. (1976). Correlation of pupil reactivity to tyramine or hydroxyamphetamine and tyramine pressor responses in patients treated with amitriptaline or mianserin. British Journal of Clinical Pharmacology, 3, 666-667.

Gotlib, I. H. (1984). Depression and general psychopathology in university students. Journal of Abnormal Psychology, 93, 19-30.

Greenberger, D. & Padesky, C. A. (1995). Mind over mood: A cognitive therapy treatment manual for clients. New York: Guilford.

Halgren, E. (1992). Emotional neurophysiology of the amygdala within the context of human cognition. In J. P. Aggleton (Ed.) The amygdala: Neurobiological aspects of emotion, memory, and mental dysfunction (pp. 191-228). New York, NY: Wiley-Liss.

Hakerem, G. & Sutton, S. (1966). Pupillary Response at Visual Threshold. Nature, 212, 485-486.

Hebb, D. O. (1949). The organization of behavior: A neuropsychological theory. New York: Wiley.

Hecht-Neilsen, R. (1990). Neurocomputing, CA: Addison-Wesley Publishing Company.

Hedges, L. V. & Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic Press.

Hess, E. H. (1972). Pupillometrics: A method of studying mental, emotional, and sensory processes. In N. S. Greenfield & R. A. Sternbach (Eds.), Handbook of psychophysiology (pp. 491-531). New York, N.Y.: Holt, Rinehart & Winston.

Hess, E. H. & Polt, J. H. (1964). Pupil size in relation to mental activity during simple problem solving. Science, 182, 177-180.

Hill, A. B. & Dutton, F. (1989). Depression and selective attention to self-esteem threatening words. Personality and Individual Differences, 10, 915-917.

Hill, A. B., & Kemp-Wheeler, S. M. (1989). The influence of anxiety on lexical and affective decision time for emotional words. Personality and Individual Differences, 10, 1143-1149.

Hinton, G. E. (Ed.). (1991). Connectionist symbol processing, Cambridge, MA: MIT Press.

Hinton, G. E., McClelland, J. L., & Rumelhart, D. E. (1986). Distributed Representations. In J. L. McClelland, & D. E. Rumelhart (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol 1) (pp. 77-109). Cambridge, MA: MIT Press.

Hutt, L. D., & Anderson, J. P. (1967). The relationship between pupil size and recognition threshold. Psychonomic Science, 9, 477-478.

Ingram, R. (1984). Toward an information processing analysis of depression. Cognitive Therapy and Research, 8, 443-478.

Ingram, R. (1990). Self-focused attention in clinical disorders: Review and a conceptual model, Psychological Bulletin, 107, 156-176.

Ingram, R. E. & Hollon, S. D. (1986). Cognitive therapy of depression from an information processing perspective. In R. Ingram (Ed.), Information processing approaches to clinical psychology (pp. 261-284). New York: Academic Press.

Ingram, R. E., Miranda, J., & Segal, Z. V. (1998). Cognitive vulnerability to depression. New York, NY: Guilford.

Janisse, M. P. (1973). Pupil size and affect: A critical review of the literature since 1960. Canadian Psychologist, 14, 311-329.

Jencius, S. T. (1998). Personality: Simple versus complex systems. Presentation at the 9th European conference on personality. Surrey, England.

Jobe, T. H., Fichtner, C. G., Port, J. D., & Gavira, M.M. (1995). Neuropoiesis: Proposal for a connectionistic neurobiology. Medical Hypotheses, 45, 147-163.

Kahneman, D. & Beatty, J. (1966). Pupil diameter and load on memory. Science, 154, 1583-1585.

Kendall, P. C., Hollon, S. D., Beck, A. T., Hammen, C. L., & Ingram, R. E. (1987). Issues and recommendations regarding use of the Beck Depression Inventory. Cognitive Therapy and Research, 11(3), p. 289-299.

Kiers, H. A. & Krijnen, W. P. (1991). An efficient algorithm for PARAFAC of three-way data with large numbers of observation units. Psychometrika, 56, 147-152.

Koikegami, H. & Yoshida, K. (1953). Pupillary dilation induced by stimulation of amygdaloid nuclei. Folia Pychiatrica Neurologica Japonica, 7, 109-125.

LeDoux, J. E. (1997). Emotion, memory, and the brain. Presentation at the meeting of the American Psychological Association, New York, New York.

LeDoux, J. E. (1995). Emotion: Clues from the brain. Annual Review of Psychology, 46, 209-235.

LeDoux, J. E. (1992). Emotion and the amygdala. In J. P. Aggleton (Ed.) The amygdala: Neurobiological aspects of emotion, memory, and mental dysfunction (pp. 339-351). New York, NY: Wiley-Liss.

LeDoux, J. E. (1989). Cognitive-emotional interactions in the brain. Cognition and Emotion, 3, 267-289.

Liakos, A. & Crisp, A. H. (1971). Pupil size in psychoneurotic patients: A psychophysiological and psychometric investigation. Psychotherapy and Psychosomatics, 19, 104-110.

Luce, R. D. (1963). A threshold theory for simple detection experiments. Psychological Review, 70, 61-79.

Luce, R. D. & Narens, L. (1983). Symmetry, scale types, and generalizations of classical physical measurement. Journal of Mathematical Psychology, 27, 44-85.

Luciano, J. S. (1997) A neural network model of major unipolar depression based on anatomical, pharmacological, and psychiatric data. Unpublished doctoral dissertation, University of Boston, MA.

MacLeod, C., & Mathews, A. M. (1991). Cognitive-experimental approaches to the emotional disorders. In Paul R. Martin, (Ed.), Handbook of behavior therapy and psychological science: An integrative approach, 164 (pp. 116-150). New York: Pergamon Press.

Macleod, C., Mathews, A. M., & Tata, P. (1986). Attentional bias in emotional disorder. Journal of Abnormal Psychology, 95(1), 15-20.

Macleod, C., Tata, P., & Mathews, A. (1987). Perception of emotionally valenced information in depression. British Journal of Clinical Psychology, 26, 67-68.

Marley, A. A. (Ed.). (1995). Choice, decision, and measurement: Essays in honor of R. Duncan Luce. Mahwah, NJ: Erlbaum.

Martin, M., Williams, R.M., & Clark, D. M. (1991). Does anxiety lead to selective processing of threat related information. Behavior Research and Therapy, 29, 147-160.

Mathews, A. & Milroy, R. (1994). Processing of emotional meaning in anxiety. Cognition and Emotion, 8, 535-553.

Matt, G., Vazquez, C., & Campbell, W. (1992). Mood-congruent recall of affectively toned stimuli: A meta-analytic review. Clinical Psychology Review, 12, 227-255.

Matthews, G. & Southall, A. (1991). Depression and the processing of emotional stimuli: A study of semantic priming. Cognitive Therapy and Research 15, 283-302.

Matthews, G. & Harley, T. A. (1996). Connectionist models of emotional distress and attentional bias. Cognition and Emotion, 10, 561-600.

McClelland, J. L., Rumelhart, D. E., & Hinton, G. E. (1985). The appeal of parallel distributed processing. In J. L. McClelland, & D. E. Rumelhart (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition, (Vol 2) (pp. 3-44). Cambridge, MA: MIT Press.

Miyata, Y. (1991). A user’s guide to PlaNet version 5.6: A tool for constructing, running, and looking into a PDP network. (Available from Yoshiro Miyata, Department of Computer Science, University of Colorado at Boulder, Boulder, CO 80309-0430).

Morrow, J., & Nolen-Hoeksema, S. (1990). Effects of responses to depression on the remediation of depressive affect. Journal of Personality and Social Psychology, 58, 519-527.

Movellan, J. R., & McClelland, J. L. (1994). Stochastic interactive processing, channel separability, and optimal perceptual interference: An examination of Morton’s law. Department of Psychology, Carnegie Mellon University, Technical Report PDP.CNS.95.4.

Muijen, M. Jones, D. P., Roy, D., Silverstone, T. & Mehmet, A. (1989). Mianserin withdrawal and the pupil response of depressed and recovered patients: A preliminary report. Biological Psychiatry, 25, 810-814.

Naumann, E., Bartussek, D., Diedrich, O., & Laufer, M. E. (1992). Assessing cognitive and affective information processing functinos of the brain by means of the late positive complex of the event-related potential. Journal of Psychophysiology, 6, 285-298.

Nolen-Hoeksema, S. & Morrow, J. (1991). A prospective study of depression and posttraumatic stress symptoms after a natural disaster: The 1989 Loma Prieta earthquake. Journal of Personality and Social Psychology, 61, 115-121.

Nolen-Hoeksema, S. Morrow, J. & Fredrickson, B. L. (1993). Response styles and the duration of episodes of depressed mood. Journal of Abnormal Psychology, 102, 20-28.

Nolen-Hoeksema, S. Parker, L. E., & Larson, J. (1994). Ruminative coping with depressed mood following loss. Journal of Personality and Social Psychology, 67, 92-104.

Park, B. (1998) A connectionist account of antidepressant action. Unpublished manuscript. Available from the Connectionist models of cognitive, affective, brain, and behavioral disorders web site at www.sci.sdsu.edu/CAL/connectionist-models/.

Paykel, E. S. (1979). Causal relationships between clinical depression and life events. In Barrett, J. E. (Ed.), Stress and mental disorder (pp. 71-86). New York: Raven Press.

Podgorny, P. & Garner, W. R. (1979). Reaction time as a measure of inter- and intraobject visual similarity: Letters of the alphabet. Perception and Psychophysics, 26, 37-52.

Powell, M. & Helmsley, D. R. (1984). Depression: a breakdown of perceptual defense? British Journal of Psychiatry, 145, 358-362.

Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85(2), 59-108.

Ratcliff, R. (1993). Methods for dealing with reaction time outliers. Psychological Bulletin, 114, 510-532.

Riemann, B.C. & McNally, R. J. (1995). Cognitive processing of personally relevant information. Cognition and Emotion, 9, 325-340.

Ruiz Caballero, J. A. & Bermudez Moreno, J. (1992). Individual differences in depression, induced mood, and perception of emotionally toned words. European Journal of Personality, 6(3), 215-224.

Rumelhart , Hinton, & McClelland (1986). A general framework for parallel distributed processing. In D. E. Rumelhart, J. L. McClelland, & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition, (Vol. 1) (pp. 45-76). MA: MIT Press.

Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal representations by error propagation. In D. E. Rumelhart, J. L. McClelland, & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition, Vol 1., (pp. 318-362). MA: MIT Press.

Sarle, W. S. (1994). Neural networks and statistical models. Proceedings of the Nineteenth Annual SAS Users Group International Conference (pp. 1538-1550). Cary, NC: SAS Institute.

Segal, Z. V., Gemar, M., Truchon, C., Guirguis, M., & Horowitz, L. M. (1995). A priming methodology for studying self-representation in major depressive disorder. Journal of Abnormal Psychology, 104, 205-213.

Seidenberg, M. S. & McClelland, J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96, 523-568.

Shur, E., & Checkley, S. (1982). Pupil studies in depressed patients: An investigation of the mechanism of action of desipramine. British Journal of Psychiatry, 140, 181-184.

Siegle, G. J. (1994). The Balanced Affective Word List Creation Program. Available on the World Wide Web at www.sci.sdsu.edu/CAL/wordlist.html.

Siegle, G. J. (1996). Rumination on affect: Cause for negative attention biases in depression? Unpublished Master’s Thesis, San Diego State University.

Siegle, G. J. (1997). Why I Make Models (or What I Learned in Graduate School About Validating Clinical Causal Theories With Computational Models). The Behavior Therapist, 20, 179-184.

Siegle, G. J. (1998a). A neural network model of affective interference in depression. Presentation at the International Workshop on Neural Network Models of Cognitive and Brain Disorders, College Park, MD.

Siegle, G. J. (1998b). Connectionist models of cognitive, affective, brain, and behavioral disorders. Available on the World Wide Web at http://www.sci.sdsu.edu/CAL/connectionist-models.

Siegle, G. J., Ingram, R. E., & Matt, G. E., (1995). A neural network model of information processing biases in depression. Presentation at the workshop Neural Modeling of Cognitive and Brain Disorders, College Park, Maryland.

Siegle, G. J., Ingram, R. E., & Matt, G. E., (1999). Affective interference: Cause for negative attention biases in depression? Manuscript submitted for publication.

Siegle, G. J. & Ingram, R. E. (1997a). Modeling individual differences in negative information processing biases. In Matthews, G. (Ed.), Cognitive science perspectives on personality and emotion. New York, NY: Elsevier.

Siegle, G. J. & Ingram, R. E. (1997b). A neural network model of inability to process emotional information in depression. Presentation at the meeting of the Society for Research in Psychopathology, Palm Springs, CA.

Siegle, G., Ingram, R., Granholm, E., & Matt, G. (1998). Modeling the time course of attention to negative information in depression. In G. Matthews (Chair), Cognitive science perspectives on personality and emotion. Presentation at the 9th European Conference on Personality, Surrey, England. Siegle, G.J. (1999) A neural network model of attention biases in depression, in Reggia, J. and Ruppin, E. (Eds.) Disorders of brain, behavior, and cognition: The neurocomputational perspective. (pp. 415-441) New York, NY: Elsevier

Siegle, G. (1999b). World wide web site associated with Greg Siegle’s dissertation. Available on the World Wide Web at http://www.sci.sdsu.edu/CAL/greg/dissert/. While the link to this web site may change in the future, every effort will be made to preserve the presence of these analyses on the Web, and a link to them will be maintained on the first author’s web site (currently http://www.sci.sdsu.edu/CAL/greg/). As such, a search for the first author’s web site on most search engines should allow access to the associated information in the future.

Snyder, C. W., Walsh, W. D., & Pamment, P. R. (1983). Three-mode PARAFAC factor analysis in applied research. Journal of Applied Psychology, 68, 572-583.

Spielberger, C. D. (1983). State-trait anxiety inventory for adults: Permissions set, manual, test booklet, and scoring key. Palo Alto, CA: Mind Garden.

Spitzer, R. L., Williams, J. B., Gibbon, M., & First, M. B. (1992). The Structured Clinical Interview for DSM-III—R (SCID): I. History, rationale, and description. Archives of General Psychiatry, 49, 624-629.

Squire, L. R. (1992). Memory and the hippocampus: A synthesis from findings with rats, monkeys, and humans. Psychological Review, 99, 195-231.

Steer, R. A., Ranieri, W. F., Beck, A. T., & Clark, D. A. (1993). Further evidence for the validity of the Beck Anxiety Inventory with psychiatric outpatients. Journal of Anxiety Disorders, 7,(3), 195-205.

Steinhauer, S. R. (1982). Emitted and evoked pupillary responses and event-related potentials as a function of reward and task involvement. Unpublished Doctoral Dissertation, City University of New York, New York.

Stip, E. & Lecours, A. R. (1992). Fonctionnement neuropsychologique du deprime. Epreuve de decision lexicale dans la depression majeure. Encephale, 18, 575-583.

Stip, E., Lecours, A. R., Chertkow, H., Elie, R., & O’Connor, K. (1994). Influence of affective words on lexical decision task in major depression. Journal of Psychiatry and Neuroscience, 19(3), 202-207.

Teasdale, J. D. (1988). Cognitive vulnerability to persistent depression. Cognition and Emotion, 2, 247-274.

Teasdale, J. D. & Barnard, P., (1993). Affect, cognition, and change: Remodelling depressive thought. Hillsdale, N.J.: Erlbaum.

Teasdale, J. D., Segal, Z., & Williams, J. M. G. (1995). How does cognitive therapy prevent depressive relapse and why should attentional control (mindfulness) training help? Behaviour Research and Therapy, 33, 25-39.

Tryon, W. W. (1993). Neural networks: I. Theoretical unification through connectionism. Clinical Psychology Review, 13, 341-352.

Tryon, W. (1994). Encoding emotion into a bidirectional associative memory model. In G. Siegle and R. Ingram (Chair), Connectionist models of negative affect. Panel discussion conducted at the meeting of the Association for the Advancement of Behavior Therapy.

Tucker, D. M. & Derryberry, D. (1992). Motivated attention: Anxiety and the frontal executive functions. Neuropsychiatry, neuropsychology, and behavioral neurology, 5, 233-252.

Turner, P. (1975). The human pupil as a model for clinical pharmacological investigation. Physicians of London, 9, 165-171.

Vacchiano, R. B., Strauss, P. S., Ryan, S., & Hockman, L. (1968). Pupillary response to value linked words. Perceptual and Motor Skills, 27, 207-210.

Vanderploeg, R. D., Brown, W. S., & Marsh, J. T. (1987). Judgments of emotion in words and faces: ERP correlates. International Journal of Psychophysiology, 5, 193-205.

Weaver, K. A., & McNeill, A. N. (1992). Null effect of mood as a semantic prime. Journal of General Psychology, 119(3), 295-301.

Williams, G., Conner, J., Siegle, G., Ingram, R., & Cole, D. (1998). Is more negative less positive? Relating dysphoria to emotion ratings. Presentation at the meeting of the Western Psychological Association, Albuquerque, New Mexico.

Williams, J. M. G., Mathews, A., & MacLeod, C. (1996). The emotional Stroop task and psychopathology. Psychological Bulletin, 120, 3-24.

Williams, J., & Oaksford, M. (1992). Cognitive science, anxiety, and depression: From experiments to connectionism. In Stein and Young (Eds.), Cognitive science and the clinical disorders, San Diego, CA: Academic Press.

Williamson, S., Harpur, T. J., & Hare, R. D. (1991). Abnormal processing of affective words by psychopaths. Psychophysiology, 28(3), 260-273.

Yates, J., & Nasby, W. (1993). Dissociation, affect, and network models of memory: An integrative proposal. Journal of Traumatic Stress, 6, 305-326.