In my blogpost of July 24, 2010, Counting Symptoms That Don’t Count, I wrote.
“So what does a
doctor who spends so little time with a patient do to save time? I mean besides
completely ignoring the patient's relationships, history of trauma, humanity, etc…Well,
one thing they can do is ask only about symptoms, and blindly accept the
patient's yes or no answer without even checking to see if the patient
understands the difference between a transient mood state and a psychiatric
symptom. Better yet, before the doctor even sees the patient, he or she can
have the patient fill out a symptom checklist, and base his diagnosis entirely
on that. (Of course, his secretary could make a diagnosis doing that, so the
patient really wouldn't even have to talk to the doctor at all).”
The inappropriate use of self report tests designed to screen patients as actual diagnostic
instruments has become even more of an issue than ever. As you may recall from earlier posts, such instruments are purposely
designed to cast a wide net so as not to miss someone in need of treatment, and as such, they snare many patients who do not, in fact, need
treatment.
As managed care is tightening its ever present grip, full psychiatric diagnostic interviews are being marginalized. This is especially true in the so call “collaborative care” models, in which psychiatrists merely advise primary care physicians without necessarily seeing the patient themselves.
As managed care is tightening its ever present grip, full psychiatric diagnostic interviews are being marginalized. This is especially true in the so call “collaborative care” models, in which psychiatrists merely advise primary care physicians without necessarily seeing the patient themselves.
Fellow blogger
George Dawson, M.D. beautifully describes the problems with the use of a depression screening instrument in wide use called
the PHQ-9: “…let's talk about
what is really happening here. This is all about a patient coming in and
being given a PHQ-9 depression screening inventory… It generally takes
most patients anywhere from 1 - 3 minutes to check off the boxes.
Conceivably that could lead to a diagnosis of depression in a few more
minutes in the primary care clinic. At that point the patient enters the
antidepressant algorithm and they are they are officially being treated [they may be given an antidepressant on the
basis of the pHQ-9 results alone - DA]. The care manager reports the
PHQ-9 scores of those who do not improve to the "supervising"
psychiatrist and gets a recommendation to modify treatment."
No determination
of whether the symptoms are clinically significant. No determination of whether the symptoms
reported are merely relatively normal reactions to adverse environmental events. No nothing.
To appreciate why
symptom checklists are so problematic, I need to discuss something called a Likert Scale. A Likert Scale asks the patient to “rate” a
symptom by level of severity, frequency, importance, or how strongly the test
taker agrees with a statement. There is
usually a 4 to 7 point scale with a number attached. Examples:
Not at all - 0
Several days - 1
More than half the days - 2
Nearly every day -3
(the PHQ-9 Likert Scale)
Very Frequently = 5
Frequently = 4
Occasionally =3
Rarely = 2
Never =1
Not difficult at all = 0
Somewhat difficult = 1
Very difficult = 2
Extremely difficult =3
Very Important = 5
Important = 4
Moderately Important =3
Of Little Importance =2
Unimportant =1
Notice that the questions
are asking the test taker to make a judgment about a symptom, but do not
really define each level. It is
therefore up to the test taker to decide whether the symptom occurs “often” or is “difficult”
compared to some standard. But compared to what? Most people will use their own experience as reference points, and apply the terms according to this subjective standard.
So how is this a problem? Well, for depression inventories,
most people have never seen someone with a severe melancholic depression who is
thinking, moving and talking at a snail’s pace and who is totally and constantly overwhelmed with his or her depression all day every day for weeks at a time.
Having never seen this, the average person does not know how bad depressive symptoms can be – unlike an experienced psychiatrist who has seen the whole gamut of depressed feelings. They therefore will not compare themselves to that, which is actually the relevant comparison!
So each test taker is, in effect, creating his or her own scale. What seems like "often" to them might not seem like very often at all to someone else. This makes the results next to meaningless for making a real diagnosis.
Having never seen this, the average person does not know how bad depressive symptoms can be – unlike an experienced psychiatrist who has seen the whole gamut of depressed feelings. They therefore will not compare themselves to that, which is actually the relevant comparison!
So each test taker is, in effect, creating his or her own scale. What seems like "often" to them might not seem like very often at all to someone else. This makes the results next to meaningless for making a real diagnosis.
For those interested in
statistics, the issue was neatly summed up by John Knight, a commenter on a
Psychology Today Blog Post that criticized
another post I had written. He wrote:
“Firstly, there is nothing more subjective
than self-reporting. How on earth can we treat what a client reports as
objective data? Can any patient really detach themselves and report their...
'status' objectively, and interpret their symptoms and place scores on a
Likert-type scale in the same manner as everyone else? What about the issue of
the relationship to the practitioner? Can a patient be trusted to report
objectively without trying to spare the practitioner's feelings? Or the
opposite - what if they are annoyed and want to give negative feedback to
someone they don't like?
Secondly, these Likert-type scales are often
being processed as interval-level data rather than ordinal data. For the
statistically uninitiated, ordinal data generally consists of "an
arbitrary numerical scale where the exact numerical quantity of a particular
value has no significance beyond its ability to establish a ranking over a set
of data points" (thank you Wikipedia), whereas interval data will be
something like degrees, metres, kilometres, and so on.
A Likert-type scale is ordinal data, but weak
arguments and statistical trickery are being employed to treat it as interval
data, which is easier to process and looks more scientifically impressive.
To lay off the accountant language for a
moment, many CBT practitioners are treating patient self-reports with the same
kind of measurable, real-world objectivity that one would treat degrees
celsius, metres, kilometres, and so on. That is quite simply disgusting, and
should trouble the conscience of any scientist willing to employ the method.”
Another problem
is that instruments like the PHQ-9 ask questions about how many days a
week a person experiences a symptom, but do not ask how long the
symptoms last on a given day when present, let alone about the circumstances in
which a symptom makes an appearance.
Let’s look at the
questions, and I’d like the reader to envision two scenarios. The first is the
melancholic depressive described above. The other is a man who gets involved and preoccupied with his duties at work and feels
fine there, but every night after he gets home he becomes embroiled in a
continuing conflict with his wife, who is threatening divorce, and only then starts to
become extremely upset.
I think the reader will see how it is quite possible that both of these very different individuals might answer the PHQ-9 questions in almost the exact same way, and come out with identical scores. The first would benefit from an antidepressant. The other would not, and probably needs marriage counseling instead.
I think the reader will see how it is quite possible that both of these very different individuals might answer the PHQ-9 questions in almost the exact same way, and come out with identical scores. The first would benefit from an antidepressant. The other would not, and probably needs marriage counseling instead.
PHQ-9
Patient Depression Questionnaire
Over the last 2 weeks, how often
have you been bothered by any of the following problems.
Not at all - 0
Several days - 1
More than half the days - 2
Nearly every day -3
1. Little
interest or pleasure in doing things
2. Feeling
down, depressed, or hopeless
3. Trouble
falling or staying asleep, or sleeping too much
4. Feeling
tired or having little energy
5. Poor
appetite or overeating
6. Feeling
bad about yourself—or that you are a failure or
have let yourself or your family
down
7. Trouble
concentrating on things, such as reading the
newspaper or watching television
8. Moving
or speaking so slowly that other people could
have noticed. Or the opposite - being
so figety or
restless that you have been moving
around a lot more
than usual
9.
Thoughts that you would be better off dead, or of
hurting yourself
add columns
TOTAL:
10. If you checked off any
problems, how difficult have these problems made it for you
to do your work take care of things at home, or get along with other people?
Not difficult at all
Somewhat difficult
Very difficult
Extremely difficult
Copyright
© 1999 Pfizer Inc.
I encountered this with my psychiatrist--5 minute questionnaire followed by her diagnosis. Several months later I asked her to change my antidepressant because...well, why not? She complied, but I never had a whole lot of confidence in the whole process. Probably killed the placebo effect, too.
ReplyDelete