Tuesday, October 1, 2013

Counting Symptoms that Don’t Count, Part II: Compared to What?

In my blogpost of July 24, 2010, Counting Symptoms That Don’t Count, I wrote.

“So what does a doctor who spends so little time with a patient do to save time? I mean besides completely ignoring the patient's relationships, history of trauma, humanity, etc…Well, one thing they can do is ask only about symptoms, and blindly accept the patient's yes or no answer without even checking to see if the patient understands the difference between a transient mood state and a psychiatric symptom. Better yet, before the doctor even sees the patient, he or she can have the patient fill out a symptom checklist, and base his diagnosis entirely on that. (Of course, his secretary could make a diagnosis doing that, so the patient really wouldn't even have to talk to the doctor at all).”

The inappropriate use of self report tests designed to screen patients as actual diagnostic instruments has become even more of an issue than ever. As you may recall from earlier posts, such instruments are purposely designed to cast a wide net so as not to miss someone in need of treatment, and as such, they snare many patients who do not, in fact, need treatment. 

As managed care is tightening its ever present grip, full psychiatric diagnostic interviews are being marginalized. This is especially true in the so call “collaborative care” models, in which psychiatrists merely advise primary care physicians without necessarily seeing the patient themselves. 

Fellow blogger George Dawson, M.D. beautifully describes the problems with the use of a depression screening instrument in wide use called the PHQ-9:  “…let's talk about what is really happening here.  This is all about a patient coming in and being given a PHQ-9 depression screening inventory…  It generally takes most patients anywhere from 1 - 3 minutes to check off the boxes.  Conceivably that could lead to a diagnosis of depression in a few more minutes in the primary care clinic.  At that point the patient enters the antidepressant algorithm and they are they are officially being treated [they may be given an antidepressant on the basis of the pHQ-9 results alone - DA]. The care manager reports the PHQ-9 scores of those who do not improve to the "supervising" psychiatrist and gets a recommendation to modify treatment."

No determination of whether the symptoms are clinically significant. No determination of whether the symptoms reported are merely relatively normal reactions to adverse environmental events. No nothing.

To appreciate why symptom checklists are so problematic, I need to discuss something called a Likert Scale. A Likert Scale asks the patient to “rate” a symptom by level of severity, frequency, importance, or how strongly the test taker agrees with a statement. There is usually a 4 to 7 point scale with a  number attached.  Examples:

Not at all - 0
Several days - 1
More than half the days - 2
Nearly every day -3
(the PHQ-9 Likert Scale)

Very Frequently = 5
Frequently = 4
Occasionally =3
Rarely = 2
Never  =1

Not difficult at all = 0
Somewhat difficult = 1
Very difficult = 2
Extremely difficult =3

Very Important = 5
Important = 4
Moderately Important =3
Of Little Importance =2
Unimportant =1

Notice that the questions are asking the test taker to make a judgment about a symptom, but do not really define each level. It is therefore up to the test taker to decide whether the symptom occurs “often” or is “difficult” compared to some standard. But compared to what? Most people will use their own experience as reference points, and apply the terms according to this subjective standard.

So how is this a problem? Well, for depression inventories, most people have never seen someone with a severe melancholic depression who is thinking, moving and talking at a snail’s pace and who is totally and constantly overwhelmed with his or her depression all day every day for weeks at a time.  

Having never seen this, the average person does not know how bad depressive symptoms can be – unlike an experienced psychiatrist who has seen the whole gamut of depressed feelings. They therefore will not compare themselves to that, which is actually the relevant comparison!

So each test taker is, in effect, creating his or her own scale. What seems like "often" to them might not seem like very often at all to someone else. This makes the results next to meaningless for making a real diagnosis.

For those interested in statistics, the issue was neatly summed up by John Knight, a commenter on a Psychology Today Blog Post that criticized another post I had written.  He wrote:

“Firstly, there is nothing more subjective than self-reporting. How on earth can we treat what a client reports as objective data? Can any patient really detach themselves and report their... 'status' objectively, and interpret their symptoms and place scores on a Likert-type scale in the same manner as everyone else? What about the issue of the relationship to the practitioner? Can a patient be trusted to report objectively without trying to spare the practitioner's feelings? Or the opposite - what if they are annoyed and want to give negative feedback to someone they don't like?

Secondly, these Likert-type scales are often being processed as interval-level data rather than ordinal data. For the statistically uninitiated, ordinal data generally consists of "an arbitrary numerical scale where the exact numerical quantity of a particular value has no significance beyond its ability to establish a ranking over a set of data points" (thank you Wikipedia), whereas interval data will be something like degrees, metres, kilometres, and so on.

A Likert-type scale is ordinal data, but weak arguments and statistical trickery are being employed to treat it as interval data, which is easier to process and looks more scientifically impressive.

To lay off the accountant language for a moment, many CBT practitioners are treating patient self-reports with the same kind of measurable, real-world objectivity that one would treat degrees celsius, metres, kilometres, and so on. That is quite simply disgusting, and should trouble the conscience of any scientist willing to employ the method.”

Another problem is that instruments like the PHQ-9 ask questions about how many days a week a person experiences a symptom, but do not ask how long the symptoms last on a given day when present, let alone about the circumstances in which a symptom makes an appearance.

Let’s look at the questions, and I’d like the reader to envision two scenarios. The first is the melancholic depressive described above. The other is a man who gets involved and preoccupied with his duties at work and feels fine there, but every night after he gets home he becomes embroiled in a continuing conflict with his wife, who is threatening divorce, and only then starts to become extremely upset. 

I think the reader will see how it is quite possible that both of these very different individuals might answer the PHQ-9 questions in almost the exact same way, and come out with identical scores. The first would benefit from an antidepressant. The other would not, and probably needs marriage counseling instead.

PHQ-9 Patient Depression Questionnaire

Over the last 2 weeks, how often have you been bothered by any of the following problems.

Not at all - 0
Several days - 1
More than half the days - 2
Nearly every day -3

1. Little interest or pleasure in doing things
2. Feeling down, depressed, or hopeless
3. Trouble falling or staying asleep, or sleeping too much
4. Feeling tired or having little energy
5. Poor appetite or overeating
6. Feeling bad about yourself—or that you are a failure or
have let yourself or your family down
7. Trouble concentrating on things, such as reading the
newspaper or watching television
8. Moving or speaking so slowly that other people could
have noticed. Or the opposite - being so figety or
restless that you have been moving around a lot more
than usual
9. Thoughts that you would be better off dead, or of
hurting yourself

add columns

10. If you checked off any problems, how difficult have these problems made it for you to do your work take care of things at home, or get along with other people?

Not difficult at all
Somewhat difficult
Very difficult
Extremely difficult

Copyright © 1999 Pfizer Inc.

Oh gee, look at who came up with this scale. A drug company. How convenient!!

1 comment:

  1. I encountered this with my psychiatrist--5 minute questionnaire followed by her diagnosis. Several months later I asked her to change my antidepressant because...well, why not? She complied, but I never had a whole lot of confidence in the whole process. Probably killed the placebo effect, too.