A while back (11/2/11) I reviewed the book Stats.con by James Penston. That book discussed how the statistics used in randomized clinical trials can be highly deceptive. How Not to be Wrong also covers some aspects of statistical misuse, in more detail, and certainly in a much more entertaining way. Some of his comments are funny as hell.
Another issue he mentions is that, if your sample size is too small, the chances increase dramatically that one of your subjects will be an outlier that dramatically but artificially changes the average for whatever characteristic you are measuring. With a small sample, you are more likely to get a few extra prodigies or slackers in a study of people's ability to perform certain tasks. A famous example: if Bill Gates walks into a bar with a few other people, the average guy in the room is a billionaire.
In addition to the sample size considerations described above, you can get into trouble if you start with a sample that contains people who are higher or larger on average on the relevant variable that the average person in the general population.
I can not say for certain, but I wonder if studies on borderline personality disorders (BPD) yield misleading results because of regression to the mean. Long term follow-up studies on patients with the disorder seem to indicate that it seems to go away after a few years in a significant percentage of subjects. This finding is misleading, however, when you look closer.
To make the BPD diagnosis, the subject needs to exhibit 5 of the 9 possible criteria. Many of the "improved" subjects merely went from 5 criteria down to 4 of them, and were therefore not diagnosed with BPD any longer. Actually, they became just what we call "subthreshold" for the disorder. Their problematic relationships, however, were still pretty much the same.
These results could mean that subjects with BPD may naturally vacillate between meeting criteria for the disorder and being subthreshold, or between exhibiting a high number of the criteria and a lower one. Which would mean that if they qualified for the diagnosis at the beginning of the long term follow-up study, a significant proportion of the long-term study subjects were at their worst. If so, the study results may indicate regression to the mean, and therefore say nothing else significant about the long term prognosis for the disorder.
Other important statistical issues the author discusses clearly and brilliantly include assumptions that two variables are related in a linear fashion when the are not (non-linearity - cause and effect relationships that are not based purely on an increase in one variable always leading to either an increase or decrease in another); torturing the data until it confesses (running multiple tests on your study data, controlling for different things, until something significant seems to pop up); and the following problem inherent in studies designed to see if two things like being married and smoking are correlated:
"Surely the chance is very small that the proportion of married people is exactly the same as the proportion of smokers in the whole population. So, absent a crazy coincidence, marriage and smoking will be correlated, either positively or negatively."
Any one who is serious about critically evaluating the medical literature owes it to themselves to read this book.