donderdag 24 april 2014

Does sun make you unhappy?

Does sun make you unhappy?

On 24th of April a newsarticle was published on a Belgian website reporting about a study among people who migrated to the south of Europe from northern countries. 

According to this news item, 300 people who migrated from Belgium, The Netherlands, Germany, the UK, France and Switzerland to a Mediterranean country were included in the study. When asked how happy they were, measured on a scale with 10 as maximum, these expats scored themselves with an average of 7.3. In a study among 56,000 people in Northern-Europe, the average score on this scale was 7.5.

Does these results justify the head of the article: “Verhuizen naar de zon maakt niet gelukkiger” [Moving towards sunnier place does not make you happier, your cow]?

What could it mean?

The news article presents a comment of a sociologist, involved in the research. His hypothesis about this assumed unhappiness is that moving can disturb social lives. That is off course a possibility, but there is (at least) one other explanation: sampling bias.  

Sampling bias is a frequent problem in statistics. For instance: a lot of sociological studies are conducted by means of a survey. People are invited to participate in the study by filling in a question list that has to be sent back to the researchers. Even if the researchers try to ask a random sample of the population to fill in the forms, the people who actually do participate can be a selective group. A lot of people are simply not interested in or motivated for this kind of studies. The results from the study are then only applicable to this selected group of people, who might be quite different from the average person.

If you find this difficult to understand, try the following. Look at the figure in this article. Close your eyes and put your finger somewhere on the picture. Imagine that you can only observe that part of the picture at which you are pointing. What color is the pictures where you are pointing? As you can not observe other parts of the pictures, you may get the impression that the picture only has one color. The same happens if you are looking at a small part of the population. Researchers tend to assume that the people who did not participate in a study are similar to the people who do. Which can be a huge mistake.

Is this a one-colored picture?


In the case of the migrated people, a similar thing could have happened. For instance if only people who are already unhappy migrate. People who are completely satisfied with their live, will probably be less motivated to change anything. People who feel unhappy and hope that they can change that feeling by moving to an other country will be more motivated. The researchers who are studying the expats may well be looking at the unhappier part of the population. It is even possible that these people got indeed happier. Maybe they had an average happiness score of only 6.5 while living in their home countries. That could mean that these countries could have got more happy  on average because the unhappy people are leaving, while the unhappy people got more happy because they changed their lives.  

So don't worry if you are moving to the south, just go for it. Everyone will get more happy that way.



dinsdag 15 april 2014

Feminists

On the January 17th an article on the time to promotion for female
and male history professors was published at the website of The Atlantic
(article). The article refers to a paper published in Perspectives on History. (Which is unfortunately not available to the public.)
In a survey among history professors it was registered that the effects of
marriage on the time to promotion was reversed in female professors as
compared to male professors. Single female professors seem to get promoted
faster as compared to their married colleagues. Male professors, on the
contrary got promoted faster when they are married as compared to their
single colleagues.

The "conclusions"
First I give a few of the most remarkable statements in the article.
The article quotes a female professor who stated that a female professor
with a stay-at-home spouse is quite rare, but she often sees men with stay-at-
home wives, allowing them to fully commit themselves to their professions.
This might very well be true, but it does not imply anything about the
cause of this phenomenon. Political correct thinking would oblige us to say
that it is due to discrimination of the female professors. However, it is well
possible that female professors have an other mind set than male professors.
Consider the possibility that they just don't want a husband depending on
them.

Further the article states that it happened several times that women
turned down positions at Brown University, when their husband could not
leave from his own job. "The fact that it has happened on more than one
occasion would certainly contribute to the assertion that marriage does not
help a female professor progress in their field." Before the women amongst
you -the readers- start burning their bras, consider the possibility that women actually
care more about their family than about a job as a professor and therefore
turn down the job when it is not compatible with their family-life. One of
my professors in statistics -who is kind of sort of brilliant- turned down a position at Harvard, because her
boyfriend lived at the other side of the ocean. She stated, literally: "I didn't
want to go, because I had a boyfriend." [Emphasis added.] So she did what she wanted, and
did not go.

The next subject addressed in the article is maternity/ paternity leave.
Apparently only 3.4 percent of male professors took paternity leave. "A
much higher 33.6 percent of women, on the other hand, took time off after
the birth of a child." For those who are surprised by these numbers, I
recommend to go talk with your parents about the bees and the
flowers.
What the author of the article wants to demonstrate with the next statement is a mystery. "Perhaps the vaguest statement in the survey is the most
illuminating: 'Female faculty members are treated fairly at this institution.'
55.4 percent of female professors agreed, as compared to 84.7 percent of male
professors." The suggestion in the article seems to be that male professors
are better treated than there female colleagues. As this question of the
survey is purely on perception, one can conclude from this statement that
female professors complain more than their male colleagues. (Please don't
shoot me: it is the green cow, not the green bull and certainly not the green
ox.)

The fact that women possibly are just complaining more often than men
is illustrated in the next quote from the article. (I am intentionally using the same errors as the author of the article made.) About the representation
of women in all kinds of non-teaching, non-research activities, the authors
states: "The gender breakdown within a department plays a significant role.
Typically, there are more men than women within a discipline, and yet
committees seek as much diversity as possible. Women, then, are often
asked to do double the amount of service as men, a number that increases
for women of color." When there are few women involved in this activities,
it is said that they are discriminated. To much involvement, and women are
kept from their main tasks. So men: you are damned if you do and damned
if you do not.

A last quote is the following: "When we look at these kinds of issues,
whether it is the wage gap or child care, it becomes increasingly clear that
there is a fundamental problem with the professional workplace, which is
still best structured for single males, or males with wives who support their
careers." With these kind of statements we will end up in a situation where
it is forbidden for women to take care of there families. Why does the author
not consider the possibilities that women with a family do not have the desire
for a fast carrier. Nothing is said in this article about the possibilities for
women to become professor. Apparently, there is no problem there. It takes
a bit longer. Probably because they take some time off to bear and nurse children. Which is something nature reserved for them.

The problem
All the statements in the article are based on a very basic, though common
error in statistics. The author of the article does not distinguish correlation
from causality. For those who are not familiar with the concept of correlation, I give an easy example. In 2000 a paper was published, describing the
relation between the population of storks and human birth rate in European
countries. The author of this paper found a statistically significant correlation between
the number of stork breeding pairs and the birth rate. (Meaning more or less that the chance that the finding was a coincidence is less than five percent.) However, this does
not mean that the storks are causing the higher number of babies. Probably it is the industrialization that causes both a lower number of storks and a
lower birth rate. The
same is going on in this article: a correlation is observed between gender
and the time it takes to get promoted. Whether or not this relation is significant is not mentioned in the article. The analysis in the original paper
did probably not include a survival analysis, which should be done in order
to know for sure whether there is a significant correlation or not. If anyone has access to the article, I would be glad to get it, in order to verify my
suspicion. This
correlation does not mean that there is a causal relation. So, dear female
history professors: go write some papers if you want to get promoted. Or
go have some children first and write your papers afterwards. Or blame nature for the fact that men cannot bear and nurse children.

maandag 14 april 2014

Why average survival is a sloppy description

On April 29, 2010 the American Cancer Society published an article on its website with the title "FDA Approves Prostate Cancer Vaccine" (article). The article mentions that the approval comes mainly after a study (randomized, phase III for those that are interested) showed that patients receiving infusions with Provenge, the newly approved drug, lived on average four months longer than patients who got the placebo treatment instead. The problem with this description is the term "average."

Average is described as a measure of the "middle" or "typical" value of a data set (Wikipedia). Unfortunately, this term is not very specific, as it can refer to the arithmetic mean, the median or the mode. Most people will think of an arithmetic mean when speaking about an average. Mean and median can be very different. How big the difference is, depends on the distribution, as is illustrated in figures 1 and 2. (The distribution of a dataset is just a way to described the histogram of the data. And the histogram is the kind of graph used in figures 1 and 2) The first figure presents a normal distribution, the second one a so called negative binomial distribution. In the first dataset, with normally distributed data, the mean is 100.69 and the median 100.98. So in that case it does not really matter whether the mean or the median is used. In the second dataset the mean is 98.34 and the median 75. This is a clear difference. (Definitely when we are speaking about survival and life expectancy.)





In studies analyzing survival data, it is most common to use the median survival time and not the mean. The logic very simple. Median survival time is defined as the time were 50% of the patients has died and 50% is still alive. The mean survival time is calculated by adding the survival times of all patients and divide the result by the number of patients. In order to be able to calculate the mean survival time, it is necessary to know the survival time for each patient. It can take quite a long time to get all information, as it can take years before all patients died. (The longer it takes, the better, actually.) However, when 50% of the patients died, the median survival time is known. As survival data are typically not normally distributed, the median will be different from the mean. (As survival data are skewed to the right, the median will be lower than the mean.) Therefore it takes less time to complete a study when the median survival time is used.

Furthermore, statisticians use a technique called "censoring" in order to be able to come to any conclusion in a reasonable amount of time. With this technique data from patients who are still alive at the end of a survival study can be included in the analyses. At the end of the study, the total survival time of patients who are still alive at that time is not known. However, the minimal survival time of those patients is known and can be used in the analyses. (For instance: if all patients treated with an old medicine died by the end of the study and all patients treated with a new medicine are still alive, it is possible to draw some conclusions from this information. That the exact survival time is not known, does not aect the conclusion.) It is possible to calculate a mean survival time from the censored data. The true mean survival time will always be underestimated with this technique. The reason is that patients who are living longer, have a higher change of being censored. Therefore, the longer survival times will not be recorded in the data, resulting in an estimated mean which is lower than the true one.

In the original article on the prostate cancer vaccine the authors mention a median survival of 25.8 months in patients receiving Provenge (Sipuleucel-T immunotherapy) and 21.7 months in the placebo group. (P.W. Kanto, C.S. Higano et al., Sipuleucel-T immunotherapy for castration-resistant prostate cancer, New England Journal of Medicine 2010 Jul 29;363(5):411-22) From what I wrote, you all will see now that the mean survival times in both groups can be further apart than the median survival times are. The author of the article on the website of the American Cancer Society was thus sloppy when he wrote that the patients lived on average four months longer with the new medicine.