Statistics help, tutorials and a shoulder to cry on…

Menu & Search

Hypothesis Testing 1 – Writing a hypothesis

Writing hypothesis

Hang onto your hats, statfans! Hopefully you’ve read the rest of the ‘statistical basics’ posts and you know that what we are trying to do is estimate the value of some important statistic  (which could be a mean or average for example) which we will calculate from some data organised into variables. We can safely estimate from a sample thanks to our BFF the Central Limit Theorem. If any of that is news to you please go back and refresh those concepts.

Statisticians get very worked up about hypothesis testing. Things have to be done in a very particular way and there’s a million opportunities for you to stuff it up. Get excited, because here comes my complete guide to hypothesis testing. You’re welcome! This first part is about writing hypotheses.

What is a hypothesis?

A ‘hypothesis’ is some statement about how the world works. I might have a hypothesis that “Women weigh less than men, on average”, or “Aliens from space are controlling my brain”. A hypothesis has to be a statement about how things actually are, or might be. “People should not murder each other” is not a hypothesis, it’s a statement about how someone thinks the world should be . My golden rule to keep you out of trouble is that every hypothesis is a statement beginning with the word ‘that’, like these:

  • That giraffes are taller than monkeys, on average
  • That elderly people sleep less than middle-aged people, on average
  • That this drug I have improves average cancer survival times
  • That this drug I have does not improve cancer survival times
  • That this new drug is no different to this other drug

A hypothesis should only contain one ‘that’ – it might sound obvious, but if you have multiple ‘thats’ you have multiple hypotheses: ‘That my brother is older than me and that he likes to play golf ‘ is two hypotheses. Your hypothesis need to be bite sized; able to be tested and digested in a single experiment.

Your hypotheses need to be falsifiable. What does this mean? It means that there has to be some possible way of unearthing evidence which debunks it, or proves it false. With this in mind, my hypothesis about aliens controlling my brain is no good – there is no experiment we can do (with current technology!) which will disprove this hypothesis. If you can’t disprove a statement, it doesn’t count as a scientific hypothesis.

Your hypotheses needs to be precise in two ways. “That pizza tastes good” is simply not going to cut it. “Good” is not a thing we can test with an experiment. You need to use very clear language, so subjective terms like ‘good’ and ‘bad’ are out. Usually, you also need to say who you’re talking about. It is very unusual that your research population is all of humanity, so include the population in the hypothesis.

A better hypothesis about pizza is: “That pizza is preferred to hot dogs by middle aged, single Scottish men”. This hypothesis suggests the experiment which could test it, which is the hallmark of a very precise hypothesis.

You need to avoid causal overtones. Unless you are a randomised, experimental study you are not allowed to suggest that some thing is causing some other thing. Ever. Students are often trying to make their hypotheses sound interesting, or important, and so unintentionally introduce language which suggests they are hunting for causal relationships. I have taken a lot of marks off students for this simple mistake, it is very easy to do by accident. Here are some examples:

  • That drinking coffee causes headaches among middle aged women
  • That having a baby increases the chances you will not graduate from university
  • That depression leads to poor performance on undergraduate exams

None of this language – or any other variation implying that something is the direct result of something else – is ok in any scientific discipline. Weed it out. Here are the improved versions:

  • That coffee consumption is associated with higher frequency of headache among middle aged women
  • That women who become mothers during undergraduate study graduate at lower rates than women without children
  • That depression is associated with poor performance on undergraduate exams

Why do we do this hypothesis hose-down? Science is inherently conservative in its statements about how the world works. We know that things which are correlated aren’t always caused by each other,  and so to avoid making errors we have set up a fancy list of things you need to do in order to be allowed to claim that the relationship you’ve found is causal. This skepticism is a big part of thinking like a scientist.

Ok, so you know what’s what with hypotheses – congratulations! In the next post we’ll meet the enemy of students everywhere – the “Null Hypothesis.”


Article Tags
Related article
Statistical basics – Normal distribution Part 1

Statistical basics – Normal distribution Part 1

Most students can draw a normal distribution and name a…

Sampling Distribution & Central Limit Theorem

Sampling Distribution & Central Limit Theorem

Understanding sampling distributions is really important, but very few researchers…

Statistical Basics – What is a Random Variable?

Statistical Basics – What is a Random Variable?

This term “random variable” is a bit confusing, and not…

Discussion about this post

Leave a Reply

Your email address will not be published. Required fields are marked *

Type your search keyword, and press enter to search