Monday, November 19, 2018

Hypothesis Testing Errors

Hypothesis testing allows us to quantify relationships between known samples and the unknown population from which they were taken. What do I mean by that? Let's say I am interested in the returns from investing in a particular stock. The daily return might be calculated as the ratio of today's closing price over yesterday's closing price. Whether I was to take a sample of 30 returns (over 31 days) or 60 returns (over 61 days), I still couldn't know the population mean return, but I could hypothesize about it... so I do.

I choose a "null hypothesis" that the population mean is 4.5%. Given that my sample mean was 5% and there was a 2% standard deviation, 30 observations would produce a test statistic of \(\frac{0.05 - 0.045}{\frac{0.02}{\sqrt{30}}} = 1.37\) standard deviations. In other words, the p-values would be 8.5% and 91.5%. For an 80% confidence two-tailed test (10% in the left tail and 10% in the right tail) we would reject the hypothesis, but at 90% confidence we would accept it. Note how we've already accepted or rejected the hypothesis regardless of its truth.

Now, imagine an all-seeing and all-knowing supernatural bystander watching the same events unfurl... they could know the population mean... and even though they wouldn't be obliged to share the exact value with me, let's say that they'd at least tell me if my hypothesis was true or false; that is to say: if I hypothesized that the population mean was 4.5% and it actually was 4.5% then the hypothesis would be true, otherwise if would be false (it could be 4.2% or 4.3% or even -5% or 63%; the point is we don't know).

If we take our two test results and combine them with the two possible truth values, it produces this 2X2 matrix of outcomes.

AcceptReject
TrueCorrectType 1 Error
FalseType 2 ErrorCorrect

  • True/False: does the actual population mean match the hypothesized mean?
  • Reject/Accept: does our statistic fall outside/inside the confidence interval?
  • Correct/Incorrect: did we accept a true (or reject a false) hypothesis or did we commit an error?

Let's ask the bystander if our hypothesis was indeed true or if it was false.

  • Yes, it's true:
    • At 90% we accepted it
    • At 80% we rejected it (Type 1 Error)
  • No, it's false:
    • At 90% we accepted it (Type 2 Error)
    • At 80% we rejected it.
This last pair of possibilities deserves more analysis: when the bystander tells us our hypothesis was false, it doesn't seem to matter why we were correct to reject the hypothesis; all that matters is that we did.

This video tries to explain it but I am not confident the author is correct. I'd prefer to side with authors of the CFA curriculum who say - in short - "it's complicated".

No comments: