What on earth is a likelihood estimator:

Well, thinking of a discrete data set. What is the likelihood that the data you've got has come from a parametric probability distribution with a given parameter for instance Poisson distribution with average 3 (X~Poisson(3))

To spell it out even more explicitly with an example.

If you coount how many people die per day in a hospital. and then the number of days with no people dieing (220) becomes x0, with 1 person dieing (110) is x1 and so on...

x0 = 220, x1 = 110, x2 = 23, x3 = 9, x4 = 0 are all members of the set X

We might readily assume that this is a Poisson distribution. to calculate the parameter for this model we just find the average

The Poisson pdf looks like this

So the model with the paramter 0.5055 looks like this

and if we multiply each of these numbers by the total number of deaths at the hospital we obtain

x0 = 214.00, x1 = 114.28, x2 = 30.51, x3 = 5.44 and x4 = 0.73

vs the actual figures

x0 = 220, x1 = 110, x2 = 23, x3 = 9, x4 = 0

Which isn't a bad match. But what is the likelihood that the above deaths data came from a Poisson distribution with mean 0.534?

The likelihood can be seen as the probability that the figures obtained would have come from a Poisson distribution with the given mean.

taking x1=0 and there are 220 of them the likelihood calculation looks like this

and so on...gives:

Which if you calculate it gives something very very very small.

But if you take the natural logarithm you get a quite large negative numbers.

Interesting!! What would happen, you might think, if I kept changing the parameter entered into the equation as a suggested parameter until I get the highest value of the likelihood value.

Doing this is the basics of finding the Maximum Likelihood Estimator (MLE).

For the above model the plot of the MLE looks like:

By inspection we can see that there is a maximum likelihood value somewhere around the 0.51824 mark.

The more general item can be seen here:

Which means given any set of sample data how likely is it that the data was obtained from a population with an assumed parametric distribution.

If we have a set of data and we think it's Normal or Poisson or otherwise distributed, what value of the parameter/s gives the best fit for the given data and distribution?

To find this value we find the maximum value of the Likelihood Function usually denoted using . As such represents the parameters obtained by maximising the likelihood function for the model and is called the Maximum Likelihood Estimator or MLE for short.

Using as a head bending example the Normal distribution (using beta instead of Theta):

Notice that in the above;

  • i indexes individuals (aka dependant variable or response)
  • j indexes explanatory variables (independent variables).

One consequence of the above equation is that the exponetial component will only ever be less than 1 as the component...

will always be positive so...

...will always be between 0 and 1 and so will L( ß )

Anyway I digress...