Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> However, the idea is that often a lot of the probability mass - an amount that is not small - will be concentrated around the maximum likelihood estimate, and so that's why it makes a good estimate, and worth using.

This is a Bayesian point of view. The other answers are more frequentist, pointing out that likelihood at a parameter theta is NOT the probability of theta being the true parameter (given data). So we can't and don't interpret it like a probability.



Given enough data, Bayesian and frequentist models tend to converge to the same answer anyway.

Bayesian priors have similar effect to regularization (e.g. ridge regression / penalizing large parameter values).


That's not a Bayesian point of view. You can re-word it in terms of a confidence interval / coverage probability. It is true that in frequentist statistics parameters don't have probability distributions, but their estimators very much do. And one of the main properties of a good estimator is formulated in terms of convergence in probability to the true parameter value (consistency).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: