From my experience the biggest hinder to the future of data science is how crapp...

nickdavidhaynes · on Feb 13, 2017

To be fair, learning statistics is hard for the same reason that doing statistics is hard - any statistic involves assumptions, and the different assumptions underlying different models can be very subtle. There's a lot of disagreement among even professional, academic statisticians about fundamental concepts like p values [1] and how to quantify uncertainty under multiple hypothesis testing [2]. Unfortunately, I don't see any of this getting easier any time soon (although I would love to be proven wrong).

[1] http://www.tandfonline.com/doi/full/10.1080/00031305.2016.11...

[2] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1112991/

ddysgath · on Feb 13, 2017

Moving away from Null Hypothesis Testing and towards a more Bayesian approach is a good first step. For me, and I'm sure many others, NHT is a very backwards way of approaching inference. I don't care about an imaginary distribution with mean 0, I have real data I can fit to a distribution directly--what can you tell me about it? Conditioning on the data itself rather than an unobservable parameter of interest is much more intuitive and makes it much easier to report results to non-statisticians.

claytonjy · on Feb 13, 2017

I completely agree; I've found it much harder to self-learn the stats than the software side of things. Sibling post makes a good point, but I think the history of stats vs. comp sci bears weight here too; having many people want to learn stats outside academia is a much newer phenomenon than people doing the same with programming.

Anyone have any good resources for self-teaching stats? I have a BS in math but only took one stats course, and it was as terrible as all intro-stats classes are. I have a strong, proof-based understanding of probability theory, but haven't found a similar approach to stats. It all seems to be "if data looks like this, use this test, watch for these pitfalls" which is terrible for building intuition.

parul · on Feb 13, 2017

Try the Khan Academy stats resources - https://www.khanacademy.org/math/statistics-probability

Datacamp also launched a bunch of new stats courses recently. I haven't checked them out yet, but their courses are usually good quality. https://www.datacamp.com/courses/topic:probablity_and_statis...

johnmoberg · on Feb 13, 2017

If you like proofs and rigor, take a look at "Statistical Inference" by Casella and Berger.