There is a lot of stuff the human eye cant see that is very interesting. One of the challenges in medical imaging is getting accurate labelling of images. For many labellings we see large inter- and intra-observer variability. We have both the problem that humans see something that is not interesting and miss something that is interesting.
I currently work on estimating emphysema extent in CT lung scans. Emphysema can be very diffuse and it is not possible to label individual pixels, so instead we try to learn the local emphysema pattern from a global label. Neural networks are interesting for this problem because the learn the features, but it is also a "problem" because the features might not make physically sense, which could make it hard to transfer the model and convince clinicians that they should use it.
For that kind of task, you might want to filter out other things.
We should just be realistic.
We want to take real image, except it might be tinkered with, and make neural net tell us what we see on it, except we also want it to see what we can't see, and we want it to answer as accurate as possible, except we also want short and definitive answer.
We also kind of want it to admit that image always contains more than one thing, but kind of don't.
You can say that about almost anything, and the world is still full of factory workers.
As a PhD student in medical imaging, you must also know that getting fully automating segmentation methods to work to the standard required in the clinic is really hard. And once you solve it for one clinic you will likely not be able to transfer the trained model to another clinic, because scan parameters, patients and workflow are different.
But when we solve the segmentation task, I think most radiologist will clap their hands and move on.
Stating it as "thousands of patient images a day" is misleading. It would be the same as saying you inspected "thousands of parts each day". As the radiologist further down notes, CT scans contains many slices.
While computers don't get tired, they also have a really hard time solving stuff like annotation tasks automatically. One thing is getting a good enough general performance, another is to never make critical errors. I see a huge potential for ML approaches in health care, but primarily as an aid for the health care professionals and not as a full replacement.
Maybe not right now, but there is nothing to say that it can't eventually surpass humans in effectiveness and critical error rate. 10-15 years ago people would have said "yeah self-driving cars are good, but as an aid for the driver, never as a full replacement".
Yes it is obviously an error by the doctor and the pharmacists, no one is saying otherwise. The whole point of the article is to investigate how this error occurred. We live in a world where these errors happens all the time and the best thing we can do is learn from them and if the error analysis boils down to "incompetent doctor", then it is likely that the same error will occur again.
That is a good question, and one we should discuss more actively, because if it can go wrong it will go wrong. What happens when an over eager politician learns that "we can predict with X% accuracy if a person will do something bad next year"? I might be cynical, but I do not expect the result will be an increased interest in how society can help people before they do bad stuff. It would not surprise me if instead the argument would be that extensive surveillance is a great benefit to society because it can identify the bad guys with X% accuracy.
I think the points are good, but I am not very happy about this statement
"When dealing with small amounts of data, it’s reasonable to try as many algorithms as possible and to pick the best one since the cost of experimentation is low. But as we hit “big data”, it pays off to analyze the data upfront and then design the modeling pipeline (pre-processing, modeling, optimization algorithm, evaluation, productionization) accordingly."
If done correctly, then I agree. But we have to be carefull about overfitting when we try out several models or make an initial analysis to determine which model to use. In this sense, choosing a model is no different from fitting the parameters of the model.
If you are disciplined, and separate data into training and testing sets, you can try as many models as you want without fear of overfitting. Indeed, optimizing over the parameters of a model on the training set is essential (pruning parameters in a tree, regularization weights, etc.) and can be thought of as training large number of models.
If you aren't doing this correctly, then you can't really interpret the performance of even a single model. Seen people screw this up in so many ways - my favorite recent one that was quite high on HN was someone using the full dataset for variable selection, before doing a training-testing split afterwards.
If you use performance on the test set for model selection, this is not true. It follows from simple probabilistic reasoning, the more models you try the higher the chance one will score well on both the training set and the test set by "luck", and this is especially true with small datasets. In fact it is a best practice to use a separate validation set for model selection and use the test set only for final performance evaluation, see e.g. the answer to this question:
I personally love the topic of bayesian optimization over all the possible parameters including model choice. My point was more about given the resource is always constrained, it typically pays off long term for practitioners to analyze the data, understand the underlying mechanics before jumping into modeling.
I thought exactly the same thing. Statistics is about uncertainty, and it's very easy to be misled when you don't correct for trying lots of hypotheses.
A specific critique raised in the press release from IARC is that the study has an
"emphasis on very rare cancers (e.g. osteosarcoma, medulloblastoma) that together make only a small contribution to the total cancer burden."
and that it
"excludes [...] common cancers for which incidence differs substantially between populations and over time."
So it sounds like the generalization hinted at in the abstract shows a bigger misunderstanding of statistics than any in the press release. Would be nice if the paper was not paywalled, so we could actually read it.
You're right. You might say "most cancers are caused by bad luck," and across the set of types of cancers... that might be the case. But if you were to say, "Most cases of cancer are caused by behavioral or environmental factors" you'd be saying something entirely different.
Yeah, we you look at total cancer burden and the associated epidemiology you can explain something like 90% of cancers from environmental sources (which includes things like obesity). Don't have a reference on me at the moment so feel free to disagree.
A 2005 Lancet paper [1] says that ~35% of cancer deaths can be attributable to a modifiable risk factor, and that's only risk factors likely to be causal, not known to be causal. If accurate, that would would support the "most cancer due to luck." However, this is the first that I've looked into it, and if it's higher than 50% I would be quite interested to see data supporting that.
That is nice illustration of premature optimization. Instead of thinking "Lets parallelize", one should measure and find out what causes performance problems. Should one choose to go down the parallel path, it's a good idea to test if hyper threading degrades performance. In my experience it can be expensive use more than the physical cores.
edit:
Another issue is that I really dont like when people present speedup in %. How should 540% speedup be interpreted? It makes more sense as a ratio, so we find sequential/parallel = 10067483333/1583584841 ~= 6.36. So the parallel version achieves a speedup factor of 6.36.
Yes I can do basic arithmetic. My point is that the ratio is easy to interpret, the program ran 6.36 times faster. The only reason I can think of for giving a percentage is to make it sound more impressive.
I currently work on estimating emphysema extent in CT lung scans. Emphysema can be very diffuse and it is not possible to label individual pixels, so instead we try to learn the local emphysema pattern from a global label. Neural networks are interesting for this problem because the learn the features, but it is also a "problem" because the features might not make physically sense, which could make it hard to transfer the model and convince clinicians that they should use it.