A quotable Domingos paper
I’ve been (re-)reading Pedro Domingos’ paper, A Few Useful Things to Know About Machine Learning, and wanted to share some quotes that I like.
- (…) much of the “folk knowledge” that is needed to successfully develop machine learning applications is not readily available in [textbooks].
- Most textbooks are organized by representation [rather than the type of evaluation or optimization] and it’s easy to overlook the fact that the other components are equally important.
- (…) if you hire someone to build a classifier, be sure to keep some of the data to yourself and test the classifier they give you on it.
- Farmers combine seeds with nutrients to grow crops. Learners combine knowledge with data to grow programs.
- What if the knowledge and data we have are not sufficient to completely determine the correct classifier? Then we run the risk of just hallucinating a classifier (…)
- (…) strong false assumptions can be better than weak true ones, because a learner with the latter needs more data to avoid overfitting.
- Even with a moderate dimension of 100 and a huge training set of a trillion examples, the latter cover only a fraction of about 10^-18 of the input space. This is what makes machine learning both necessary and hard.
- (…) the most useful learners are those that facilitate incorporating knowledge.
Another interesting recent paper by Domingos is What’s Missing in AI: The Interface Layer.
Previously, Domingos has done a lot of interesting work on, for instance, why Naïve Bayes often works well even though its assumptions are not fulfilled, and why bagging works well. Those are just the ones I remember, I’m sure there is a lot more.