I have been using R, mostly happily, for the past 6 or 7 years, for its variety of statistical and machine learning packages and the relative ease of producing nice-looking plots. At the same time I am a big user of Python for things that R really doesn’t do that well, such as large-scale string manipulation. I had been aware of scikit-learn (or sklearn) for a while as a potential way to be able to do “everything” in Python including stats and plotting, but never really felt the pull to start using it. In the beginning, it felt too immature; later, it felt too messy when I looked at the documentation.
Last week, however, I came across a really good tutorial by Jake Vanderplas that finally made sklearn click for me and perhaps will push me over the edge to start using it. (I don’t expect to leave R any time soon, though…) The tutorial shows, step by step, how to divide your data set into training and test sets, fit models and make predictions, perform grid searches for parameter settings, plot learning curves etc.
Deep learning is another subject (although much bigger than sklearn of course) that I have kept up a passing interest in but never really looked into properly, because I wasn’t sure where to start. The new book Deep learning: Methods and applications (PDF link) by Li Deng and Dong Yu seems like a good place to start. I’ve only read a few chapters, but so far it has done a good job of clarifying terms and putting deep learning methods into a historical context.