Data science in China
China’s been on my mind lately, as I’ve been putting together my visa application to go there in October. I feel that the “(big) data (science)” mindset has really taken root there, which is maybe not so surprising in such a huge and populous country where it’s natural to think of billions of potential customers and where engineering and quantitative sciences are appreciated (e g many of the Communist party leaders are science PhDs and engineers).
For instance, I heard about Mayer-Schönberger’s and Cukier’s book Big Data: A Revolution That Will Transform How We Live, Work, and Think before it appeared in English from professor 周涛 (Zhou Tao) who told me that it was already being translated into Chinese. In fact if you look at the publication dates on Amazon, it looks like the Chinese version was published first, but that could hardly have been the case, or?
I learned the abbrevation BAT – Baidu, Alibaba and TenCent – for the three big Chinese internet companies from Quora in a thread where they were pinpointed as also being the main big data players in China right now. Baidu, of course, has made waves with the relatively recent announcement that Andrew Ng (deep learning and general machine learning guru of Stanford, Google and Coursera) has joined their artificial intelligence lab which he will head and where he intends to implement some truly visionary projects. (You can hear more about Ng’s plans for the future of AI here – the audio is pretty bad though.) TenCent is the company that developed WeChat and QQ – huge platforms in China and some other parts of the world – but I don’t know much about their data science efforts so I’ll pass them over in silence. Finally, Alibaba was of course recently introduced as a publicly traded company to great fanfare, and offers a very interesting service (connecting customers directly to manufacturers).
Here is a recent blog post about how Alibaba use Spark and GraphX to analyze their e-commerce platform, which is one of the largest in the world, collecting hundreds of petabytes of data.
Alibaba are currently looking for senior data scientists for their Hangzhou office. Looks like a fun gig.
By the way, the international conference on machine learning 2014 was held in Beijing. Here is a link to the videos (some of which seem quite interesting).
I’d love to learn more about data science in China and welcome any comments.