- The BigML blog has been on a roll lately with many interesting posts. I particularly liked this one, Bedtime for Boosting, which goes pretty deep into benchmarking various versions of the boosting algorithms we all know and love (?).
- Mark Gerstein of Yale University has a nice slide deck about the big data blizzard in genomics (<– pdf link). There are lots of ideas here about how to build predictive models based on, for example, ENCODE data. I won’t get into the ongoing controversy around ENCODE here, suffice to say that I think the ENCODE data sets are a good resource for starting to build statistical models of genomic regulation on a larger scale.
- The O’Reilly Radar has a good post about how Python data tools just keep getting better.
- An “ultra-tricky” bioinformatics challenge will be run by Genome Biology on DNA Day (April 25), with a “truly awesome” prize. Intriguing.