Learning about deep learning through album cover classification
In the past month, I’ve spent some time on my album cover classification project. The goal of this project is for me to learn about deep learning by working on an actual problem. This post covers my...
View ArticleGoodbye, Parse.com
Over the past year, I’ve been using Parse‘s free backend-as-a-service and web hosting to serve BCRecommender (music recommendation service) and Price Dingo (now-closed shopping comparison engine). The...
View ArticleYou don’t need a data scientist (yet)
The hype around big data has caused many organisations to hire data scientists without giving much thought to what these data scientists are going to do and whether they’re actually needed. This is a...
View ArticleThe wonderful world of recommender systems
I recently gave a talk about recommender systems at the Data Science Sydney meetup (the slides are available here). This post roughly follows the outline of the talk, expanding on some of the key...
View ArticleMiscommunicating science: Simplistic models, nutritionism, and the art of...
I recently finished reading the book In Defense of Food: An Eater’s Manifesto by Michael Pollan. The book criticises nutritionism – the idea that one should eat according to the sum of measured...
View ArticleMigrating a simple web application from MongoDB to Elasticsearch
Bandcamp Recommender (BCRecommender) is a web application that serves music recommendations from Bandcamp. I recently switched BCRecommender’s data store from MongoDB to Elasticsearch. This has made it...
View ArticleThe hardest parts of data science
Contrary to common belief, the hardest part of data science isn’t building an accurate model or obtaining good, clean data. It is much harder to define feasible problems and come up with reasonable...
View ArticleThis holiday season, give me real insights
Merriam-Webster defines an insight as an understanding of the true nature of something. Many companies seem to define an insight as any piece of data or information, which I would call a...
View ArticleThe joys of offline data collection
Many modern data scientists don’t get to experience data collection in the offline world. Recently, I spent a month sailing down the northern Great Barrier Reef, collecting data for the Reef Life...
View ArticleWhy you should stop worrying about deep learning and deepen your...
Everywhere you go these days, you hear about deep learning’s impressive advancements. New deep learning libraries, tools, and products get announced on a regular basis, making the average data...
View ArticleThe rise of greedy robots
Given the impressive advancement of machine intelligence in recent years, many people have been speculating on what the future holds when it comes to the power and roles of robots in our society. Some...
View ArticleDiving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions
Background: I have previously written about the need for real insights that address the why behind events, not only the what and how. This was followed by a fairly popular post on causality, which was...
View ArticleMaking Bayesian A/B testing more accessible
Much has been written in recent years on the pitfalls of using traditional hypothesis testing with online A/B tests. A key issue is that you’re likely to end up with many false positives if you...
View ArticleIs Data Scientist a useless job title?
Data science can be defined as either the intersection or union of software engineering and statistics. In recent years, the field seems to be gravitating towards the broader unifying definition, where...
View ArticleIf you don’t pay attention, data can drive you off a cliff
You’re a hotshot manager. You love your dashboards and you keep your finger on the beating pulse of the business. You take pride in using data to drive your decisions rather than shooting from the hip...
View ArticleAsk Why! Finding motives, causes, and purpose in data science
Some people equate predictive modelling with data science, thinking that mastering various machine learning techniques is the key that unlocks the mysteries of the field. However, there is much more to...
View ArticleCustomer lifetime value and the proliferation of misinformation on the internet
Suppose you work for a business that has paying customers. You want to know how much money your customers are likely to spend to inform decisions on customer acquisition and retention budgets. You’ve...
View ArticleExploring and visualising reef life survey data
Last year, I wrote about the Reef Life Survey (RLS) project and my experience with offline data collection on the Great Barrier Reef. I found that using auto-generated flashcards with an increasing...
View ArticleMy 10-step path to becoming a remote data scientist with Automattic
About two years ago, I read the book The Year without Pants, which describes the author’s experience leading a team at Automattic (the company behind WordPress.com, among other products). Automattic is...
View ArticleAdvice for aspiring data scientists and other FAQs
Aspiring data scientists and other visitors to this site often repeat the same questions. This post is the definitive collection of my answers to such questions (which may evolve over time). How do I...
View Article