The Basics of Machine Learning for Business
Machine Learning is sexy, it’s a buzzword, but it’s also changing businesses across all industries in a very real and rapid way. It feels like voodoo even to me, a trained engineer, maybe it’s all the hype and my proclivity for science fiction.
It’s not voodoo, let’s break it down.
At the core of Machine Learning (ML) are so-called models. ML models are functions. You know like a f(x) = 5x + 10. Only they get super complex with lots of parameters, not just a lonely x-variable.
In essence, ML is a bunch of math algorithms running on lots of data with the purpose of building a model, aka figuring out all the parameters of a complex function.
No magic, this is just math. Math can be scary, but good ol’ Pavel will protect you, don’t fret.
We’ve been using ML models or functions for three things, usually to predict things:
Regression
I’ve got a bunch of data; I want to fit a curve to it. f(x) = mx + b, find m and b
Classification
Are these customers likely to churn? Is this an image of a dog or a muffin?
Clustering
Segmenting populations, customers, arranging by category (search engine), discovering similar items
Basically, those are your three styles of models. You’ll pick an approach based on the specific business problem or question you’re trying to solve.
We can think about the overall machine learning lifecycle, via 3 stages:
1. Prepare data, get your cowboy gear on and do some wrangling.
2. Feed the data into the math monster, build and train a model/function
3. Deploy the model, feed live data into the function and do something with the result of that function (detect a fraudulent transaction and block it)
This is a cyclical and iterative process. Once deployed, we take the latest data and see if your targeted metrics are improving, feed more data in and create an even more precise model.
Why are AI and ML so HOT HOT HOT right now?
More data is being generated and captured. You need the data for machine learning, without data this does not exist. More data tends to produce more accurate models.
More compute at cheaper rates. With the cloud, we can spin up, 2000 GPUs (specialized processors) to train a model for a couple hours for a few bucks. Imagine having to build your own computing infrastructure instead, real estate lease and all.
We have free access to state of the art algorithms, tools and frameworks. Tensorflow, PyTorch, scikit-learn, you get the idea, the cutting edge ML algorithms are open-source.
Bottom line; don’t be afraid, it’s just some math. Most business will use ML either directly or through vendor-software to improve operations and sales. For lots of businesses, much of the data still lays there untapped. State of the art tools are freely available.
This article is completely inspired (borderline plagiarized) by Matt Winkler’s video, see the full thing here (scroll down a bit).