As an addendum to last week’s post about machine learning – here is an article by the BBC : https://www.bbc.co.uk/news/technology-48825761. This describes a story about an amazon employee who built a cat-flap. Fed up with receiving “presents”, he made this clever device to recognise whether his cat has come home with prey in its jaws. If it has, it refuses to open and the cat must stay outside.
I thought it neatly encapsulated some problems of machine learning I am finding, and also pointed to some possible features where this technology could be applied to generate more value.
The training problem and need for clean data
It took 23,000 photos to train the algorithm. Each had to be hand sorted to determine whether there was a cat, a cat with prey etc. etc.
This is like the oil and gas industry in that it needs a lot of clean training data that may not be available without a lot of manual input.
The Bayesian stats can work against you
The frequency of event-occurrence in this case is once every 10 days. The maker ran a trial for 5 weeks (so should have seen 3-4 instances in that trial, though in this case it’s stated as seven). There was one false positive and one false negative – giving an error rate which may be around 20% (though there are not enough samples to have a lot of confidence). If the algorithm continues to learn and say that the cat uses the flap 5 times a day on average. This means, in a year it will have 1,825 additional samples from which 35 will be positives of which 7 will be false and 1790 negatives of which 358 will be false-positives. Manual correction will be required 365 times (i.e. per day on average) and the learning rate will take about 12 years to duplicate the original training set. I don’t know, but I suspect there are diminishing returns on adding new data so how much smarter it will be in 12 years I’ve no idea.
Disclaimer – a good statistician will know my maths above is not quite right, but the principle is.
Do you trust the alarm? Do you take notice?
So in this example, quite like oil and gas operations, the issue was getting hold of clean training data. The event being detected was comparatively rare meaning that a lot of false positives are likely. In the example above an “alarm” would have sounded 365 times and 28 occasions it would have been real. With the sparsity of events the this means that the algorithm will not learn very fast and I think the alarms will be ignored.
So where can the applications be better?
Parallel learning helps build better predictive models. if we had the same cat-flap for every cat, and every owner corrected the false signals and right algorithm could learn quickly and disseminate the results to all, this would speed the learning process. Self-driving cars are a good example of where this is possible, and google-search is great example of the power of parallel experience.
Products and services impossible for a human
Situations where there is so much data that manual processing is impossible. Here I don’t mean that you can collect so much data on a manual operation that is ongoing that you cannot analyse all the extra information. What I do mean is that there is intrinsically so much information it would be impossible to analyse by hand and never has been. So an ML approach is the only one possible. For instance looking at real-time changes patterns in data networks.
Simple situations where it’s expensive and boring for a human
Automated first-line help systems, call screening, password resetting etc. These are all tasks where humans can do them, but they are simple tasks which are too boring for smart people to do, where automated help can often provide a better experience. And where “sorry I didn’t get that, did you say yes?” is only mildly irritating not the cause of major corporate loss.
There are places that machine learning will be revolutionary, but I suspect that much of ML will be either embedded to make normal tasks a bit easier – such as auto-spell checking, voice recognition etc. Or they will tackle classes of problem such as IT security, or on-line shopping behaviour where there is inherently a lot of fast-moving data and manual monitoring is simply not feasible nor fast enough to work.