Machine Learning — Data First

With all the attention on Machine Learning (ML) that I encountered at London Tech Week, I thought I better find out a bit more about it. I wanted to verify my view that it won’t have a dramatic impact on the Oil and Gas industry and see if this was actually true.

My findings, so far, is that ML might:

  • Speed up some analysis
  • Change the spread-sheet paradigm (for better or for worse)
  • Enable people with the right level of expertise to create predictive analysis (whether this will be at all valuable is a different matter)
  • In a myriad of small ways, change minor tasks (removing annoyance, reduce low-value add data & reporting work, change interfaces)

None of these applications are dramatic in themselves, but over time they may add incremental benefits that provide a moving improvement-front. A bit like a six-sigma or Kanban.

What will you need in order to drive broad-value from ML

You’ll only be able to take advantage of that if four things are true:

  1. You have clean, historical data available
  2. You can access and combine quality-controlled, time-dependant data in near real-time (ideally from multiple sources)
  3. You have wide-spread knowledge of how to apply the new analysis tools – like those based on “R”. (think how many people can use Excel today for collecting, analysing, querying and reporting on data – varying degrees of proficiency, but who do you know that can use “R”)
  4. You are prepared to reorganise the way work is performed to take advantage of the new possibilities created by: data analysis leading to demonstrated-fact-based / probability-assessed management decisions & employee actions.

Low cost hardware is the trigger

The interest in machine learning is spawned from the dramatic drop in the cost of hardware and software required to perform the number crunching required. Because of this, not only has the complexity of the addressable problem increased but also the inefficiency of code that can be supported increases the usability of tools and techniques leading to their application by practitioners outside pure decision sciences.

If you attend any of the IoT conferences – or speak to the large vendors of real-time industrial data, you’ll hear a lot about how edge-computing and Machine-Learning will change things for industry. “Edge” means placing computing power in the field with low-power and small costs.

Putting this alongside the sensor enables pre-processing information to send back only the results. This helps to reduce data bandwidths and increase responses.

For a view on how cheap this type of technology now is – and how undramatic the applications of ML really are I invite you to have a look at this video (from the hobbyist market) showing what can be done for less than $100. Listen out for the references to “TensorFlow” one day that will be important, there are also some passing references to cloud-based resources that may be of interest.