Machine Learning – more learning….

As an addendum to last week’s post about machine learning – here is an article by the BBC : https://www.bbc.co.uk/news/technology-48825761. This describes a story about an amazon employee who built a cat-flap. Fed up with receiving “presents”, he made this clever device to recognise whether his cat has come home with prey in its jaws. If it has, it refuses to open and the cat must stay outside.

I thought it neatly encapsulated some problems of machine learning I am finding, and also pointed to some possible features where this technology could be applied to generate more value.

The training problem and need for clean data

It took 23,000 photos to train the algorithm. Each had to be hand sorted to determine whether there was a cat, a cat with prey etc. etc.

This is like the oil and gas industry in that it needs a lot of clean training data that may not be available without a lot of manual input.

The Bayesian stats can work against you

The frequency of event-occurrence in this case is once every 10 days. The maker ran a trial for 5 weeks (so should have seen 3-4 instances in that trial, though in this case it’s stated as seven). There was one false positive and one false negative – giving an error rate which may be around 20% (though there are not enough samples to have a lot of confidence). If the algorithm continues to learn and say that the cat uses the flap 5 times a day on average. This means, in a year it will have 1,825 additional samples from which 35 will be positives of which 7 will be false and 1790 negatives of which 358 will be false-positives. Manual correction will be required 365 times (i.e. per day on average) and the learning rate will take about 12 years to duplicate the original training set. I don’t know, but I suspect there are diminishing returns on adding new data so how much smarter it will be in 12 years I’ve no idea.

Disclaimer – a good statistician will know my maths above is not quite right, but the principle is.

Do you trust the alarm? Do you take notice?

So in this example, quite like oil and gas operations, the issue was getting hold of clean training data. The event being detected was comparatively rare meaning that a lot of false positives are likely. In the example above an “alarm” would have sounded 365 times and 28 occasions it would have been real. With the sparsity of events the this means that the algorithm will not learn very fast and I think the alarms will be ignored.

So where can the applications be better?

Distributed learning

Parallel learning helps build better predictive models. if we had the same cat-flap for every cat, and every owner corrected the false signals and right algorithm could learn quickly and disseminate the results to all, this would speed the learning process. Self-driving cars are a good example of where this is possible, and google-search is great example of the power of parallel experience.

Products and services impossible for a human

Situations where there is so much data that manual processing is impossible. Here I don’t mean that you can collect so much data on a manual operation that is ongoing that you cannot analyse all the extra information. What I do mean is that there is intrinsically so much information it would be impossible to analyse by hand and never has been. So an ML approach is the only one possible. For instance looking at real-time changes patterns in data networks.

Simple situations where it’s expensive and boring for a human

Automated first-line help systems, call screening, password resetting etc. These are all tasks where humans can do them, but they are simple tasks which are too boring for smart people to do, where automated help can often provide a better experience. And where “sorry I didn’t get that, did you say yes?” is only mildly irritating not the cause of major corporate loss.

Conclusion

There are places that machine learning will be revolutionary, but I suspect that much of ML will be either embedded to make normal tasks a bit easier – such as auto-spell checking, voice recognition etc. Or they will tackle classes of problem such as IT security, or on-line shopping behaviour where there is inherently a lot of fast-moving data and manual monitoring is simply not feasible nor fast enough to work.

To innovation and beyond – 2019+

My first post of the year – a look ahead for 2019 – was a bit tongue-in-cheek. Now The World Economic Forum (WEF) is meeting in Davos, Switzerland, I thought I would provide a more insightful analysis.

The WEF will be considering the implications of the 4th Industrial Revolution as the headline theme for their annual conference. If you’re new to all this here is a I4.0 primer from CNBC [Link]. 2019 is going to be a year where industrial innovation takes centre stage. 

The thinking from WEF is always good, detailed thorough. I think that some of the crucial themes for unlocking innovative value will be focussed around opportunities and risks. Here are some of my current favourites.

The Opportunities

  1. Using information and reconfigurable platforms to provide new solutions to stakeholder experience. This will establish new ways to create, deliver and consume the core outputs from industrial processes.
  2. Removing the idea of separation between “IT and the Business”. The two are now conjoined. Being good at tech will be a prerequisite of being good in business. Technology will be embedded in every way that work is done, products are created, consumed and delivered.
  3. Empowering the front-line will be crucial. The winners will be faster organisations where workers make autonomous decisions and are rewarded for outcomes. As an analogy think of Deliveroo drivers. For many reasons, more refined models of work-coordination are required but the core autonomous nature of the work is being previewed here. Decentralised decision-making and autonomous action guided by technology removes many of the tasks performed by middle management. I hope we will start to see teachers, dentists, doctors and nurses no longer filling in spreadsheets and working as relecutant automatons directed by ill-informed command-and-control resource-allocation systems.
  4. With power comes responsibility. Without middle management, new forms of controls (and motivation) will be needed to spot problems and reward behaviour. Surprisingly for some, I don’t believe it is the front-line worker, but middle management, that is most under threat from AI, visual computing and big-data. I hope the CFO won’t push progress only on AIQ but that marketing and talent managers will push the AEQ agenda. It’s important we understand not only economics but also pride, satisfaction and feelings of accomplishment.
  5. Innovation may not be in new forms of technology. The tech available to us now is far ahead of our application of it. Deployment options are already available but not used. Innovation will come from the application of existing technology to new areas of business. Those stuck with old infrastructure will not be able to reconfigure fast enough to keep up. Value will arise from designing new ways of working. Capturing the value will rest on finding ways to get the rest of us to work that way too.

And now the risks

  1. Innovation will come from networks. Big companies will look to small companies for ideas, small companies will be formed from collaborative networks of individuals. Ideas will be mashed-up to cross-fertilise creativity. Guards must be in place to avoid exploitative situations – if they arise unchecked it will mean that the small-guys can’t and won’t play for long. Without them, brilliant ideas will never be used. Rights management is crucial for the distribution of the value created. In the way that song-writing credits generate performance fees for artists. Licenses for ways-of-working are needed to stimulate innovation, and society needs to enable easy access to legal enforcement to uphold claims against copying without permission.
  2. Massive generalisation follows: Young people are frustrated by old-people’s inability to embrace new ways of working. Technology savvy folks are orders of magnitude more productive than their peers. They are quicker to make decisions and to multi-task. This leads to not only high-productivity but also to high-error rates. Iterative short-cycle experimentation and learning-by-doing is the hall-mark of agile strategy. This is not an approach that has been adapted to high-risk industrial work-settings. This leads to a clash of culture and an inability to attract and retain talent.
  3. Innovative individuals will continue to pursue independent careers in increasing numbers. Old industries will die, vested interests will be disenfranchised. The world of work, taxation, social contracts, pensions and access to finance will have to evolve to cope with this. To create a consensus and establish a sense of fairness new-politicians will need not only wisdom but also to deploy the old-tools of oratory and persuasion. There will be big disagreements across society and between nations. It will be necessary to create hope for those who fear being disenfranchised. They will not go quietly into that good night.
  4. Politics of property will come to the fore – the control of assets will be important. Whether that is physical real-estate where low-paid important workers are unable to afford to live where the people who need them reside; property from an accumulation of historical data that provides an unassailable lead and monopoly positions; or the “IDEA” that one person has spent 10 years creating that is exploited by a large corporation without reward. Society will need to find ways to address the control and distribution of property in a world where labour and working-time may not function as a distribution & motivation method.

I will spend time exploring these themes during the year – I have a number of initiatives already kicking off for the year and I hope that you’ll be able to help.

Five Digital Vectors

Frameworks for Digitalisation – Part 1

I’ve been working on frameworks that help me describe concepts around Digitalisation in upstream oil and gas. I plan to publish these in several formats but so far I’ve been too busy to do this to my satisfaction – so I’m going to put them out here for comment and then work them up as packaged tools.

This first framework – five digital vectors – is designed to set the context for the strategic intent of a digitalisation initiative. This is important because senior management had better know why they are embarking on programme of change, what they expect to get from it and where threats to it will come from.

I was recently talking to the CEO of a multinational engineering consultancy based in Norway. To slightly protect his identity, I’ll call him Egil.

Egil:  “Gareth, you know [insert Big 4 consultancy here] was just in my office telling me that digitalisation was going to radically alter my business. They said just look what NetFlix did for the video store. It must be important or they wouldn’t be here. But I’m busy and, frankly, I don’t get it”.

Communicating strategic intent is important. I am as guilty as anybody about trotting out tired lines about how digitalisation will disrupt industries and then helpfully pointing out that Uber has no cars, AirBNB no property and Amazon no shops. This may be intriguing but it’s no longer precisely true (as all three are busy making strategic bets in traditional assets), and it’s of very little help if you’re in Oil and Gas wondering how this applies to your business.

Using this Five Digital Vectors framework provides a way to classify the objectives of an initiative, how innovation in the area may cause competitive shifts and explain where to look in order to measure success. There are Five main vectors for digitalisation. They are:

  1. Pure Digital
  2. Digitally Enhanced Products and Services
  3. Digitally Efficient Operations
  4. Digitally Effective Supply Chain
  5. Digital License to Operate

I’ll explain a little about each of these, and then hopefully you’ll get the idea. If you take each in turn you can look for potential disrupters and initiatives and decouple them. Some of these will be more likely to impact your business than others. At least now you can decide which few to concentrate on first.

Vector 1: Pure Digital

Pure Digital strategies work when a product can be codified as information. Think Music, E-books, Films. Once the physical product is removed massive scale economies accrue to storage and distribution. What is called “long-tail” economics kicks in around inventory and specialisation, customisation and choice. In Oil and Gas, we may see some spare parts digitised, emailed and then 3D printed on-site. This will reduce carrying costs and delays. We may also see pure information products trade more freely (such as production forecasting, planning, sub-surface models, training data sets and educated machine-learning algorithms).

Vector 2: Digitally Enhanced Products and Services

Digitally enhanced strategies arise when the fundamental “product” becomes augmented with information. For instance, Uber generates a fair portion of its demand not only on price, but also because it provides information about where the cars are, when they will arrive, the route they take and the price you will pay. They then ease the transaction by collecting payment and supplying receipts. However, all the digitalisation in the world will be useless without the underlying physical product (in this case, a car to take you home). In upstream oil and gas we may see that a supplier of products such as spare parts, services or even crude oil become a preferred option when they supply accompanying information before their wares arrive and when they keep you informed while they are in service.

Vector 3: Digitally Efficient Operations

In oil and gas this is the area where I am witnessing most digitalisation activity.

Using information within your own business to reduce waste and increase accuracy is hardly a new idea, but digitalisation changes the game. As more information becomes available – because of better connections, more sensors and accumulated history – so it becomes possible to change the way you do things. Prioritisation, scheduling, just-in-time: these concepts work better when you can access more information and use it sensibly. Today’s engineers entering the workplace can probably not remember a world that didn’t have an iPhone and Google (Google is almost 20 years old). So, they are used to being able to think of a question and get an answer quickly. If you can harness this creative real-time problem-solving ability (by making information available) you can improve your operations.

Vector 4: Digitally Effective Supply Chain

Both vertically and horizontally there is potential to add value through more efficient exchange. The digitally efficient operation strategy will reduce the waste and hence cost within a single company (see Porter on what it will do for price). Supply chain strategies focus on removing friction between companies so inter-company waste will also reduce. This is, in many ways, a move from Digitally Efficient Operations to Digitally Efficient Industry. It is about expanding the focus from the individual company to the collection of companies.

For this to work requires standards, data compatibility and platforms where buyers and sellers can transact. Some suppliers (think about a stationery company) will supply various industries – say automotive and oil and gas. So eventually some standards will need to be cross industry, whereas others (say for drilling services) won’t be.  Though the benefits can be large, there are two main problems: co-ordination of participants; and allocation of cost and benefit.

Vector 5: Digital License to Operate

This is an interesting insight that came to me when I was discussing the apocryphal case of a town inviting bids from contractors to build a pipeline through it. One bidder offered to expose in real time the contents of the pipe, the corrosion status, inspection procedures and compliance, the leaks and seeps and other such. The other company claimed it was confidential. Guess who got the permission to build.

Whether the information was confidential or whether the quality of it and how to access it was suspect, I don’t know. But we see similar exposure of operational data for services such as trains and busses through simple APIs. This data is then “mashed up” by active citizens for public good to help people plan journeys or avoid breakdowns.

In the future, perhaps it will be a requirement of regulators that operational, safety and environmental data is made available to the public in real-time, if not – then you won’t be allowed to operate your field. Once that data’s out there you can expect to be held to account for your actions. Welcome to CSR in Industry 4.0.

Summary

The five vectors described here help to provide a primary direction for an initiative. For maximum impact, like all good vector mathematics, the magnitude of value delivered will increase as the direction of the vectors align. This tool helps to focus the mind on the primary vector and provides insights to the effect on the others to enable informed choices to be made.

As always, email me direct or leave comments here and I’ll do my best to respond.

Image credit http://www.kimonmatara.com/vector_ops/