Keeping ahead of the data curve

The amount of data that is generated, communicated and manipulated in modern industries is truly staggering. For instance, over a terabyte of data is saved each day in power production alone, and this is quickly increasing as time marches on. Here, George Walker, managing director of industrial automation expert Novotek UK & Ireland, discusses what goes into these multi-terabyte data handling systems.

It’s almost hard to believe that merely 30 years ago, a single terabyte would have filled nearly half a million floppy disks. As a species we’ve become quite desensitised to these dizzying figures. A terabyte just isn’t all that much in the modern world — you can even get it on a fingernail-sized SD card.

Aside from the modern internet of on-demand video and conference calls, industrial data systems have been encouraging this trend upwards for decades. From 1960s punch-card computers, to 1980s magnetic tape reels, to modern SSDs; industrial data collection has always taken advantage of the latest developments to push the limits.

This theme of increased data collection has become fossilised by modern automation, as well as health and safety standards, both of which rely on complete data handling to operate robotics and keep workers safe respectively.

According to predictions recently published by industry analyst IDC, sectors such as semiconductor manufacturing and automotive manufacturing are producing upwards of two terabytes of data every day — as much as the combined water, energy and gas utilities sector. Both these figures are expected to skyrocket in the next five years, with semiconductor manufacturing expected to quadruple.

Treading data

Many organisations, from a wide range of industries, collect data from their processes on these huge scales. The question is, what do they then do with it?

The simplest, traditional and most obvious way is to generate a regular report to be examined by plant managers and engineers. For instance, a bottling plant might create a report concerning pasteurisation temperatures and conveyor motor speeds, among other parameters. This way problems can be addressed if anything reports out of expected values.

While reporting is easy, it has pitfalls. Namely, even with daily or even hourly reports, the system cannot predict where or when the next fault will appear, as a report approach is entirely and inextricably reactive. Furthermore, in a plant with potentially tens of thousands of sensors, how do engineers decide what to include and exclude from the report? It just takes one oversight or mistake to wreak havoc.

What computers are best at

Luckily, industrial data handling software has been keeping a good pace with industrial data growth.

Modern SCADA and industrial historian software, such as those supplied by Novotek UK and Ireland, is well equipped to crunch the terabytes of data created by production lines around the world today. Instead of collecting data, generating a regular report and expecting the engineer on the day to parse it, these systems keep a watchful eye on all connected devices.

With integrated machine-learning algorithms, the systems can get a feel for the normal, smooth operation of the plant. If anything perturbs the algorithm, be that a hot-running motor, increased pump cycle-time or any other aberrant parameter, the system alerts engineers that an issue is developing.

This way problems can be identified and corrected before they start posing issues. For instance, a SCADA system monitoring a municipal sewerage plant discovered and notified engineers of a potentially catastrophic blockage, over twelve hours before it occurred. Under a report-based system the subtle operational effects that this blockage building up caused may have been missed, or the entire fault could have occurred between reports. Either way, the result is at best a more stressful maintenance experience, or at worst a horrendous mess.

So, while terabytes might make modern teenagers yawn, when that amount of data is flying at you uncompressed, in real time, it can quickly become overwhelming. It’s best to hand the parsing off to those who are far better equipped for these logic-intensive— computers with industrial data handling software.