Posts

Industry Embraces Big Data

We blogged about Big Data six years ago. Back then, pushing industrial data to the cloud in real time was a novel idea. Collecting industrial data within the plant for on-site use had been going on for decades, but few companies were integrating that data with enterprise IT or analytical systems.

Today, all that is changing. IoT and Industrie 4.0 are ideal for connecting industrial processes to Big Data. Progressive companies routinely use digital transformation to feed analytical systems to improve performance across the enterprise. Others are taking notice, trying to catch up. A recent research project by Automation World points to the growing rate of acceptance and adoption of Big Data among system integrators and end users, and how they leverage it.

Half of the system integrators in the study report that most or all of their clients collect production data to run improvement analysis. A quarter of the end-users surveyed say that they collect data from over 76% of their systems and devices.

While most of the data being collected is for in-plant improvements in equipment and maintenance operations, somewhere between 40% and 54% is also being used for Industry 4.0, smart manufacturing, or digital transformation initiatives. Pulling Big Data from the shop floor has become that important in just a few years time.

Data collection technologies

Despite the move towards Big Data, the most widely-used approaches to collecting data are still hand-written notes entered into a spreadsheet, as well as on-site data historians, according to the report. So for many users, the technology hasn’t changed significantly since the 1980s. However, cloud and edge technologies are gaining acceptance, being used at some level in about one fourth of the facilities reported on.

The survey didn’t specifically address it, but we see that some technologies originally developed for in-plant use—most notably data historians—are now widely used in edge and cloud scenarios. Some of the most well-known real-time data historians have cloud equivalents, or can be run on cloud servers. As a result, there is no clear line between traditional data collection and IoT-based systems, and there doesn’t need to be.

What is needed is secure, real-time data communication between the plant and the office or cloud. As high-quality data communication is more widely adopted, and as companies implement digital transformation in more areas, we can expect to see a huge growth in Big Data applications to optimize resource use, increase production efficiencies, and bolster the profits of the enterprise.

IoT for All

With each passing year the IoT (Internet of Things) becomes more familiar, more of a household word. What once seemed a futuristic dream—having billions of devices connected and chattering over the Internet—is now almost taken for granted. Case in point is the IoT For All website whose very name speaks volumes. It seems that everyone is using or at least touched by IoT in one way or another.

At the beginning of the year, IoT For All published an article Where Is IoT Headed in 2019? that collects and distills the thoughts of industry experts regarding the near future for the IoT. Although not specific to Industrial IoT, there was significant discussion on several themes that are of interest to us here at Skkynet:

Secure by Design

Several experts have predicted that the rapid development of the IoT with little attention being paid to security will lead to widespread attacks in the coming year—often directed at industrial and infrastructure targets. At the same time, they lament the lack of robust security solutions built into hardware, software, and services. James Goepel, CEO and General Counsel for Fathom Cyber mentioned new regulations in California that mandate a secure-by-design approach to the IoT. “I think we’re going to see many more states, and possibly the federal government, following California’s lead and creating legislation that imposes new cybersecurity-by-design requirements on IoT manufacturers,” he said. Skkynet’s customers will be ready, as they have been employing our secure-by-design approach to the IoT for years.

Edge and Hybrid Computing

This year “will be a defining year for edge and hybrid computing strategies as IoT and the global network of sensors pile on more data than the average cloud has had to handle in the past,” according to Alan Conboy, working in the Office of the CTO at Scale Computing. “This transition will officially crown edge computing as the next big thing.” This has certainly been our experience. As interest in edge computing grows, we are seeing a corresponding demand for Skkynet’s edge computing and hybrid cloud solutions.

Remote Access

“Experienced engineers are hard to find and those they do have can only visit so many remote sites in a year. Enabled by 5G and the speed with which data can travel through the air, AR (augmented reality) will enable engineers-in-training to be able to have instant intelligence about a device on which they may be working just by pointing their tablet towards it,” said Jeff Travers, Head of IoT Connectivity Management at Ericsson. Much of this remote connectivity will depend on secure, real-time, two-way data flow. Again, Skkynet’s unique approach to Industrial IoT solves problems that many managers and executives are only now beginning to realize exist.

In short, the future continues to brighten for IoT in general, and Industrial IoT in particular. At least part of our mission is to make the move to IoT as smooth and easy as possible. We want it to become the logical choice for anyone who considers it—so that it really does become IoT for all.

What is Edge Processing anyway?

Part 12 of Data Communication for Industrial IoT

Edge processing refers to the execution of aggregation, data manipulation, bandwidth reduction and other logic directly on an IoT sensor or device.  The idea is to put basic computation as close as possible to the physical system, making the IoT device as “smart” as possible.

Is this a way to take advantage of all of the spare computing power in the IoT device?  Partially.  The more work the device can do to prepare the data for the cloud, the less work the cloud needs to do.  The device can convert its information into the natural format for the cloud server, and can implement the proper communication protocols.  There is more, though.

Data Filter

Edge processing means not having to send everything to the cloud.  An IoT device can deal with some activities itself.  It can’t rely on a cloud server to implement a control algorithm that would need to survive an Internet connection failure.  Consequently, it should not need to send to the cloud all of the raw data feeding that algorithm.

Let’s take a slightly contrived example.  Do you need to be able to see the current draw of the compressor in your smart refrigerator on your cell phone?  Probably not.  You might want to know whether the compressor is running constantly – that would likely indicate that you left the door ajar.  But really, you don’t even need to know that.  Your refrigerator should recognize that the compressor is running constantly, and it should decide on its own that the door is ajar.  You only need to know that final piece of information, the door is ajar, which is two steps removed from the raw input that produces it.

Privacy

This has privacy and information security implications.  If you don’t send the information to the Internet, you don’t expose it.  The more processing you can do on the device, the less you need to transmit on the Internet.  That may not be a big distinction for a refrigerator, but it matters a lot when the device is a cell tower, a municipal water pumping station or an industrial process.

Bandwidth

Edge processing also has network bandwidth implications.  If the device can perform some of the heavy lifting before it transmits its information it has the opportunity to reduce the amount of data it produces.  That may be something simple, like applying a deadband to a value coming from an A/D converter, or something complex like performing motion detection on an image.  In the case of the deadband, the device reduces bandwidth simply by not transmitting every little jitter from the A/D converter.  In the case of the motion detection, the device can avoid sending the raw images to the cloud and instead just send an indication of whether motion was detected.  Instead of requiring a broadband connection the device could use a cellular connection and never get close to its monthly data quota.

Data Protocol

There is just one thing to watch for.  In our example of the motion detection, the device probably wants to send one image frame to the cloud when it detects motion.  That cannot be represented as a simple number.  Generally, the protocol being used to talk to the cloud server needs to be rich enough to accept the processed data the device wants to produce.  That counts out most industrial protocols like Modbus, but fits most REST-based protocols as well as the higher level protocols like OPC UA and MQTT.

Continue reading, or go back to Table of Contents

Where does Blockchain fit into the IIoT?

Part 11 of Data Communication for Industrial IoT

Nothing I’ve read suggests that blockchain will replace SSL for IoT security.  Blockchains are “distributed ledgers” that are known to be tamper-proof (though there are ways to tamper with them in actuality if you own enough of the computing power validating the transactions). This design works fine for certain Internet applications like bitcoin, but I don’t see the blockchain fitting well into the IIoT.

Size matters

First of all, since there is no central ledger, all participating devices must contain, or have access to, the entire ledger.  No entry can ever be removed from the ledger.  As the number of devices grows, and the number of transactions it contains grows, the size of the ledger grows geometrically.  The size of the bitcoin blockchain is roughly doubling every year and currently is over 60GB.  For an IoT node to fully trust the blockchain it would need a geometrically growing amount of storage.  That’s obviously not possible.

So, individual devices can prune the block chain and store only the last few minutes or seconds of it, hoping that nearby peer devices will provide independent confirmation that their little piece of the blockchain is cryptographically secure.  That produces a possible line of attack on the device, where nearby devices could lie, and produce a satisfactory probability of truth in the “mind” of the target device.

Thus security is based on the availability of massive storage, and attempts to reduce that storage requirement diminish security.  As far as I can tell this is an unsolved problem right now.

Too much connectivity?

The second problem with blockchains is that they assume that every transaction in the system must be transmitted to every participant in the blockchain.  Yes, when somebody’s fridge turns on in Paris, every one of the billions of devices participating in the blockchain must be told.  If they are not, then their local copy of the blockchain is inconsistent and they cannot trust the next transaction, which they might actually be interested in.  As the number of devices and transactions rises, the amount of worldwide network bandwidth required to maintain the integrity of the blockchain grows geometrically.  One article I read says that on a 10Mbit Internet connection the theoretical maximum number of transactions in the entire bitcoin universe that connection could sustain would be 7 transactions per second.  Seven.

The result of these two limitations is that a blockchain probably cannot be used to carry the actual data that the devices produce.  Instead it is more likely to be used as an authentication mechanism.  That is, a device that is legitimately on the blockchain can be verified as being itself based on something that the blockchain knows.  My personal opinion is that it sounds very much like the blockchain would become a distributed certificate authority.  Instead of having the current SSL “chain of trust” of certificates, you would have a “blockchain of trust”.  But since an individual device could not contain the entire blockchain you would still need a server to provide the equivalent of certificate validation, so there’s your point of attack.

There are some examples of IoT devices using blockchains, like a washing machine that buys detergent using bitcoins, that are using misdirection to claim the use of blockchains.  Yes, they are using blockchains in their bitcoin transactions because that’s how bitcoin works, but the maintenance data they produce (the real point of the blockchains-for-IoT conversation) are not being transmitted via blockchain at all.

I have yet to see a practical application of blockchains to IoT data or even to IoT authentication.  The conversation at the moment is in the realm of “it would be nice” but the solutions to the implementation problems are not clear.  Incidentally the same problems exist for bitcoin and there are no clear solutions in that space either.

Continue reading, or go back to Table of Contents

Is REST the Answer for IIoT?

Part 10 of Data Communication for Industrial IoT

As we’ve stated previously, the IIoT is imagined as a client-server architecture where the “things” can be smart devices with embedded micro-controllers.  The devices generate data based on sensors, and send that data to a server that is usually elsewhere on the Internet.  Similarly, a device can be controlled by retrieving data from the server and acting upon it, say to turn on an air conditioner.

The communication mechanism typically used for devices to communicate with the servers over the Internet is called REST (Representational State Transfer) using HTTP.  Every communication between the device and server occurs as a distinct HTTP request.  When the device wants to send data to the server it makes an HTTP POST call.  When it wants to get data (like a new thermostat setting) it makes an HTTP GET call.  Each HTTP call opens a distinct socket, performs the transaction, and then closes the socket.  The protocol is said to be “connectionless”.  Every transaction includes all of the socket set-up time and communication overhead.  Since there is no connection, all transactions must take the form of “request/response” where the device sends a request to the server and collects the response.  The server generally does not initiate a transaction with the device, as that would expose the device to attack from the Internet.

HTTP does define a keep-alive connection, where several HTTP transactions are sent on a single socket.  This definitely reduces the amount of time spent creating and destroying TCP connections, but does not change the basic request/response behaviour of the HTTP protocol.  Scalability issues and trade-offs between latency and bandwidth still overwhelm any benefit gained from a keep-alive connection.

One of the identifying features of the IIoT is the data volume.  Even a simple industrial system contains thousands of data points.  REST APIs might be fine for a toaster, but at industrial scale they run into problems:

Bandwidth

REST messages typically pay the cost for socket setup on every message or group of messages.  Then they send HTTP headers before transmitting the data payload.  Finally, they demand a response, which contains at least a few required headers.  Writing a simple number requires hundreds of bytes to be transmitted on multiple IP packets.

Latency

Latency measures the amount of time that passes between an event occurring and the user receiving notification.  In a REST system, the latency is the sum of:

  • The client’s polling rate
  • Socket set-up time
  • Network transmission latency to send the request
  • Transmission overhead for HTTP headers
  • Transmission time for the request body
  • Network transmission latency to send the response
  • Socket take-down time

By comparison an efficient persistent connection measures latency as:

  • Network transmission latency to send the request
  • Transmission time for the request body
  • Network transmission time for an optional response body

The largest sources of latency in a REST system (polling rate, socket set-up, response delivery) are all eliminated with the new model.  This allows it to achieve transmission latencies that are mere microseconds above network latencies.

REST’s latency problems become even clearer in systems where two devices are communicating with one another through an IoT server.  Low-latency event-driven systems can achieve practical data rates hundreds or thousands of time faster than REST.  REST was never designed for the kind of data transmission IIoT requires.

Scalability

One of the factors in scalability is the rate at which a server responds to transactions from the device.  In a REST system a device must constantly poll the server to retrieve new data.  If the device polls the server quickly then it causes many transactions to occur, most of which produce no new information.  If the device polls the server slowly, it may miss important data or will experience a significant time lag (latency) due to the polling rate.  As the number of devices increases, the server quickly becomes overloaded and the system must make a choice between the number of devices and the latency of transmission.

All systems have a maximum load.  The question is, how quickly does the system approach this maximum and what happens when the maximum is reached?  We have all seen suddenly-popular web sites become inaccessible due to overloading.  While those systems experienced unexpectedly high transaction volumes, a REST system in an IIoT setting will be exposed to that situation in the course of normal operation.  Web systems suffer in exactly the scenarios that IIoT is likely to be useful.  Event-driven systems scale much more gradually, as adding clients does not necessarily add significant resource cost.  For example, we have been able to push REST systems to about 3,000 transactions per minute.  We have pushed event driven systems to over 5,000,000 transactions per minute on the same hardware.

Symmetry

REST APIs generally assume that the data flow will be asymmetrical.  That is, the device will send a lot of data to the server, but retrieve data from the server infrequently.  In order to maintain reasonable efficiency, the device will typically transmit frequently, but poll the server infrequently.  This causes additional latency, as discussed earlier.  In some systems this might be a reasonable sacrifice, but in IIoT systems it usually is not.

For example, a good IIoT server should be capable of accepting, say, 10,000 data points per second from an industrial process and retransmitting that entire data set to another industrial process, simulator, or analytics system without introducing serious alterations to the data timing.  To do that, the server must be capable of transmitting data just as quickly as it receives it.  A good way to achieve this is through establishing persistent, bidirectional connections to the server.  This way, if the device or another client needs to receive 10,000 data changes per second the communication mechanism will support it.

Robustness

Industrial applications are often mission-critical to their owners.  This is one of the big issues holding back the IIoT.  What happens if the Internet connection goes down?

In the typical IoT scenario, the device is making REST calls to a server running in the cloud.  In some ways this is a by-product of the cloud vendor’s business model, and in some ways it is due to the REST implementation.  A REST server is typically a web server with custom URL handlers, tightly coupled to a proprietary server-side application and database.  If the Internet connection is lost, the device is cut off, even if the device is inside an industrial plant providing data to a local control system via the cloud.  If the cloud server is being used to issue controls to the device, then control becomes impossible, even locally.  This could be characterized as “catastrophic degradation” when the Internet connection is lost.

Ideally, the device should be able to make a connection to a local computer inside the plant, integrate directly with a control system using standard protocols like OPC and DDE, and also transmit its data to the cloud.  If the Internet connection is lost, the local network connection to the control system is still available.  The device is not completely cut off, and control can continue.  This is a “graceful degradation” when the Internet connection is lost.

In conclusion, REST systems work reasonably well in low-speed transactional systems.  However, they have a number of disadvantages when applied to high speed, low latency systems, and to systems where data transfer from the server to the device is frequent.  Industrial IoT systems are characterized by exactly these requirements, making REST an inappropriate communication model for IIoT.

Continue reading, or go back to Table of Contents

Cisco Study Shows Most IoT Projects Unsuccessful

One of the big take-aways from the annual Internet of Things World Forum (IoTWF) held in London last week were the results of a new Cisco study that only about 1/3 of the IoT projects were considered completely successful, technically.  Financially the success rate was even worse—just 15%—according to the business executives surveyed.  The study was conducted among over 1,300 executives in medium and large size companies in the manufacturing, energy, health care, transportation, and similar sectors. The findings suggest several reasons for low IoT project completion rates, and more important, point to specific remedies.

Unexpected Difficulties

As we have seen in the past, one of the primary reasons for project failure or lackluster results for IoT projects has been that those initiating the project were not aware at the outset how difficult implementation would be.  This is illustrated in the Cisco study results, where cost overruns and the need to extend timelines to completion were common.  Many respondents noted that they lacked the necessary internal IoT expertise.  As a result, over half of the IoT initiatives didn’t make it past the Proof of Concept phase, and of those that did, many ended up with poor IoT integration and/or low quality of data.

Need for Partnerships

These results underlined, according to the majority of survey respondents, the need for IoT partnerships.  At every stage of the project, from planning and design, through implementation and deployment, and during the management and maintenance phases, those organizations that engaged with IoT partners were more successful.  This applied to general areas of technical consulting and support, as well as specific aspects such as data analytics.

Commenting on this kind of relationship, the final report stated: “Our study found that the most successful organizations engage the IoT partner ecosystem at every stage, implying that strong partnerships throughout the process can smooth out the learning curve.”

Learning from Failure

The good news in all of this is that companies are willing and able to learn from mistakes.  Most survey respondents are optimistic for the future of the IoT, and they see its potential.  Over sixty percent believe that they “have barely begun to scratch the surface of what IoT technologies can do for their businesses.”

Among the participants who have completed projects, most said that they are using data from the IoT to improve their business.  Two out of three of them have seen the greatest benefits in improved customer satisfaction, more efficient operations, and better quality of products and/or services.  The most unexpected benefit was improved profitability for the company.

These results corroborate our experience.  The companies that we partner with report a much higher success rate than most of those participating in the Cisco study.  We agree with the finding that “strong partnerships throughout the process can smooth out the learning curve,” and we take seriously the challenge of removing the difficulties that may crop up when embarking on an IoT project.