Posts

Hybrid Cloud

Part 15 of Data Communication for Industrial IoT

A hybrid cloud is a combination of in-house and cloud functionality in the same system.  For industrial applications, this usually means the use of a public cloud for remote access, and a locally managed, private “cloud” for interaction among application within the plant.

A public cloud is a service like SkkyHub, Azure or GE Predix that is hosted by a separate company, whose services are available to the general public.  It provides a point of contact for a remote user to access the data collected from the industrial system.  Depending on the cloud service it might also provide redistribution, storage, analysis, alerts and other processing on the data.

I put private “cloud” in quotes because the local server does not need to be a cloud service in the sense of on-demand resource balancing on a distributed collection of virtual machines.  Usually it is an aggregation and distribution point for data from the local network that allows various plant subsystems to communicate with one another.  It may fulfill the same communication role as a cloud server without the implied implementation of a cloud service.

A private cloud is run by the company that uses it, often on company hardware at a company location.  It might be, for example, be run inside the control system network, in a DMZ between different plant locations or between the management offices and the shop floor.  In terms of security and reliability, one key difference is that data communications for the private cloud typically run across the company network, while for the public cloud communications go across the Internet.

A Very Good Approach

There are several reasons why a hybrid cloud might be the best approach to data communications for the Industrial IoT.  First, a good hybrid cloud system can provide separate data paths for in-house applications and publicly available services.  Implemented correctly, this can mean exposing lower-level data to plant engineers and managers, while providing access to higher-level data like accumulated statistics for the IT department, or aggregated supply levels to raw materials vendors.  This kind of data separability, where everyone gets just the data they need, is a hallmark of a secure system.

In a similar way, running separate servers for in-plant and public cloud keeps the critical data paths on the LAN where higher data rates and more bandwidth are typically possible.  It also means that if the Internet connection goes down, the plant can continue to function unimpeded.  To achieve this, the design would need to ensure that none of the primary control necessary to run the plant is in the public cloud.  And if the system is designed correctly, when the network comes back the data should continue to flow to remote users via the public cloud, using the latest values.

Despite these advantages, there are a few practical considerations.  Costs can be somewhat higher, as you are effectively running two systems.  That may not be a huge consideration since you probably need the communication infrastructure within your plant in any case.  The requirement then becomes ensuring that your in-plant communication software can interoperate with your choice of public cloud provider.  A software or hardware protocol gateway may be all you need to act as a private “cloud” server.

Avoid Cloud Dependencies

Some cloud service companies may not encourage a hybrid cloud solution just due to their business model.  For example, consider a typical REST-based cloud service.  These services offer the ability to transmit data via HTTP directly from a sensor or embedded device to the cloud server.  However, if that sensor needs to be incorporated into a larger control system, sending the data to a cloud service and then back to the control system is both impractical and fragile.  Yet the cloud provider will not let you run a copy of their server software in your own network, as that would eliminate their service revenue.

This brings up an important requirement when selecting an IIoT cloud provider – a good IIoT device should be equally comfortable connecting to a local system or a cloud system.  The cloud service provider should formally support local receipt of data using the same protocol they use in the public cloud, through cloud-compatible software or hardware, so that the device can transmit either to the local system or the public cloud with just a configuration change.  Local cloud-compatible servers should support the protocols that the plant uses, like OPC, Modbus, etc.

When a hybrid cloud system is implemented correctly it can eliminate dependencies on the Internet connection, yet still provide all of the remote accessibility benefits of the cloud.  It can make it possible to re-purpose devices between the cloud and the local system trivially, and provide for graceful degradation in the event of a network failure by maintaining local communication and control even though remote visibility has been lost.  When the purpose of the system is process control, you don’t want to lose control of your process.

The cloud is good, except when it’s not.

Part 14 of Data Communication for Industrial IoT

Cloud computing can be quite useful in industrial systems to gather data and do supervisory control in some application spaces.  “Big data” services help managers and engineers to locate inefficiencies, coordinate predictive maintenance, and boost productivity.  The cloud model of software as a service (SaaS) offers a convenient way to add new functionality to existing systems, and it shifts costs from capital to operating expenses.

Despite the advantages of cloud systems, system integrators and key decision-makers in industrial facilities are reluctant to try it.   Some of the reasons for this might include:

  • License enforcement — “Will this cloud-based system be used to ensure software license compliance, in the same way my kids need an internet connection to play a single-player computer game?”
  • Vendor lock-in — “If all the processing power of the system is in the cloud service, how can I switch services?”
  • No edge processing — “There are too many cloud services that are basically just Internet-accessible databases. That’s not flexible enough for me.”
  • Security — “Once my data leaves my plant, is it safe from prying eyes?  And if I connect my plant to the cloud, will my plant be open to attack?”
  • Loss of connectivity — “If my Internet connection goes down, will I lose my ability to control my plant?”

So should we avoid cloud services altogether?  No.  They provide capability and efficiency you can’t get any other way.  In addition to data-gathering, cloud services can be used to support remote connectivity over the Internet.

Cloud as Intermediary

If we link an operation center in one city to a production system in another, there must be a network.  If we make a direct connection, then one or the other must accept an inbound connection from the Internet.  Using a cloud system as an intermediary means that neither the operation center nor the production system needs to open its firewall, thereby improving security by moving the point of attack outside either system.

Limited Data Sets

Should IIoT devices send all of their data to the cloud?  No, it’s usually not necessary.  Only the data necessary for remote monitoring and control needs to be accessed.  Device information is not monolithic – you should be able to pick and choose what the cloud has access to.

Support for Local Capability

But what happens when cloud is not available?  What happens if cloud provider goes out of business (think Google/NEST Revolv)?  The system should degrade in a way that essential functions still remain available. The goal should be to support fundamental local capability, enhanced with cloud services.  We should still be able to use our devices when the Internet is not available.

Like most things in life, the cloud has its strong points and its weaknesses.  The most successful implementations will take full advantage of the strong points, and design around the weaknesses.  For industrial applications, that means keeping remote devices and in-plant systems behind closed firewalls and protecting them from any network slowdowns or outages.  This can be accomplished through edge and fog processing mentioned previously, and/or by implementing a hybrid cloud, which we will discuss next.

What is Edge Processing anyway?

Part 12 of Data Communication for Industrial IoT

Edge processing refers to the execution of aggregation, data manipulation, bandwidth reduction and other logic directly on an IoT sensor or device.  The idea is to put basic computation as close as possible to the physical system, making the IoT device as “smart” as possible.

Is this a way to take advantage of all of the spare computing power in the IoT device?  Partially.  The more work the device can do to prepare the data for the cloud, the less work the cloud needs to do.  The device can convert its information into the natural format for the cloud server, and can implement the proper communication protocols.  There is more, though.

Data Filter

Edge processing means not having to send everything to the cloud.  An IoT device can deal with some activities itself.  It can’t rely on a cloud server to implement a control algorithm that would need to survive an Internet connection failure.  Consequently, it should not need to send to the cloud all of the raw data feeding that algorithm.

Let’s take a slightly contrived example.  Do you need to be able to see the current draw of the compressor in your smart refrigerator on your cell phone?  Probably not.  You might want to know whether the compressor is running constantly – that would likely indicate that you left the door ajar.  But really, you don’t even need to know that.  Your refrigerator should recognize that the compressor is running constantly, and it should decide on its own that the door is ajar.  You only need to know that final piece of information, the door is ajar, which is two steps removed from the raw input that produces it.

Privacy

This has privacy and information security implications.  If you don’t send the information to the Internet, you don’t expose it.  The more processing you can do on the device, the less you need to transmit on the Internet.  That may not be a big distinction for a refrigerator, but it matters a lot when the device is a cell tower, a municipal water pumping station or an industrial process.

Bandwidth

Edge processing also has network bandwidth implications.  If the device can perform some of the heavy lifting before it transmits its information it has the opportunity to reduce the amount of data it produces.  That may be something simple, like applying a deadband to a value coming from an A/D converter, or something complex like performing motion detection on an image.  In the case of the deadband, the device reduces bandwidth simply by not transmitting every little jitter from the A/D converter.  In the case of the motion detection, the device can avoid sending the raw images to the cloud and instead just send an indication of whether motion was detected.  Instead of requiring a broadband connection the device could use a cellular connection and never get close to its monthly data quota.

Data Protocol

There is just one thing to watch for.  In our example of the motion detection, the device probably wants to send one image frame to the cloud when it detects motion.  That cannot be represented as a simple number.  Generally, the protocol being used to talk to the cloud server needs to be rich enough to accept the processed data the device wants to produce.  That counts out most industrial protocols like Modbus, but fits most REST-based protocols as well as the higher level protocols like OPC UA and MQTT.

Where does Blockchain fit into the IIoT?

Part 11 of Data Communication for Industrial IoT

Nothing I’ve read suggests that blockchain will replace SSL for IoT security.  Blockchains are “distributed ledgers” that are known to be tamper-proof (though there are ways to tamper with them in actuality if you own enough of the computing power validating the transactions). This design works fine for certain Internet applications like bitcoin, but I don’t see the blockchain fitting well into the IIoT.

Size matters

First of all, since there is no central ledger, all participating devices must contain, or have access to, the entire ledger.  No entry can ever be removed from the ledger.  As the number of devices grows, and the number of transactions it contains grows, the size of the ledger grows geometrically.  The size of the bitcoin blockchain is roughly doubling every year and currently is over 60GB.  For an IoT node to fully trust the blockchain it would need a geometrically growing amount of storage.  That’s obviously not possible.

So, individual devices can prune the block chain and store only the last few minutes or seconds of it, hoping that nearby peer devices will provide independent confirmation that their little piece of the blockchain is cryptographically secure.  That produces a possible line of attack on the device, where nearby devices could lie, and produce a satisfactory probability of truth in the “mind” of the target device.

Thus security is based on the availability of massive storage, and attempts to reduce that storage requirement diminish security.  As far as I can tell this is an unsolved problem right now.

Too much connectivity?

The second problem with blockchains is that they assume that every transaction in the system must be transmitted to every participant in the blockchain.  Yes, when somebody’s fridge turns on in Paris, every one of the billions of devices participating in the blockchain must be told.  If they are not, then their local copy of the blockchain is inconsistent and they cannot trust the next transaction, which they might actually be interested in.  As the number of devices and transactions rises, the amount of worldwide network bandwidth required to maintain the integrity of the blockchain grows geometrically.  One article I read says that on a 10Mbit Internet connection the theoretical maximum number of transactions in the entire bitcoin universe that connection could sustain would be 7 transactions per second.  Seven.

The result of these two limitations is that a blockchain probably cannot be used to carry the actual data that the devices produce.  Instead it is more likely to be used as an authentication mechanism.  That is, a device that is legitimately on the blockchain can be verified as being itself based on something that the blockchain knows.  My personal opinion is that it sounds very much like the blockchain would become a distributed certificate authority.  Instead of having the current SSL “chain of trust” of certificates, you would have a “blockchain of trust”.  But since an individual device could not contain the entire blockchain you would still need a server to provide the equivalent of certificate validation, so there’s your point of attack.

There are some examples of IoT devices using blockchains, like a washing machine that buys detergent using bitcoins, that are using misdirection to claim the use of blockchains.  Yes, they are using blockchains in their bitcoin transactions because that’s how bitcoin works, but the maintenance data they produce (the real point of the blockchains-for-IoT conversation) are not being transmitted via blockchain at all.

I have yet to see a practical application of blockchains to IoT data or even to IoT authentication.  The conversation at the moment is in the realm of “it would be nice” but the solutions to the implementation problems are not clear.  Incidentally the same problems exist for bitcoin and there are no clear solutions in that space either.

Is REST the Answer for IIoT?

Part 10 of Data Communication for Industrial IoT

As we’ve stated previously, the IIoT is imagined as a client-server architecture where the “things” can be smart devices with embedded micro-controllers.  The devices generate data based on sensors, and send that data to a server that is usually elsewhere on the Internet.  Similarly, a device can be controlled by retrieving data from the server and acting upon it, say to turn on an air conditioner.

The communication mechanism typically used for devices to communicate with the servers over the Internet is called REST (Representational State Transfer) using HTTP.  Every communication between the device and server occurs as a distinct HTTP request.  When the device wants to send data to the server it makes an HTTP POST call.  When it wants to get data (like a new thermostat setting) it makes an HTTP GET call.  Each HTTP call opens a distinct socket, performs the transaction, and then closes the socket.  The protocol is said to be “connectionless”.  Every transaction includes all of the socket set-up time and communication overhead.  Since there is no connection, all transactions must take the form of “request/response” where the device sends a request to the server and collects the response.  The server generally does not initiate a transaction with the device, as that would expose the device to attack from the Internet.

HTTP does define a keep-alive connection, where several HTTP transactions are sent on a single socket.  This definitely reduces the amount of time spent creating and destroying TCP connections, but does not change the basic request/response behaviour of the HTTP protocol.  Scalability issues and trade-offs between latency and bandwidth still overwhelm any benefit gained from a keep-alive connection.

One of the identifying features of the IIoT is the data volume.  Even a simple industrial system contains thousands of data points.  REST APIs might be fine for a toaster, but at industrial scale they run into problems:

Bandwidth

REST messages typically pay the cost for socket setup on every message or group of messages.  Then they send HTTP headers before transmitting the data payload.  Finally, they demand a response, which contains at least a few required headers.  Writing a simple number requires hundreds of bytes to be transmitted on multiple IP packets.

Latency

Latency measures the amount of time that passes between an event occurring and the user receiving notification.  In a REST system, the latency is the sum of:

  • The client’s polling rate
  • Socket set-up time
  • Network transmission latency to send the request
  • Transmission overhead for HTTP headers
  • Transmission time for the request body
  • Network transmission latency to send the response
  • Socket take-down time

By comparison an efficient persistent connection measures latency as:

  • Network transmission latency to send the request
  • Transmission time for the request body
  • Network transmission time for an optional response body

The largest sources of latency in a REST system (polling rate, socket set-up, response delivery) are all eliminated with the new model.  This allows it to achieve transmission latencies that are mere microseconds above network latencies.

REST’s latency problems become even clearer in systems where two devices are communicating with one another through an IoT server.  Low-latency event-driven systems can achieve practical data rates hundreds or thousands of time faster than REST.  REST was never designed for the kind of data transmission IIoT requires.

Scalability

One of the factors in scalability is the rate at which a server responds to transactions from the device.  In a REST system a device must constantly poll the server to retrieve new data.  If the device polls the server quickly then it causes many transactions to occur, most of which produce no new information.  If the device polls the server slowly, it may miss important data or will experience a significant time lag (latency) due to the polling rate.  As the number of devices increases, the server quickly becomes overloaded and the system must make a choice between the number of devices and the latency of transmission.

All systems have a maximum load.  The question is, how quickly does the system approach this maximum and what happens when the maximum is reached?  We have all seen suddenly-popular web sites become inaccessible due to overloading.  While those systems experienced unexpectedly high transaction volumes, a REST system in an IIoT setting will be exposed to that situation in the course of normal operation.  Web systems suffer in exactly the scenarios that IIoT is likely to be useful.  Event-driven systems scale much more gradually, as adding clients does not necessarily add significant resource cost.  For example, we have been able to push REST systems to about 3,000 transactions per minute.  We have pushed event driven systems to over 5,000,000 transactions per minute on the same hardware.

Symmetry

REST APIs generally assume that the data flow will be asymmetrical.  That is, the device will send a lot of data to the server, but retrieve data from the server infrequently.  In order to maintain reasonable efficiency, the device will typically transmit frequently, but poll the server infrequently.  This causes additional latency, as discussed earlier.  In some systems this might be a reasonable sacrifice, but in IIoT systems it usually is not.

For example, a good IIoT server should be capable of accepting, say, 10,000 data points per second from an industrial process and retransmitting that entire data set to another industrial process, simulator, or analytics system without introducing serious alterations to the data timing.  To do that, the server must be capable of transmitting data just as quickly as it receives it.  A good way to achieve this is through establishing persistent, bidirectional connections to the server.  This way, if the device or another client needs to receive 10,000 data changes per second the communication mechanism will support it.

Robustness

Industrial applications are often mission-critical to their owners.  This is one of the big issues holding back the IIoT.  What happens if the Internet connection goes down?

In the typical IoT scenario, the device is making REST calls to a server running in the cloud.  In some ways this is a by-product of the cloud vendor’s business model, and in some ways it is due to the REST implementation.  A REST server is typically a web server with custom URL handlers, tightly coupled to a proprietary server-side application and database.  If the Internet connection is lost, the device is cut off, even if the device is inside an industrial plant providing data to a local control system via the cloud.  If the cloud server is being used to issue controls to the device, then control becomes impossible, even locally.  This could be characterized as “catastrophic degradation” when the Internet connection is lost.

Ideally, the device should be able to make a connection to a local computer inside the plant, integrate directly with a control system using standard protocols like OPC and DDE, and also transmit its data to the cloud.  If the Internet connection is lost, the local network connection to the control system is still available.  The device is not completely cut off, and control can continue.  This is a “graceful degradation” when the Internet connection is lost.

In conclusion, REST systems work reasonably well in low-speed transactional systems.  However, they have a number of disadvantages when applied to high speed, low latency systems, and to systems where data transfer from the server to the device is frequent.  Industrial IoT systems are characterized by exactly these requirements, making REST an inappropriate communication model for IIoT.

Who Owns the Factory?

My local Toyota dealer owns my car.  My name may appear on the ownership papers, but I know better.  The dealership tells me when I’m due for maintenance, what each thing will cost, and why it’s important to repair or replace it.  Sometimes I think they care more about my car than I do.  Of course, they get paid for this service, but it is also in their best interest to keep my car running in tip-top shape, because a satisfied customer is a repeat customer.

It wasn’t always this way.  In younger days when money was scarce and time was free, and I could do anything I put my mind to, I got a few books and set about doing my own car repairs.  After some trial and error, I was able to do normal maintenance, and even undertake a few more complicated repairs like change a radiator core or rebuild a carburetor.  But over the years cars have gotten more complex, and time has become more valuable.  Now I’m more than happy to turn the whole project over to the experts.  As far as I’m concerned, the dealership owns the car.

Who owns the project?

Seems like factories may be going in the same direction.  To get the most out of “smart” manufacturing, the IIoT, and Industrie 4.0, factory owners and operators are relying more and more on outside expertise.  System integrators are stepping in to fill the gap, and some of them are realizing that they can provide the most value to their customers by taking ownership.  Maybe not the factory itself, but the projects they implement.  The question, “Who owns the project?” really boils down to, “Who takes responsibility for it?”

Robert Lowe, co-founder and CEO of Loman Control Systems Inc., a certified member of the Control System Integrators Association (CSIA), recently suggested this idea in an Automation World blog, End-User Asset ‘Owned’ by a System Integrator. He sees a need for system integrators to take on more responsibility by supporting their clients “beyond the project.”  He proposes a new acronym, SIaaS, for System Integration as a Service.  Providing “service and support for maintenance, machine monitoring, machine performance, process performance, reporting, technology upgrades, cybersecurity and so forth” frees the end-user to “focus on making its product and not be dependent on inside resources for sustainable performance.”

Lowe goes on to explain how system integrators are in a unique position to partner with companies on a project they have completed, because they understand well how it works.  Not only did they build it, but they have more experience monitoring, maintaining, and upgrading similar systems.  Rather than finding, training, and maintaining specialized staff to keep the system running, the plant owner can keep his or her people focused on the bigger picture of getting their product out the door.  And the system integrator who owns the asset will ensure that it performs well, because a satisfied customer is a repeat customer.

Skkynet supports system integrators who want to provide their expertise as a service.  On the one hand, our technical solutions—DataHub, SkkyHub, and ETK—are all available “as a Service”. More significantly, research and experience have shown that many IoT projects run into unexpected difficulties.  Rather than expending the resources to build and maintain a secure and reliable IIoT system on their own, plant management and system integrators can hand that responsibility over to those with the expertise, and cut their costs as well.