Turning IIoT Data into Value: The 5D Architecture

What’s in it for me? Sure, the Industrial IoT is getting a lot of press—it’s been riding high on the Gartner Hype Cycle for years. But now that most people have beheld the vision and survived the deluge of glowing predictions, they are starting to ask some down-to-earth questions. In particular, engineers who have to assemble the pieces and managers who need to justify the costs are asking, “What are we going to get out of it?”

The benefit of the IoT, according to Finbar Gallagher, CEO and Founder of Fraysen Systems, is its ability to turn data into value. To explain how that happens, Gallagher has boiled down every IoT implementation into a common “5D architecture.” In his article, The 5D Architecture – A Standard Architecture for IoT, he says, “IoT systems are complex, very large scale and present many pitfalls for the system architect. Thinking about these systems in terms of the problem to be solved: turning data into value…”

The article breaks down the process of turning data into value through the interaction of five core elements, the 5D of the architecture, which can be summarized as follows:

  1. Data collection
  2. Detecting events based on changes in the data, and analysis
  3. Dispatching (decide and plan) an action based on events
  4. Delivering the action
  5. Developing value, which underlies and unites all of the above

Surrounding, connecting, and acting upon these 5D core elements are four services:

  1. Communication
  2. Presenting information
  3. Storing data and information
  4. Managing the 5 core elements.

Although these services are sometimes considered to be core elements, Gallagher separates them, because he says they do not in themselves create value. Each of these services relies on a person to extract value from them. Ultimately, value is not intrinsic to the data, analysis, plans, or actions either, but rather depends on human interaction to derive it. To make his point, Gallagher quotes a production manager who once said to him, “So if I don’t look at the charts this system presents, the system doesn’t deliver any value, does it?”

Be that as it may, people still need an IIoT system to access their data for extracting value.  And the better it functions, the more value they get. A good IIoT service will provide optimal data collection, event detection, dispatching, and delivery of action through secure and rapid communication, accurate presentation, and fully-integrated storage of data and information. Gallagher suggests some specific criteria, such as:

  • The ability to collect data from a wide range of sources, including legacy PLCs, log files, historians, and devices that may use different protocols.
  • Low latency data communication through direct, real-time connections whenever possible, avoiding high-latency approaches such as having a sender write data to files and requiring the receiver to read them.
  • Consistent event detection: repeatable and verifiable.
  • The ability to provide feedback (with or without human input) so that the system supports the ability to learn and modify action plans.
  • Data communication should be easy to use, resilient, and able to preserve structure. To these we would also add secure by design.
  • Data storage should be flexible, fully integrated, and minimal latency.

Anyone familiar with Skkynet’s approach to Industrial IoT will see that it meets the criteria that Gallagher proposes. On our own, we can’t turn data into value. That depends on you, the user. But we can provide you with easy, quick, and secure access to your data, so that you can make the most of it.

Remote Control without a Direct Connection

Part 5 of Data Communication for Industrial IoT

As discussed previously, the idea of using a cloud service as an intermediary for data resolves the problems of securing the device and securing the network.  If both the device and the user make outbound connections to a secure cloud server, there is no need to open ports on firewalls, and no need for a VPN. But this approach brings up two important questions for anyone interested in remote control:

  1. Is it fast enough?
  2. Does it still permit a remote user to control his device?

The answer to the first question is fairly simple.  It’s fast enough if the choice of communication technology is fast enough.  Many cloud services treat IoT communication as a data storage problem, where the device populates a database and then the client consults the contents of the database to populate web dashboards.  The communication model is typically a web service over HTTP(S).  Data transmission and retrieval both essentially poll the database.

The Price of Polling

Polling introduces an inevitable trade-off between resource usage on the server and polling rate, where the polling rate must be set with a reasonable delay to avoid overloading the cloud server or the user’s network.  This polling does two things – it introduces latency, a gap in time between an event occurring on the device and the user receiving notification of it, and it uses network bandwidth in proportion to the number of data items being handled.  Remote control of the device is still possible through polling if you are willing to pay the latency and bandwidth penalty of having the device poll the cloud.  This might be fine for a device with 4 data values, but it scales exceptionally poorly for an industrial device with hundreds of data items, or for an entire plant with tens of thousands of data items.

Publish/Subscribe Efficiency

By contrast, some protocols implement a publish/subscribe mechanism where the device and user both inform the cloud server that they have an interest in a particular data set.  When the data changes, both the device and user are informed without delay.  If no data changes, no network traffic is generated.  So, if the device updates a data value, the user gets a notification.  If the user changes a data value the device gets a notification.  Consequently, you have bi-directional communication with the device without requiring a direct connection to it.

This kind of publish/subscribe protocol can support bidirectional communication with latencies as low as a few milliseconds over the background network latency.  On a reasonably fast network or Internet connection, this is faster than human reaction time.  Thus, the publish/subscribe approach has the potential to support remote control without a direct connection.

Continue reading, or go back to Table of Contents

Secure by Design for IIoT

Securing the Industrial IoT is a big design challenge, but one that must be met. Although the original builders of industrial systems did not anticipate a need for Internet connectivity, companies now see the value of connecting to their plants, pipelines, and remote devices, often over the Internet. The looming question: How to maintain a high level of security for a mission-critical system while allowing remote access to the data?

As you can imagine the answer is not simple.  What’s called for is a totally new approach, one that is secure by design.  This blog entry, published on the ARC Advisory’s Industrial IoT/Industrie 4.0 Viewpoints blog, gives an overview of why standard industrial system architecture is not adequate to ensure the security of plant data on the Internet, and introduces the two main considerations that must go into creating a more secure design.

Top Performance for Industrial IoT

T he Industrial IoT is different from the regular IoT. Mission-critical industrial systems are not like consumer or business IT applications. Performance is crucial. Most IT systems are built around a relational database, a repository of data that clients can add to or access, where a response time of a second or two is acceptable. IT data is typically sent across a network via HTML or XML, which adds complexity to the raw data, and consumes bandwidth. Although fine for office or home use, these technologies are not sufficient for the Industrial IoT.

In a typical industrial system, the data flows in real time. It moves from a sensor, device, or process through the system, often combining with other data along the way, and may end up in an operator’s control panel, another machine or device, or special-purpose data historian. As plant or field conditions change, the data arrives in real time, and the system or operator must react. A robotic arm or other device can send hundreds of data changes per second. Tiny, millisecond fluctuations in the data set can have significant effects or trigger alarms, and often each minute detail needs to be accessed in a trend chart or historical database.

Achieving this kind of performance on the Industrial IoT demands an exceptional approach to data communication.

  • A real-time, in-memory database keeps the data moving. The data needs to flow quickly and effortlessly through the system, and an in-memory database is needed to support these rapid value changes. A relational database, the familiar workhorse of the IT world, is not built for this specialized task. It takes too long to write records, process queries, and retrieve information. Thus, an in-memory, flat-file database, is a good choice, allowing for higher data throughput.
  • High-speed data integration connects any data source with any user. A key task of the in-memory database is to integrate all sources of incoming data. If all communication is data-centric (see below), then every data source can be pooled together into a single, universal data set. This design keeps the data handling as simple as possible, allowing any authorized user to connect to any specified combination of data inputs in real time.
  • Publish/subscribe beats polling. In a publish/subscribe, event-driven model, a user makes a one-time request to connect to a data source, then gets updates whenever they occur. By contrast, polling sends regular, timed requests for data. This wastes resources when data changes are infrequent, because multiple requests might return with the same value. At the same time, polling is also inaccurate during rapid change, because a burst of several value changes may occur between polling cycles, and will be completely lost.
  • High-speed “push” data sources are most effective. The data should be pushed out to the system, and then pushed to the user. In addition to being a better security model, this approach is also more efficient. To “pull” data from a source requires polling, which takes longer and uses too much bandwidth, because each data update requires two messages: a request and a reply. Push technology only requires one message, which is more efficient, consumes less bandwidth, and also enables machine-to-machine communication.
  • Data-centric, not web-centric, design gives the best performance on the cloud. Transcoding data at the source takes time, and requires resources on the device which many smaller sensors may not have. By keeping the data in its simplest format, with no HTML or XML code, the lowest possible latency can be achieved. The raw data flows from the source, through the cloud, to the user as quickly as possible. When it arrives it can be converted to other formats, such as HTML, XML, SQL, etc. Different users, such as web browsers, databases, spreadsheets, and machine-to-machine systems can access a single data source at the point of its arrival, reducing the volume of data flow in the system.

Skkynet’s implementation

Following these principles, Skkynet’s SkkyHub™ and DataHub® provide in-plant or IoT networking speeds of just a few milliseconds over network latency, with a throughput of up to 50,000+ data changes per second. Their high level of performance is achieved by combining real-time, in-memory database technology with publish/subscribe, pushed data collection and a data-centric approach to communication.

The “Hub” technology in DataHub and SkkyHub is a real-time, in-memory, flat-file database, used in hundreds of mission-critical systems worldwide for over 15 years. Designed from the ground up for industrial data communications, the DataHub and ETK work by converting all incoming data into a simple, internal, raw-data format. This raw data can be integrated and transmitted at very high speeds.

At the plant level, the DataHub collects, integrates and redistributes process data in real time. Selected sets of data can be passed seamlessly to the IoT simply by connecting the DataHub or ETK to SkkyHub. At the cloud level, SkkyHub provides the same real-time data collection, integration, and distribution. IoT performance now approaches the actual network propagation speeds of the Internet, with virtually no added latency.

Quite honestly, we shouldn’t expect the typical IoT platform to provide this level of performance. Few, if any, were designed for the Industrial IoT. It should come as no surprise that a concept as disruptive as “Industrial Internet of Things” may require new approaches for proper implementation. And in addition to performance, industrial applications have unique security and compatibility requirements. When choosing a solid, robust platform for Industrial IoT, these are all critical factors to consider.

Tunnelling OPC DA – Know Your Options

Since OPC was introduced over fifteen years ago, it has seen a steady rise in popularity within the process control industry. Using OPC, automation professionals can now select from a wide range of client applications to connect to their PLCs and hardware devices. The freedom to choose the most suitable OPC client application for the job has created an interest in drawing data from more places in the plant. Industry-wide, we are seeing a growing need to connect OPC clients on one computer to OPC servers on other, networked computers. As OPC has grown, so has the need to network it.

The most widely-used OPC protocol for real-time data access is OPC DA.  However, anyone who has attempted to network OPC DA knows that it is challenging, at best. The networking protocol for OPC DA is DCOM, which was not designed for real-time data transfer. DCOM is difficult to configure, responds poorly to network breaks, and has serious security flaws. Using DCOM between different LANs, such as connecting between manufacturing and corporate LANs, is sometimes impossible to configure. Using OPC DA over DCOM also requires more network traffic than some networks can handle because of bandwidth limitations, or due to the high traffic already on the system. To overcome these limitations, there are various tunnelling solutions on the market. This article will look at how tunneling OPC DA solves the issues associated with DCOM, and show you what to look for in a tunnelling product.

Eliminating DCOM

The goal of tunnelling OPC DA is to eliminate DCOM, which is commonly done by replacing the DCOM networking protocol with TCP. Instead of connecting the OPC client to a networked OPC server, the client program connects to a local tunnelling application, which acts as a local OPC server. The tunnelling application accepts requests from the OPC client and converts them to TCP messages, which are then sent across the network to a companion tunnelling application on the OPC server computer. There the request is converted back to OPC DA and is sent to the OPC server application for processing. Any response from the server is sent back across the tunnel to the OPC client application in the same manner.

OPC Tunnelling

This is how most tunnellers for OPC DA work, in principle. A closer look will show us that although all of them eliminate DCOM, there are some fundamentally different approaches to tunnelling architecture that lead to distinctly different results in practice. As you review tunnelling solutions, here are four things to look out for:

  1. Does the tunnelling product extend OPC transactions across the network, or does it keep all OPC transactions local?
  2. What happens to the OPC client and server during a network break?
  3. How does the tunnel support multiple client-server connections?
  4. Does the tunnelling product provide security, including data encryption, user authentication, and authorization?

1. Extended or Local OPC Transactions?

There are two basic types of tunnelling products on the market today, each with a different approach to the problem. The first approach extends the OPC transaction across the network link, while the second approach keeps all OPC transactions local to the sending or receiving computer.

OPC Tunnelling Comparison

Extending the OPC transaction across the network means that a typical OPC client request is passed across the network to the OPC server, and the server’s response is then passed all the way back to the client. Unfortunately, this approach preserves the synchronous nature of DCOM over the link, with all of its negative effects. It exposes every OPC client-server transaction to network issues like timeouts, delays, and blocking behavior. Link monitoring can reduce these effects, but it doesn’t eliminate them, as we shall see below.

On the other hand, the local OPC transaction approach limits the client and server OPC transactions to their respective local machines. For example, when the tunnelling program receives an OPC client request, it responds immediately to the OPC client with data from a locally cached copy. At the other end, the same thing happens. The tunnelling program’s job is then to maintain the two copies of the data (client side and server side) in constant synchronization. This can be done very efficiently without interfering with the function of the client and server. The result is that the data crosses the network as little as possible, and both OPC server and OPC client are protected from all network irregularities.

2. Handling Network Issues

There is a huge variety of network speeds and capabilities, ranging from robust LANs, to WANs running over T1 lines on multi-node internets, and on down to low-throughput satellite connections. The best tunnelling products give the best possible performance over any given kind of network.

To protect against network irregularities and breaks, any good tunnelling application will offer some kind of link monitoring. Typically this done with a “heartbeat” message, where the two tunnel programs send messages to one another on a timed interval, for example every few seconds. If a reply isn’t received back within a user-specified time, the tunnelling application assumes that the network is down. The OPC client and server may then be informed that the network is broken.

In practice this sounds simple. The problem arises when you have to specify the timeout used to identify a network disconnection. If you set the timeout too long, the client may block for a long time waiting for a reply, only to discover that the network is down. On the other hand, setting the timeout too short will give you false indications of a network failure if for some reason the connection latency exceeds your expectations. The slower the network, the greater the timeout must be.

However, this balancing act is only necessary if the tunnelling product uses the extended OPC approach. A product that offers local OPC transactions still provides link monitoring, but the OPC client and server are decoupled from the network failure detection. Consequently, the timeout can be set appropriately for the network characteristics—from a few hundred milliseconds for highly robust networks to many seconds, even minutes for extremely slow networks—without the risk of blocking the OPC client or server.

How the tunnelling product informs your OPC client of the network break also varies according to the tunnel product design. Products that extend the OPC transactions generally do one of two things:

  1. Synthesize an OPC server shutdown. The OPC client receives a shutdown message that appears to be coming from the server. Unaware of the network failure, the client instead operates under the assumption that the OPC server itself has stopped functioning.
  2. Tell the client nothing, and generate a COM failure the next time the client initiates a transaction. This has two drawbacks. First the client must be able to deal with COM failures, the most likely event to crash a client. Worse yet, since OPC clients often operate in a “wait” state without initiating transactions, the client may think the last data values are valid and up-to-date, never realizing that there is any problem.

Products that provide local OPC transactions offer a third option:

  1. Maintain the COM connection throughout the network failure, and alter the quality of the data items to “Not Connected” or something similar. This approach keeps the OPC connection open in a simple and robust way, and the client doesn’t have to handle COM disconnects.

3. Support for Multiple Connections

Every tunnelling connection has an associated cost in network load. Tunnelling products that extend OPC transactions across the network may allow many clients to connect through the same tunnel, but each client sends and receives data independently. For each connected client the network bandwidth usage increases. Tunnelling products that satisfy OPC transactions locally can handle any number of clients and servers on either end of the tunnel, and the data flows across the network only once. Consequently, adding clients to the system will not add load to the network. In a resource-constrained system, this can be a crucial factor in the success of the control application.

If you are considering multiple tunnelling connections, be sure to test for cross-coupling between clients. Does a time-intensive request from a slow client block other requests from being handled? Some tunnelling applications serialize access to the OPC server when multiple clients are connected, handling the requests one by one. This may simplify the tunnel vendor’s code, but it can produce unacceptable application behavior. If one client makes a time-consuming request via the tunnel, then other clients must line up and wait until that request completes before their own requests will be serviced. All clients block for the duration of the longest request by any client, reducing system performance and increasing latency dramatically.

On the other hand, if the tunnel satisfies OPC requests locally, this situation simply does not happen. The OPC transactions do not cross the network, so they are not subject to network effects nor to serialization across the tunnel.

4. What About Security?

Whenever you get involved in networking plant data, security is a key concern. In fact, security is a primary reason for choosing tunnelling over DCOM. DCOM was never intended for use over a wide area network, so its security model is primarily designed to be easily configured only on a centrally administered LAN. Even making DCOM security work between two different segments of the same LAN can be extremely difficult. One approach to DCOM security is to firewall the whole system, so that nothing gets in or out, then relax the security settings on the computers inside the firewall. This is perhaps the best solution on a trusted network, but it is not always an option. Sometimes you have to transmit data out through the firewall to send your data across a WAN or even the Internet. In those cases, you are going to want a secure connection. Relaxed DCOM settings are simply not acceptable.

Most experts agree that there are three aspects to network security:

  • Data encryption is necessary to prevent anyone who is sniffing around on the network from reading your raw data.
  • User authentication validates each connecting user, based on their user name and password, or some other shared secret such as a private/public key pair.
  • Authorization establishes permissions for each of those authenticated users, and gives access to the appropriate functionality.

There are several options open to tunneling vendors to provide these three types of security. Some choose to develop their own security solution from the ground up. Others use standard products or protocols that many users are familiar with. These include:

SSL (Secure Socket Layer) – Provides data encryption only, but is very convenient for the user. Typically, you just check a box in the product to activate SSL data encryption. The tunneling product must provide user authentication and authorization separately.

VPN (Virtual Private Network) – Provides both encryption and authentication. VPN does not come as part of the product, per se, but instead is implemented by the operating system. The tunneling product then runs over the VPN, but still needs to handle authorization itself.

SSH (Secure Shell) Tunneling – Provides encryption and authentication to a TCP connection. This protocol is more widely used in UNIX and Linux applications, but can be effective in MS-Windows. SSH Tunnelling can be thought of as a kind of point-to-point VPN.

As none of these standard protocols covers all the three areas, you should ensure that the tunnelling product you chose fills in the missing pieces. For example, don’t overlook authorization. The last thing you need is for some enterprising young apprentice or intern to inadvertently link in to your live, production system and start tweaking data items.

How Can You Know? Test!

The concept of tunnelling OPC DA is still new to many of us. Vendors of tunnelling products for OPC DA spend a good deal of time and energy just getting the basic point across: eliminate the hassles of DCOM by using TCP across the network. Less attention has been put on the products themselves, and their design. As we have seen, though, these details can mean all the difference between a robust, secure connection, or something significantly less.

How can you know what you are getting? Gather as much information as you can from the vendor, and then test the system. Download and install a few likely products. (Most offer a time-limited demo.) As much as possible, replicate your intended production system. Put a heavy load on it. Pull out a network cable and see what happens. Connect multiple clients, if that’s what you plan to do. Configure the security. Also consider other factors such as ease of use, OPC compliance, and how the software works with other OPC-related tasks you need to do.

If you are fed up with DCOM, tunnelling OPC DA provides a very good alternative. It is a handy option that any engineer or system integrator should be aware of. At the very least, you should certainly find it an improvement over configuring DCOM. And with the proper tools and approach, you can also make it as robust and secure as your network will possibly allow.

Download White Paper (PDF)