What is a Digital Twin and why do I care?
A Digital Twin uses data from sensors installed on physical systems to represent their near real-time status, working condition or position. This modelling technology allows us to see what is happening inside the system without having to be able to get inside the system. It forms a critical step in the information value chain without which it is often impossible to get from raw data to insight, and therefore to value. As the Internet of Things grows, Digital Twins will become a standard tool for Data Scientists and Engineers wishing to use all this new data to automatically understand and respond to what is going on in the real world.
What problem does the Digital Twin solve?
Control systems engineering gives us a very strong theoretical and conceptual framework for understanding Digital Twins, how to build them and how to apply them. Here is how that discipline looks at any real-world system:
- Inputs combined with the previous state of the system change the system state. Note that the many real-world systems are stateful in some way: what they are doing now is dependent on what has happened to them previously, not just on their current inputs.
- These inputs change the system in some useful way.
- Some, at least, of the internal state of the system appears at its outputs. To the extent that the state can be measured at the outputs, the system is said to be observable.
The two concepts I emphasize above are key to getting useful information out of an IoT solution:
- Observability: my IoT solution must be able to provide the information that is important to me, not just the data that is available to measure.
- Statefulness: my IoT solution must be able to tell me correctly what the state of my system is right now, even if that state is affected by things that happened a long time ago.
A system is observable if at any time its internal state can be reconstructed from no more information than its current outputs. Here is what this looks like:
In the Internet of Things, this means that if a formula or formulas can be applied to the available output data to tell me everything I need to know about the system at any given time, then the system is observable. Great. In that case we are in the realm of analytics, and need not bother ourselves further about a Digital Twin for this system. A good example of this might be that I need to know the efficiency of a chiller unit. I have measurements for voltage, current, temperature and flow rate. Then at any given time
Efficiency[T] = (temp[T] * flow rate[T]) / (voltage[T] * current[T]).
However if I cannot extract all the information I need from the current system outputs, then my system is unobservable, or at least not completely observable. For example, if I want to know if a production machine is faulted, and why, and my only measurable output is the number of units produced in the last minute, and that number is zero (0), then I can infer that there is possibly something wrong, but I cannot say for sure, and I certainly can’t say what. To make my system observable I would need access to the fault code register.
The Digital Twin is a model that responds to its inputs in the same ways as the real-world system, but with the advantage that I can look anywhere inside it that I want to, hence the “represent” part of the Digital Twin definition. This means that if I force the Digital Twin to follow the same inputs and outputs as the real-world system, then the previously unobservable information I need about the real-world system will be right there at my fingertips.
Obviously, the answer I get from my Digital Twin model depends on when I start my analysis from. Let’s imagine a system that produces an output
"Y" if it receives the input
"B", unless it has received an
"A" first, in which case its output will be
"X". In the real world, this might be a machine going into different fault and run states, where the effect of an input on the machine’s state depends on the state the machine is in at the time. If I go far enough back in time, I realize that my system did receive an input
"A", and so by the rules of my system, the later
"B" results in my model producing the output
"X". However if I don’t go back far enough, I will think that I only got a
"B", and the output should be
"Y". But how far back is “far enough”? The input
"A" might have arrived 100 milliseconds ago, or it might have arrived yesterday, or just before the week-end. Which means that I cannot just pick up and run my model over a selected time period any time I want to get an answer — apart from the sheer impracticality of crunching the numbers while the User waits for an answer. My Digital Twin must run continuously to hope to be accurate: of course I can store its results for later historical analysis. Hence real-time.
If we have a Digital Twin for a real-world system, we can interrogate the the Digital Twin to discover the otherwise unobservable state of the system, even if we cannot directly measure that state. For this to be successful though, we must run our Digital Twin continuously, although this has the side benefit that its insights are always available to us in real, or near-real, time.
Fraysen Systems – Case Study: How IIoT Quickly Slashed Operational Costs
Wikipedia – Digital Twins
Wikipedia – Control Systems Engineering of Multiple Input, Multiple Output Systems
IBM – What is a Digital Twin?
GE – What is a Digital Twin?
Forbes – Marr, Bernard: What Is Digital Twin Technology – And Why Is It So Important?