Deep Reinforcement Learning
  • 19 Sep 2024
  • 2 Minutes to read
  • Dark
    Light

Deep Reinforcement Learning

  • Dark
    Light

Article summary

How you can use a Deep Reinforcement Learning Model in your control system.

Deep Reinforcement Learning (DRL) models are ideal for advanced control of non-linear complex processes. They allow for multi-objective control schemes that can balance efficiency, safety, and quality simultaneously. This can be used to drive an analog value, such as a control valve or a VFD, or it can be used for discrete control, such as a solenoid or a motor starter. Below we will dive in to some common practices and things to be aware of when programming around a DRL model.

Fallback Control Scheme

When adding a DRL model to your control system, it is crucial to plan a fallback control scheme in the case of a model failure. DRL models are only as reliable as the data they were trained on, therefore if the process values are outside of any scenario that the model was trained on, the model will enter a failed state and it will be up to the control system to determine how to continue.

Mode Selection

A typical method of implementing a DRL model into your control scheme is to create a selector block that allows the operator or logic to swap between the fallback control scheme and DRL control. The mode selector is typically presented to the operator as a faceplate that clearly indicates which mode is currently in control. It should also only allow DRL control if the model is in a good state and the connection to Koios is valid. Below is a simple example of a selector control scheme that uses a PID loop as the fallback mode in the event of an interlock.

Bumpless Transfer: Output Tracking

Output tracking is a crucial part of mode switching in any control scheme. In this case, the output of the fallback control method should track the DRL output anytime DRL is in control. This allows there to be a bumpless transfer of control when swapping from DRL to the fallback. When going from fallback control to DRL, this will usually be implemented in the model’s training. The DRL agent will typically use the final control output as an input to the model which will continuously allow it to track with the output. Some control systems will automatically back calculate and track the selectors output from the upstream logic, but if this is not the case, you can implement something like shown in the below diagram, where the Set CV pin is allowing the block’s CV to be overwritten when the DRL model is in control.


What's Next