Understanding Fault Detection & Diagnostics
What is a Fault?
A fault is a period of time in which a specified condition is true on a specific device.
For example, this is a fault:
@8:00am AHU-1 Could not maintain discharge temperature within setpoint for 5 hours.
To break this fault down to illustrate the definition...
There are a number of terms used within the industry to identify this concept, and their uses are not always consistent between parties so it is important to understand when these terms are used within the KODE Labs platform, what they mean in that context.
🚨 Alarms
Alarms are the lowest and simplest level of condition checking. It is usually a check programmed within the BMS or available off of the device and use only one or two points of data to detect the condition. These are most commonly an error or failure condition detected within the device and outputted as such, or a check for exceeding prescribed limits, such as a temperature out of range. They are calculated and communicated instantaneously and do not have a need to exist outside the bounds of that evaluation.
What they are good at
Alarms are great at providing a way to identify, and immediately notify necessary parties of, critical events.
What they are not good at
As alarms are not meant to have meaning outside of the block of time in which they are triggered they are not the best inputs to extract holistic and actionable big-picture insights.
Examples of Alarms
- Zone Temperature Exceeds Maximum
- Discharge Fan has Failed (error from device)
- Freeze Status is in Alarm
- Fire Alarm is Active
🌋 Events
Events are the next level of condition checking. Events are used to identify a specific device condition that will usually require a check of several different inputs and or devices. Events should be set up to detect a wide array of conditions across varying levels of importance. Because of this, a proper Events module must contain several ways to filter and sort the data to help the user weed through the noise.
What they are good at
An Event can be useful as an instantaneous indicator and analyzed over a period of time to derive insights from any patterns that may arise.
What they are not good at
By themselves, Events can be very powerful. However, due to the excessive output that may arise from such condition coverage, significant effort is required from the user to sort through it to derive larger insights and/or diagnose issues.
Examples of Events
- Heating Not Active when Heating Required
- Discharge Air Flow Lower than Expected
- Simultaneous Heating and Cooling
🛠️ Incidents
Incidents are the highest level of condition checking within the KODE Labs FDD platform. Incidents are created by finding commonalities between groups of events through a number of different criteria.
- Reference Based Grouping. For this grouping type, the program looks for events along ontology relationships.
- For example, if it is detected that a VAV is not able to meet its room temperature setpoint, the incident engine will look to see if there is a problem on the AHU associated with that VAV and also look for issues amongst that VAV’s siblings. If any additional events have occurred along those relationships, they are all grouped together under a single incident.
The benefit of this feature is that it performs some of that root-cause analysis that nobody has time to do — ultimately reducing the number of identified conditions down to a manageable number that guides the user to understanding the real problem.
What they are good at
Bottom line: Incidents minimize time needed for troubleshooting and investigation, and maximize the effectiveness of maintenance operations.
What they are not good at
As Incidents are designed to make lives easier, the truth of the matter is no engine like this can catch 100% of all issues. There is still a need for engineers to sort through events and derive their own insights using their experience and knowledge. For this type of analysis, you are better served by Events.