CAUTION: Service Disruption Possible – Rethink OT Security

Delicate OT Environments Require Security Aligned to Operational Best Practices

As a pioneer in a new breed of OT security (aka ICS, aka Critical Infrastructure, aka SCADA), Claroty is sought out and engaged by clients across nearly every corner of the globe. 

 This is partially because we’ve brought to the table a unique understanding of the OT domain, and arguably a team comprised of some the best cyber security minds in the world…but it also because we bring a burning desire to LISTEN and LEARN from our clients as to what is and isn’t acceptable inside their complex, mission critical operational networks. 

We’re a partner more than a vendor – we’re helping to bridge the needs of IT and OT teams to drive better visibility and security…and we understand that each of our clients’ environments are unique and require unique approaches. 

We knew going into this – and it has been reaffirmed by every single client to this point – that securing OT networks requires the same hippocratic oath found in medicine… 

Rule number 1: whatever you do, do no harm.

OT networks require reliability, safety and up-time. More so than in the IT domain, these networks cannot suffer unplanned interruptions or downtime – if they do, the results can range from catastrophic results that endanger the physical safety of employees and the populace at large, to massive financial losses, to interruptions to critical infrastructure that drive the economy, to the inability to restore operations in a rapid fashion, etc. 

Recently we spoke with a client that suffered a significant loss simply by conducting an activity that happens every day in the IT domain…active scanning of the network for vulnerabilities. 

When we heard about this, we knew that our approach to building a solution that delivers passive, real-time monitoring of OT networks was absolutely the right decision.

Making a Mistake with Active Scanning…

It started out simply enough. A power company wanted to conduct a vulnerability assessment of their network. It ended badly when the firm they hired to conduct this assessment failed to align to OT operational best practices.

Background:

· A natural-gas fired power station producing approximately $1 - $2m worth of electricity each day

· As with any network, the power station is subject to several site maintenance activities (equipment replacement, software upgrade, electricity works etc.). Any one of these activities compels production to be paused until conclusion 

· Due to the high cost of any non-production hour, site maintenance takes place within a strictly defined execution timeframe. Any deviation from this timeframe, results in tangible monetary loss and impact to the public. For utilization reasons, a maintenance activity timeframe is usually leveraged to perform additional activities that cannot be done while the system is production mode.

· A recent government mandate required that security assessments be conducted across the energy sector – the decision was made by the company to conduct this assessment within the normal maintenance window.

 Incident and Response Description:

· The security assessment was conducted using an incredibly well known/popular active scanningtool with SCADA plugins adjusted to the local network.

· Two hours into the site assessment the OT team discovered that the active network scanning damaged an Ovation DCS PLC. The IT security assessment team was completely unaware of the PLC damage.

· Upon discovery, the OT team had to switch to a redundant backup PLC, as is the standard procedure. However, the main PLC was completely dead and could not provide the redundant one with the required synchronization data. As a result, the redundant PLC could not render full operational backup. 

· Due to the redundant PLC failure, the OT team had to fully clean the main PLC (ROM/RAM), reset to factory default, find the backup of the program that was running on the PLC, and download it.

· The damage resulted in the restoration procedure taking approximately six hours…that’s six hours of unplanned downtime with no production…very, very not-optimal.

·Several days following the incident, the OT team discovered that there was some additional damage to remote IO. The finite scope of the incident implications is yet unknown.

 Takeaways:

· The OT\IT Convergence Trend Brings Operational Risk The IT security staff that conducted the assessment did not regard its actions as potentially harmful. After all, these were standard procedures in IT network scanning. Security teams and security vendors need to understand that the OT domain requires a rethink of strategy…what worked for IT does not necessarily work for OT. As these different worlds converge, there needs to be new approaches.

· This wasn’t age…it was designIn OT networks, there are definitely “legacy” technologies in place…but in this case, the age of the damaged PLCs had nothing to do with what happened. The damaged PLCs were by no means “legacy,” but a few years old with the one-before-latest software version installed. The lack of ability to withstand active monitoring is not related to the device age but to its initial design as a purpose-built machine that lacks the flexibility to sustain anything but what it was purpose built for.

· OT Asset Sensitivity – The Risk of Active ScanningThere is no way to know in advance how a PLC will react to active scanning. In this case, the IT security staff and the team engaged to conduct the assessment were using a very popular and widely used tool…but the active scanning approach that works in IT does not work in OT. This is why Claroty has created a solution that involves passive, real-time scanning and monitoring of the OT networks it secures.

This is a new world folks – as we embark upon the mission to enhance OT security, we need to treat it as such. We’re here to help and we want to hear from you – let’s start talking today!