Project

Resilience and Security by Design of Water Networks

Research hypothesis

Modern societies depend on complex and interconnected infrastructures to deliver clean water to consumers. Water distribution networks (WDNs) are vulnerable to cyberattacks, deliberate contamination, and distribution failures. These critical infrastructures are increasingly relying on Information systems (IS) for which security is a major concern. Failure detection and any appropriate countermeasures depend not only on the design of the WDN, but also on the integrity of the IS, composed of the sensor network, the monitoring system, and their data.

The topology of the WDN must change to fully adapt to adverse conditions, as must the supervision tools monitoring the system for potential cyberattacks. Therefore, it is important that water quality and quantity observations are sufficient and informative at the locations where the sensors provide the most information. Questioning the WDN design and optimizing sensor placement defines a new resilience-by-design paradigm. The resilience by design solution must be approached in terms of multiple objectives to be achieved, such as economic and technical ones. Multi-objective problems are generally difficult to solve and require many evaluations to calculate the Pareto fronts. Each of the evaluation should be model-based to be realistic and requires using a high-fidelity or a reduced-order hydraulic and transport model. This approach requires the development of a digital twin (DT) of the WDN, which must be calibrated to reduce the uncertainty of the input data and is linked to the accurate sensor data. The DT is relatively insensitive to lack of data integrity and availability because it is based on physical laws and its topology is well-described by the network graph.

It is also important to consider cyber risk to ensure robust detection of malicious activity. In the event of a cyber-attack, the availability, confidentiality and integrity of sensitive data and IS can be directly compromised with irreparable consequences to the WDN. It is therefore essential that cybersecurity experts analyse digital traces (in this case system logs), using machine learning tools. Beyond filtering and machine learning, which provide outstanding results in finding known attacks, there is a need for tools to extract unknown attacks, i.e. uncommon and anomalous behaviour. CoRREAU extracts these behaviours through log analysis in a data-driven approach that constitutes a digital shadow of the IS (DS). The logs constitute a massive set of discrete events, where these anomalies are hidden. It is taken advantage of the exploration capability of genetic algorithms (GAs) to define a novel anomaly detection scheme based on the analysis of the topology of trace data: Anomaly detection is expressed as a multi-criteria optimisation problem, which enables to evaluate the impact of diverse features of the IT system (communication type and load, origin and destination of the traffic, interconnection between machines). The DS cybersecurity analysis efficiently complements the DT resilience-by-design assessment to ensure the physical and digital robustness of both interconnected networks.

The CoRREau project aims to answer the three challenging research questions i) How to obtain an observable reduced-order model, the DT, ii) How to distinguish sensor faults from true malicious attacks, and iii) What countermeasures should be taken?

As illustrate in Figure 1, the CoRREau project aims to leverage both model-based DT and data-based DS approaches to improve resilience through WDN design as well as IS security: resilience is key to the supply security property, and cyber-physical security is mandatory to reduce WDN exposure to remote malevolent aggressions. CoRREau exploits the complementarity of both approaches at the level of water distribution and IT&OT supervision networks, in order to pave the way for a complete digital twin integrating the two systems.

Position of the project as it relates to the state of the art

Contribution and limitation of past research works

Network resilience to failure is a key concept that has been used with multiple meanings and dimensions covered (not only technical); in general, and from a mathematical point of view, it is a question of quantifying the time (or rapidity) of return to normal, but it is also important to consider the system’s performance, its robustness, its structural redundancy, and the water utility resourcefulness. Several authors recognize the WDN network configuration is key in qualifying resilience; high-fidelity hydraulic modeling was used, as well as trophic coherence, and a variety of complex network centrality and spectral properties. Very few studies propose to change the network topology to increase the resilience; to realize self-cleaning networks, add valves to control the velocity, and adapt DMAs to real-time conditions and hazards with flexible and changing boundary conditions, minimizing the average pressure; consider one single key performance indicator, and in some respect, are limited to explore the design of the network, as there is no additional sensors, not all the hydraulic devices and the possible reconfigurations. Moreover, these research works were intended to WDN normal operations and not cyber-physical attacks.

The hydraulic response to a cyber-physical attack may depend not only on the attack features, but also on the initial conditions, and the same hydraulic response can be obtained with different attacks; the location of sensors and actuators may be not sufficient to discriminate among attack situations. It is important to use the right model for modeling water quantity and quality, and the network model must be calibrated with online pressure and flow rate measurements. It is important to use the right model to model water quantity and quality, and the network model should be calibrated. For this prospect, metamodels from complex network theory can lead to large model prediction errors, although it is still possible to make decisions and place sensors using network graph analysis. The right model should explore the various acceptable reconfigurations of the system and its regulation, while mitigating impacts and identifying countermeasures. For the CoRREau project, the existing high-fidelity hydraulic model (Porteau software of the partner INRAE) will be simplified and represented by a metamodel, namely the DT. The DT will be used to search for optimal network designs by solving the resilience by design optimisation problem.

Additionally, to support the identification of attacks in the WDN monitoring system, a new data-driven model is proposed to secure the IS. This model detects abnormal behaviours by expressing anomaly detection as an optimisation problem: the further the traces are (or more exactly relevant features) from previously observed system behaviours, the most likely the associated operations are malicious ones. CoRREau approach combines latest results in pareto-front optimisation for multi-criteria optimisation with novel discrete genetic algorithms. Identification of pareto-fronts with controlled computational complexity is key to the performance of multi-criteria GAs. FastEMO extends the archive-based approach of ASREA using low-dependency-individuals and a large front, thus greatly limiting the number of iterations rounds. ASREA reduces the complexity of previous scheme like NSGA-III through a size-bound archive, thus avoiding quadratic complexity. However, these approaches have be mostly validated for continuous analysis only. Discrete optimisation relies on an estimation-of-distribution approach [18] which copes well with noise and local optima and is expected to support good optimisation times and good representations of the structure of the fitness landscape. However, if they prove efficient in practice, they require further investigation to deepen their theoretical understanding, in particular the impact of data format such as non-binary-string representations for multi-variate estimation of distribution algorithms (EDA). Building performant pareto-front extraction strategies for multi-criteria optimisation on discrete data, which will enable to track malicious activities in IS systems, relies on performant – and very recent – models which integration will both provide significant progress to the state of the art and efficiently support the results of CoRREau project. This approach relies on the instrumentation of the Cyber Physical System (CPS) to generate a reference model from the logs. This instrumentation for log extraction is in essence a major innovation for WDNs, for which the only data available so far is sensor data.

Methodology

WP1 is dedicated to the project management and dissemination with all Partners involved. The four other scientific WPs are shown in Figure 2. WP2 is central and is leaded by the end-user Eurométropole Strasbourg (CUS). Its aim is to define attack scenarios of interest and to validate the consortium methods and tools that are developed in WP3, WP4 & WP5. The academic research will be articulated around two doctoral studies, the first in engineering sciences for the DT, the second in computer science for the DS.

Figure Links between the different scientific WPs of CoRREau project; all partners will participate to all Wps, to allow a better assimilation of the results and facilitate the integration in WP5.

For more information, see the following pages:

Modification date: 26 February 2025 | Publication date: 17 June 2022 | By: INRAE | S. Sabatié

Cookie type	Means of blocking
Analytical and performance cookies	Realytics Google Analytics Spoteffects Optimizely
Targeted advertising cookies	DoubleClick Mediarithmics

Mandatory cookies	Functional cookies	Social media and advertising cookies
These cookies are needed to ensure the proper functioning of the site and cannot be disabled. They help ensure a secure connection and the basic availability of our website.	These cookies allow us to analyse site use in order to measure and optimise performance. They allow us to store your sign-in information and display the different components of our website in a more coherent way.	These cookies are used by advertising agencies such as Google and by social media sites such as LinkedIn and Facebook. Among other things, they allow pages to be shared on social media, the posting of comments, and the publication (on our site or elsewhere) of ads that reflect your centres of interest.
Our EZPublish content management system (CMS) uses CAS and PHP session cookies and the New Relic cookie for monitoring purposes (IP, response times). These cookies are deleted at the end of the browsing session (when you log off or close your browser window)	Our EZPublish content management system (CMS) uses the XiTi cookie to measure traffic. Our service provider is AT Internet. This company stores data (IPs, date and time of access, length of the visit and pages viewed) for six months	Our EZPublish content management system (CMS) does not use this type of cookie.