Securing Critical Infrastructure Through Innovative Use Of Merged Hierarchical Deep Neural Networks

Lav Gupta, University of Missouri-St. Louis

Abstract

Multi-clouds are becoming central to large and modern applications including those in business, industry and critical infrastructure sectors. Designers usually deploy these clouds hierarchically to get the best advantage of low latency of the edge clouds and high processing capabilities of the core clouds. The data that flows into the clouds for processing and storage and moves out to other system domains for further use must cross multiple trust boundaries, and as a result, face large attack surfaces. This gives malicious actors abundant opportunity to penetrate and potentially harm organizations or bring down critical services causing widespread disruptions and mayhem. Deep neural network models can be used in innovative ways to protect the confidentiality and integrity of dataflows in the clouds. However, the use of deep learning comes with some challenges. In large multi-location and multi-cloud environments, deep learning models grow rapidly in size and complexity, inhibiting fast training of cloud models and making it difficult to maintain accuracy of detection of known and unknown attacks on the data-in-motion. This impedes their use in critical infrastructure services. We propose innovative distributed-hierarchical-merged models, which make use of cooperative training at the edge and the core clouds, and the power of data and model parallelisms, to achieve rapid training with high accuracy. Our broad objectives in this paper are twofold: Firstly, we show that merged hierarchical deep learning models, working cooperatively in the multi-cloud, significantly reduce the parameters to be trained and results in faster core cloud training time. Secondly, training the merged core model with distribute strategy for data parallelism on CPUs and GPUs further reduces the training time significantly. We also show that while achieving improvement by about 25% over unmerged models and the accuracy of detection of unknown attacks in the range 96.9-99.5%.