| dc.description.abstract |
The past decade has seen an increase in the adoption of the Internet of Things (IoT) ecosystem.
The number ofI oT devices is estimated to be around 30.9 billion by the end of 2025, as reported by Statista. This number is almost four times the current population of the world. The massive adoption
of the IoT system in domains such as healthcare, energy, transportation, home, and industry, coupled
with the heterogeneity, computational constraints, and the insecure nature of some of these IoT
devices, has made it attractive to cyber-attacks. Over the years, various techniques have been
explored either to mitigate or reduce the number of attacks against the IoT system. Some of these
techniques are encryption, access control, secure architecture, and intrusion detection systems
(IDS).IDSs have proven very effective in detecting attacks in traditional computing systems. Over the past ten years, a lot of work has been done focusing on the use of IDS to detect attacks within the IoT ecosystem. However,IDSs for traditional computing systems do not meet the requirements
of the IoT ecosystem due to the heterogeneous nature of the IoT ecosystem, the protocol-specific
nature of the IoT ecosystem, the dynamic nature of the IoT ecosystem, and the limited computational
capacity of IoT devices. This has led to the IoT security research community working to develop
IoT-specific IDS that can overcome the earlier challenges mentioned. Most of the works done
in the quest to design these IoT-specific IDSs use machine learning (ML) based techniques, with
the majority of these approaches using offline ML techniques. Using offline ML algorithms to
design IDS for the IoT ecosystem leads to problems such as the IDS becoming obsolete when
there is a change in the data that was used to train the model, computational complexity, and
inability to adapt to real-time network traffic. Online ML algorithms have shown the ability to
produce lightweight models, adapt to real-time environments, and handled rifts in domains such as recommended systems.
In this thesis, we designed a lightweight IDS using online ML techniques that can run on the
edge of an IoT network, such as a gateway device. In addition, the proposed IDS should be able to
adapt to changes in network traffic in real-time.
The study proposed an IoT-based IDS using an online ML algorithm to build an IDS that is
lightweight and self-learning by dynamically adapting to changes in traffic.
The study is divided into five stages. The first stage involved investigating various ML algorithms
to identify which techniques are most suitable for developing a self-learning IDS for the IoT
ecosystem. Our investigation revealed that online MLalgorithms can be used to design IDS models
that are lightweight in nature and can adapt to network traffic in realtime without having to retrain
the model. To validate this, we used an ensemble of Gaussian Naïve Bayes and Hoeffding Tree
to design an online ensemble model. The results show that the proposed IDS recorded an average
accuracy of 99.98% with a memory usage between122.38 KB and650.11 KB. During the second stage, we focused on using a lightweight data preprocessing technique to
reduce the memory and computational requirements of the proposed intrusion detection. We used
an incremental principal component analysis(IPCA)as the data preprocessing technique and used
the Self-Adjusting Memory k-Nearest Neighbour (SAMKNN) to model our IDS. We used an on-
device (RaspberryPiModelB) training approach to build the proposed IDS. The results show that
the proposed model could record an accuracy as high as 98.91%, using a memory of 1.4% of the
total memory allocated on the device,1.6% of the CPU, and an average energy usage of 2%.
In the third stage, we developed an adaptive version of the SAMKNN algorithm that dynam-
ically updates and reacts to drifts. Our proposed Adaptive SAMKNN dynamically adjusts its
memory allocation. The approach not only allows the IDS to detect real-time threats but also uses
minimal memory. The results show that the proposed IDS outperforms the non-adaptive version
of SAMKNN in terms of memory efficiency.
In the fourth stage of the study, we explored the ability of the proposed IDS to detect zero-day
attacks deployed by adversaries as adversarial attacks. The results show that the IDS built using
adaptive SAMKNN can detect synthetic day attacks injected into the various datasets used for the
experimentalvalidation.
Finally, we developed an online deep learning based IDS with dynamic quantization. The
results show that the proposed system achieved high performance and used minimal computational
resources after quantization. |
en_US |