Anomaly Detection in Smart Home IoT Devices Using Unsupervised Learning
Keywords:
Smart home, IoT security, anomaly detection, unsupervised learning, autoencoder, Isolation Forest, OC-SVM, time series, network telemetry, edge computingAbstract
Smart homes concentrate dozens of heterogeneous Internet-of-Things (IoT) devices—thermostats, cameras, door locks, voice assistants, motion sensors—into a single, always-connected environment. This heterogeneity, coupled with weak device security and intermittent connectivity, widens the attack surface and increases the chance of silent malfunctions. Traditional intrusion detection systems struggle here because labeled attack data are scarce, device firmware and behavior change rapidly, and “normal” routines vary widely across households. This manuscript investigates unsupervised learning methods for anomaly detection in smart homes, focusing on network-centric and device-telemetry signals. We design a modular pipeline that (i) ingests raw network flows and device logs, (ii) normalizes and featurizes time-series windows, (iii) learns baseline behavior using Isolation Forest, One-Class Support Vector Machines (OC-SVM), and deep autoencoder variants (feed-forward and LSTM), and (iv) raises alerts using statistically principled thresholds (Median Absolute Deviation and Extreme Value Theory).
A simulation study blends real-world-like benign traffic with injected anomalies (port scans, command-and-control beacons, ARP spoofing, rogue firmware updates, packet floods, and sensor drift/freeze faults). We evaluate using AUROC/AUPRC, F1 at a threshold chosen on a small validation subset, precision@k for triage, false-positive rate at fixed true-positive rate, and detection delay. Results show that temporal autoencoders (LSTM-AE) excel on slow-drift operational faults and bursty command sequences, while Isolation Forest offers robust, interpretable baselines with low computational cost suitable for deployment on home gateways. OC-SVM performs competitively on stationary features but is sensitive to scaling and concept drift. We discuss threshold calibration, household personalization, privacy-preserving deployment, and human-in-the-loop feedback. The study suggests a hybrid approach—Isolation Forest for coarse screening, followed by LSTM-AE for fine-grained confirmation—can reduce false positives by ~25% at comparable recall, making unsupervised anomaly detection practical for resource-constrained smart homes.
Downloads
Downloads
Additional Files
Published
Issue
Section
License
Copyright (c) 2025 The journal retains copyright of all published articles, ensuring that authors have control over their work while allowing wide dissenmination.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Articles are published under the Creative Commons Attribution NonCommercial 4.0 License (CC BY NC 4.0), allowing others to distribute, remix, adapt, and build upon the work for non-commercial purposes while crediting the original author.
