International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 13 Issue: 01 | Jan 2026
p-ISSN: 2395-0072
www.irjet.net
Voice-Controlled Smart Home Automation: Integrating Large Language Models with IoT Devices for Enhanced Usability Bikash Katuwal1, Abhishek Ghimire1 1Saint Cloud State University, Saint Cloud, Minnesota, USA
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - This paper presents a novel approach to smart
on cloud processing, and inability to perform complex reasoning tasks [2].
home automation by integrating Large Language Models (LLMs) with Internet of Things (IoT) devices to create a responsive, context-aware voice control system for home appliances. The developed system utilizes an ESP32-S3 microcontroller as the central processing unit, coupled with a MEMS microphone for audio capture, Deepgram's Speechto-Text API for voice recognition, Groq LLM for natural language understanding and response generation, and Deepgram's Aura for Text-to-Speech conversion. A 4-way relay system enables control of multiple household appliances. The system demonstrates significant improvements over traditional voice-controlled home automation systems, with an average response time of 1.2 seconds from voice command to relay activation, and a command recognition accuracy of above 96% in normal ambient noise conditions. Real-time conversation capabilities enable more intuitive human-computer interaction, with the system capable of maintaining context across multiple exchanges. This research contributes to the growing field of ambient intelligence by showcasing how cutting-edge language models can be effectively deployed on resource-constrained IoT devices to enhance home automation systems with natural language processing capabilities.
The evolution of voice-controlled home automation systems has progressed through several distinct phases, each marked by advances in both hardware capabilities and software sophistication. The recent advances in Large Language Models (LLMs) present an opportunity to fundamentally transform how humans interact with their home environments [3][20]. These models demonstrate unprecedented abilities in natural language understanding, contextual awareness, and human-like response generation. However, deploying such sophisticated models in resource-constrained IoT environments presents significant technical challenges related to processing power, memory limitations, and realtime response requirements [4]. This research is motivated by the gap between the potential of LLMs to revolutionize human-machine interaction and the practical constraints of implementing these capabilities in affordable, accessible smart home systems. By developing a system that effectively bridges this gap, we aim to contribute to the advancement of ambient intelligence technologies that seamlessly integrate into everyday living environments while providing intuitive, responsive control through natural language.
1.INTRODUCTION The integration of smart technology into residential environments has rapidly evolved over the past decade, transitioning from simple remote-controlled devices to sophisticated interconnected systems that anticipate and respond to user needs. Voice-controlled home automation represents one of the most intuitive interfaces for humancomputer interaction, eliminating the need for physical controls or mobile applications while enabling hands-free operation for users of all abilities. The Internet of Things (IoT) plays an important role in detecting and reporting tree poaching in real time by using low-power sensors and reliable communication technologies. Its impact goes beyond environmental protection; IoT has also transformed fields like agriculture, healthcare, and smart homes. By enabling devices to communicate and share data without human involvement, IoT effectively bridges the gap between the physical and digital worlds.[1] Early voice assistants like Amazon's Alexa, Google Assistant, and Apple's Siri introduced more conversational interactions but remained limited by their closed ecosystems, reliance
© 2026, IRJET
|
Impact Factor value: 8.315
Manohar and Sivaprakasam (2019) demonstrated early implementations of voice control in smart homes using Arduino microcontrollers and Bluetooth connectivity, achieving basic command recognition but lacking conversational abilities [5]. Jasim (2022) advanced this approach by incorporating cloud-based natural language processing, improving recognition accuracy but introducing latency issues and privacy concerns due to constant cloud connectivity requirements [6]. In parallel, research into edge computing solutions for voice processing has shown promise. Lyashenko, V. (2021) implemented compressed neural network models on ESP32 devices for keyword spotting and basic command recognition, achieving response times under 500ms but with limited vocabulary and no contextual awareness [7]. The application of LLMs to IoT environments remains an emerging field. Chen et al. (2024) demonstrated the feasibility of deploying quantized language models on
|
ISO 9001:2008 Certified Journal
|
Page 174