This project is a Python-based voice and text AI assistant.
It can recognize speech, take manual input, detect your mood, and generate answers with the Google Gemini API.
The assistant also keeps a conversation log and tries to adapt its tone based on how you feel.
Also pause or resume media playback, adjust system volume by percentage,
and even perform Google searches directly when you ask.
Voice input with hotkey (F7)
Manual input with hotkey (F8)
Mood detection (happy, sad, angry, calm, surprised, flirty, gamer, neutral)
Google search integration
Saves all conversations into conversation_log.txt
- Clone the repository:
git clone https://github.com/HamzaDonmez/voice-ai-assistant.git cd voice-ai-assistant - Install requirements:
pip install -r requirements.txt- Open the file FinalVer.py and add your Google API Key:
client = genai.Client(api_key="Your_Api_Key")- Run the program:
python FinalVer.py-
Voice Input
Hotkey:F7→ Activates microphone listening (listen_and_recognize()in code). -
Manual Input
Hotkey:F8→ Lets you type directly (manual_input()in code). -
The assistant will answer back and log everything.
-
Web Search
If user says "Google", "search", or "on internet", it performs a Google search
(handled byparse_intent()andwebbrowser.open()). -
Mood Detection
Detects emotions from text: angry, sad, happy, calm, surprised, flirty, gamer, neutral.
(detect_mood()function decides based on keywords).
Example: Saying "I’m sad" sets mood ="sad". -
Conversation Logging
All dialogues are saved inconversation_log.txt.
(append_log()andensure_log_file()functions handle this). -
Media Control (via keyboard events)
Assistant can pause or play media with system hotkeys.
For example, sendingkeyboard.send("play/pause media")inside intent handling.
(You can expandparse_intent()to map words like "pause music" or "resume music"). -
Volume Control
Can adjust volume by percentage when instructed.
Example: "Set volume to 30" → triggers system volume command (implemented in your OS section).
In code, you’d extendparse_intent()to detect "volume" +%and call system command. -
AI-Powered Responses
Uses Google Gemini API (get_ai_response()) to generate contextual replies.
Takes mood + last 15 conversation lines into account. -
Interrupt & Exit
Stop the program anytime withCtrl+C. (KeyboardInterrupthandling insidemain()andprocessing_loop()).
The project uses:
speechrecognition
keyboard
google-generativeai
pyaudio
- Feel free to fork, open issues, or improve the code. Any suggestions are welcome!
📌 Author