Gesture Controlled Temple Run using OpenCV & MediaPipe — No Keyboard, Just Your Hands!

Опубликовано: 06 Май 2025
на канале: codewithashutosh

118

Ever imagined playing Temple Run with just your hand gestures? No keyboard. No joystick. Just your camera and a bit of Python magic. Well, I did exactly that — and the result is a hands-free gaming experience powered by AI.
Welcome to the future of intuitive gameplay.

🎯 Project Overview
This project is a Gesture-Controlled Temple Run system built using:

🧠 OpenCV: For video frame processing
✋ MediaPipe: For real-time hand landmark detection
💻 Python: The glue that binds it all
🎮 Temple Run (or any similar endless runner game): As the playground
What started as a simple idea became an exciting experiment in computer vision, delivering immersive control without ever needing to touch the keyboard.

🔍 Why I Built This
As someone deeply invested in AI, computer vision, and the future of interaction, I’ve always been intrigued by human-computer interfaces beyond the mouse and keyboard.

Temple Run is nostalgic. But adding hand gestures to it made it futuristic. I wanted to create something that's not just fun but also a learning playground for AI-based gesture recognition.

And honestly? It’s so much fun to control characters just by swiping your hand.

🧰 Tech Stack
Here’s a breakdown of the tools and libraries used:

Python 3.9
OpenCV: For accessing camera feed, drawing, and visualization
MediaPipe Hands: To detect hand landmarks in real-time
PyAutoGUI: To simulate keyboard key presses like left, right, space, etc.
Subway Surfers / Temple Run: As the game being controlled
🎥 Yes, the code works with both Temple Run and Subway Surfers on web or desktop.

🎬 How It Works
The idea is simple — use a webcam to track your hand, interpret your gestures, and simulate key presses accordingly.

1. Capture Frame from Webcam
camera_video = cv2.VideoCapture(0)
OpenCV reads frames continuously from the webcam.

2. Detect Hand using MediaPipe
mpHands = mp.solutions.hands
hands = mpHands.Hands()
MediaPipe provides fast, efficient landmark detection — it identifies 21 key points on your hand (fingers, joints, wrist, etc.) in real-time.

3. Interpret Gestures
Based on movement and finger positions, we define gestures:

Swipe Left ➡️ → Move left
Swipe Right ⬅️ → Move right
Show 5 fingers 🖐️ → Jump
Show 1 finger ☝️ → Slide (crouch)
4. Trigger Game Actions
Using pyautogui.press("left") or "right", the code simulates keyboard input based on detected gestures.

So you never touch the keyboard — your hand becomes the controller.

📦 Project Source Code
You can access the full, clean source code here:

📄 Source Code (Google Doc):
👉 Click here to view the code

📱 Customizing the Display
To make the webcam feed match a mobile phone-like display, we resized the output to resemble a vertical phone screen:

resized_frame = cv2.resize(frame, (360, 640))
This enhances the immersive feel — like you're watching and interacting through a mobile screen.

🧪 Challenges Faced
1. Gesture Conflicts
Some gestures looked too similar (e.g., 1 finger vs 2 fingers). Solved by refining landmark logic and adding cooldown timers.

2. Swipe Detection
Accurately detecting a horizontal hand swipe within frames was tricky. I improved it by comparing previous frame hand position vs current frame.

3. Lighting Sensitivity
MediaPipe performs best in good lighting. In dim conditions, landmark detection was less reliable — fixed using external light.

💡 Learning Outcomes
This project taught me:

Real-time hand tracking using MediaPipe
Interfacing hardware inputs (camera) with software outputs (keyboard presses)
Designing smooth, intuitive gesture recognition systems
Debugging live vision-based AI models in constrained environments
🔥 Showcase & Demo
🎥 I recorded the gameplay as a demo video and posted it on LinkedIn. The video showcases Temple Run being played with just my hand gestures — no contact, no controller.

The response? Incredible.
From AI enthusiasts to recruiters and developers, everyone was intrigued by the practicality of the system.

🗣️ How You Can Build It
Want to build it yourself?

Just follow these steps:

Install dependencies:
pip install opencv-python mediapipe pyautogui
Clone or copy the source code
Run the script:
python app.py
Open your browser and visit:
Temple Run
or Subway Surfers
Play using just your hands!
🚀 Future Improvements
Voice + Gesture Combo: (Imagine saying "Jump" and raising your hand.)
Cross-platform integration: To work with mobile games too.
AI-powered gesture customization: So users can define their own gestures.

Смотрите видео Gesture Controlled Temple Run using OpenCV & MediaPipe — No Keyboard, Just Your Hands! онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь codewithashutosh 06 Май 2025, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 118 раз и оно понравилось 5 людям.

173