Gesture Controlled Temple Run using OpenCV & MediaPipe — No Keyboard, Just Your Hands!

Published: 06 May 2025
on channel: codewithashutosh

118

Ever imagined playing Temple Run with just your hand gestures? No keyboard. No joystick. Just your camera and a bit of Python magic. Well, I did exactly that — and the result is a hands-free gaming experience powered by AI.
Welcome to the future of intuitive gameplay.

🎯 Project Overview
This project is a Gesture-Controlled Temple Run system built using:

🧠 OpenCV: For video frame processing
✋ MediaPipe: For real-time hand landmark detection
💻 Python: The glue that binds it all
🎮 Temple Run (or any similar endless runner game): As the playground
What started as a simple idea became an exciting experiment in computer vision, delivering immersive control without ever needing to touch the keyboard.

🔍 Why I Built This
As someone deeply invested in AI, computer vision, and the future of interaction, I’ve always been intrigued by human-computer interfaces beyond the mouse and keyboard.

Temple Run is nostalgic. But adding hand gestures to it made it futuristic. I wanted to create something that's not just fun but also a learning playground for AI-based gesture recognition.

And honestly? It’s so much fun to control characters just by swiping your hand.

🧰 Tech Stack
Here’s a breakdown of the tools and libraries used:

Python 3.9
OpenCV: For accessing camera feed, drawing, and visualization
MediaPipe Hands: To detect hand landmarks in real-time
PyAutoGUI: To simulate keyboard key presses like left, right, space, etc.
Subway Surfers / Temple Run: As the game being controlled
🎥 Yes, the code works with both Temple Run and Subway Surfers on web or desktop.

🎬 How It Works
The idea is simple — use a webcam to track your hand, interpret your gestures, and simulate key presses accordingly.

1. Capture Frame from Webcam
camera_video = cv2.VideoCapture(0)
OpenCV reads frames continuously from the webcam.

2. Detect Hand using MediaPipe
mpHands = mp.solutions.hands
hands = mpHands.Hands()
MediaPipe provides fast, efficient landmark detection — it identifies 21 key points on your hand (fingers, joints, wrist, etc.) in real-time.

3. Interpret Gestures
Based on movement and finger positions, we define gestures:

Swipe Left ➡️ → Move left
Swipe Right ⬅️ → Move right
Show 5 fingers 🖐️ → Jump
Show 1 finger ☝️ → Slide (crouch)
4. Trigger Game Actions
Using pyautogui.press("left") or "right", the code simulates keyboard input based on detected gestures.

So you never touch the keyboard — your hand becomes the controller.

📦 Project Source Code
You can access the full, clean source code here:

📄 Source Code (Google Doc):
👉 Click here to view the code

📱 Customizing the Display
To make the webcam feed match a mobile phone-like display, we resized the output to resemble a vertical phone screen:

resized_frame = cv2.resize(frame, (360, 640))
This enhances the immersive feel — like you're watching and interacting through a mobile screen.

🧪 Challenges Faced
1. Gesture Conflicts
Some gestures looked too similar (e.g., 1 finger vs 2 fingers). Solved by refining landmark logic and adding cooldown timers.

2. Swipe Detection
Accurately detecting a horizontal hand swipe within frames was tricky. I improved it by comparing previous frame hand position vs current frame.

3. Lighting Sensitivity
MediaPipe performs best in good lighting. In dim conditions, landmark detection was less reliable — fixed using external light.

💡 Learning Outcomes
This project taught me:

Real-time hand tracking using MediaPipe
Interfacing hardware inputs (camera) with software outputs (keyboard presses)
Designing smooth, intuitive gesture recognition systems
Debugging live vision-based AI models in constrained environments
🔥 Showcase & Demo
🎥 I recorded the gameplay as a demo video and posted it on LinkedIn. The video showcases Temple Run being played with just my hand gestures — no contact, no controller.

The response? Incredible.
From AI enthusiasts to recruiters and developers, everyone was intrigued by the practicality of the system.

🗣️ How You Can Build It
Want to build it yourself?

Just follow these steps:

Install dependencies:
pip install opencv-python mediapipe pyautogui
Clone or copy the source code
Run the script:
python app.py
Open your browser and visit:
Temple Run
or Subway Surfers
Play using just your hands!
🚀 Future Improvements
Voice + Gesture Combo: (Imagine saying "Jump" and raising your hand.)
Cross-platform integration: To work with mobile games too.
AI-powered gesture customization: So users can define their own gestures.

Watch video Gesture Controlled Temple Run using OpenCV & MediaPipe — No Keyboard, Just Your Hands! online without registration, duration hours minute second in high quality. This video was added by user codewithashutosh 06 May 2025, don't forget to share it with your friends and acquaintances, it has been viewed on our site 118 once and liked it 5 people.

5,521