top of page

Interactive Wand Gesture Tracking Using OpenCV and a Raspberry Pi 5

Spring 2025

Highlight Reel

Project Summary

This interactive wand gesture recognition system is inspired by the Harry Potter wands found at Universal Studios theme parks around the world.

It detects spellcasting gestures in real-time using the same wand used by Universal Studios in their parks. It recognizes and responds to two specific spells:
 

  • "Alohamora" - Unlocking spell that opens a magic box

  • "Colloportus" - Locking charm that closes a magic box


A raspberry Pi 5 runs a custom-trained SVM Classifier to detect which of these spells were drawn by the wizard. It's able to see the path they drew by flooding the scene with infrared light, and then uses a camera (without an infrared filter) to detect the reflective tip of the wand (seen as a white blob by the camera). After some filtering, blob detection is used in OpenCV to create the path of the wand and save it as an image to be analyzed by the classifier.

 

Once classified, the Raspberry Pi triggers:

​

  • A Servo that opens or closes the box

  • An LED strip that illuminates with fiery colors tied to each spell type

  • Themed sound effects for each spell that volume duck the background music loop playing

 

All of this runs on-device using multi-threaded python, allowing classification of drawn spell paths to happen while drawing the next spell path.
​
 

Flow Chart of Overall Process

Wand (1).jpg

Creating the Dataset by Hand

In order to train a model to recognize the wand patterns, I drew over 200 example paths for each spell using a python script. The spells were saved as grayscale 28x28 PNGs. I made sure each drawing wasn't perfect, that way I wouldn't need to worry as much about overfitting the data in the training step.

Screenshot 2025-05-30 at 3.10.41 PM.png
Screenshot 2025-05-30 at 3.05.47 PM.png

Training the Data

Once I had built up a dataset of over 400 hand-drawn wand traces, I converted them into numerical arrays and organized them into input features (X) and corresponding labels (y) — with 0 representing the “Alohamora” spell and 1 representing “Colloportus.” Each image was flattened into a 784-element vector to serve as the input for classification.

To train the model, I used scikit-learn’s train_test_split to divide the dataset into training and testing sets, with 75% of the data used for training and the remaining 25% reserved for evaluation. I then built a pipeline that first normalized the feature data using StandardScaler, followed by a Support Vector Classifier (SVC). This preprocessing ensured that all input dimensions were on a consistent scale, which is particularly important for distance-based classifiers like SVMs.

For optimal performance, I used GridSearchCV to perform an exhaustive search over a range of hyperparameters including kernel type, regularization strength (C), gamma, and polynomial degree. The grid search evaluated 180 different parameter combinations using 5-fold cross-validation, totaling 900 fits. The best-performing configuration used a polynomial kernel with C=0.1, gamma=0.01, and degree=2.

After training, the model achieved an accuracy of over 99% on the held-out test data. I then saved the final model to a .pkl file using joblib, allowing it to be loaded and used in real time by the wand recognition system running on the Raspberry Pi.

Screenshot 2025-05-30 at 3.24.00 PM.png

Implementing Real-Time Wand Tracking

With the model trained and saved, the next step was bringing the wand system to life in real time. I used a Raspberry Pi 5 to run the entire pipeline, beginning with blob detection through the PiCamera. Once a bright, circular blob (representing the wand tip) is detected and remains active for a set duration, the system starts tracing the wand’s motion using OpenCV. As the trace is drawn, the system monitors movement and stillness to determine when the gesture is complete.

Once a valid gesture is captured, the trace is converted into a 28×28 binary image and passed into the trained SVM classifier for prediction. Based on whether the result is "Alohamora" or "Colloportus," the system then animates a servo motor, lights up a high-density LED strip with spell-specific fire effects, and plays custom sound effects — all controlled in sync using Python threads to ensure smooth, responsive feedback. Additional logic was added to filter out false triggers like reflections or short movements, allowing for consistent and immersive spell recognition.

Screenshot 2025-05-30 at 3.34.32 PM.png
Screenshot 2025-05-30 at 3.31.48 PM.png
Screenshot 2025-05-30 at 3.32.31 PM.png
Screenshot 2025-05-30 at 3.33.25 PM.png
Screenshot 2025-05-30 at 3.36.20 PM.png

Spell Sound Effects
(Click to Play)

Alohamora
00:00 / 00:07
Colloportus
00:00 / 00:05

Background Loop Snippet
(Includes Environmental Wind Noise)

Loop Snippet
00:00 / 00:58

Putting It All Together

To complete the illusion of magic, I housed all of the electronics in a repurposed shoebox that I painted to resemble an old wizard chest. I installed an MG996r servo behind the box and used thin music wire to connect it to the lid — pulling it open or pushing it closed based on the recognized spell. The motion was subtle but effective, especially when paired with the synchronized lights and sounds.

To enhance the theatricality and hide the inner workings, I surrounded the box with a collection of Harry Potter books and themed blankets, which helped conceal the wiring, and servo linkage from plain view. I also placed the RGB LED strip just inside the inner lip of the box, casting colorful, fire-like glow patterns that lit up the interior during each spell effect. These details helped the final experience feel more like a real bit of magic than just a tech demo, blending handmade props with interactive software and hardware.

Screenshot 2025-05-30 at 4.18.58 PM.png
Screenshot 2025-05-30 at 4.22.40 PM.png
bottom of page