Smoothly animating virtual characters is a pretty challenging task, that's why there exist motion capturing systems, which basically translates real world
movement into virtual environments.
One among many possible tracking systems which is suitable for the job is an optical one, which uses a set of trackers displaced on the body, tracked by several cameras.
OptiTrack
Tracking
Predicting
Python;
OpenCV;
Unreal Engine 5;
Learn to work with optical tracking data.
Find also way to make them more robust to information loss: a sensor's data is lost if not enough camera can capture its position.
Move to virtual environment such as Unreal Engine 5, understanding the tool and modelling a scene with an animated virtual character.
Extract spatial informations from it and achieve 3D to 2D projection of skeleton joints on the camera image plane.
This project was part of the Computer Vision course at University of Trento a.y. 2023/2024, on my first year of Master's degree in AI systems.
I've always found virtual environments' possibilities really interesting and limitless. I was quite happy to work in such contexts inside a university project.
Optical tracking systems are able to gather information about trackers at an impressive speed (those we have in the labs go up to 360 fps).
The informations provided by those sensors we mostly care about are :
A possible solution to the problem is to apply some sort of external tracking, so that when the sensors' information is loss we can try to predict it.
A simple way to achieve so is the application of naive filters! We propose 2: a Kallman filter (the most popular and obvious) & a particle filter.
Starting with the Kallman filter :
By simulating a virtual environment inside Unreal Engine 5.4 (UE) the goal is to extract pose informations necessary
to achieve 3D to 2D projection onto the camera image plane of all the joints togheter with the skeletal structure.
That was our first time playing around with UE, but after a while we got the hang of it : UE makes it possible to interact with level (scene) components either
via C++ code, or blueprints (BP) (visual representation of code functions via node graphs). As it was our first time with the Engine we went for the BP approach.
As first we properly modelled the scene inserting 2 core blueprints, one containing our main actor and the other containing the camera.
Using the animation retargeting feature provided by UE5 we were also easily able to map the provided animation onto another free skeleton from
Adobe Mixamo characters
We decided to go for openCV as image processing framework as it provides all we need to perform a 3D to 2D projection of a set of points into the camera image plane.
We had some stuff to deal with first :
UE5 and openCV use 2 different coordinate systems as :
Instead of evaluating results on Matplotlib it would be much better to forward data to a more suitable environment such as Blender. To achieve so we used as basis the deep-motion-editing repository which provides a framework to build skeleton aware neural networks by interacting with Blender python APIs.