Robotic manipulator

A pick and place system for an anthropomorphic arm

Project overview

Problems

Even in a "simple" pick & place application robotic manipulators might need to be responsive to changes in the target object's position.

Considered such a scenario how do we achieve so?

Methods

Synthetic dataset generation;

Object detection and classification;

Object pose estimation;

Robotic kinematics;

ROS architecture designing.

Tools

Python;

C++;

ROS;

Blender;

YOLO;

Gazebo.

Goals

Make a UR5 robotic arm grasp randomly displaced Mega Blocks (big lego like objects) and place them one by one into a specific location according to their class, making the computation of dynamics and the sensorial components work fluently into a ROS environment.

Developed in collaboration with Sergio Brodesco and Dennis Cattoni (2023).

Context

This project was part of the "Introduction to robotics" couse at University of Trento a.y. 2022/2023, on my last year of Bachelor's degree in computer science.

From an accademic perspective the objective was about putting into practice the knowledge acquired during the course into a complex and variegate context which demanded coding, communication and engenieering skills.

Naturally, the accademic context involves as always some constraints. In our case we couldn't use the kinematic function provided by the manipulator's producer, we had instead to re-implement them from scratch, requiring us to understand the mathematical details behind it.

The problem

Robotic matipulators working in assembly line scenarios are generally programmed to perform specific and precise pre-defined sequencies of actions in orther to achieve their goal... But that's not always the case! What if the workspace is non-static and the robot needs to first assess the situation and then act as a consequence? That's the type of sceraio we're dealing with in this project.

We dispose of

UR5 manipulator
ZED2 RGBD camera
a bag of Mega Blocks with the corresponding 11 possible 3d model shapes

The robot must pick all the blocks displaced on a table and move them to a specific location according to the object class.

Addressing the problem

ROS is our backbone frame to develop the blocks which will compose the whole system :

There are basically 3 main nodes handling the majour challenges we need to face :

Image processor : in charge of classifying the objects and estimate their position & rotation in space
Motion processor : in charge of handling the robot's kinematics (trajectory computation, collision detection, ...)
Task planner : intermediate node which instructs the robot given the current environment situation

Computer vision

Nowadays the image classification task is fairly easy one, with deep models such as YOLO capable of performing accurate detection on real time systems. The first problem we're facing is that those models require good, labelled data to perform well. What we decided to do was making use of the 3d models we had to generate our data syntetically inside Blender, which is perhaps the most famous 3d modelling software on the market.

For this reasons a replica of the real world setting has been modelled inside of Blender togheter with a script able to automatically render and annotate YOLO friendly images showing random displacements of the 3d models, varrying also other scene's parameters.

After training on a fully synthetic dataset the YOLO classifier proved to be generalize well also in the real world

Obtaining the object's position after a good detection step is easy since the ZED2 camera dispose of a depth sensor, which makes the job easier, but we still need to acquire the objects 3d rotation in space. The easier way of achieve so would probably we to use a different deep architecture and feed it with also the rotation information, but we decided to go for another way.

What we do is basically consider the point cloud captured by the ZED2 inside an object's bounding box, then we sample a good ammount of points from the corresponding class object's 3d model and interpolate these 2 sets pointcloud via extracting the FPFH features, running RANSAC and refining it's computation with ICP.

As long as the 3d models are accurate enough and the ZED2 captures a decent amount of points relative to the object we're able to extract the rotation matrix generated by the pointcloud interpolation and use it as the object's rotation.

Motion processors

Follows the dynamics schema we opted for :

Further informations & source code available on GitHub !

My contribution to the project

I was mainly responsible of the entire computer vision part, but i also joined my colleagues in developing all the other components