For ENME480: Introduction to Robotics at the University of Maryland, I programmed a Universal Robots UR3e 6-DOF robotic arm to autonomously detect ArUco-tagged blocks using an overhead camera and stack them into a tower. The full system integrated forward kinematics, inverse kinematics, computer vision, and a ROS2 control pipeline — all running on the physical robot.
The project was built progressively across four lab assignments throughout the semester:
The UR3e has 6 joints, each introducing one rotational degree of freedom. To determine the position and orientation of the end effector from a given set of joint angles, I derived the forward kinematics using the Denavit-Hartenberg (DH) convention. Each consecutive frame-to-frame transformation is defined by four parameters — link length (a), link twist (α), link offset (d), and joint angle (θ) — and chained together to produce the full base-to-end-effector transformation matrix.
The UR3e physical dimensions used in the model (in mm): d1 = 152, a2 = 244, a3 = 213, d4 = 93, d5 = 83, d6 = 83, with an additional tool offset of 53.5 mm lateral and 59 mm axial for the suction gripper. The world frame origin was offset 150 mm × 150 mm from the robot base corner, with base plate thickness z = 10 mm.
I implemented calculate_dh_transform() in Python — a function that takes a list of six joint angles and returns the 4×4 homogeneous transformation matrix. This was validated in Gazebo by comparing computed end effector positions against the simulator's reported pose via the /ur3/position ROS2 topic.
With the FK model established, I derived an analytical inverse kinematics solution to compute the six joint angles needed to reach a desired end effector position (x, y, z) and yaw. This was implemented in the inverse_kinematics() function and validated on the physical robot.
A laser pointer was attached to the end effector. I commanded the robot to five target Cartesian positions and measured where the laser hit the table, comparing against my FK model's predictions. Discrepancies were analyzed and attributed to joint backlash, DH parameter approximations, and tool offset calibration error. Singularity conditions — such as wrist alignments and fully extended configurations — were identified and handled in the IK solution.
An overhead USB camera was mounted above the workspace facing straight down. To locate each block in the robot's coordinate frame, I needed to convert pixel coordinates from the camera image into real-world table-frame coordinates in mm.
Using get_perspective_warping_with_aruco.py, I clicked four known reference points on the table edges (clockwise from bottom-left) and provided their known real-world coordinates in mm. OpenCV's getPerspectiveTransform() computed a perspective matrix saved as perspective_matrix.npy, which corrects for camera angle and lens distortion to produce accurate world-frame positions from any pixel coordinate.
Each block had a unique ArUco marker on its top face. Using OpenCV's ArUco detection module, I:
image_frame_to_table_frame() to convert pixel coordinates to table-frame (x, y) in mm
This detection logic was wrapped into a ROS2 node (block_detection_aruco.py) that continuously published detected block positions to /aruco_detection/positions.
The final integration was main_pipeline.py, which orchestrated the full autonomous sequence. I implemented four key functions:
Publishes a set of joint angles to the /ur3e/command topic using the CommandUR3e message structure to command the arm to a target configuration.
Toggles the suction gripper on or off while holding the current arm pose, published to the gripper control topic.
Sequences the full pick-and-place motion for one block: approach above the block → lower to pick height → activate suction → raise → translate to destination → lower → release suction → raise. The robot never drags blocks across the table surface.
Given detected block IDs and their table-frame positions, this function determines the order of operations and destination for each block. For the stacking task, each subsequent block was placed on top of the previous with an incrementing height offset.
The complete pipeline ran as four concurrent ROS2 nodes:
The entire stack ran inside a Docker container using ROS2 Humble, with Gazebo simulation used for pre-lab testing and logic validation before deploying to the physical UR3e in the Robotics and Autonomous Laboratory (RAL).