Flying Drone DIY, part IV: custom autopilot programs
Executive summary
According to my development schedule, the next version of the drone should cover camera and video transmission. However, I got a request to help with the programming part, and it moved me to a point that had been planned only for the 4th version of the drone.
One request is to help automate drone return when it loses control in the area of electronic warfare equipment influence. Another request is to automate drone navigation by the optical flow. This publication covers the start of programming drone behavior and mid-term results.
I successfully set up and tune all necessary emulation software and developed my first program for a drone. It helps to evaluate the SDK, API, and nuances of air vehicle navigation.
From two requests above I started with the second one. I decomposed the issue and started work on it step by step. I have completed recognition visual objects on the video stream, achieved satisfactory detection of AprilTag markers, packed it all together into a pipeline and supplied it to the drone.
Technically, it works, but results should be improved: a. do not trigger changes in altitude on change of focal point b. better recognize change in tilt of surface. These would be the next steps.
First autopilot program
Operation with drone goes over MAVLink protocol. There are several options how you could operate with it: plain low level, mavsdk, dronekit, ROS\ROS2. DroneKit is a superior toolset, but its development is discontinued, ROS toolkit is narrow specialized, so MAVSDK is natural choice with fallback to low level API when it is necessary.
async def connect_drone():
drone = System()
await drone.connect(system_address="udpin://127.0.0.1:14550")
print("Connecting to drone...")
async for state in drone.core.connection_state():
if state.is_connected:
print("Drone discovered!")
break
return drone
Code snippet is self-explanatory. It worth to note a special format
of UDP address - udpin. When you run operator command and simulation
environment on the same machine, you have distinguish listeners and
writer in this manner.
async def arm_and_takeoff(drone):
default_alt = 1.7 # by default 2 meters is a target alt on takeoff (1.7 for gazebo iris_runaway)
print("Arming drone...")
await drone.action.arm()
print(f"Taking off to {default_alt} meters...")
await drone.action.takeoff()
async for pos in drone.telemetry.position():
alt = pos.relative_altitude_m
print(f" Altitude: {alt:.1f} m")
if alt >= default_alt * 0.95:
print("Reached target altitude")
break
await asyncio.sleep(0.5)
Code is self-explanatory, except a few interesting moments. The first
one is arm could fail -- arming a drone requires several pramaters
to be met as a part of safety check; the process could be prolong
in time and in production-ready scenario you have run extra check
within exponential backoff algorithm before command arm. The second interesting moment is a physical nature of subject which has been programmed here. You have to wait until physical motors do its job, in one or another way this waiting should be implemented.
async def reach_altitude(drone, target_alt):
async for position in drone.telemetry.position():
alt = position.relative_altitude_m
print(f"Altitude: {alt:.1f} m")
if alt >= target_alt:
print("Reached target altitude")
break
await drone.offboard.set_position_ned(PositionNedYaw(0.0, 0.0, -target_alt, 0.0))
await asyncio.sleep(0.5)
The code is self-explanatory, except NED. NED = North–East–Down is a
local Cartesian coordinate system referenced to the drone's current
position. X (North) → North
Y (East) → East
Z (Down) → Down
Used in almost all autopilots (ArduPilot, PX4). The key point is all calculation depends the start of coordinate system. Calling "Send to north" twice with the same value 80m, makes drone flight on 80m, not for 160m.
As a first step working in drone programming I work with simulation environment. There are several options, but QGroundControl is linux native. All together it shown good results. The whole code snippet together with instruction for deployment is available by this link.
Auto Stabilization R&D
Visual control
QGroundControl is good for many aspects, but it is a 2D map visualization, for my next task I would need visual confirmation of drone's tilt which requires 3D perspective.
There is several solutions for it and Gazebo is among the most popular ones. It has several caveats on installation, e.g. mandatory option of --model json, otherwise it will start a wrong server, but community is rich and helpful, so you should not have serious blockers.
Object recognition
The initial task has been decomposed on several smaller parts. At first we have to stabilize drone not by image of the surface itself, but by ArUco markers. During development it was discovered the demo video I used has AprilTag instead of ArUco (they have many similarities, so it is easy to mix them up) and a different set of algos should be used to operate with them.
The first goal is to recognize this marker on a single frame.
Direct launch classic opencv algorithms gaves unsatisfactory results. During R&D I ended up with approach prepare images -> find squares as tag candidates -> pass each tag to the AprilTagDetector.
def find_squares(self, frame, min_area=2000, max_area=15000, aspect_tol=0.3): gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) blur = cv2.GaussianBlur(gray, (5, 5), 0) edges = cv2.Canny(blur, 50, 150) contours, _ = cv2.findContours(edges, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) squares = [] for cnt in contours: epsilon = 0.02 * cv2.arcLength(cnt, True) approx = cv2.approxPolyDP(cnt, epsilon, True) if len(approx) == 4 and cv2.isContourConvex(approx): x, y, w, h = cv2.boundingRect(approx) area = w * h ratio = w / float(h) if h else 0 if area >= min_area and area <= max_area and abs(ratio - 1) < aspect_tol: squares.append((x, y, w, h)) return squares
| Stage | Operation | Algorithm / Method | Purpose |
|---|---|---|---|
| Preprocessing | Color to Grayscale | Color space conversion | Reduces image dimensionality and prepares data for intensity-based gradient analysis. |
| Preprocessing | Gaussian Blur | Gaussian convolution | Suppresses high-frequency noise and stabilizes gradient computation for edge detection. |
| Edge Detection | Canny Edge Detector | Canny (Sobel + hysteresis) | Detects strong and weak edges based on image gradients, producing a binary edge map. |
| Contour Extraction | Find Contours | Suzuki–Abe algorithm | Extracts connected edge boundaries as ordered point sets representing object outlines. |
| Shape Approximation | Polygon Approximation | Douglas–Peucker | Simplifies contours by reducing the number of points while preserving the overall shape. |
| Geometric Validation | Convexity Check | Convex hull test | Filters out non-convex shapes that cannot represent valid square-like objects. |
| Geometric Filtering | Bounding Box & Aspect Ratio | Axis-aligned bounding box | Applies area and aspect-ratio constraints to reject objects that do not match expected square geometry. |
def detect_tags_in_roi(self, roi):
gray = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY)
return self.apriltag_detector.detect(gray)
def detect_apriltag(self, frame, squares):
best = None
for (x, y, w, h) in squares:
roi = frame[y:y+h, x:x+w]
tags = self.detect_tags_in_roi(roi)
for t in tags:
corners = np.array(t.corners)
side = np.mean([
np.linalg.norm(corners[0] - corners[1]),
np.linalg.norm(corners[1] - corners[2])
])
cx = corners[:, 0].mean() + x
cy = corners[:, 1].mean() + y
if not best or side > best.px_size:
best = TargetDetection(cx, cy, side, "tag")
return best
After squares (april tags potential candidates) they has been found,
they are passed to OpenCV tag detector to recognize an AprilTag.
def estimate_distance_from_px(self, px_size):
if px_size <= 0:
return None
return (self.focal_length_px * self.tag_size_m) / px_size
The distance is estimated using a pinhole camera model, where
the real object size and its projected size in pixels are
related through the camera focal length.
async def _iteration(self, target_alt):
frame = await self.video.read()
if frame is None:
await self._on_frame_lost()
return
detection = self.tracker.process(frame)
if self.tracker.is_lost():
print("[Safety] Target lost → landing")
await self.hardware.land()
self._run = False
return
if detection:
vx, vy, vz, _ = self.velocity_ctrl.compute(detection, target_alt)
await self._send_velocity_safe(vx, vy, vz)
else:
await self._hover_if_needed()
self._draw_debug(frame, detection)
self._display(frame)
This code snippet shows a high level of process described above.
When distance is calculated based on the recognized AprilTag,
this data send to the drone to stabilize its flight.
There are many more things behind the scene, but format of the publication allows to focus on key things. The whole code is available on github.
A few related publications:
Flying Drone DIY, Part III: Assembling and Tuning
Flying drone DIY, part II: configuration for the 1st version
Flying drone DIY, part I: discovery phase