6. AI Visual Recognition & Tracking Course
6.1 Single Color Recognition
In this section, the camera detects colors. When a red ball is recognized, the buzzer will emit a beep, and the red ball will be highlighted in the transmitted image with “Color: red” displayed.
6.1.1 Program Description
The implementation of color recognition consists of two parts: color detection and execution feedback after recognition.
First, for the color detection part, Gaussian filtering is applied to the image to reduce noise. The Lab color space is then used to convert the color of the object (you can learn more about the Lab color space in the “OpenCV Vision Basic Course” section of the tutorial materials).
Next, the object’s color within the circle is recognized using color thresholding, followed by masking (masking involves using selected images, shapes, or objects to globally or locally obscure the image being processed).
After performing morphological operations such as opening and closing on the object image, the object with the largest contour is circled.
Opening: The image undergoes erosion followed by dilation. This operation removes small objects, smooths shape boundaries, and preserves the area. It can eliminate small noise particles and separate connected objects.
Closing: The image undergoes dilation followed by erosion. This operation fills small holes within objects, connects nearby objects, closes broken contour lines, and smooths boundaries while preserving the area.
After recognition, the servo and buzzer are set up to provide feedback based on the detected color. For example, when red is detected, the buzzer will emit a sound.
For detailed feedback behavior, please refer to 6.1.3 Program Outcome of this document.
6.1.2 Start and Close the Game
Note
The input command is case-sensitive, and keywords can be auto-completed using the Tab key.
(1) Power on the device and, following the instructions in “Remote Desktop Installation and Connection\3.1 VNC Installation and Connection”, use the VNC remote connection tool to connect.
(2) Click the icon
,in the top left corner of the system desktop or press the shortcut “Ctrl+Alt+T” to open the Terminator terminal.
(3) Execute the command to navigate to the directory where the program is located, then press Enter:
cd spiderpi/functions
(4) Enter the command and press Enter to start the program:
python3 color_recognition.py
(5) To close the program, simply press “Ctrl+C” in the LX terminal. If it does not close, press it multiple times.
6.1.3 Program Outcome
After starting the game, the camera will be used to detect colors. When a red ball is recognized, the buzzer will emit a beep sound, and the ball will be circled in the transmitted image, with “Color: red” printed.
Note
During the recognition process, ensure the environment is well-lit to avoid inaccurate recognition due to poor lighting conditions.
Ensure that no objects with similar or matching colors to the target are present in the background within the cameras visual range, as this may cause misrecognition.
If color recognition is inaccurate, refer to the section “6.1.5 Function Extensions -> Adjusting Color Thresholds” in this document to adjust the color threshold settings.
6.1.4 Program Analysis
The source code of this program is saved in: /home/pi/spiderpi/functions/color_recognition.py
Import Function Library
4import sys
5import cv2
6import math
7import time
8import threading
9import numpy as np
10from common import misc
11from common import yaml_handle
12from calibration.camera import Camera
13from sensor.ultrasonic_sensor import Ultrasonic
(1) Import Libraries for OpenCV, Time, Math, and Threading
To use functions from a library, we can call them with the syntax:
library_name.function_name(parameter1, parameter2, ...)
199 time.sleep(0.01)
For example, to call the sleep function from the time library, we use:
In Python, several libraries like time, cv2, and math are built-in and can be directly imported and used. You can also create your own libraries, like the yaml_handle file-reading library mentioned above.
(2) Instantiate a Library
Some library names can be long and hard to remember. To simplify function calls, we often instantiate libraries. For example:
12from calibration.camera import Camera
After instantiating the library, we can call functions from the Board library using the shorter syntax:
Board.function_name(parameter1, parameter2, …)
This makes it much easier and more convenient to use.
1.4.2 Main Function Analysis
In a Python program, __name__ == '__main__' indicates the main function of the program, where the program starts by reading an image.
(1) Image Processing
186 camera = Camera()
When the play mode starts, the video stream is obtained and stored in “cap”.
(2) Entering Image Processing
When an image is read, the run() function is called for image processing.
189 while True:
190 img = camera.frame
191 if img is not None:
192 frame = img.copy()
193 Frame = run(frame)
194 cv2.imshow('Frame', Frame)
195 key = cv2.waitKey(1)
196 if key == 27:
197 break
① The function img.copy() is used to copy the content of img to frame.
② The function run() performs image processing.
108def run(img):
109 global draw_color
110 global color_list
111 global detect_color
112 global action_finish
113 global count
114 img_copy = img.copy()
115 img_h, img_w = img.shape[:2]
116
117
118
119 frame_resize = cv2.resize(img_copy, size, interpolation=cv2.INTER_NEAREST)
120 frame_gb = cv2.GaussianBlur(frame_resize, (3, 3), 3)
121 frame_lab = cv2.cvtColor(frame_gb, cv2.COLOR_BGR2LAB) # 将图像转换到LAB空间(convert the image to LAB space)
(3) Resizing the image for easier processing.
119 frame_resize = cv2.resize(img_copy, size, interpolation=cv2.INTER_NEAREST)
The first parameter img_copy is the input image.
The second parameter size is the size of the output image. The size can be set by yourself.
The third parameter interpolation=cv2.INTER_NEAREST is the interpolation method. INTER_NEAREST: Nearest-neighbor interpolation.
INTER_LINEAR: Bilinear interpolation. If you do not specify the last parameter, this method will be used by default.
INTER_CUBIC: Bicubic interpolation within a 4x4 pixel neighborhood.
INTER_LANCZOS4: Lanczos interpolation within an 8x8 pixel neighborhood.
(4) Gaussian Filtering
There is always noise mixed in the image, which affects the image quality and makes the features less prominent. Different filtering methods are selected according to different types of noise, common ones include: Gaussian filtering, median filtering, mean filtering, etc.
Gaussian filtering is a linear smoothing filter, suitable for eliminating Gaussian noise and widely used in the noise reduction process of image processing.
120 frame_gb = cv2.GaussianBlur(frame_resize, (3, 3), 3)
he first parameter frame_resize is the input image.
The second parameter (3, 3) is the size of the Gaussian kernel.
The third parameter 3 is the standard deviation of the Gaussian kernel in the X direction.
(5) Converting the Image to LAB Color Space, where the function cv2.cvtColor() is a color space conversion function.
121 frame_lab = cv2.cvtColor(frame_gb, cv2.COLOR_BGR2LAB) # 将图像转换到LAB空间(convert the image to LAB space)
The first parameter frame_gb is the input image.
The second parameter cv2.COLOR_BGR2LAB is the conversion format. cv2.COLOR_BGR2LABconverts from BGR format to LAB format. If you want to convert to RGB, you can use cv2.COLOR_BGR2RGB.
(6) Converting the Image into a Binary Image, which only has 0 and 1, making the image simpler and reducing the data volume, and thus easier to process.
The inRange() function in the cv2 library is used to binarize the image.
131 frame_mask = cv2.inRange(frame_lab,
132 (lab_data[i]['min'][0],
133 lab_data[i]['min'][1],
134 lab_data[i]['min'][2]),
135 (lab_data[i]['max'][0],
136 lab_data[i]['max'][1],
137 lab_data[i]['max'][2])) #对原图像和掩模进行位运算(perform bitwise operation to original image and mask)
The first parameter frame_lab is the input image;
The second parameter (lab_data[i]['min'][0],lab_data[i]['min'][1],lab_data[i]['min'][2]) is the lower color threshold;
The third parameter (lab_data[i]['max'][0],lab_data[i]['max'][1],lab_data[i]['max'][2]) is the upper color threshold;
(7) To reduce interference and make the image smoother, erosion and dilation operations need to be performed on the image. Erosion and dilation are two basic morphological operations, often used in image processing, especially in binary image processing. These two operations are usually used to remove small noise, separate and identify objects in the image, and adjust the size of the image, etc.
138 eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #腐蚀(erode)
139 dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #膨胀(dilate)
he first parameter is the input image;
The second parameter is the structural element (also known as the kernel), which defines the nature of the operation. The size and shape of the kernel determine the degree of erosion and dilation.
(8) Obtaining the Contour with the Largest Area
The first parameter dilated is the input image;
142 contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2] #找出轮廓(find contours)
The second parameter cv2.RETR_EXTERNAL is the contour retrieval mode;
The third parameter cv2.CHAIN_APPROX_NONE)[-2] is the contour approximation method.
Among the obtained contours, the contour with the largest area is searched for, and in order to avoid interference, a minimum value needs to be set, and the target contour is valid only when the area is larger than this value.
143 areaMaxContour, area_max = get_area_max_contour(contours) #找出最大轮廓(find the largest contour)
144 if areaMaxContour is not None:
145 if area_max > max_area:#找最大面积(find the maximum area)
146 max_area = area_max
147 color_area_max = i
148 areaMaxContour_max = areaMaxContour
(9) Displaying the Returned Imag
192 frame = img.copy()
193 Frame = run(frame)
194 cv2.imshow('Frame', Frame)
195 key = cv2.waitKey(1)
196 if key == 27:
197 break
The function cv2.imshow() is used to display the image in a window, 'Frame' is the window name, and Frame is the display content. There must be cv2.waitKey() afterwards, otherwise, it cannot be displayed.
The function cv2.waitKey() is used to wait for key input, and the parameter “1” is the delay time.
1.4.3 drive the buzzer
91 board.set_buzzer(2400, 0.1, 0.2, 1)
92 time.sleep(0.2)
The function set_buzzer() is used to drive the buzzer.
The code time.sleep(0.2) is a delay function, and 0.2 is the buzzing time.
6.1.5 Function Extensions
Adjusting Color Thresholds
The color recognition program is pre-configured to recognize three colors: red, green, and blue. By default, the program identifies red, triggering the buzzer to emit a beep and drawing a circle around the red ball in the transmitted image, displaying “Color: red”.
To change the recognized color to green, follow these steps:
(1) Enter the following command and press Enter to navigate to the source code directory:
cd spiderpi/functions
(2) Then, enter the following command and press Enter to open the program file:
sudo vim color_recognition.py
(3) Locate the code shown in the image below:
(4) Press the “i” key on the keyboard to enter edit mode.
(5) Replace “red” (highlighted in red in the image) with “green”, as shown in the image below:
(6) To save your changes, press the “Esc” key, then type “:wq” (note the colon before “wq”) and press Enter to save and exit.
(7) Enter the following command and press Enter to start the color recognition functionality:
sudo python3 color_recognition.py
6.2 Color Recognition
6.2.1 Program Logic
For humans, it is easy to distinguish different colors in the world. How can robots recognize object colors? For SpiderPi Pro, we can install a camera vision module to it and control it to identify different colors through visual recognition.
The overall implementation process is as follows:
First, program SpiderPi Pro to recognize colors with Lab color space. Convert the RGB color space to Lab, and then perform image binarization and operations such as dilation and corrosion to obtain an outline containing only the target color.
Lastly, circle the obtained color outline and control the robot to take action according to the result of color recognition.
6.2.2 Start and Close the Game
Note
The input command should be case sensitive and space sensitive.
(1) Start the SpiderPi Pro robot and connect to the Raspberry Pi desktop remotely via VNC.
(2) Click
at upper left corner of desktop, or press “Ctrl+Alt+T” to open LX terminal.
(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.
cd spiderpi/functions
(4) Enter command, then press “Enter” to start the game.
python3 color_detect.py
(5) If you want to exit the game programming, press “Ctrl+C” in the LX terminal interface. If the exit fails, please try it a few more times.
6.2.3 Project Outcome
Note
The default recognition color is red. If you want to change it to blue or green, please refer to “6.2.5 Function Extension -> Change the Default Recognition Color”.
Place the red ball in front of SpiderPi Pro’s camera and it will nod when recognizing the red ball. It will “shake head” when detecting the green and blue balls.
6.2.4 Program Analysis
The source code of this program is located at: /home/pi/spiderpi/functions/color_detect.py
Import Function Libraries
4import sys
5import cv2
6import math
7import time
8import threading
9import numpy as np
10from common import misc
11from common import yaml_handle
12from calibration.camera import Camera
13from sensor.ultrasonic_sensor import Ultrasonic
2.4.2 Image Processing
(1) Gaussian Filtering
Before converting the image from RGB into LAB space, denoise the image and use GaussianBlur() function in cv2 library for Gaussian filtering.
179 frame_gb = cv2.GaussianBlur(frame_resize, (3, 3), 3)
The meaning of the parameters in bracket is as follows:
The first parameter frame_resize is the input image;
The second parameter (3, 3) is the size of the Gaussian kernel;
The third parameter 3 is the variance allowed near the average value in Gaussian filtering. The larger this value, the larger the variance allowed around the average value; the smaller the value, the smaller the variance allowed around the average value.
(2) Binarization Processing
The inRange() function in the cv2 library is used to perform binarization processing on the image.
189 frame_mask = cv2.inRange(frame_lab,
190 (lab_data[i]['min'][0],
191 lab_data[i]['min'][1],
192 lab_data[i]['min'][2]),
193 (lab_data[i]['max'][0],
194 lab_data[i]['max'][1],
195 lab_data[i]['max'][2])) #对原图像和掩模进行位运算(perform bitwise operation to original image and mask)
The first parameter in the bracket is the input image.
The second and the third parameters respectively are the lower limit and upper limit of the threshold. When the RGB value of the pixel is between the upper limit and lower limit, the pixel is assigned a value of 1, otherwise, 0.
(3) Corrosion and dilation
To reduce the interference and make the image smoother, it is necessary to perform corrosion and dilation on the image.
196 eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #腐蚀(erode)
197 dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #膨胀(dilate)
erode() function is used for corrosion. Take eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) for example.
The meaning of the parameters in bracket are as follow.
The first parameter frame_mask is the input image.
The second parameter cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)) is the structural element and kernel deciding the nature of the operation. And the first parameter in the parenthesis is the kernel shape and the second parameter is the kernel dimension.
dilate() function is used for image dilation. And the meaning of the parameters in parenthesis is the same as that of erode() function.
(4) Acquire the maximum contour
After processing the image, acquire the contour of the target to be recognized, which involves findContours() function in cv2 library.
200 contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2] #找出轮廓(find contours)
The first parameter in parentheses is the input image;
The second parameter is the retrieval mode of the contour;
The third parameter is the approximation method of the contour.
Find the contour of the maximum area among the obtained contours. To avoid interference, please set a minimum value. Only when the area is larger than this value, the target contour is valid.
205 if max_area > 100: # 有找到最大面积(the maximum area has been found)
206 ((centerX, centerY), radius) = cv2.minEnclosingCircle(areaMaxContour_max) # 获取最小外接圆(obtain the minimum circumscribed circle)
207 centerX = int(misc.map(centerX, 0, size[0], 0, img_w))
208 centerY = int(misc.map(centerY, 0, size[1], 0, img_h))
209 radius = int(misc.map(radius, 0, size[0], 0, img_w))
210 cv2.circle(img, (centerX, centerY), radius, range_rgb[color_area_max], 2)#画圆(drwa circle)
2.4.3 Feedback Information
After the contour of the maximum area is obtained, call circle() function in cv2 library, and circle the recognized target. The color of the circle is in line with the color of the object.
210 cv2.circle(img, (centerX, centerY), radius, range_rgb[color_area_max], 2)#画圆(drwa circle)
211
To improve the accuracy of the recognition result, it is necessary to make several judgments.
212 if color_area_max == 'red': #红色最大(red is the maximum)
213 color = 1
214 elif color_area_max == 'green': #绿色最大(green is the maximum)
215 color = 2
216 elif color_area_max == 'blue': #蓝色最大(blue is the maximum)
217 color = 3
218 else:
219 color = 0
220 color_list.append(color)
221
222 if len(color_list) == 3: #多次判断(multiple judgements)
223 # 取平均值(get mean)
224 color = int(round(np.mean(np.array(color_list))))
225 color_list = []
226 if color == 1:
227 detect_color = 'red'
228 draw_color = range_rgb["red"]
229 elif color == 2:
230 detect_color = 'green'
231 draw_color = range_rgb["green"]
232 elif color == 3:
233 detect_color = 'blue'
234 draw_color = range_rgb["blue"]
235 else:
236 detect_color = 'None'
237 draw_color = range_rgb["black"]
238 else:
239 detect_color = 'None'
240 draw_color = range_rgb["black"]
241
242 cv2.putText(img, "Color: " + detect_color, (10, img.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.65, draw_color, 2)
After the judgment is completed, the color of the recognition target is printed in the feedback image. Here, the putText() function in the cv2 library is involved.
244 cv2.putText(img, "Color: " + detect_color, (10, img.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.65, draw_color, 2)
The meaning of the parameters is as follow.
The first parameter img is the input image.
The second parameter "Color: " + detect_color represents the displayed content.
The third parameter (10, img.shape[0] - 10) is the displayed position.
The fourth parameter cv2.FONT_HERSHEY_SIMPLEX represents the font type.
The fifth parameter 0.65 represents the font size.
The sixth parameter draw_color represents the color of the font.
The seventh parameter 2 represents the font weight.
2.4.4 Main Function Analysis
The python program __name__ == '__main__:' is the main function of program. Firstly, the function init() is called to initialize. The initialization in this program includes: return the servo to the initial position, read the color threshold file. Generally there are also configurations for ports, peripherals, timing interrupts, etc., which are all done in the process of initialization.
248if __name__ == '__main__':
249 from common.ros_robot_controller_sdk import Board
(1) Read the Camera Image
263 while True:
264 img = camera.frame
265 if img is not None:
266 frame = img.copy()
When the game starts, the image is stored in “img”.
(2) Enter Image Processing
When the captured image is read, call run function to process the image.
266 frame = img.copy()
267 Frame = run(frame)
268 cv2.imshow('Frame', Frame)
269 key = cv2.waitKey(1)
270 if key == 27:
271 break
① The function img.copy() is used to copy the content of img to frame.
② The function run() performs image processing.
248if __name__ == '__main__':
249 from common.ros_robot_controller_sdk import Board
250
251 board = Board()
252 ultrasonic = Ultrasonic()
253
254 debug = False
255 if debug:
256 print('Debug Mode')
257
258 init()
259 start()
260 camera = Camera()
261 camera.camera_open(correction=True) # 开启畸变矫正,默认不开启(enable the distortion correction which is not started by default)
262
263 while True:
264 img = camera.frame
265 if img is not None:
266 frame = img.copy()
267 Frame = run(frame)
268 cv2.imshow('Frame', Frame)
269 key = cv2.waitKey(1)
270 if key == 27:
271 break
272 else:
273 time.sleep(0.01)
274 camera.camera_close()
275 cv2.destroyAllWindows()
2.4.5 Subthread Analysis
Run the move() function of the SpiderPi Pro as a subthread. When a color is recognized, the move() function is executed.
The function mainly involves processing the image results, making a judgment, and executing different feedback accordingly.
112def move():
113 global draw_color
114 global detect_color
115 global action_finish
116
117 while True:
118 if debug:
119 return
120 if __isRunning:
121 if detect_color != 'None':
122 action_finish = False
123 if detect_color == 'red':
124 board.pwm_servo_set_position(0.2, [[1, 1200]])
125 time.sleep(0.2)
126 board.pwm_servo_set_position(0.2, [[1, 1800]])
127 time.sleep(0.2)
128 board.pwm_servo_set_position(0.2, [[1, 1200]])
129 time.sleep(0.2)
130 board.pwm_servo_set_position(0.2, [[1, 1800]])
131 time.sleep(0.2)
132 board.pwm_servo_set_position(0.2, [[1, 1500]])
133 time.sleep(0.1)
134 detect_color = 'None'
135 draw_color = range_rgb["black"]
136 time.sleep(1)
137 elif detect_color == 'green' or detect_color == 'blue':
138 board.pwm_servo_set_position(0.2, [[2, 1200]])
139 time.sleep(0.2)
140 board.pwm_servo_set_position(0.2, [[2, 1800]])
141 time.sleep(0.2)
142 board.pwm_servo_set_position(0.2, [[2, 1200]])
143 time.sleep(0.2)
144 board.pwm_servo_set_position(0.2, [[2, 1800]])
145 time.sleep(0.2)
146 board.pwm_servo_set_position(0.2, [[2, 1500]])
147 time.sleep(0.1)
148 detect_color = 'None'
149 draw_color = range_rgb["black"]
150 time.sleep(1)
151 else:
152 time.sleep(0.01)
153 action_finish = True
154 detect_color = 'None'
155 else:
156 time.sleep(0.01)
157 else:
158 time.sleep(0.01)
6.2.5 Function Extensions
Change the Default Recognition Color
There are three built-in colors, including red, green and blue, in the color recognition program. The robot defaults to nod when recognizing red.
Take modifying the default recognition color as green as an example. The specific operation steps are as follow.
(1) Input command and press “Enter” to navigate to the directory where the game programs are stored.
cd spiderpi/functions
(2) Enter the command and press “Enter” to open the program file.
vim color_detect.py
(3) Locate the codes shown below:
Note
We can input the serial number of the line and press “Shift+G” to jump to the corresponding position. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.
(4) Press “i” to enter the editing mode, then modify “red” in if detect_color == 'red': to “green”. And replace “green” with “red” in elif detect_color == 'green' or detect_color == 'blue':. And you can modify it as blue in the same way.
After modification, Press “Esc” and input “:wq” and then press “Enter” to save the file and exit the editor.
:wq
(5) After the modification is completed, you can follow the steps in “6.2.2 Start and Close the Game” to check the game performance.
Add New Recognition Colors
In addition to the built-in recognition colors, you can set other recognition colors in the program. Take orange as example.
(1) Open VNC, input command to open Lab color setting document.
Vim spiderpi/config/lab_config.yaml
Note
It is recommended to screenshot the initial value for recording.
(2) Double click the icon of debugging tool
in the system desktop. If the prompt box pops up, choose “Execute”.
Click “Connect” button. When the interface displays the camera returned image, the connection is successful. Select “red” in the drop-down box.
(3) Face the camera to the color to recognize. Drag the sliders of L, A, and B until the object to be recognized in the left screen becomes white and other areas become black.
For example, if you want to recognize orange, you can put the orange ball within camera’s vision. Adjust the corresponding sliders of L, A, and B until the orange part in the left screen turns white and other colors become black, and then click “Save” button to keep the modified data.
(4) After the modification is completed, check whether the modified data was successfully written in. Enter the command again “vim spiderpi/config/lab config.yaml” to open file of Lab color setting.
Vim spiderpi/config/lab_config.yaml
Note
In order to avoid the game performance, it’s recommended to use the LAB_Tool tool to modify the value back to the initial value after the modification.
(5) The modified data is written successfully into the configuration program. Then you can press “Esc” and input “:wq” and then press “Enter” to save and exit.
:wq
(6) According to the steps in “6.2.5 Function Extension -> Change the Default Recognition Color”, set the default recognition color as red.
(7) Start the game again and put the orange object in front of the camera. SpiderPi Pro will nod when recognizing the color. If you want to add other color as recognition color, you can follow the previous steps to set.
6.3 Target Position Recognition
In this lesson, the camera will be used to recognize red, green, and blue balls. The detected balls will be highlighted in the live feed, and their XY coordinates will be displayed.
6.3.1 Brief Analysis of the Task
The implementation of target tracking can be divided into two parts: color recognition and position marking.
First, for the color recognition part, Gaussian filtering is applied to the image for noise reduction. The Lab color space is then used to convert the color of the objects (for more details on the Lab color space, please refer to the “OpenCV Vision Basic Course”).
Next, color thresholding is used to identify the color of objects within the circle. The image is then masked (masking involves using a selected image, shape, or object to globally or locally occlude the processed image).
After performing morphological operations (open and close operations) on the object’s image, the largest contour is outlined with a circle.
Opening operation: The image is eroded first and then dilated. This operation is used to remove small objects, smooth shape boundaries, and preserve the overall area. It helps remove small noise particles and separate objects that are connected.
Closing operation: The image is dilated first and then eroded. This operation is used to fill small holes within the objects, connect adjacent objects, and reconnect broken contour lines while smoothing the boundaries without changing the area.
Position marking requires specific detection algorithms. The basic principle is to search for areas in the image that match predefined features or patterns, then return the position and bounding box of these areas.
6.3.2 Start and Close the Game
Note
The input of commands must strictly distinguish between uppercase and lowercase letters, as well as spaces. Additionally, you can use the “Tab” key on the keyboard to auto-complete keywords.
(1) Power on the device and, following the instructions in “Remote Desktop Installation and Connection\3.1 VNC Installation and Connection”, use the VNC remote connection tool to connect.
(2) Click the icon
in the top left corner of the system desktop or press the shortcut “Ctrl+Alt+T” to open the LX terminal.
(3) In the terminal, enter the command to navigate to the directory where the program is located, then press Enter:
cd spiderpi/functions
(4) Enter the command and press Enter to start the program:
python3 color_position_recognition.py
(5) To close the program, simply press “Ctrl+C” in the LX terminal. If it does not close, press it multiple times.
6.3.3 Program Outcome
The program defaults to recognizing red, green, and blue balls. After recognition, it will highlight the objects in the transmitted image and display their XY coordinates.
Note
During the recognition process, ensure the environment is well-lit to avoid inaccurate recognition due to lighting issues.
Ensure there are no objects with similar or identical colors to the target colors within the camera’s field of view to prevent misrecognition.
If color recognition is inaccurate, refer to the section “6.3.5 Function Extension ->Adjusting Color Threshold” in this document to adjust the color threshold settings.
6.3.4 Program Description
The source code for this program is located at:/home/pi/spiderpi/functions/color_position_recognition.py
Importing Libraries
4import sys
5import cv2
6import math
7import time
8import threading
9import numpy as np
10from common import misc
11from common import yaml_handle
12from calibration.camera import Camera
13from sensor.ultrasonic_sensor import Ultrasonic
(1) Import the necessary libraries, including OpenCV, time, math, threading, and inverse kinematics.
To call a function from a library, use the format LibraryName.FunctionName(Parameters). For example:
189 time.sleep(0.01)
This calls the sleep function from the time library, which is used for adding delays.
Python comes with several built-in libraries like time, cv2, math, which can be imported directly. You can also create your own libraries, such as the “yaml_handle” file reading library.
(2) Instantiating Libraries
Sometimes, library names are long and hard to remember. To make function calls more convenient, we often instantiate libraries using shorter names. For example:
12from calibration.camera import Camera
After instantiation, functions from the Board library can be called as:
Board.FunctionName(Parameters)
This makes calling functions much easier.
Main Function Analysis
In a Python program, the if __name__ == '__main__': block indicates the main function. The program starts by opening the camera and reading the video stream. The read() method captures each frame of the image, where the program searches for and marks the color of the ball, then displays the result. The video is displayed through a loop, and once the display is finished, the release() function is called to release the resources.
167if __name__ == '__main__':
168 from common.ros_robot_controller_sdk import Board
169
170 board = Board()
171 ultrasonic = Ultrasonic()
172
173 load_config()
174 init_move()
175 reset()
176 camera = Camera()
177 camera.camera_open(correction=True) # 开启畸变矫正,默认不开启(enable the distortion correction which is not started by default)
(1) Capturing Camera Image
176 camera = Camera()
When the program starts, the camera is initialized.
(2) Image Processing
① The run() function handles image processing.
183 Frame = run(frame)
85def run(img):
86 global draw_color
87 global color_list
88 global detect_color
89 global action_finish
90
91 img_copy = img.copy()
92 img_h, img_w = img.shape[:2]
93
94
95 frame_resize = cv2.resize(img_copy, size, interpolation=cv2.INTER_NEAREST)
96 frame_gb = cv2.GaussianBlur(frame_resize, (3, 3), 3)
97 frame_lab = cv2.cvtColor(frame_gb, cv2.COLOR_BGR2LAB) # 将图像转换到LAB空间(convert the image to LAB space)
② Resize the image to make it easier to process.
95 frame_resize = cv2.resize(img_copy, size, interpolation=cv2.INTER_NEAREST)
The first parameter img_copy is the input image.
The second parameter size is the size of the output image, which can be set as needed.
The third parameter interpolation=cv2.INTER_NEAREST is the interpolation method.
Options include:
INTER_NEAREST: Nearest-neighbor interpolation.
INTER_LINEAR: Bilinear interpolation (default if no other method is specified).
INTER_CUBIC: Bicubic interpolation in a 4x4 pixel neighborhood.
INTER_LANCZOS4: Lanczos interpolation in an 8x8 pixel neighborhood.
③ Apply Gaussian Blur to reduce noise
Gaussian blur is a linear smoothing filter used to eliminate Gaussian noise and is widely used in image denoising.
96 frame_gb = cv2.GaussianBlur(frame_resize, (3, 3), 3)
The first parameter frame_resize is the input image.
The second parameter (3, 3) is the size of the Gaussian kernel.
The third parameter 3 is the standard deviation of the Gaussian kernel in the X-direction.
④ Convert the image to LAB color space.
97 frame_lab = cv2.cvtColor(frame_gb, cv2.COLOR_BGR2LAB) # 将图像转换到LAB空间(convert the image to LAB space)
The first parameter frame_gb is the input image.
The second parameter cv2.COLOR_BGR2LAB specifies the conversion from BGR to LAB format. To convert to RGB, use cv2.COLOR_BGR2RGB.
⑤ Convert the image to a binary image with only 0s and 1s, simplifying the image and reducing data for easier processing.
The cv2.inRange() function is used for binarization:
{lineno-start=}
frame_mask = cv2.inRange(frame_lab,
(lab_data[i]['min'][0],
lab_data[i]['min'][1],
lab_data[i]['min'][2]),
(lab_data[i]['max'][0],
lab_data[i]['max'][1],
lab_data[i]['max'][2])) #对原图像和掩模进行位运算(perform bitwise operation to original image and mask)
The first parameter frame_lab is the input image.
The second parameter (lab_data[i]['min'][0], lab_data[i]['min'][1], lab_data[i]['min'][2]) is the lower threshold for the color.
The third parameter (lab_data[i]['max'][0], lab_data[i]['max'][1], lab_data[i]['max'][2]) is the upper threshold for the color.
⑥ Perform erosion and dilation to smooth the image and reduce interference.
Erosion reduces the size of foreground objects and eliminates small objects, while dilation increases the size of foreground objects and fills small holes.
113 eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #腐蚀(erode)
114 dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #膨胀(dilate)
⑦ Find the contour with the largest area
After the image processing steps, use the cv2.findContours() function to find contours:
117 contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2] #找出轮廓(find contours)
The first parameter dilated is the input image.
The second parameter cv2.RETR_EXTERNAL specifies the contour retrieval mode.
The third parameter cv2.CHAIN_APPROX_NONE)[-2] specifies the contour approximation method.
The program searches for the largest contour and sets a threshold area to ensure the detected contour is valid.
118 areaMaxContour, area_max = get_area_max_contour(contours) #找出最大轮廓(find the largest contour)
119 if areaMaxContour is not None:
120 if area_max > max_area:#找最大面积(find the maximum area)
121 max_area = area_max
122 color_area_max = i
123 areaMaxContour_max = areaMaxContour
124 if max_area > 100: # 有找到最大面积(the maximum area has been found)
⑧ Extract the position information
Use cv2.putText() to draw text on the image:
162 cv2.putText(img, "Color: " + detect_color, (10, img.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.65, draw_color, 2)
The first parameter img is the input image.
The second parameter "Color: " + detect_color is the text to display (e.g., the detected color).
The third parameter (10, img.shape[0] - 10) and (centerX, centerY - 20) specify the starting coordinates for the text (bottom-left position).
The fourth parameter cv2.FONT_HERSHEY_SIMPLEX specifies the font type.
The fifth parameter 0.65 is the scaling factor for the font size.
The sixth parameter draw_color is the color of the text.
The seventh parameter 2 specifies the thickness of the text line.
(3) Displaying the Return Image
179 while True:
180 img = camera.frame
181 if img is not None:
182 frame = img.copy()
183 Frame = run(frame)
184 cv2.imshow('Frame', Frame)
185 key = cv2.waitKey(1)
186 if key == 27:
187 break
The cv2.imshow() function is used to display the image in a window. The first parameter is the window name (e.g., ‘Frame’), and the second parameter is the image to display.
The function cv2.waitKey() is used to wait for a key press; the parameter 1 specifies the delay time.
6.3.5 Function Extension
Adjusting Color Threshold
During the game experience, if the color recognition of objects is not accurate, you may need to adjust the color threshold. This section uses adjusting the red color as an example; the process for adjusting other colors is similar. Follow the steps below:
(1) Double-click the system desktop icon
and click “Execute” in the pop-up window.
(2) Once the interface opens, click “Connect.”
(3) After a successful connection, select “red” from the color options in the bottom-right corner of the interface.
(4) If the transmitted image does not appear in the pop-up window, it indicates the camera is not connected properly. Check the camera connection cable to ensure it is securely connected.
The image on the right side of the interface shows the real-time transmitted video, and the left side shows the color to be captured.
Point the camera at the red color block, and then adjust the six sliders at the bottom to ensure that the red color block on the left side of the screen turns completely white, while other areas remain black.
Finally, click the “Save” button to save the data.
6.4 Target Tracking
6.4.1 Program logic
First, program SpiderPi Pro to recognize colors with Lab color space. Convert the RGB color space to Lab, perform image binarization, and then operations such as expansion and corrosion to obtain an outline containing only the target color. And circle the obtained outline.
After color recognition, take X and Y coordinate of the image center as setting value. And take the X and Y coordinate of the target as input value to update PID.
Lastly, calculate according to the feedback about the image position and control SpiderPi Pro to move with the target, so as to realize color tracking.
6.4.2 Operation steps
Note
The input command should be case sensitive and space sensitive.
(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.
(2) Click
at upper left corner of desktop, or press “Ctrl+Alt+T” to open LX terminal.
(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.
cd spiderpi/functions
(4) Enter the command, then press “Enter” to start the game.
python3 color_track.py
(5) If you want to exit the game programming, press “Ctrl+C” in the LX terminal interface. If the exit fails, please try it few more times.
6.4.3 Project outcome
Note
The default recognized and tracking color is green. If you want to change it to blue, please refer to “6.4.5 Function Extension ->Modify Default Recognition Color”. And, please don’t move the ball too fast and out of the camera vision.
After the game starts, move the green ball slowly, and the robotic arm of SpiderPi Pro will move with the green ball.
6.4.4 Program Analysis
The source code of this program is located in:/home/pi/spiderpi/functions/color_track.py
Import Function Library
4import sys
5import cv2
6import math
7import time
8import numpy as np
9from common import misc
10from common.pid import PID
11from common import yaml_handle
12from calibration.camera import Camera
13from calibration.CalibrationConfig import *
14from sensor.ultrasonic_sensor import Ultrasonic
15import arm_ik.arm_move_ik as AMK
(1) Gaussian filtering
Before converting the image from RGB into LAB space, denoise the image and use “GaussianBlur()” function in cv2 library for Gaussian filtering.
146 frame_gb = cv2.GaussianBlur(frame_resize, (5, 5), 5)
The meaning of the parameters in bracket is as follow
The first parameter frame_resize is the input image
The second parameter (5, 5) is the size of Gaussian kernel.
The third parameter 5 is the allowable variance around the average in Gaussian filtering. The larger the value, the larger the allowable variance around the average value; The smaller the value, the smaller the allowable variance around the average value.
(2) Binaryzation processing
Adopt inRange() function in cv2 library to perform binaryzation on the image.
187 frame_mask = cv2.inRange(frame_lab,
188 (lab_data[i]['min'][0],
189 lab_data[i]['min'][1],
190 lab_data[i]['min'][2]),
191 (lab_data[i]['max'][0],
192 lab_data[i]['max'][1],
193 lab_data[i]['max'][2])) #对原图像和掩模进行位运算(perform bitwise operation to original image and mask)
The first parameter in the bracket is the input image. The second and the third parameters respectively are the lower limit and upper limit of the threshold. When the RGB value of the pixel is between the upper limit and lower limit, the pixel is assigned a value of 1, otherwise, 0.
(3) Corrosion and dilation
To reduce the interference and make the image smoother, it is necessary to perform corrosion and dilation on the image.
161 eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #腐蚀(erode)
162 dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #膨胀(dilate)
The erode() function is used for corrosion. Take eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) for example. The meaning of the parameters in bracket are as follow.
The first parameter frame_mask is the input image.
The second parameter cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)) is the structural element and kernel deciding the nature of the operation. And the first parameter in the parenthesis is the kernel shape and the second parameter is the kernel dimension.
The dilate()function is used for image dilation. And the meaning of the parameters in parenthesis is the same as that of erode() function.
(4) Acquire the maximum contour
After processing the image, acquire the contour of the target to be recognized, which involves findContours() function in cv2 library.
165 contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2] # 找出轮廓(find contours)
The first parameter in parentheses is the input image; the second parameter is the retrieval mode of the contour; the third parameter is the approximation method of the contour.
Find the contour of the maximum area among the obtained contours. To avoid interference, please set a minimum value. Only when the area is larger than this value, the target contour is valid.
168 if area_max > 50: # 有找到最大面积(the maximum area has been found)
169 (centerX, centerY), radius = cv2.minEnclosingCircle(areaMaxContour) #获取最小外接圆(obtain the minimum circumscribed circle)
170 centerX = int(misc.map(centerX, 0, size[0], 0, img_w))
171 centerY = int(misc.map(centerY, 0, size[1], 0, img_h))
172 radius = int(misc.map(radius, 0, size[0], 0, img_w))
173 cv2.circle(img, (int(centerX), int(centerY)), int(radius), range_rgb[detect_color], 2)
Feedback Information
After the contour of the maximum area is obtained, call minEnclosingCircle() function in cv2 library to obtain the smallest circumscribed circle of the target contour.
169 (centerX, centerY), radius = cv2.minEnclosingCircle(areaMaxContour) #获取最小外接圆(obtain the minimum circumscribed circle)
Then circle the recognized target, which involves circle() function in cv2 library.
173 cv2.circle(img, (int(centerX), int(centerY)), int(radius), range_rgb[detect_color], 2)
Drive the servo
Take X and Y coordinate of the center of the image as setting value. And take the X and Y coordinate of the recognized target as the input value to update PID.
175 # use_time = 0
176 x_pid.SetPoint = img_w/2 #设定(set)
177 x_pid.update(centerX) #当前(current)
178 dx = int(x_pid.output)
179 # use_time = abs(dx*0.00025)
180 x_dis += dx #输出(output)
181
182 x_dis = 0 if x_dis < 0 else x_dis
183 x_dis = 1000 if x_dis > 1000 else x_dis
184
185 y_pid.SetPoint = img_h/2
186 y_pid.update(centerY)
187 dy = int(y_pid.output)
188 # use_time = round(max(use_time, abs(dy*0.00025)), 5)
189 y_dis += dy
190
191 y_dis = 0 if y_dis < 0 else y_dis
192 y_dis = 1000 if y_dis > 1000 else y_dis
193
194 if not debug:
195 board.bus_servo_set_position(0.02, [[24, y_dis], [21, x_dis]])
196 time.sleep(0.02)
Drive the specific servo to rotate to the designated position through calling the bus_servo_set_position() function in Board library
194 if not debug:
195 board.bus_servo_set_position(0.02, [[24, y_dis], [21, x_dis]])
196 time.sleep(0.02)
Take bus_servo_set_position(0.02, [[24, y_dis], [21, x_dis]]) function for example.
The meaning of the parameter in bracket is as follow.
The first parameter 0.02 is the rotation time in the unit of “24”.
The second parameter 24 is the servo ID to be driven.
The third parameter y_dis is the rotation position.
6.4.5 Function extension
Modify Default Recognized Color
There are two built-in colors in the program of color tracking, including green and blue. And its robotic arm will move with the target.
Take modifying the default recognition color as blue for example. The specific operation steps are as follow.
(1) Input command and press “Enter” into the directory where the game programs are stored.
cd spiderpi/functions
(2) Enter command and press “Enter” to open the program file.
vim color_track.py
(3) Locate the code shown below:
Note
press “Shift+G” after inputting the line number to directly jump to the corresponding line. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.
(4) Press “i” to enter the editing mode. And modify “green” in “__target_color = (‘green’,)” as “blue”.
(5) After modification, press “Esc” and input “:wq” and then press Enter to save and exit.
:wq
Add New Recognition Color
Note
for better game performance, please do not add red as the recognition color.
In addition to the built-in recognition colors, you can set other recognition colors in the program. Take orange as example
(1) Open VNC, input command to open Lab color setting document.
Vim spiderpi/config/lab_config.yaml
Note
It is recommended to screenshot the initial value for recording.
(2) Double click the icon of debugging tool
in the system desktop. If the prompt box pops up, choose “Execute”.
(3) Click “Connect” button. When the interface displays the camera returned image, the connection is successful. Select “green” in the drop-down box.
Face the camera to the color to recognize. Drag the sliders of L, A, and B until the target color in the left screen becomes white and other areas become black.
For example, if you want to recognize orange, you can put the orange ball within camera’s vision. Adjust the corresponding sliders of L, A, and B until the orange part in the left screen turns white and other colors become black, and then click “Save” button to keep the modified data.
(4) After the modification is completed, check whether the modified data was successfully written in. Enter the command again “Vim spiderpi/config/lab_config.yaml” to open file of Lab color setting.
Vim spiderpi/config/lab_config.yaml
Note
In order to avoid the game performance, it’s recommended to use the LAB_Tool tool to modify the value back to the initial value after the modification.
(5) The modified data is written successfully into the configuration program. Then you can press “Esc” and input “:wq” and then press Enter to save and exit.
:wq
(6) According to the steps in “6.4.5 Function Extension ->Modify Default Recognition Color”, set the default recognition color as green.
(7) Start the game again and put the orange object in front of the camera. SpiderPi Pro will nod when recognizing the color. If you want to add other color as recognition color, you can follow the previous steps to set.
6.5 Line Following
6.5.1 Program Logic
Line following is common in robot competitions which is implemented by two-channel or four-channel line follower. Different from this, SpiderPi Pro can recognize the line color through visual module, and process with image algorithms, to realize line following.
First, program SpiderPi Pro to recognize colors with Lab color space. Convert the RGB color space to Lab, then perform image binarization, and then operations such as expansion and corrosion to obtain an outline containing only the target color. Next, circle color outline.
After color recognition, calculate according to the the position feedback of the line in the image, and then program SpiderPi Pro to move along the line so as to realize line following.
6.5.2 Operation Steps
Note
The input command should be case sensitive and space sensitive.
(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.
(2) Click
at upper left corner of desktop, or press “Ctrl+Alt+T” to open LX terminal.
(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.
cd spiderpi/functions
(4) Enter the command, then press “Enter” to start the game.
python3 visual_patrol.py
(5) If you want to exit the game program, press “Ctrl+C” in the LX terminal interface. If the exit fails, please try it a few more times.
6.5.3 Project Outcome
Note
The default recognition color is red. If you want to change it to white or black, please refer to “6.5.5 Function Extension -> Modify Default Recognition Color”.
Paste red electrical tape to form a path. Then place SpiderPi Pro on the red line. After the game starts, the robot will move along the red line.
6.5.4 Program Analysis
The source code of this program is stored in:/home/pi/spiderpi/functions/visual_patrol.py
Import Function Library
4import sys
5import cv2
6import time
7import math
8import threading
9import numpy as np
10from common import yaml_handle
11from calibration.camera import Camera
12from calibration.CalibrationConfig import *
13from common import kinematics
14from sensor.ultrasonic_sensor import Ultrasonic
15import arm_ik.arm_move_ik as AMK
(1) Import the libraries related to OpenCV, time, math, and threads.
If want to call a function in library, you can use “library name+function name (parameter, parameter)”. For example:
218 time.sleep(0.01)
Call sleep function in time library. The function sleep () is used to delay. There are some built-in libraries in Python, so they can be called directly. For example, time, cv2 and math. You can also write a new library like yaml_handle.
(2) Instantiate Function Library
The name of function library is too long to memorize. For calling function easily, the library can be instantiated. For example:
11from calibration.camera import Camera
After instantiating, you can directly input and call the function Board.function name (parameter, parameter).
Define Global Variable
17if sys.version_info.major == 2:
18 print('Please run this program with python3!')
19 sys.exit(0)
20
21lab_data = None
22servo_data = None
23def load_config():
24 global lab_data, servo_data
25
26 lab_data = yaml_handle.get_yaml_data(yaml_handle.lab_file_path)
27
28load_config()
29
30__target_color = ('red',)
31# 设置检测颜色(set target color)
32def setLineTargetColor(target_color):
33 global __target_color
34
35 __target_color = target_color
36 return (True, ())
Main Function Analysis
The python program __name__ == '__main__:' is the main function of program. Firstly, the function “init()” is called to initialize. The initialization in this program includes: return the servo to the initial position, read the color threshold file. Generally there are also configurations for ports, peripherals, timing interrupts, etc., which are all done in the process of initialization.
182if __name__ == '__main__':
183 from common.ros_robot_controller_sdk import Board
184 from sensor.ultrasonic_sensor import Ultrasonic
185
186 board = Board()
187 ik = kinematics.IK(board) # 实例化逆运动学库(instantiate inverse kinematics library)
188 ultrasonic = Ultrasonic()
189 ak = AMK.ArmIK()
(1) Read the Captured Image
207 while True:
208 img = camera.frame
209 if img is not None:
When the the game is started, store the image in img.
(2) Enter Image Processing
When the captured image is read, call run function to process the image
209 if img is not None:
210 frame = img.copy()
211 frame = cv2.remap(frame, mapx, mapy, cv2.INTER_LINEAR) # 畸变矫正(distortion correction)
212 Frame = run(frame)
213 cv2.imshow('Frame', Frame)
214 key = cv2.waitKey(1)
215 if key == 27:
216 break
The function img.copy() is used to copy the content of img to frame.
(3) Gaussian filtering
Before converting the image from RGB into LAB space, denoise the image and use GaussianBlur() function in cv2 library for Gaussian filtering.
141 frame_gb = cv2.GaussianBlur(img, (3, 3), 3)
The meaning of the parameters in bracket is as follow
The first parameter img is the input image
The second parameter (3, 3) is the size of Gaussian kernel
The third parameter 3 is the allowable variance around the average in Gaussian filtering. The larger the value, the larger the allowable variance around the average value; The smaller the value, the smaller the allowable variance around the average value.
(4) Binaryzation processing
Adopt inRange() function in cv2 library to perform binaryzation on the image.
150 frame_mask = cv2.inRange(frame_lab,
151 (lab_data[i]['min'][0],
152 lab_data[i]['min'][1],
153 lab_data[i]['min'][2]),
154 (lab_data[i]['max'][0],
155 lab_data[i]['max'][1],
156 lab_data[i]['max'][2])) #对原图像和掩模进行位运算(perform bitwise operation to original image and mask)
The first parameter in the bracket is the input image. The second and the third parameters respectively are the lower limit and upper limit of the threshold. When the RGB value of the pixel is between the upper limit and lower limit, the pixel is assigned a value of 1, otherwise, 0.
(5) Corrosion and dilation
To reduce the interference and make the image smoother, it is necessary to perform corrosion and dilation on the image.
157 eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #腐蚀(erode)
158 dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #膨胀(dilate)
erode() function is used for corrosion. Take eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) for example.
The meaning of the parameters in bracket are as follow.
The first parameter frame_mask is the input image.
The second parameter cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)) is the structural element and kernel deciding the nature of the operation. And the first parameter in the parenthesis is the kernel shape and the second parameter is the kernel dimension.
dilate() function is used for image dilation. And the meaning of the parameters in parenthesis is the same as that of erode() function.
(6) Acquire the maximum contour
After processing the image, acquire the contour of the target to be recognized, which involves findContours() function in cv2 library.
159 cnts = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_TC89_L1)[-2] #找出所有轮廓(find all contours)
The first parameter in parentheses is the input image;
the second parameter is the retrieval mode of the contour; the third parameter is the approximation method of the contour.
160 cnt_large, area = get_area_maxContour(cnts) #找到最大面积的轮廓(find the largest contour)
161 if area > 10:
162 rect = cv2.minAreaRect(cnt_large) #最小外接矩形(the minimum bounding rectangle)
163
164 box = np.intp(cv2.boxPoints(rect)) #最小外接矩形的四个顶点(the four corner points of the minimum bounding rectangle)
165 for j in range(4):
166 box[j, 1] = box[j, 1] + r[0]
167
168 cv2.drawContours(img, [box], -1, (0, 255, 255), 2) #画出四个点组成的矩形(draw the rectangle composed of the four points)
(7) Obtain location
minAreaRect() function in cv2 library is used to obtain the smallest circumscribed rectangle of the target outline and the coordinate of 4 vertexes will be obtained by boxPoints() function. Next, the coordinates of the center point of the rectangle can be deduced from the coordinates of the vertex.
164 box = np.intp(cv2.boxPoints(rect)) #最小外接矩形的四个顶点(the four corner points of the minimum bounding rectangle)
165 for j in range(4):
166 box[j, 1] = box[j, 1] + r[0]
167
168 cv2.drawContours(img, [box], -1, (0, 255, 255), 2) #画出四个点组成的矩形(draw the rectangle composed of the four points)
169
170 #获取矩形的对角点(obtain the diagonal points of the rectangle)
171 pt1_x, pt1_y = box[0, 0], box[0, 1]
172 pt3_x, pt3_y = box[2, 0], box[2, 1]
173 line_center_x, line_center_y = (pt1_x + pt3_x) / 2, (pt1_y + pt3_y) / 2 #中心点(center point)
174 cv2.circle(img, (int(line_center_x), int(line_center_y)), 5, (0, 0, 255), -1) #画出中心点(draw the center point)
175 line_center = line_center_x
5.4.4 Line following
After the image processing, control SpiderPi Pro to move through calling the function in kinematics.IK library.
111 if line_center >= 0:
112 if abs(line_center -img_center_x) < 60:
113 ik.go_forward(ik.initial_pos, 2, 60, 50, 1)
114 elif line_center -img_center_x >= 60:
115 ik.turn_right(ik.initial_pos, 2, 30, 50, 1)
116 else:
117 ik.turn_left(ik.initial_pos, 2, 30, 50, 1)
118 last_line_center = line_center
119
120 elif line_center == -1:
121 if last_line_center >= img_center_x:
122 ik.turn_left(ik.initial_pos, 2, 30, 50, 1)
123 else:
124 ik.turn_right(ik.initial_pos, 2, 30, 50, 1)
125 else:
126 time.sleep(0.01)
The functions used to control the SpiderPi Pro’s movement are listed below.
| Function | Usage |
|---|---|
| ik.go_forward(ik.initial_pos, 2, 50, 80, 1) | robot moves straight forward 50mm |
| ik.back(ik.initial_pos, 2, 100, 80, 1) | robot moves straight backward 100mm |
| ik.turn_left(ik.initial_pos, 2, 30, 100, 1) | turn left on the spot 30 degrees |
| ik.turn_right(ik.initial_pos, 2, 30, 100, 1) | turn right on the spot 30 degrees |
| ik.left_move(ik.initial_pos, 2, 100, 100, 1) | move left 100mm |
| ik.right_move(ik.initial_pos, 2, 100, 100, 1) | move right 100mm |
Take ik.go_forward(ik.initial_pos, 2, 50, 80, 1) for example. The meaning of the parameter in bracket is as follow.
The first parameter ik.initial_pos represents the posture.
The second parameter 2 is the mode, and 2 is spider mode.
The third parameter 50 is the stride and the unit is mm when it goes straight, and degree when it turns.
The fourth parameter 80 is the speed in mm/s.
The fifth parameter 1 is the number of execution. When it is “0”, it means that the robot will perform one action at loop.
6.5.5 Function Extension
Modify Default Recognition Color
There are three built-in colors, including red, black and white, in the program. Take modify the default recognition color as white for example.
(1) Input command and press Enter into the directory where the game programs are stored.
cd spiderpi/functions
(2) Enter command and press Enter to open the program file.
vim visual_patrol.py
(3) Locate the code shown below:
Note
press “Shift+G” after inputting the line number to directly jump to the corresponding line. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.
(4) Press “i” key to enter the editing mode. And modify “red” in “__target_color = (‘red’,)” as “white”. Or you can modify it as black if you want.
(5) After modification, press “Esc” key and input “:wq” and then press Enter to save and exit.
Add New Recognition Color
In addition to the three built-in tracked colors, you can set other colors in the program. Take blue as example
(1) Open VNC, input command to open Lab color setting document.
Vim spiderpi/config/lab_config.yaml
Note
It is recommended to screenshot the initial value for recording.
(2) Double click the icon of debugging tool
in the system desktop. If the prompt box pops up, choose “Execute”.
(3) Click “Connect” button. When the interface displays the camera returned image, the connection is successful. Select “red” in the drop-down box.
(4) Face the camera to the color to recognize. Drag the sliders of L, A, and B until the target color area in the left screen becomes white and other areas become black.
For example, if you want to modify the default color as blue, you can put the blue line within camera’s vision. Adjust the corresponding sliders of L, A, and B until the blue part in the left screen turns white and other colors become black, and then click “Save” button to keep the modified data.
Note
In order to avoid the influence on game performance, it’s recommended to use the “LAB_Tool” tool to modify the value back to the initial value after the modification.
(5) After the modification is completed, check whether the modified data was successfully written in. Enter the command again “vim spiderpi/config/lab_config.yaml” to open file of Lab color setting.
Vim spiderpi/config/lab_config.yaml
(6) The modified data is written successfully into the configuration program. Then you can press “Esc” and input “:wq” and then press Enter to save and exit.
:wq
(7) According to the steps in “6.5.5 Function Extension -> Modify Default Recognition Color”, set the default recognition color as red.
(8) Start the line following game again according to the steps in “6.5.2 Operation Steps”. Then SpiderPi Pro will move along the blue line.
6.6 Tag Detection
6.6.1 Brief Game Description
When the robot detects a tag, the buzzer emits a sound, and the feedback image is returned.
AprilTag, a visual fiducial marker, is similar to a QR code or barcode. It can be used to quickly detect markers and calculate relative positions, meeting real-time requirements. It is widely used in various applications such as augmented reality (AR), robotics, and camera calibration. Currently, AprilTags can be printed using a standard printer, and their detection programs can calculate precise 3D position, orientation, and ID relative to the camera.
In this lesson, we will combine OpenCV with AprilTag to complete a small project for detecting AprilTag markers. When the camera detects the tag, the robot’s onboard buzzer will sound as a prompt, and the feedback image will be displayed.
6.6.2 Start and Close the Game
Note
The input of commands must strictly distinguish between uppercase and lowercase letters, as well as spaces.
(1) Power on the device and, following the instructions in “Remote Desktop Installation and Connection\3.1 VNC Installation and Connection”, use the VNC remote connection tool to connect.
(2) Click the icon
in the top left corner of the system desktop or press the shortcut “Ctrl+Alt+T” to open the LX terminal.
(3) In the terminal, enter the command to navigate to the directory where the program is located, then press Enter:
cd spiderpi/functions
(4) Enter the command and press Enter to start the program:
python3 apriltag_recognition.py
(5) To close the program, simply press “Ctrl+C” in the LX terminal. If it does not close, press it multiple times.
6.6.3 Program Outcome
Note
For optimal tag detection, place the tag against a solid-colored or white background. Dark backgrounds (e.g., black) may interfere with tag recognition.
Once the game is activated, position the included AprilTag tag in front of the camera. When the robot detects the tag, the buzzer will sound as a prompt. The feedback image will display the captured tag, outline it, and show the tag’s tag_id and tag_family information.
6.6.4 Program Parameter Explanation
The source code for this program is located at:/home/pi/spiderpi/functions/apriltag_recognition.py
(1) Image Acquisition and Processing
The first step is image processing, which involves working with digital image data. We begin by importing the necessary packages.
4import sys
5import time
6import cv2
7import numpy as np
8from common import yaml_handle
9from calibration.camera import Camera
10import common.apriltag as apriltag
11from common.ros_robot_controller_sdk import Board
12from sensor.ultrasonic_sensor import Ultrasonic
Next, we initialize and start the camera to acquire the image, then proceed to copy, remap, and display the image.
95 while True:
96 img = camera.frame
97 if img is not None:
98 frame = img.copy()
99 Frame = run(frame)
100 cv2.imshow('Frame', Frame)
Afterward, we need to convert the image from RGB format to grayscale. The corresponding code is as follows:
54 gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(2) Tag Detection
Once the image has been processed, we need to detect the tag. This is done by using the tag library to detect the tag in the acquired image. The code implementation is as follows:
51# 检测apriltag(detect apriltag)
52detector = apriltag.Detector(searchpath=apriltag._get_demo_searchpath())
After detection, the program will obtain the four corner points of the tag.
59 corners = np.rint(detection.corners) # 获取四个角点(obtain the four corner points)
Next, we need to draw the contours of the tag. In OpenCV, we use the cv2.drawContours function to accomplish this. The program code is as follows:
62 cv2.drawContours(img, [np.array(corners, np.intp)], -1, (0, 255, 255), 2)
This function takes five parameters, each with the following meanings:
img: The image to be processed.
[np.array(corners, np.int)]: The contour points.
-1: The contour index. -1 indicates that all contours should be drawn.
(0, 255, 255): The color of the contour.
2: The thickness of the contour line.
(3) Retrieving Tag Information
The program uses the AprilTag library to perform encoding and decoding to retrieve the tag’s information. Depending on the encoding method, different inner point coordinates are generated.
Once the quadrilateral is identified, the grid coordinates are clarified. To verify the reliability of the encoding, the tag must be matched against a known encoding library.
62 tag_family = str(detection.tag_family, encoding='utf-8') # 获取tag_family(obtain tag_family)
63 tag_id = int(detection.tag_id) # 获取tag_id(obtain tag_id)
64
65 return tag_family, tag_id
6.7 Tag Recognition
6.7.1 Program Logic
AprilTag is a visual positioning marker, which is similar to QR code or bar code. It can quickly detect the marker and calculate the position. It’s mainly applied to AR, robot and camera calibration, etc.
First, detect AprilTag through positioning, image segmentation, and contour searching. Obtain the angular point information after the contour is positioned. Connect the four corner points with a straight line to form a closed loop.
Encode and decode the detected tags. Finally, control SpiderPi Pro to execute the corresponding action according to different Tag IDs.
6.7.2 Operation Steps
Note
The input command should be case sensitive and space sensitive.
(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.
(2) Click
at upper left corner of desktop, or press “Ctrl+Alt+T” to open LX terminal.
(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.
cd spiderpi/functions
(4) Enter the command, then press “Enter” to start the game.
python3 apriltag_detect.py
(5) If you want to exit the game programming, press “Ctrl+C” in the LX terminal interface. If the exit fails, please try it a few more times.
6.7.3 Project Outcome
Note
Please run this game on a solid color or a white background. Dark background such as black will affect the tag recognition performance.
Please keep the tag intact, because dirt and wrinkle will affect recognition.
When recognizing the corresponding tag, the robot will execute the corresponding action. Besides, the tag will be marked with yellow box and the Tag ID and category will be printed on the camera returned image.
The corresponding actions of different Tag ID are listed below.
| Tag ID | Action |
|---|---|
| 1 | wave hands |
| 2 | mark time |
| 3 | twist |
6.7.4 Program Analysis
The source code of the program is located in: /home/pi/spiderpi/functions/apriltag_detect.py
Import Function Library
4import sys
5import math
6import threading
7import time
8import cv2
9import numpy as np
10from common import yaml_handle
11from calibration.camera import Camera
12from calibration.CalibrationConfig import *
13from common import kinematics
14import common.apriltag as apriltag
(1) Import the libraries related to OpenCV, time, math, and threads. If want to call a function in library, you can use “library name+function name (parameter, parameter)”. For example:
199 time.sleep(0.01)
Call sleep function in time library. The function sleep () is used to delay. There are some built-in libraries in Python, so they can be called directly. For example, time, cv2 and math. You can also write a new library like yaml_handle.
(2) Instantiate Function Library
The name of function library is too long to memorize. For calling function easily, the library can be instantiated. For example:
11from calibration.camera import Camera
After instantiating, you can directly input and call the function Board.function name (parameter, parameter).
Main Function Analysis
The python program __name__ == '__main__:' is the main function of program. Firstly, the function init() is called to initialize. The initialization in this program includes: return the servo to the initial position, read the color threshold file. Generally there are also configurations for ports, peripherals, timing interrupts, etc., which are all done in the process of initialization.
159if __name__ == '__main__':
160 from common.ros_robot_controller_sdk import Board
161 from sensor.ultrasonic_sensor import Ultrasonic
162 from common.action_group_controller import ActionGroupController
163 import arm_ik.arm_move_ik as AMK
164
165
166 board = Board()
167 ik = kinematics.IK(board) # 实例化逆运动学库(instantiate inverse kinematics library)
168 ultrasonic = Ultrasonic()
169 agc = ActionGroupController(board)
170 ak = AMK.ArmIK()
Obtain Corner Point Information
Use np.rint() to obtain the four corner points of the tag.
116# 检测apriltag(detect apriltag)
117detector = apriltag.Detector(searchpath=apriltag._get_demo_searchpath())
118def apriltagDetect(img):
119 gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
120 detections = detector.detect(gray, return_image=False)
121
122 if len(detections) != 0:
123 for detection in detections:
124 corners = np.rint(detection.corners) # 获取四个角点(obtain the four corner points)
125 cv2.drawContours(img, [np.array(corners, np.int64)], -1, (0, 255, 255), 2)
126
127 tag_family = str(detection.tag_family, encoding='utf-8') # 获取tag_family(obtain tag_family)
128 tag_id = int(detection.tag_id) # 获取tag_id(obtain tag_id)
129
130 object_center_x, object_center_y = int(detection.center[0]), int(detection.center[1]) # 中心点(center point)
131
132 object_angle = int(math.degrees(math.atan2(corners[0][1] - corners[1][1], corners[0][0] - corners[1][0]))) # 计算旋转角(calculate rotation angle)
133
134 return tag_family, tag_id
Tag Detection
(1) After the angular points of the tag are obtained, mark the Tag through calling drawContours() function in cv2 library.
125 cv2.drawContours(img, [np.array(corners, np.int64)], -1, (0, 255, 255), 2)
The meaning of the parameters in bracket is as follow.
The first parameter img is the input image
The second parameter [np.array(corners, np.int)] is the contour itself and list in Python.
The third parameter -1 is the index of the contour. The value here represents all the contours in list will be drawn.
The fourth parameter (0, 255, 255) is the color of the contour. The values respectively corresponds to B, G, R, and the color is yellow here.
The fifth parameter 2 is the width of the contour.
(2) Obtain the type of the tag (tag_family) and ID (tag_id)
127 tag_family = str(detection.tag_family, encoding='utf-8') # 获取tag_family(obtain tag_family)
128 tag_id = int(detection.tag_id) # 获取tag_id(obtain tag_id)
(3) Through calling putText() function in cv2 library, print the ID and category of the tag on the camera returned image.
150 if tag_id is not None:
151 cv2.putText(img, "tag_id: " + str(tag_id), (10, img.shape[0] - 30), cv2.FONT_HERSHEY_SIMPLEX, 0.65, [0, 255, 255], 2)
152 cv2.putText(img, "tag_family: " + tag_family, (10, img.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.65, [0, 255, 255], 2)
153 else:
154 cv2.putText(img, "tag_id: None", (10, img.shape[0] - 30), cv2.FONT_HERSHEY_SIMPLEX, 0.65, [0, 255, 255], 2)
155 cv2.putText(img, "tag_family: None", (10, img.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.65, [0, 255, 255], 2)
The meaning of the parameters in bracket is as follow.
The first parameter img is the input image.
The second parameter "tag_id: " + str(tag_id) is the displayed content.
The third parameter (10, img.shape[0] - 30) is the displayed position.
The fourth parameter cv2.FONT_HERSHEY_SIMPLEX is the font type.
The fifth parameter 0.65 is the font size.
The sixth parameter [0, 255, 255] is the color of the font, and the values respectively corresponds to B, G, R. The color here is yellow.
The seventh parameter 2 is the font weight.
Action Controlling
After the tag ID is obtained, control SpiderPi Pro to execute the corresponding action group through calling agc.run_action() function.
82 while True:
83 if debug:
84 return
85 if __isRunning:
86 if tag_id is not None:
87 action_finish = False
88 time.sleep(0.5)
89 if tag_id == 1:
90 agc.run_action_group('wave',lock_servos=LOCK_SERVOS)#招手(wave)
91 tag_id = None
92 time.sleep(1)
93 action_finish = True
94 elif tag_id == 2:
95 agc.run_action_group('stepping',lock_servos=LOCK_SERVOS)#原地踏步(stepping)
96 tag_id = None
97 time.sleep(1)
98 action_finish = True
99 elif tag_id == 3:
100 agc.run_action_group('twist_l',lock_servos=LOCK_SERVOS)#扭腰(twist)
101 tag_id = None
102 time.sleep(1)
103 action_finish = True
104 else:
105 action_finish = True
106 time.sleep(0.01)
107 else:
108 time.sleep(0.01)
109 else:
110 time.sleep(0.01)
6.7.5 Function Extension
Modify Action Corresponding to the Tag
SpiderPi Pro is default to “wave hands” in the program when the ID 1 tag is detected, but you can modify the default program. For example, we can revise the feedback action as kicking.
(1) Enter command and press “Enter” to navigate to the directory where the game program is located.
cd spiderpi/functions
(2) Enter command and press Enter to open the program file.
vim apriltag_detect.py
(3) Locate the code shown below:
Note
press “Shift+G” after inputting the line number to directly jump to the corresponding line. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.
(4) Press “i” key to enter the editing mode.
Modify “wave” of agc.run_action("wave") as “kick”. If you want to change it to other action group, you can enter the other action group name which can be checked in “/home/pi/spiderpi/aiction_groups”.
kick
(5) After modification, press “Esc” and input “:wq” and then press Enter to save and exit.
:wq
Modify/ Add the Tag
You can find the Tag materials in this directory “AprilTag collection”, but you need to extract this folder first.
Note
There is no need to download materials online. 200 tags are provided and you can find them in “ApirlTag Collection”.
You can print the tag in suitable size, not too large or too small, as long as the tag can be recognized by the robot. The tag will be circled in yellow when it is recognized.
The recognition background should be white. Dark background will influence the recognition effect.
Take adding Tag ID4 for example. The corresponding action of this tag is “Stand at Attention in High Posture”. Please follow the below steps to modify.
(1) According to “6.7.5 Function Extension -> Modify Action Corresponding to the Tag “, enter the catalog of the game program and open the corresponding program file.
(2) Locate the code in 98th line, input “5yy”, and then copy the codes of 98-102 line.
(3) When the hint of “5 lines yanked” appears, it means that the codes are copied successfully.
(4) Then move to the codes shown in the red frame and enter “p” to paste the codes copied before.
(5) Press “i” key to enter the editing mode, and modify “3” of “elif tag_id == 3:” as “4”, and “twist_l” of “agc.run_action(‘twist_l’)” as “stand_high”. And modify the comment after the codes as “stand at attention in high posture”. If you want to change it to other action groups, you can enter other action group name which can be checked in “/home/pi/spiderpi/action_groups”.
(6) After modification, press “Esc” key, enter “:wq”, and then press “Enter” to save and exit.
:wq
(7) Find Tag ID4 in folder “AprilTag Collection” and print it directly.
(8) According to “6.7.2 Operation Steps” to start the game and check whether the modification works.
6.8 Face Recognition
6.8.1 Brief Description of the Activity
When no face is detected, the robotic arm rotates left and right to scan the area. Once a face is detected, the claw moves up and down as a greeting.
Face recognition is one of the most widely used applications in artificial intelligence, particularly in image recognition. Among these applications, face recognition is the most popular, often used in scenarios like smart locks and facial unlocking on mobile phones.
In this activity, we first train the face recognition model. The system then detects faces by scaling the image. After detection, the coordinates of the recognized face are converted back to the original scale, and the largest face is identified. The recognized face is then outlined with a frame.
Next, the pan-tilt servos are set to rotate left and right to locate the face. Finally, the robot executes the feedback action based on the recognition results.
6.8.2 Start and Close the Game
Note
The input of commands must strictly distinguish between uppercase and lowercase letters.
(1) Power on the device and, following the instructions in “Remote Desktop Installation and Connection\3.1 VNC Installation and Connection”, use the VNC remote connection tool to connect.
(2) Click the icon
in the top left corner of the system desktop or press the shortcut “Ctrl+Alt+T” to open the LX terminal.
(3) In the terminal, enter the command to navigate to the directory where the program is located, then press Enter:
cd spiderpi/functions
(4) Enter the command and press Enter to start the program:
python3 face_recongition.py
(5) To close the program, simply press “Ctrl+C” in the LX terminal. If it does not close, press it multiple times.
6.8.3 Program Outcome
Note
For optimal performance, please avoid using this activity under strong lighting conditions, such as direct sunlight or close proximity to incandescent lights, as intense light can affect face recognition accuracy. It is recommended to conduct this activity indoors, with the face positioned within a range of 50 cm to 1 meter from the camera.
Once the activity begins, the camera’s pan-tilt will rotate left and right. If no face is detected, the robotic arm will scan by rotating left and right. Upon detecting a face, the claw will move up and down to greet the user.
6.8.4 Program Brief Analysis
The source code of the program is saved in:/home/pi/spiderpi/functions/face_recongition.py
Function Logic
(1) Importing Libraries
At this initialization step, necessary libraries are imported to facilitate future function calls within the program.
4import sys
5import cv2
6import time
7import sys
8import threading
9import mediapipe as mp
10from common import yaml_handle
11from calibration.camera import Camera
12from common.action_group_controller import ActionGroupController
13from common.ros_robot_controller_sdk import Board
14from calibration.camera import Camera
15from common import kinematics
(2) Setting Initial State
19debug = False
20iHWSONAR = None
21board = None
22if sys.version_info.major == 2:
23 print('Please run this program with python3!')
24 sys.exit(0)
25
26# 导入人脸识别模块(import facial recognition module)
27Face = mp.solutions.face_detection
28# 自定义人脸识别方法,最小的人脸检测置信度0.5(Customize face recognition method, and the minimum face detection confidence is 0.5)
29faceDetection = Face.FaceDetection(min_detection_confidence=0.8)
30
31lab_data = None
32servo_data = None
(3) Color Space Conversion
The BGR image is converted to an RGB image.
79 imgRGB = cv2.cvtColor(img_copy, cv2.COLOR_BGR2RGB) # 将BGR图像转为RGB图像(convert BGR image to RGB image)
(4) Using Mediapipe Face Model for Recognition.
The system performs face detection and draws a rectangle around the detected face. Then, the position of the face is compared to the center of the image. If the face is centered, start_greet is set to True to trigger the action group.
81if results.detections: # 如果检测不到人脸那就返回None(If the face is not detected, return None)
82
83 for index, detection in enumerate(results.detections): # 返回人脸索引index(第几张脸),和关键点的坐标信息(Return the face index (which face) and the coordinate information of the keypoints)
84 scores = list(detection.score)
85 if scores and scores[0] > 0.75:
86
87 bboxC = detection.location_data.relative_bounding_box # 设置一个边界框,接收所有的框的xywh及关键点信息(Set a bounding box to receive xywh and keypoint information for all boxes)
88
89 # 将边界框的坐标点,宽,高从比例坐标转换成像素坐标(Convert the coordinates' width and height of the bounding box from proportional coordinates to pixel coordinates)
90 bbox = (
91 int(bboxC.xmin * img_w),
92 int(bboxC.ymin * img_h),
93 int(bboxC.width * img_w),
94 int(bboxC.height * img_h)
95 )
96 cv2.rectangle(img, bbox, (0, 255, 0), 2) # 在每一帧图像上绘制矩形框(draw a rectangle on each frame of the image)
97
98 # 获取识别框的信息, xy为左上角坐标点(Get information about the recognition box, where xy is the coordinates of the upper left corner)
99 x, y, w, h = bbox
100 center_x = int(x + (w / 2))
101 center_y = int(y + (h / 2))
102 area = int(w * h)
103 if not start_greet:
104 board.set_buzzer(2400, 0.1, 0.2, 1)
105 start_greet = True
106
107 else :
108 start_greet = False
109
(5) Face Recognition
If a face is detected, the Board.setPWMServoPulse function is used to control the servo motor by setting the PWM (Pulse Width Modulation) to perform the waving action.
The first parameter 0.05 is the pulse interval or duration.
The second parameter 3 refers to the pin number connected to the servo.
The third parameter 500 represents the pulse width, which typically corresponds to the servo’s position.
130 while True:
131 img = camera.frame
132 if img is not None:
133 frame = img.copy()
134 Frame = run(frame)
135 cv2.imshow('Frame', Frame)
136 key = cv2.waitKey(1)
137 if key == 27:
138 break
139 else:
140 time.sleep(0.01)
(6) Display the Transmitted Image
Call the resize() function in the cv2 library to scale the image and display it in real time on the transmitted Image.
133 frame = img.copy()
134 Frame = run(frame)
135 cv2.imshow('Frame', Frame)
136 key = cv2.waitKey(1)
137 if key == 27:
138 break
when a face is detected, the buzzer makes a sound.
104 board.set_buzzer(2400, 0.1, 0.2, 1)
6.9 Face Detection
6.9.1 Program logic
In image recognition, face recognition technology is very popular and is often used in scenarios such as door locks and facial recognition for unlocking mobile phones.
To realize face detection, the first step is to zoom in or out the image.
Next, convert the coordinate of the recognized human face into the coordinate before scaling, and mark the target human face with the box.
Lastly, control SpiderPi Pro to execute the corresponding action. When human face is not recognized, control the robotic arm to rotate around to search human face.
6.9.2 Operation steps
Note
The input command should be case sensitive and space sensitive.
(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.
(2) Click
at upper left corner of desktop, or press “Ctrl+Alt+T” to open LX terminal.
(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.
cd spiderpi/functions
(4) Enter the command, then press “Enter” to start the game.
python3 face_detect.py
(5) If you want to exit the game programming, press “Ctrl+C” in the LX terminal interface. If the exit fails, please try it a few more times.
6.9.3 Project outcome
Note
As the strong light will influence the effect of face detection, please do not play this game under strong light, such as sunlight, incandescent light. It is recommended to start this game in the indoor and the distance between human face and the camera is within 1m.
After the game starts, the camera will raise to the specific angle and then rotate around to search human face. When recognizing human face, the robotic arm will stop rotating and SpiderPi Pro will “wave”.
6.9.4 Program Analysis
The source code of this program is located in: /home/pi/spiderpi/functions/face_detect.py
Import Function Library
4import sys
5import cv2
6import time
7import sys
8import threading
9import mediapipe as mp
10from common import yaml_handle
11from calibration.camera import Camera
12from common.action_group_controller import ActionGroupController
13from common.ros_robot_controller_sdk import Board
14from calibration.camera import Camera
15from common import kinematics
Define Global Variable
20debug = False
21iHWSONAR = None
22board = None
23if sys.version_info.major == 2:
24 print('Please run this program with python3!')
25 sys.exit(0)
26
27# 导入人脸识别模块(import facial recognition module)
28Face = mp.solutions.face_detection
29# 自定义人脸识别方法,最小的人脸检测置信度0.5(Customize face recognition method, and the minimum face detection confidence is 0.5)
30faceDetection = Face.FaceDetection(min_detection_confidence=0.8)
31
32lab_data = None
33servo_data = None
34def load_config():
35 global lab_data, servo_data
36
37 lab_data = yaml_handle.get_yaml_data(yaml_handle.lab_file_path)
38 servo_data = yaml_handle.get_yaml_data(yaml_handle.servo_file_path)
39
40load_config()
Image Processing
(1) Convert color space
Convert the BGR image to LAB image.
134 imgRGB = cv2.cvtColor(img_copy, cv2.COLOR_BGR2RGB) # 将BGR图像转为RGB图像(convert BGR image to RGB image)
The cvtColor() function is used to convert an image from one color space to another. In the example code gray = cv2.cvtColor(frame_resize, cv2.COLOR_BGR2GRAY) , the meanings in the parenthesis are as follow:
The first parameter frame_resize is the input image.
The second parameter cv2.COLOR_BGR2GRAY is the type of conversion, which in this case is a conversion from BGR to grayscale.
(2) Call face detector
After completing the image processing steps mentioned above, the image is passed to a face detector for further processing.
135 results = faceDetection.process(imgRGB) # 将每一帧图像传给人脸识别模块(transmit the image of each frame to facial recognition module)
136 if results.detections: # 如果检测不到人脸那就返回None(If the face is not detected, return None)
137
138 for index, detection in enumerate(results.detections): # 返回人脸索引index(第几张脸),和关键点的坐标信息(Return the face index (which face) and the coordinate information of the keypoints)
139 scores = list(detection.score)
140 if scores and scores[0] > 0.75:
(3) Display transmitted image
Call resize() function in cv2 library to scale the shape, and display it in the live camera feed.
182 while True:
183 img = camera.frame
184 if img is not None:
185 frame = img.copy()
186 Frame = run(frame)
187 cv2.imshow('Frame', Frame)
188 key = cv2.waitKey(1)
189 if key == 27:
190 break
Action Controlling
When human face is recognized, call the agc.run_action()function to control SpiderPi Pro to execute the designated action group.
102 AGC.run_action('wave') # 识别到人脸时执行的动作(If the face is detected, execute the action)
When human face is not detected, call “board.bus_servo_set_position()” to control the robotic arm of SpiderPi Pro to rotate around.
111 board.pwm_servo_set_position(0.05, [[2, servo2_pulse]])
112 time.sleep(0.05)
Main Function Analysis
(1) Call init() function to initialize SpiderPi Pro.
42# 初始位置(initial position)
43def initMove():
44 ultrasonic.setRGBMode(0)
45 ultrasonic.setRGB(1, (0, 0, 0))
46 ultrasonic.setRGB(2, (0, 0, 0))
47
48 board.pwm_servo_set_position(0.5, [[1, 1800] , [2, servo_data['servo2']]])
(2) Call reset() function to reset variable parameters such as servo.
57# 变量重置(reset variables)
58def reset():
59 global d_pulse
60 global start_greet
61 global x_pulse
62 global action_finish
63
64
65 start_greet = False
66 action_finish = True
67 x_pulse = 500
68 init_move()
(3) Call start() function to start face tracking game.
77def start():
78 global __isRunning
79 __isRunning = True
80 print("FaceDetect Start")
(4) Instantiate the camera library and call camera_open() function to enable camera’s distortion correction.
180 camera = Camera()
181 camera.camera_open(correction=True)
Subthread Analysis
Run a sub-thread that calls the move() function to control the movement of pan-tilt servo.
116# 运行子线程(run sub-thread)
117th = threading.Thread(target=move)
In the move() function, adjust the rotation of the pan-tilt servo by sliding the window.
92def move():
93 global start_greet
94 global action_finish
95 global d_pulse, servo2_pulse
96
97 while True:
98 if __isRunning:
99 if start_greet:
100 start_greet = False
101 action_finish = False
102 AGC.run_action('wave') # 识别到人脸时执行的动作(If the face is detected, execute the action)
103
104 action_finish = True
105 time.sleep(0.5)
106 else:
107 if servo2_pulse > 2000 or servo2_pulse < 1000:
108 d_pulse = -d_pulse
109
110 servo2_pulse += d_pulse
111 board.pwm_servo_set_position(0.05, [[2, servo2_pulse]])
112 time.sleep(0.05)
113 else:
114 time.sleep(0.01)
The meanings of the parameters in the parentheses of the code board.bus_servo_set_position(0.05, [[21,x_pulse]]) are as follows:
The first parameter 0.05 is the runtime of the servo in the unit of m.
The second parameter 21 is the servo number, which is servo 21.
The third parameter x_pulse is pulse width of the servo ranging from 1000 to 1900.
6.9.5 Function extension
Note
The built-in action group file can be found in this catalog “/home/pi/SpiderPi/action_groups”.
When human face is recognized, SpiderPi Pro will “wave hands” by default. But we can modify the program to let SpiderPi Pro react differently, such as “twist body”. Please follow the below steps to modify.
(1) Enter the command and press “Enter” to come to the catalog where the game programs are stored.
cd spiderpi/functions
(2) Enter command “vim face_detect.py” and press “Enter” to open the program file.
vim face_detect.py
(3) Locate the code shown below:
Note
press “Shift+G” after inputting the line number to directly jump to the corresponding line. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.
(4) Press “i” key to enter the editing mode.
(5) Modify “wave” in agc.run_action("wave") as “twist”. If you want to change it to other action groups, please move to the catalog “/home/pi/spiderpi/ action_groups” to check other action group names.
After modification, press “Esc” key and enter “:wq” and then press Enter to save and exit.
:wq
6.10 Auto Obstacle Avoidance
6.10.1 Program Logic
Ultrasonic sensor can measure the distance between SpiderPi Pro and the object ahead. After the data is obtained from the ultrasonic sensor, process and judge the data. When it’s shorter than the set distance threshold, SpiderPi Pro will turn to avoid the front obstacle. Otherwise, the robot will move forward.
6.10.2 Operation Steps
Note
The input command should be case sensitive and space sensitive.
(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.
(2) Click
at upper left corner of desktop, or press “Ctrl+Alt+T” to open LX terminal.
(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.
cd spiderpi/functions
(4) Enter the command, then press “Enter” to start the game.
python3 avoidance.py
(5) If you want to exit the game program, press “Ctrl+C” in the LX terminal interface. If the exit fails, please try a few more times.
6.10.3 Project Outcome
Note
The default distance threshold is 40cm. If you want to modify it as other value, you can refer to “6.10.5 Function Extension -> Modify Default Distance Threshold”.
After the game starts, the measured distance will be displayed on the camera returned image. When the distance between SpiderPi Pro and the obstacle is shorter than 25cm, the robot will step back and then turn left. When longer than 25cm and shorter than 40cm, the robot will turn left. When the distance is longer than 40cm, the robot will move forward.
6.10.4 Program Analysis
The source code of this program is located in :/home/pi/spiderpi/functions/avoidance.py
Import Function Library
4import os
5import sys
6import cv2
7import time
8import threading
9import numpy as np
10import pandas as pd
11from common import yaml_handle
12from common import kinematics
13from calibration.camera import Camera
14from calibration.CalibrationConfig import *
15from sensor.ultrasonic_sensor import Ultrasonic
16import arm_ik.arm_move_ik as AMK
Define Global Variable
19if sys.version_info.major == 2:
20 print('Please run this program with python3!')
21 sys.exit(0)
22
23
24def load_config():
25 global lab_data, servo_data
26
27 lab_data = yaml_handle.get_yaml_data(yaml_handle.lab_file_path)
28
29load_config()
30
31Threshold = 40.0 # 默认阈值40cm(default threshold is 40cm)
32TextColor = (0, 255, 255)
33TextSize = 12
34
35__isRunning = False
36distance = 0
Main Function Analysis
(1) Initialize and Instantiate
117if __name__ == '__main__':
118 from common.ros_robot_controller_sdk import Board
119
120
121 board = Board()
122 ik = kinematics.IK(board)
123 ultrasonic = Ultrasonic()
124 ak = AMK.ArmIK()
① Call init() function to initialize SpiderPi Pro.
135 init()
136 start()
137 camera = Camera()
138 camera.camera_open()
② Call reset() function to reset servo variable.
38def reset():
39 ak.setPitchRangeMoving((0, 15, 30), 0, -90, 100, 1)
③ Instantiate the camera library and call camera_open() function to enable camera’s distortion correction.
137 camera = Camera()
138 camera.camera_open()
Distance Ranging
(1) Distance threshold setting
Set a Threshold to determine whether to perform obstacle avoidance. Its unit is cm.
31Threshold = 40.0 # 默认阈值40cm(default threshold is 40cm)
(2) Acquire and process the measured distance
Obtain the distance measured by the ultrasonic sensor through calling getDistance() function.
102 # 数据处理,过滤异常值(process data and filter abnormal values)
103 distance_ = ultrasonic.getDistance() / 10.0
104 distance_data.append(distance_)
105 data = pd.DataFrame(distance_data)
106 data_ = data.copy()
107 u = data_.mean() # 计算均值(calculate mean)
108 std = data_.std() # 计算标准差(calculate standard deviation)
109
110 data_c = data[np.abs(data - u) <= std]
111 distance = data_c.mean()[0]
Process the obtained data for more accurate distance.
103 distance_ = ultrasonic.getDistance() / 10.0
104 distance_data.append(distance_)
105 data = pd.DataFrame(distance_data)
106 data_ = data.copy()
107 u = data_.mean() # 计算均值(calculate mean)
108 std = data_.std() # 计算标准差(calculate standard deviation)
109
110 data_c = data[np.abs(data - u) <= std]
111 distance = data_c.mean()[0]
112 if len(distance_data) == 5:
113 distance_data.remove(distance_data[0])
(3) Feedback information
Through calling putText() function in cv2 library, the measured distance will be printed on the camera returned image.
115 cv2.putText(img, "Dist:%.1fcm" % distance, (30, 480 - 30), cv2.FONT_HERSHEY_SIMPLEX, 1.2, TextColor, 2)
The meaning of the parameter in bracket is as follow.
The first parameter img is the input image.
The second parameter "Dist:%.1fcm" % distance is the displayed content
The third parameter (30, 480 - 30) is the displayed position.
The fourth parameter cv2.FONT_HERSHEY_SIMPLEX is the font type.
The fifth parameter 1.2 is the font size
The sixth parameter TextColor is the font color.
The seventh parameter 2 is the font weight.
Action Controlling
Compare the measured distance with the set threshold. SpiderPi Pro will execute the corresponding action according to the result.
78 if 0 < distance < Threshold:
79 while distance < 25: # 小于25cm时后退(back up when the distance is less than 25cm)
80 ik.back(ik.initial_pos, 2, 80, 50, 1)
81 for i in range(6): # 左转6次,每次15度,一共90度(Turn left 6 times with 15 degrees each time, a total of 90 degrees)
82 if __isRunning:
83 ik.turn_left(ik.initial_pos, 2, 50, 50, 1)
84 else:
85 ik.go_forward(ik.initial_pos, 2, 80, 50, 1)
86 else:
87 time.sleep(0.01)
The corresponding actions of different distance range are listed below.
| Distance | Action |
|---|---|
| 0cm < distance < 25cm | move backwards and then turn left |
| 25cm < distance < 40cm | turn left |
| 40cm < distance | move forward |
The movement of SpiderPi Pro can be controlled through calling function in kinematics.IK library. Please check the table below to decide which to use.
| Function | Usage |
|---|---|
| ik.back(ik.initial_pos, 2, 80, 50, 1) | move backwards 80mm |
| ik.turn_left(ik.initial_pos, 2, 15, 50, 1) | turn left 15 degree on the spot |
| ik.go_forward(ik.initial_pos, 2, 80, 50, 1) | move forward 80mm |
The meaning of the parameter in bracket is as follow.
The first parameter is posture
The second parameter is mode. 2 is Spider mode.
The third parameter is stride. When the robot turns, the unit is mm, and when it turns, the unit is degree.
The fourth parameter is speed in mm/s.
The fifth parameter is the number of execution. 0 represents that the action will be executed at loop.
6.10.5 Function Extension
Modify Default Distance Threshold
The default distance threshold is 40cm, and it can set to 30-60. For example, modify it as 50cm.
(1) Enter the command\ and press “Enter” to come to the directory of the game program.
cd spiderpi/functions
(2) Input the command “vim avoidance.py” and press “Enter” to open the program file
vim avoidance.py
(3) Locate the code shown below:
Note
Press “Shift+G” after inputting the line number to directly jump to the corresponding line. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.
(4) Press “i” key to enter the editing mode. And modify “40.0” of “Threshold = 40.0” as “50.0”.
(5) After modification, press “Esc” and enter “:wq” and then press “Enter” to save and exit.
:wq
6.11 Shape Recognition under Single Color
6.11.1 Program Logic
Firstly, program SpiderPi Pro to recognize colors through Lab color space. Convert the RGB color space to Lab, and then perform image binarization, expansion, corrosion and other operations in sequence to obtain an outline only containing the target color. Then, circle the color outline to realize object color recognition.
The next step is to judge the shape of the outline and program SpiderPi Pro to give corresponding response.
6.11.2 Operation Steps
Note
When entering commands, pay strict attention to case sensitivity and spaces.
(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.
(2) Click
at upper left corner of desktop to open the Terminator.
(3) Enter the command to navigate to the directory where the game program is located and press Enter.
cd spiderpi/advanced
(4) Enter “python3 shape_recognition_plain.py”, and then press “Enter” to start the game.
python3 shape_recognition_plain.py
(5) f want to quit this game, just press “Ctrl+C”. If the game cannot be quit, please try again.
6.11.3 Project Outcome
After the game starts, place the blue object in front of SpiderPi Pro’s camera. When the shape of the object is recognized, the shape name will be printed on the terminal, and the buzzer will beep. When triangle is recognized, the buzzer will beep once. When rectangle is recognized, the buzzer will beep twice. When circle is recognized, the buzzer will beep three times.
6.11.4 Program Parameter Description
The source code of this program is located at: /home/pi/spiderpi/advanced/shape_recognition_plain.py
Importing Function Libraries
4import sys
5import cv2
6import math
7import time
8import signal
9import threading
10import numpy as np
11from common import yaml_handle
12from calibration.camera import Camera
13from calibration.CalibrationConfig import *
14from common import kinematics
15from common.ros_robot_controller_sdk import Board
16from common.action_group_controller import ActionGroupController
17import arm_ik.arm_move_ik as AMK
18from sensor.ultrasonic_sensor import Ultrasonic
19import sensor.dot_matrix_sensor as DMS
(1) Import the libraries related to OpenCV, time, math, and threads. If want to call a function in library, you can use “library name+function name (parameter, parameter)”. For example:
78 time.sleep(3)
Call sleep function in “time” library. The function sleep () is used to delay. There are some built-in libraries in Python, so they can be called directly. For example, “time”, “cv2” and “math”. You can also write a new library like “yaml_handle”.
(2) Instantiating Function Libraries
The name of function library is too long to memorize. For calling function easily, the library can be instantiated. For example:
14from common import kinematics
15from common.ros_robot_controller_sdk import Board
16from common.action_group_controller import ActionGroupController
After instantiating, you can directly input and call the function Board.function name (parameter, parameter).
Main Function Analysis
The python program __name__ == '__main__:' is the main function of program. Firstly, the function “init()” is called to initialize. The initialization in this program includes: return the servo to the initial position, read the color threshold file. Generally there are also configurations for ports, peripherals, timing interrupts, etc., which are all done in the process of initialization.
148if __name__ == '__main__':
149 #加载参数(load parameter)
150 param_data = np.load(calibration_param_path + '.npz')
151
152 #获取参数(obtain parameter)
153 mtx = param_data['mtx_array']
154 dist = param_data['dist_array']
155 newcameramtx, _ = cv2.getOptimalNewCameraMatrix(mtx, dist, (640, 480), 0, (640, 480))
156 mapx, mapy = cv2.initUndistortRectifyMap(mtx, dist, None, newcameramtx, (640, 480), 5)
157
158 load_config()
159 init_move()
160
161 camera = Camera()
162 camera.camera_open()
Parameters of Color Detection
Shape recognition is realized through detecting the color of the object. The detected color is blue.
123 color = 'blue'
The main detection parameters involved in the process of detecting the color of the object are as follows:
(1) Before converting the image into LAB space, denoise the image and use GaussianBlur() function for Gaussian filtering.
118 frame_gb = cv2.GaussianBlur(img, (3, 3), 3)
The first parameter img is the input image.
The second parameter (3, 3) is the size of Gaussian kernel. Larger kernel will lead to greater filtering, which results in fuzzier output image and more complex computation.
The third parameter 3 is the standard deviation of Gaussian function along the X direction. It is used to control the change around the average in Gaussian filtering. When the data increases, the allowable variation range around the average value expands; if it decreases, the allowable variation range around the average value narrow down.
(2) Use inRange function to perform binaryzation on the input image, as the picture shown.
124 frame_mask = cv2.inRange(frame_lab,
125 (lab_data[color]['min'][0],
126 lab_data[color]['min'][1],
127 lab_data[color]['min'][2]),
128 (lab_data[color]['max'][0],
129 lab_data[color]['max'][1],
130 lab_data[color]['max'][2]))
(3) To avoid interference and make the image smoother, use cv2.morphologyEx function to process the image.
131 opened = cv2.morphologyEx(frame_mask, cv2.MORPH_OPEN, np.ones((6, 6), np.uint8))
132 closed = cv2.morphologyEx(opened, cv2.MORPH_CLOSE, np.ones((6, 6), np.uint8))
Take opened = cv2.morphologyEx(frame_mask, cv2.MORPH_OPEN, np.ones((6,6),np.uint8)) for example.
The first parameter frame_mask represents the input image.
The second parameter represents the way to change. cv2.MORPH_OPEN indicates open operation. Perform corrosion first, and then dilation to eliminate the black spots. And cv2.MORPH_CLOSE refers to close operation. In close operation, dilation is performed first, and then corrosion to remove bright spots.
The third parameter np.ones((6,6),np.uint8) represents the size of the box.
(4) Find out the maximum contour of the object.
54# 找出面积最大的轮廓(find the contour with the maximum area)
55def get_area_maxContour(contours):
56 contour_area_temp = 0
57 contour_area_max = 0
58 area_max_contour = None
59 for c in contours:
60 contour_area_temp = math.fabs(cv2.contourArea(c))
61 if contour_area_temp > contour_area_max:
62 contour_area_max = contour_area_temp
63 if contour_area_temp > 50:
64 area_max_contour = c
65 return area_max_contour, contour_area_max
To filter out disturbance, set the command, like if contour_area_temp > 50, which means that only when the area is more than 50, the maximum contour is effective.
Color Recognition Parameters
When the robot recognizes a blue object, the cv2.drawContours() function is used to draw the contour of the object.
136 cv2.drawContours(img, areaMaxContour, -1, (0, 0, 255), 2)
The first parameter img is the input image;
The second parameter areaMaxContour is the contour itself, which is a list in Python;
The third parameter -1 is the index of the contour. Here, the value represents drawing all the contours in the contour list;
The fourth parameter (0, 0, 255) is the color of the contour. The order is R, G, B, and here it is blue;
The fifth parameter 2 is the width of the contour.
Shape Judgment Parameters
(1) After the object contour is framed, acquire polygon approximate object shape through cv2.approxPolyDP, as shown in the picture.
138 approx = cv2.approxPolyDP(areaMaxContour, epsilon, True)
The first parameter areaMaxContour represents the set of points of the contour.
The second parameter epsilon represents the distance between the filtered line segment set and the newly generated line segment set is d. If d is smaller than epsilon, filter out. Otherwise, keep it.
The third parameter True represents the closed contour newly generated. False represents open contour.
The below picture will help you better understand.
Process AC segment first. When d, distance between B and AC, is more than epsilon, then keep AB. Then, process BC segment.
Note
you can set the value of epsilon. Epsilon of this game program is set to 0.035 times the contour perimeter. The smaller the value, the better the recognition effect.
(2) Obtain the quantity of the sides of polygon approximate object shape, and display it on the terminal.
140 if len(shape_list) == 24:
141 shape_length = int(round(np.mean(shape_list)))
142 shape_list = []
143 #print(shape_length)
144 else:
145 shape_length = 0
146 return img
(3) Through obtaining the number of the sides, judge the shape of the object and display it on the terminal. At the same time, control the buzzer to sound different times continuously according to the shape.
71# 主要控制函数(main control function)
72def move():
73 #global shape_length, board
74 while move_st:
75 if shape_length == 3:
76 print('三角形')
77 board.set_buzzer(2400, 0.1, 0.4, 1) # 以2400Hz的频率,0.1秒开始响,0.4秒停止响,重复1次(The buzzer sounds at a frequency of 2400Hz for 0.1 seconds, followed by a pause of 0.4 seconds, and it repeats this pattern once)
78 time.sleep(3)
79
80 elif shape_length == 4:
81 print('矩形')
82 board.set_buzzer(2400, 0.1, 0.4, 2) # 以2400Hz的频率,0.1秒开始响,0.4秒停止响,重复2次(The buzzer sounds at a frequency of 2400Hz for 0.1 seconds, followed by a pause of 0.4 seconds, and it repeats this pattern twice)
83 time.sleep(3)
84
85 elif shape_length >= 6:
86 print('圆')
87 board.set_buzzer(2400, 0.1, 0.4, 3) # 以2400Hz的频率,0.1秒开始响,0.4秒停止响,重复3次(The buzzer sounds at a frequency of 2400Hz for 0.1 seconds, followed by a pause of 0.4 seconds, and it repeats this pattern three times)
88 time.sleep(3)
89
90 else:
91 time.sleep(1)
6.11.5 Function Extension
Changing the Default Recognition Color
The default recognizable color of this game is blue. Here, taking changing the default recognition color to red as an example, the specific modification steps are as follows:
(1) Enter command to the catalog where the game programs are stored.
cd spiderpi/advanced
(2) Enter command to open the program file.
sudo vim shape_recognition_plain.py
(3) Locate the code shown below:
Note
press “Shift+G” after inputting the line number to directly jump to the corresponding line. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.
(4) Press “i” key to enter the editing mode, then modify “blue” of “color = ‘blue’” as “red”.
(5) After modification, press “Esc” and input “:wq” to save the file and exit.
:wq
(6) Execute the steps in “6.11.2 Operation Steps” to check the modification effect.
Changing the Feedback Sound
When triangle is recognized, the buzzer will beep once. When rectangle is recognized, the buzzer will beep twice. When circle is recognized, the buzzer will beep third times. And we make the buzzer beep twice when the circle is recognized for example.
(1) Enter the command and press “Enter” to enter the catalog where the game programs are stored.
cd spiderpi/advanced
(2) Enter the command and press “Enter” to open the program file.
sudo vim shape_recognition_plain.py
(3) Scroll down to find these codes.
(4) Press “i” key to enter the editing mode and modify the “3” in board.set_buzzer(2400, 0.1, 0.4, 3) to “2”.
(5) After modification, press the “Esc” key, enter “:wq” and press Enter to save and exit.
:wq
(6) Execute the steps in “6.11.2 Operation Steps” to check the modification effect.
6.12 Shape Recognition
6.12.1 Program logic
Firstly, process the real-time camera image through OpenCV, and then perform binaryzation, corrosion, dilation, etc., to obtain the contour only containing the target color, and mark it.
After obtaining the target contour, deduce the corresponding shape according to the contour approximation result. And the recognition result will be displayed on the dot matrix screen, so as to realize shape recognition.
6.12.2 Operation steps
Note
The input command should be case sensitive and space sensitive.
(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.
(2) Click
at upper left corner of desktop to open the Terminator.
(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.
cd spiderpi/advanced
(4) Enter command, and then press “Enter” to start the game.
python3 shape_recognition.py
(5) If want to close this game, press “Ctrl+C” on LX terminal. If the game cannot be quit, please try again.
6.12.3 Project outcome
Note
The default recognition color is red, green and blue. The recognizable shapes are triangle, rectangle and circle.
When the shape is recognized, the corresponding shape pattern will be displayed on the dot matrix screen. In addition, the quantity of sides of the shape and the shape name are printed at the terminal.
6.12.4 Program Parameter Description
The source code of this program is located at /home/pi/spiderpi/advanced/shape_recognition.py
Import Function Library
4import sys
5import cv2
6import math
7import time
8import signal
9import threading
10import numpy as np
11from calibration.camera import Camera
12from calibration.CalibrationConfig import *
13from common import yaml_handle
14from common import kinematics
15from common.ros_robot_controller_sdk import Board
16from common.action_group_controller import ActionGroupController
17import arm_ik.arm_move_ik as AMK
18from sensor.ultrasonic_sensor import Ultrasonic
19import sensor.dot_matrix_sensor as DMS
(1) Import the libraries related to OpenCV, time, math, and threads. If want to call a function in library, you can use “library name+function name (parameter, parameter)”. For example:
198 time.sleep(0.01)
Call sleep function in “time” library. The function sleep () is used to delay. There are some built-in libraries in Python, so they can be called directly. For example, time, cv2 and math. You can also write a new library like yaml_handle.
(2) Instantiating Function Libraries
The name of function library is too long to memorize. For calling function easily, the library can be instantiated. For example:
15from common.ros_robot_controller_sdk import Board
16from common.action_group_controller import ActionGroupController
17import arm_ik.arm_move_ik as AMK
After instantiating, you can directly input and call the function Board.function name (parameter, parameter).
Analysis of the Main Function
In a Python program, __name__ == '__main__:' is the main function of the program. First, the function init() is called for initialization configuration. In this program, the initialization includes: returning the servo to the initial position and reading the color threshold file. Generally, there are also configurations such as ports, peripherals, and timer interrupts. All of these need to be completed in the initialization content.
172if __name__ == '__main__':
173 #加载参数(load parameter)
174 param_data = np.load(calibration_param_path + '.npz')
175
176 #获取参数(obtain parameter)
177 mtx = param_data['mtx_array']
178 dist = param_data['dist_array']
179 newcameramtx, _ = cv2.getOptimalNewCameraMatrix(mtx, dist, (640, 480), 0, (640, 480))
180 mapx, mapy = cv2.initUndistortRectifyMap(mtx, dist, None, newcameramtx, (640, 480), 5)
Defining Global Variables
42# 读取颜色阈值函数(read color threshold and parameters of coordinate transformation)
43def load_config():
44 global lab_data
45
46 lab_data = yaml_handle.get_yaml_data(yaml_handle.lab_file_path)
47
48# 初始位置(initial position)
49def init_move():
50 ultrasonic.setRGBMode(0)
51 ultrasonic.setRGB(0, (0, 0, 0))
52 ultrasonic.setRGB(1, (0, 0, 0))
53 ik.stand(ik.initial_pos)
54 ak.setPitchRangeMoving((0, 12, 18), -60, -90, 100, 2)
(1) Gaussian Filtering
Before converting the image from RGB into LAB space, denoise the image and use GaussianBlur() function in cv2 library for Gaussian filtering.
132 frame_lab = cv2.cvtColor(frame_gb, cv2.COLOR_BGR2LAB) # 将图像转换到LAB空间(convert the image to LAB space)
The meaning of the parameters in bracket is as follow
The first parameter img is the input image.
The second parameter (3, 3) is the size of Gaussian kernel.
The third parameter 3 is the allowable variance around the average in Gaussian filtering. The larger the value, the larger the allowable variance around the average value; The smaller the value, the smaller the allowable variance around the average value.
(2) Binarization Processing
Adopt inRange() function in cv2 library to perform binaryzation on the image.
139 frame_mask = cv2.inRange(frame_lab,
140 (lab_data[i]['min'][0],
141 lab_data[i]['min'][1],
142 lab_data[i]['min'][2]),
143 (lab_data[i]['max'][0],
144 lab_data[i]['max'][1],
145 lab_data[i]['max'][2])) #对原图像和掩模进行位运算(perform bitwise operation to original image and mask)
The first parameter in the bracket is the input image. The second and the third parameters respectively are the lower limit and upper limit of the threshold. When the RGB value of the pixel is between the upper limit and lower limit, the pixel is assigned a value of 1, otherwise, 0.
(3) Corrosion and dilation
The function of erosion is to remove burrs from the edges of the image. The function of dilation is to expand the edge of the image and fill in the non-target pixels at the edge or inside of the target object.
To reduce distraction and make the image smoother, use morphologyEx() function in OpenCV library to perform open operation and close operation in sequence on the gray-scale image obtained after binaryzation.
146 opened = cv2.morphologyEx(frame_mask, cv2.MORPH_OPEN, np.ones((6,6),np.uint8)) #开运算(opening operation)
147 closed = cv2.morphologyEx(opened, cv2.MORPH_CLOSE, np.ones((6,6),np.uint8)) #闭运算(Closing operation)
The open operation is to erode first and then dilate, which can eliminate small areas with high brightness and separate objects at thin points. The boundary of the larger object can be smoothed without changing its area.
The close operation is to dilate first, then corrode. Its function is to bridge narrow discontinuities and slender ravines, eliminate small holes, make up for breaks in contour lines, and it also has a certain smoothing effect on contours.
The meaning of the parameters in the parentheses of the morphologyEx() function is as follow.
The first parameter is the input image
The second parameter is the morphological method used. cv2.MORPH_OPEN is for open operation, and cv2.MORPH_CLOSE is for close operation.
The third parameter is the kernel of the morphological operation. np.ones((6,6),np.uint8) is a 3×3 square structural element.
(4) Acquire the maximum contour
After processing the image, acquire the contour of the target to be recognized, which involves findContours() function in cv2 library.
148 contours = cv2.findContours(closed, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2] #找出轮廓(find contours)
The first parameter in parentheses is the input image; the second parameter is the retrieval mode of the contour; the third parameter is the approximation method of the contour.
Find the contour of the maximum area among the obtained contours. To avoid interference, please set a minimum value. Only when the area is larger than this value, the target contour is valid.
68 if contour_area_temp > 50: # 只有在面积大于50时,最大面积的轮廓才是有效的,以过滤干扰(Only when the area is greater than the set value, the contour with the maximum area is considered valid to filter out interference)
69 area_max_contour = c
After obtaining the contour with largest area, use drawContours() function in cv2 library to mark the contour.
68 cv2.drawContours(img, areaMaxContour_max, -1, (0, 0, 255), 2)
(5) Shape Recognition
Calculate the perimeter of the contour with arcLength() function in cv2 library and use the approxPolyDP() function for contour approximation
157 # 识别形状(shape recognition)
158 # 周长 0.035 根据识别情况修改,识别越好,越小(Perimeter 0.035. Adjust according to the detection performance, the better the detection, the smaller the value)
159 epsilon = 0.035 * cv2.arcLength(areaMaxContour_max, True)
160 # 轮廓相似(contours are similar)
161 approx = cv2.approxPolyDP(areaMaxContour_max, epsilon, True)
Based on the contour approximation result, acquire the number of the side of the recognized image to judge the corresponding shape of the image.
162 shape_list.append(len(approx))
163 if len(shape_list) == 24:
164 shape_length = int(round(np.mean(shape_list)))
165 shape_list = []
166 else:
167 shape_length = 0
Dot Matrix Display
According to the recognition result, the corresponding pattern will be displayed on the dot matrix screen.
75 if shape_length == 3:
76 print('三角形')
77 ## 显示'三角形'(display 'triangle')
78 tm.display_buf = (0x80, 0xc0, 0xa0, 0x90, 0x88, 0x84, 0x82, 0x81,
79 0x81, 0x82, 0x84,0x88, 0x90, 0xa0, 0xc0, 0x80)
80 tm.update_display()
81
82 elif shape_length == 4:
83 print('矩形')
84 ## 显示'矩形'(display 'rectangle')
85 tm.display_buf = (0x00, 0x00, 0x00, 0x00, 0xff, 0x81, 0x81, 0x81,
86 0x81, 0x81, 0x81,0xff, 0x00, 0x00, 0x00, 0x00)
87 tm.update_display()
88
89 elif shape_length >= 6:
90 print('圆')
91 ## 显示'圆形'(display 'circle')
92 tm.display_buf = (0x00, 0x00, 0x00, 0x00, 0x1c, 0x22, 0x41, 0x41,
93 0x41, 0x22, 0x1c,0x00, 0x00, 0x00, 0x00, 0x00)
94 tm.update_display()
95
96 else:
97 ## 清屏(clear the screen)
98 tm.display_buf = [0] * 16
99 tm.update_display()
100 print('None')
There are 16 columns of LEDs on the dot matrix screen and each column is controlled with a hexadecimal value, that is “10001000”. The status of LEDs corresponding to this value, from top to bottom, is “on off off off on off off off”.
Through calling update_display() function in HiwonderSDK.tm1640 library, refresh the font in the tm.display_buf buffer area and display it on the dot matrix screen, and then you can control the dot matrix screen to display the desired pattern.