6. AI Visual Recognition & Tracking Course

6.1 Single Color Recognition

In this section, the camera detects colors. When a red ball is recognized, the buzzer will emit a beep, and the red ball will be highlighted in the transmitted image with “Color: red” displayed.

6.1.1 Program Description

The implementation of color recognition consists of two parts: color detection and execution feedback after recognition.

First, for the color detection part, Gaussian filtering is applied to the image to reduce noise. The Lab color space is then used to convert the color of the object (you can learn more about the Lab color space in the “OpenCV Vision Basic Course” section of the tutorial materials).

Next, the object’s color within the circle is recognized using color thresholding, followed by masking (masking involves using selected images, shapes, or objects to globally or locally obscure the image being processed).

After performing morphological operations such as opening and closing on the object image, the object with the largest contour is circled.

Opening: The image undergoes erosion followed by dilation. This operation removes small objects, smooths shape boundaries, and preserves the area. It can eliminate small noise particles and separate connected objects.

Closing: The image undergoes dilation followed by erosion. This operation fills small holes within objects, connects nearby objects, closes broken contour lines, and smooths boundaries while preserving the area.

After recognition, the servo and buzzer are set up to provide feedback based on the detected color. For example, when red is detected, the buzzer will emit a sound.

For detailed feedback behavior, please refer to 6.1.3 Program Outcome of this document.

6.1.2 Start and Close the Game

Note

The input command is case-sensitive, and keywords can be auto-completed using the Tab key.

(1) Power on the device and, following the instructions in “Remote Desktop Installation and Connection\3.1 VNC Installation and Connection”, use the VNC remote connection tool to connect.

(2) Click the icon ,in the top left corner of the system desktop or press the shortcut “Ctrl+Alt+T” to open the Terminator terminal.

(3) Execute the command to navigate to the directory where the program is located, then press Enter:

cd spiderpi/functions

(4) Enter the command and press Enter to start the program:

python3 color_recognition.py

(5) To close the program, simply press “Ctrl+C” in the LX terminal. If it does not close, press it multiple times.

6.1.3 Program Outcome

After starting the game, the camera will be used to detect colors. When a red ball is recognized, the buzzer will emit a beep sound, and the ball will be circled in the transmitted image, with “Color: red” printed.

Note

  • During the recognition process, ensure the environment is well-lit to avoid inaccurate recognition due to poor lighting conditions.

  • Ensure that no objects with similar or matching colors to the target are present in the background within the cameras visual range, as this may cause misrecognition.

  • If color recognition is inaccurate, refer to the section “6.1.5 Function Extensions -> Adjusting Color Thresholds” in this document to adjust the color threshold settings.

6.1.4 Program Analysis

The source code of this program is saved in: /home/pi/spiderpi/functions/color_recognition.py

  • Import Function Library

 4import sys
 5import cv2
 6import math
 7import time
 8import threading
 9import numpy as np
10from common import misc
11from common import yaml_handle
12from calibration.camera import Camera
13from sensor.ultrasonic_sensor import Ultrasonic

(1) Import Libraries for OpenCV, Time, Math, and Threading

To use functions from a library, we can call them with the syntax:

library_name.function_name(parameter1, parameter2, ...)

199            time.sleep(0.01)

For example, to call the sleep function from the time library, we use:

In Python, several libraries like time, cv2, and math are built-in and can be directly imported and used. You can also create your own libraries, like the yaml_handle file-reading library mentioned above.

(2) Instantiate a Library

Some library names can be long and hard to remember. To simplify function calls, we often instantiate libraries. For example:

12from calibration.camera import Camera

After instantiating the library, we can call functions from the Board library using the shorter syntax:

Board.function_name(parameter1, parameter2, …)

This makes it much easier and more convenient to use.

1.4.2 Main Function Analysis

In a Python program, __name__ == '__main__' indicates the main function of the program, where the program starts by reading an image.

(1) Image Processing

186    camera = Camera()

When the play mode starts, the video stream is obtained and stored in “cap”.

(2) Entering Image Processing

When an image is read, the run() function is called for image processing.

189    while True:
190        img = camera.frame
191        if img is not None:
192            frame = img.copy()
193            Frame = run(frame)
194            cv2.imshow('Frame', Frame)
195            key = cv2.waitKey(1)
196            if key == 27:
197                break

① The function img.copy() is used to copy the content of img to frame.

② The function run() performs image processing.

108def run(img):
109    global draw_color
110    global color_list
111    global detect_color
112    global action_finish
113    global count
114    img_copy = img.copy()
115    img_h, img_w = img.shape[:2]
116
117    
118
119    frame_resize = cv2.resize(img_copy, size, interpolation=cv2.INTER_NEAREST)
120    frame_gb = cv2.GaussianBlur(frame_resize, (3, 3), 3)      
121    frame_lab = cv2.cvtColor(frame_gb, cv2.COLOR_BGR2LAB)  # 将图像转换到LAB空间(convert the image to LAB space)

(3) Resizing the image for easier processing.

119    frame_resize = cv2.resize(img_copy, size, interpolation=cv2.INTER_NEAREST)

The first parameter img_copy is the input image.

The second parameter size is the size of the output image. The size can be set by yourself.

The third parameter interpolation=cv2.INTER_NEAREST is the interpolation method. INTER_NEAREST: Nearest-neighbor interpolation.

INTER_LINEAR: Bilinear interpolation. If you do not specify the last parameter, this method will be used by default.

INTER_CUBIC: Bicubic interpolation within a 4x4 pixel neighborhood.

INTER_LANCZOS4: Lanczos interpolation within an 8x8 pixel neighborhood.

(4) Gaussian Filtering

There is always noise mixed in the image, which affects the image quality and makes the features less prominent. Different filtering methods are selected according to different types of noise, common ones include: Gaussian filtering, median filtering, mean filtering, etc.

Gaussian filtering is a linear smoothing filter, suitable for eliminating Gaussian noise and widely used in the noise reduction process of image processing.

120    frame_gb = cv2.GaussianBlur(frame_resize, (3, 3), 3) 

he first parameter frame_resize is the input image.

The second parameter (3, 3) is the size of the Gaussian kernel.

The third parameter 3 is the standard deviation of the Gaussian kernel in the X direction.

(5) Converting the Image to LAB Color Space, where the function cv2.cvtColor() is a color space conversion function.

121    frame_lab = cv2.cvtColor(frame_gb, cv2.COLOR_BGR2LAB)  # 将图像转换到LAB空间(convert the image to LAB space)

The first parameter frame_gb is the input image.

The second parameter cv2.COLOR_BGR2LAB is the conversion format. cv2.COLOR_BGR2LABconverts from BGR format to LAB format. If you want to convert to RGB, you can use cv2.COLOR_BGR2RGB.

(6) Converting the Image into a Binary Image, which only has 0 and 1, making the image simpler and reducing the data volume, and thus easier to process.

The inRange() function in the cv2 library is used to binarize the image.

131                frame_mask = cv2.inRange(frame_lab,
132                                         (lab_data[i]['min'][0],
133                                          lab_data[i]['min'][1],
134                                          lab_data[i]['min'][2]),
135                                         (lab_data[i]['max'][0],
136                                          lab_data[i]['max'][1],
137                                          lab_data[i]['max'][2]))  #对原图像和掩模进行位运算(perform bitwise operation to original image and mask)

The first parameter frame_lab is the input image;

The second parameter (lab_data[i]['min'][0],lab_data[i]['min'][1],lab_data[i]['min'][2]) is the lower color threshold;

The third parameter (lab_data[i]['max'][0],lab_data[i]['max'][1],lab_data[i]['max'][2]) is the upper color threshold;

(7) To reduce interference and make the image smoother, erosion and dilation operations need to be performed on the image. Erosion and dilation are two basic morphological operations, often used in image processing, especially in binary image processing. These two operations are usually used to remove small noise, separate and identify objects in the image, and adjust the size of the image, etc.

138                eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)))  #腐蚀(erode)
139                dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #膨胀(dilate)

he first parameter is the input image;

The second parameter is the structural element (also known as the kernel), which defines the nature of the operation. The size and shape of the kernel determine the degree of erosion and dilation.

(8) Obtaining the Contour with the Largest Area

The first parameter dilated is the input image;

142                contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2]  #找出轮廓(find contours)

The second parameter cv2.RETR_EXTERNAL is the contour retrieval mode;

The third parameter cv2.CHAIN_APPROX_NONE)[-2] is the contour approximation method.

Among the obtained contours, the contour with the largest area is searched for, and in order to avoid interference, a minimum value needs to be set, and the target contour is valid only when the area is larger than this value.

143                areaMaxContour, area_max = get_area_max_contour(contours)  #找出最大轮廓(find the largest contour)
144                if areaMaxContour is not None:
145                    if area_max > max_area:#找最大面积(find the maximum area)
146                        max_area = area_max
147                        color_area_max = i
148                        areaMaxContour_max = areaMaxContour

(9) Displaying the Returned Imag

192            frame = img.copy()
193            Frame = run(frame)
194            cv2.imshow('Frame', Frame)
195            key = cv2.waitKey(1)
196            if key == 27:
197                break

The function cv2.imshow() is used to display the image in a window, 'Frame' is the window name, and Frame is the display content. There must be cv2.waitKey() afterwards, otherwise, it cannot be displayed.

The function cv2.waitKey() is used to wait for key input, and the parameter “1” is the delay time.

1.4.3 drive the buzzer

91                board.set_buzzer(2400, 0.1, 0.2, 1)
92                time.sleep(0.2)

The function set_buzzer() is used to drive the buzzer.

The code time.sleep(0.2) is a delay function, and 0.2 is the buzzing time.

6.1.5 Function Extensions

  • Adjusting Color Thresholds

The color recognition program is pre-configured to recognize three colors: red, green, and blue. By default, the program identifies red, triggering the buzzer to emit a beep and drawing a circle around the red ball in the transmitted image, displaying “Color: red”.

To change the recognized color to green, follow these steps:

(1) Enter the following command and press Enter to navigate to the source code directory:

cd spiderpi/functions

(2) Then, enter the following command and press Enter to open the program file:

sudo vim color_recognition.py

(3) Locate the code shown in the image below:

(4) Press the “i” key on the keyboard to enter edit mode.

loading

(5) Replace “red” (highlighted in red in the image) with “green”, as shown in the image below:

(6) To save your changes, press the “Esc” key, then type “:wq” (note the colon before “wq”) and press Enter to save and exit.

loading

(7) Enter the following command and press Enter to start the color recognition functionality:

sudo python3 color_recognition.py

6.2 Color Recognition

6.2.1 Program Logic

For humans, it is easy to distinguish different colors in the world. How can robots recognize object colors? For SpiderPi Pro, we can install a camera vision module to it and control it to identify different colors through visual recognition.

The overall implementation process is as follows:

First, program SpiderPi Pro to recognize colors with Lab color space. Convert the RGB color space to Lab, and then perform image binarization and operations such as dilation and corrosion to obtain an outline containing only the target color.

Lastly, circle the obtained color outline and control the robot to take action according to the result of color recognition.

6.2.2 Start and Close the Game

Note

The input command should be case sensitive and space sensitive.

(1) Start the SpiderPi Pro robot and connect to the Raspberry Pi desktop remotely via VNC.

(2) Click at upper left corner of desktop, or press “Ctrl+Alt+T” to open LX terminal.

(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.

cd spiderpi/functions

(4) Enter command, then press “Enter” to start the game.

python3 color_detect.py

(5) If you want to exit the game programming, press “Ctrl+C” in the LX terminal interface. If the exit fails, please try it a few more times.

6.2.3 Project Outcome

Note

The default recognition color is red. If you want to change it to blue or green, please refer to “6.2.5 Function Extension -> Change the Default Recognition Color”.

Place the red ball in front of SpiderPi Pro’s camera and it will nod when recognizing the red ball. It will “shake head” when detecting the green and blue balls.

6.2.4 Program Analysis

The source code of this program is located at: /home/pi/spiderpi/functions/color_detect.py

  • Import Function Libraries

 4import sys
 5import cv2
 6import math
 7import time
 8import threading
 9import numpy as np
10from common import misc
11from common import yaml_handle
12from calibration.camera import Camera
13from sensor.ultrasonic_sensor import Ultrasonic

2.4.2 Image Processing

(1) Gaussian Filtering

Before converting the image from RGB into LAB space, denoise the image and use GaussianBlur() function in cv2 library for Gaussian filtering.

179    frame_gb = cv2.GaussianBlur(frame_resize, (3, 3), 3)    

The meaning of the parameters in bracket is as follows:

The first parameter frame_resize is the input image;

The second parameter (3, 3) is the size of the Gaussian kernel;

The third parameter 3 is the variance allowed near the average value in Gaussian filtering. The larger this value, the larger the variance allowed around the average value; the smaller the value, the smaller the variance allowed around the average value.

(2) Binarization Processing

The inRange() function in the cv2 library is used to perform binarization processing on the image.

189                frame_mask = cv2.inRange(frame_lab,
190                                         (lab_data[i]['min'][0],
191                                          lab_data[i]['min'][1],
192                                          lab_data[i]['min'][2]),
193                                         (lab_data[i]['max'][0],
194                                          lab_data[i]['max'][1],
195                                          lab_data[i]['max'][2]))  #对原图像和掩模进行位运算(perform bitwise operation to original image and mask)

The first parameter in the bracket is the input image.

The second and the third parameters respectively are the lower limit and upper limit of the threshold. When the RGB value of the pixel is between the upper limit and lower limit, the pixel is assigned a value of 1, otherwise, 0.

(3) Corrosion and dilation

To reduce the interference and make the image smoother, it is necessary to perform corrosion and dilation on the image.

196                eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)))  #腐蚀(erode)
197                dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #膨胀(dilate)

erode() function is used for corrosion. Take eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) for example.

The meaning of the parameters in bracket are as follow.

The first parameter frame_mask is the input image.

The second parameter cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)) is the structural element and kernel deciding the nature of the operation. And the first parameter in the parenthesis is the kernel shape and the second parameter is the kernel dimension.

dilate() function is used for image dilation. And the meaning of the parameters in parenthesis is the same as that of erode() function.

(4) Acquire the maximum contour

After processing the image, acquire the contour of the target to be recognized, which involves findContours() function in cv2 library.

200                contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2]  #找出轮廓(find contours)

The first parameter in parentheses is the input image;

The second parameter is the retrieval mode of the contour;

The third parameter is the approximation method of the contour.

Find the contour of the maximum area among the obtained contours. To avoid interference, please set a minimum value. Only when the area is larger than this value, the target contour is valid.

205        if max_area > 100:  # 有找到最大面积(the maximum area has been found)
206            ((centerX, centerY), radius) = cv2.minEnclosingCircle(areaMaxContour_max)  # 获取最小外接圆(obtain the minimum circumscribed circle)
207            centerX = int(misc.map(centerX, 0, size[0], 0, img_w))
208            centerY = int(misc.map(centerY, 0, size[1], 0, img_h))
209            radius = int(misc.map(radius, 0, size[0], 0, img_w))            
210            cv2.circle(img, (centerX, centerY), radius, range_rgb[color_area_max], 2)#画圆(drwa circle)

2.4.3 Feedback Information

After the contour of the maximum area is obtained, call circle() function in cv2 library, and circle the recognized target. The color of the circle is in line with the color of the object.

210            cv2.circle(img, (centerX, centerY), radius, range_rgb[color_area_max], 2)#画圆(drwa circle)
211

To improve the accuracy of the recognition result, it is necessary to make several judgments.

212            if color_area_max == 'red':  #红色最大(red is the maximum)
213                color = 1
214            elif color_area_max == 'green':  #绿色最大(green is the maximum)
215                color = 2
216            elif color_area_max == 'blue':  #蓝色最大(blue is the maximum)
217                color = 3
218            else:
219                color = 0
220            color_list.append(color)
221
222            if len(color_list) == 3:  #多次判断(multiple judgements)
223                # 取平均值(get mean)
224                color = int(round(np.mean(np.array(color_list))))
225                color_list = []
226                if color == 1:
227                    detect_color = 'red'
228                    draw_color = range_rgb["red"]
229                elif color == 2:
230                    detect_color = 'green'
231                    draw_color = range_rgb["green"]
232                elif color == 3:
233                    detect_color = 'blue'
234                    draw_color = range_rgb["blue"]
235                else:
236                    detect_color = 'None'
237                    draw_color = range_rgb["black"]               
238        else:
239            detect_color = 'None'
240            draw_color = range_rgb["black"]
241            
242    cv2.putText(img, "Color: " + detect_color, (10, img.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.65, draw_color, 2)

After the judgment is completed, the color of the recognition target is printed in the feedback image. Here, the putText() function in the cv2 library is involved.

244    cv2.putText(img, "Color: " + detect_color, (10, img.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.65, draw_color, 2)

The meaning of the parameters is as follow.

The first parameter img is the input image.

The second parameter "Color: " + detect_color represents the displayed content.

The third parameter (10, img.shape[0] - 10) is the displayed position.

The fourth parameter cv2.FONT_HERSHEY_SIMPLEX represents the font type.

The fifth parameter 0.65 represents the font size.

The sixth parameter draw_color represents the color of the font.

The seventh parameter 2 represents the font weight.

2.4.4 Main Function Analysis

The python program __name__ == '__main__:' is the main function of program. Firstly, the function init() is called to initialize. The initialization in this program includes: return the servo to the initial position, read the color threshold file. Generally there are also configurations for ports, peripherals, timing interrupts, etc., which are all done in the process of initialization.

248if __name__ == '__main__':
249    from common.ros_robot_controller_sdk import Board

(1) Read the Camera Image

263    while True:
264        img = camera.frame
265        if img is not None:
266            frame = img.copy()

When the game starts, the image is stored in “img”.

(2) Enter Image Processing

When the captured image is read, call run function to process the image.

266            frame = img.copy()
267            Frame = run(frame)
268            cv2.imshow('Frame', Frame)
269            key = cv2.waitKey(1)
270            if key == 27:
271                break

① The function img.copy() is used to copy the content of img to frame.

② The function run() performs image processing.

248if __name__ == '__main__':
249    from common.ros_robot_controller_sdk import Board
250
251    board = Board()
252    ultrasonic = Ultrasonic()
253
254    debug = False
255    if debug:
256        print('Debug Mode')
257
258    init()
259    start()
260    camera = Camera()
261    camera.camera_open(correction=True) # 开启畸变矫正,默认不开启(enable the distortion correction which is not started by default)
262    
263    while True:
264        img = camera.frame
265        if img is not None:
266            frame = img.copy()
267            Frame = run(frame)
268            cv2.imshow('Frame', Frame)
269            key = cv2.waitKey(1)
270            if key == 27:
271                break
272        else:
273            time.sleep(0.01)
274    camera.camera_close()
275    cv2.destroyAllWindows()

2.4.5 Subthread Analysis

Run the move() function of the SpiderPi Pro as a subthread. When a color is recognized, the move() function is executed. The function mainly involves processing the image results, making a judgment, and executing different feedback accordingly.

112def move():
113    global draw_color
114    global detect_color
115    global action_finish
116
117    while True:
118        if debug:
119            return
120        if __isRunning:
121            if detect_color != 'None':
122                action_finish = False
123                if detect_color == 'red':
124                    board.pwm_servo_set_position(0.2, [[1, 1200]])
125                    time.sleep(0.2)
126                    board.pwm_servo_set_position(0.2, [[1, 1800]])
127                    time.sleep(0.2)
128                    board.pwm_servo_set_position(0.2, [[1, 1200]])
129                    time.sleep(0.2)
130                    board.pwm_servo_set_position(0.2, [[1, 1800]])
131                    time.sleep(0.2)
132                    board.pwm_servo_set_position(0.2, [[1, 1500]])
133                    time.sleep(0.1)
134                    detect_color = 'None'
135                    draw_color = range_rgb["black"]                    
136                    time.sleep(1)
137                elif detect_color == 'green' or detect_color == 'blue':
138                    board.pwm_servo_set_position(0.2, [[2, 1200]])
139                    time.sleep(0.2)
140                    board.pwm_servo_set_position(0.2, [[2, 1800]])
141                    time.sleep(0.2)
142                    board.pwm_servo_set_position(0.2, [[2, 1200]])
143                    time.sleep(0.2)
144                    board.pwm_servo_set_position(0.2, [[2, 1800]])
145                    time.sleep(0.2)
146                    board.pwm_servo_set_position(0.2, [[2, 1500]])
147                    time.sleep(0.1)
148                    detect_color = 'None'
149                    draw_color = range_rgb["black"]                    
150                    time.sleep(1)
151                else:
152                    time.sleep(0.01)                
153                action_finish = True                
154                detect_color = 'None'
155            else:
156               time.sleep(0.01)
157        else:
158            time.sleep(0.01)

6.2.5 Function Extensions

  • Change the Default Recognition Color

There are three built-in colors, including red, green and blue, in the color recognition program. The robot defaults to nod when recognizing red.

Take modifying the default recognition color as green as an example. The specific operation steps are as follow.

(1) Input command and press “Enter” to navigate to the directory where the game programs are stored.

cd spiderpi/functions

(2) Enter the command and press “Enter” to open the program file.

vim color_detect.py

(3) Locate the codes shown below:

Note

We can input the serial number of the line and press “Shift+G” to jump to the corresponding position. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.

(4) Press “i” to enter the editing mode, then modify “red” in if detect_color == 'red': to “green”. And replace “green” with “red” in elif detect_color == 'green' or detect_color == 'blue':. And you can modify it as blue in the same way.

After modification, Press “Esc” and input “:wq” and then press “Enter” to save the file and exit the editor.

:wq

(5) After the modification is completed, you can follow the steps in “6.2.2 Start and Close the Game” to check the game performance.

  • Add New Recognition Colors

In addition to the built-in recognition colors, you can set other recognition colors in the program. Take orange as example.

(1) Open VNC, input command to open Lab color setting document.

Vim spiderpi/config/lab_config.yaml

Note

It is recommended to screenshot the initial value for recording.

(2) Double click the icon of debugging tool in the system desktop. If the prompt box pops up, choose “Execute”.

Click “Connect” button. When the interface displays the camera returned image, the connection is successful. Select “red” in the drop-down box.

(3) Face the camera to the color to recognize. Drag the sliders of L, A, and B until the object to be recognized in the left screen becomes white and other areas become black.

For example, if you want to recognize orange, you can put the orange ball within camera’s vision. Adjust the corresponding sliders of L, A, and B until the orange part in the left screen turns white and other colors become black, and then click “Save” button to keep the modified data.

(4) After the modification is completed, check whether the modified data was successfully written in. Enter the command again “vim spiderpi/config/lab config.yaml” to open file of Lab color setting.

Vim spiderpi/config/lab_config.yaml

Note

In order to avoid the game performance, it’s recommended to use the LAB_Tool tool to modify the value back to the initial value after the modification.

(5) The modified data is written successfully into the configuration program. Then you can press “Esc” and input “:wq” and then press “Enter” to save and exit.

:wq

(6) According to the steps in “6.2.5 Function Extension -> Change the Default Recognition Color”, set the default recognition color as red.

(7) Start the game again and put the orange object in front of the camera. SpiderPi Pro will nod when recognizing the color. If you want to add other color as recognition color, you can follow the previous steps to set.

6.3 Target Position Recognition

In this lesson, the camera will be used to recognize red, green, and blue balls. The detected balls will be highlighted in the live feed, and their XY coordinates will be displayed.

6.3.1 Brief Analysis of the Task

The implementation of target tracking can be divided into two parts: color recognition and position marking.

First, for the color recognition part, Gaussian filtering is applied to the image for noise reduction. The Lab color space is then used to convert the color of the objects (for more details on the Lab color space, please refer to the “OpenCV Vision Basic Course”).

Next, color thresholding is used to identify the color of objects within the circle. The image is then masked (masking involves using a selected image, shape, or object to globally or locally occlude the processed image).

After performing morphological operations (open and close operations) on the object’s image, the largest contour is outlined with a circle.

Opening operation: The image is eroded first and then dilated. This operation is used to remove small objects, smooth shape boundaries, and preserve the overall area. It helps remove small noise particles and separate objects that are connected.

Closing operation: The image is dilated first and then eroded. This operation is used to fill small holes within the objects, connect adjacent objects, and reconnect broken contour lines while smoothing the boundaries without changing the area.

Position marking requires specific detection algorithms. The basic principle is to search for areas in the image that match predefined features or patterns, then return the position and bounding box of these areas.

6.3.2 Start and Close the Game

Note

The input of commands must strictly distinguish between uppercase and lowercase letters, as well as spaces. Additionally, you can use the “Tab” key on the keyboard to auto-complete keywords.

(1) Power on the device and, following the instructions in “Remote Desktop Installation and Connection\3.1 VNC Installation and Connection”, use the VNC remote connection tool to connect.

(2) Click the icon in the top left corner of the system desktop or press the shortcut “Ctrl+Alt+T” to open the LX terminal.

(3) In the terminal, enter the command to navigate to the directory where the program is located, then press Enter:

cd spiderpi/functions

(4) Enter the command and press Enter to start the program:

python3 color_position_recognition.py

(5) To close the program, simply press “Ctrl+C” in the LX terminal. If it does not close, press it multiple times.

6.3.3 Program Outcome

The program defaults to recognizing red, green, and blue balls. After recognition, it will highlight the objects in the transmitted image and display their XY coordinates.

Note

  • During the recognition process, ensure the environment is well-lit to avoid inaccurate recognition due to lighting issues.

  • Ensure there are no objects with similar or identical colors to the target colors within the camera’s field of view to prevent misrecognition.

  • If color recognition is inaccurate, refer to the section “6.3.5 Function Extension ->Adjusting Color Threshold” in this document to adjust the color threshold settings.

6.3.4 Program Description

The source code for this program is located at:/home/pi/spiderpi/functions/color_position_recognition.py

  • Importing Libraries

 4import sys
 5import cv2
 6import math
 7import time
 8import threading
 9import numpy as np
10from common import misc
11from common import yaml_handle
12from calibration.camera import Camera
13from sensor.ultrasonic_sensor import Ultrasonic

(1) Import the necessary libraries, including OpenCV, time, math, threading, and inverse kinematics.

To call a function from a library, use the format LibraryName.FunctionName(Parameters). For example:

189            time.sleep(0.01)

This calls the sleep function from the time library, which is used for adding delays.

Python comes with several built-in libraries like time, cv2, math, which can be imported directly. You can also create your own libraries, such as the “yaml_handle” file reading library.

(2) Instantiating Libraries

Sometimes, library names are long and hard to remember. To make function calls more convenient, we often instantiate libraries using shorter names. For example:

12from calibration.camera import Camera

After instantiation, functions from the Board library can be called as:

Board.FunctionName(Parameters)

This makes calling functions much easier.

  • Main Function Analysis

In a Python program, the if __name__ == '__main__': block indicates the main function. The program starts by opening the camera and reading the video stream. The read() method captures each frame of the image, where the program searches for and marks the color of the ball, then displays the result. The video is displayed through a loop, and once the display is finished, the release() function is called to release the resources.

167if __name__ == '__main__':
168    from common.ros_robot_controller_sdk import Board
169
170    board = Board()
171    ultrasonic = Ultrasonic()
172
173    load_config()
174    init_move()
175    reset()
176    camera = Camera()
177    camera.camera_open(correction=True) # 开启畸变矫正,默认不开启(enable the distortion correction which is not started by default)

(1) Capturing Camera Image

176    camera = Camera()

When the program starts, the camera is initialized.

(2) Image Processing

① The run() function handles image processing.

183            Frame = run(frame)
85def run(img):
86    global draw_color
87    global color_list
88    global detect_color
89    global action_finish
90    
91    img_copy = img.copy()
92    img_h, img_w = img.shape[:2]
93
94
95    frame_resize = cv2.resize(img_copy, size, interpolation=cv2.INTER_NEAREST)
96    frame_gb = cv2.GaussianBlur(frame_resize, (3, 3), 3)      
97    frame_lab = cv2.cvtColor(frame_gb, cv2.COLOR_BGR2LAB)  # 将图像转换到LAB空间(convert the image to LAB space)

② Resize the image to make it easier to process.

95    frame_resize = cv2.resize(img_copy, size, interpolation=cv2.INTER_NEAREST)

The first parameter img_copy is the input image.

The second parameter size is the size of the output image, which can be set as needed.

The third parameter interpolation=cv2.INTER_NEAREST is the interpolation method.

Options include:

INTER_NEAREST: Nearest-neighbor interpolation.

INTER_LINEAR: Bilinear interpolation (default if no other method is specified).

INTER_CUBIC: Bicubic interpolation in a 4x4 pixel neighborhood.

INTER_LANCZOS4: Lanczos interpolation in an 8x8 pixel neighborhood.

③ Apply Gaussian Blur to reduce noise

Gaussian blur is a linear smoothing filter used to eliminate Gaussian noise and is widely used in image denoising.

96    frame_gb = cv2.GaussianBlur(frame_resize, (3, 3), 3) 

The first parameter frame_resize is the input image.

The second parameter (3, 3) is the size of the Gaussian kernel.

The third parameter 3 is the standard deviation of the Gaussian kernel in the X-direction.

④ Convert the image to LAB color space.

97    frame_lab = cv2.cvtColor(frame_gb, cv2.COLOR_BGR2LAB)  # 将图像转换到LAB空间(convert the image to LAB space)

The first parameter frame_gb is the input image.

The second parameter cv2.COLOR_BGR2LAB specifies the conversion from BGR to LAB format. To convert to RGB, use cv2.COLOR_BGR2RGB.

⑤ Convert the image to a binary image with only 0s and 1s, simplifying the image and reducing data for easier processing.

The cv2.inRange() function is used for binarization:

{lineno-start=}

                frame_mask = cv2.inRange(frame_lab,
                                         (lab_data[i]['min'][0],
                                          lab_data[i]['min'][1],
                                          lab_data[i]['min'][2]),
                                         (lab_data[i]['max'][0],
                                          lab_data[i]['max'][1],
                                          lab_data[i]['max'][2]))  #对原图像和掩模进行位运算(perform bitwise operation to original image and mask)

The first parameter frame_lab is the input image.

The second parameter (lab_data[i]['min'][0], lab_data[i]['min'][1], lab_data[i]['min'][2]) is the lower threshold for the color.

The third parameter (lab_data[i]['max'][0], lab_data[i]['max'][1], lab_data[i]['max'][2]) is the upper threshold for the color.

⑥ Perform erosion and dilation to smooth the image and reduce interference.

Erosion reduces the size of foreground objects and eliminates small objects, while dilation increases the size of foreground objects and fills small holes.

113                eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)))  #腐蚀(erode)
114                dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #膨胀(dilate)

⑦ Find the contour with the largest area

After the image processing steps, use the cv2.findContours() function to find contours:

117                contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2]  #找出轮廓(find contours)

The first parameter dilated is the input image.

The second parameter cv2.RETR_EXTERNAL specifies the contour retrieval mode.

The third parameter cv2.CHAIN_APPROX_NONE)[-2] specifies the contour approximation method.

The program searches for the largest contour and sets a threshold area to ensure the detected contour is valid.

118                areaMaxContour, area_max = get_area_max_contour(contours)  #找出最大轮廓(find the largest contour)
119                if areaMaxContour is not None:
120                    if area_max > max_area:#找最大面积(find the maximum area)
121                        max_area = area_max
122                        color_area_max = i
123                        areaMaxContour_max = areaMaxContour
124        if max_area > 100:  # 有找到最大面积(the maximum area has been found)

⑧ Extract the position information

Use cv2.putText() to draw text on the image:

162    cv2.putText(img, "Color: " + detect_color, (10, img.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.65, draw_color, 2)

The first parameter img is the input image.

The second parameter "Color: " + detect_color is the text to display (e.g., the detected color).

The third parameter (10, img.shape[0] - 10) and (centerX, centerY - 20) specify the starting coordinates for the text (bottom-left position).

The fourth parameter cv2.FONT_HERSHEY_SIMPLEX specifies the font type.

The fifth parameter 0.65 is the scaling factor for the font size.

The sixth parameter draw_color is the color of the text.

The seventh parameter 2 specifies the thickness of the text line.

(3) Displaying the Return Image

179    while True:
180        img = camera.frame
181        if img is not None:
182            frame = img.copy()
183            Frame = run(frame)
184            cv2.imshow('Frame', Frame)
185            key = cv2.waitKey(1)
186            if key == 27:
187                break

The cv2.imshow() function is used to display the image in a window. The first parameter is the window name (e.g., ‘Frame’), and the second parameter is the image to display.

The function cv2.waitKey() is used to wait for a key press; the parameter 1 specifies the delay time.

6.3.5 Function Extension

  • Adjusting Color Threshold

During the game experience, if the color recognition of objects is not accurate, you may need to adjust the color threshold. This section uses adjusting the red color as an example; the process for adjusting other colors is similar. Follow the steps below:

(1) Double-click the system desktop icon and click “Execute” in the pop-up window.

(2) Once the interface opens, click “Connect.”

(3) After a successful connection, select “red” from the color options in the bottom-right corner of the interface.

(4) If the transmitted image does not appear in the pop-up window, it indicates the camera is not connected properly. Check the camera connection cable to ensure it is securely connected.

The image on the right side of the interface shows the real-time transmitted video, and the left side shows the color to be captured.

Point the camera at the red color block, and then adjust the six sliders at the bottom to ensure that the red color block on the left side of the screen turns completely white, while other areas remain black.

Finally, click the “Save” button to save the data.

6.4 Target Tracking

6.4.1 Program logic

First, program SpiderPi Pro to recognize colors with Lab color space. Convert the RGB color space to Lab, perform image binarization, and then operations such as expansion and corrosion to obtain an outline containing only the target color. And circle the obtained outline.

After color recognition, take X and Y coordinate of the image center as setting value. And take the X and Y coordinate of the target as input value to update PID.

Lastly, calculate according to the feedback about the image position and control SpiderPi Pro to move with the target, so as to realize color tracking.

6.4.2 Operation steps

Note

The input command should be case sensitive and space sensitive.

(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.

(2) Click at upper left corner of desktop, or press “Ctrl+Alt+T” to open LX terminal.

(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.

cd spiderpi/functions

(4) Enter the command, then press “Enter” to start the game.

python3 color_track.py

(5) If you want to exit the game programming, press “Ctrl+C” in the LX terminal interface. If the exit fails, please try it few more times.

6.4.3 Project outcome

Note

The default recognized and tracking color is green. If you want to change it to blue, please refer to “6.4.5 Function Extension ->Modify Default Recognition Color”. And, please don’t move the ball too fast and out of the camera vision.

After the game starts, move the green ball slowly, and the robotic arm of SpiderPi Pro will move with the green ball.

6.4.4 Program Analysis

The source code of this program is located in:/home/pi/spiderpi/functions/color_track.py

  • Import Function Library

 4import sys
 5import cv2
 6import math
 7import time
 8import numpy as np
 9from common import misc
10from common.pid import PID
11from common import yaml_handle
12from calibration.camera import Camera 
13from calibration.CalibrationConfig import *
14from sensor.ultrasonic_sensor import Ultrasonic
15import arm_ik.arm_move_ik as AMK

(1) Gaussian filtering

Before converting the image from RGB into LAB space, denoise the image and use “GaussianBlur()” function in cv2 library for Gaussian filtering.

146    frame_gb = cv2.GaussianBlur(frame_resize, (5, 5), 5) 

The meaning of the parameters in bracket is as follow

The first parameter frame_resize is the input image

The second parameter (5, 5) is the size of Gaussian kernel.

The third parameter 5 is the allowable variance around the average in Gaussian filtering. The larger the value, the larger the allowable variance around the average value; The smaller the value, the smaller the allowable variance around the average value.

(2) Binaryzation processing

Adopt inRange() function in cv2 library to perform binaryzation on the image.

187            frame_mask = cv2.inRange(frame_lab,
188                                         (lab_data[i]['min'][0],
189                                          lab_data[i]['min'][1],
190                                          lab_data[i]['min'][2]),
191                                         (lab_data[i]['max'][0],
192                                          lab_data[i]['max'][1],
193                                          lab_data[i]['max'][2]))  #对原图像和掩模进行位运算(perform bitwise operation to original image and mask)

The first parameter in the bracket is the input image. The second and the third parameters respectively are the lower limit and upper limit of the threshold. When the RGB value of the pixel is between the upper limit and lower limit, the pixel is assigned a value of 1, otherwise, 0.

(3) Corrosion and dilation

To reduce the interference and make the image smoother, it is necessary to perform corrosion and dilation on the image.

161            eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)))  #腐蚀(erode)
162            dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) #膨胀(dilate)

The erode() function is used for corrosion. Take eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) for example. The meaning of the parameters in bracket are as follow.

The first parameter frame_mask is the input image.

The second parameter cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)) is the structural element and kernel deciding the nature of the operation. And the first parameter in the parenthesis is the kernel shape and the second parameter is the kernel dimension.

The dilate()function is used for image dilation. And the meaning of the parameters in parenthesis is the same as that of erode() function.

(4) Acquire the maximum contour

After processing the image, acquire the contour of the target to be recognized, which involves findContours() function in cv2 library.

165            contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2]  # 找出轮廓(find contours)

The first parameter in parentheses is the input image; the second parameter is the retrieval mode of the contour; the third parameter is the approximation method of the contour.

Find the contour of the maximum area among the obtained contours. To avoid interference, please set a minimum value. Only when the area is larger than this value, the target contour is valid.

168    if area_max > 50:  # 有找到最大面积(the maximum area has been found)
169        (centerX, centerY), radius = cv2.minEnclosingCircle(areaMaxContour) #获取最小外接圆(obtain the minimum circumscribed circle)
170        centerX = int(misc.map(centerX, 0, size[0], 0, img_w))
171        centerY = int(misc.map(centerY, 0, size[1], 0, img_h))
172        radius = int(misc.map(radius, 0, size[0], 0, img_w))
173        cv2.circle(img, (int(centerX), int(centerY)), int(radius), range_rgb[detect_color], 2)
  • Feedback Information

After the contour of the maximum area is obtained, call minEnclosingCircle() function in cv2 library to obtain the smallest circumscribed circle of the target contour.

169        (centerX, centerY), radius = cv2.minEnclosingCircle(areaMaxContour) #获取最小外接圆(obtain the minimum circumscribed circle)

Then circle the recognized target, which involves circle() function in cv2 library.

173        cv2.circle(img, (int(centerX), int(centerY)), int(radius), range_rgb[detect_color], 2)
  • Drive the servo

Take X and Y coordinate of the center of the image as setting value. And take the X and Y coordinate of the recognized target as the input value to update PID.

175        # use_time = 0
176        x_pid.SetPoint = img_w/2  #设定(set)
177        x_pid.update(centerX)  #当前(current)
178        dx = int(x_pid.output)
179        # use_time = abs(dx*0.00025)
180        x_dis += dx  #输出(output)
181        
182        x_dis = 0 if x_dis < 0 else x_dis          
183        x_dis = 1000 if x_dis > 1000 else x_dis
184            
185        y_pid.SetPoint = img_h/2
186        y_pid.update(centerY)
187        dy = int(y_pid.output)
188        # use_time = round(max(use_time, abs(dy*0.00025)), 5)
189        y_dis += dy
190        
191        y_dis = 0 if y_dis < 0 else y_dis
192        y_dis = 1000 if y_dis > 1000 else y_dis    
193        
194        if not debug:
195            board.bus_servo_set_position(0.02, [[24, y_dis], [21, x_dis]])
196            time.sleep(0.02)

Drive the specific servo to rotate to the designated position through calling the bus_servo_set_position() function in Board library

194        if not debug:
195            board.bus_servo_set_position(0.02, [[24, y_dis], [21, x_dis]])
196            time.sleep(0.02)

Take bus_servo_set_position(0.02, [[24, y_dis], [21, x_dis]]) function for example.

The meaning of the parameter in bracket is as follow.

The first parameter 0.02 is the rotation time in the unit of “24”.

The second parameter 24 is the servo ID to be driven.

The third parameter y_dis is the rotation position.

6.4.5 Function extension

  • Modify Default Recognized Color

There are two built-in colors in the program of color tracking, including green and blue. And its robotic arm will move with the target.

Take modifying the default recognition color as blue for example. The specific operation steps are as follow.

(1) Input command and press “Enter” into the directory where the game programs are stored.

cd spiderpi/functions

(2) Enter command and press “Enter” to open the program file.

vim color_track.py

(3) Locate the code shown below:

Note

press “Shift+G” after inputting the line number to directly jump to the corresponding line. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.

(4) Press “i” to enter the editing mode. And modify “green” in “__target_color = (‘green’,)” as “blue”.

(5) After modification, press “Esc” and input “:wq” and then press Enter to save and exit.

:wq
  • Add New Recognition Color

Note

for better game performance, please do not add red as the recognition color.

In addition to the built-in recognition colors, you can set other recognition colors in the program. Take orange as example

(1) Open VNC, input command to open Lab color setting document.

Vim spiderpi/config/lab_config.yaml

Note

It is recommended to screenshot the initial value for recording.

(2) Double click the icon of debugging tool in the system desktop. If the prompt box pops up, choose “Execute”.

(3) Click “Connect” button. When the interface displays the camera returned image, the connection is successful. Select “green” in the drop-down box.

Face the camera to the color to recognize. Drag the sliders of L, A, and B until the target color in the left screen becomes white and other areas become black.

For example, if you want to recognize orange, you can put the orange ball within camera’s vision. Adjust the corresponding sliders of L, A, and B until the orange part in the left screen turns white and other colors become black, and then click “Save” button to keep the modified data.

(4) After the modification is completed, check whether the modified data was successfully written in. Enter the command again “Vim spiderpi/config/lab_config.yaml” to open file of Lab color setting.

Vim spiderpi/config/lab_config.yaml

Note

In order to avoid the game performance, it’s recommended to use the LAB_Tool tool to modify the value back to the initial value after the modification.

(5) The modified data is written successfully into the configuration program. Then you can press “Esc” and input “:wq” and then press Enter to save and exit.

:wq

(6) According to the steps in “6.4.5 Function Extension ->Modify Default Recognition Color”, set the default recognition color as green.

(7) Start the game again and put the orange object in front of the camera. SpiderPi Pro will nod when recognizing the color. If you want to add other color as recognition color, you can follow the previous steps to set.

6.5 Line Following

6.5.1 Program Logic

Line following is common in robot competitions which is implemented by two-channel or four-channel line follower. Different from this, SpiderPi Pro can recognize the line color through visual module, and process with image algorithms, to realize line following.

First, program SpiderPi Pro to recognize colors with Lab color space. Convert the RGB color space to Lab, then perform image binarization, and then operations such as expansion and corrosion to obtain an outline containing only the target color. Next, circle color outline.

After color recognition, calculate according to the the position feedback of the line in the image, and then program SpiderPi Pro to move along the line so as to realize line following.

6.5.2 Operation Steps

Note

The input command should be case sensitive and space sensitive.

(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.

(2) Click at upper left corner of desktop, or press “Ctrl+Alt+T” to open LX terminal.

(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.

cd spiderpi/functions

(4) Enter the command, then press “Enter” to start the game.

python3 visual_patrol.py

(5) If you want to exit the game program, press “Ctrl+C” in the LX terminal interface. If the exit fails, please try it a few more times.

6.5.3 Project Outcome

Note

The default recognition color is red. If you want to change it to white or black, please refer to “6.5.5 Function Extension -> Modify Default Recognition Color”.

Paste red electrical tape to form a path. Then place SpiderPi Pro on the red line. After the game starts, the robot will move along the red line.

6.5.4 Program Analysis

The source code of this program is stored in:/home/pi/spiderpi/functions/visual_patrol.py

  • Import Function Library

 4import sys
 5import cv2
 6import time
 7import math
 8import threading
 9import numpy as np
10from common import yaml_handle
11from calibration.camera import Camera 
12from calibration.CalibrationConfig import *
13from common import kinematics
14from sensor.ultrasonic_sensor import Ultrasonic
15import arm_ik.arm_move_ik as AMK

(1) Import the libraries related to OpenCV, time, math, and threads.

If want to call a function in library, you can use “library name+function name (parameter, parameter)”. For example:

218            time.sleep(0.01)

Call sleep function in time library. The function sleep () is used to delay. There are some built-in libraries in Python, so they can be called directly. For example, time, cv2 and math. You can also write a new library like yaml_handle.

(2) Instantiate Function Library

The name of function library is too long to memorize. For calling function easily, the library can be instantiated. For example:

11from calibration.camera import Camera 

After instantiating, you can directly input and call the function Board.function name (parameter, parameter).

  • Define Global Variable

17if sys.version_info.major == 2:
18    print('Please run this program with python3!')
19    sys.exit(0)
20
21lab_data = None
22servo_data = None
23def load_config():
24    global lab_data, servo_data
25    
26    lab_data = yaml_handle.get_yaml_data(yaml_handle.lab_file_path)
27
28load_config()
29
30__target_color = ('red',)
31# 设置检测颜色(set target color)
32def setLineTargetColor(target_color):
33    global __target_color
34
35    __target_color = target_color
36    return (True, ())
  • Main Function Analysis

The python program __name__ == '__main__:' is the main function of program. Firstly, the function “init()” is called to initialize. The initialization in this program includes: return the servo to the initial position, read the color threshold file. Generally there are also configurations for ports, peripherals, timing interrupts, etc., which are all done in the process of initialization.

182if __name__ == '__main__':
183    from common.ros_robot_controller_sdk import Board
184    from sensor.ultrasonic_sensor import Ultrasonic
185    
186    board = Board()
187    ik = kinematics.IK(board)  # 实例化逆运动学库(instantiate inverse kinematics library)
188    ultrasonic = Ultrasonic()
189    ak = AMK.ArmIK()

(1) Read the Captured Image

207    while True:
208        img = camera.frame
209        if img is not None:

When the the game is started, store the image in img.

(2) Enter Image Processing

When the captured image is read, call run function to process the image

209        if img is not None:
210            frame = img.copy()
211            frame = cv2.remap(frame, mapx, mapy, cv2.INTER_LINEAR)  # 畸变矫正(distortion correction)
212            Frame = run(frame)           
213            cv2.imshow('Frame', Frame)
214            key = cv2.waitKey(1)
215            if key == 27:
216                break

The function img.copy() is used to copy the content of img to frame.

(3) Gaussian filtering

Before converting the image from RGB into LAB space, denoise the image and use GaussianBlur() function in cv2 library for Gaussian filtering.

141    frame_gb = cv2.GaussianBlur(img, (3, 3), 3)

The meaning of the parameters in bracket is as follow

The first parameter img is the input image

The second parameter (3, 3) is the size of Gaussian kernel

The third parameter 3 is the allowable variance around the average in Gaussian filtering. The larger the value, the larger the allowable variance around the average value; The smaller the value, the smaller the allowable variance around the average value.

(4) Binaryzation processing

Adopt inRange() function in cv2 library to perform binaryzation on the image.

150                frame_mask = cv2.inRange(frame_lab,
151                                         (lab_data[i]['min'][0],
152                                          lab_data[i]['min'][1],
153                                          lab_data[i]['min'][2]),
154                                         (lab_data[i]['max'][0],
155                                          lab_data[i]['max'][1],
156                                          lab_data[i]['max'][2]))  #对原图像和掩模进行位运算(perform bitwise operation to original image and mask)

The first parameter in the bracket is the input image. The second and the third parameters respectively are the lower limit and upper limit of the threshold. When the RGB value of the pixel is between the upper limit and lower limit, the pixel is assigned a value of 1, otherwise, 0.

(5) Corrosion and dilation

To reduce the interference and make the image smoother, it is necessary to perform corrosion and dilation on the image.

157                eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)))  #腐蚀(erode)
158                dilated = cv2.dilate(eroded, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)))  #膨胀(dilate)

erode() function is used for corrosion. Take eroded = cv2.erode(frame_mask, cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))) for example.

The meaning of the parameters in bracket are as follow.

The first parameter frame_mask is the input image.

The second parameter cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3)) is the structural element and kernel deciding the nature of the operation. And the first parameter in the parenthesis is the kernel shape and the second parameter is the kernel dimension. dilate() function is used for image dilation. And the meaning of the parameters in parenthesis is the same as that of erode() function.

(6) Acquire the maximum contour

After processing the image, acquire the contour of the target to be recognized, which involves findContours() function in cv2 library.

159                cnts = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_TC89_L1)[-2]  #找出所有轮廓(find all contours)

The first parameter in parentheses is the input image;

the second parameter is the retrieval mode of the contour; the third parameter is the approximation method of the contour.

160                cnt_large, area = get_area_maxContour(cnts)  #找到最大面积的轮廓(find the largest contour)
161                if area > 10:
162                    rect = cv2.minAreaRect(cnt_large)  #最小外接矩形(the minimum bounding rectangle)
163                    
164                    box = np.intp(cv2.boxPoints(rect))  #最小外接矩形的四个顶点(the four corner points of the minimum bounding rectangle)
165                    for j in range(4):
166                        box[j, 1] = box[j, 1] + r[0]
167
168                    cv2.drawContours(img, [box], -1, (0, 255, 255), 2)  #画出四个点组成的矩形(draw the rectangle composed of the four points)

(7) Obtain location

minAreaRect() function in cv2 library is used to obtain the smallest circumscribed rectangle of the target outline and the coordinate of 4 vertexes will be obtained by boxPoints() function. Next, the coordinates of the center point of the rectangle can be deduced from the coordinates of the vertex.

164                    box = np.intp(cv2.boxPoints(rect))  #最小外接矩形的四个顶点(the four corner points of the minimum bounding rectangle)
165                    for j in range(4):
166                        box[j, 1] = box[j, 1] + r[0]
167
168                    cv2.drawContours(img, [box], -1, (0, 255, 255), 2)  #画出四个点组成的矩形(draw the rectangle composed of the four points)
169
170                    #获取矩形的对角点(obtain the diagonal points of the rectangle)
171                    pt1_x, pt1_y = box[0, 0], box[0, 1]
172                    pt3_x, pt3_y = box[2, 0], box[2, 1]
173                    line_center_x, line_center_y = (pt1_x + pt3_x) / 2, (pt1_y + pt3_y) / 2  #中心点(center point)
174                    cv2.circle(img, (int(line_center_x), int(line_center_y)), 5, (0, 0, 255), -1)  #画出中心点(draw the center point)
175                    line_center = line_center_x

5.4.4 Line following

After the image processing, control SpiderPi Pro to move through calling the function in kinematics.IK library.

111            if line_center >= 0:              
112                if abs(line_center -img_center_x) < 60:
113                    ik.go_forward(ik.initial_pos, 2, 60, 50, 1)
114                elif line_center -img_center_x >= 60:
115                    ik.turn_right(ik.initial_pos, 2, 30, 50, 1)
116                else:
117                    ik.turn_left(ik.initial_pos, 2, 30, 50, 1)
118                last_line_center = line_center
119
120            elif line_center == -1:
121                if last_line_center >= img_center_x:
122                    ik.turn_left(ik.initial_pos, 2, 30, 50, 1)
123                else:
124                    ik.turn_right(ik.initial_pos, 2, 30, 50, 1)
125        else:
126            time.sleep(0.01)

The functions used to control the SpiderPi Pro’s movement are listed below.

Function Usage
ik.go_forward(ik.initial_pos, 2, 50, 80, 1) robot moves straight forward 50mm
ik.back(ik.initial_pos, 2, 100, 80, 1) robot moves straight backward 100mm
ik.turn_left(ik.initial_pos, 2, 30, 100, 1) turn left on the spot 30 degrees
ik.turn_right(ik.initial_pos, 2, 30, 100, 1) turn right on the spot 30 degrees
ik.left_move(ik.initial_pos, 2, 100, 100, 1) move left 100mm
ik.right_move(ik.initial_pos, 2, 100, 100, 1) move right 100mm

Take ik.go_forward(ik.initial_pos, 2, 50, 80, 1) for example. The meaning of the parameter in bracket is as follow.

The first parameter ik.initial_pos represents the posture.

The second parameter 2 is the mode, and 2 is spider mode.

The third parameter 50 is the stride and the unit is mm when it goes straight, and degree when it turns.

The fourth parameter 80 is the speed in mm/s.

The fifth parameter 1 is the number of execution. When it is “0”, it means that the robot will perform one action at loop.

6.5.5 Function Extension

  • Modify Default Recognition Color

There are three built-in colors, including red, black and white, in the program. Take modify the default recognition color as white for example.

(1) Input command and press Enter into the directory where the game programs are stored.

cd spiderpi/functions

(2) Enter command and press Enter to open the program file.

vim visual_patrol.py

(3) Locate the code shown below:

Note

press “Shift+G” after inputting the line number to directly jump to the corresponding line. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.

(4) Press “i” key to enter the editing mode. And modify “red” in “__target_color = (‘red’,)” as “white”. Or you can modify it as black if you want.

(5) After modification, press “Esc” key and input “:wq” and then press Enter to save and exit.

  • Add New Recognition Color

In addition to the three built-in tracked colors, you can set other colors in the program. Take blue as example

(1) Open VNC, input command to open Lab color setting document.

Vim spiderpi/config/lab_config.yaml

Note

It is recommended to screenshot the initial value for recording.

(2) Double click the icon of debugging tool in the system desktop. If the prompt box pops up, choose “Execute”.

(3) Click “Connect” button. When the interface displays the camera returned image, the connection is successful. Select “red” in the drop-down box.

(4) Face the camera to the color to recognize. Drag the sliders of L, A, and B until the target color area in the left screen becomes white and other areas become black.

For example, if you want to modify the default color as blue, you can put the blue line within camera’s vision. Adjust the corresponding sliders of L, A, and B until the blue part in the left screen turns white and other colors become black, and then click “Save” button to keep the modified data.

Note

In order to avoid the influence on game performance, it’s recommended to use the “LAB_Tool” tool to modify the value back to the initial value after the modification.

(5) After the modification is completed, check whether the modified data was successfully written in. Enter the command again “vim spiderpi/config/lab_config.yaml” to open file of Lab color setting.

Vim spiderpi/config/lab_config.yaml

(6) The modified data is written successfully into the configuration program. Then you can press “Esc” and input “:wq” and then press Enter to save and exit.

:wq

(7) According to the steps in “6.5.5 Function Extension -> Modify Default Recognition Color”, set the default recognition color as red.

(8) Start the line following game again according to the steps in “6.5.2 Operation Steps”. Then SpiderPi Pro will move along the blue line.

6.6 Tag Detection

6.6.1 Brief Game Description

When the robot detects a tag, the buzzer emits a sound, and the feedback image is returned.

AprilTag, a visual fiducial marker, is similar to a QR code or barcode. It can be used to quickly detect markers and calculate relative positions, meeting real-time requirements. It is widely used in various applications such as augmented reality (AR), robotics, and camera calibration. Currently, AprilTags can be printed using a standard printer, and their detection programs can calculate precise 3D position, orientation, and ID relative to the camera.

In this lesson, we will combine OpenCV with AprilTag to complete a small project for detecting AprilTag markers. When the camera detects the tag, the robot’s onboard buzzer will sound as a prompt, and the feedback image will be displayed.

6.6.2 Start and Close the Game

Note

The input of commands must strictly distinguish between uppercase and lowercase letters, as well as spaces.

(1) Power on the device and, following the instructions in “Remote Desktop Installation and Connection\3.1 VNC Installation and Connection”, use the VNC remote connection tool to connect.

(2) Click the iconin the top left corner of the system desktop or press the shortcut “Ctrl+Alt+T” to open the LX terminal.

(3) In the terminal, enter the command to navigate to the directory where the program is located, then press Enter:

cd spiderpi/functions

(4) Enter the command and press Enter to start the program:

python3 apriltag_recognition.py

(5) To close the program, simply press “Ctrl+C” in the LX terminal. If it does not close, press it multiple times.

6.6.3 Program Outcome

Note

For optimal tag detection, place the tag against a solid-colored or white background. Dark backgrounds (e.g., black) may interfere with tag recognition.

Once the game is activated, position the included AprilTag tag in front of the camera. When the robot detects the tag, the buzzer will sound as a prompt. The feedback image will display the captured tag, outline it, and show the tag’s tag_id and tag_family information.

6.6.4 Program Parameter Explanation

The source code for this program is located at:/home/pi/spiderpi/functions/apriltag_recognition.py

(1) Image Acquisition and Processing

The first step is image processing, which involves working with digital image data. We begin by importing the necessary packages.

 4import sys
 5import time
 6import cv2
 7import numpy as np
 8from common import yaml_handle
 9from calibration.camera import Camera 
10import common.apriltag as apriltag
11from common.ros_robot_controller_sdk import Board
12from sensor.ultrasonic_sensor import Ultrasonic

Next, we initialize and start the camera to acquire the image, then proceed to copy, remap, and display the image.

 95    while True:
 96        img = camera.frame
 97        if img is not None:
 98            frame = img.copy()
 99            Frame = run(frame)          
100            cv2.imshow('Frame', Frame)

Afterward, we need to convert the image from RGB format to grayscale. The corresponding code is as follows:

54    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

(2) Tag Detection

Once the image has been processed, we need to detect the tag. This is done by using the tag library to detect the tag in the acquired image. The code implementation is as follows:

51# 检测apriltag(detect apriltag)
52detector = apriltag.Detector(searchpath=apriltag._get_demo_searchpath())

After detection, the program will obtain the four corner points of the tag.

59            corners = np.rint(detection.corners)  # 获取四个角点(obtain the four corner points)

Next, we need to draw the contours of the tag. In OpenCV, we use the cv2.drawContours function to accomplish this. The program code is as follows:

62            cv2.drawContours(img, [np.array(corners, np.intp)], -1, (0, 255, 255), 2)

This function takes five parameters, each with the following meanings:

img: The image to be processed.

[np.array(corners, np.int)]: The contour points.

-1: The contour index. -1 indicates that all contours should be drawn.

(0, 255, 255): The color of the contour.

2: The thickness of the contour line.

(3) Retrieving Tag Information

The program uses the AprilTag library to perform encoding and decoding to retrieve the tag’s information. Depending on the encoding method, different inner point coordinates are generated.

Once the quadrilateral is identified, the grid coordinates are clarified. To verify the reliability of the encoding, the tag must be matched against a known encoding library.

62            tag_family = str(detection.tag_family, encoding='utf-8')  # 获取tag_family(obtain tag_family)
63            tag_id = int(detection.tag_id)  # 获取tag_id(obtain tag_id)
64            
65            return tag_family, tag_id

6.7 Tag Recognition

6.7.1 Program Logic

AprilTag is a visual positioning marker, which is similar to QR code or bar code. It can quickly detect the marker and calculate the position. It’s mainly applied to AR, robot and camera calibration, etc.

First, detect AprilTag through positioning, image segmentation, and contour searching. Obtain the angular point information after the contour is positioned. Connect the four corner points with a straight line to form a closed loop.

Encode and decode the detected tags. Finally, control SpiderPi Pro to execute the corresponding action according to different Tag IDs.

6.7.2 Operation Steps

Note

The input command should be case sensitive and space sensitive.

(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.

(2) Click at upper left corner of desktop, or press “Ctrl+Alt+T” to open LX terminal.

(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.

cd spiderpi/functions

(4) Enter the command, then press “Enter” to start the game.

python3 apriltag_detect.py

(5) If you want to exit the game programming, press “Ctrl+C” in the LX terminal interface. If the exit fails, please try it a few more times.

6.7.3 Project Outcome

Note

  • Please run this game on a solid color or a white background. Dark background such as black will affect the tag recognition performance.

  • Please keep the tag intact, because dirt and wrinkle will affect recognition.

When recognizing the corresponding tag, the robot will execute the corresponding action. Besides, the tag will be marked with yellow box and the Tag ID and category will be printed on the camera returned image.

The corresponding actions of different Tag ID are listed below.

Tag ID Action
1 wave hands
2 mark time
3 twist

6.7.4 Program Analysis

The source code of the program is located in: /home/pi/spiderpi/functions/apriltag_detect.py

  • Import Function Library

 4import sys
 5import math
 6import threading
 7import time
 8import cv2
 9import numpy as np
10from common import yaml_handle
11from calibration.camera import Camera 
12from calibration.CalibrationConfig import *
13from common import kinematics
14import common.apriltag as apriltag

(1) Import the libraries related to OpenCV, time, math, and threads. If want to call a function in library, you can use “library name+function name (parameter, parameter)”. For example:

199            time.sleep(0.01)

Call sleep function in time library. The function sleep () is used to delay. There are some built-in libraries in Python, so they can be called directly. For example, time, cv2 and math. You can also write a new library like yaml_handle.

(2) Instantiate Function Library

The name of function library is too long to memorize. For calling function easily, the library can be instantiated. For example:

11from calibration.camera import Camera 

After instantiating, you can directly input and call the function Board.function name (parameter, parameter).

  • Main Function Analysis

The python program __name__ == '__main__:' is the main function of program. Firstly, the function init() is called to initialize. The initialization in this program includes: return the servo to the initial position, read the color threshold file. Generally there are also configurations for ports, peripherals, timing interrupts, etc., which are all done in the process of initialization.

159if __name__ == '__main__':
160    from common.ros_robot_controller_sdk import Board
161    from sensor.ultrasonic_sensor import Ultrasonic
162    from common.action_group_controller import ActionGroupController
163    import arm_ik.arm_move_ik as AMK
164
165
166    board = Board()
167    ik = kinematics.IK(board)  # 实例化逆运动学库(instantiate inverse kinematics library)
168    ultrasonic = Ultrasonic()
169    agc = ActionGroupController(board)
170    ak = AMK.ArmIK()
  • Obtain Corner Point Information

Use np.rint() to obtain the four corner points of the tag.

116# 检测apriltag(detect apriltag)
117detector = apriltag.Detector(searchpath=apriltag._get_demo_searchpath())
118def apriltagDetect(img):   
119    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
120    detections = detector.detect(gray, return_image=False)
121
122    if len(detections) != 0:
123        for detection in detections:                       
124            corners = np.rint(detection.corners)  # 获取四个角点(obtain the four corner points)
125            cv2.drawContours(img, [np.array(corners, np.int64)], -1, (0, 255, 255), 2)
126
127            tag_family = str(detection.tag_family, encoding='utf-8')  # 获取tag_family(obtain tag_family)
128            tag_id = int(detection.tag_id)  # 获取tag_id(obtain tag_id)
129
130            object_center_x, object_center_y = int(detection.center[0]), int(detection.center[1])  # 中心点(center point)
131            
132            object_angle = int(math.degrees(math.atan2(corners[0][1] - corners[1][1], corners[0][0] - corners[1][0])))  # 计算旋转角(calculate rotation angle)
133            
134            return tag_family, tag_id
  • Tag Detection

(1) After the angular points of the tag are obtained, mark the Tag through calling drawContours() function in cv2 library.

125            cv2.drawContours(img, [np.array(corners, np.int64)], -1, (0, 255, 255), 2)

The meaning of the parameters in bracket is as follow.

The first parameter img is the input image

The second parameter [np.array(corners, np.int)] is the contour itself and list in Python.

The third parameter -1 is the index of the contour. The value here represents all the contours in list will be drawn.

The fourth parameter (0, 255, 255) is the color of the contour. The values respectively corresponds to B, G, R, and the color is yellow here.

The fifth parameter 2 is the width of the contour.

(2) Obtain the type of the tag (tag_family) and ID (tag_id)

127            tag_family = str(detection.tag_family, encoding='utf-8')  # 获取tag_family(obtain tag_family)
128            tag_id = int(detection.tag_id)  # 获取tag_id(obtain tag_id)

(3) Through calling putText() function in cv2 library, print the ID and category of the tag on the camera returned image.

150    if tag_id is not None:
151        cv2.putText(img, "tag_id: " + str(tag_id), (10, img.shape[0] - 30), cv2.FONT_HERSHEY_SIMPLEX, 0.65, [0, 255, 255], 2)
152        cv2.putText(img, "tag_family: " + tag_family, (10, img.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.65, [0, 255, 255], 2)
153    else:
154        cv2.putText(img, "tag_id: None", (10, img.shape[0] - 30), cv2.FONT_HERSHEY_SIMPLEX, 0.65, [0, 255, 255], 2)
155        cv2.putText(img, "tag_family: None", (10, img.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.65, [0, 255, 255], 2)

The meaning of the parameters in bracket is as follow.

The first parameter img is the input image.

The second parameter "tag_id: " + str(tag_id) is the displayed content.

The third parameter (10, img.shape[0] - 30) is the displayed position.

The fourth parameter cv2.FONT_HERSHEY_SIMPLEX is the font type.

The fifth parameter 0.65 is the font size.

The sixth parameter [0, 255, 255] is the color of the font, and the values respectively corresponds to B, G, R. The color here is yellow.

The seventh parameter 2 is the font weight.

  • Action Controlling

After the tag ID is obtained, control SpiderPi Pro to execute the corresponding action group through calling agc.run_action() function.

 82    while True:
 83        if debug:
 84            return
 85        if __isRunning:
 86            if tag_id is not None:
 87                action_finish = False
 88                time.sleep(0.5)
 89                if tag_id == 1:               
 90                    agc.run_action_group('wave',lock_servos=LOCK_SERVOS)#招手(wave)
 91                    tag_id = None
 92                    time.sleep(1)                  
 93                    action_finish = True                
 94                elif tag_id == 2:                    
 95                    agc.run_action_group('stepping',lock_servos=LOCK_SERVOS)#原地踏步(stepping)
 96                    tag_id = None
 97                    time.sleep(1)
 98                    action_finish = True          
 99                elif tag_id == 3:                   
100                    agc.run_action_group('twist_l',lock_servos=LOCK_SERVOS)#扭腰(twist)
101                    tag_id = None
102                    time.sleep(1)
103                    action_finish = True
104                else:
105                    action_finish = True
106                    time.sleep(0.01)
107            else:
108               time.sleep(0.01)
109        else:
110            time.sleep(0.01)

6.7.5 Function Extension

  • Modify Action Corresponding to the Tag

SpiderPi Pro is default to “wave hands” in the program when the ID 1 tag is detected, but you can modify the default program. For example, we can revise the feedback action as kicking.

(1) Enter command and press “Enter” to navigate to the directory where the game program is located.

cd spiderpi/functions

(2) Enter command and press Enter to open the program file.

vim apriltag_detect.py

(3) Locate the code shown below:

Note

press “Shift+G” after inputting the line number to directly jump to the corresponding line. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.

(4) Press “i” key to enter the editing mode.

Modify “wave” of agc.run_action("wave") as “kick”. If you want to change it to other action group, you can enter the other action group name which can be checked in “/home/pi/spiderpi/aiction_groups”.

kick

(5) After modification, press “Esc” and input “:wq” and then press Enter to save and exit.

:wq
  • Modify/ Add the Tag

You can find the Tag materials in this directory “AprilTag collection”, but you need to extract this folder first.

Note

  • There is no need to download materials online. 200 tags are provided and you can find them in “ApirlTag Collection”.

  • You can print the tag in suitable size, not too large or too small, as long as the tag can be recognized by the robot. The tag will be circled in yellow when it is recognized.

  • The recognition background should be white. Dark background will influence the recognition effect.

Take adding Tag ID4 for example. The corresponding action of this tag is “Stand at Attention in High Posture”. Please follow the below steps to modify.

(1) According to “6.7.5 Function Extension -> Modify Action Corresponding to the Tag “, enter the catalog of the game program and open the corresponding program file.

(2) Locate the code in 98th line, input “5yy”, and then copy the codes of 98-102 line.

(3) When the hint of “5 lines yanked” appears, it means that the codes are copied successfully.

(4) Then move to the codes shown in the red frame and enter “p” to paste the codes copied before.

(5) Press “i” key to enter the editing mode, and modify “3” of “elif tag_id == 3:” as “4”, and “twist_l” of “agc.run_action(‘twist_l’)” as “stand_high”. And modify the comment after the codes as “stand at attention in high posture”. If you want to change it to other action groups, you can enter other action group name which can be checked in “/home/pi/spiderpi/action_groups”.

(6) After modification, press “Esc” key, enter “:wq”, and then press “Enter” to save and exit.

:wq

(7) Find Tag ID4 in folder “AprilTag Collection” and print it directly.

(8) According to “6.7.2 Operation Steps” to start the game and check whether the modification works.

6.8 Face Recognition

6.8.1 Brief Description of the Activity

When no face is detected, the robotic arm rotates left and right to scan the area. Once a face is detected, the claw moves up and down as a greeting.

Face recognition is one of the most widely used applications in artificial intelligence, particularly in image recognition. Among these applications, face recognition is the most popular, often used in scenarios like smart locks and facial unlocking on mobile phones.

In this activity, we first train the face recognition model. The system then detects faces by scaling the image. After detection, the coordinates of the recognized face are converted back to the original scale, and the largest face is identified. The recognized face is then outlined with a frame.

Next, the pan-tilt servos are set to rotate left and right to locate the face. Finally, the robot executes the feedback action based on the recognition results.

6.8.2 Start and Close the Game

Note

The input of commands must strictly distinguish between uppercase and lowercase letters.

(1) Power on the device and, following the instructions in “Remote Desktop Installation and Connection\3.1 VNC Installation and Connection”, use the VNC remote connection tool to connect.

(2) Click the iconin the top left corner of the system desktop or press the shortcut “Ctrl+Alt+T” to open the LX terminal.

(3) In the terminal, enter the command to navigate to the directory where the program is located, then press Enter:

cd spiderpi/functions

(4) Enter the command and press Enter to start the program:

python3 face_recongition.py

(5) To close the program, simply press “Ctrl+C” in the LX terminal. If it does not close, press it multiple times.

6.8.3 Program Outcome

Note

For optimal performance, please avoid using this activity under strong lighting conditions, such as direct sunlight or close proximity to incandescent lights, as intense light can affect face recognition accuracy. It is recommended to conduct this activity indoors, with the face positioned within a range of 50 cm to 1 meter from the camera.

Once the activity begins, the camera’s pan-tilt will rotate left and right. If no face is detected, the robotic arm will scan by rotating left and right. Upon detecting a face, the claw will move up and down to greet the user.

6.8.4 Program Brief Analysis

The source code of the program is saved in:/home/pi/spiderpi/functions/face_recongition.py

  • Function Logic

(1) Importing Libraries

At this initialization step, necessary libraries are imported to facilitate future function calls within the program.

 4import sys
 5import cv2
 6import time
 7import sys
 8import threading
 9import mediapipe as mp
10from common import yaml_handle
11from calibration.camera import Camera 
12from common.action_group_controller import ActionGroupController
13from common.ros_robot_controller_sdk import Board
14from calibration.camera import Camera 
15from common import kinematics

(2) Setting Initial State

19debug = False
20iHWSONAR = None
21board = None
22if sys.version_info.major == 2:
23    print('Please run this program with python3!')
24    sys.exit(0)
25 
26# 导入人脸识别模块(import facial recognition module)
27Face = mp.solutions.face_detection
28# 自定义人脸识别方法,最小的人脸检测置信度0.5(Customize face recognition method, and the minimum face detection confidence is 0.5)
29faceDetection = Face.FaceDetection(min_detection_confidence=0.8)
30
31lab_data = None
32servo_data = None

(3) Color Space Conversion

The BGR image is converted to an RGB image.

79 imgRGB = cv2.cvtColor(img_copy, cv2.COLOR_BGR2RGB) # 将BGR图像转为RGB图像(convert BGR image to RGB image)

(4) Using Mediapipe Face Model for Recognition.

The system performs face detection and draws a rectangle around the detected face. Then, the position of the face is compared to the center of the image. If the face is centered, start_greet is set to True to trigger the action group.

 81if results.detections:  # 如果检测不到人脸那就返回None(If the face is not detected, return None)
 82
 83        for index, detection in enumerate(results.detections):  # 返回人脸索引index(第几张脸),和关键点的坐标信息(Return the face index (which face) and the coordinate information of the keypoints)
 84            scores = list(detection.score)
 85            if scores and scores[0] > 0.75:
 86                
 87                bboxC = detection.location_data.relative_bounding_box  # 设置一个边界框,接收所有的框的xywh及关键点信息(Set a bounding box to receive xywh and keypoint information for all boxes)
 88                
 89                # 将边界框的坐标点,宽,高从比例坐标转换成像素坐标(Convert the coordinates' width and height of the bounding box from proportional coordinates to pixel coordinates)
 90                bbox = (
 91                    int(bboxC.xmin * img_w),
 92                    int(bboxC.ymin * img_h),
 93                    int(bboxC.width * img_w),
 94                    int(bboxC.height * img_h)
 95                )
 96                cv2.rectangle(img, bbox, (0, 255, 0), 2)  # 在每一帧图像上绘制矩形框(draw a rectangle on each frame of the image)
 97                
 98                # 获取识别框的信息, xy为左上角坐标点(Get information about the recognition box, where xy is the coordinates of the upper left corner)
 99                x, y, w, h = bbox
100                center_x = int(x + (w / 2))
101                center_y = int(y + (h / 2))
102                area = int(w * h)
103                if not start_greet: 
104                    board.set_buzzer(2400, 0.1, 0.2, 1)
105                    start_greet = True 
106                    
107            else :
108                start_greet = False
109                

(5) Face Recognition

If a face is detected, the Board.setPWMServoPulse function is used to control the servo motor by setting the PWM (Pulse Width Modulation) to perform the waving action.

The first parameter 0.05 is the pulse interval or duration.

The second parameter 3 refers to the pin number connected to the servo.

The third parameter 500 represents the pulse width, which typically corresponds to the servo’s position.

130    while True:
131        img = camera.frame
132        if img is not None:
133            frame = img.copy()
134            Frame = run(frame)           
135            cv2.imshow('Frame', Frame)
136            key = cv2.waitKey(1)
137            if key == 27:
138                break
139        else:
140            time.sleep(0.01)

(6) Display the Transmitted Image

Call the resize() function in the cv2 library to scale the image and display it in real time on the transmitted Image.

133        frame = img.copy()
134        Frame = run(frame)           
135        cv2.imshow('Frame', Frame)
136        key = cv2.waitKey(1)
137        if key == 27:
138            break

when a face is detected, the buzzer makes a sound.

104                    board.set_buzzer(2400, 0.1, 0.2, 1)

6.9 Face Detection

6.9.1 Program logic

In image recognition, face recognition technology is very popular and is often used in scenarios such as door locks and facial recognition for unlocking mobile phones.

To realize face detection, the first step is to zoom in or out the image.

Next, convert the coordinate of the recognized human face into the coordinate before scaling, and mark the target human face with the box.

Lastly, control SpiderPi Pro to execute the corresponding action. When human face is not recognized, control the robotic arm to rotate around to search human face.

6.9.2 Operation steps

Note

The input command should be case sensitive and space sensitive.

(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.

(2) Click at upper left corner of desktop, or press “Ctrl+Alt+T” to open LX terminal.

(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.

cd spiderpi/functions

(4) Enter the command, then press “Enter” to start the game.

python3 face_detect.py

(5) If you want to exit the game programming, press “Ctrl+C” in the LX terminal interface. If the exit fails, please try it a few more times.

6.9.3 Project outcome

Note

As the strong light will influence the effect of face detection, please do not play this game under strong light, such as sunlight, incandescent light. It is recommended to start this game in the indoor and the distance between human face and the camera is within 1m.

After the game starts, the camera will raise to the specific angle and then rotate around to search human face. When recognizing human face, the robotic arm will stop rotating and SpiderPi Pro will “wave”.

6.9.4 Program Analysis

The source code of this program is located in: /home/pi/spiderpi/functions/face_detect.py

  • Import Function Library

 4import sys
 5import cv2
 6import time
 7import sys
 8import threading
 9import mediapipe as mp
10from common import yaml_handle
11from calibration.camera import Camera 
12from common.action_group_controller import ActionGroupController
13from common.ros_robot_controller_sdk import Board
14from calibration.camera import Camera 
15from common import kinematics
  • Define Global Variable

20debug = False
21iHWSONAR = None
22board = None
23if sys.version_info.major == 2:
24    print('Please run this program with python3!')
25    sys.exit(0)
26 
27# 导入人脸识别模块(import facial recognition module)
28Face = mp.solutions.face_detection
29# 自定义人脸识别方法,最小的人脸检测置信度0.5(Customize face recognition method, and the minimum face detection confidence is 0.5)
30faceDetection = Face.FaceDetection(min_detection_confidence=0.8)
31
32lab_data = None
33servo_data = None
34def load_config():
35    global lab_data, servo_data
36    
37    lab_data = yaml_handle.get_yaml_data(yaml_handle.lab_file_path)
38    servo_data = yaml_handle.get_yaml_data(yaml_handle.servo_file_path)
39
40load_config()
  • Image Processing

(1) Convert color space

Convert the BGR image to LAB image.

134    imgRGB = cv2.cvtColor(img_copy, cv2.COLOR_BGR2RGB) # 将BGR图像转为RGB图像(convert BGR image to RGB image)

The cvtColor() function is used to convert an image from one color space to another. In the example code gray = cv2.cvtColor(frame_resize, cv2.COLOR_BGR2GRAY) , the meanings in the parenthesis are as follow: The first parameter frame_resize is the input image. The second parameter cv2.COLOR_BGR2GRAY is the type of conversion, which in this case is a conversion from BGR to grayscale.

(2) Call face detector

After completing the image processing steps mentioned above, the image is passed to a face detector for further processing.

135    results = faceDetection.process(imgRGB) # 将每一帧图像传给人脸识别模块(transmit the image of each frame to facial recognition module)
136    if results.detections:  # 如果检测不到人脸那就返回None(If the face is not detected, return None)
137
138        for index, detection in enumerate(results.detections):  # 返回人脸索引index(第几张脸),和关键点的坐标信息(Return the face index (which face) and the coordinate information of the keypoints)
139            scores = list(detection.score)
140            if scores and scores[0] > 0.75:

(3) Display transmitted image

Call resize() function in cv2 library to scale the shape, and display it in the live camera feed.

182    while True:
183        img = camera.frame
184        if img is not None:
185            frame = img.copy()
186            Frame = run(frame)           
187            cv2.imshow('Frame', Frame)
188            key = cv2.waitKey(1)
189            if key == 27:
190                break
  • Action Controlling

When human face is recognized, call the agc.run_action()function to control SpiderPi Pro to execute the designated action group.

102                AGC.run_action('wave') # 识别到人脸时执行的动作(If the face is detected, execute the action)

When human face is not detected, call “board.bus_servo_set_position()” to control the robotic arm of SpiderPi Pro to rotate around.

111                board.pwm_servo_set_position(0.05, [[2, servo2_pulse]])
112                time.sleep(0.05)
  • Main Function Analysis

(1) Call init() function to initialize SpiderPi Pro.

42# 初始位置(initial position)
43def initMove():
44    ultrasonic.setRGBMode(0)
45    ultrasonic.setRGB(1, (0, 0, 0))
46    ultrasonic.setRGB(2, (0, 0, 0)) 
47
48    board.pwm_servo_set_position(0.5, [[1, 1800] , [2, servo_data['servo2']]])

(2) Call reset() function to reset variable parameters such as servo.

57# 变量重置(reset variables)
58def reset():
59    global d_pulse
60    global start_greet
61    global x_pulse    
62    global action_finish
63
64 
65    start_greet = False
66    action_finish = True
67    x_pulse = 500 
68    init_move()  

(3) Call start() function to start face tracking game.

77def start():
78    global __isRunning
79    __isRunning = True
80    print("FaceDetect Start")

(4) Instantiate the camera library and call camera_open() function to enable camera’s distortion correction.

180    camera = Camera()
181    camera.camera_open(correction=True)
  • Subthread Analysis

Run a sub-thread that calls the move() function to control the movement of pan-tilt servo.

116# 运行子线程(run sub-thread)
117th = threading.Thread(target=move)

In the move() function, adjust the rotation of the pan-tilt servo by sliding the window.

 92def move():
 93    global start_greet
 94    global action_finish
 95    global d_pulse, servo2_pulse    
 96    
 97    while True:
 98        if __isRunning:
 99            if start_greet:
100                start_greet = False
101                action_finish = False
102                AGC.run_action('wave') # 识别到人脸时执行的动作(If the face is detected, execute the action)
103
104                action_finish = True
105                time.sleep(0.5)
106            else:
107                if servo2_pulse > 2000 or servo2_pulse < 1000:
108                    d_pulse = -d_pulse
109            
110                servo2_pulse += d_pulse       
111                board.pwm_servo_set_position(0.05, [[2, servo2_pulse]])
112                time.sleep(0.05)
113        else:
114            time.sleep(0.01)

The meanings of the parameters in the parentheses of the code board.bus_servo_set_position(0.05, [[21,x_pulse]]) are as follows:

The first parameter 0.05 is the runtime of the servo in the unit of m.

The second parameter 21 is the servo number, which is servo 21.

The third parameter x_pulse is pulse width of the servo ranging from 1000 to 1900.

6.9.5 Function extension

Note

The built-in action group file can be found in this catalog “/home/pi/SpiderPi/action_groups”.

When human face is recognized, SpiderPi Pro will “wave hands” by default. But we can modify the program to let SpiderPi Pro react differently, such as “twist body”. Please follow the below steps to modify.

(1) Enter the command and press “Enter” to come to the catalog where the game programs are stored.

cd spiderpi/functions

(2) Enter command “vim face_detect.py” and press “Enter” to open the program file.

vim face_detect.py

(3) Locate the code shown below:

Note

press “Shift+G” after inputting the line number to directly jump to the corresponding line. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.

(4) Press “i” key to enter the editing mode.

(5) Modify “wave” in agc.run_action("wave") as “twist”. If you want to change it to other action groups, please move to the catalog “/home/pi/spiderpi/ action_groups” to check other action group names.

After modification, press “Esc” key and enter “:wq” and then press Enter to save and exit.

:wq

6.10 Auto Obstacle Avoidance

6.10.1 Program Logic

Ultrasonic sensor can measure the distance between SpiderPi Pro and the object ahead. After the data is obtained from the ultrasonic sensor, process and judge the data. When it’s shorter than the set distance threshold, SpiderPi Pro will turn to avoid the front obstacle. Otherwise, the robot will move forward.

6.10.2 Operation Steps

Note

The input command should be case sensitive and space sensitive.

(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.

(2) Click at upper left corner of desktop, or press “Ctrl+Alt+T” to open LX terminal.

(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.

cd spiderpi/functions

(4) Enter the command, then press “Enter” to start the game.

python3 avoidance.py

(5) If you want to exit the game program, press “Ctrl+C” in the LX terminal interface. If the exit fails, please try a few more times.

6.10.3 Project Outcome

Note

The default distance threshold is 40cm. If you want to modify it as other value, you can refer to “6.10.5 Function Extension -> Modify Default Distance Threshold”.

After the game starts, the measured distance will be displayed on the camera returned image. When the distance between SpiderPi Pro and the obstacle is shorter than 25cm, the robot will step back and then turn left. When longer than 25cm and shorter than 40cm, the robot will turn left. When the distance is longer than 40cm, the robot will move forward.

6.10.4 Program Analysis

The source code of this program is located in :/home/pi/spiderpi/functions/avoidance.py

  • Import Function Library

 4import os
 5import sys
 6import cv2
 7import time
 8import threading
 9import numpy as np
10import pandas as pd
11from common import yaml_handle
12from common import kinematics
13from calibration.camera import Camera 
14from calibration.CalibrationConfig import *
15from sensor.ultrasonic_sensor import Ultrasonic
16import arm_ik.arm_move_ik as AMK
  • Define Global Variable

19if sys.version_info.major == 2:
20    print('Please run this program with python3!')
21    sys.exit(0)
22
23
24def load_config():
25    global lab_data, servo_data
26    
27    lab_data = yaml_handle.get_yaml_data(yaml_handle.lab_file_path)
28
29load_config()
30
31Threshold = 40.0 # 默认阈值40cm(default threshold is 40cm)
32TextColor = (0, 255, 255)
33TextSize = 12
34
35__isRunning = False
36distance = 0
  • Main Function Analysis

(1) Initialize and Instantiate

117if __name__ == '__main__':
118    from common.ros_robot_controller_sdk import Board
119
120
121    board = Board()
122    ik = kinematics.IK(board)
123    ultrasonic = Ultrasonic()
124    ak = AMK.ArmIK()

① Call init() function to initialize SpiderPi Pro.

135    init()
136    start()
137    camera = Camera()
138    camera.camera_open()

② Call reset() function to reset servo variable.

38def reset():
39    ak.setPitchRangeMoving((0, 15, 30), 0, -90, 100, 1)

③ Instantiate the camera library and call camera_open() function to enable camera’s distortion correction.

137    camera = Camera()
138    camera.camera_open()
  • Distance Ranging

(1) Distance threshold setting

Set a Threshold to determine whether to perform obstacle avoidance. Its unit is cm.

31Threshold = 40.0 # 默认阈值40cm(default threshold is 40cm)

(2) Acquire and process the measured distance

Obtain the distance measured by the ultrasonic sensor through calling getDistance() function.

102        # 数据处理,过滤异常值(process data and filter abnormal values)
103        distance_ = ultrasonic.getDistance() / 10.0
104        distance_data.append(distance_)
105        data = pd.DataFrame(distance_data)
106        data_ = data.copy()
107        u = data_.mean()  # 计算均值(calculate mean)
108        std = data_.std()  # 计算标准差(calculate standard deviation)
109
110        data_c = data[np.abs(data - u) <= std]
111        distance = data_c.mean()[0]

Process the obtained data for more accurate distance.

103        distance_ = ultrasonic.getDistance() / 10.0
104        distance_data.append(distance_)
105        data = pd.DataFrame(distance_data)
106        data_ = data.copy()
107        u = data_.mean()  # 计算均值(calculate mean)
108        std = data_.std()  # 计算标准差(calculate standard deviation)
109
110        data_c = data[np.abs(data - u) <= std]
111        distance = data_c.mean()[0]
112        if len(distance_data) == 5:
113            distance_data.remove(distance_data[0])

(3) Feedback information

Through calling putText() function in cv2 library, the measured distance will be printed on the camera returned image.

115        cv2.putText(img, "Dist:%.1fcm" % distance, (30, 480 - 30), cv2.FONT_HERSHEY_SIMPLEX, 1.2, TextColor, 2)

The meaning of the parameter in bracket is as follow.

The first parameter img is the input image.

The second parameter "Dist:%.1fcm" % distance is the displayed content

The third parameter (30, 480 - 30) is the displayed position.

The fourth parameter cv2.FONT_HERSHEY_SIMPLEX is the font type.

The fifth parameter 1.2 is the font size

The sixth parameter TextColor is the font color.

The seventh parameter 2 is the font weight.

  • Action Controlling

Compare the measured distance with the set threshold. SpiderPi Pro will execute the corresponding action according to the result.

78            if 0 < distance < Threshold:
79                while distance < 25: # 小于25cm时后退(back up when the distance is less than 25cm)
80                    ik.back(ik.initial_pos, 2, 80, 50, 1)
81                for i in range(6): # 左转6次,每次15度,一共90度(Turn left 6 times with 15 degrees each time, a total of 90 degrees)
82                    if __isRunning:
83                        ik.turn_left(ik.initial_pos, 2, 50, 50, 1)
84            else: 
85                ik.go_forward(ik.initial_pos, 2, 80, 50, 1)
86        else:
87            time.sleep(0.01)

The corresponding actions of different distance range are listed below.

Distance Action
0cm < distance < 25cm move backwards and then turn left
25cm < distance < 40cm turn left
40cm < distance move forward

The movement of SpiderPi Pro can be controlled through calling function in kinematics.IK library. Please check the table below to decide which to use.

Function Usage
ik.back(ik.initial_pos, 2, 80, 50, 1) move backwards 80mm
ik.turn_left(ik.initial_pos, 2, 15, 50, 1) turn left 15 degree on the spot
ik.go_forward(ik.initial_pos, 2, 80, 50, 1) move forward 80mm

The meaning of the parameter in bracket is as follow.

The first parameter is posture

The second parameter is mode. 2 is Spider mode.

The third parameter is stride. When the robot turns, the unit is mm, and when it turns, the unit is degree.

The fourth parameter is speed in mm/s.

The fifth parameter is the number of execution. 0 represents that the action will be executed at loop.

6.10.5 Function Extension

Modify Default Distance Threshold

The default distance threshold is 40cm, and it can set to 30-60. For example, modify it as 50cm.

(1) Enter the command\ and press “Enter” to come to the directory of the game program.

cd spiderpi/functions

(2) Input the command “vim avoidance.py” and press “Enter” to open the program file

vim avoidance.py

(3) Locate the code shown below:

Note

Press “Shift+G” after inputting the line number to directly jump to the corresponding line. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.

(4) Press “i” key to enter the editing mode. And modify “40.0” of “Threshold = 40.0” as “50.0”.

(5) After modification, press “Esc” and enter “:wq” and then press “Enter” to save and exit.

:wq

6.11 Shape Recognition under Single Color

6.11.1 Program Logic

Firstly, program SpiderPi Pro to recognize colors through Lab color space. Convert the RGB color space to Lab, and then perform image binarization, expansion, corrosion and other operations in sequence to obtain an outline only containing the target color. Then, circle the color outline to realize object color recognition.

The next step is to judge the shape of the outline and program SpiderPi Pro to give corresponding response.

6.11.2 Operation Steps

Note

When entering commands, pay strict attention to case sensitivity and spaces.

(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.

(2) Click at upper left corner of desktop to open the Terminator.

(3) Enter the command to navigate to the directory where the game program is located and press Enter.

cd spiderpi/advanced

(4) Enter “python3 shape_recognition_plain.py”, and then press “Enter” to start the game.

python3 shape_recognition_plain.py

(5) f want to quit this game, just press “Ctrl+C”. If the game cannot be quit, please try again.

6.11.3 Project Outcome

After the game starts, place the blue object in front of SpiderPi Pro’s camera. When the shape of the object is recognized, the shape name will be printed on the terminal, and the buzzer will beep. When triangle is recognized, the buzzer will beep once. When rectangle is recognized, the buzzer will beep twice. When circle is recognized, the buzzer will beep three times.

6.11.4 Program Parameter Description

The source code of this program is located at: /home/pi/spiderpi/advanced/shape_recognition_plain.py

  • Importing Function Libraries

 4import sys
 5import cv2
 6import math
 7import time
 8import signal
 9import threading
10import numpy as np
11from common import yaml_handle
12from calibration.camera import Camera
13from calibration.CalibrationConfig import *
14from common import kinematics
15from common.ros_robot_controller_sdk import Board
16from common.action_group_controller import ActionGroupController
17import arm_ik.arm_move_ik as AMK
18from sensor.ultrasonic_sensor import Ultrasonic
19import sensor.dot_matrix_sensor as DMS

(1) Import the libraries related to OpenCV, time, math, and threads. If want to call a function in library, you can use “library name+function name (parameter, parameter)”. For example:

78            time.sleep(3)

Call sleep function in “time” library. The function sleep () is used to delay. There are some built-in libraries in Python, so they can be called directly. For example, “time”, “cv2” and “math”. You can also write a new library like “yaml_handle”.

(2) Instantiating Function Libraries

The name of function library is too long to memorize. For calling function easily, the library can be instantiated. For example:

14from common import kinematics
15from common.ros_robot_controller_sdk import Board
16from common.action_group_controller import ActionGroupController

After instantiating, you can directly input and call the function Board.function name (parameter, parameter).

  • Main Function Analysis

The python program __name__ == '__main__:' is the main function of program. Firstly, the function “init()” is called to initialize. The initialization in this program includes: return the servo to the initial position, read the color threshold file. Generally there are also configurations for ports, peripherals, timing interrupts, etc., which are all done in the process of initialization.

148if __name__ == '__main__':
149    #加载参数(load parameter)
150    param_data = np.load(calibration_param_path + '.npz')
151
152    #获取参数(obtain parameter)
153    mtx = param_data['mtx_array']
154    dist = param_data['dist_array']
155    newcameramtx, _ = cv2.getOptimalNewCameraMatrix(mtx, dist, (640, 480), 0, (640, 480))
156    mapx, mapy = cv2.initUndistortRectifyMap(mtx, dist, None, newcameramtx, (640, 480), 5)
157
158    load_config()
159    init_move()
160    
161    camera = Camera()
162    camera.camera_open()
  • Parameters of Color Detection

Shape recognition is realized through detecting the color of the object. The detected color is blue.

123    color = 'blue'

The main detection parameters involved in the process of detecting the color of the object are as follows:

(1) Before converting the image into LAB space, denoise the image and use GaussianBlur() function for Gaussian filtering.

118    frame_gb = cv2.GaussianBlur(img, (3, 3), 3)

The first parameter img is the input image.

The second parameter (3, 3) is the size of Gaussian kernel. Larger kernel will lead to greater filtering, which results in fuzzier output image and more complex computation.

The third parameter 3 is the standard deviation of Gaussian function along the X direction. It is used to control the change around the average in Gaussian filtering. When the data increases, the allowable variation range around the average value expands; if it decreases, the allowable variation range around the average value narrow down.

(2) Use inRange function to perform binaryzation on the input image, as the picture shown.

124    frame_mask = cv2.inRange(frame_lab,
125                             (lab_data[color]['min'][0],
126                              lab_data[color]['min'][1],
127                              lab_data[color]['min'][2]),
128                             (lab_data[color]['max'][0],
129                              lab_data[color]['max'][1],
130                              lab_data[color]['max'][2]))

(3) To avoid interference and make the image smoother, use cv2.morphologyEx function to process the image.

131    opened = cv2.morphologyEx(frame_mask, cv2.MORPH_OPEN, np.ones((6, 6), np.uint8))
132    closed = cv2.morphologyEx(opened, cv2.MORPH_CLOSE, np.ones((6, 6), np.uint8))

Take opened = cv2.morphologyEx(frame_mask, cv2.MORPH_OPEN, np.ones((6,6),np.uint8)) for example.

The first parameter frame_mask represents the input image.

The second parameter represents the way to change. cv2.MORPH_OPEN indicates open operation. Perform corrosion first, and then dilation to eliminate the black spots. And cv2.MORPH_CLOSE refers to close operation. In close operation, dilation is performed first, and then corrosion to remove bright spots.

The third parameter np.ones((6,6),np.uint8) represents the size of the box.

(4) Find out the maximum contour of the object.

54# 找出面积最大的轮廓(find the contour with the maximum area)
55def get_area_maxContour(contours):
56    contour_area_temp = 0
57    contour_area_max = 0
58    area_max_contour = None
59    for c in contours:
60        contour_area_temp = math.fabs(cv2.contourArea(c))
61        if contour_area_temp > contour_area_max:
62            contour_area_max = contour_area_temp
63            if contour_area_temp > 50:
64                area_max_contour = c
65    return area_max_contour, contour_area_max

To filter out disturbance, set the command, like if contour_area_temp > 50, which means that only when the area is more than 50, the maximum contour is effective.

  • Color Recognition Parameters

When the robot recognizes a blue object, the cv2.drawContours() function is used to draw the contour of the object.

136        cv2.drawContours(img, areaMaxContour, -1, (0, 0, 255), 2)

The first parameter img is the input image;

The second parameter areaMaxContour is the contour itself, which is a list in Python;

The third parameter -1 is the index of the contour. Here, the value represents drawing all the contours in the contour list;

The fourth parameter (0, 0, 255) is the color of the contour. The order is R, G, B, and here it is blue;

The fifth parameter 2 is the width of the contour.

  • Shape Judgment Parameters

(1) After the object contour is framed, acquire polygon approximate object shape through cv2.approxPolyDP, as shown in the picture.

138        approx = cv2.approxPolyDP(areaMaxContour, epsilon, True)

The first parameter areaMaxContour represents the set of points of the contour.

The second parameter epsilon represents the distance between the filtered line segment set and the newly generated line segment set is d. If d is smaller than epsilon, filter out. Otherwise, keep it.

The third parameter True represents the closed contour newly generated. False represents open contour.

The below picture will help you better understand.

loading

Process AC segment first. When d, distance between B and AC, is more than epsilon, then keep AB. Then, process BC segment.

Note

you can set the value of epsilon. Epsilon of this game program is set to 0.035 times the contour perimeter. The smaller the value, the better the recognition effect.

(2) Obtain the quantity of the sides of polygon approximate object shape, and display it on the terminal.

140        if len(shape_list) == 24:
141            shape_length = int(round(np.mean(shape_list)))
142            shape_list = []
143            #print(shape_length)
144    else:
145        shape_length = 0
146    return img

(3) Through obtaining the number of the sides, judge the shape of the object and display it on the terminal. At the same time, control the buzzer to sound different times continuously according to the shape.

71# 主要控制函数(main control function)
72def move():
73    #global shape_length, board
74    while move_st:
75        if shape_length == 3:
76            print('三角形')
77            board.set_buzzer(2400, 0.1, 0.4, 1)  # 以2400Hz的频率,0.1秒开始响,0.4秒停止响,重复1次(The buzzer sounds at a frequency of 2400Hz for 0.1 seconds, followed by a pause of 0.4 seconds, and it repeats this pattern once)
78            time.sleep(3)
79            
80        elif shape_length == 4:
81            print('矩形')
82            board.set_buzzer(2400, 0.1, 0.4, 2)  # 以2400Hz的频率,0.1秒开始响,0.4秒停止响,重复2次(The buzzer sounds at a frequency of 2400Hz for 0.1 seconds, followed by a pause of 0.4 seconds, and it repeats this pattern twice)
83            time.sleep(3)
84            
85        elif shape_length >= 6:
86            print('圆')
87            board.set_buzzer(2400, 0.1, 0.4, 3)  # 以2400Hz的频率,0.1秒开始响,0.4秒停止响,重复3次(The buzzer sounds at a frequency of 2400Hz for 0.1 seconds, followed by a pause of 0.4 seconds, and it repeats this pattern three times)
88            time.sleep(3)
89            
90        else:
91            time.sleep(1)

6.11.5 Function Extension

  • Changing the Default Recognition Color

The default recognizable color of this game is blue. Here, taking changing the default recognition color to red as an example, the specific modification steps are as follows:

(1) Enter command to the catalog where the game programs are stored.

cd spiderpi/advanced

(2) Enter command to open the program file.

sudo vim shape_recognition_plain.py

(3) Locate the code shown below:

Note

press “Shift+G” after inputting the line number to directly jump to the corresponding line. This section aims to introduce the quick jump method, therefore, the code location numbers are for reference only. Please refer to the actual situation.

(4) Press “i” key to enter the editing mode, then modify “blue” of “color = ‘blue’” as “red”.

(5) After modification, press “Esc” and input “:wq” to save the file and exit.

:wq

(6) Execute the steps in “6.11.2 Operation Steps” to check the modification effect.

  • Changing the Feedback Sound

When triangle is recognized, the buzzer will beep once. When rectangle is recognized, the buzzer will beep twice. When circle is recognized, the buzzer will beep third times. And we make the buzzer beep twice when the circle is recognized for example.

(1) Enter the command and press “Enter” to enter the catalog where the game programs are stored.

cd spiderpi/advanced

(2) Enter the command and press “Enter” to open the program file.

sudo vim shape_recognition_plain.py

(3) Scroll down to find these codes.

(4) Press “i” key to enter the editing mode and modify the “3” in board.set_buzzer(2400, 0.1, 0.4, 3) to “2”.

(5) After modification, press the “Esc” key, enter “:wq” and press Enter to save and exit.

:wq

(6) Execute the steps in “6.11.2 Operation Steps” to check the modification effect.

6.12 Shape Recognition

6.12.1 Program logic

Firstly, process the real-time camera image through OpenCV, and then perform binaryzation, corrosion, dilation, etc., to obtain the contour only containing the target color, and mark it.

After obtaining the target contour, deduce the corresponding shape according to the contour approximation result. And the recognition result will be displayed on the dot matrix screen, so as to realize shape recognition.

6.12.2 Operation steps

Note

The input command should be case sensitive and space sensitive.

(1) Boot up SpiderPi Pro, then remotely connect to Raspberry Pi desktop through VNC.

(2) Clickat upper left corner of desktop to open the Terminator.

(3) Enter the command and press “Enter” to navigate to the directory where the game program is located.

cd spiderpi/advanced

(4) Enter command, and then press “Enter” to start the game.

python3 shape_recognition.py

(5) If want to close this game, press “Ctrl+C” on LX terminal. If the game cannot be quit, please try again.

6.12.3 Project outcome

Note

The default recognition color is red, green and blue. The recognizable shapes are triangle, rectangle and circle.

When the shape is recognized, the corresponding shape pattern will be displayed on the dot matrix screen. In addition, the quantity of sides of the shape and the shape name are printed at the terminal.

6.12.4 Program Parameter Description

The source code of this program is located at /home/pi/spiderpi/advanced/shape_recognition.py

  • Import Function Library

 4import sys
 5import cv2
 6import math
 7import time
 8import signal
 9import threading
10import numpy as np
11from calibration.camera import Camera
12from calibration.CalibrationConfig import *
13from common import yaml_handle
14from common import kinematics
15from common.ros_robot_controller_sdk import Board
16from common.action_group_controller import ActionGroupController
17import arm_ik.arm_move_ik as AMK
18from sensor.ultrasonic_sensor import Ultrasonic
19import sensor.dot_matrix_sensor as DMS

(1) Import the libraries related to OpenCV, time, math, and threads. If want to call a function in library, you can use “library name+function name (parameter, parameter)”. For example:

198            time.sleep(0.01)

Call sleep function in “time” library. The function sleep () is used to delay. There are some built-in libraries in Python, so they can be called directly. For example, time, cv2 and math. You can also write a new library like yaml_handle.

(2) Instantiating Function Libraries

The name of function library is too long to memorize. For calling function easily, the library can be instantiated. For example:

15from common.ros_robot_controller_sdk import Board
16from common.action_group_controller import ActionGroupController
17import arm_ik.arm_move_ik as AMK

After instantiating, you can directly input and call the function Board.function name (parameter, parameter).

  • Analysis of the Main Function

In a Python program, __name__ == '__main__:' is the main function of the program. First, the function init() is called for initialization configuration. In this program, the initialization includes: returning the servo to the initial position and reading the color threshold file. Generally, there are also configurations such as ports, peripherals, and timer interrupts. All of these need to be completed in the initialization content.

172if __name__ == '__main__':
173    #加载参数(load parameter)
174    param_data = np.load(calibration_param_path + '.npz')
175
176    #获取参数(obtain parameter)
177    mtx = param_data['mtx_array']
178    dist = param_data['dist_array']
179    newcameramtx, _ = cv2.getOptimalNewCameraMatrix(mtx, dist, (640, 480), 0, (640, 480))
180    mapx, mapy = cv2.initUndistortRectifyMap(mtx, dist, None, newcameramtx, (640, 480), 5)
  • Defining Global Variables

42# 读取颜色阈值函数(read color threshold and parameters of coordinate transformation)
43def load_config():
44    global lab_data
45    
46    lab_data = yaml_handle.get_yaml_data(yaml_handle.lab_file_path)
47    
48# 初始位置(initial position)
49def init_move():
50    ultrasonic.setRGBMode(0)
51    ultrasonic.setRGB(0, (0, 0, 0))
52    ultrasonic.setRGB(1, (0, 0, 0))
53    ik.stand(ik.initial_pos)
54    ak.setPitchRangeMoving((0, 12, 18), -60, -90, 100, 2)

(1) Gaussian Filtering

Before converting the image from RGB into LAB space, denoise the image and use GaussianBlur() function in cv2 library for Gaussian filtering.

132    frame_lab = cv2.cvtColor(frame_gb, cv2.COLOR_BGR2LAB)  # 将图像转换到LAB空间(convert the image to LAB space)

The meaning of the parameters in bracket is as follow

The first parameter img is the input image.

The second parameter (3, 3) is the size of Gaussian kernel.

The third parameter 3 is the allowable variance around the average in Gaussian filtering. The larger the value, the larger the allowable variance around the average value; The smaller the value, the smaller the allowable variance around the average value.

(2) Binarization Processing

Adopt inRange() function in cv2 library to perform binaryzation on the image.

139            frame_mask = cv2.inRange(frame_lab,
140                             (lab_data[i]['min'][0],
141                              lab_data[i]['min'][1],
142                              lab_data[i]['min'][2]),
143                             (lab_data[i]['max'][0],
144                              lab_data[i]['max'][1],
145                              lab_data[i]['max'][2]))  #对原图像和掩模进行位运算(perform bitwise operation to original image and mask)

The first parameter in the bracket is the input image. The second and the third parameters respectively are the lower limit and upper limit of the threshold. When the RGB value of the pixel is between the upper limit and lower limit, the pixel is assigned a value of 1, otherwise, 0.

(3) Corrosion and dilation

The function of erosion is to remove burrs from the edges of the image. The function of dilation is to expand the edge of the image and fill in the non-target pixels at the edge or inside of the target object.

To reduce distraction and make the image smoother, use morphologyEx() function in OpenCV library to perform open operation and close operation in sequence on the gray-scale image obtained after binaryzation.

146            opened = cv2.morphologyEx(frame_mask, cv2.MORPH_OPEN, np.ones((6,6),np.uint8))  #开运算(opening operation)
147            closed = cv2.morphologyEx(opened, cv2.MORPH_CLOSE, np.ones((6,6),np.uint8)) #闭运算(Closing operation)

The open operation is to erode first and then dilate, which can eliminate small areas with high brightness and separate objects at thin points. The boundary of the larger object can be smoothed without changing its area.

The close operation is to dilate first, then corrode. Its function is to bridge narrow discontinuities and slender ravines, eliminate small holes, make up for breaks in contour lines, and it also has a certain smoothing effect on contours.

The meaning of the parameters in the parentheses of the morphologyEx() function is as follow.

The first parameter is the input image

The second parameter is the morphological method used. cv2.MORPH_OPEN is for open operation, and cv2.MORPH_CLOSE is for close operation.

The third parameter is the kernel of the morphological operation. np.ones((6,6),np.uint8) is a 3×3 square structural element.

(4) Acquire the maximum contour

After processing the image, acquire the contour of the target to be recognized, which involves findContours() function in cv2 library.

148            contours = cv2.findContours(closed, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[-2]  #找出轮廓(find contours)

The first parameter in parentheses is the input image; the second parameter is the retrieval mode of the contour; the third parameter is the approximation method of the contour.

Find the contour of the maximum area among the obtained contours. To avoid interference, please set a minimum value. Only when the area is larger than this value, the target contour is valid.

68            if contour_area_temp > 50:  # 只有在面积大于50时,最大面积的轮廓才是有效的,以过滤干扰(Only when the area is greater than the set value, the contour with the maximum area is considered valid to filter out interference)
69                area_max_contour = c

After obtaining the contour with largest area, use drawContours() function in cv2 library to mark the contour.

68        cv2.drawContours(img, areaMaxContour_max, -1, (0, 0, 255), 2)

(5) Shape Recognition

Calculate the perimeter of the contour with arcLength() function in cv2 library and use the approxPolyDP() function for contour approximation

157        # 识别形状(shape recognition)
158        # 周长  0.035 根据识别情况修改,识别越好,越小(Perimeter 0.035. Adjust according to the detection performance, the better the detection, the smaller the value)
159        epsilon = 0.035 * cv2.arcLength(areaMaxContour_max, True)
160        # 轮廓相似(contours are similar)
161        approx = cv2.approxPolyDP(areaMaxContour_max, epsilon, True)

Based on the contour approximation result, acquire the number of the side of the recognized image to judge the corresponding shape of the image.

162        shape_list.append(len(approx))
163        if len(shape_list) == 24:
164            shape_length = int(round(np.mean(shape_list)))                            
165            shape_list = []
166    else:
167        shape_length = 0
  • Dot Matrix Display

According to the recognition result, the corresponding pattern will be displayed on the dot matrix screen.

 75        if shape_length == 3:
 76            print('三角形')
 77            ## 显示'三角形'(display 'triangle')
 78            tm.display_buf = (0x80, 0xc0, 0xa0, 0x90, 0x88, 0x84, 0x82, 0x81,
 79                              0x81, 0x82, 0x84,0x88, 0x90, 0xa0, 0xc0, 0x80)
 80            tm.update_display()
 81            
 82        elif shape_length == 4:
 83            print('矩形')
 84            ## 显示'矩形'(display 'rectangle')
 85            tm.display_buf = (0x00, 0x00, 0x00, 0x00, 0xff, 0x81, 0x81, 0x81,
 86                              0x81, 0x81, 0x81,0xff, 0x00, 0x00, 0x00, 0x00)
 87            tm.update_display()
 88            
 89        elif shape_length >= 6:           
 90            print('圆')
 91            ## 显示'圆形'(display 'circle')
 92            tm.display_buf = (0x00, 0x00, 0x00, 0x00, 0x1c, 0x22, 0x41, 0x41,
 93                              0x41, 0x22, 0x1c,0x00, 0x00, 0x00, 0x00, 0x00)
 94            tm.update_display()
 95            
 96        else:
 97            ## 清屏(clear the screen)
 98            tm.display_buf = [0] * 16
 99            tm.update_display()
100            print('None')

There are 16 columns of LEDs on the dot matrix screen and each column is controlled with a hexadecimal value, that is “10001000”. The status of LEDs corresponding to this value, from top to bottom, is “on off off off on off off off”.

loading

Through calling update_display() function in HiwonderSDK.tm1640 library, refresh the font in the tm.display_buf buffer area and display it on the dot matrix screen, and then you can control the dot matrix screen to display the desired pattern.