{"slug": "rk3576-hailo-8-12x-ai-performance-boost", "title": "RK3576 + Hailo-8: 12x AI Performance Boost", "summary": "Seeed Studio integrated a Hailo-8 M.2 AI accelerator with the reComputer RK3576, boosting YOLOv11n inference from 2.3 FPS to 28.1 FPS—a 12x performance increase—enabling real-time edge video analysis for surveillance, inspection, and robotics without cloud dependency.", "body_md": "Edge AI devices are often constrained by their built-in compute capabilities. The Seeed Studio reComputer RK3576, powered by Rockchip's RK3576 processor, features a respectable 6 TOPS NPU. However, when running YOLOv11n for video object detection, it achieves only ~2.3 FPS – far from usable for real-time applications.\n\nThis project addresses that bottleneck by integrating the Hailo-8 M.2 AI Accelerator Module, which delivers an additional 26 TOPS of compute power. The result? We pushed YOLOv11n inference to 28.1 FPS – a 12x+ performance uplift – enabling smooth, real-time video analysis.\n\nThe solution leverages:\n\n- reComputer RK3576\n- Hailo-8 AI Accelerator\n- Docker-based deployment\n- AI Lab model workflow\n- YOLO object detection\n\nThe entire system runs locally at the edge without relying on cloud services, making it suitable for:\n\n- Smart surveillance\n- Industrial inspection\n- Traffic monitoring\n- Robotics perception\n- Smart retail analytics\n\nBefore deploying the AI application, the Hailo-8 AI accelerator must be installed into the M.2 PCIe expansion slot of the reComputer RK3576.\n\nThe reComputer RK3576 provides an internal M.2 expansion interface that enables high-speed PCIe communication with AI accelerators such as the Hailo-8.\n\n**Step1：Open the reComputer RK3576 Enclosure**\n\nTo access the internal expansion interface, remove the enclosure cover according to the hardware guide.\n\nAfter opening the chassis, the mainboard and expansion interfaces become accessible for hardware upgrades and AI accelerator installation.\n\n**Step2：Install the Hailo-8 Accelerator**\n\nAlign the Hailo-8 M.2 connector with the PCIe slot and insert it at a slight angle.\n\nPush the module gently into the connector until it is fully seated.\n\nThe RK3576 already includes an NPU for AI workloads. However, when running multiple object detection streams or higher frame-rate video, additional acceleration becomes beneficial.\n\nHailo-8 provides dedicated AI inference acceleration that can significantly increase throughput while maintaining low power consumption. Hailo's architecture is specifically optimized for edge AI vision workloads.\n\nSystem Architecture\n\n```\nUSB Camera      │      ▼reComputer RK3576      │      ▼Hailo-8 Accelerator      │      ▼YOLO Inference Engine      │      ▼Bounding Box Visualization      │      ▼Local Display / Web StreamUSB Camera      │      ▼reComputer RK3576      │      ▼Hailo-8 Accelerator      │      ▼YOLO Inference Engine      │      ▼Bounding Box Visualization      │      ▼Local Display\n```\n\nDeploying the Hailo Software Package**Step 1: Access the Hailo Software Download Center**\n\nVisit the Hailo website and search for **\"Software Downloads\"** using the search bar.\n\n**Step 2: Select Your Hardware Platform**\n\nOn the software download page, choose the appropriate hardware platform. For this project, select Hailo-8.\n\n**Step 3: Download the Required Software Packages**\n\nDownload the required software packages for your operating system, including:\n\n- HailoRT Runtime\n- PCIe Driver\n- Model Zoo (optional)\n- Development Tools and SDK (if needed)\n\nMake sure to download the versions that are compatible with your target platform and operating system before proceeding with the installation.\n\nTransfer Required Files to reComputer RK3576Open **Windows Terminal** or **PowerShell** on your PC and transfer the required installation packages to the reComputer RK3576 using SCP.\n\n```\nscp C:\\Users\\seeed\\Downloads\\hailort-pcie-driver_4.24.0_all.deb seeed@192.168.10.230:/home/seeed/scp C:\\Users\\seeed\\Downloads\\hailort-4.24.0-cp311-cp311-linux_aarch64.whl seeed@192.168.10.230:/home/seeed/scp C:\\Users\\seeed\\Downloads\\librknnrt.so seeed@192.168.10.230:/home/seeed/\n```\n\nThe files will be copied from the Windows host to the home directory of the RK3576 device.\n\nInstall Hailo Runtime and PCIe DriverOpen a terminal on the reComputer RK3576 and execute the following commands to install the Hailo PCIe driver, runtime environment, and Python SDK.\n\n```\n# Install the PCIe driversudo dpkg -i hailort-pcie-driver_4.24.0_all.deb# Reboot the systemsudo reboot# After reboot, verify the driver is loadedlsmod | grep hailo# Install HailoRTsudo dpkg -i hailort_4.24.0_arm64.deb# Scan and verify device statushailortcli scan# Create and activate a virtual environmentpython3 -m venv hailo_envsource hailo_env/bin/activate# Install HailoRT Python librarypip install hailort-4.23.0-cp311-cp311-linux_aarch64.whl# Verify installation and device connectionpython3 -c \"from hailo_platform import VDevice; vdev = VDevice(); print('Successfully connected via VDevice! Device info:', vdev)\"\n```\n\nIf the final verification command successfully detects the accelerator, the Hailo-8 PCIe device has been installed correctly and is ready for AI model deployment.\n\nInstall the Hailo Model ZooNext, install the official Hailo Model Zoo toolkit. This toolkit provides utilities for downloading, compiling, converting, and running pretrained AI models optimized for Hailo accelerators, including object detection, image classification, and segmentation models.\n\n```\n# 1. Install required system librariessudo apt updatesudo apt install -y git libglib2.0-0 libgl1-mesa-glx# 2. Clone the official repository (latest branch recommended)git clone https://github.com/hailo-ai/hailo_model_zoo.gitcd hailo_model_zoopip install -e .\n```\n\nAfter installation, the Model Zoo tools can be used to download and deploy AI models directly onto the Hailo-8 accelerator.\n\nVerify Camera DetectionBefore running inference, verify that the camera is correctly detected by the operating system.\n\n```\nv4l2-ctl --list-devices\n```\n\nIf the camera is connected successfully, it should appear in the device list.\n\n**Figure X.** Camera detection result on the reComputer RK3576.\n\nTo deploy YOLO11n on the Hailo-8 accelerator, first download and compile the model using the Hailo Model Zoo tools.\n\nRun the following command inside the `hailo_model_zoo`\n\ndirectory with the virtual environment activated:\n\n```\nhailomz compile yolov11n\n```\n\nAfter the compilation process is completed, the generated `yolov11n.hef`\n\nmodel file can be used for real-time AI inference on the Hailo-8 accelerator.\n\nCreate a new Python file named:\n\n```\nwebcam_yolo11n.py\n```\n\nCopy the following code into the file:\n\n``` python\nimport numpy as npimport cv2import timefrom hailo_platform import (VDevice, HEF, InferVStreams, ConfigureParams,                            HailoStreamInterface, InputVStreamParams, OutputVStreamParams)# ================= Configuration =================HEF_PATH = 'yolov11n.hef'DEVICE_ID = \"/dev/video0\"  # Update based on v4l2-ctl outputCONF_THRESHOLD = 0.45# COCO Dataset 80 Class LabelsCOCO_CLASSES = [    \"person\", \"bicycle\", \"car\", \"motorcycle\", \"airplane\", \"bus\", \"train\", \"truck\", \"boat\", \"traffic light\",    \"fire hydrant\", \"stop sign\", \"parking meter\", \"bench\", \"bird\", \"cat\", \"dog\", \"horse\", \"sheep\", \"cow\",    \"elephant\", \"bear\", \"zebra\", \"giraffe\", \"backpack\", \"umbrella\", \"handbag\", \"tie\", \"suitcase\", \"frisbee\",    \"skis\", \"snowboard\", \"sports ball\", \"kite\", \"baseball bat\", \"baseball glove\", \"skateboard\", \"surfboard\",    \"tennis racket\", \"bottle\", \"wine glass\", \"cup\", \"fork\", \"knife\", \"spoon\", \"bowl\", \"banana\", \"apple\",    \"sandwich\", \"orange\", \"broccoli\", \"carrot\", \"hot dog\", \"pizza\", \"donut\", \"cake\", \"chair\", \"couch\",    \"potted plant\", \"bed\", \"dining table\", \"toilet\", \"tv\", \"laptop\", \"mouse\", \"remote\", \"keyboard\", \"cell phone\",    \"microwave\", \"oven\", \"toaster\", \"sink\", \"refrigerator\", \"book\", \"clock\", \"vase\", \"scissors\", \"teddy bear\",    \"hair drier\", \"toothbrush\"]# ==================================================def main():    # 1. Initialize Hailo Hardware    hef = HEF(HEF_PATH)    input_vstream_info = hef.get_input_vstream_infos()[0]    input_h, input_w = input_vstream_info.shape[:2]    cap = cv2.VideoCapture(DEVICE_ID)    if not cap.isOpened():        print(\"Cannot open webcam\")        return    # Setup inference variables    prev_time = 0    with VDevice() as target:        config_params_dict = ConfigureParams.create_from_hef(hef, HailoStreamInterface.PCIe)        network_group = target.configure(hef, config_params_dict)[0]        with network_group.activate():            vstream_params = (InputVStreamParams.make_from_network_group(network_group),                              OutputVStreamParams.make_from_network_group(network_group))            with InferVStreams(network_group, vstream_params[0], vstream_params[1]) as vstreams:                print(\"[INFO] Initialization successful! Running YOLOv11n real-time detection...\")                while True:                    start_time = time.time()  # Record start time for FPS                    ret, frame = cap.read()                    if not ret:                        break                    # Preprocessing (Convert to RGB based on previous validation)                    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)                    resized = cv2.resize(frame_rgb, (input_w, input_h))                    input_tensor = np.expand_dims(resized, axis=0)                    # Inference                    outputs = vstreams.infer(input_tensor)                    # Parsing and Drawing                    h, w, _ = frame.shape                    for name, class_list in outputs.items():                        # Iterate through 80 classes                        for class_id, detections in enumerate(class_list[0]):                            if len(detections) > 0:                                for det in detections:                                    if len(det) >= 5:                                        ymin, xmin, ymax, xmax, confidence = det[:5]                                        if confidence > CONF_THRESHOLD:                                            # Coordinate Mapping                                            left, top = int(xmin * w), int(ymin * h)                                            right, bottom = int(xmax * w), int(ymax * h)                                            # Get class name, display ID if out of bounds                                            class_name = COCO_CLASSES[class_id] if class_id < len(COCO_CLASSES) else f\"ID {class_id}\"                                            # Draw bounding box                                            cv2.rectangle(frame, (left, top), (right, bottom), (0, 255, 0), 2)                                            # Draw background and label text                                            label = f\"{class_name}: {confidence:.2f}\"                                            cv2.putText(frame, label, (left, top - 10),                                                        cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)                    # Calculate and display real-time FPS                    curr_time = time.time()                    fps = 1 / (curr_time - start_time)                    # Print in the top left corner                    cv2.putText(frame, f\"FPS: {fps:.1f}\", (20, 40),                                cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)                    # Display window                    cv2.imshow('reComputer RK3576 - Hailo YOLOv11n', frame)                    if cv2.waitKey(1) & 0xFF == ord('q'):                        break    cap.release()    cv2.destroyAllWindows()if __name__ == \"__main__\":    main()\n```\n\nrunning result：\n\nThen create a Python file named `webcam_npu_save`\n\nto run the model on the built-in NPU of RK3576. The code is as follows:\n\n``` python\nimport numpy as npimport cv2import timeimport osfrom rknnlite.api import RKNNLite# ================= Configuration =================RKNN_MODEL_PATH = 'yolo11n.rknn'  # RKNN 模型路径DEVICE_ID = 0  # 摄像头设备号，对应 /dev/video0CONF_THRESHOLD = 0.45OUTPUT_DIR = \"npu_detection_results\"os.makedirs(OUTPUT_DIR, exist_ok=True)# COCO Dataset 80 Class LabelsCOCO_CLASSES = [    \"person\", \"bicycle\", \"car\", \"motorcycle\", \"airplane\", \"bus\", \"train\", \"truck\", \"boat\", \"traffic light\",    \"fire hydrant\", \"stop sign\", \"parking meter\", \"bench\", \"bird\", \"cat\", \"dog\", \"horse\", \"sheep\", \"cow\",    \"elephant\", \"bear\", \"zebra\", \"giraffe\", \"backpack\", \"umbrella\", \"handbag\", \"tie\", \"suitcase\", \"frisbee\",    \"skis\", \"snowboard\", \"sports ball\", \"kite\", \"baseball bat\", \"baseball glove\", \"skateboard\", \"surfboard\",    \"tennis racket\", \"bottle\", \"wine glass\", \"cup\", \"fork\", \"knife\", \"spoon\", \"bowl\", \"banana\", \"apple\",    \"sandwich\", \"orange\", \"broccoli\", \"carrot\", \"hot dog\", \"pizza\", \"donut\", \"cake\", \"chair\", \"couch\",    \"potted plant\", \"bed\", \"dining table\", \"toilet\", \"tv\", \"laptop\", \"mouse\", \"remote\", \"keyboard\", \"cell phone\",    \"microwave\", \"oven\", \"toaster\", \"sink\", \"refrigerator\", \"book\", \"clock\", \"vase\", \"scissors\", \"teddy bear\",    \"hair drier\", \"toothbrush\"]# ==================================================def post_process(outputs, frame, conf_threshold):    \"\"\"解码 YOLOv8 风格的原始输出（解耦头）\"\"\"    h, w = frame.shape[:2]    detections = []        # 定义三个尺度的特征图尺寸    scales = [(80, 80), (40, 40), (20, 20)]        # 提取输出 (索引对应关系)    # 0,3,6: 回归 (reg) -> [64, 80, 80] 等    # 1,4,7: 分类 (cls) -> [80, 80, 80] 等    # 2,5,8: 目标性 (obj) -> [1, 80, 80] 等    reg_outputs = [outputs[0], outputs[3], outputs[6]]    cls_outputs = [outputs[1], outputs[4], outputs[7]]    obj_outputs = [outputs[2], outputs[5], outputs[8]]    # 遍历三个尺度    for reg, cls, obj, (h_feat, w_feat) in zip(reg_outputs, cls_outputs, obj_outputs, scales):        # 将张量展平并调整维度顺序        reg = reg.squeeze(0).transpose(1, 2, 0).reshape(-1, 64)  # [num_boxes, 64]        cls = cls.squeeze(0).transpose(1, 2, 0).reshape(-1, 80)  # [num_boxes, 80]        obj = obj.squeeze(0).transpose(1, 2, 0).reshape(-1, 1)   # [num_boxes, 1]        # 对每个特征点进行解码        for i in range(reg.shape[0]):            # 1. 目标性分数            obj_conf = obj[i][0]            if obj_conf < conf_threshold:                continue                        # 2. 分类分数            cls_scores = cls[i] * obj_conf  # 分类分数 * 目标性分数            class_id = np.argmax(cls_scores)            confidence = cls_scores[class_id]            if confidence < conf_threshold:                continue            # 3. 解码边界框 (YOLO 格式)            # 获取特征图中的网格坐标            row = i // w_feat            col = i % w_feat                        # 从回归头中提取 x, y, w, h 的偏移量            reg_vals = reg[i]            dx, dy, dw, dh = reg_vals[0], reg_vals[1], reg_vals[2], reg_vals[3]                        # 计算中心点坐标和宽高 (在特征图上的归一化坐标)            cx = (col + dx) / w_feat            cy = (row + dy) / h_feat            width = dw            height = dh                        # 转换为原图坐标            left = int((cx - width / 2) * w)            top = int((cy - height / 2) * h)            right = int((cx + width / 2) * w)            bottom = int((cy + height / 2) * h)                        # 边界检查            left = max(0, left)            top = max(0, top)            right = min(w, right)            bottom = min(h, bottom)                        class_name = COCO_CLASSES[class_id] if class_id < len(COCO_CLASSES) else f\"ID {class_id}\"            detections.append((class_name, float(confidence), left, top, right, bottom))        # 移除多余代码（因为这里没有使用 NMS，但建议保留）    # 注意：如果检测框过多，可以考虑保留 NMS 逻辑    return detectionsdef main():    # 1. 初始化 RKNN    rknn = RKNNLite()        # 加载模型    print(f\"[INFO] Loading RKNN model from {RKNN_MODEL_PATH}...\")    ret = rknn.load_rknn(RKNN_MODEL_PATH)    if ret != 0:        print(f\"[ERROR] Load RKNN model failed: {ret}\")        return        # 初始化运行时    print(\"[INFO] Initializing RKNN runtime...\")    ret = rknn.init_runtime()    if ret != 0:        print(f\"[ERROR] Init runtime failed: {ret}\")        return    print(\"[INFO] RKNN model loaded successfully\")        # 2. 打开摄像头    cap = cv2.VideoCapture(DEVICE_ID)    if not cap.isOpened():        print(\"Cannot open webcam\")        return        print(f\"[INFO] Camera opened. Model input: 640x640\")    print(\"[INFO] Press Ctrl+C to stop.\")        frame_count = 0    save_interval = 5  # 每5帧保存一张        try:        while True:            start_time = time.time()            ret, frame = cap.read()            if not ret:                break                        # 预处理            frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)            resized = cv2.resize(frame_rgb, (640, 640))            input_tensor = np.expand_dims(resized, axis=0).astype(np.float32) / 255.0                        # 推理            outputs = rknn.inference(inputs=[input_tensor])                        # 后处理            detections = post_process(outputs, frame, CONF_THRESHOLD)                        # 计算并显示 FPS            fps = 1 / (time.time() - start_time)            cv2.putText(frame, f\"NPU FPS: {fps:.1f}\", (20, 40),                        cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)                        # 每 N 帧保存一张图片（带检测框和FPS）            if frame_count % save_interval == 0:                img_path = os.path.join(OUTPUT_DIR, f\"npu_frame_{frame_count:06d}.jpg\")                cv2.imwrite(img_path, frame)                print(f\"[Frame {frame_count}] Saved -> {img_path}\")                        # 终端打印检测信息            if detections:                print(f\"[Frame {frame_count}] Found {len(detections)} objects, FPS: {fps:.1f}\")                for cls, conf, l, t, r, b in detections[:3]:  # 最多打印3个                    print(f\"  - {cls}: {conf:.2f} at ({l},{t})-({r},{b})\")                        frame_count += 1            time.sleep(0.01)                except KeyboardInterrupt:        print(\"\\n[INFO] Stopped by user.\")        cap.release()    print(f\"[INFO] Done. Total frames: {frame_count}\")if __name__ == \"__main__\":    main()\n```\n\nrunning result：\n\nRK3576 built-in NPU: YOLOv11n only achieves 2.3 FPS with extremely poor real-time performance, making smooth video detection impossible.\n\nHailo-8 PCIe accelerator card: The same model reaches 28.1 FPS, delivering over tenfold frame rate improvement and meeting the real-time inference requirements of standard video streams.\n\nThis project demonstrates how the reComputer RK3576 and Hailo-8 accelerator can be combined to build a powerful edge AI vision platform capable of real-time object detection without cloud dependency.The combination of Rockchip computing resources, Hailo acceleration, and containerized deployment provides an accessible path for developers building next-generation intelligent edge applications.\n\n[Read more](javascript:void(0))", "url": "https://wpnews.pro/news/rk3576-hailo-8-12x-ai-performance-boost", "canonical_source": "https://www.hackster.io/Myang/rk3576-hailo-8-12x-ai-performance-boost-6ff665", "published_at": "2026-06-25 01:30:11+00:00", "updated_at": "2026-06-25 09:16:55.787000+00:00", "lang": "en", "topics": ["artificial-intelligence", "computer-vision", "ai-products"], "entities": ["Seeed Studio", "Rockchip", "Hailo", "reComputer RK3576", "Hailo-8", "YOLOv11n", "RK3576"], "alternates": {"html": "https://wpnews.pro/news/rk3576-hailo-8-12x-ai-performance-boost", "markdown": "https://wpnews.pro/news/rk3576-hailo-8-12x-ai-performance-boost.md", "text": "https://wpnews.pro/news/rk3576-hailo-8-12x-ai-performance-boost.txt", "jsonld": "https://wpnews.pro/news/rk3576-hailo-8-12x-ai-performance-boost.jsonld"}}