RK3588-Based AI Security Gateway with 8K Video and 6 TOPS Edge Performance - Blog

News
Blog

RK3588-based Intelligent Security Gateway Solution: Enabling Closed-loop Multi-channel Video AI Inference and Real-time Streaming

Against the backdrop of the accelerated development of smart cities and Industry 4.0, security monitoring systems are undergoing an intelligent upgrade from simply "seeing" to truly "understanding" what they observe. Especially in the fields of intelligent transportation, smart buildings, and public safety, real-time video analytics and rapid decision-making capabilities have become core requirements for ensuring operational efficiency.

In response to the dual challenges of real-time performance and intelligence posed by multi-channel video surveillance scenarios, Forlinx Embedded has launched an intelligent multi-channel security gateway solution based on the RK3588 SoM. Relying on the on-board video codec hardware acceleration unit (VPU) and neural network processing unit (NPU), it can achieve real-time encoding and decoding of multi-channel video data and inference of various AI models, thus establishing a full-process closed-loop from video collection to intelligent analysis.

RK3588-based Intelligent Security Gateway Solution Enabling Closed-loop Multi-channel Video AI Inference and Real-time Streaming

1. Performance Analysis of FET3588-C SoM

Let's first take a look at the core configuration of the main control device. It is equipped with Rockchip's flagship RK3588 processor, which adopts a big-little core architecture of 4×[email protected] + 4×[email protected], providing powerful power for multitasking. It has a built-in Rockchip self-developed triple-core NPU that can work collaboratively or independently, enabling flexible allocation of computing power to avoid redundancy.

The comprehensive computing power can reach 6 TOPS, offering abundant computing support for edge-side AI. It also incorporates the new-generation 48-megapixel ISP3.0, which can achieve effects such as lens shadow correction, 2D/3D noise reduction, sharpening and dehazing, fisheye correction, gamma correction, and wide-dynamic contrast enhancement, significantly improving the image quality.

In addition, it is fully equipped with a rich variety of high-speed data communication interfaces to meet the diverse needs of users.

Performance Analysis of FET3588-C SoM

2. System Data Flow

The FET3588-C SoM, multiple network cameras, and the PC are in a local area network formed by a switch. The FET3588-C SoM is responsible for the entire process of pulling and decoding the video streams input from the network cameras, conducting AI inference, and encoding and pushing the streams. The processed video streams are pushed to the designated streaming media server. The PC pulls and plays the corresponding videos from this streaming media server.

RK3588-based Intelligent Security Gateway Solution System Data Flow

3. Highlights of the Solution

01. Efficient real-time video processing

The FET3588-C SoM adopts Rockchip's fourth-generation codec technology. In addition to supporting the mainstream 8K@60fps H.265 decoder and 8K@30fps H.264 decoder, it also supports the 8K@60fps VP9 decoder and 4K@60fps AV1 decoder.

Both video encoding/decoding and AI inference in the entire process rely on the SoM for codec processing. With multiple camera inputs, the decoding time for each frame of video data is about 15ms, and the inference time is about 30-200ms. Through frame-skipping operations, it can meet the requirement of real-time processing of multi-camera video data at around 20fps.

02. Real-time inference of multiple AI models

The FET3588-C SoM introduces INT4/INT8/INT16/FP16 mixed-precision operations, which improve the MAC utilization rate by over 28%. At the same time, it upgrades to the 2.0 RKNN TOOLkit2 suite, which has strong compatibility and can meet the edge-computing needs of most terminal devices.

This solution supports the deployment of multiple AI models to meet the requirements of different application scenarios. For example, one video stream can use an object detection model while another video stream uses a keypoint detection model. Currently, it supports YOLOv8 object detection models, keypoint detection models, and image segmentation models.

03. Flexible configuration through an external configuration file

Information such as the RTSP addresses of cameras, the paths, and types of corresponding AI models can be directly configured through a configuration file independent of the program. There is no need to modify or re-compile the program, which is convenient for users to adjust and use.

4. Application Examples

The FET3588-C SoM, 4 x network cameras, and the PC are connected via Ethernet cables and a switch. The cameras capture images and output RTSP streams encoded in H.264 format, with a resolution of 1080P and a frame rate of 20fps. All devices are in the same network segment. On the PC side, players such as FFplay and VLC can be used to pull and play the four video streams.

Screen 1: Demonstrates the human keypoint detection model.

Screen 2: Shows the object detection model.

Screen 3: Displays the image segmentation model.

Screen 4: Performs area-warning detection (when a target enters the red-framed area, the color of the area frame changes from green to red).

5. Summary

Overall, the intelligent gateway solution based on Forlinx Embedded's FET3588-C SoM achieves a full-process technical closed-loop from multi-channel video input decoding, real-time AI inference to encoding and pushing the streams to the media server. It can provide efficient edge-computing support for industrial video surveillance.