%0 Journal Article %T 多尺度信息融合的实时语义分割网络
Real-Time Semantic Segmentation Network Based on Multi-Scale Information %A 胡家虎 %A 佘玉梅 %J Artificial Intelligence and Robotics Research %P 19-29 %@ 2326-3423 %D 2024 %I Hans Publishing %R 10.12677/AIRR.2024.131003 %X 在自动驾驶、无人机等处理器资源受限的任务中,需要考虑模型的参数量和运算速度,并确保较好的准确性。一些语义分割模型采用并行式结构提取多尺度信息时,使用深度可分离卷积或分组卷积替换常规卷积来降低计算量。但这些操作存在增加网络延迟,降低推理速度的问题。基于此问题,提出一个基于编码器–解码器的实时语义分割模型。编码器阶段,使用部分卷积结合扩张卷积构建不同的并行式模块,用于提取不同阶段的多尺度信息。解码器阶段,使用融合上采样特征的方式。模型在Cityscapes和CamVid数据集上进行实验,平均交并比分别为71.3%和66.8%,运行速度分别为97帧/s和98帧/s,结果表明该模型在分割精度和运行速度之间达到较好平衡。
In tasks with limited processor resources such as autonomous driving and UAV, it is necessary to consider the number of parameters and operation speed of the model, and ensure good accuracy. When some semantic segmentation models adopt a parallel structure to extract multi-scale information, they use depth wise separable convolution or grouped convolution to replace conventional convolution to reduce computational complexity. However, these operations have the problem of increasing network delay and reducing inference speed. To solve this problem, a real-time semantic segmentation model based on encoder-decoder is proposed. In the encoder stage, partial convolution combined with dilated convolution was used to construct different parallel modules for extracting multi-scale information at different stages. In the decoder stage, the up sampled features are fused. The model is tested on Cityscapes and CamVid datasets, the MIU is 71.3%and 66.8%respectively, and the running speed is 97 frames/s and 98 frames/s respectively. The results show that the model achieves a good balance between segmentation accuracy and running speed. %K 实时语义分割,部分卷积,多尺度特征,编解码器结构
Real-Time Semantic Segmentation %K Partial Convolution %K Multi-Scale Information %K Codec Structure %U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=80524