%0 Journal Article
%T CasNet: A Cascade Coarse-to-Fine Network for Semantic Segmentation
%A Zhenyang Wang
%A Zhidong Deng
%A Shiyao Wang
%J 清华大学学报自然科学版（英文版）
%@ 1878-7606
%D 2019
%R 10.26599/TST.2018.9010044
%X Semantic segmentation is a fundamental topic in computer vision. Since it is required to make dense predictions for an entire image, a network can hardly achieve good performance on various kinds of scenes. In this paper, we propose a cascade coarse-to-fine network called CasNet, which focuses on regions that are difficult to make pixel-level labels. The CasNet comprises three branches. The first branch is designed to produce coarse predictions for easy-to-label pixel regions. The second one learns to distinguish the relatively difficult-to-label pixels from the entire image. Finally, the last branch generates final predictions by combining both the coarse and the fine prediction results through a weighting coefficient that is estimated by the second branch. Three branches focus on their own objectives and collaboratively learn to predict from coarse-to-fine predictions. To evaluate the performance of the proposed network, we conduct experiments on two public datasets: SIFT Flow and Stanford Background. We show that these three branches can be trained in an end-to-end manner, and the experimental results demonstrate that the proposed CasNet outperforms existing state-of-the-art models, and it achieves prediction accuracy of 91.6% and 89.7% on SIFT Flow and Standford Background, respectively
%K semantic segmentation
%K convolutional neural network
%K hard negative mining
%U http://tst.tsinghuajournals.com/EN/10.26599/TST.2018.9010044