%0 Journal Article %T CasNet: A Cascade Coarse-to-Fine Network for Semantic Segmentation %A Zhenyang Wang %A Zhidong Deng %A Shiyao Wang %J 清华大学学报自然科学版(英文版) %@ 1878-7606 %D 2019 %R 10.26599/TST.2018.9010044 %X Semantic segmentation is a fundamental topic in computer vision. Since it is required to make dense predictions for an entire image, a network can hardly achieve good performance on various kinds of scenes. In this paper, we propose a cascade coarse-to-fine network called CasNet, which focuses on regions that are difficult to make pixel-level labels. The CasNet comprises three branches. The first branch is designed to produce coarse predictions for easy-to-label pixel regions. The second one learns to distinguish the relatively difficult-to-label pixels from the entire image. Finally, the last branch generates final predictions by combining both the coarse and the fine prediction results through a weighting coefficient that is estimated by the second branch. Three branches focus on their own objectives and collaboratively learn to predict from coarse-to-fine predictions. To evaluate the performance of the proposed network, we conduct experiments on two public datasets: SIFT Flow and Stanford Background. We show that these three branches can be trained in an end-to-end manner, and the experimental results demonstrate that the proposed CasNet outperforms existing state-of-the-art models, and it achieves prediction accuracy of 91.6% and 89.7% on SIFT Flow and Standford Background, respectively %K semantic segmentation %K convolutional neural network %K hard negative mining %U http://tst.tsinghuajournals.com/EN/10.26599/TST.2018.9010044