全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

ATFF: Advanced Transformer with Multiscale Contextual Fusion for Medical Image Segmentation

DOI: 10.4236/jcc.2024.123015, PP. 238-251

Keywords: Medical Image Segmentation, Advanced Transformer, Deep Supervision, Attention Mechanism

Full-Text   Cite this paper   Add to My Lib

Abstract:

Deep convolutional neural network (CNN) greatly promotes the automatic segmentation of medical images. However, due to the inherent properties of convolution operations, CNN usually cannot establish long-distance interdependence, which limits the segmentation performance. Transformer has been successfully applied to various computer vision, using self-attention mechanism to simulate long-distance interaction, so as to capture global information. However, self-attention lacks spatial location and high-performance computing. In order to solve the above problems, we develop a new medical transformer, which has a multi-scale context fusion function and can be used for medical image segmentation. The proposed model combines convolution operation and attention mechanism to form a u-shaped framework, which can capture both local and global information. First, the traditional converter module is improved to an advanced converter module, which uses post-layer normalization to obtain mild activation values, and uses scaled cosine attention with a moving window to obtain accurate spatial information. Secondly, we also introduce a deep supervision strategy to guide the model to fuse multi-scale feature information. It further enables the proposed model to effectively propagate feature information across layers, Thanks to this, it can achieve better segmentation performance while being more robust and efficient. The proposed model is evaluated on multiple medical image segmentation datasets. Experimental results demonstrate that the proposed model achieves better performance on a challenging dataset (ETIS) compared to existing methods that rely only on convolutional neural networks, transformers, or a combination of both. The mDice and mIou indicators increased by 2.74% and 3.3% respectively.

References

[1]  Xia, S., Zhu, H., Liu, X., Gong, M., Huang, X., Xu, L., Zhang, H. and Guo, J. (2019) Vessel Segmentation of X-Ray Coronary Angiographic Image Sequence. IEEE Transactions on Biomedical Engineering, 67, 1338-1348.
https://doi.org/10.1109/TBME.2019.2936460
[2]  Park, S. and Chung, M. (2022) Cardiac Segmentation on CT Images through Shape-Aware Contour Attentions. Computers in Biology and Medicine, 147, Article ID: 105782.
https://doi.org/10.1016/j.compbiomed.2022.105782
[3]  Huo, Y., Liu, J., Xu, Z., Harrigan, R., Assad, A., Abramson, R. and Landman, B. (2017) Robust Multicontrast MRI Spleen Segmentation for Splenomegaly Using Multi-Atlas Segmentation. IEEE Transactions on Biomedical Engineering, 65, 336-343.
https://doi.org/10.1109/TBME.2017.2764752
[4]  Ungi, T., Greer, H., Sunderland, K., Wu, V., Baum, Z., Schlenger, C., Oetgen, M., Cleary, K., Aylward, S. and Fichtinger, G. (2020) Automatic Spine Ultrasound Segmentation for Scoliosis Visualization and Measurement. IEEE Transactions on Biomedical Engineering, 67, 3234-3241.
https://doi.org/10.1109/TBME.2020.2980540
[5]  Cai, Y. and Wang, Y. (2022) MA-Unet: An Improved Version of Unet Based on Multi-Scale and Attention Mechanism for Medical Image Segmentation. 3rd International Conference on Electronics and Communication; Network and Computer Technology, Volume 12167, 205-211.
https://doi.org/10.1117/12.2628519
[6]  Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, Berlin, 234-241.
https://doi.org/10.1007/978-3-319-24574-4_28
[7]  Oktay, O., Schlemper, J., Folgoc, L.L., et al. (2018) Attention u-net: Learning Where to Look for the Pancreas.
[8]  Isensee, F., Jaeger, P.F., Kohl, S.A.A., et al. (2021) nnU-Net: A Self-Configuring Method for Deep Learning-Based Biomedical Image Segmentation. Nature Methods, 18, 203-211.
https://doi.org/10.1038/s41592-020-01008-z
[9]  Yap, M.H., Pons, G., Marti, J., et al. (2017) Automated Breast Ultrasound Lesions Detection Using Convolutional Neural Networks. IEEE Journal of Biomedical and Health Informatics, 22, 1218-1226.
https://doi.org/10.1109/JBHI.2017.2731873
[10]  Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., et al. (2019) Unet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation. IEEE Transactions on Medical Imaging, 39, 1856-1867.
https://doi.org/10.1109/TMI.2019.2959609
[11]  Oktay, O., Schlemper, J., Folgoc, L.L., et al. (2018) Attention U-Net: Learning Where to Look for the Pancreas.
[12]  Huang, H., Lin, L., Tong, R., et al. (2020) Unet3+: A Full-Scale Connected Unet for Medical Image Segmentation. ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, 4-8 May 2020, 1055-1059.
https://doi.org/10.1109/ICASSP40776.2020.9053405
[13]  Valanarasu, J.M.J. and Patel, V.M. (2022) Unext: Mlp-Based Rapid Medical Image Segmentation Network. Medical Image Computing and Computer Assisted Intervention-MICCAI 2022: 25th International Conference, Singapore, 18-22 September 2022, 23-33.
https://doi.org/10.1007/978-3-031-16443-9_3
[14]  Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q. and Salakhutdinov, R. (2019) Transformer-XL: Attentive Language Models beyond a Fixed-Length Context. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, July 2019, 2978-2988.
https://doi.org/10.18653/v1/P19-1285
[15]  Raghu, M., et al. (2021) Do Vision Transformers See like Convolutional Neural Networks? Advances in Neural Information Processing Systems, 34, 12116-12128.
[16]  Nguyen, T., Raghu, M. and Kornblith, S. (2020) Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth.
[17]  Oktay, O., Schlemper, J., Folgoc, L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y. and Kainz, B. (2019) Attention U-Net: Learning Where to Look for the Pancreas. Medical Image Analysis, 53, 197-207.
https://doi.org/10.1016/j.media.2019.01.012
[18]  Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A. and Zhou, Y. (2021) TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation.
[19]  Zhou, H.Y., Guo, J., Zhang, Y., et al. (2021) nnformer: Interleaved Transformer for Volumetric Segmentation.
[20]  Huang, X., Deng, Z., Li, D., et al. (2021) Missformer: An Effective Medical Image Segmentation Transformer.
[21]  Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. arXiv: 1706.03762.
[22]  Wang, H., Zhu, Y., Green, B., Adam, H., Yuille, A. and Chen, L. (2020) Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation. European Conference on Computer Vision, Glasgow, 23-28 August 2020, 108-126.
https://doi.org/10.1007/978-3-030-58548-8_7
[23]  Liu, Z., Hu, H., Lin, Y., Yao, Z., Xie, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., Dong, L., et al. (2022) Swin Transformer v2: Scaling up Capacity and Resolution. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orle-ans, 18-24 June 2022, 12009-12019.
https://doi.org/10.1109/CVPR52688.2022.01170
[24]  Liu, Z., Hu, H., Lin, Y., et al. (2022) Swin Transformer v2: Scaling up Capacity and Resolution. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, 18-24 June 2022, 12009-12019.
https://doi.org/10.1109/CVPR52688.2022.01170
[25]  Vazquez, D., Bernal, J., Sanchez, F.J., Fernandez-Esparrach, G., Lopez, A.M., Romero, A., Drozdzal, M. and Courville, A. (2017) A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images. Journal of Healthcare Engineering, 2017, Article ID: 4037190.
https://doi.org/10.1155/2017/4037190
[26]  Jha, D., Smedsrud, P.H., Riegler, M.A., et al. (2020) Kvasirseg: A Segmented Polyp Dataset. MultiMedia Modeling: 26th International Conference, MMM 2020, Daejeon, 5-8 January 2020, 451-462.
https://doi.org/10.1007/978-3-030-37734-2_37
[27]  Bernal, J., Sanchez, F.J., Fernandez-Esparrach, G., Gil, D., Rodrıguez, C. and Vilarino, F. (2015) Wm-dova Maps for Accurate Polyp Highlighting in Colonoscopy: Validation vs. Saliency Maps from Physicians. Computerized Medical Imaging and Graphics, 43, 99-111.
https://doi.org/10.1016/j.compmedimag.2015.02.007
[28]  Tajbakhsh, N., Gurudu, S.R. and Liang, J. (2015) Automated Polyp Detection in Colonoscopy Videos Using Shape and Context Information. IEEE TMI, 35, 630-644.
https://doi.org/10.1109/TMI.2015.2487997
[29]  Silva, J., Histace, A., Romain, O., Dray, X. and Granado, B. (2014) Toward Embedded Detection of Polyps in WCE Images for Early Diagnosis of Colorectal Cancer. The International Journal of Computer Assisted Radiology and Surgery, 9, 283-293.
https://doi.org/10.1007/s11548-013-0926-3
[30]  Han, Z., Jian, M. and Wang, G. (2022) ConvUNeXt: An Efficient Convolution Neural Network for Medical Image Segmentation. Knowledge-Based Systems, 253, Article ID: 109512.
https://doi.org/10.1016/j.knosys.2022.109512
[31]  Fu, Z., Li, J. and Hua, Z. (2022) DEAU-Net: Attention Networks Based on Dual Encoder for Medical Image Segmentation. Computers in Biology and Medicine, 150, Article ID: 106197.
https://doi.org/10.1016/j.compbiomed.2022.106197
[32]  Valanarasu, J., Oza, P., Hacihaliloglu, I. and Patel, V. (2021) Medical Transformer: Gated Axial-Attention for Medical Image Segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, 27 September-1 October 2021, 36-46.
https://doi.org/10.1007/978-3-030-87193-2_4
[33]  Tomar, N., Jha, D., Bagci, U. and Ali, S. (2022) TGANet: Text-Guided Attention for Improved Polyp Segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Singapore, 18-22 September 2022, 151-160.
https://doi.org/10.1007/978-3-031-16437-8_15
[34]  Cao, H., Karlinsky, L., Michaeli, T., Nishino, K., et al. (2023) Swin-Unet: Unet-Like Pure Trans-Former for Medical Image Segmentation. European Conference on Computer Vision Computer Vision, Tel Aviv, 23-27 October 2022, 205-218.
https://doi.org/10.1007/978-3-031-25066-8_9

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413