VM-UNet++ research on crack image segmentation based on improved VM-UNet

Scritto il 16/03/2025
da Wenliang Tang

Sci Rep. 2025 Mar 15;15(1):8938. doi: 10.1038/s41598-025-92994-7.

ABSTRACT

Cracks are common defects in physical structures, and if not detected and addressed in a timely manner, they can pose a severe threat to the overall safety of the structure. In recent years, with advancements in deep learning, particularly the widespread use of Convolutional Neural Networks (CNNs) and Transformers, significant breakthroughs have been made in the field of crack detection. However, CNNs still face limitations in capturing global information due to their local receptive fields when processing images. On the other hand, while Transformers are powerful in handling long-range dependencies, their high computational cost remains a significant challenge. To effectively address these issues, this paper proposes an innovative modification to the VM-UNet model. This modified model strategically integrates the strengths of the Mamba architecture and UNet to significantly improve the accuracy of crack segmentation. In this study, we optimized the original VM-UNet architecture to better meet the practical needs of crack segmentation tasks. Through comparative experiments on the Crack500 and Ozgenel public datasets, the results clearly demonstrate that the improved VM-UNet achieves significant advancements in segmentation accuracy. Compared to the original VM-UNet and other state-of-the-art models, VM-UNet++ shows a 3% improvement in mDS and a 4.6-6.2% increase in mIoU. These results fully validate the effectiveness of our improvement strategy. Additionally, VM-UNet++ demonstrates lower parameter count and floating-point operations, while maintaining a relatively satisfactory inference speed. These improvements make VM-UNet++ advantageous for practical applications.

PMID:40089495 | DOI:10.1038/s41598-025-92994-7