UAV-based Deep Learning Applications for Automated Inspection of Civil Infrastructure

Modern technologies such as Unmanned Aerial Vehicle (UAV)-based inspection and deep learning (DL) algorithms introduce new opportunities and challenges in Civil Engineering. To better facilitate the adoption and advancement of UAV-based detection technologies, this paper conducts a systematic literature review on a plethora of articles and performs a comprehensive investigation and comparison across four different topics: (1) investigating the technical specifications of currently utilized UAV platforms and of the employed on-board sensors, (2) summarizing the categories of inspected infrastructure and the corresponding defects, (3) collecting publicly available datasets established on infrastructure defects, (4) illustrating and comparing DL algorithms designed for defect detection. Based on the analysis of collected related work, challenges hindering the development of UAV-based infrastructure inspection, solutions, and potential future opportunities are proposed. This review is aimed at assisting researchers and practitioners to accelerate progress towards more efficient and safe autonomous UAV-based structural inspection in civil engineering.

References

Reviewed papers that have only been included in the statistics:

[1] Z. Wu, R. Kalfaoris, F. Kouyoumdjian, and C. Taelman. “Applying deep convolutional neural network with 3D reality mesh model for water tank crack detection and evaluation”. In: Urban Water Journal 17.8 (2020), pp. 682–695. DOI: 10.1080/1573062x.2020.1758166.

[2] F. Nex, D. Duarte, A. Steenbeek, and N. Kerle. “Towards real-time building damage mapping with low-cost UAV solutions”. In: Remote Sensing 11.3 (2019), p. 287. DOI: 10.3390/rs11030287.

[3] A. Alzarrad, I. Awolusi, M. T. Hatamleh, and S. Terreno. “Automatic assessment of roofs conditions using artificial intelligence (AI) and unmanned aerial vehicles (UAVs)”. In: Frontiers in Built Environment 8 (2022), p. 1026225. DOI: 10.3389/fbuil.2022.1026225.

[4] J. Ding, J. Zhang, Z. Zhan, X. Tang, and X. Wang. “A precision efficient method for collapsed building detection in post-earthquake UAV images based on the improved NMS algorithm and Faster R-CNN”. In: Remote Sensing 14.3 (2022), p. 663. DOI: 10.3390/rs14030663.

[5] J. Wang, P. Wang, L. Qu, Z. Pei, and T. Ueda. “Automatic detection of building surface cracks using UAV and deep learning-combined approach”. In: Structural Concrete (2024). DOI: 10.1002/suco.202300937.

[6] K. Lee, S. Lee, and H. Y. Kim. “Bounding-box object augmentation with random transformations for automated defect detection in residential building façades”. In: Automation in Construction 135 (2022), p. 104138. DOI: 10.1016/j.autcon.2022.104138.

[7] T. Ghosh Mondal, M. R. Jahanshahi, R.-T. Wu, and Z. Y. Wu. “Deep learning-based multi-class damage detection for autonomous post-disaster reconnaissance”. In: Structural Control and Health Monitoring 27.4 (2020), e2507. DOI: 10.1002/stc.2507.

[8] Y. Wang, W. Feng, K. Jiang, Q. Li, R. Lv, and J. Tu. “Real-time damaged building region detection based on improved YOLOv5s and embedded system from UAV images”. In: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2023). DOI: 10.1109/jstars.2023.3268312.

[9] S. Tilon, F. Nex, N. Kerle, and G. Vosselman. “Post-disaster building damage detection from earth observation imagery using unsupervised and transferable anomaly detecting generative adversarial networks”. In: Remote Sensing 12.24 (2020), p. 4193. DOI: 10.3390/rs12244193.

[10] M. Vlaminck, R. Heidbüchel, W. Philips, and H. Luong. “Region-based CNN for anomaly detection in PV power plants using aerial imagery”. In: Sensors 22.3 (2022), p. 1244. DOI: 10.3390/s22031244.

[11] J. Starzyński, P. Zawadzki, and D. Haraczyński. “Machine learning in solar plants inspection automation”. In: Energies 15.16 (2022), p. 5966. DOI: 10.3390/en15165966.

[12] P. Kuznetsov, D. Kotelnikov, L. Yuferev, V. Panchenko, V. Bolshev, M. Jasiński, and A. Flah. “Method for the automated inspection of the surfaces of photovoltaic modules”. In: Sustainability 14.19 (2022), p. 11930. DOI: 10.3390/su141911930.

[13] A. Barrett, D. Bratanov, N. Amarasignam, D. Sera, and F. Gonzalez. “Machine learning based damage detection in photovoltaic arrays using UAV-acquired infrared and visual imagery”. In: 2024 International Conference on Unmanned Aircraft Systems (ICUAS). IEEE, 2024, pp. 264–271. DOI: 10.1109/icuas56888.2024.10065684.

[14] D. Langenkämper, T. Möller, D. Brüin, and T. W. Nattkemper. “Efficient visual monitoring of offshore windmill installations using image annotation and deep learning techniques”. In: Global Oceans 2020: Singapore–US Gulf Coast. IEEE, 2020, pp. 1–6. DOI: 10.1109/ieeec38869.2020.9389035.

[15] N. Kerle, F. Nex, D. Duarte, and A. Vetrivel. “UAV-based structural damage mapping – results from 6 years of research in two European projects”. In: The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-3/W8 (2019), pp. 187–194. DOI: 10.5194/isprs-archives-XLII-3-W8-187-2019.

[16] Y. Wu, P. Chen, Y. Qin, Y. Qian, F. Xu, and L. Jia. “Automatic railroad track components inspection using hybrid deep learning framework”. In: IEEE Transactions on Instrumentation and Measurement 72 (2023), pp. 1–15. DOI: 10.1109/TIM.2023.3265636.

[17] Y. Wu, Y. Qin, Y. Qian, F. Guo, Z. Wang, and L. Jia. “Hybrid deep learning architecture for rail surface segmentation and surface defect detection”. In: Computer-Aided Civil and Infrastructure Engineering 37.2 (2022), pp. 227–244. DOI: 10.1111/mice.12710.

[18] P. Bojarczak and P. Lesiak. “UAVs in rail damage image diagnostics supported by deep-learning networks”. In: Open Engineering 11.1 (2021), pp. 339–348. DOI: 10.1515/eng-2021-0033.

[19] H. S. Munawar, F. Ullah, D. Shahzad, A. Heravi, S. Qayyum, and J. Akram. “Civil infrastructure damage and corrosion detection: An application of machine learning”. In: Buildings 12.2 (2022), p. 156. DOI: 10.3390/buildings12020156.

[20] H.-H. von Benzon and X. Chen. “Mapping damages from inspection images to 3D digital twins of large-scale structures”. In: Engineering Reports 7.1 (2025), e12837. DOI: 10.1002/eng2.12837.

[21] S. Zhou, S. Tai, L. Zhang, D. Cheng, L. Zhu, Y. Li, and X. Ye. “Application improvement of deep learning algorithm in small-sized fittings, voltage balancing ring and bare conductor detection of transmission lines”. In: International Journal of Pattern Recognition and Artificial Intelligence 37.11 (2023), p. 2352017. DOI: 10.1142/s0218001423520171.

[22] C. Xu, M. Xin, Y. Wang, and J. Gao. “An efficient YOLO v3-based method for the detection of transmission line defects”. In: Frontiers in Energy Research 11 (2023), p. 1236915. DOI: 10.3389/fenrg.2023.1236915.

[23] B. J. Souza, S. F. Stefenon, G. Singh, and R. Z. Freire. “Hybrid-YOLO for classification of insulators defects in transmission lines based on UAV”. In: International Journal of Electrical Power & Energy Systems 148 (2023), p. 108982. DOI: 10.1016/j.ijepes.2023.108982.

[24] G. Singh, S. F. Stefenon, and K.-C. Yow. “Interpretable visual transmission lines inspections using pseudo-prototypical part network”. In: Machine Vision and Applications 34.3 (2023), p. 41. DOI: 10.1007/s00138-023-01390-6.

[25] H. Cheng, Y. Li, and Y. Li. “Embankment surface crack pixel-wise identification in UAV images based on a lightweight U-Network with transfer learning”. In: Structures 58 (2023), p. 105640. DOI: 10.1016/j.istruc.2023.105640.

[26] H. Cheng, Y. Li, H. Li, and Q. Hu. “Embankment crack detection in UAV images based on efficient channel attention U2Net”. In: Structures 50 (2023), pp. 430–443. DOI: 10.1016/j.istruc.2023.02.010.

[27] Z. Zhang, Z. Shen, J. Liu, J. Shu, and H. Zhang. “A binocular vision-based crack detection and measurement method incorporating semantic segmentation”. In: Sensors 21.1 (2023), p. 3. DOI: 10.3390/s24010003.

[28] A. C. Loerch, D. A. Stow, L. L. Coulter, A. Nara, and J. Frew. “Comparing the accuracy of sUAS navigation, image co-registration and CNN-based damage detection between traditional and repeat station imaging”. In: Geosciences 12.11 (2022), p. 401. DOI: 10.3390/geosciences12110401.

[29] G. Bhattacharya, N. B. Puhan, and B. Mandal. “Kernelized dynamic convolution routing in spatial and channel interaction for active concrete defect recognition”. In: Signal Processing: Image Communication 108 (2022), p. 116818. DOI: 10.1016/j.image.2022.116818.

[30] Q. Zhu, T. H. Dinh, M. D. Phung, and Q. P. Ha. “Hierarchical convolutional neural network with feature preservation and autotuned thresholding for crack detection”. In: IEEE Access 9 (2021), pp. 60201–60214. DOI: 10.1109/access.2021.3073921.

[31] V. Hoskere, Y. Narazaki, T. A. Hoang, and B. Spencer Jr. “MaDnet: multi-task semantic segmentation of multiple types of structural materials and damage in images of civil infrastructure”. In: Journal of Civil Structural Health Monitoring 10.5 (2020), pp. 757–773. DOI: 10.1007/s13349-020-00409-0.

[32] S. Bhowmick, S. Nagarajaiah, and A. Veeraraghavan. “Vision and deep learning-based algorithms to detect and quantify cracks on concrete surfaces from UAV videos”. In: Sensors 20.21 (2020), p. 6299. DOI: 10.3390/s20216299.

[33] S. Egodawela, A. Khodadadian Gostar, H. S. Buddika, A. Dammika, N. Harischandra, S. Navaratnam, and M. Mahmoodian. “A deep learning approach for surface crack classification and segmentation in unmanned aerial vehicle assisted infrastructure inspections”. In: Sensors 24.6 (2024), p. 1936. DOI: 10.3390/s24061936.

[34] I. O. Agyemang, X. Zhang, D. Acheampong, I. Adjei-Mensah, G. A. Kusi, B. C. Mawuli, and B. L. Y. Agbley. “Autonomous health assessment of civil infrastructure using deep learning and smart devices”. In: Automation in Construction 141 (2022), p. 104396. DOI: 10.1016/j.autcon.2022.104396.

[35] S. Jiang and J. Zhang. “Real-time crack assessment using deep neural networks with wall-climbing unmanned aerial system”. In: Computer-Aided Civil and Infrastructure Engineering 35.6 (2020), pp. 549–564. DOI: 10.1111/mice.12519.

[36] S. Jiang, Y. Wu, and J. Zhang. “Bridge coating inspection based on two-stage automatic method and collision-tolerant unmanned aerial system”. In: Automation in Construction 146 (2023), p. 104685. DOI: 10.1016/j.autcon.2022.104685.

[37] Z.-f. Wang, Y.-f. Yu, J. Wang, J.-q. Zhang, H.-l. Zhu, P. Li, L. Xu, H.-n. Jiang, Q.-m. Sui, L. Jia, et al. “Convolutional neural-network-based automatic dam-surface seepage defect identification from thermograms collected from UAV-mounted thermal imaging camera”. In: Construction and Building Materials 323 (2022), p. 126416. DOI: 10.1016/j.conbuildmat.2022.126416.

[38] F. Song, Y. Sun, and G. Yuan. “Autonomous identification of bridge concrete cracks using unmanned aircraft images and improved lightweight deep convolutional networks”. In: Structural Control and Health Monitoring 2024.1 (2024), p. 7857012. DOI: 10.1155/2024/7857012.

[39] S. Jiang, J. Zhang, and C. Gao. “Bridge deformation measurement using unmanned aerial dual camera and learning-based tracking method”. In: Structural Control and Health Monitoring 2023.1 (2023), p. 4752072. DOI: 10.1155/2023/4752072.

[40] S. Jiang, J. Zhang, W. Wang, and Y. Wang. “Automatic inspection of bridge bolts using unmanned aerial vision and adaptive scale unification-based deep learning”. In: Remote Sensing 15.2 (2023), p. 328. DOI: 10.3390/rs15020328.

[41] F. P. García Márquez, P. J. Bernalte Sánchez, and I. Segovia Ramírez. “Acoustic inspection system with unmanned aerial vehicles for wind turbines structure health monitoring”. In: Structural Health Monitoring 21.2 (2022), pp. 485–500. DOI: 10.1177/14759217211004822.

[42] Z. Yuqing. “A hybrid convolutional neural network and relief-f algorithm for fault power line recognition in internet of things-based smart grids”. In: Wireless Communications and Mobile Computing 2022.1 (2022), p. 4911553. DOI: 10.1155/2022/4911553.

[43] M. W. Khan, M. S. Obaidat, K. Mahmood, D. Batool, H. M. S. Badar, M. Aamir, and W. Gao. “Real-time road damage detection and infrastructure evaluation leveraging unmanned aerial vehicles and tiny machine learning”. In: IEEE Internet of Things Journal (2024). DOI: 10.1109/jiot.2024.3385994.

[44] S. Feng, M. Gao, X. Jin, T. Zhao, and F. Yang. “Fine-grained damage detection of cement concrete pavement based on UAV remote sensing image segmentation and stitching”. In: Measurement 226 (2024), p. 113844. DOI: 10.1016/j.measurement.2023.113844.

[45] A. Ji, X. Xue, Y. Wang, X. Luo, and L. Wang. “Image-based road crack risk-informed assessment using a convolutional neural network and an unmanned aerial vehicle”. In: Structural Control and Health Monitoring 28.7 (2021), e2749. DOI: 10.1002/stc.2749.

[46] Z. Chen, M. Wagner, J. Das, R. K. Doe, and R. S. Cerveny. “Data-driven approaches for tornado damage estimation with unpiloted aerial systems”. In: Remote Sensing 13.9 (2021), p. 1669. DOI: 10.3390/rs13091669.

[47] M. E. Mohammadi, D. P. Watson, and R. L. Wood. “Deep learning-based damage detection from aerial SFM point clouds”. In: Drones 3.3 (2019), p. 68. DOI: 10.3390/drones3030068.

[48] M. Rahnemoonfar, T. Chowdhury, A. Sarkar, D. Varshney, M. Yari, and R. R. Murphy. “Floodnet: A high resolution aerial imagery dataset for post flood scene understanding”. In: IEEE Access 9 (2021), pp. 89644–89654. DOI: 10.1109/access.2021.3090981.

[49] C. Luo, L. Yu, J. Yan, Z. Li, P. Ren, X. Bai, E. Yang, and Y. Liu. “Autonomous detection of damage to multiple steel surfaces from 360 panoramas using deep neural networks”. In: Computer-Aided Civil and Infrastructure Engineering 36.12 (2021), pp. 1585–1599. DOI: 10.1111/mice.12686.

[50] L. Zhang, X. Wu, Z. Liu, P. Yu, and M. Yang. “ESD-YOLOv8: An efficient solar cell fault detection model based on YOLOv8”. In: IEEE Access 12 (2024), pp. 138801–138815. DOI: 10.1109/access.2024.3466209.

[51] S. A. Fakhri, M. Satari Abrovi, H. Zakeri, A. Safdarinezhad, and S. A. Fakhri. “Pavement crack detection through a deep-learned asymmetric encoder-decoder convolutional neural network”. In: International Journal of Pavement Engineering 24.1 (2023), p. 2255359. DOI: 10.1080/10298436.2023.2255359.

[52] Y. Li, J. Ma, Z. Zhao, and G. Shi. “A novel approach for UAV image crack detection”. In: Sensors 22.9 (2022), p. 3305. DOI: 10.3390/s22093305.

[53] Y. Pan, X. Chen, Q. Sun, and X. Zhang. “Monitoring asphalt pavement aging and damage conditions from low-altitude UAV imagery based on a CNN approach”. In: Canadian Journal of Remote Sensing 47.3 (2021), pp. 432–449. DOI: 10.1080/07038992.2020.1870217.

[54] Y. Jiang, S. Han, and Y. Bai. “Building and infrastructure defect detection and visualization using drone and deep learning technologies”. In: Journal of Performance of Constructed Facilities 35.6 (2021), p. 04021092. DOI: 10.1061/(asce)cf.1943-5509.0001652.

[55] M.-M. Naddaf-Sh, S. Hosseini, J. Zhang, N. A. Brake, and H. Zargarzadeh. “Real-time road crack mapping using an optimized convolutional neural network”. In: Complexity 2019 (2019), pp. 1–17. DOI: 10.1155/2019/2470735.

[56] C. Feng, H. Zhang, Y. Li, S. Wang, and H. Wang. “Efficient real-time defect detection for spillway tunnel using deep learning”. In: Journal of Real-Time Image Processing 18.6 (2021), pp. 2377–2387. DOI: 10.1007/s11554-021-01130-x.

[57] V. De Arriba López, M. Maboudi, P. Achancaray, and M. Gerke. “Automatic non-destructive UAV-based structural health monitoring of steel container cranes”. In: Applied Geomatics (2023), pp. 1–21. DOI: 10.1007/s12518-023-00542-7.

References for deep learning algorithms:

1. CNN variants:

ResNet: K. He, X. Zhang, S. Ren, and J. Sun. “Deep residual learning for image recognition”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016, pp. 770–778. DOI: 10.1109/cvpr.2016.90.

VGG: K. Simonyan and A. Zisserman. “Very deep convolutional networks for large-scale image recognition”. In: arXiv preprint arXiv:1409.1556 (2014). DOI: 10.48550/arXiv.1409.1556.

Inception: C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. “Going deeper with convolutions”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015, pp. 1–9. DOI: https://doi.org/10.1109/cvpr.2015.7298594.

Xception: F. Chollet. “Xception: Deep learning with depthwise separable convolutions”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017, pp. 1800–1807. DOI: https://doi.org/10.1109/cvpr.2017.195.

ConvNeXt: Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie. “A convnet for the 2020s”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2022, pp. 11976–11986. DOI: https://doi.org/10.1109/cvpr52688.2022.01167.

HRNet: J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y. Zhao, D. Liu, Y. Mu, M. Tan, X. Wang, et al. “Deep high-resolution representation learning for visual recognition”. In: IEEE Transactions on Pattern Analysis and Machine Intelligence 43.10 (2020), pp. 3349–3364. DOI: https://doi.org/10.1109/tpami.2020.2983686.

EfficientNet: M. Tan and Q. Le. “Efficientnet: Rethinking model scaling for convolutional neural networks”. In: International Conference on Machine Learning (ICML). PMLR. 2019, pp. 6105–6114. URL: https://proceedings.mlr.press/v97/tan19a.html.

ShuffleNet: X. Zhang, X. Zhou, M. Lin, and J. Sun. “Shufflenet: An extremely efficient convolutional neural network for mobile devices”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018, pp. 6848–6856. DOI: https://doi.org/10.1109/cvpr.2018.00716.

MobileNet: A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. “Mobilenets: Efficient convolutional neural networks for mobile vision applications”. In: arXiv preprint arXiv:1704.04861 (2017). URL: https://arxiv.org/abs/1704.04861.

DenseNet: G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger. “Densely connected convolutional networks”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017, pp. 4700–4708. DOI: https://doi.org/10.1109/cvpr.2017.243.

2. Transformer variants:

Vision Transformer (ViT): A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al. “An image is worth 16×16 words: Transformers for image recognition at scale”. In: arXiv preprint arXiv:2010.11929 (2020). URL: https://arxiv.org/pdf/2010.11929/1000.

Swin Transformer: Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo. “Swin transformer: Hierarchical vision transformer using shifted windows”. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2021, pp. 10012–10022. DOI: https://doi.org/10.1109/iccv48922.2021.00986.

Mask2Former: B. Cheng, I. Misra, A. G. Schwing, A. Kirillov, and R. Girdhar. “Masked-attention mask transformer for universal image segmentation”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022, pp. 1290–1299. DOI: https://doi.org/10.1109/cvpr52688.2022.00135.

SegFormer: E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo. “SegFormer: Simple and efficient design for semantic segmentation with transformers”. In: Advances in Neural Information Processing Systems 34 (2021), pp. 12077–12090. URL: https://proceedings.neurips.cc/paper_files/paper/2021/file/64f1ff27bf1b4ec22924fd0acb550c235-Paper.pdf.

DEtection TRansformer (DETR): N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko. “End-to-end object detection with transformers”. In: European Conference on Computer Vision (ECCV). Springer. 2020, pp. 213–229. DOI: https://doi.org/10.1007/978-3-030-58452-8_13.

3. Object detection models:

SSD: W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg. “SSD: Single shot multibox detector”. In: European Conference on Computer Vision (ECCV). Springer. 2016, pp. 21–37. DOI: https://doi.org/10.1007/978-3-319-46448-0_2.

YOLO: J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. “You only look once: Unified, real-time object detection”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016, pp. 779–788. DOI: https://doi.org/10.1109/cvpr.2016.91.

EfficientDet: M. Tan, R. Pang, and Q. V. Le. “EfficientDet: Scalable and efficient object detection”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020, pp. 10781–10790. DOI: https://doi.org/10.1109/cvpr42600.2020.01079.

R-CNN: R. Girshick, J. Donahue, T. Darrell, and J. Malik. “Rich feature hierarchies for accurate object detection and semantic segmentation”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2014, pp. 580–587. DOI: https://doi.org/10.1109/cvpr.2014.81.

4. Semantic segmentation models:

FCN: E. Shelhamer, J. Long, and T. Darrell. “Fully convolutional networks for semantic segmentation”. In: IEEE Transactions on Pattern Analysis and Machine Intelligence 39.4 (2017), pp. 640–651. DOI: https://doi.org/10.1109/TPAMI.2016.2572683.

U-Net: O. Ronneberger, P. Fischer, and T. Brox. “U-net: Convolutional networks for biomedical image segmentation”. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference. Springer. 2015, pp. 234–241. DOI: https://doi.org/10.1007/978-3-319-24574-4_28.

U2-Net: X. Qin, Z. Zhang, C. Huang, M. Dehghan, O. R. Zaiane, and M. Jagersand. “U2-Net: Going deeper with nested U-structure for salient object detection”. In: Pattern Recognition 106 (2020), p. 107404. DOI: https://doi.org/10.1016/j.patcog.2020.107404.

SegNet: V. Badrinarayanan, A. Kendall, and R. Cipolla. “Segnet: A deep convolutional encoder-decoder architecture for image segmentation”. In: IEEE Transactions on Pattern Analysis and Machine Intelligence 39.12 (2017), pp. 2481–2495. DOI: https://doi.org/10.1109/tpami.2016.2644615.

DeepLab: L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs”. In: IEEE Transactions on Pattern Analysis and Machine Intelligence 40.4 (2017), pp. 834–848. DOI: https://doi.org/10.1109/tpami.2017.2699184.

CrackNet: L. Zhang, F. Yang, Y. D. Zhang, and Y. J. Zhu. “Road crack detection using deep convolutional neural network”. In: 2016 IEEE International Conference on Image Processing (ICIP). IEEE. 2016, pp. 3708–3712. DOI: https://doi.org/10.1109/icip.2016.7533052.

DeepCrack: Y. Liu, J. Yao, X. Lu, R. Xie, and L. Li. “DeepCrack: A deep hierarchical feature learning architecture for crack segmentation”. In: Neurocomputing 338 (2019), pp. 139–153. DOI: https://doi.org/10.1016/j.neucom.2019.01.036.

PSPNet: H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. “Pyramid scene parsing network”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017, pp. 2881–2890. DOI: https://doi.org/10.1109/cvpr.2017.660.

Segment Anything Model (SAM): A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, et al. “Segment anything”. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2023, pp. 4015–4026. DOI: https://doi.org/10.1109/iccv57321.2023.00371.

5. Instance segmentation models:

Mask R-CNN: K. He, G. Gkioxari, P. Dollár, and R. Girshick. “Mask r-cnn”. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). 2017, pp. 2961–2969. DOI: https://doi.org/10.1109/iccv.2017.322.

Mask Scoring R-CNN: Z. Huang, L. Huang, Y. Gong, C. Huang, and X. Wang. “Mask scoring r-cnn”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, pp. 6409–6418. DOI: https://doi.org/10.1109/cvpr.2019.00657.