{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T22:17:52Z","timestamp":1775081872305,"version":"3.50.1"},"reference-count":52,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,4,30]],"date-time":"2025-04-30T00:00:00Z","timestamp":1745971200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Artif. Intell."],"abstract":"<jats:sec><jats:title>Background<\/jats:title><jats:p>Wireless Capsule Endoscopy (WCE) enables non-invasive imaging of the gastrointestinal tract but generates vast video data, making real-time and accurate abnormality detection challenging. Traditional detection methods struggle with uncontrolled illumination, complex textures, and high-speed processing demands.<\/jats:p><\/jats:sec><jats:sec><jats:title>Methods<\/jats:title><jats:p>This study presents a novel approach using Real-Time Detection Transformer (RT-DETR), a transformer-based object detection model, specifically optimized for WCE video analysis. The model captures contextual information between frames and handles variable image conditions. It was evaluated using the Kvasir-Capsule dataset, with performance assessed across three RT-DETR variants: Small (S), Medium (M), and X-Large (X).<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>RT-DETR-X achieved the highest detection precision. RT-DETR-M offered a practical trade-off between accuracy and speed, while RT-DETR-S processed frames at 270 FPS, enabling real-time performance. All three models demonstrated improved detection accuracy and computational efficiency compared to baseline methods.<\/jats:p><\/jats:sec><jats:sec><jats:title>Discussion<\/jats:title><jats:p>The RT-DETR framework significantly enhances precision and real-time performance in gastrointestinal abnormality detection using WCE. Its clinical potential lies in supporting faster and more accurate diagnosis. Future work will focus on further optimization and deployment in endoscopic video analysis systems.<\/jats:p><\/jats:sec>","DOI":"10.3389\/frai.2025.1529814","type":"journal-article","created":{"date-parts":[[2025,4,30]],"date-time":"2025-04-30T05:41:21Z","timestamp":1745991681000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Precision enhancement in wireless capsule endoscopy: a novel transformer-based approach for real-time video object detection"],"prefix":"10.3389","volume":"8","author":[{"given":"Tsedeke Temesgen","family":"Habe","sequence":"first","affiliation":[]},{"given":"Keijo","family":"Haataja","sequence":"additional","affiliation":[]},{"given":"Pekka","family":"Toivanen","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2025,4,30]]},"reference":[{"key":"B1","article-title":"A robust pipeline for classification and detection of bleeding frames in wireless capsule endoscopy using swin transformer and rt-detr","author":"Alavala","year":"2024","journal-title":"arXiv preprint arXiv:2406.08046"},{"key":"B2","article-title":"Transformer-based wireless capsule endoscopy bleeding tissue detection and classification","author":"Alawode","year":"2024","journal-title":"arXiv preprint arXiv:2412.19218"},{"key":"B3","doi-asserted-by":"publisher","first-page":"1085","DOI":"10.1016\/j.gie.2010.11.010","article-title":"Management of ingested foreign bodies and food impactions","volume":"73","year":"2011","journal-title":"Gastrointest. Endosc"},{"key":"B4","doi-asserted-by":"publisher","first-page":"124","DOI":"10.1186\/s12938-023-01186-9","article-title":"Wireless capsule endoscopy multiclass classification using three-dimensional deep convolutional neural network model","volume":"22","author":"Bordbar","year":"2023","journal-title":"Biomed. Eng. Online"},{"key":"B5","doi-asserted-by":"publisher","first-page":"29391","DOI":"10.1109\/ACCESS.2023.3260983","article-title":"Enhanced classification of gastric lesions and early gastric cancer diagnosis in gastroscopy using multi-filter autoaugment","volume":"11","author":"Chae","year":"2023","journal-title":"IEEE Access"},{"key":"B6","doi-asserted-by":"publisher","first-page":"107782","DOI":"10.1016\/j.cmpb.2023.107782","article-title":"Pact-net: parallel CNNS and transformers for medical image segmentation","volume":"242","author":"Chen","year":"2023","journal-title":"Comput. Methods Programs Biomed"},{"key":"B7","doi-asserted-by":"publisher","first-page":"e210110","DOI":"10.1148\/ryai.210110","article-title":"Deep learning for the detection, localization, and characterization of focal liver lesions on abdominal us images","volume":"4","author":"Dadoun","year":"2022","journal-title":"Radiol. Artif. Intell"},{"key":"B8","doi-asserted-by":"publisher","first-page":"76108","DOI":"10.1109\/ACCESS.2023.3297097","article-title":"A two-stage method for polyp detection in colonoscopy images based on saliency object extraction and transformers","volume":"11","author":"de Moura Lima","year":"2023","journal-title":"IEEE Access"},{"key":"B9","volume-title":"Sleisenger and Fordtran's Gastrointestinal and Liver Disease: Pathophysiology, Diagnosis, Management","author":"Feldman","year":"2020"},{"key":"B10","volume-title":"Clinical Gastrointestinal Endoscopy E-Book","author":"Ginsberg","year":"2011"},{"key":"B11","doi-asserted-by":"publisher","first-page":"e0144023","DOI":"10.1128\/spectrum.01440-23","article-title":"Automatic patient-level recognition of four plasmodium species on thin blood smear by a real-time detection transformer (RT-DETR) object detection algorithm: a proof-of-concept and evaluation","volume":"12","author":"Guemas","year":"2024","journal-title":"Microbiol. Spectr"},{"key":"B12","doi-asserted-by":"publisher","first-page":"1217","DOI":"10.3390\/bioengineering11121217","article-title":"Breast tumor detection and diagnosis using an improved faster R-CNN in dce-MRI","volume":"11","author":"Gui","year":"2024","journal-title":"Bioengineering"},{"key":"B13","doi-asserted-by":"publisher","first-page":"126793","DOI":"10.1109\/ACCESS.2024.3456100","article-title":"Efficiency meets accuracy: benchmarking object detection models for pathology detection in wireless capsule endoscopy","volume":"12","author":"Habe","year":"2024","journal-title":"IEEE Access"},{"key":"B14","author":"Hao","year":"2024","journal-title":"A novel method of video object detection based on improved rt-detr"},{"key":"B15","doi-asserted-by":"publisher","first-page":"108","DOI":"10.1186\/s12876-015-0337-8","article-title":"Major predictors and management of small-bowel angioectasia","volume":"15","author":"Igawa","year":"2015","journal-title":"BMC Gastroenterol"},{"key":"B16","doi-asserted-by":"publisher","first-page":"6987","DOI":"10.3390\/s24216987","article-title":"Research on microscale vehicle logo detection based on real-time detection transformer (RT-DETR)","volume":"24","author":"Jin","year":"2024","journal-title":"Sensors"},{"key":"B17","doi-asserted-by":"publisher","DOI":"10.1109\/ICBPE.2009.5384106","author":"Khun","year":"2009","journal-title":"2009 International Conference on Biomedical and Pharmaceutical Engineering"},{"key":"B18","doi-asserted-by":"publisher","first-page":"1654","DOI":"10.1053\/j.gastro.2006.09.047","article-title":"Endoscopy of the upper GI tract: a training manual","volume":"131","author":"Kimberly","year":"2006","journal-title":"Gastroenterology"},{"key":"B19","first-page":"59","article-title":"The prevalence of helicobacter pylori in peptic ulcer disease","volume":"9","author":"Kuipers","year":"1995","journal-title":"Aliment. Pharmacol. Therap"},{"key":"B20","doi-asserted-by":"publisher","first-page":"11191","DOI":"10.1038\/s41598-024-60897-8","article-title":"Multi-scale coupled attention for visual object detection","volume":"14","author":"Li","year":"","journal-title":"Sci. Rep"},{"key":"B21","doi-asserted-by":"publisher","DOI":"10.1109\/IECBES.2012.6498194","article-title":"\u201cA training based support vector machine technique for blood detection in wireless capsule endoscopy images","author":"Li","year":"2012","journal-title":"2012 IEEE-EMBS Conference on Biomedical Engineering and Sciences"},{"key":"B22","doi-asserted-by":"publisher","first-page":"3676","DOI":"10.3390\/electronics13183676","article-title":"A hessian-based deep learning preprocessing method for coronary angiography image analysis","volume":"13","author":"Li","year":"","journal-title":"Electronics"},{"key":"B23","doi-asserted-by":"publisher","first-page":"106454","DOI":"10.1016\/j.bspc.2024.106454","article-title":"The intelligent gastrointestinal metaplasia assessment based on deformable transformer with token merging","volume":"95","author":"Li","year":"","journal-title":"Biomed. Signal Process. Control"},{"key":"B24","doi-asserted-by":"publisher","first-page":"116829","DOI":"10.1109\/ACCESS.2024.3446613","article-title":"Enhancing gastrointestinal stromal tumor (gist) diagnosis: an improved yolov8 deep learning approach for precise mitotic detection","volume":"12","author":"Liang","year":"2024","journal-title":"IEEE Access"},{"key":"B25","article-title":"Rt-detrv2: improved baseline with bag-of-freebies for real-time detection transformer","author":"Lv","year":"2024","journal-title":"arXiv preprint arXiv:2407.17140"},{"key":"B26","doi-asserted-by":"publisher","first-page":"1007","DOI":"10.1007\/978-3-319-90761-1_7-1","article-title":"Epidemiology of gastrointestinal diseases","volume":"24","author":"Machicado","year":"2020","journal-title":"Geriatr. Gastroenterol"},{"key":"B27","doi-asserted-by":"publisher","first-page":"21621","DOI":"10.1109\/ACCESS.2024.3363413","article-title":"Enhancing uav aerial image analysis: integrating advanced sahi techniques with real-time detection models on the visdrone dataset","volume":"12","author":"Muzammul","year":"2024","journal-title":"IEEE Access"},{"key":"B28","doi-asserted-by":"publisher","first-page":"3133","DOI":"10.3390\/diagnostics13193133","article-title":"Video analysis of small bowel capsule endoscopy using a transformer network","volume":"13","author":"Oh","year":"2023","journal-title":"Diagnostics"},{"key":"B29","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR56361.2022.9956652","article-title":"\u201cTime-coherent embeddings for wireless capsule endoscopy","author":"Pascual","year":"2022","journal-title":"2022 26th International Conference on Pattern Recognition (ICPR)"},{"key":"B30","doi-asserted-by":"publisher","DOI":"10.1145\/3083187.3083212","article-title":"\u201cKvasir: a multi-class image dataset for computer aided gastrointestinal disease detection","author":"Pogorelov","year":"2017","journal-title":"Proceedings of the 8th ACM Multimedia Systems Conference, MMSys"},{"key":"B31","doi-asserted-by":"publisher","first-page":"106582","DOI":"10.1016\/j.compbiomed.2023.106582","article-title":"Real-time gastric intestinal metaplasia diagnosis tailored for bias and noisy-labeled data with multiple endoscopic imaging","volume":"154","author":"Pornvoraphat","year":"2023","journal-title":"Comput. Biol. Med"},{"key":"B32","doi-asserted-by":"publisher","first-page":"110927","DOI":"10.1016\/j.dib.2024.110927","article-title":"Pixel-wise annotation for clear and contaminated regions segmentation in wireless capsule endoscopy images: a multicentre database","volume":"57","author":"Sadeghi","year":"2024","journal-title":"Data Brief"},{"key":"B33","doi-asserted-by":"publisher","first-page":"13732","DOI":"10.1038\/s41598-022-17502-7","article-title":"Edge artificial intelligence wireless video capsule endoscopy","volume":"12","author":"Sahafi","year":"2022","journal-title":"Dental Sci. Rep"},{"key":"B34","unstructured":"Saltzman\n              J. R.\n            \n          \n          Angiodysplasia of the Gastrointestinal Tract\n          \n          2024"},{"key":"B35","doi-asserted-by":"publisher","first-page":"134","DOI":"10.1007\/s11554-024-01512-x","article-title":"Real-time medical lesion screening: accurate and rapid detectors","volume":"21","author":"Shao","year":"2024","journal-title":"J. Real-Time Image Proc"},{"key":"B36","doi-asserted-by":"publisher","first-page":"113972","DOI":"10.1109\/ACCESS.2024.3438799","article-title":"Gastro intestinal disease classification using hierarchical spatio pyramid tranfonet with pittree fusion and efficient-condconv swishnet","volume":"12","author":"Sharmila","year":"2024","journal-title":"IEEE Access"},{"key":"B37","doi-asserted-by":"publisher","first-page":"102973","DOI":"10.1016\/j.media.2023.102973","article-title":"A deep weakly semi-supervised framework for endoscopic lesion segmentation","volume":"90","author":"Shi","year":"2023","journal-title":"Med. Image Anal"},{"key":"B38","doi-asserted-by":"publisher","first-page":"142","DOI":"10.1038\/s41597-021-00920-z","article-title":"Kvasir-capsule, a video capsule endoscopy dataset","volume":"8","author":"Smedsrud","year":"2021","journal-title":"Sci. Data"},{"key":"B39","doi-asserted-by":"publisher","first-page":"1643","DOI":"10.1172\/JCI105656","article-title":"Intestinal lymphangiectasia: a protein-losing enteropathy with hypogammaglobulinemia, lymphocytopenia and impaired homograft rejection","volume":"46","author":"Strober","year":"1967","journal-title":"J. Clin. Invest"},{"key":"B40","doi-asserted-by":"publisher","first-page":"106723","DOI":"10.1016\/j.compbiomed.2023.106723","article-title":"Transformer-based multi-task learning for classification and segmentation of gastrointestinal tract endoscopic images","volume":"157","author":"Tang","year":"2023","journal-title":"Comput. Biol. Med"},{"key":"B41","doi-asserted-by":"publisher","first-page":"105262","DOI":"10.1109\/ACCESS.2023.3319068","article-title":"Wireless capsule endoscopy image classification: an explainable AI approach","volume":"11","author":"Varam","year":"2023","journal-title":"IEEE Access"},{"key":"B42","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-024-81842-9","article-title":"Crh-yolo for precise and efficient detection of gastrointestinal polyps","volume":"14","author":"Wan","year":"","journal-title":"Sci. Rep"},{"key":"B43","doi-asserted-by":"publisher","first-page":"15478","DOI":"10.1038\/s41598-024-66642-5","article-title":"A semantic feature enhanced yolov5-based network for polyp detection from colonoscopy images","volume":"14","author":"Wan","year":"","journal-title":"Sci. Rep"},{"key":"B44","doi-asserted-by":"publisher","first-page":"32","DOI":"10.3390\/technologies13010032","article-title":"Vision transformers for image classification: a comparative survey","volume":"13","author":"Wang","year":"2025","journal-title":"Technologies"},{"key":"B45","unstructured":"Weerakkody\n              Y.\n            \n            \n              Bell\n              D.\n            \n            \n              Morgan\n              M.\n            \n          \n          Ampulla of vater\n          \n          2024"},{"key":"B46","doi-asserted-by":"publisher","first-page":"1416","DOI":"10.3390\/bioengineering10121416","article-title":"High-speed and accurate diagnosis of gastrointestinal disease: Learning on endoscopy images using lightweight transformer with local feature attention","volume":"10","author":"Wu","year":"2023","journal-title":"Bioengineering"},{"key":"B47","article-title":"Understanding differences in applying detr to natural and medical images","author":"Xu","year":"2024","journal-title":"arXiv preprint arXiv:2405.17677"},{"key":"B48","doi-asserted-by":"publisher","first-page":"913","DOI":"10.32604\/cmc.2024.058467","article-title":"Pd-yolo: colon polyp detection model based on enhanced small-target feature extraction","volume":"82","author":"Yu","year":"2025","journal-title":"Comput. Mater. Continua"},{"key":"B49","doi-asserted-by":"publisher","first-page":"741252","DOI":"10.1016\/j.aquaculture.2024.741252","article-title":"Rapid detection of salmon louse larvae in seawater based on machine learning","volume":"592","author":"Zhang","year":"","journal-title":"Aquaculture"},{"key":"B50","doi-asserted-by":"publisher","first-page":"107969","DOI":"10.1016\/j.cmpb.2023.107969","article-title":"Si-vit: shuffle instance-based vision transformer for pancreatic cancer rose image classification","volume":"244","author":"Zhang","year":"","journal-title":"Comput. Methods Programs Biomed"},{"key":"B51","doi-asserted-by":"publisher","first-page":"192","DOI":"10.1007\/s10462-024-10762-x","article-title":"From single to universal: tiny lesion detection in medical imaging","volume":"57","author":"Zhang","year":"","journal-title":"Artif. Intell. Rev"},{"key":"B52","article-title":"Detrs beat yolos on real-time object detection","author":"Zhao","year":"2023","journal-title":"arXiv [Preprint]. arXiv:2304.08069"}],"container-title":["Frontiers in Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2025.1529814\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,4,30]],"date-time":"2025-04-30T05:41:25Z","timestamp":1745991685000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2025.1529814\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,30]]},"references-count":52,"alternative-id":["10.3389\/frai.2025.1529814"],"URL":"https:\/\/doi.org\/10.3389\/frai.2025.1529814","relation":{},"ISSN":["2624-8212"],"issn-type":[{"value":"2624-8212","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,4,30]]},"article-number":"1529814"}}