Improving quality of Colon Capsule Endoscopy: Steps toward clinical implementation

Maria Magdalena Buijs

Research output: ThesisPh.D. thesis

15 Downloads (Pure)


Introduction: Colon Capsule Endoscopy was introduced in 2006, as a less invasive diagnostic colon investigation. The most recent guideline of the European Society of Gastrointestinal Endoscopy (ESGE) approves the use of CCE in patients for whom colonoscopy is inappropriate or not possible and after incomplete colonoscopy not caused by stenosis.
Since 2015 our research group at Odense University Hospital has been studying the use of CCE in different clinical settings. The main focus of these studies is the detection of polyps, especially adenomas, and cancers. Colorectal adenomas can develop into cancer and this risk increases with polyp size. Surveillance colonoscopies are indicated after removal of large adenomas (over 1 cm), since these are regarded as high-risk adenomas.
A recent systematic review and meta-analyses describes a sensitivity and specificity of 85 % of CCE for the detection of colorectal polyps. These assessments are based on the gold standard colonoscopy, thus polyps that are only detected by CCE are regarded as false positive. Some studies performed second
colonoscopies in patients with these polyps and detected polyps that were missed during primary colonoscopy in the majority of these patients. This indicates that the sensitivity of CCE for colorectal polyps might be even better than 85 %. However, multiple challenges for the implementation of CCE as a routine
investigation were demonstrated. The challenge that is addressed in this thesis is variability. What is the repeatability of evaluations of CCE videos, when performed twice by the same doctor, or by different doctors? What is the effect on experience on the quality of these assessments? Would it be possible to use artificial intelligence (AI) to assess CCE videos, in order to reduce this variability? The thesis consists of a review of relevant literature and the following five publications:

Study I: Polyp size estimation of large colorectal polyps during colonoscopy was assessed in repeat colonoscopies in order to assess interobserver variability in daily practice. The first colonoscopy was performed by general endoscopists whereas the second was performed by a dedicated expert who used an
instrument with known size to compare polyp size with. Estimated polyp sizes during colonoscopy were subsequently compared to post-fixation measurements by a pathologist. The study showed a variability of up to 17 mm in the primary colonoscopy compared to post-fixation sizes that was reduced to 9 mm in the secondary colonoscopy. This study indicates that clinical estimation of polyp sizes of large polyps is likely to be unreliable, which cautions a critical appraisal of polyp sizes and clinical decisions based on these polyp sizes.

Study II: Evaluations of CCE videos were performed by three experts and two beginners, in order to assess variability in these evaluations. The experts agreed moderately on number of polyps and detection of large polyps, as did the beginners compared to the experts. Intraobserver agreement was excellent for the
number of polyps in experts, and detection of large polyps in one expert. Beginners had a poor to moderate intraobserver agreement for the detection of polyps. The intra- and interobserver agreement of bowel cleansing quality was poor to moderate in all observers. This study indicates excellent intraobserver and acceptable interobserver agreement in experts, but difficulties in determining bowel cleansing classification
in both experts and beginners.

Study III: Polyps detected by CCE and colonoscopy need to be matched, in order to compare detected polyps between both modalities, as well as identify those polyps that have been detected multiple times in CCE. An algorithm was developed to match polyps based on size, morphology and location. A subsequent analysis of these matched polyps showed that polyp size was overestimated in both colonoscopy and CCE in comparison to post-fixation measurements. In the smallest polyps (≤ 5 mm), an overestimation of 135 % by CCE was observed. Large polyps were not significantly overestimated by CCE. There were significantly more pedunculated polyps in CCE and fewer flat polyps. This is likely because the colon is not inflated during CCE as in colonoscopy. This study beckons caution when using the current reported CCE polyp sizes for clinical decisions, especially in the smallest polyps.

Study IV: Bowel cleansing classification in CCE appears to be a challenge, as observed in Study II. Therefore an algorithm was trained to distinguish between dirty and clean images in order to assess bowel cleansing in CCE. This trained algorithm was compared to an algorithm that determined cleansing based
on the color distribution per pixel. The pixel analysis was less sensitive for detecting poorer bowel cleansing quality, and classified the vast majority of videos as good. The trained algorithm classified almost half (47 %) of the videos in agreement to the averaged classification by human CCE readers (from study II). An additional 41% differed only one class from the human average. The main challenge in this study is the absence of a gold standard, which complicates training an algorithm. This study shows that an algorithm is likely to be able to classify bowel cleansing quality accurately. An expert panel could be used to set the right thresholds for the different bowel cleansing classes. Implementation of capsule localization to the algorithm can improve bowel cleansing quality classification per segment, which could be presented as the percentage of clean images per segment to give a tangible assessment of cleansing quality.

Study V: The possibility of detecting colorectal polyps autonomously in CCE videos was studied by developing a convolutional neural network (CNN) that was trained by CCE images. The accuracy for detecting polyps in single CCE images was 98.0%, sensitivity 98.1% and specificity 96.3%. This is
significantly better than all current state-of-the-art CNNs. The modified CNN needs to be trained continuously to perform CCE evaluations of complete videos. The sensitivity for polyps needs to approach 100%, if the modified CNN is to be the primary reader in daily clinical practices in order not to miss any neoplasias. One could also argue that as long as the CNN detects more polyps than a human reader, it has achieved an acceptable level of polyp detection for independent assessment of CCE videos.

Discussion and conclusions: This thesis shows that human observers are fallible and assess colonoscopy and CCE with a certain variability. The studies in this thesis report a large variability in polyp size estimation in colonoscopy and an acceptable variability in expert evaluations of CCE videos. Bowel cleansing classification in CCE has a poor interobserver agreement, and might be improved by implementing artificial
intelligence. Artificial intelligence also appears to be promising in polyp detection in CCE videos, but will need to be validated before clinical implementation is possible. It appears from both this thesis and current literature that the implementation of artificial intelligence can reduce variability and is likely to improve the
quality of endoscopy. Even though artificial intelligence is promising, it is important to remember that machines are trained based on human input and thus sensitive for faulty input. As medical doctors it is therefore important to keep focusing on increasing quality of performance and develop quality standards, in order to optimize the quality of artificial intelligence in medicine.
Artificial intelligence requires continuous labour by adding more data to the database and improving the algorithm with current developments in artificial intelligence. Using primarily artificial intelligence instead of doctors to evaluate CCE videos will reduce reviewing time by medical staff, since only possible relevant
pathology will need to be assessed. However, it can also lead to a vulnerable dependence on artificial intelligence, if doctors do not obtain enough experience with CCE to perform reliable capsule evaluations themselves. The most complicated issue for the assessment of CCE videos by artificial intelligence are the
ethical and legal implications when relevant colorectal pathology is missed or wrong decisions are made based on an artificial intelligence assessment. Who is responsible for these mistakes, the engineers that have built the algorithm, the doctors that have implemented it or a third party like the hospital? Furthermore,
it is important not to solely trust on machines for clinical assessments, since individual patient variables should be taken into account when making clinical decisions.
Determining clinical consequences of CCE is not easy, since the sensitivity of CCE for polyps is likely to be higher than colonoscopy. Studies with repeat-colonoscopy and comparisons to CT colonography report that up to 6-24% of polyps are overlooked in colonoscopy. This limited sensitivity of colonoscopy is important to take into consideration when determining guidelines for CCE. It is plausible that polyps detected in CCE will
not be found by a following therapeutic colonoscopy. One will have to decide whether a second colonoscopy will need to be performed in such cases. Since one of the arguments for the implementation of CCE is to reduce the amount of colonoscopies, only a minority of CCE patients should need a colonoscopy
Especially when addressing screening participants one needs to weight the risks of harming a healthy individual by one or more colonoscopies, versus the possible health benefits of having (pre)malignant polyps detected and removed. Risk stratification after CCE will be based on number and size of detected
polyps, without taking pathology into account. Since a large proportion of the smallest polyps are hyperplastic and have no malignant potential at all, not removing ≤ 5 mm polyps might be acceptable in screening participants. The ESGE advices a colonoscopy if CCE detects ≥ 6 mm polyps, but our practice
has been to perform a colonoscopy if ≥ 10 mm polyps were detected. With our current knowledge of overestimating CCE polyp size in small polyps (Study III), one could say that a threshold of 10 mm might be acceptable, since the corresponding HP size is likely to be 4-5 mm smaller and might as such correspond to a HP polyp size ≤ 5 mm.
An even more exciting though would be whether small polyps should be followed with control-CCE instead of being removed, to observe the natural development. Multiple studies highlight that the risk of small polyps is very low, as accepted in the remove and discard theory, as well as other studies that show that only a small percentage (6 %) of surveilled small polyps develop into advanced adenomas. Does the risk of polypectomy outweigh the risk of the development of cancer in small polyps? Or could it be ethically defendable to follow-up on small polyps rather than removing all visible polyps? A follow-up study of these small polyps by CCE would be incredibly exciting, since it can reduce the amount of colonoscopies as well as the correlated risk of adverse events.
The studies in this thesis show a large variation in bowel cleansing assessment in CCE, which calls for the need of improvement of the bowel cleansing scale and possibly the implementation of artificial intelligence to reduce the variation in these assessment. The risk of a poor or unacceptable bowel cleansing is overlooking polyps, therefore a specific threshold for unacceptable bowel cleansing should be determined in CCE, for example as defined by the CC-CLEAR classification. Ideally this classification would be implemented in a bowel cleansing algorithm that objectively assesses bowel cleansing quality.

Original languageEnglish
Awarding Institution
  • University of Southern Denmark
  • Baatrup, Gunnar, Supervisor
  • Steele, Robert, Supervisor
  • Larsen, Morten Kobæk, Supervisor
External participants
Date of defence7. Apr 2022
Publication statusPublished - 21. Feb 2022

Note re. dissertation

Print copy of the full thesis is restricted to reference use in the Library.


Dive into the research topics of 'Improving quality of Colon Capsule Endoscopy: Steps toward clinical implementation'. Together they form a unique fingerprint.

Cite this