Authors

Jan Cychnerski, Tomasz Dziubich, Adam Brzeski

The database

Dataset consists of thousands annotated gastrointestinal endoscopy images coming from retrospective studies. All cases were selected and annotated by three specialists from GUMed. Data tries to span numerous set of endoscopic diagnosis, using a terminology accordingly with Minimal Standard Terminology (MST 3.0) Digestive, giving 27 different classes of colonoscopic examination and 54 of upper endoscopy. The source recordings originated from 1271 patient’s examinations (555 and 712 respectively). The detailed summary is presented in the table below, which includes several rows with zeros as well, because our goal is to demonstrate the degree of compliance of our data set with the full MST specification.

Annotations were done by medical experts in a polygon-shaped masks, if applicable. Number of such labeled images was placed in “precise” sub-column. Due to the fact that some of the images come from videos, which recording speed is about 30 fps, in the dataset we also included similar images appearing before and after the source frame, which are similar to the source and the selection fits the region of interest. However, it should be emphasized that this type of selections was not made by an expert, therefore we have placed them in a separate sub-column (Imprecise).

The dataset is available for free for research purposes. To get access, contact the team: cvlab@eti.pg.edu.pl or jan.cychnerski@eti.pg.edu.pl

Read online

https://arxiv.org/abs/2201.08746

Abstract

The article presents a new multi-label comprehensive image dataset from flexible endoscopy, colonoscopy and capsule endoscopy, named ERS. The collection has been labeled according to the full medical specification of 'Minimum Standard Terminology 3.0' (MST 3.0), describing all possible findings in the gastrointestinal tract (104 possible labels), extended with an additional 19 labels useful in common machine learning applications. The dataset contains around 6000 precisely and 115,000 approximately labeled frames from endoscopy videos, 3600 precise and 22,600 approximate segmentation masks, and 1.23 million unlabeled frames from flexible and capsule endoscopy videos. The labeled data cover almost entirely the MST 3.0 standard. The data came from 1520 videos of 1135 patients.

Additionally, this paper proposes and describes four exemplary experiments in gastrointestinal image classification task performed using the created dataset. The obtained results indicate the high usefulness and flexibility of the dataset in training and testing machine learning algorithms in the field of endoscopic data analysis.

Data statistics

ID Name Images - Precise Images - Imprecise Exams - Precise Exams - Imprecise Masks - Precise Masks - Imprecise
--- TOTAL 6003 1348697 1108 1126 3606 22671
--- EMPTY 33 1233268 10 208 0 0
--- NON-EMPTY 5970 115429 1098 918 3606 22671
b blood 814 174 133 41 181 0
b01 blood 184 173 77 40 181 0
b02 no_blood 630 1 56 1 0 0
c colono 2199 37391 482 387 1635 14636
c01 angiodysplasia 61 597 14 13 61 597
c02 bleeding_of_unknown_origin 5 5 3 2 5 5
c03 colitis:ischemic 20 54 5 4 20 54
c05 colorectal_cancer 528 25345 92 82 270 2333
c08 crohns_disease:active 127 1306 19 16 127 1306
c10 crohns_disease:quiescent 18 151 11 9 18 151
c11 diverticulitis 1 6 1 1 1 6
c12 diverticulosis 83 325 29 16 83 325
c13 fistula 18 217 5 5 18 217
c14 foreign_body 1 4 1 1 1 4
c15 hemorrhoids 11 65 5 4 11 65
c16 ileitis 3 13 1 1 3 13
c17 lipoma 12 113 3 3 12 113
c19 melanosis 19 75 6 6 19 75
c20 parasites 22 179 2 2 22 179
c22 polyp 950 5395 247 185 629 5395
c23 polyposis_syndrome 16 220 7 4 16 220
c24 postoperative_appearance 10 18 6 5 10 18
c25 proctitis 9 173 3 3 9 173
c26 rectal_ulcer 22 510 8 7 22 510
c27 solitary_ulcer 12 211 7 5 12 211
c28 stricture:inflammatory 1 6 1 1 1 6
c29 stricture:malignant 15 131 9 6 15 131
c30 stricture:postoperative 2 2 1 1 2 2
c31 submucosal_tumor 2 8 1 1 2 8
c32 ulcerative_colitis:active 162 1746 33 28 162 1746
c34 ulcerative_colitis:quiescent 84 773 35 28 84 773
g gastro 1779 21680 591 473 1790 8035
g02 achalasia 9 25 3 3 9 25
g03 barretts_esophagus 20 119 8 7 20 119
g04 benign_stricture 4 25 1 1 4 25
g05 bleeding_of_unknown_origin 8 27 5 4 8 27
g06 coeliac_disease 7 26 2 2 7 26
g07 crohns_disease 6 37 1 1 6 37
g08 dieulafoy_lesion 9 11 4 3 9 11
g10 duodenal_bulb_deformity 5 15 3 3 5 15
g11 duodenal_cancer 19 37 5 3 19 37
g14 duodenal_polyp 27 99 10 8 27 99
g15 duodenal_postoperative_appearance 1 0 1 0 1 0
g18 duodenal_ulcer 106 436 44 34 106 436
g19 duodenal_ulcer_with_bleeding 26 87 12 10 26 87
g20 duodenopathy:erosive 21 146 8 8 21 146
g22 duodenopathy:hyperemic 5 33 4 3 5 33
g25 esophageal_caustic_injury 5 12 2 2 5 12
g26 esophageal_cancer 54 237 20 16 54 237
g27 esophageal_candidiasis 12 144 3 3 12 144
g28 esophageal_diverticulum 13 45 6 4 13 45
g29 esophageal_fistula 17 76 8 8 17 76
g30 esophageal_foreign_body 5 1 2 1 5 1
g31 esophageal_polyp 23 99 11 10 23 99
g32 esophageal_postoperative_apperance 1 0 1 0 1 0
g33 esophageal_stricture 23 127 9 8 23 127
g35 esophageal_submucosal_tumor 3 6 1 1 3 6
g36 esophageal_varices 170 466 66 55 170 466
g37 extrinsic_compression 7 32 1 1 7 32
g39 gastric_cancer 87 274 34 28 87 274
g40 gastric_diverticulum 1 2 1 1 1 2
g41 gastric_fistula 3 7 1 1 3 7
g42 gastric_foreign_body 19 65 10 8 19 65
g43 gastric_caustic_injury 2 1 2 1 2 1
g44 gastric_lymphoma 20 31 7 5 20 31
g45 gastric_polyp(s) 192 1210 72 54 192 1210
g46 gastric_postoperative_appearance 22 112 11 7 22 112
g47 gastric_retention 18 27 7 2 18 27
g50 gastric_ulcer 204 846 88 64 204 846
g51 gastric_ulcer_with_bleeding 20 94 7 5 20 94
g52 gastric_ulcer:anastomotic 17 24 10 7 17 24
g53 gastric_varices 82 476 24 19 82 476
g54 gastropathy:erosive 83 446 31 23 83 446
g55 gastropathy:hemorrhagic 13 46 3 3 13 46
g56 gastropathy:hyperemic 33 510 17 12 33 510
g57 gastropathy:hypertrophic 2 12 1 1 2 12
g59 gastropathy:portal_hypertensive 115 14385 46 47 115 631
g61 hiatus_hernia 13 45 7 7 13 45
g62 mallory:weiss_tear 16 43 8 6 16 43
g63 other_esophagitis 95 335 37 31 95 335
g65 post_sclerotherapy_appearance 19 71 9 7 19 71
g66 pyloric_stenosis 22 35 7 5 22 35
g67 reflux_esophagitis 37 100 17 8 37 100
g68 schatzki_ring 4 32 1 1 4 32
g69 scar 21 61 11 6 21 61
g70 submucosal_tumor 24 131 6 4 24 131
h healthy 1019 19464 67 110 0 0
h01 esophagus 33 1006 6 13 0 0
h02 stomach 42 1214 6 29 0 0
h03 duodenum 29 1154 6 17 0 0
h05 small-bowel 4 65 1 1 0 0
h06 upper 9 8268 2 56 0 0
h07 colon 902 11129 49 53 0 0
q quality 974 94470 97 224 0 0
q01 sharp 259 57530 3 172 0 0
q02 blur 428 36922 25 174 0 0
q03 bile 18 14 6 5 0 0
q04 food 1 0 1 0 0 0
q05 tooclose 82 50 23 15 0 0
q06 air 86 48 33 19 0 0
q07 defocus 51 25 27 18 0 0
q08 light 60 36 22 17 0 0
q09 motion 87 50 40 27 0 0
q10 stool 22 30 10 3 0 0