Jan Cychnerski, Tomasz Dziubich, Adam Brzeski
Dataset consists of thousands annotated gastrointestinal endoscopy images coming from retrospective studies. All cases were selected and annotated by three specialists from GUMed. Data tries to span numerous set of endoscopic diagnosis, using a terminology accordingly with Minimal Standard Terminology (MST 3.0) Digestive, giving 27 different classes of colonoscopic examination and 54 of upper endoscopy. The source recordings originated from 1271 patient’s examinations (555 and 712 respectively). The detailed summary is presented in the table below, which includes several rows with zeros as well, because our goal is to demonstrate the degree of compliance of our data set with the full MST specification.
Annotations were done by medical experts in a polygon-shaped masks, if applicable. Number of such labeled images was placed in “precise” sub-column. Due to the fact that some of the images come from videos, which recording speed is about 30 fps, in the dataset we also included similar images appearing before and after the source frame, which are similar to the source and the selection fits the region of interest. However, it should be emphasized that this type of selections was not made by an expert, therefore we have placed them in a separate sub-column (Imprecise).
The dataset is available for free for research purposes. To get access, contact the team: cvlab@eti.pg.edu.pl or jan.cychnerski@eti.pg.edu.pl
https://arxiv.org/abs/2201.08746
The article presents a new multi-label comprehensive image dataset from flexible endoscopy, colonoscopy and capsule endoscopy, named ERS. The collection has been labeled according to the full medical specification of 'Minimum Standard Terminology 3.0' (MST 3.0), describing all possible findings in the gastrointestinal tract (104 possible labels), extended with an additional 19 labels useful in common machine learning applications. The dataset contains around 6000 precisely and 115,000 approximately labeled frames from endoscopy videos, 3600 precise and 22,600 approximate segmentation masks, and 1.23 million unlabeled frames from flexible and capsule endoscopy videos. The labeled data cover almost entirely the MST 3.0 standard. The data came from 1520 videos of 1135 patients.
Additionally, this paper proposes and describes four exemplary experiments in gastrointestinal image classification task performed using the created dataset. The obtained results indicate the high usefulness and flexibility of the dataset in training and testing machine learning algorithms in the field of endoscopic data analysis.
ID | Name | Images - Precise | Images - Imprecise | Exams - Precise | Exams - Imprecise | Masks - Precise | Masks - Imprecise |
---|---|---|---|---|---|---|---|
--- | TOTAL | 6003 | 1348697 | 1108 | 1126 | 3606 | 22671 |
--- | EMPTY | 33 | 1233268 | 10 | 208 | 0 | 0 |
--- | NON-EMPTY | 5970 | 115429 | 1098 | 918 | 3606 | 22671 |
b | blood | 814 | 174 | 133 | 41 | 181 | 0 |
b01 | blood | 184 | 173 | 77 | 40 | 181 | 0 |
b02 | no_blood | 630 | 1 | 56 | 1 | 0 | 0 |
c | colono | 2199 | 37391 | 482 | 387 | 1635 | 14636 |
c01 | angiodysplasia | 61 | 597 | 14 | 13 | 61 | 597 |
c02 | bleeding_of_unknown_origin | 5 | 5 | 3 | 2 | 5 | 5 |
c03 | colitis:ischemic | 20 | 54 | 5 | 4 | 20 | 54 |
c05 | colorectal_cancer | 528 | 25345 | 92 | 82 | 270 | 2333 |
c08 | crohns_disease:active | 127 | 1306 | 19 | 16 | 127 | 1306 |
c10 | crohns_disease:quiescent | 18 | 151 | 11 | 9 | 18 | 151 |
c11 | diverticulitis | 1 | 6 | 1 | 1 | 1 | 6 |
c12 | diverticulosis | 83 | 325 | 29 | 16 | 83 | 325 |
c13 | fistula | 18 | 217 | 5 | 5 | 18 | 217 |
c14 | foreign_body | 1 | 4 | 1 | 1 | 1 | 4 |
c15 | hemorrhoids | 11 | 65 | 5 | 4 | 11 | 65 |
c16 | ileitis | 3 | 13 | 1 | 1 | 3 | 13 |
c17 | lipoma | 12 | 113 | 3 | 3 | 12 | 113 |
c19 | melanosis | 19 | 75 | 6 | 6 | 19 | 75 |
c20 | parasites | 22 | 179 | 2 | 2 | 22 | 179 |
c22 | polyp | 950 | 5395 | 247 | 185 | 629 | 5395 |
c23 | polyposis_syndrome | 16 | 220 | 7 | 4 | 16 | 220 |
c24 | postoperative_appearance | 10 | 18 | 6 | 5 | 10 | 18 |
c25 | proctitis | 9 | 173 | 3 | 3 | 9 | 173 |
c26 | rectal_ulcer | 22 | 510 | 8 | 7 | 22 | 510 |
c27 | solitary_ulcer | 12 | 211 | 7 | 5 | 12 | 211 |
c28 | stricture:inflammatory | 1 | 6 | 1 | 1 | 1 | 6 |
c29 | stricture:malignant | 15 | 131 | 9 | 6 | 15 | 131 |
c30 | stricture:postoperative | 2 | 2 | 1 | 1 | 2 | 2 |
c31 | submucosal_tumor | 2 | 8 | 1 | 1 | 2 | 8 |
c32 | ulcerative_colitis:active | 162 | 1746 | 33 | 28 | 162 | 1746 |
c34 | ulcerative_colitis:quiescent | 84 | 773 | 35 | 28 | 84 | 773 |
g | gastro | 1779 | 21680 | 591 | 473 | 1790 | 8035 |
g02 | achalasia | 9 | 25 | 3 | 3 | 9 | 25 |
g03 | barretts_esophagus | 20 | 119 | 8 | 7 | 20 | 119 |
g04 | benign_stricture | 4 | 25 | 1 | 1 | 4 | 25 |
g05 | bleeding_of_unknown_origin | 8 | 27 | 5 | 4 | 8 | 27 |
g06 | coeliac_disease | 7 | 26 | 2 | 2 | 7 | 26 |
g07 | crohns_disease | 6 | 37 | 1 | 1 | 6 | 37 |
g08 | dieulafoy_lesion | 9 | 11 | 4 | 3 | 9 | 11 |
g10 | duodenal_bulb_deformity | 5 | 15 | 3 | 3 | 5 | 15 |
g11 | duodenal_cancer | 19 | 37 | 5 | 3 | 19 | 37 |
g14 | duodenal_polyp | 27 | 99 | 10 | 8 | 27 | 99 |
g15 | duodenal_postoperative_appearance | 1 | 0 | 1 | 0 | 1 | 0 |
g18 | duodenal_ulcer | 106 | 436 | 44 | 34 | 106 | 436 |
g19 | duodenal_ulcer_with_bleeding | 26 | 87 | 12 | 10 | 26 | 87 |
g20 | duodenopathy:erosive | 21 | 146 | 8 | 8 | 21 | 146 |
g22 | duodenopathy:hyperemic | 5 | 33 | 4 | 3 | 5 | 33 |
g25 | esophageal_caustic_injury | 5 | 12 | 2 | 2 | 5 | 12 |
g26 | esophageal_cancer | 54 | 237 | 20 | 16 | 54 | 237 |
g27 | esophageal_candidiasis | 12 | 144 | 3 | 3 | 12 | 144 |
g28 | esophageal_diverticulum | 13 | 45 | 6 | 4 | 13 | 45 |
g29 | esophageal_fistula | 17 | 76 | 8 | 8 | 17 | 76 |
g30 | esophageal_foreign_body | 5 | 1 | 2 | 1 | 5 | 1 |
g31 | esophageal_polyp | 23 | 99 | 11 | 10 | 23 | 99 |
g32 | esophageal_postoperative_apperance | 1 | 0 | 1 | 0 | 1 | 0 |
g33 | esophageal_stricture | 23 | 127 | 9 | 8 | 23 | 127 |
g35 | esophageal_submucosal_tumor | 3 | 6 | 1 | 1 | 3 | 6 |
g36 | esophageal_varices | 170 | 466 | 66 | 55 | 170 | 466 |
g37 | extrinsic_compression | 7 | 32 | 1 | 1 | 7 | 32 |
g39 | gastric_cancer | 87 | 274 | 34 | 28 | 87 | 274 |
g40 | gastric_diverticulum | 1 | 2 | 1 | 1 | 1 | 2 |
g41 | gastric_fistula | 3 | 7 | 1 | 1 | 3 | 7 |
g42 | gastric_foreign_body | 19 | 65 | 10 | 8 | 19 | 65 |
g43 | gastric_caustic_injury | 2 | 1 | 2 | 1 | 2 | 1 |
g44 | gastric_lymphoma | 20 | 31 | 7 | 5 | 20 | 31 |
g45 | gastric_polyp(s) | 192 | 1210 | 72 | 54 | 192 | 1210 |
g46 | gastric_postoperative_appearance | 22 | 112 | 11 | 7 | 22 | 112 |
g47 | gastric_retention | 18 | 27 | 7 | 2 | 18 | 27 |
g50 | gastric_ulcer | 204 | 846 | 88 | 64 | 204 | 846 |
g51 | gastric_ulcer_with_bleeding | 20 | 94 | 7 | 5 | 20 | 94 |
g52 | gastric_ulcer:anastomotic | 17 | 24 | 10 | 7 | 17 | 24 |
g53 | gastric_varices | 82 | 476 | 24 | 19 | 82 | 476 |
g54 | gastropathy:erosive | 83 | 446 | 31 | 23 | 83 | 446 |
g55 | gastropathy:hemorrhagic | 13 | 46 | 3 | 3 | 13 | 46 |
g56 | gastropathy:hyperemic | 33 | 510 | 17 | 12 | 33 | 510 |
g57 | gastropathy:hypertrophic | 2 | 12 | 1 | 1 | 2 | 12 |
g59 | gastropathy:portal_hypertensive | 115 | 14385 | 46 | 47 | 115 | 631 |
g61 | hiatus_hernia | 13 | 45 | 7 | 7 | 13 | 45 |
g62 | mallory:weiss_tear | 16 | 43 | 8 | 6 | 16 | 43 |
g63 | other_esophagitis | 95 | 335 | 37 | 31 | 95 | 335 |
g65 | post_sclerotherapy_appearance | 19 | 71 | 9 | 7 | 19 | 71 |
g66 | pyloric_stenosis | 22 | 35 | 7 | 5 | 22 | 35 |
g67 | reflux_esophagitis | 37 | 100 | 17 | 8 | 37 | 100 |
g68 | schatzki_ring | 4 | 32 | 1 | 1 | 4 | 32 |
g69 | scar | 21 | 61 | 11 | 6 | 21 | 61 |
g70 | submucosal_tumor | 24 | 131 | 6 | 4 | 24 | 131 |
h | healthy | 1019 | 19464 | 67 | 110 | 0 | 0 |
h01 | esophagus | 33 | 1006 | 6 | 13 | 0 | 0 |
h02 | stomach | 42 | 1214 | 6 | 29 | 0 | 0 |
h03 | duodenum | 29 | 1154 | 6 | 17 | 0 | 0 |
h05 | small-bowel | 4 | 65 | 1 | 1 | 0 | 0 |
h06 | upper | 9 | 8268 | 2 | 56 | 0 | 0 |
h07 | colon | 902 | 11129 | 49 | 53 | 0 | 0 |
q | quality | 974 | 94470 | 97 | 224 | 0 | 0 |
q01 | sharp | 259 | 57530 | 3 | 172 | 0 | 0 |
q02 | blur | 428 | 36922 | 25 | 174 | 0 | 0 |
q03 | bile | 18 | 14 | 6 | 5 | 0 | 0 |
q04 | food | 1 | 0 | 1 | 0 | 0 | 0 |
q05 | tooclose | 82 | 50 | 23 | 15 | 0 | 0 |
q06 | air | 86 | 48 | 33 | 19 | 0 | 0 |
q07 | defocus | 51 | 25 | 27 | 18 | 0 | 0 |
q08 | light | 60 | 36 | 22 | 17 | 0 | 0 |
q09 | motion | 87 | 50 | 40 | 27 | 0 | 0 |
q10 | stool | 22 | 30 | 10 | 3 | 0 | 0 |