Skip to content

PDF submission of my bachelor thesis: an image-processing pipeline for extracting information from telemedicine images.

Notifications You must be signed in to change notification settings

AlejandroCampayo/medical-image-pipeline

Repository files navigation

Medical Image Pipeline

This repository contains the results of my bachelor thesis conducted for AbiGlobalHealth, focusing on extracting information from images sent by users of their telemedicine service.

IMPORTANT: All images used in this thesis are sourced from Google and public datasets to ensure GDPR compliance. No patient images are included.


Overview

The pipeline processes images and automatically routes them to specialized engines based on their type. It combines classification, captioning, segmentation, and OCR to handle a wide variety of image types, including clinical photos, medical documents, and screenshots.

Pipeline Architecture

Pipeline Overview

Components

1. Global Classifier

A fine-tuned ResNet50 classifier categorizes images into seven classes: body-structure, face-features, medical documents, fluids, medicine packaging, paper documents, and screenshots. The dataset was labeled using a custom-built annotation platform.

Dataset Annotation

Validation curves for different classifiers investigated in this step are shown below:

Validation Curves

Decision thresholds for the classifier were determined based on model confidence:

Decision Thresholds

Results for the classification step:

Classification Results

2. Branch-and-Process

Once classified, images are processed by specialized backends:

Body-structure & Face-features

Captioning models (e.g., ViT-GPT2, BLIP) generate text descriptions of visible signs.

Captioning Example

Medical Documents

A secondary classifier identifies subtypes of medical documents (e.g., radiographies, MRIs).

Medical Document Categories

Medicine Packaging

Text segmentation models (e.g., CRAFT, EAST) isolate regions of interest, which are processed by OCR engines (e.g., Tesseract, TrOCR) to extract text.

Segmentator Performance

Paper Documents

Direct OCR extracts text from printed or handwritten documents.

Extract Written Information

Screenshots

OCR extracts on-screen text from digital screenshots.


Conclusion

This pipeline integrates classification, captioning, segmentation, and OCR to process any medical image type received by a usual telemedicine platform. It can handle clinical photos, digitize medical documents, and extract text from screenshots efficiently.

For more details, refer to the thesis PDF: Information Extraction from Telemedicine Consultation Images.

About

PDF submission of my bachelor thesis: an image-processing pipeline for extracting information from telemedicine images.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published