Document Scanner

TL;DR

Uses GPU to accelerate real-time document detection and extraction from video feed.

Document Scanner preview

Motivation

Existing mobile document scanner apps lacked real-time feedback, frustrating users who had to repeatedly attempt captures. I wanted to leverage GPU capabilities to provide instant visual feedback while capturing documents.

Objectives & Goals

Create a system that immediately detects and visually outlines document edges in real-time camera feeds, enabling instantaneous, accurate captures without latency.

Solution & Implementation

I employed OpenGL shaders to parallelize edge detection (Canny) and line detection (Hough transform), dramatically accelerating image processing through GPU hardware. The final system offered real-time visual feedback and automatic image rectification.

Results & Achievements

The GPU approach drastically improved speed and usability. I successfully prototyped real-time edge and line detection, demonstrating practical feasibility and superior user experience compared to CPU-based methods.

Learnings & Reflections

This project significantly expanded my knowledge of GPU programming, optimization techniques, and interactive computer vision.