🛠️ ragflow-fix-ocr-gpu-memory

This repository documents and provides a tested fix for an OCR-related GPU memory issue in RAGFlow. The core objective of this project is to demonstrate the environment, configuration, and code changes that successfully resolve the error, ensuring stable PDF parsing and indexing even for large medical textbooks.

✅ Background

While running RAGFlow with GPU acceleration, OCR parsing of large files (e.g., 177 MB scanned medical textbooks) could cause abnormal termination or memory errors. After troubleshooting, adjustments were made to system configuration, Docker runtime, and the RAGFlow OCR code (ocr.py). With these changes, the system can now reliably parse large PDFs without exceptions.

🔬 Testing Environment

CPU: AMD Ryzen 9 5900X
Memory: DDR4, 64 GB
GPU: NVIDIA RTX 3090, 24 GB
File tested: 177 MB medical textbook (scanned PDF)
Image used: v0.20.5-slim
Launch method: docker-compose-gpu.yml
Model backend: Ollama
Nginx adjustment:
```
client_max_body_size 256M;
```
This was critical to allow larger PDF uploads.

After these adjustments, the PDF was re-sliced, processed, and parsed without any anomalies or error messages.

📂 Repository Structure

ragflow-fix-ocr-gpu-memory/
│
├── ragflow-logs/                 # Logs captured during test runs
│   ├── ragflow_server.log        # Main server log
│   └── task_executor_*.log       # Task execution logs
│
├── system_image/                 # Screenshots of system setup & configurations
│   ├── explorer.png              # Windows file explorer showing file layout
│   ├── ollama-settings.png       # Ollama model configuration
│   ├── rag-flow.png              # RAGFlow runtime interface
│   └── slice_func.png            # Demonstration of PDF slicing function
│
├── .env_example                  # Example .env file for environment variables
├── .gitignore                    # Git ignore rules
├── docker-compose-gpu.yml        # Docker Compose file with GPU support
├── ocr.py                        # Modified OCR script to fix GPU memory usage
├── README.md                     # This documentation
├── service_conf.yaml.template    # Template for service configuration
└── 《内科学》（第10版）.pdf      # Sample large medical textbook (177 MB, for test)

📸 Images & References

Directory structure
Ollama model configuration
RAGFlow runtime environment
Demonstration of PDF slicing

🔗 Additional Resources

All supporting files, logs, and screenshots are included in this repository. You can review them to reproduce the results or verify the fix.

👉 Repository link: https://github.com/loks666/ragflow-fix-ocr-gpu-memory

If you encounter further issues, feel free to open an issue or start a discussion. Collaboration is welcome to refine and extend the fix.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🛠️ ragflow-fix-ocr-gpu-memory

✅ Background

🔬 Testing Environment

📂 Repository Structure

📸 Images & References

🔗 Additional Resources

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
ragflow-logs		ragflow-logs
system_image		system_image
.env_example		.env_example
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
docker-compose-gpu.yml		docker-compose-gpu.yml
ocr.py		ocr.py
service_conf.yaml.template		service_conf.yaml.template
《内科学》（第10版）.pdf		《内科学》（第10版）.pdf

loks666/ragflow-fix-ocr-gpu-memory

Folders and files

Latest commit

History

Repository files navigation

🛠️ ragflow-fix-ocr-gpu-memory

✅ Background

🔬 Testing Environment

📂 Repository Structure

📸 Images & References

🔗 Additional Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages