You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Uses Django admin interface to prototype and test business logic before building custom UI. This approach allows rapid iteration on data models and processing workflows.
37
-
38
-
**Next**: AI-powered extraction to structured genealogy data.
-**Date Standardization**: Multi-format Dutch/English date parsing ("15 maart 1654" → "1654-03-15")
50
+
-**Genealogical ID Correction**: Systematic fixes for OCR errors in Roman numerals (IL→II, XIL→XII)
51
+
-**Family Context Tracking**: Infers individual IDs from family group headers ("a. John" → "X.9.a")
52
+
53
+
### Development Approach
54
+
Uses Django admin interface to prototype and test business logic before building custom UI. This approach enables rapid iteration on data models and processing workflows while maintaining data quality through manual review capabilities.
55
+
56
+
**Current Focus**: Optimizing OCR quality and refining neural network training data
57
+
**Next Phase**: LLM integration for natural language queries and relationship inference
39
58
40
59
## Sample Data
41
60
@@ -67,15 +86,35 @@ python manage.py runserver
67
86
68
87
## Usage
69
88
70
-
Upload documents via Django admin → automatic OCR processing → review extracted text and confidence scores.
89
+
**Document Processing Workflow:**
90
+
1. Upload documents via Django admin interface
91
+
2. Automatic OCR processing with rotation detection and correction
92
+
3. Intelligent text chunking with genealogical structure preservation
0 commit comments