You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would love to get feedback from the community! What data quality metrics do you find most valuable in your work?
67
+
68
+
# 文案三
69
+
70
+
### For r/MachineLearning
71
+
72
+
**Title**: [D] Dingo 1.9.0 released: Open-source data quality evaluation with enhanced hallucination detection
73
+
74
+
Just released **Dingo 1.9.0** with major upgrades for RAG-era data quality assessment.
75
+
76
+
### Key Updates:
77
+
78
+
**🔍 Enhanced Hallucination Detection**
79
+
Dingo 1.9.0 integrates two powerful hallucination detection approaches:
80
+
-**HHEM-2.1-Open local model** (recommended) - runs locally without API costs
81
+
-**GPT-based cloud detection** - leverages OpenAI models for detailed analysis
82
+
83
+
Both evaluate LLM-generated answers against provided context using consistency scoring (0.0-1.0 range, configurable thresholds).
84
+
85
+
**⚙️ Configuration System Overhaul**
86
+
Complete rebuild with modern DevOps practices:
87
+
- Hierarchical inheritance (project → user → system levels)
88
+
- Hot-reload capabilities for instant config changes
89
+
- Schema validation with clear error messages
90
+
- Template system for common scenarios
91
+
92
+
**📚 DeepWiki Document Q&A**
93
+
Transform static documentation into interactive knowledge bases:
94
+
- Multi-language support (EN/CN/JP)
95
+
- Context-aware multi-turn conversations
96
+
- Visual document structure parsing
97
+
- Semantic navigation and cross-references
98
+
99
+
### Why It Matters:
100
+
Traditional hallucination detection relies on static rules. Our approach provides context-aware validation essential for production RAG systems, SFT data quality assessment, and real-time LLM output verification.
What hallucination detection approaches are you currently using? Interested in your RAG quality challenges.
112
+
113
+
---
114
+
115
+
### For r/OpenSource
116
+
117
+
**Title**: [Project] Dingo 1.9.0: Major update to our data quality evaluation toolkit
118
+
119
+
The community response has been incredible! **Dingo 1.9.0** delivers features you've been requesting.
120
+
121
+
### Project Stats:
122
+
- ⭐ 311 GitHub stars and growing
123
+
- 🍴 32 active development forks
124
+
- 📚 Comprehensive multi-language documentation
125
+
- 🔄 Full CI/CD pipeline with automated testing
126
+
127
+
### What's New:
128
+
**Hallucination Detection**: Integrated HHEM-2.1-Open model and GPT-based detection for comprehensive fact-checking against context.
129
+
130
+
**Config System Redesign**: Hierarchical inheritance, hot-reload, and template-based setup replacing the previous complex configuration approach.
131
+
132
+
**DeepWiki Integration**: Interactive documentation system that transforms static docs into conversational AI assistants.
133
+
134
+
### Community Impact:
135
+
This release addresses community requests through extensive collaboration - issues resolved, PRs merged, and new contributors welcomed from around the world.
136
+
137
+
### Contributing Opportunities:
138
+
-**Core Development**: Python/ML implementation
139
+
-**Documentation**: Technical writing and tutorials
2. Check "good first issue" labels for beginner-friendly tasks
146
+
3. Join our community discussions
147
+
148
+
**License**: Apache 2.0 - fully open-source, no vendor lock-in
149
+
150
+
What data quality tools does your team currently use? Would love to hear about your experiences and challenges.
151
+
152
+
---
153
+
154
+
### For r/artificial
155
+
156
+
**Title**: Dingo 1.9.0: Addressing AI hallucination through enhanced detection
157
+
158
+
As AI systems become more prevalent, data quality and factual accuracy are paramount concerns. Sharing our latest release addressing these challenges.
159
+
160
+
### The Challenge:
161
+
- LLM hallucinations in production systems
162
+
- RAG systems losing factual accuracy when combining sources
163
+
- Temporal inconsistency as information becomes outdated
164
+
- Quality control across different data modalities
165
+
166
+
### Our Solution:
167
+
**Dingo 1.9.0** provides comprehensive hallucination detection through two complementary approaches:
168
+
169
+
**Local HHEM-2.1-Open Integration**: Runs Vectara's hallucination evaluation model locally, providing fast, cost-effective fact-checking without API dependencies.
170
+
171
+
**Cloud-based GPT Detection**: Leverages advanced language models for detailed consistency analysis with comprehensive reasoning.
172
+
173
+
**Smart Configuration Management**: Completely redesigned system enabling environment-aware inheritance, hot-reload capabilities, and template-based setups for rapid deployment.
174
+
175
+
**Interactive Documentation**: DeepWiki transforms static documentation into conversational AI assistants, improving team knowledge sharing and reducing information silos.
176
+
177
+
### Real-World Applications:
178
+
-**Production Monitoring**: Real-time quality control for customer-facing AI systems
179
+
-**Training Pipeline**: Pre-processing validation for SFT datasets
180
+
-**Enterprise Knowledge**: Quality assurance for internal AI applications
181
+
-**Research**: Systematic evaluation across different model architectures
182
+
183
+
### Community Adoption:
184
+
Growing adoption across organizations focused on AI safety and reliability, with particular interest from teams building production RAG systems and those requiring systematic data quality assessment.
185
+
186
+
**Try it**: Available on GitHub under Apache 2.0 license
0 commit comments