Query Regarding AI Model for Oracle Bone Script (甲骨文) Recognition and Differentiation from Mimicked Symbols

### Search before asking

- [x] I have searched the HUB [issues](https://github.com/ultralytics/hub/issues) and [discussions](https://github.com/ultralytics/hub/discussions) and found no similar questions.


### Question

I’m currently working on an AI-based symbol recognition project focused on ancient Chinese Oracle Bone Characters (甲骨文). The system is designed to recognize and classify these ancient symbols while distinguishing authentic Oracle Bone characters from mimicked or artificially generated (fake) ones that visually resemble the originals.

In the current dataset, there are 30 symbols in total — 15 authentic Oracle Bone Characters and 15 mimicked (false) symbols intentionally designed to resemble the real ones. The objective is to train and evaluate an AI model capable of learning subtle differences in stroke structure, curvature, and spatial composition between true and false symbols.

I would like to seek guidance or insights on the following aspects:

Which AI architecture or model type (e.g., CNNs, Vision Transformers, multimodal models, etc.) would be best suited for symbol recognition and authenticity differentiation in ancient scripts?

What are the best practices for dataset preparation and annotation when dealing with a limited number of symbols and stylistic irregularities found in historical scripts?

How can the model be trained to identify authenticity features — such as original stroke weight, spacing, or engraving patterns — that distinguish true Oracle Bone symbols from mimicked ones?

Are there any publicly available datasets or references containing verified Oracle Bone Characters that could help expand or validate my dataset?

What evaluation metrics or benchmarking methods would be most effective for comparing recognition accuracy and authenticity detection in such a small-scale symbolic dataset?

This project aims to enhance the accuracy of AI-based ancient script recognition and explore methods to prevent confusion between authentic historical symbols and synthetic or misleading reproductions.

Any technical guidance, research direction, or references related to model training or dataset expansion would be greatly appreciated.

Thank you for your time and support.

<img width="591" height="705" alt="Image" src="https://github.com/user-attachments/assets/751adc18-0882-4322-baf7-a55fca2164bf" />

### Additional

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Query Regarding AI Model for Oracle Bone Script (甲骨文) Recognition and Differentiation from Mimicked Symbols #1215

Search before asking

Question

Additional

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Query Regarding AI Model for Oracle Bone Script (甲骨文) Recognition and Differentiation from Mimicked Symbols #1215

Description

Search before asking

Question

Additional

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions