Inquiry About Accessing Anime Episode Library or API for trace.moe #97

XFRNO · 2025-06-16T19:49:20Z

XFRNO
Jun 16, 2025

Hi Soruly,

I’m really impressed by the work you’ve done on trace.moe — it’s an incredibly helpful tool and an inspiring open-source project.

I’m currently exploring how anime scene search engines work, and I’d like to experiment with building something similar for personal or educational use. I noticed that while the codebase is open source, the anime video dataset used for indexing isn’t included (which is completely understandable).

I wanted to kindly ask:
• Is there any way to access the anime episode library you use, even in limited form (e.g. public domain anime or a subset)?
• Or alternatively, is there an API or service you provide (public or private) that allows querying or accessing the anime video data for indexing or testing?

I completely understand if this data can’t be shared due to copyright or bandwidth constraints, but I’d really appreciate any pointers or suggestions you’re willing to offer.

Thank you again for building trace.moe and making part of it open to the community.

soruly · 2025-06-18T03:20:22Z

soruly
Jun 18, 2025
Maintainer

This is a good idea that's been in my mind for a while but I've never found any proper ways to do it.

There are a number of issues:

copyright holder of these data is not known, and varies from country to country by agents for certain time.
laws and regulations varies from region to region. It's hard to say which laws apply, or if fair-use is considered acceptable, and to what extend.
open data for AI training is still a controversial topic
how to let developers to access raw data (for training) without copying

As long as below principles are met, I'm open to any ideas

do not make profit out of it
do not make loss of copyright holders
effective measures to prevent abuse

P.S. None of the anime is old enough to become public domain yet.

0 replies

XFRNO · 2025-06-18T21:07:34Z

XFRNO
Jun 18, 2025
Author

Thanks again for your thoughtful response and for sharing your perspective so clearly. It’s refreshing to see someone take such care with both the technical and ethical sides of a project.

I also had one more question, if you don’t mind me asking:

Since trace.moe indexes such a large amount of anime content (over 100k hours), I was curious how you manage the collection and storage of that data. It seems like it would require a lot of bandwidth and storage, not to mention the challenge of obtaining the episodes themselves.

If you’re comfortable sharing, I’d love to learn how you approached that side of the project, whether it’s automation, storage optimization, legal handling, or something else. I understand if some parts need to remain private, but any insight would be really appreciated.

Thanks again for being so open about your work and for making trace.moe such a valuable resource.

0 replies

soruly · 2025-06-20T16:39:10Z

soruly
Jun 20, 2025
Maintainer

The entire collection is about 30TB now, which is a pretty small compared to what other DataHoarders on reddit are storing, even with backups. And there's only a limited amonut of new anime produced every year, so it grows steadily by only about 1TB/year.

I've some scripts to help me detect duplicate entries (either by file name or stream hash), and from time to time I'll check if I should keep it or drop it. They're just some ad-hoc scripts/commands, nothing magical. There're some other tools I found it useful: ncdu, vifm, rclone, yt-dlp.

Like collecting stamps, I collect things as a hobby long before I made this search engine. 100,000 files may seem a lot, but over a span of 15 years it's less than 20 files/day. I also use anilist to check if there's anything missing in my dataset, and use it to create regex pattern to move files automatically to the correct folder with anilist ID. So most of the time it runs unattended throughout a season.

And it doesn't use that much bandwidth too (<1TB/month) because the video preview is very short and highly compressed. Hetzner and Cloudflare is also part of Bandwidth Alliance, so that's basically free unlimited traffic.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Inquiry About Accessing Anime Episode Library or API for trace.moe #97

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

Inquiry About Accessing Anime Episode Library or API for trace.moe #97

Uh oh!

XFRNO Jun 16, 2025

Replies: 3 comments

Uh oh!

soruly Jun 18, 2025 Maintainer

Uh oh!

XFRNO Jun 18, 2025 Author

Uh oh!

Uh oh!

soruly Jun 20, 2025 Maintainer

XFRNO
Jun 16, 2025

soruly
Jun 18, 2025
Maintainer

XFRNO
Jun 18, 2025
Author

soruly
Jun 20, 2025
Maintainer