More

19-84 · 2026-01-14T06:16:24 1768371384

there is nearly 10TB of youtube metadata available on archive.org https://archive.org/details/youtube-metadata

chicagojoe · 2026-01-14T17:24:23 1768411463

These show as unavailable/with lock icons for me. Is there some process to download locked content from IA?

19-84 · 2026-01-13T23:54:54 1768348494

redd-archiver uses postgres full text search. for static search you could use lunr.js

19-84 · 2026-01-13T23:52:41 1768348361

the torrent has data for the top 40,000 subs on reddit. thanks to watchful1 splitting the data by subreddit, you can download only the subreddit you want from the torrent

Imustaskforhelp · 2026-01-14T11:45:09 1768391109

I am going to be honest and this looks really cool.

40,000 subs are good numbers and I hope that the number can be spread to even more subreddits

Perhaps we can finally migrate all or much of the data to lemmy instances as well to finally get the lemmy instance up and running as well.

Thank you for creating this. It opens up a lots of interesting opportunities.

19-84 · 2026-01-13T23:51:33 1768348293

the data from 2025-12 has been released already, it is usually released every month, it just needs to be split and reprocessed for 2025 by watchful1. i will probably eventually add support for importing data from the monthly arctic shift dumps so that archives can be updated monthly.

https://github.com/ArthurHeitmann/arctic_shift/releases

Arctic Shift https://academictorrents.com/browse.php?search=RaiderBDev

Watchful1 https://academictorrents.com/browse.php?search=Watchful1

riku_iki · 2026-01-14T01:02:20 1768352540

Is data web scrapped? Is reddit ok with that?..

19-84 · 2026-01-13T22:52:39 1768344759

I included a metadata dump of every subreddit found in the torrent. it includes a status field which will show of a subreddit is private along with a much more details

data catalog readme: https://github.com/19-84/redd-archiver/blob/main/tools/READM...

reddit data: https://github.com/19-84/redd-archiver/blob/main/tools/subre...

19-84 · 2026-01-13T22:44:25 1768344265

ive created tooling for an instance registry and team based leaderboard. the API has function to support this as well, so that we can collectively host archives in a decentralized and distributed manner

registry readme: https://github.com/19-84/redd-archiver/blob/main/docs/REGIST...

register instances: https://github.com/19-84/redd-archiver/blob/main/.github/ISS...

19-84 · 2026-01-13T21:09:35 1768338575

thank you for your comment, some example dot files were not copied in my original repo, they have now been added.

https://github.com/19-84/redd-archiver/commit/0bb103952195ae...

the docs have been updated with mkdir steps

https://github.com/19-84/redd-archiver/commit/c3754ea3a0238f...

alcroito · 2026-01-13T21:27:59 1768339679

Cheers. I checked the updated steps.

This is still missing creating the `output/.postgres-data` dir, without which docker compose refuses to start.

After creating that manually, going to http://localhost/ shows a 403 Forbidden page, which makes you believe that something might have gone wrong.

This is before running `reddarchiver-builder python reddarc.py` to generate the necessary DB from the input data.

19-84 · 2026-01-13T22:21:02 1768342862

I've updated the workflow and added a placeholder page that will serve before archives are created. thanks again! https://github.com/19-84/redd-archiver/commit/0dfd505ca81cb2...

19-84 · 2026-01-13T19:52:09 1768333929

thank you for your comment, I will support any platform that has complete dataset available. I will take submissions for any complete datasets through github issues. https://github.com/19-84/redd-archiver/blob/main/.github/ISS...

19-84 · 2026-01-13T19:43:04 1768333384

the API and MCP server is very powerful ;)

19-84 · 2026-01-13T19:26:03 1768332363

I have also published sub statistics and profiling for each platform. these can be used to help identify which subs to prioritize for archiving.

reddit: https://github.com/19-84/redd-archiver/blob/main/tools/subre...

voat: https://github.com/19-84/redd-archiver/blob/main/tools/subve...

ruqqus: https://github.com/19-84/redd-archiver/blob/main/tools/guild...