TTH hash support [tracked]

old granted and denied feature requests

Moderator: AniDB

PetriW
AniDB Staff
Posts: 1522
Joined: Sat May 24, 2003 2:34 pm

Post by PetriW »

Skywalka wrote:So please if you really consider adding this hash please add all other possible hashes that might one day be necessary.
You forgot to say "and please predict what hashes we'll use in 10 years". ;)
nwa
AniDB Staff
Posts: 585
Joined: Sat Jun 07, 2003 10:51 am

Post by nwa »

hashes... sigh
I think it's a bother to add them to files, crc32 is enough for me and I can live through the hash fixers... but if I'd need to add 10 different hashes to anidb, that's overkill imo...

or you could make a special little pop-up window that has all the hashes that one could add but only display crc32 in the file detail page with a little link to the popup window with all the hashes.. this way, the fields won't be left empty and bad for the eye
:P
analogued
Posts: 54
Joined: Mon Jul 12, 2004 6:53 am

Post by analogued »

nwa wrote:hashes... sigh
I think it's a bother to add them to files, crc32 is enough for me and I can live through the hash fixers... but if I'd need to add 10 different hashes to anidb, that's overkill imo...

or you could make a special little pop-up window that has all the hashes that one could add but only display crc32 in the file detail page with a little link to the popup window with all the hashes.. this way, the fields won't be left empty and bad for the eye
:P
Well... the whole point in adding these new hashes (SHA-1 and TTH) is in order to generate new links for new networks. Currently AniDB is useful for people using the Edonkey network... others have to look elsewhere for this type of information.
ender
Posts: 7
Joined: Thu Jan 15, 2004 6:24 pm
Location: The Sunny Side of the Alps
Contact:

Post by ender »

FYI, it took me 4 days to hash my 1.4TiB DC share on an Athlon 1GHz, with the hasher running on low priority. Note that all my disks are encrypted, so it'll be faster on a typical system.
Elias
Posts: 242
Joined: Tue Feb 17, 2004 4:55 pm

Post by Elias »

PetriW wrote:
Skywalka wrote:So please if you really consider adding this hash please add all other possible hashes that might one day be necessary.
You forgot to say "and please predict what hashes we'll use in 10 years". ;)
Predicting may be impossible, but it is possible to prepare anidb to easy add another hash (if it will not be very strange format). Assuming that new hashes are rather as some strings of letters or numbers, it can be added now in advance few more new fieds, like hash1, ..., hash5 (as varchars) and few indicators to each of them what_hash1, ..., what_hash5 (simple char or number with default 0=no hash, 1-TTH,2-another new wonderful hash,...). Then when new hash will be needed only task will be add new value for list of possible hashes. And user who wants to add some hashes may choose type of hash from list and fill associated field with hash value (repeating this few times if there will be more hashes).

But i wouldn't add high priority for this. I use DC sometimes, but for searching still use name of anime and episode number, not TTH. Searching by TTH in most cases would return 0 sources, when by name i've got some sources, maybe sometimes little different from original (if is changed, than can be repaired after by emule).
Skywalka
Posts: 889
Joined: Tue Sep 16, 2003 7:57 pm

Post by Skywalka »

The problem is the hashing, not adding a new row to the table in the DB Elias.

I cannot keep everything on harddisk forever, that is just too expensive. My collection is 1.3 TiB on a RAID System that is full. I started adding non-raided harddisks but honestly I am not willing to shell out more money to have all my anime available almost instantly. I put them on DVD-R or DVD+R or DVD-RAM and it takes roughly 45 minutes to copy the contents of a 1x DVD-R to the harddisk. Multiply that with 250 and you know how long it would take to hash all my files again. This would mean that in the future with new clients, only new files could be shared there and easily identified, because nobody would be willing to get out the old files and hash them to add that information to AniDB. I mean it might be possible that somebody gets a file requests, takes out the dvd or cd and puts it in the share folder of the new share utility and it might be possible that he runs AoM as well and that AoM will then add an automated creq to have the information added.

But I think it more often will be that there is somebody who sees a file entry on AniDB, thinks he would like to get that file and then needs the hash already in anidb to be able to find the file on the network. Like it was mentioned earlier, there might be a lot of different files on the network, and he won't be able to identify the right file. He'd have to download a file and then hash it for the crc32 or sha checksum and if he has the right file, then his AoM would add the info to the DB.

The propability that older files are automatically hashed is very low and one purpose of AniDB is to lessen the amount of unnecessary data-transfer (-> corrupted file transfer) which won't be possible if you can find out if a file is good or not before transferring it.

At the moment it would be of no problem for me to get all hashsums you can think of for all of my files. It might take a month or so to get them all but then we'd have them. In the future I might have deleted files or only have them on removale media and getting hashes then could be a bite tire/troublesome.

So if it isn't a space problem in the DB, if it does not make the DB slower and if PetriW is willing to add 35 hash algorithms I'd say let's bloat AoMs hash function and get all possible hashes into AniDB we can think of ;-)

I know this won't happen and therefor I'd rather suggest we come to a final decision on hashes, get all hashes implemented we think will _ever_ be necessary and then close the case and bury it file in a cement block in the dephs of the ocean.

Oh and while I am at it: now would also be a good time to add the automatic Video Codec and Audio codec information gathering support to AoM, although it is a lot easier to get that info without copying all the data to the harddisk. Would still be tiresome to get 250+ discs out of their folders so PetriW, implement it now! ;-) ;) ;)
exp
Site Admin
Posts: 2438
Joined: Tue Oct 01, 2002 9:42 pm
Location: Nowhere

Post by exp »

well,

adding 35 hashes would be a real pain.
it would increase the anidb database size considerably and it would also slow down AoM's hashing quite a bit.
I think we should only add hashes for which we have a real demand.

And even if we add a new hash type later and most old files don't have those hashes, so what?
with time more and more of even the old files will be on some anidb users hdd somewhere.
and even if they aren't, well there are files without those hashes then, if someone really needs them he can still request them on irc/forum.

BYe!
EXP
rowaasr13
Posts: 415
Joined: Sat Sep 27, 2003 4:57 am

Post by rowaasr13 »

BTW, about sizes. I really hope that hases are not stored as strings in AniDB, aren't they?
Skywalka
Posts: 889
Joined: Tue Sep 16, 2003 7:57 pm

Post by Skywalka »

Hehe, I expected that answer from EXP and I guess it's acceptable. Files that aren't shared won't need a new hash and those which are shared will eventually get hashed.

And it's not that important to have all files hashed. That will keep the size of the DB smaller.
ninjamask
Posts: 50
Joined: Mon Apr 12, 2004 10:47 pm
Location: Germany, Cologne
Contact:

Post by ninjamask »

after reading all oppinions, do i think its a good idea to add TTH hashes.

mata ne, ninja. ^^
[NOR]Nico
Posts: 1
Joined: Fri Dec 09, 2005 8:47 pm

TTH support in ANIDB

Post by [NOR]Nico »

As an DC++ user i know it would be greatly appreciated by many of ANIBS users if you added TTH listing in your file info service. The DirectConnect network has a large Anime community with several thousand Anime fans. I know that a large amount of the DC Anime communty uses ANIDB, and we constantly promote ANIDB site as a great source of info, in our networks.

TTH is wery useful because you can find the exact file you are looking for, and it makes it easy to filter out corrupted files. It would not be a big problem adding TTH support, and a LOT of people would benefit from it. I sure know i would use your site even more if you added this function.
Raptor
Posts: 155
Joined: Mon Nov 01, 2004 11:07 pm

Post by Raptor »

as a fellow DC++ user i completely agree its very annoying to download multiple version of the file in hope that one will be the good one...
Locked