ed2k hash function

Want to help out? Need help accessing the AniDB API? This is the place to ask questions.

Moderator: AniDB

Locked
MistaMuShu
Posts: 10
Joined: Sat May 28, 2005 2:50 am

ed2k hash function

Post by MistaMuShu »

After much searching around I've got some details of what I have to do in order to get an ed2k hash.

According to wikipedia:
The file (in question) is divided into 9.28Mb chunks and the hash is calculated for each one. The resulting hash table is hashed once again, and the final value is used as a part of the ed2k link.
What is the "resulting hash table" they speak of?

I might offload the task of hashing to an external program, but I'm wondering what people who have made clients used to generate this value. What library or modules are available? Recommendations for any specific commandline utilities?
PetriW
AniDB Staff
Posts: 1522
Joined: Sat May 24, 2003 2:34 pm

Post by PetriW »

BennieB and to a lesser degree me made the AoM one...

What the ed2k hash does is it creates a hash for every 9728000 byte chunk in the file. If the file is smaller than 9728000 byte then the md4 hash for it is used, otherwise the binary version of each hash (not the hexadecimal text version) is added together and hashed another time with md4.
Essentially imagine each hash being a small string filled with the binary version of the hash, what you do is you add all those small strings together then hash the resulting string with md4 to get the ed2k hash.

And finally, if a file is exactly a multiple of 9728000 byte then an extra hash is added created out of no hashing material. (Yes, there's a default md4 start value so you still get a hash.)
fahrenheit
AniDB Staff
Posts: 438
Joined: Thu Apr 08, 2004 1:43 am
Location: Portugal

Post by fahrenheit »

you can get a good c implementation of the ed2k hash with ed2k_hash [ http://ed2k-tools.sourceforge.net/ed2k_hash.shtml ], if you want you can add sha1 and md5 hashes by merging that source with the source of 2hash [ http://crossrealm.com/2hash/ ] they use the same format so it's easy to make a single pass and get the ed2k, sha1 and md5 hash of a file.

(you can get the crc32 sum also, but has i couldn't find any source that uses the same format has those 3 or make something that's compatible with format those 3, i make a second pass to get the crc32 sum of a file which is not good :P)

hope that helps.
MistaMuShu
Posts: 10
Joined: Sat May 28, 2005 2:50 am

Post by MistaMuShu »

Thanks, those are helpful. I came across the first link a while back. I'm trying to stay focused on the core ideas before I mess around with MyList add and other niceties. Once I have a solid idea of how I handle notifications, and caching info to sqlite, then I'll work on sending inf back to anidb.

SQLite, GUI programming (wxWidgets), multi-threaded apps, and crypto are all new to me so this is definately a learning project. Thanks for all the help so far :wink:

I'd like to stay away from C as much as possible, or at least I don't want to know about the details even I do use it. I know the crypto library I'm planning on using is implemented in C, but I feel much more comfortable higher up.[/url]
fahrenheit
AniDB Staff
Posts: 438
Joined: Thu Apr 08, 2004 1:43 am
Location: Portugal

Post by fahrenheit »

i'm also experimenting with multithreaded apps and gui (GTK+) but i'm running with some problems, most of them because i'm a crapy programmer :P

good luck with your experimentations :)
PetriW
AniDB Staff
Posts: 1522
Joined: Sat May 24, 2003 2:34 pm

Post by PetriW »

Multithreading isn't all that bloody if you think a little.
Sure, it's a royal pita to convert some poorly designed singlethreaded stuff to multithreading but you can get away with very little work if you just make sure stuff doesn't use global or shared resources.

A very good idea is to only modify the gui from your main thread.


Finally, a warning about SQLite, the multithreading support is buggy. Don't assume it can handle multiple queries running at the same time (aka two sqlite_step running at the same time might both return SQLITE_BUSY).
I solved that with a critical section but it can be a pain if you don't have a wrapper around sqlite.
Locked