Well actual structure is somewhat irrelevant, I generally have to manipulate things into special tables anyway to facilitate the comparisons/computations. Basically any of the data in the system can be used for comparison.nstgc wrote:I do have to admit that the thought of a map of the anime database sounds pretty cool. How is it currently being done? I know you don't give out server information (not that I could read it if you did), but what about just the math. Thats what I want anyway. I really don't want to put out effort just to find that none of my seggestions are taken.
I glanced at one of the publications and it seemed more like marcketing and how this type of thing can help companies. This doesn't help at all. Which ones did you find useful?
For the A-to-A system I am working on, the first step is to create a table that lists all of the animes with similar votes and the types of votes. Each record has aid1, aid2, vote1, vote2, type1, and type2. There are 18,000,000 records (based on votes from August 2005) and it takes 30 minutes to run on my machine. Then it goes through this list one by one adjusts the votes with weights (permanent votes have a higher weight than temporaries) and calculates sums on things like vote1^2, vote2^2, vote1*vote2. When it gets to the end of a set of anime it calculates the score (cos of the angle between the vectors / ratio of the vector lengths). This takes about 4.5 hours to run this portion on my machine.
For the U-to-U system I do similar things with creating temporary tables to assemble the data (for complex operations this is actually faster in many cases), but the amount of data to go through is significantly less.
As far as the publications, the earlier ones (in general) talk more about the recommendation system. Like I said I hadn't gone through all of them either and many of them don't really have much to do with what we are talking about. The Amazon article was more useful (the link is in the first post). I found this set of articles because I was looking up a reference from the Amazon article, and then I saw the whole list of articles...
As far as implementing your idea, I cannot say one way or another at this point. In order to do it, I would have to be convinced that the outcome is worth the effort. The U-to-U systems that I have worked on appear to have hit a limitation and something different is needed. You have a suggestion of something that is different, but so far I am not sure if it will lead to good results or not, so I am hesitant to put much work into it. Also, I started on this A-to-A initiative, so assuming exp is willing to do something like that, then I will probably try this first (I know that is what I told you last time too, sorry). I think your idea is worth discussion, but atm I don't have a plan to implement it, on the other hand I have not rejected it either. I don't want to discourage you, but I don't want you to have false expectations either.