Future of Anime Hint - Anime Referral

egg · Post by **egg** » Sun Jan 29, 2006 6:37 pm

visnu wrote:
Egg wrote:I had one user who PM’ed me that presented an algorithm for me to review (very thorough 14-Page PDF document). It has been over a decade since I have done math at this level (differential calculus) and I never dug into it deep enough to figure out if it would work or not.
Why not instead suggest to the user to make a (pseudo) SQL implementation of the algorithm. Then you could try it out on your sample database.

The existing things are a mix of SQL and Perl. This person know the mathematics, but may not be a programmer. Also to make the work they would give me meaningful, I would have to reveal more of the database and code structure that I would be willing to do. Also, I would like to understand what I implement. Also, at the time I was in the middle of implementing something else. I have not heard from this user in quite a while. Anyway, it's a good suggestion, but it just would not be practicle, they did model it in Mathlab, but that is a far cry from what I would need.

egg · Post by **egg** » Sun Jan 29, 2006 6:50 pm

Andemon wrote:aid 1544: Elfen Lied

1 778 Mobile Police Patlabor (1989) (Kidou Keisatsu Patlabor (1989)) 919 10
...

Quite... I have to agree that it doesn't seem to work too well at the moment.

OK, look at my response to DonGato for an explanation of what it is measuring. Now, let's look at the votes:

Code: Select all

 vote1 | vote2
-------+-------
   600 |   600
   700 |   800
  1000 |  1000
  1000 |   900
   900 |   900
   800 |  1000
   900 |  1000
  1000 |  1000
   900 |   900
   900 |   900

Those do appear to have similar votes. Is this a good enough reason for recommend one of the animes if you had voted highly for another? If not, why not and how should it be improved?

These should be thought of as, People who liked x, also liked:

BTW, I will probably do something similar with the Categories, so hopefully that will be a better measure of similarity.

DonGato · Post by **DonGato** » Sun Jan 29, 2006 7:29 pm

I would recommend looking for a name that matches that then, like 'anime hint'. You're not looking for similar animes but for similarly voted animes.

egg · Post by **egg** » Sun Jan 29, 2006 8:02 pm

DonGato wrote:I would recommend looking for a name that matches that then, like 'anime hint'. You're not looking for similar animes but for similarly voted animes.

Yeah, it was not a good name, and I did not explain it clearly. I had talked about what it was previously in the thread, but that was a while ago. Anyway I came up with the Anime Referral. I didn't want to use Hint because there is a lot more involved in a hint, comparing results from many of your own votes for instance. Also I did not want to call it Recommendations because I don't want it to be confused with Anime-Planet's work. Anyway, hopefully that clarify's things a little. Let me know what you think and/or if you have another suggestion for a name.

DonGato · Post by **DonGato** » Sun Jan 29, 2006 8:46 pm

Anime Referral sounds ok but might need a little explanation.

nstgc · Post by **nstgc** » Wed Feb 01, 2006 2:52 pm

Why do they have to be similar? Why not dissimilar? If John says "black" when Jane says "white," then to find out what John says all you need to do is ask Jane and make that negitive. If you are constantly disagreeing with someone then you want to stay away from what they say. That is one of the high points of my sugetion, the "new algorithim" suggestion.

I haven't really thought about this peice of work of mine for some time now, however let me outline the process that it uses.

First you take all users and subtract their vote from your vote. You don't have to use everyone, but it should increase accuracy, reguardless of their status in reffernce to you or how many anime's they've seen. This may be hard on the server so you may want to comprimise accuracy for speed. I reffer to each of these differences as "off sets". A negitive off set shows that you think higher of things then that other user does.
You average these off sets for that user. This is a users average off set. If they think something is "5" and they have an off set of "2" then you probably would expect yourself to be giving it a "3". this is the expect vote from a give user.
you then average the these expected votes to get a vote that should have fairly high accuracy given a large enough population.
This is very simple and can be expressed on a single line. While accurate already you can improve the accuracy by making an error esitimate. This is done using the stanard divition of the offsets and the normal distribution function. I also think that some skew correction would be nice, but everyone seems to have about the same skew. As long as you are dealling with difference of votes between users, the user's skew only matters if they differ significantly. I would assume that "significantly" would be moving the by one. I don't think changing the kurtosis would have too great of an effect and nor would the standard diviation.
I also made something so that users with more votes are treated differently, but I think its unnessisary. All users are useful, but I can see how you may what to temper the less users. I logicaly came up with a weight that becomes liniar, but I think that higher vote (>50) users should all be treeted the same. This is one of those cases where the scientist doesn't like his results.

If any one is interested in this as a possible replacement please say so. While I will work on this, making legible documents is very difficult. Also I will not say where the pdf file is. If egg wants to say, then thats fine, but it is very hard to read. Even for me, after over a year it has become somewhat difficult to fallow. Even for those that understand the math, it is hard to read. I understand it becuase I wrote it, but I certianly should rewrite it. The current one does have nice graphs and atleast one example.

One nice thing about my method is that unlike one that relies on weights, it can estimate how much you will like it. its nice knowing that "a>b" but that leaves us with "how much more will I enjoy 'a' over 'b'?"

If I make any changes, it will most likely be to the error approximation and "usefulness" aproximation. Also, I mentioned this after one of the examples, but I could also make an error aproximation. It wouldn't be hard, but to find the percent error wouldn't do much good since I already take accuracy into account when applying the weights.
Another thing I have considered is figureing out if it would be useful to include animes that two users haven't seen. I don't know if this would improve accuracy or not, but I bet I could make it improve it a bit. The problem is that the differnce it would make would be minimal and the effort on my part as well as the added stress on the server may make it not worth the trouble. You would end up evaluting every single user and every single anime for every single user even if they have nother heard of the anime (in the extreme case).

I've been mentioning weights (2 times) and I want to make it clear that this method does not use weights to find the answer, but they refine the answer.

I do think that the core method is very simple and easy to understand, but I understand that the other can be difficult even for my fellow math majors. I just happen to be one of those people that don't fallow the book and work well ahead of the class. I have thought of ways to simplify the weights, but that would decrease acuracy.

[edit] I forgot to mention that as the number of users aproaches infinity, the error becomes zero. You can know for certain how much you will like something before seeing it.
I may not know any programing outside of Mathematica, but I do have a freind that may be able to help me. Still, I would perfer to stay away from any coding.

egg · Post by **egg** » Wed Feb 01, 2006 6:40 pm

nstgc wrote:

First you take all users and subtract their vote from your vote. You don't have to use everyone, but it should increase accuracy, reguardless of their status in reffernce to you or how many anime's they've seen. This may be hard on the server so you may want to comprimise accuracy for speed. I reffer to each of these differences as "off sets". A negitive off set shows that you think higher of things then that other user does.

You average these off sets for that user. This is a users average off set. If they think something is "5" and they have an off set of "2" then you probably would expect yourself to be giving it a "3". this is the expect vote from a give user.

Let me take the following scenario:

Code: Select all

User  v1  v2  v3
----- --- --- ---
Me    10  5.5 1
U1    10  5.5 1
U2    1   5.5 10

So if I compute the differences between me and U1 I get 0, 0, and 0 which is an average of 0.
Now, let me do the same for U2, I get -9, 0, and 9 which is an average of 0.

So this system would think that I vote the same as both U1 and U2... If you average the scores together, you lose valuable information, you need to work with the list of values. One way to deal with this is rather than looking at the individual scores, look at it as a vector. Cosine or pearson's can measure the similarity between the vote vectors, the length of the vector is an indication of the scale, ...

BTW, I update the Referral System, I made some adjustments to limit the affect of bad scores playing too much of a role in the referrals (so if ten people all voted 1 on a set of animes, although they voted similarly, that does not necessarily mean that someone who actually likes one anime will like the other), also I included temporary votes.

Updated AniDB Anime Referral Sample

suppy · Post by **suppy** » Wed Feb 01, 2006 7:12 pm

egg wrote:
nstgc wrote:

First you take all users and subtract their vote from your vote. You don't have to use everyone, but it should increase accuracy, reguardless of their status in reffernce to you or how many anime's they've seen. This may be hard on the server so you may want to comprimise accuracy for speed. I reffer to each of these differences as "off sets". A negitive off set shows that you think higher of things then that other user does.

You average these off sets for that user. This is a users average off set. If they think something is "5" and they have an off set of "2" then you probably would expect yourself to be giving it a "3". this is the expect vote from a give user.
Let me take the following scenario:
Code: Select all
User  v1  v2  v3
----- --- --- ---
Me    10  5.5 1
U1    10  5.5 1
U2    1   5.5 10
So if I compute the differences between me and U1 I get 0, 0, and 0 which is an average of 0.
Now, let me do the same for U2, I get -9, 0, and 9 which is an average of 0.

So this system would think that I vote the same as both U1 and U2...

Whether you vote higher or lower than another user is imo irrelevant. If you vote differently is. Instead of subtracting their vote from your vote, you simply take at look at the difference. abs(U1v1-U2v1). Then after that you follow nstgc's algorithm.
So for egg's example it would be 0 for U1, but 6 for U2 (if the mathematics in my head haven't failed me).

Edit: Afterthoughts: this would mean that nstgc's algorithm would require a bit more change, since if user U2 voted high for an anime, you would probably vote low for it (instead of voting >10

) ... but that should be trivial to fix I believe.

nstgc · Post by **nstgc** » Wed Feb 01, 2006 7:56 pm

<this is a reply to egg, not the guy who, whose name I forget, posted while I typed. in refference to him: taking the absolute value is a bad idea. plugging it back into my formula would result in error out the ass.>

Thats not quite right. What it means is that given a v4, you would vote [(v4_u1-O_u1)+(v4_u2-O_u2)]/2. This comes out to v4_u1 or v4_u2. So it is assumed that you will, given these three animes and two users, you will always vote identicly to them. This is a problem, but that is why you need a large population. Your corrilation coeffecients work ok with small number, where as what I have doesn't work well unless you have a much larger group. Animes per user becomes irrelevent, but number of users is still a big deal.

If someone lies half the time and tells the truth half the time do you assume their lieing when they say something or that they are telling teh truth? Neither, you ignore them. Remember, its the offset that is flawed in this case and not the users vote itself. The weight system dampens this type of user so that they don't contribute. I could tell you exactly how much it would dampen, but I have a lab report to writen for my physics lab in an hour and a half.

In formation is not lost. Some times U2 votes high and sometimes U2 votes low, but in the end he is over all nuetral. In the previous system you look at instances of agreement, to me, information was lost there. You lose the sign. I thought about using a cubic function instead of a quadratic which would emphisis such problems, but I decided against it. I can't remember why, but I had a good reason I'm sure.

If you remember, I also included a method of solving that problem. Assuming you still have the file, I spent page two three and four talking about the weighting function. Half a page was for the number of votes issue, since it was a simple formula, and the rest of on showing the percent accuracy. I not only made a weighting system, I made it so that it would mean something. It can tell you the error that can be expected from a given user. It does use a quadradic system. Actual, it uses a system similar to the Corrilation coefficenct. I take the standard deviation of the off set where as you are taking the standard deviation of the users votes and the covariance of the two together. This is in effecient and as you could say "only half the story".

There is a simplistic version, that which lacks weights, and one that includes them, and is thus complex. The simplisticly it does not work with small numbers where as the weights reduce accuracy in large groups. I'm not entire sure if the weight function causes problems in large groups or not, but I think they do. This doesn't matter since when I say large, I mean absolutely huge. There aren't enough users to cuase problems. Unless you really dig into the user data base, there aren't enough. You need to look at atleast 100 cases before the simplistic version can produce anything remotely similar to being accurate. I would discourage its use in groups under 300. That three hundred was, as I put it, "pulled out of my ass" meaning it sounds right, but the one hundred I'm comfident is a good number. The weights make it useful at smaller levels.

I will admit I have not considered size very much when making this, but it is more logical then the corrilation coefficents. It also produces real numbers that can be understood instead of a score.

I will look at this later today when I get home. If any one is truely interested then please put your concerns together.

example: "a user can have completely different results yet still have 'a perfect score'"

I know it sounds stupid, but I am as good at understanding others as I am at conveying my thoughts to them and that helps.

I think once I clear up some miss understanding like "the lost data," I think you'll see that the weighting system I'm using is like a more complex version of what you're using now and the core part, the double summation part, is by itself useful, but given the cercumstances needs a little help. My weighting system as well as the core rating system both produce real numbers as either a pertage chance for error of a given amount or an estimated vote.

The system I use to produce a weight is just as accurate, if not a little more to the different handling of data, as the corrilation co__, but piggy backed with the core rating system (I think I'll be reffering to it as that since I've never seen it else where) the two together should work much better.

Actualy, as much as I dislike the idea, you could take your weighting system and use it with my core rating system and that should improve the accuracy of both as well. I haven't really looked into it, but as I said before, at its core, my weighting system uses the stardard deviation as well. Mine give a very non linear result instead, but the error propogation should be the same. Again, its the way data is handles. You also can't get any error estimates with your system alone. Still, one is implace and the other would be a pain in the ass to implement along side my core rating system.

I can also make another formula that uses the animes themselves as starting points instead of the users if that interests any one.

Post by **fahrenheit** » Wed Feb 01, 2006 8:08 pm

i've been thinking about this,

i would like to propose some things:

1) type search, if given aid is an ova, list ovas and series, (and vice-versa), but if aid is a movie only list movies (but also offer the possibility to compare everything).
2) give less results, 10~25 or so with the option of displaying more, remove animes that user has in list and marked as seen as per default (also with the option).
3) give higher rating to animes that have the same genre as the given anime (i think we now have the genres well weeded out for that to work okay).

but i don't know, the type of results given are related to what you are expecting to get, example, if you put "other people that liked X also liked Y" you can output almost any anime that it would fit what people are expecting to get.

Btw, good work on the system, you now get 18 results to Blame!, and altough i don't see any relation between blame and DBZ, it fits the title, this was the main reason to the 3rd sugestion above.
As for TEXHNOLYZE, also i did see many of the animes listed, but again the high relevance of Girls Bravo - Season 2 eludes me.

Either way, egg i realy admire you, handling the hint system and this referal this must be giving you some trouble, so even if i didn't say anything good, keep at it

Edit: forgot to mention this, what about a hybrid system, user recomending and sys recs for a given anime, being that user recs outrank sys, anime planet recomendations are pretty much like i would sugest.

nstgc · Post by **nstgc** » Wed Feb 01, 2006 8:30 pm

I just thought of this, but I don't know how many people this system has to deal with. If its too few then that would effect the accuracy of anything. From what I read in the wiki it does sound like you need to include more users.

Also, I guess I should put this up, the url for the pdf file that I've been mentioning is [url]http://nst_gc.tripod.com/Ani-Hints_Mk_III.pdf[/url]. The first four pages explain the system and then the next ten are suppose pages. They contain examples and the scripts I used to make graphs and get numbers. I also have an excel file that I have that shows a system of 25 users and 10 animes. I don't know how to get that to people.

If you look at the first page it should be clear why my core rating system require a sign. Also it is very important that you not ignore "high" and "low". Its like if you're shoot something. A whole bunch of "off by __ inches" does little good, but if you say "___ inches to the left" you can get somewhere.

Also, I'd like to say that me naming the central part of my systek "core rating system" sound really arogent, but I can't think of any other name. Its the core of the system, and it is a rating system instead of a weighting system.

egg · Post by **egg** » Thu Feb 02, 2006 12:09 am

nstgc,

It is true that I picked an example that did not take the rest of work into consideration that may have filtered out that user. But given the information you can expect that user has the opposite tastes that you do, and even though the preferences are averaged, unless this user is filtered out, then they would cancel out the other users recommendations. I can think of other simple examples where I can present issues that I see with the beginning portion of your idea, but I will give you some time to think about it a little more.

The number of matching users is an issue. Although currently to use the hint you need to have voted for at least 30 anime, and it only compares you to people who have voted for at least 30 anime, the number of intersections is relatively small. So there may only be a few users that have voted on the same anime you did, AND then once you apply the rules some of those also get filtered out because they have differing voting habits. I think that is the main problem in a user vs. user recommendation system, frequently the same set of core users will be considered “similar” to a number of users and the hint results are frequently the same.

Variables to consider when determining how similar two users (or anime) are based on a series of votes:
1) Pattern, do they have high votes and low votes in the similar places?
2) Magnitude, do they vote about the same values?
3) Difference, does the first one vote lower or higher than the second?

The original method went basically by #2. The Pearson’s is #1, but I added a dummy value to have it follow #2 somewhat as well. Cosine is sort of a combination of #1 and #2. Your method appears to be #3.

Here are some more resources:
http://www.cs.umn.edu/research/GroupLen ... tions.html
I have only read a few of these and many do not apply directly, but it is interesting.

Guest · Post by **Guest** » Thu Feb 02, 2006 6:15 am

Before I reply, I would like to apologize for not having read the other posts. I mostly gave my sales pitch and replied to replies. As mentioned I had class.

I have the intention of viewing the reffences mentioned, but not at this moment. Either tommarow or Sunday. Which ones are you drawing from.

If you combine the weighting function of mine with the rating system, then its a 2+3. Also it shouldn't exibit any of the problems mentioned in your first post. I explained this once before in refrence to "can something made of imperfect parts ever be perfect." My answer is yes if the error cancels out. When averaging, random error always is worked out given enough data. Systemic problems do remain. I feel that the weighting systems can significantly dampen anything thats left over. Whats next is the number of users issue. I will think about later. To date I've merely acknowledged it as a problem, I've never quantitized it. From what your saying, I'll either have to figure out how to get more out the data or counter act the uncertinty.

Please remember that even the most wacked out set of numbers, if they are real, is not detrimental. In this particular case, where data is scarce, you do emphisis those users that are the most helpful, but that doesn't make for good users and bad users.

Unless I see something really pretty in those publications, which is very possible, and/or can't figure out how to get around a user count problem (assuming such a problem is found to exist) I will stand fast with my year old formula. Other wise I will have to op for the anime-to-anime aproach.

I do have to admit that the thought of a map of the anime database sounds pretty cool. How is it currently being done? I know you don't give out server information (not that I could read it if you did), but what about just the math. Thats what I want anyway. I really don't want to put out effort just to find that none of my seggestions are taken.

I glanced at one of the publications and it seemed more like marcketing and how this type of thing can help companies. This doesn't help at all. Which ones did you find useful?

I will say one last thing before going to sleep at the key board. No matter what what type of system you choose, the best will involve multiple aproaches bundled together. That is why, in addition to the core rating system, I had a two part weighting function. I think the combination of difference and standard deviation is very nice. One is direction and the other magnitude you could say. In general, a difference can give you a "where" while a variance can give you a "how close". I do not beleive patterns in the data are useful at all. The number of vote a user has is more useful than what I think you mean by pattern. Also I use a double summation across users and animes. Both are treat equaly under the core rating system. I emphisis a particular order and method, but the formula itself takes no prefference. I'm pretty sure I can, in less then ten minutes, rewrite the CRS so that it looks like its an anime-to-anime system be actual do the same thing. The weighting system is a different story, that will take atleast an hour.
Currently, I assume, you have a A-to-A system that uses similarity alone (as oppose to that and dis-similarity). While I think that an A to A system is better then a U to U system, I think that mix would be better. While mine is mix, it relies very heavily on U to U information. I think it would be better if it was a little more A to A then it currently is.

Regaurdless, a map of the DB sounds really cool and if I can't get you guys to use my CRS then I would like to add a U to U element into the A to A system under developement along with a difference segment.

nstgc · Post by **nstgc** » Thu Feb 02, 2006 6:25 am

That was me by the way. Also, if I say something tomarrow that contridicts anything in the above, go with what I say later, for I am very tired. But thats just a disclaimer, don't think that I don't have 95% confidence in that message.

egg · Post by **egg** » Thu Feb 02, 2006 7:32 am

fahrenheit wrote:1) type search, if given aid is an ova, list ovas and series, (and vice-versa), but if aid is a movie only list movies (but also offer the possibility to compare everything).
2) give less results, 10~25 or so with the option of displaying more, remove animes that user has in list and marked as seen as per default (also with the option).
3) give higher rating to animes that have the same genre as the given anime (i think we now have the genres well weeded out for that to work okay).

Let me start off with a disclaimer. The results so far are a proposal for calculating weights to use for the anime hint, this does not list similar anime atm. Once I get a handle on this, then there will be a separate list that will be based on things like the category, type and so forth. As far as the number to show and if there will be a filter will probably be up to exp (assuming he agrees to use it, I haven't specifically asked).

As far as the hint is concerned it is trying to recommend anime that you will like, so if it knows an anime that you like and that people who like that also like another anime, should it be recommended? So given that is it good enough to know that a number of people voted highly for both TEXHNOLYZE and Girls Bravo (as an example) to make a recommendation? On the one hand, some people will probably not be interested at all given the genre, but others may genuinely interested. Adding the genres will put in preconceived notions and artificially limit the results, also the user will have the ability to apply various filters (like genre) manually.

Another thing that will happen is if they really do not have a correllation, then a few more people will vote and things will level out a bit more. This is one thing that always happens with new anime, they get higher than normal votes because the people who initially watch it are the people who think they will like it and watch it first. Generally they vote highly and the anime has a high rating. As more and more people watch it, more balanced voting occurs and generally the score drops.

So the next question is, when I do the similarity listings based on categories and type should it also take into consideration votes? For instance if there is a Futuristic Action Samurai Vampire OVA set at a School and a Futuristic Action Samurai Vampire OVA set at a Hospital, those are probably similar. But let's say that everyone that voted on the School on voted 10 and those same people voted 1 on the one set at the Hospital. Should they still be shown as similar, or should they get filtered out?

fahrenheit wrote:but i don't know, the type of results given are related to what you are expecting to get, example, if you put "other people that liked X also liked Y" you can output almost any anime that it would fit what people are expecting to get.[/qoute]Hopefully once I get the algorithms fixed it will list the few animes that are meaningful and not just some random results.

fahrenheit wrote:Btw, good work on the system, you now get 18 results to Blame!, and altough i don't see any relation between blame and DBZ, it fits the title, this was the main reason to the 3rd sugestion above.
As for TEXHNOLYZE, also i did see many of the animes listed, but again the high relevance of Girls Bravo - Season 2 eludes me.
Well, there is a problem with the algorithm atm, it is artificially inflating animes with a few high permanent votes and a lot of temporary votes. This is the case with Girls Bravo and some other things I checked. I need to go back and review the logic and then I will rerun things, I will probably have something new in a couple of days. When I do Girls Bravo will certainly have a lower score and may even drop off the list.

fahrenheit wrote:Edit: forgot to mention this, what about a hybrid system, user recomending and sys recs for a given anime, being that user recs outrank sys, anime planet recomendations are pretty much like i would sugest.

I have no intention of adding user recommendations to this. This is meant to be an automated system that hopefully will find similarities that people have missed. Anime-Planet already has a system set up for user recommendations, this is not meant to compete with that. Hopefully this will generate an objective viewpoint based on the mined data that will complement their system.