[Anime Hint] - Feature Requests

egg · Post by **egg** » Fri Aug 20, 2004 5:55 pm

exp has graciously allowed me to work on the anime hint. You can see how the hint currently works here: How does Anime Hint work?.

These are the ways the hint can be changed.
1) The filters. This does not change how the animes are found, it just restricts the list by excluding X or only showing Y. Filters that are under consideration are:
* Blacklist [DONE]
* Hentai and/or Genre Filter [DONE]
* Min Overall Vote
* Min Review
* Min # Overall Votes
* Min # Reviews
* Min # Votes by Like Users
* Complete in mylist, Incomplete in mylist, Not in mylist, Any [DONE]
* Watched, Partially watched, Unwatched, Any [DONE]
* My Permanent Votes, My Temporary Votes, No Votes, Any [DONE]
* Release Year?
* Related Animes?? (Prequel/Sequel/...)
* Type
* Number of Episodes [Suggested 09/22/04]

2) The User Logic. This is the logic that calculates which users have similar tastes... Considerations here are:
* Different Handling of My Temporary votes
* Adding support of other users' temporary votes
* Normalize votes before calculating matches [Suggested 08/23/04]
* Use Covariance to calculate weight [DENIED, After more testing it did not match that well]
* Use Pearsons ... to calculate weight [Suggested 09/07/04, ONTODO]
* Match users by genre instead of animes [Suggested 01/11/05]

3) The Anime Logic. This is the logic that calculates which animes you should like based on the users selected in the User Logic.
* Min Vote, consider other users votes >= x (currently 8 ) [DONE]
* Logic Changes (rec vote and avg. adj.) [DONE]
* # Votes by Like Users
* Overall Vote
* Overall Review
* Overall # Votes
* Overall # Reviews
* Temporary Votes
* Related Anime?
* Genres??
* Release Year???
* Type
* Normalize Votes before calculations??? [Suggested 08/23/04]
* Standardize Score (Make top score 1000) [DONE]
* Number of Episodes (Min and/or Max)??? [Suggested 01/09/05]

4) Other Settings
* Save Settings [in a cookie?] [Suggested 9/28]
* Clean up form, how do I keep functionality but make the interface cleaner? [DONE]
* Follow language preferences for mylist [Suggested 10/14]
* If another language is shown, also show 'Official' Title, i.e. Crest of the Stars (Seikai no Monshou)

If you have any questions/comment/requests please state them here, and we can discuss them. When you make a request please specify if it is for a Filter, User Logic or Anime Logic.

Note some of these changes are going to take a while, or may never happen... I will do what I can, but no promises.

Andemon · Post by **Andemon** » Mon Aug 23, 2004 5:00 pm

User Logic:

It's rather difficult to comment on user logic, when one doesn't know how exactly it works; but I've had some minor experience with similar recommendation systems, and there's one thing that seems bit problematic to me about the current Anime Hint:

As I understand it, if I have the exactly same rating for one anime as my 'neighbors' (users with similar ratings), it adds 40 points to weight. If it's one step away, only 5 points are added; two steps, -20 points, and so on. ...at least that's how it used to be.

...so what happens when two users like same animes just as much, but one uses harsher scaling system than the other? Consider these three users:

User 1:
Anime#1 (vote): 9
Anime#2: 8
Anime#3: 5
Anime#4: 7

User 2: (Exact same rating curve as User 1, but uses harsher rating scale)
Anime#1: 8
Anime#2: 7
Anime#3: 4
Anime#4: 6

User 3: (Completely different rating curve than user 1, but with a few overlaps)
Anime#1: 9
Anime#2: 5
Anime#3: 7
Anime#4: 7

User 1 asks for recommendations; the weights of User 2 and User 3 are calculated as:
User 2: 5 + 5 + 5 + 5 = 20
User 3: 40 + -40 + -20 + 40 = 20

Even though User 1's ratings are much closer to those of User 2 (only their rating scale differs) than User 3, with current system, they would have the same weight. Do you see the problem?

Taking the difference of the rating curves into account would improve the accuracy of the recommender significantly -- as far as I know, most advanced recommenders do it. Not too difficult to code, either; the easiest way would be to calculate the average of each users ratings, create a modifier based on that, and modify all the ratings with it before comparing them. That would ensure that the votes are on the same 'scale'.

Andemon · Post by **Andemon** » Mon Aug 23, 2004 5:29 pm

Let's use those three users as an example.

First, we calculate the modifier required to set the users average vote to 5.5:

User#1 has average of 7.25, so the modifier would be 0.758 (7.25*0.758 = ~5.5).
User#2 has average of 6.25, so the modifier would be 0.88.
User#3 has average of 7, so the modifier would be 0.785.

...then we modify the users ratings with that modifier:

User#1's ratings become (rounded): 7, 6, 4, 6
User#2's ratings become (rounded): 7, 6, 4, 6
User#3's ratings become (rounded): 7, 4, 6, 6

Now, when user#1 asks for recommendations, the weights are calculated as:

User#2: 40 + 40 + 40 + 40 = 160 (perfect match)
User#3: 40 + -20 + -20 + 40 = 40 (slight match)

See? Quite simple method, but improves the accuracy a lot.

egg · Post by **egg** » Tue Aug 24, 2004 4:50 am

So what you are suggesting is normalizing the votes before using them for calculations. It is an interesting idea... Anybody have any comments?

Post by **exp** » Tue Aug 24, 2004 7:44 am

hm,

in order for this to work as expected we'd need to filter certain users.
i.e. everyone with only a few votes and everyone who does not have enough diversity in his votes. (i.e. ppl who only vote for the good animes -> all their votes might have the same (or nearly the same) value.)
the first is done already in the current hint implementation IIRC. the second one would be a new filter.

and with the exact definitions given for the different votes it might also be problematic to use normalization at all.
bc a user might actually not have used a 8 or 9 vote simply bc he didn't like all those animes that well after all. normalizing the votes of such a user in a way that an anime he voted 6 on is considered like a vote of 8 or 9 might be just as bad or even worse compared to the problems normalization might solve.

all in all i am not sure if we should really implement this. it might actually make the anime hint results less accurate.

BYe!
EXP

Elias · Post by **Elias** » Tue Aug 24, 2004 9:53 am

1. Group votes into bad/neutral (<=5) and good (>=6) and calculate/use modifier only for good votes (some users may use low votes often, some not, but changing good votes according to number of low votes is not a good idea).
2. Modifier calculated for good votes = 8 - avg.of.good.votes.from.user
Can be calculated once a day for every user (users harshness modifier).
In this point can also be filtered users with only one good score in its voting (voting only 10).
3. Do not round users rating (modifier rarely will be high, so after rounding it will became useless), but this mean score matrix must be improved to work with fraction numbers. It should be rather easy, function for strict will be:
Score(diff)= -60 + if(diff<=4,20*(4-diff),0) + if (diff<=2, 5*(2-diff)*(2-diff),0)
4. Trying with this example:
User 1, modifier=8- ((9+8+7)/3)=8-8=0 (ideal user)
User 2, modifier=8- ((8+7+6)/3)=8-7=+1
User 3, modifier=8- ((9+7+7)/3)=8-7.666=+0.33

After using modifier scores will be:
U1: 9, 8, 5, 7
U2: 9, 8, 4, 7
U3: 9.33, 5, 7.33, 7.33
(low votes 4 and 5 are not modified!)

Weight of U2 will be than: +40 +40 +5 +40 = 125
Weight of U3 will be than: +27 -40 -27 +27 = -13

Guest · Post by **Guest** » Tue Aug 24, 2004 2:50 pm

Hm. That seems like a workable solution, and definitely easier to implement than what I would've suggested. Bravo.

egg · Post by **egg** » Tue Aug 24, 2004 5:16 pm

If something like this was implemented, it would probably be done using modifier(s) that are created periodically in a script (assuming exp would allow it). If the values are not rounded, we would need to define how to use the rating scale based on fractional values...

exp wrote:in order for this to work as expected we'd need to filter certain users.
i.e. everyone with only a few votes and everyone who does not have enough diversity in his votes. (i.e. ppl who only vote for the good animes -> all their votes might have the same (or nearly the same) value.)
the first is done already in the current hint implementation IIRC. the second one would be a new filter.

As you say, the first one is already done, so I don't see an issue.

As for the second one, if a person ONLY votes for the animes they like, then that makes their relative scores somewhat useless. If a person votes all 10s, then they would end up with this being normalized to 5.5 or 8 from the two implementations listed above. I don't see that as a problem, since there is no variation in their votes, their value in being used as for someone else's hint is somewhat limited.

exp wrote:and with the exact definitions given for the different votes it might also be problematic to use normalization at all.

That depends on how many people use the exact definitions. It has been pointed out before that people don't always vote according to the definitions. (Please don't make this another 1 & 10 vote thread...

)

exp wrote:all in all i am not sure if we should really implement this. it might actually make the anime hint results less accurate.

On average this would probably mean more users would match, instead of fewer (the example used is an extreme case). Does this mean that these are "better" matches, I don't know. My concern is that it does not appear to be statistically sound (although it has been a long time since I took statistics). If it is not statistically sound, there must be a reason why, and that reason could lead to false results...

I think that more discussion on the concept is needed before exp and/or I believe it should be implemented, we can discuss the exact implementation later. It has some good points, but there are still some concerns. The examples given make a good argument, but there are probably examples where the match would be worse using these ideas.

Andemon · Post by **Andemon** » Wed Aug 25, 2004 4:21 am

egg wrote:
exp wrote:and with the exact definitions given for the different votes it might also be problematic to use normalization at all.
That depends on how many people use the exact definitions. It has been pointed out before that people don't always vote according to the definitions. (Please don't make this another 1 & 10 vote thread... )

Indeed.
I've read the definitions, but still tend to give higher votes than I probably should... -_-; ...not to mention that there are no doubt plenty of users who don't even know that the exact definitions exist.

egg wrote:
exp wrote:all in all i am not sure if we should really implement this. it might actually make the anime hint results less accurate.
On average this would probably mean more users would match, instead of fewer (the example used is an extreme case). Does this mean that these are "better" matches, I don't know. My concern is that it does not appear to be statistically sound (although it has been a long time since I took statistics). If it is not statistically sound, there must be a reason why, and that reason could lead to false results...

Well, I for one would be glad to get more matches. I currently have 88 ratings, but only match 14 users -- highest one has rating of 205, and the rest seem to have about ~105.

Many 'professional' recommenders use similar system; Movielens and Alexandria Digital Literature, for example; both of which seem highly accurate, as long as the user has enough votes already. There must be some reason why they use normalization, eh?

...just about the only major recommender that (probably) doesn't use it is the Amazon.com recommender, which has been often cited as being 'somewhat' inaccurate.

What I do know is, that the current Anime Hint doesn't work for me. My recommendations have only about 35% accuracy -- ie. only every third or so receives a vote of 8 or higher from me. The rest range from 4 to 7...

egg · Post by **egg** » Wed Aug 25, 2004 5:26 am

Andemon wrote:Well, I for one would be glad to get more matches. I currently have 88 ratings, but only match 14 users -- highest one has rating of 205, and the rest seem to have about ~105.

More does not necessarily mean better.

The best way to get more is to have more votes, then you will find more matches. You can get more users in the current system by lowering your minimum value to less than 100. But your hints will probably become less acurate. BTW, then hint only uses anime votes (not episode, group, ...), are all 88 for animes, if so the numbers sound off...

Andemon wrote:Many 'professional' recommenders use similar system; Movielens and Alexandria Digital Literature, for example; both of which seem highly accurate, as long as the user has enough votes already. There must be some reason why they use normalization, eh? ...just about the only major recommender that (probably) doesn't use it is the Amazon.com recommender, which has been often cited as being 'somewhat' inaccurate.

This is good input. Do you have any more detailed information about how they implemented their recommendations? Normalization is a fairly broad term and can mean a number of things... I would appreciate any information about how other sites recommendations work.

Andemon wrote:What I do know is, that the current Anime Hint doesn't work for me. My recommendations have only about 35% accuracy -- ie. only every third or so receives a vote of 8 or higher from me. The rest range from 4 to 7...

If it worked perfectly, I wouldn't be doing this.

When there are so many components that can be factored into the hint, it is difficult to say if the user matching has a flaw, or if time is better spent elsewhere. Unforunately we cannot measure how good the matches are (it is difficult to spot check this when each user involved has at least 30 votes and generally many more...) unless we have an approriate algorithm; and if we knew the algorithm we wouldn't need to measure it...

Please keep the discussion going, this is coming up with some good feedback.

Andemon · Post by **Andemon** » Wed Aug 25, 2004 2:50 pm

egg wrote:BTW, then hint only uses anime votes (not episode, group, ...), are all 88 for animes, if so the numbers sound off...

Low amount of matches isn't all that surprising. I've occasionally given low votes (...4, 5 and so...) to animes with average vote of 7.5 or more -- and with almost no existing votes in the low range. That's basically automatic -60 points towards most other users who have watched the anime. Doesn't mean that there's something wrong in the user logic in that regard; it's just that I simply have different tastes than large majority of the userbase.

egg wrote:
Andemon wrote:Many 'professional' recommenders use similar system; Movielens and Alexandria Digital Literature, for example; both of which seem highly accurate, as long as the user has enough votes already. There must be some reason why they use normalization, eh? ...just about the only major recommender that (probably) doesn't use it is the Amazon.com recommender, which has been often cited as being 'somewhat' inaccurate.
This is good input. Do you have any more detailed information about how they implemented their recommendations? Normalization is a fairly broad term and can mean a number of things... I would appreciate any information about how other sites recommendations work.

I actually did a school report on the subject last year, but none of the folks I contacted were willing to give out any details about the inner workings of their recommendation engines -- trade secrets, I guess.

Only got the basic details; for example, the AlexLit recommender requires fourty ratings to work. First, it somehow evens each users ratings so that they are on the same scale. (Sorry, that's very vague, I know, but that's all I got... -_-; ) Then, the recommendation engine searches for two hundred[1] closest matching 'neighbors' who have entered similar ratings, and then uses their ratings to compose a recommendations list.[2] The final recommendations are ordered by two factors -- by the predicted rating, and also by how accurate the recommender expects the rating to be (which, I think, is based on how much variation there is in the ratings taken into account, among other things).

[1] // Two hundred seems rather high amount of 'neighbors', but I can't argue with the results -- they seem to be spot on at least ~85% of time, which seems highly impressive. Perhaps more *is* better?...

[2] // The ratings of the closest neighbors have far more 'weight' (ie. affect the recommendations more) than the ones further away. That's another feature that could be considered for Anime Hint, if it's not already implemented.

Anyway, as far as normalization goes, sites such as Amazon.com which have only five possible ratings can barely manage without implementing it in some form, but the more possible ratings and rating variation the recommender has, the more important normalization becomes.

egg wrote:
Andemon wrote:What I do know is, that the current Anime Hint doesn't work for me. My recommendations have only about 35% accuracy
If it worked perfectly, I wouldn't be doing this.

When there are so many components that can be factored into the hint, it is difficult to say if the user matching has a flaw, or if time is better spent elsewhere. Unforunately we cannot measure how good the matches are (it is difficult to spot check this when each user involved has at least 30 votes and generally many more...) unless we have an approriate algorithm; and if we knew the algorithm we wouldn't need to measure it...

That's the root of the problem, indeed.
The only way to truly gauge the accuracy of an recommender is from the user feedback, and that's only available after the modifications are made. Bit problematic, that.

Recommenders are a dime a dozen on internet, but accurate ones are rare.

Well, good luck. I hope you'll manage to improve it, but even if you don't, your efforts are appreciated.

Post by **PetriW** » Wed Aug 25, 2004 3:49 pm

We could always buy the amazon recommendation system, it's only 125000 usd.

egg · Post by **egg** » Wed Aug 25, 2004 4:30 pm

Andemon wrote:I actually did a school report on the subject last year, but none of the folks I contacted were willing to give out any details about the inner workings of their recommendation engines -- trade secrets, I guess.

Thanks, that information is useful, it doesn't give us the answer, but it gives us some hints. Anyway, it has made me start looking at some other options that I will try out and see how they work...

Andemon wrote:[2] // The ratings of the closest neighbors have far more 'weight' (ie. affect the recommendations more) than the ones further away. That's another feature that could be considered for Anime Hint, if it's not already implemented.

The score listed next to the user is the weight. So, in your case the user that is at 205 has the about the same wieght as two users at 100.

PetriW wrote:We could always buy the amazon recommendation system, it's only 125000 usd.

You're a lot of help.

egg · Post by **egg** » Sat Sep 04, 2004 7:29 am

Here is a revised "How does Anime Hint work?" document that has some of the changes I am working on. I would appreciate any comments. BTW, I will probably have a default hint page with a "simple" interface, and then an advanced page that has all of the new options.

BTW, The new logic does recommend things I voted highly on, in my tests it appears to work much better than the old logic.

ATM the anime hint page tries to get a list of possible anime recommendations by first comparing all YOUR anime votes with the anime votes of all other anidb users. This is used to create a weight (or score) for each user in relation to you.

Step #1, Find Similar Users and Assign a Relative Weight
First a list is made of all the animes with your votes. Temporary and permanent votes are treated the same (if somehow there are multiple votes on an anime the permanent is taken).

A list of all users that have at least 50 [permanent?] votes for animes, 500 animes in their mylist and have a permanent vote for any anime in your vote list is made. Temporary votes are not used. Note although both you and another user have voted for at least 30-50 animes, they may still show up in this list with only 1 anime in common.

A weight is created for each of these users in this list. The weight is calculated by taking the list of animes that you have in common for that user, and calculating the difference between your permanent or temporary vote and their permanent vote, and assigning a value for that difference. These values are then added up to create the users weight. The lookup table used to determine the value are listed below and depend on if you chose a ‘strict’ or ‘loose’ weight style (default is ‘strict’).

Here is the weight style lookup table.
strict:
0 => 40, 1 => 5, 2 => -20, 3 => -40, 4 => -60, 5 => -60, 6 => -60,
7 => -60, 8 => -60, 9 => -60,

loose:
0 => 10, 1 => 1, 2 => -1, 3 => -2, 4 => -4, 5 => -6, 6 => -8,
7 => -10, 8 => -12, 9 => -15

For instance if you voted 10 on a1, 9 on a2 and 7 on a3 and another user voted 9 on a1, 9 on a2 and 4 on a3, the weight on the strict style would be 5 + 40 – 40 = 5.

After the weights are calculated then all of the users with a score less than the min. weight (default is 100) you specify are filtered out.

Step #2, Find Recommended Animes
Using the final list of users, now a list of animes is created.

The initial list is all animes the list of users voted on with a vote >= min. vote (default = 3, old logic was hard coded to 8) specified by the user. This is a change from before, the original idea, I believe, was just to get a list of animes that had high votes, and use the following logic to determine the relative ranking of these animes. The issue with this is it ignores the low votes of users that have similar tastes, so an anime with a few users that voted high and many users that vote low would still get recommended. The min. value remains so that people can use the old logic, if desired, and because many users believe that extreme votes should be filtered out.

Now a score is created for each anime by taking the sum of (user weight*((vote-rec. vote)/5)) for every user that voted for that anime. Where rec. vote (default 8, old logic was hard coded to 5) is the recommended vote level. What this does is make votes the recommended vote level the point at which higher votes will give an anime a higher score, and lower votes will lower the score of an anime.

For instance, if 3 users are voting on the same anime, u1 has a weight of 200 and a vote of 5, u2 has a weight of 100 and a vote of 9 and u3 has a weight of 100 and a vote of 10, the score is (200*((5-8)/5)) + (100*((9-8)/5) + (100*((10-8)/5) = 200*(-.6) + 100*.2 + 100*.4 = -120 + 20 + 40 = -60. Since the user that had the closest match with you voted fairly low on the anime, the overall score is low even though two other users recommended it. Lowering the rec. vote value diminishes the impact low votes have on the overall score.

Now the scores have been calculated, an averaging scheme is applied. This makes the scores less dependant on the number of users that have voted on an anime, otherwise an anime with a lot of moderate votes can outweigh an anime with fewer very high votes. The following formula is used to average the scores, animescore = score/(avg. adjustment + numberofvotes*avg. scale). Avg. adjustment (default 1) is used to balance out the average somewhat, this prevents an anime with a single 10 vote (although highly rated, if not many users have voted for it there must be a reason) to outweigh an anime with a few high votes. One way of thinking about this is a minimum number of votes to consider, since animes with fewer than this number of votes will have very low scores. Avg. scale (default 0.25) is how much of an average you want, a value of 1 is a strict average and a value of 0 does not average.

Note you cannot assign 0 to both avg. adjustment and avg. scale at the same time. A strict average would set avg. adjustment to 0 and avg. scale to 1. To have no averaging set avg. adjustment to 1 and avg. scale to 0.

Step #3, Filter Animes
Now that a list of scores has been created, these are now filtered based on the criteria chosen.

The resulting list is shown ordered by score.

[Note for purists, step 3 actually happens before step 2 for performance reasons, but it is easier to explain it here.]

egg · Post by **egg** » Sat Sep 04, 2004 7:55 am

Does anyone have any suggestions on how to display the genre filters. I am thinking of two filters.

1) Only show animes with the genre(s) xxx.
2) Do not show animes with the genre(s) yyy.

There is a subtle difference between the two. For instance if you say you only want animes that have fantasy, you may still get things like these, which are hentai.

If I implement this like the advance search, then I would have to have checkboxes for each genre for both filters. I don't think this is acceptable. This leaves a drop-down list, but you would only be able to select a single genre, or a multi-select list, when the user would have to hold down the Ctrl or Command key while clicking to select multiple genres.

Anybody have any other ideas or preferences?

BTW, if the user is able to select multiple genres, atm it will be an 'or' condition.

Thx.