[FEEDBACK] new Genre System
Moderator: AniDB
-
- Posts: 49
- Joined: Sat Mar 26, 2005 1:29 pm
Very interesting topic. We had a similar discussion at some anime review site but nothing half as complex came to our mind. As a normal user of aniDB, submitting files but not anime so far (but willing to submit categories if that were helpful), I would like to add my opinion:
I think a complex system is not a problem per se. 200 list elements can be more confusing for certain users than a tree, and a tree can be more confusing for others. Some like navigating trees, and others like scrolling lists. The advantage of a tree model is that it can easily be extended - I like this concept.
The number of people supplying data will be small anyway, and those willing to invest the time will mostly not object reading the rules and learning how to categorize an anime. Supplying a categorization as detailed as specified here would cost as much work as writing a decent review - but there are people out there writing decent reviews, and with time the categorization would improve. I'm all for a complex but usable system. As for different levels of a category applying to an anime, again I'm not against this, but only if the user has to select from a list of well explained mutially exclusive alternatives (such as radio buttons with mouse-over tooltips) - don't use numeric codes here as they're too much subject to interpretation. Transparency should not be a problem as long as each possible alternative is self-explanatory; I'm willing to traverse a complex dialog as long as I understand each and every step.
What's important though would be to have a good explanation for each and every category value, including references to well-known examples (say, links to 1-2 animes with 1000+ users knowing them), and an excellent, user-friendly GUI with a guided dialog and mouseover-tooltips for the aforementioned explanations, and for the tree model perhaps DHTML navigation (CSS: visibility; would save tons of HTTP requests) to display/hide subcategory levels within the same HTML document. For that's what I'd expect to do as user: Traverse the options tree offered by aniDB and chose from them, instead of writing a doctoral thesis and come up with all categories by myself. Only if I feel some category is obviously missing I'd post a change request.
The most important aspect of the whole discussion for me is the similarity function because that's something that doesn't work in the current version. Look at Silent Möbius which it (correctly) categorized as "Action, Comedy, Drama, Fantasy, Horror, Magic, Romance, SciFi" (yes, Fantasy and SciFi, plus Comedy and Drama). When I click on the "similar" link the resulting set is Silent Möbius itself and nothing else. Apparently the similarity function does a set inclusion match, and no other anime has all the categories of Silent Möbius as subset.
The similarity function I've been suggesting in our anime site would be: set intersection and relative amount of matches. So if Anime X has 4 categories A, B, C, and D, while Anime Y has 3 categories A, C, and E, the intersection would be A and C, and the similarity would be (2*2) / (3*4) = 4/12 = 33,3%. Being a small superset of a large set of categories gives you a lower similiarity ranking than a relatively large number of matches based on relatively small sets; for example, Anime Z has categories A, C, and E plus 10 more categories, so its similiarity with B would only be (3*3) / (4*13) = 9/52 = 17,3%.
This similarity function would be defined and provide a value between 0 and 1 for each pair of animes, so the "similar" link would lead to a page sorting all other animes by this value and present the top 100 results or so. (By the way, something like this can be done with the present categories already...)
One more thing: If you're able to divide the category issue into subissues, try doing so. My method would be to find different questions addressed by the category values, such as: audience ("Who is the target group?") or setting ("When and where does the story take place?"). The more of these questions you can find and isolate, the more clearly laid out the resulting sub-trees will become. Everything that can't be assigned to one such question will likely remain as "other keywords". It is a good thing if you can find subsets that tend to contain mostly mutually exclusive values, as in 90+% of all cases the user will then select one of the offered alternatives and be satisfied with his/her choice, while selecting the correct 13 categories from 97 alternatives can be quite frustrating.
The problem with the tree structure is that in some cases categories aren't actually subcategories. There are stories with sexual intercourse that aren't hentai, and not even ecchi (Kimi ga Nozomu Eien, for example). I'm afraid that the tree structure might make certain reasonable combinations impossible, just like the strict alternative (radio buttons) of very unlikely combinations would do, such as Modern vs. Historical (what about a Time Travel story?). Maybe a set of subsets would be more flexible.
I think a complex system is not a problem per se. 200 list elements can be more confusing for certain users than a tree, and a tree can be more confusing for others. Some like navigating trees, and others like scrolling lists. The advantage of a tree model is that it can easily be extended - I like this concept.
The number of people supplying data will be small anyway, and those willing to invest the time will mostly not object reading the rules and learning how to categorize an anime. Supplying a categorization as detailed as specified here would cost as much work as writing a decent review - but there are people out there writing decent reviews, and with time the categorization would improve. I'm all for a complex but usable system. As for different levels of a category applying to an anime, again I'm not against this, but only if the user has to select from a list of well explained mutially exclusive alternatives (such as radio buttons with mouse-over tooltips) - don't use numeric codes here as they're too much subject to interpretation. Transparency should not be a problem as long as each possible alternative is self-explanatory; I'm willing to traverse a complex dialog as long as I understand each and every step.
What's important though would be to have a good explanation for each and every category value, including references to well-known examples (say, links to 1-2 animes with 1000+ users knowing them), and an excellent, user-friendly GUI with a guided dialog and mouseover-tooltips for the aforementioned explanations, and for the tree model perhaps DHTML navigation (CSS: visibility; would save tons of HTTP requests) to display/hide subcategory levels within the same HTML document. For that's what I'd expect to do as user: Traverse the options tree offered by aniDB and chose from them, instead of writing a doctoral thesis and come up with all categories by myself. Only if I feel some category is obviously missing I'd post a change request.
The most important aspect of the whole discussion for me is the similarity function because that's something that doesn't work in the current version. Look at Silent Möbius which it (correctly) categorized as "Action, Comedy, Drama, Fantasy, Horror, Magic, Romance, SciFi" (yes, Fantasy and SciFi, plus Comedy and Drama). When I click on the "similar" link the resulting set is Silent Möbius itself and nothing else. Apparently the similarity function does a set inclusion match, and no other anime has all the categories of Silent Möbius as subset.
The similarity function I've been suggesting in our anime site would be: set intersection and relative amount of matches. So if Anime X has 4 categories A, B, C, and D, while Anime Y has 3 categories A, C, and E, the intersection would be A and C, and the similarity would be (2*2) / (3*4) = 4/12 = 33,3%. Being a small superset of a large set of categories gives you a lower similiarity ranking than a relatively large number of matches based on relatively small sets; for example, Anime Z has categories A, C, and E plus 10 more categories, so its similiarity with B would only be (3*3) / (4*13) = 9/52 = 17,3%.
This similarity function would be defined and provide a value between 0 and 1 for each pair of animes, so the "similar" link would lead to a page sorting all other animes by this value and present the top 100 results or so. (By the way, something like this can be done with the present categories already...)
One more thing: If you're able to divide the category issue into subissues, try doing so. My method would be to find different questions addressed by the category values, such as: audience ("Who is the target group?") or setting ("When and where does the story take place?"). The more of these questions you can find and isolate, the more clearly laid out the resulting sub-trees will become. Everything that can't be assigned to one such question will likely remain as "other keywords". It is a good thing if you can find subsets that tend to contain mostly mutually exclusive values, as in 90+% of all cases the user will then select one of the offered alternatives and be satisfied with his/her choice, while selecting the correct 13 categories from 97 alternatives can be quite frustrating.
The problem with the tree structure is that in some cases categories aren't actually subcategories. There are stories with sexual intercourse that aren't hentai, and not even ecchi (Kimi ga Nozomu Eien, for example). I'm afraid that the tree structure might make certain reasonable combinations impossible, just like the strict alternative (radio buttons) of very unlikely combinations would do, such as Modern vs. Historical (what about a Time Travel story?). Maybe a set of subsets would be more flexible.
nwa: i was referring to the fact that many many hentai dont show more than boobs and ass. while other show everything. ;D I have seen more then enough hentai to know all the tricks that japanese used over the time
Pelican: your argument doesnt make sense since a martial art anime with action would have the genre action added thus making it appear aniway even without the tree system. It logically shouldnt return an anime with martial arts but no action unless the system is broken thus once again making the tree system completely useless.
while i agree with the common sense tree idea (since the real tree would be IMO way better but seem rather impossible to actually do in SQL) it still bring the problem of all those exceptions wich could get the tree to suddenly look messy and confusing.
To keep things under control then maybe we could hide the root up to where the information get real. for example a samurai drama would appear as a tree with drama as one root and samurai as an other while in fact samurai is a part of action normally. If the subgenre martial arts is selected too then it would be an other root since the action root doesnt exist for this anime.
For the similar anime feature. Well everybody see similar in a different way that is often linked to something specific in an anime that pleased you.
A good way that could be a bit heavy on the system could be to allow the user to select specific part of the tree. Any anime that has all the tree part selected would be considered similar for this search.
Example i want samurai action with drama. so I select the samurai and drama and any anime that has those 2 things will appear. The ability to weed out some genre could be good too. Like the ability to search samurai drama without any action.
Lets note that this idea would also work with categories and subcategories while it would possibly bug with the common sense tree.
A % of similarities in the 2 trees could be a good way to find similar anime too but it would most probably bug with common sense tree.
2 samourai anime but one without action are really different even if only the action is missing. At the same time, 2 action samurai anime can vary quite a lot in the other stuff (comedy, romance...) and still be quite similar. Most likely way more than the previous example.
The only way to correct this would be to allow subcategories to be adaptatively linked to the correct category as i had proposed. So because of this the strict radio button forcing to select genres just to get the subgenre needed is awfull.
The last way around this would be to slightly change the names depending on wich part of the tree they belong. Example samurai_drama would be a subcategories of drama while samurai_action would be a subcategories of samurai. Whats a bit sad in this is that it make searching for samurai as a whole impossible without using some trick.
Pelican: your argument doesnt make sense since a martial art anime with action would have the genre action added thus making it appear aniway even without the tree system. It logically shouldnt return an anime with martial arts but no action unless the system is broken thus once again making the tree system completely useless.
while i agree with the common sense tree idea (since the real tree would be IMO way better but seem rather impossible to actually do in SQL) it still bring the problem of all those exceptions wich could get the tree to suddenly look messy and confusing.
To keep things under control then maybe we could hide the root up to where the information get real. for example a samurai drama would appear as a tree with drama as one root and samurai as an other while in fact samurai is a part of action normally. If the subgenre martial arts is selected too then it would be an other root since the action root doesnt exist for this anime.
For the similar anime feature. Well everybody see similar in a different way that is often linked to something specific in an anime that pleased you.
A good way that could be a bit heavy on the system could be to allow the user to select specific part of the tree. Any anime that has all the tree part selected would be considered similar for this search.
Example i want samurai action with drama. so I select the samurai and drama and any anime that has those 2 things will appear. The ability to weed out some genre could be good too. Like the ability to search samurai drama without any action.
Lets note that this idea would also work with categories and subcategories while it would possibly bug with the common sense tree.
A % of similarities in the 2 trees could be a good way to find similar anime too but it would most probably bug with common sense tree.
2 samourai anime but one without action are really different even if only the action is missing. At the same time, 2 action samurai anime can vary quite a lot in the other stuff (comedy, romance...) and still be quite similar. Most likely way more than the previous example.
The only way to correct this would be to allow subcategories to be adaptatively linked to the correct category as i had proposed. So because of this the strict radio button forcing to select genres just to get the subgenre needed is awfull.
The last way around this would be to slightly change the names depending on wich part of the tree they belong. Example samurai_drama would be a subcategories of drama while samurai_action would be a subcategories of samurai. Whats a bit sad in this is that it make searching for samurai as a whole impossible without using some trick.
-
- Posts: 49
- Joined: Sat Mar 26, 2005 1:29 pm
A tree is a set of nodes, each of which contains a pointer to its parent node and an (indexed) ID of the tree it belongs to. No problem at all to retrieve the complete tree for one ID with one SQL statement and connect the pointer structure in memory. It would be difficult to read distinct parts of the tree with high performance (because the structure information is so distributed) but I don't see this as a necessary requirement.Raptor wrote:while i agree with the common sense tree idea (since the real tree would be IMO way better but seem rather impossible to actually do in SQL)
But that's not looking for a "similar" anime. "Similar" should provide reasonable results for those users who aren't able (or willing) to analyze why they like a certain anime. I could use a search engine in the same way as you can but the target group for the "similar" feature are the users, not the programmers.Raptor wrote:For the similar anime feature. Well everybody see similar in a different way that is often linked to something specific in an anime that pleased you... Example i want samurai action with drama. so I select the samurai and drama and any anime that has those 2 things will appear.
I think you are mixing SQL with classic languages like C or Java
i never heard of SQl having pointer except for indexex and the like.
The classic command in sql is:
GET (whatever rows you want)
FROM (the tables you want)
WHERE (rows that you want to match)
Pretty different from classic programming. So unless im wrong here SQL doesnt have the node with pointer as you are saying and as EXP said the genre name need to be different wich make sense in the way i learned SQL not with the pointer kind of tree that Java use.
As for the similar i agree that it wouldnt be really a similar feature but more like a search what i like. but i think this should be discussed after we know what kind of genre system EXP will choose.
i never heard of SQl having pointer except for indexex and the like.
The classic command in sql is:
GET (whatever rows you want)
FROM (the tables you want)
WHERE (rows that you want to match)
Pretty different from classic programming. So unless im wrong here SQL doesnt have the node with pointer as you are saying and as EXP said the genre name need to be different wich make sense in the way i learned SQL not with the pointer kind of tree that Java use.
As for the similar i agree that it wouldnt be really a similar feature but more like a search what i like. but i think this should be discussed after we know what kind of genre system EXP will choose.
-
- Posts: 49
- Joined: Sat Mar 26, 2005 1:29 pm
Surely the HTML output isn't generated directly via SQL but by using some third generation algorithmic language that is sure capable of building trees with pointers. All this language (most likely Perl or PHP) needs as input is the set of nodes which can be extracted from the database via the database API (Perl DBI?) by a WHERE clause on the indexed ID column.Raptor wrote:I think you are mixing SQL with classic languages like C or Java
Tree semantics for series "XYZ":
Code: Select all
Root "A"
/ \
Child Child
B C
/ / \
Grandchild Grandchild Grandchild
D E F
Code: Select all
SERIES | NODE | PARENT | CONTENT
=======+======+========+=========+
XYZ | 1 | - | "A"
XYZ | 2 | 1 | "B"
XYZ | 3 | 1 | "C"
XYZ | 4 | 2 | "D"
XYZ | 5 | 3 | "E"
XYZ | 6 | 3 | "F"
Last edited by Devil Doll on Sat Apr 30, 2005 8:50 pm, edited 1 time in total.
Simple: every row has its own unique id and also id of parent row (null for root element). It's nothing really extraordinary. Some SQL engines have also build-in special instructions to make it even easier (like Oracle tree extensions (CONNECT BY ... PRIOR))Raptor wrote: i never heard of SQl having pointer except for indexex and the like.
see also: http://www.anidb.net/forum/viewtopic.php?p=18245#18245
I was planning to use only the categories which were actually added for an anime in a search query. That means an Action::Sports (or whatever :P) anime would not be found on a search for Action if only Action::Sports is added as a category.
However, if the anime has actually enough action content to be classified as Action too, then both Action and Action::Sports would be added for the anime and thus it would be found by a search for either Action or Action::Sports.
This approach makes sure that we do not have to look at the 'tree' at all during search runs. Which is a requirement due to performance reasons IMHO.
BYe!
EXP
I was planning to use only the categories which were actually added for an anime in a search query. That means an Action::Sports (or whatever :P) anime would not be found on a search for Action if only Action::Sports is added as a category.
However, if the anime has actually enough action content to be classified as Action too, then both Action and Action::Sports would be added for the anime and thus it would be found by a search for either Action or Action::Sports.
This approach makes sure that we do not have to look at the 'tree' at all during search runs. Which is a requirement due to performance reasons IMHO.
BYe!
EXP
Genre::Hentai (without extensions) is missing.
::Students::Female
::Breasts::Big
::Teachers::Female
i would prefer rather longer last parts (if you ommit previous parts, those can have no meaning) like this:
::Students::Female students
::Breasts::Big breasts
::Teachers::Female teachers
About layout:
- categories should be sorted by: 1st part (content, audience, genre - maybe before each group should be header), then weight and as last name - now is it little messed and hard to see
- marking category by inclinated/normal font is allmost inwisible, i would change it into normal/bold (or normal text/link (to genre definition or similar titles))
- stars will be in future graphics, ne?
::Students::Female
::Breasts::Big
::Teachers::Female
i would prefer rather longer last parts (if you ommit previous parts, those can have no meaning) like this:
::Students::Female students
::Breasts::Big breasts
::Teachers::Female teachers
About layout:
- categories should be sorted by: 1st part (content, audience, genre - maybe before each group should be header), then weight and as last name - now is it little messed and hard to see
- marking category by inclinated/normal font is allmost inwisible, i would change it into normal/bold (or normal text/link (to genre definition or similar titles))
- stars will be in future graphics, ne?
-
- AniDB Staff
- Posts: 335
- Joined: Tue Oct 01, 2002 11:13 pm
- Location: GOTT Head Office, Planet Aineias
http://www.ultima-chan.com/Anidb%20Cate ... 052005.htm
That's a list of the current entered categories. Free feel to post any adjustments, additions, etc. Just remember this is a work in progress!
That's a list of the current entered categories. Free feel to post any adjustments, additions, etc. Just remember this is a work in progress!
-
- AniDB Staff
- Posts: 438
- Joined: Thu Apr 08, 2004 1:43 am
- Location: Portugal