[FEEDBACK] new Genre System

Forum for discussing AniDB rules & standards. No small talk!

Moderator: AniDB

Locked

Do we have all needed genres listed?

Yes
10
38%
No
16
62%
 
Total votes: 26

Devil Doll
Posts: 49
Joined: Sat Mar 26, 2005 1:29 pm

Post by Devil Doll » Thu Apr 28, 2005 9:04 pm

Very interesting topic. We had a similar discussion at some anime review site but nothing half as complex came to our mind. As a normal user of aniDB, submitting files but not anime so far (but willing to submit categories if that were helpful), I would like to add my opinion:

I think a complex system is not a problem per se. 200 list elements can be more confusing for certain users than a tree, and a tree can be more confusing for others. Some like navigating trees, and others like scrolling lists. The advantage of a tree model is that it can easily be extended - I like this concept.
The number of people supplying data will be small anyway, and those willing to invest the time will mostly not object reading the rules and learning how to categorize an anime. Supplying a categorization as detailed as specified here would cost as much work as writing a decent review - but there are people out there writing decent reviews, and with time the categorization would improve. I'm all for a complex but usable system. As for different levels of a category applying to an anime, again I'm not against this, but only if the user has to select from a list of well explained mutially exclusive alternatives (such as radio buttons with mouse-over tooltips) - don't use numeric codes here as they're too much subject to interpretation. Transparency should not be a problem as long as each possible alternative is self-explanatory; I'm willing to traverse a complex dialog as long as I understand each and every step.

What's important though would be to have a good explanation for each and every category value, including references to well-known examples (say, links to 1-2 animes with 1000+ users knowing them), and an excellent, user-friendly GUI with a guided dialog and mouseover-tooltips for the aforementioned explanations, and for the tree model perhaps DHTML navigation (CSS: visibility; would save tons of HTTP requests) to display/hide subcategory levels within the same HTML document. For that's what I'd expect to do as user: Traverse the options tree offered by aniDB and chose from them, instead of writing a doctoral thesis and come up with all categories by myself. Only if I feel some category is obviously missing I'd post a change request.

The most important aspect of the whole discussion for me is the similarity function because that's something that doesn't work in the current version. Look at Silent Möbius which it (correctly) categorized as "Action, Comedy, Drama, Fantasy, Horror, Magic, Romance, SciFi" (yes, Fantasy and SciFi, plus Comedy and Drama). When I click on the "similar" link the resulting set is Silent Möbius itself and nothing else. Apparently the similarity function does a set inclusion match, and no other anime has all the categories of Silent Möbius as subset.
The similarity function I've been suggesting in our anime site would be: set intersection and relative amount of matches. So if Anime X has 4 categories A, B, C, and D, while Anime Y has 3 categories A, C, and E, the intersection would be A and C, and the similarity would be (2*2) / (3*4) = 4/12 = 33,3%. Being a small superset of a large set of categories gives you a lower similiarity ranking than a relatively large number of matches based on relatively small sets; for example, Anime Z has categories A, C, and E plus 10 more categories, so its similiarity with B would only be (3*3) / (4*13) = 9/52 = 17,3%.
This similarity function would be defined and provide a value between 0 and 1 for each pair of animes, so the "similar" link would lead to a page sorting all other animes by this value and present the top 100 results or so. (By the way, something like this can be done with the present categories already...)

One more thing: If you're able to divide the category issue into subissues, try doing so. My method would be to find different questions addressed by the category values, such as: audience ("Who is the target group?") or setting ("When and where does the story take place?"). The more of these questions you can find and isolate, the more clearly laid out the resulting sub-trees will become. Everything that can't be assigned to one such question will likely remain as "other keywords". It is a good thing if you can find subsets that tend to contain mostly mutually exclusive values, as in 90+% of all cases the user will then select one of the offered alternatives and be satisfied with his/her choice, while selecting the correct 13 categories from 97 alternatives can be quite frustrating.

The problem with the tree structure is that in some cases categories aren't actually subcategories. There are stories with sexual intercourse that aren't hentai, and not even ecchi (Kimi ga Nozomu Eien, for example). I'm afraid that the tree structure might make certain reasonable combinations impossible, just like the strict alternative (radio buttons) of very unlikely combinations would do, such as Modern vs. Historical (what about a Time Travel story?). Maybe a set of subsets would be more flexible.

Raptor
Posts: 155
Joined: Mon Nov 01, 2004 11:07 pm

Post by Raptor » Fri Apr 29, 2005 5:03 am

nwa: i was referring to the fact that many many hentai dont show more than boobs and ass. while other show everything. ;D I have seen more then enough hentai to know all the tricks that japanese used over the time :P

Pelican: your argument doesnt make sense since a martial art anime with action would have the genre action added thus making it appear aniway even without the tree system. It logically shouldnt return an anime with martial arts but no action unless the system is broken thus once again making the tree system completely useless.


while i agree with the common sense tree idea (since the real tree would be IMO way better but seem rather impossible to actually do in SQL) it still bring the problem of all those exceptions wich could get the tree to suddenly look messy and confusing.

To keep things under control then maybe we could hide the root up to where the information get real. for example a samurai drama would appear as a tree with drama as one root and samurai as an other while in fact samurai is a part of action normally. If the subgenre martial arts is selected too then it would be an other root since the action root doesnt exist for this anime.

For the similar anime feature. Well everybody see similar in a different way that is often linked to something specific in an anime that pleased you.
A good way that could be a bit heavy on the system could be to allow the user to select specific part of the tree. Any anime that has all the tree part selected would be considered similar for this search.

Example i want samurai action with drama. so I select the samurai and drama and any anime that has those 2 things will appear. The ability to weed out some genre could be good too. Like the ability to search samurai drama without any action.


Lets note that this idea would also work with categories and subcategories while it would possibly bug with the common sense tree.


A % of similarities in the 2 trees could be a good way to find similar anime too but it would most probably bug with common sense tree.
2 samourai anime but one without action are really different even if only the action is missing. At the same time, 2 action samurai anime can vary quite a lot in the other stuff (comedy, romance...) and still be quite similar. Most likely way more than the previous example.

The only way to correct this would be to allow subcategories to be adaptatively linked to the correct category as i had proposed. So because of this the strict radio button forcing to select genres just to get the subgenre needed is awfull.

The last way around this would be to slightly change the names depending on wich part of the tree they belong. Example samurai_drama would be a subcategories of drama while samurai_action would be a subcategories of samurai. Whats a bit sad in this is that it make searching for samurai as a whole impossible without using some trick.

Devil Doll
Posts: 49
Joined: Sat Mar 26, 2005 1:29 pm

Post by Devil Doll » Fri Apr 29, 2005 2:35 pm

Raptor wrote:while i agree with the common sense tree idea (since the real tree would be IMO way better but seem rather impossible to actually do in SQL)
A tree is a set of nodes, each of which contains a pointer to its parent node and an (indexed) ID of the tree it belongs to. No problem at all to retrieve the complete tree for one ID with one SQL statement and connect the pointer structure in memory. It would be difficult to read distinct parts of the tree with high performance (because the structure information is so distributed) but I don't see this as a necessary requirement.
Raptor wrote:For the similar anime feature. Well everybody see similar in a different way that is often linked to something specific in an anime that pleased you... Example i want samurai action with drama. so I select the samurai and drama and any anime that has those 2 things will appear.
But that's not looking for a "similar" anime. "Similar" should provide reasonable results for those users who aren't able (or willing) to analyze why they like a certain anime. I could use a search engine in the same way as you can but the target group for the "similar" feature are the users, not the programmers.

Raptor
Posts: 155
Joined: Mon Nov 01, 2004 11:07 pm

Post by Raptor » Sat Apr 30, 2005 7:35 pm

I think you are mixing SQL with classic languages like C or Java

i never heard of SQl having pointer except for indexex and the like.

The classic command in sql is:

GET (whatever rows you want)
FROM (the tables you want)
WHERE (rows that you want to match)


Pretty different from classic programming. So unless im wrong here SQL doesnt have the node with pointer as you are saying and as EXP said the genre name need to be different wich make sense in the way i learned SQL not with the pointer kind of tree that Java use.





As for the similar i agree that it wouldnt be really a similar feature but more like a search what i like. but i think this should be discussed after we know what kind of genre system EXP will choose.

Devil Doll
Posts: 49
Joined: Sat Mar 26, 2005 1:29 pm

Post by Devil Doll » Sat Apr 30, 2005 7:55 pm

Raptor wrote:I think you are mixing SQL with classic languages like C or Java
Surely the HTML output isn't generated directly via SQL but by using some third generation algorithmic language that is sure capable of building trees with pointers. All this language (most likely Perl or PHP) needs as input is the set of nodes which can be extracted from the database via the database API (Perl DBI?) by a WHERE clause on the indexed ID column.

Tree semantics for series "XYZ":

Code: Select all

              Root "A"
             /        \
       Child         Child
         B             C
        /            /    \
  Grandchild   Grandchild Grandchild 
     D             E          F
Table representation:

Code: Select all

SERIES | NODE | PARENT | CONTENT
=======+======+========+=========+
 XYZ   |  1   |   -    | "A"
 XYZ   |  2   |   1    | "B"
 XYZ   |  3   |   1    | "C"
 XYZ   |  4   |   2    | "D"
 XYZ   |  5   |   3    | "E"
 XYZ   |  6   |   3    | "F"
Last edited by Devil Doll on Sat Apr 30, 2005 8:50 pm, edited 1 time in total.

User avatar
Elias
Posts: 242
Joined: Tue Feb 17, 2004 4:55 pm

Post by Elias » Sat Apr 30, 2005 8:19 pm

Raptor wrote: i never heard of SQl having pointer except for indexex and the like.
Simple: every row has its own unique id and also id of parent row (null for root element). It's nothing really extraordinary. Some SQL engines have also build-in special instructions to make it even easier (like Oracle tree extensions (CONNECT BY ... PRIOR))

Raptor
Posts: 155
Joined: Mon Nov 01, 2004 11:07 pm

Post by Raptor » Sat Apr 30, 2005 9:00 pm

i didnt knew about oracle having connect by even thought thats the one i was using. ill look into it could be pretty usefull in the future.

thanks for the info.

As for efficiency well i learned SQL in a shool environnement so i dont know the fine tricks.

PetriW
AniDB Staff
Posts: 1522
Joined: Sat May 24, 2003 2:34 pm

Post by PetriW » Sun May 01, 2005 6:06 am

Oracle is kinda lonelly in having a connect by statement and it also has it's limits.

However, if you're gonna do searches on the data trees can be quite expensive performance wise, especially when deep.

Remember, it's not only about displaying the stuff.

User avatar
exp
Site Admin
Posts: 2438
Joined: Tue Oct 01, 2002 9:42 pm
Location: Nowhere

Post by exp » Sun May 01, 2005 8:07 am

see also: http://www.anidb.net/forum/viewtopic.php?p=18245#18245

I was planning to use only the categories which were actually added for an anime in a search query. That means an Action::Sports (or whatever :P) anime would not be found on a search for Action if only Action::Sports is added as a category.
However, if the anime has actually enough action content to be classified as Action too, then both Action and Action::Sports would be added for the anime and thus it would be found by a search for either Action or Action::Sports.
This approach makes sure that we do not have to look at the 'tree' at all during search runs. Which is a requirement due to performance reasons IMHO.

BYe!
EXP

User avatar
exp
Site Admin
Posts: 2438
Joined: Tue Oct 01, 2002 9:42 pm
Location: Nowhere

Post by exp » Sat May 07, 2005 8:46 pm

first version is now implemented.

BYe!
EXP

Ultima
AniDB Staff
Posts: 335
Joined: Tue Oct 01, 2002 11:13 pm
Location: GOTT Head Office, Planet Aineias

Post by Ultima » Sun May 08, 2005 12:23 am

I've added some categories to G-Taste if anyone wants to comment. I've suggested category sorting and the star images for ratings (kinda like with file quality stars) internally. And yes, for now only mods can add categories.

User avatar
Elias
Posts: 242
Joined: Tue Feb 17, 2004 4:55 pm

Post by Elias » Sun May 08, 2005 7:21 am

Genre::Hentai (without extensions) is missing.

::Students::Female
::Breasts::Big
::Teachers::Female
i would prefer rather longer last parts (if you ommit previous parts, those can have no meaning) like this:
::Students::Female students
::Breasts::Big breasts
::Teachers::Female teachers

About layout:
- categories should be sorted by: 1st part (content, audience, genre - maybe before each group should be header), then weight and as last name - now is it little messed and hard to see
- marking category by inclinated/normal font is allmost inwisible, i would change it into normal/bold (or normal text/link (to genre definition or similar titles))
- stars will be in future graphics, ne?

User avatar
DonGato
Posts: 1296
Joined: Sun Nov 17, 2002 9:08 pm
Location: The Pampas, The land of the Gaucho!
Contact:

Post by DonGato » Sun May 08, 2005 8:33 am

A Java Script local sort by any column should be provided for easier reading IMO.

Side comment probably not useful and taken in the wrong way, but I never expected to be that confusing. -_-;

Ultima
AniDB Staff
Posts: 335
Joined: Tue Oct 01, 2002 11:13 pm
Location: GOTT Head Office, Planet Aineias

Post by Ultima » Sun May 08, 2005 5:49 pm

http://www.ultima-chan.com/Anidb%20Cate ... 052005.htm

That's a list of the current entered categories. Free feel to post any adjustments, additions, etc. Just remember this is a work in progress! ;)

User avatar
fahrenheit
AniDB Staff
Posts: 438
Joined: Thu Apr 08, 2004 1:43 am
Location: Portugal

Post by fahrenheit » Sun May 08, 2005 6:10 pm

DonGato wrote:A Java Script local sort by any column should be provided for easier reading IMO.

Side comment probably not useful and taken in the wrong way, but I never expected to be that confusing. -_-;
i agree with that

Locked