[EDIT][TITLE] Pikachu's Summer Festival

old granted and denied CREQs

Moderator: AniDB

egg
Posts: 769
Joined: Tue Nov 11, 2003 7:17 am

[EDIT][TITLE] Pikachu's Summer Festival

Post by egg » Tue Jul 19, 2005 6:38 pm

I need help getting the Kanji and Romaji Names for Pikachu's Summer Festival. Here is the screencap.

Image

Andemon
Posts: 117
Joined: Thu Oct 14, 2004 4:12 pm

Post by Andemon » Tue Jul 19, 2005 7:02 pm

I haven't progressed far with my japanese studies, but that seems easy enough...

ピカチュウのなつまつり
pikachu(u) no natsumatsuri

^
|

Is that all you need?

User avatar
Rar
AniDB Staff
Posts: 1471
Joined: Fri Mar 12, 2004 2:41 pm
Location: UK
Contact:

Post by Rar » Tue Jul 19, 2005 7:25 pm

If we're being picky, I'd use caps for non-particles and break the noun phrase in the romanisation: Pikachuu no Natsu Matsuri - and you missed using pretty colours!

Rar

Rafal
Posts: 39
Joined: Fri Jan 07, 2005 2:46 am

Post by Rafal » Tue Jul 19, 2005 8:54 pm

I would write Pikachuu no Natsumatsuri. 'Natsumatsuri' is one single word in Japanese so I think it should also be written as one word.

User avatar
Rar
AniDB Staff
Posts: 1471
Joined: Fri Mar 12, 2004 2:41 pm
Location: UK
Contact:

Post by Rar » Tue Jul 19, 2005 9:16 pm

Define word, state how you you decide that なつまつり is one, and give a general rule that can be used to insert spaces in all japanese noun phrases. Go on, dare ya.

Rar

Rafal
Posts: 39
Joined: Fri Jan 07, 2005 2:46 am

Post by Rafal » Tue Jul 19, 2005 10:04 pm

Well, my rule is that I can find the word in the dictionary. :) Try looking up the 'word' "Summer Festival" in dictionary.com for instance, and you will have a hard time finding it because it's not one word in English. In German it would be "Sommerfest", which is one word, and can thus be found in the dictionary. Perhaps not a linguistically satisfying answer for you, but well, I'm not a linguist. ^^;

But now a question for you, what do you think about the 'noun phrases' in the following title: Natsuiro no Sunadokei. Why isn't it "Natsu Iro"? (summer color) And why isn't it "Suna Dokei"? (Sand + Clock = Hourglass)

Oh, and one more question, could you give me an example of a noun phrase in Japanese where spaces are definitely needed?

User avatar
Rar
AniDB Staff
Posts: 1471
Joined: Fri Mar 12, 2004 2:41 pm
Location: UK
Contact:

Post by Rar » Tue Jul 19, 2005 11:02 pm

Ehe, if you want a 'reason' why that h-game adaptation is spaced like that, it's because the fansubbers gave the title like that in the filename, no doubt. You'll notice the database is far from consistant on spacing issues.

As you brought up german, it's nice to mention it does pretty well on endless compounding, Donaudampfschiffahrtsgesellschaftskapitaenskajuetenschluesselloch I got from a website, or even just neunhundertneunundneunzigtausendneunhundertneunundneunzig.

As for what 'dictionary words' could do with a little breaking, can't manage quite as good as the german off hand, but just wacking ちょう into an online dic and seeing what came up:
長距離電話会社 【ちょうきょりでんわがいしゃ】 (n) long-distance phone company
超低金利金融政策 【ちょうていきんりきんゆうせいさく】 (n) ultra-loose monetary policy
超電導磁気浮上式鉄道 【ちょうでんどうじきふじょうしきてつどう】 (n) superconducting maglev train
Or if we allow proper names, I quite liked this one:
朝鮮民主主義人民共和国 【ちょうせんみんしゅしゅぎじんみんきょうわこく】 (n) Democratic People's Republic of Korea (North Korea); DPRK; (P)
Include techincal and scientific terms, you can go on forever.

This isn't entirely idle banter, as anime is full of stull that's clumsily translated into english as 'Great Dragon Iron Fist Fireball Attack' and equally silly phrases. Though not 'dictionary words', they're really no different to sensible noun phrases you would find in the dictionary.

Rar

Rafal
Posts: 39
Joined: Fri Jan 07, 2005 2:46 am

Post by Rafal » Tue Jul 19, 2005 11:59 pm

Ok, so there are indeed some entries in dictionaries which need some spacing if you're going to romanize them. Taking the first one of your examples, I would romanize that one as follows:

長距離電話会社 --> Chou-Kyori Denwagaisha = long-distance phone company

I'm pretty sure 'Denwagaisha' should be written as one single word here, otherwise it would have been 'Denwa Kaisha' instead. And Chou-Kyori and Denwagaisha need spacing between them because when you say it outloud in Japanese there's a small pause or something after the Kyori. At least you can definitely 'feel' that it's not one long word.

Hmm, it's kinda hard for me to explain why, it probably has something to do with 'pitch accents' or how the 'word' sounds and flows in Japanese, but I can't think of 'Natsumatsuri' as anything but one single word. 'Natsu Matsuri' just looks wrong to my eyes, just like 'Denwa Gaisha', or 'Natsu Iro' would. Perhaps I'll ask a Japanese linguist about this tomorrow, I've gotten kind of curious. ^^

egg
Posts: 769
Joined: Tue Nov 11, 2003 7:17 am

Post by egg » Wed Jul 20, 2005 12:05 am

Rar wrote:If we're being picky, I'd use caps for non-particles and break the noun phrase in the romanisation: Pikachuu no Natsu Matsuri - and you missed using pretty colours!
Well, I went with Rar's, except I used Pikachu, since that is the way it is spelled in English literature and it probably falls under the loanword clause (ironically). So I put in: 'Pikachu no Natsu Matsuri'. Anyone who thinks otherwise can try to convince Rar to change it.

User avatar
Rar
AniDB Staff
Posts: 1471
Joined: Fri Mar 12, 2004 2:41 pm
Location: UK
Contact:

Post by Rar » Wed Jul 20, 2005 4:01 am

egg wrote:except I used Pikachu, since that is the way it is spelled in English literature
I was wracking my brains then trying to remember if Pikachu was a character in Wuthering Heights or Great Expectations.
Er.. but seriously, isn't the name always given as ピカチュウ rather than in roman in japanese? No need to drop the long vowel just because the american versions do, most pokemon names are changed completely, romaji fields should still have the japanese transcription rather than the american translation.
Note this is different to this case, where a roman spelling is provided for a loanword/name, but lines can get blurry.


Back to Rafal briefly, seems to me some elements of this discussion are reasonably constant across languages while others are less so. The wikip rules for breaking words seem to boil down to minimal semantic units vs. rhythm of speech. Obviously if you take either too religiously you get silly results.
A bluebird is not always blue bird, and as mentioned a 砂時計 is not as such 砂・時・計, though there is some sense in breaking at every character in many japanese noun phrases, as most kanji do have a semantic involvement.
On the other hand of you try to go purely on spoken usage you run in to the problem that a language is anything but a set in stone standard. What might 'feel right' to you may been seen as outright wrong by other speakers of the language, the intonation and pauses you might thing natural are by no means universal.
Anyway, dictionaries are quite a useful snapshot of how you might define 'words' at a moment in time in a certain dialect, and I suggested something broadly similar to what you did, it's nice arguing against my own proposition to find breaking cases.

As for your spliting of 長距離電話会社, I think you shouldn't have been a chicken and just done the shortest one. :D
Anyway, seems reasonable, but I don't like Prefix-Word much, though people are used to the Word-suffix format. And as this is just dog transcription, being able to parse it matters as much as some abstract 'correctness' - 'correct' is write it in japanese.

Anyway, I think the ultimate solution is some flashy titles handling with markup, but for the moment it's just a case of trying to give sensible options. You'll see why there are not guidelines as of yet I hope, this is like the mapping coastlines problem, the closer you look the longer a job it gets.

Rar

rowaasr13
Posts: 415
Joined: Sat Sep 27, 2003 4:57 am

Post by rowaasr13 » Fri Jul 22, 2005 9:24 am

"Try to limit word to shortest possible combination of kanji with on-yomi without leaving one dangling kanji and treat each kanji with kun-yomi as separate word, unless there's consonant shift" should work, I think.

So "natsu iro" and "natsu matsuri" should be split, and "sunadokei" or "choukyori" should be not.

Rafal
Posts: 39
Joined: Fri Jan 07, 2005 2:46 am

Post by Rafal » Sun Jul 24, 2005 10:37 pm

All right, I did a little research, asked around and my findings were:
  • Like Rar said earlier it's hard to define what exactly a word is in Japanese and there's no absolute and definite way to find out. However, currently one common and somewhat accepted method is indeed the 'dictionary entry'. The entries which Rar found btw are not found in any of the 'great' J-J dictionaries (Koujiten, Daijirin and Daijisen) and are thus not regarded as single words according to this method.
  • Most Japanese people think of "Natsumatsuri" as one word.
  • Both my own J-E dictionary and my two kanji dictionaries (the New Nelson and Kanji & Kana: A Handbook of the Japanese Writing System) write 夏祭り as one single word in romaji.

    The second book also has an interesting paragraph about romanization and how to define what a 'word' is in Japanese:

    "The only real problem in romanizing Japanese text, in which there are no spaces between words, is in deciding where one word ends and the next begins. There are no universal rules for this, but, as a basic principle, components which are perceived to be independent units are written seperately: Hon o sagashite iru n desu. Hyphenation is used for various suffixes and other word units that one does not want to run together but does not want to write seperately: Toukyou-to, Minato-ku, Endou-san. For readability, long compounds are broken up into smaller units: Nihon Shoki, kaigai ryokou, minshu shugi."
  • Google search result
    夏祭り "natsu matsuri" = 90 hits
    夏祭り natsumatsuri = 2880 hits
You can draw your own conclusions.

User avatar
Rar
AniDB Staff
Posts: 1471
Joined: Fri Mar 12, 2004 2:41 pm
Location: UK
Contact:

Post by Rar » Mon Jul 25, 2005 10:50 pm

The rows. suggestion is quite fun, well worth listing as an idea. Though I can think of some cases where dakuten are added but perhaps a space would still be warrented, noun phrases beginning 二人 for instance.

More interestin' stuffs from rafal, I'm sure you can find some much longer 'words' in the j-j dicts than my hasty internet check managed though, if you have a good poke. Really, I'd prefer to see what these textbooks say on the topic of prefixes, something I've seen very little on as opposed to the established word-suffix.

Finally, one point that needs making from experience with wikipedia, google fights that don't involve hundreds or at least tens of thousands of results are almost entirely worthless when trying to prove something. The internet is a very poor sample of language use, but is very good at propagating errors widely. Particularly trying to get an idea of how japanese use romaji from the web is a pointless exercise (read: however they damn well like, they read their own scripts so it's not like romaji correctness matters). And final point, google.nl and google.something_else will return different results, if you want japanese pages co.jp is probably slightly preferable.

Rar

Rafal
Posts: 39
Joined: Fri Jan 07, 2005 2:46 am

Post by Rafal » Wed Jul 27, 2005 5:07 pm

Rar wrote: More interestin' stuffs from rafal, I'm sure you can find some much longer 'words' in the j-j dicts than my hasty internet check managed though, if you have a good poke.
Perhaps, but I don't see why you can't use this as a general rule though. Long entries can always be looked at seperately.

Btw, 長距離電話(ちょうきょりでんわ) does appear in the Daijirin and is romanized as "Choukyoridenwa" in my j-e romaji dic and as "Choukyori Denwa" in the New Nelson . I personally would say the second one looks better because of reasons I have already mentioned (so this would indeed be an exception). The hyphen I used earlier isn't really needed.
Really, I'd prefer to see what these textbooks say on the topic of prefixes, something I've seen very little on as opposed to the established word-suffix.
I can't find much of it either, but I think it depends on the prefix and the word. For instance for the politeness prefix 御 ("o", or "go") hyphenation is usually preferred when the prefix is not considered an integral part of the word itself: O-genki desu ka?. And when the prefix has become a part of the word as in nouns like 'otaku', 'ojou-san' or 'gohan', the general rule is to write them without a hyphen.
Finally, one point that needs making from experience with wikipedia, google fights that don't involve hundreds or at least tens of thousands of results are almost entirely worthless when trying to prove something. The internet is a very poor sample of language use, but is very good at propagating errors widely. Particularly trying to get an idea of how japanese use romaji from the web is a pointless exercise (read: however they damn well like, they read their own scripts so it's not like romaji correctness matters). And final point, google.nl and google.something_else will return different results, if you want japanese pages co.jp is probably slightly preferable.
I've used google.nl/.com/.co.jp and they all gave similar results, so I did think about that before posting those results (the default I use is .nl).

As for 'proving' anything with this, well there isn't really anything to prove as there are no universal rules for this. All you can do is look at what literature and linguists have to say about this, how other people do it and draw your own conclusions and create your own rules from there. I'm just seeing that in most (all?) Hepburn based Japanese (text)books and romaji dictionaries 夏祭り is romanized as one word, so I'm more inclined to follow their example and write it as one single word as well.

Rafal
Posts: 39
Joined: Fri Jan 07, 2005 2:46 am

Post by Rafal » Fri Jul 29, 2005 2:32 am

Sorry for the double post, forgot to reply to rowaasr's post. ^^;
rowaasr13 wrote:"Try to limit word to shortest possible combination of kanji with on-yomi without leaving one dangling kanji and treat each kanji with kun-yomi as separate word, unless there's consonant shift" should work, I think.
This doesn't seem like a very good idea, for instance with this 'rule' you'd end up writing the Japanese word for 'wheelchair' 車椅子(くるまいす) as 'Kuruma Isu'. Or the word 仲間(なかま) as 'Naka Ma'. I think nobody with any knowledge of Japanese would want to romanize these words like that. ;)

I propose to just write everything that can be found in the dictionary as one word, unless the word is very long or hard to read (for instance the earlier mentioned 'Choukyori Denwa').

Locked