can't autocreq file with avdump (fid:31483)

Great Vovs · Post by **Great Vovs** » Thu Jun 28, 2007 10:32 pm

For some reason i cannot autocreq one of my files: fid:31483.
AOM recognizes it as anidb file, avdump makes dump, but i can't see any results at anidb database
Here is log:

Code: Select all

Waiting for response from AniDB... (Ctrl-C to break)
E:\Anime Video Arhiv 2\Outlaw Star\Seihou_Bukyou_Outlaw_Star_-_23_-_(550A2D72).avi
 Hash:
  crc : 550a2d72
  ed2k: 59210dfa7c864dd56464f0dd241f1988
  md5 : 43ef1f86fc5e26a7d971d3e09a11c91e
  sha1: 587832dfaf537e52d48068a231157edfde1e4a76
  tth : d3f46c6b2jd34fqgog3pvyfmep2osjd7qgc7cta
 Duration: 00:24:00 (1440.29)
 Track #1: video
  lang: Unknown (1)
  codc: MP43 -> MS MP4x (18)
  reso: 320x240 -> 4:3
  fram: 29.97 fps
  rate: 305 kbps (304.73)
  dura: 00:24:00 (1440.25)
  size: 52.32 MB (54861086)
 Track #2: audio
  lang: Unknown (1)
  codc: 1 -> PCM (10)
  chan: 2 -> Stereo
  samp: 11025 Hz
  rate: 176 kbps (176.40)
  dura: 00:24:00 (1440.29)
  size: 30.29 MB (31758468)
 Sizes: (check sanity)
  disk: 94.47 MB (99057416)
  trac: 82.61 MB (86619554) [based on track size]
  bitr: 82.61 MB (86618502) [based on bitrate]
  tdif: 11.86 MB (12437862) 12.55%
  bdif: 11.86 MB (12438913) 12.55%
OK 0.31 - 250206

i used avdump version 0.31

Post by **epoximator** » Fri Jun 29, 2007 6:07 am

http://wiki.anidb.info/w/Avdump/Issues#Invalid_XML

Website · Post by **Rar** » Fri Jun 29, 2007 9:46 am

Can't you just bloody escape it? 's not hard, here is what is valid CDATA, so, if you need to include arbitrary bytes (like you do to record <private/> say) just make anything outside the range 0x20-0x80 an escape sequence of some kind. For instance, cstyle for 74187:
<private>\xef\xbf\xbd\\\\\x01\x01#\xef\xbf\xbd\x01=Japanese</private>
Or URL style:
<private>%EF%BF%BD%5C%5C%01%01%23%EF%BF%BD%01%3DJapanese</private>

Looks like you've already mangled that one though, as can decode \xef\xbf\xbd -> \ufffd with utf8.

Rar

Post by **epoximator** » Fri Jun 29, 2007 10:48 am

no, i don't like uppercase, thank you very much

Website · Post by **Rar** » Fri Jun 29, 2007 11:05 am

Tell me what flavour of c and strings you prefer, I'll write you a safebytes func myself...

Rar

Post by **epoximator** » Fri Jun 29, 2007 12:26 pm

as i've said before i prefer to fix the issues in avdump instead of guaranteeing valid xml. the main objective of avdump is to prevent hundred thousands of manual creqs; 14 of 114095 dumps doesn't really matter in that sense. the 533 dumps considered "corrupt" or/and the 52 "incoherent" matters more.

it would of course be possible to have valid xml AND a way to detect and mark dumps like this, but why should i add code for that on both sides when i already get what i want? just because invalid xml is evil? i know we are hosting them, but they are not linked anywhere (except in wiki).

and why would i want '<private>%EF%BF%BD%5C%5C%01%01%23%EF%BF%BD%01%3DJapanese</private>' when .32 gives '<language>Japanese</language>' ? it has to be handled at one of the ends anyway

anyway, much of this has already been fixed, but .32 is stalled for other reasons.

a func like that could come in handy nevertheless. atm avdump use char*, wchar_t*, string, wstring, wxString, Ztring, JString and so on, ehehe

Website · Post by **Rar** » Fri Jun 29, 2007 5:56 pm

My understanding is that we wanted some metadata verbatim, in which case having a means of representing arbitrary bytes is wanted. Sure, parsing some actual information out of the data is the real goal, but it's easier to get there and check you're doing it right if you've got the strangely encoded strings in unscrewable form.
Keeping the KBs of junk people put in their headers is clearly not needed though...

Rar