can't autocreq file with avdump (fid:31483)

Please report any sort of feature requests or bugs on the tracker instead of the forum! http://tracker.anidb.info

Moderator: AniDB

Locked
Great Vovs
Posts: 14
Joined: Fri Nov 18, 2005 12:24 pm
Location: Moscow, Russia
Contact:

can't autocreq file with avdump (fid:31483)

Post by Great Vovs » Thu Jun 28, 2007 10:32 pm

For some reason i cannot autocreq one of my files: fid:31483.
AOM recognizes it as anidb file, avdump makes dump, but i can't see any results at anidb database
Here is log:

Code: Select all

Waiting for response from AniDB... (Ctrl-C to break)
E:\Anime Video Arhiv 2\Outlaw Star\Seihou_Bukyou_Outlaw_Star_-_23_-_(550A2D72).avi
 Hash:
  crc : 550a2d72
  ed2k: 59210dfa7c864dd56464f0dd241f1988
  md5 : 43ef1f86fc5e26a7d971d3e09a11c91e
  sha1: 587832dfaf537e52d48068a231157edfde1e4a76
  tth : d3f46c6b2jd34fqgog3pvyfmep2osjd7qgc7cta
 Duration: 00:24:00 (1440.29)
 Track #1: video
  lang: Unknown (1)
  codc: MP43 -> MS MP4x (18)
  reso: 320x240 -> 4:3
  fram: 29.97 fps
  rate: 305 kbps (304.73)
  dura: 00:24:00 (1440.25)
  size: 52.32 MB (54861086)
 Track #2: audio
  lang: Unknown (1)
  codc: 1 -> PCM (10)
  chan: 2 -> Stereo
  samp: 11025 Hz
  rate: 176 kbps (176.40)
  dura: 00:24:00 (1440.29)
  size: 30.29 MB (31758468)
 Sizes: (check sanity)
  disk: 94.47 MB (99057416)
  trac: 82.61 MB (86619554) [based on track size]
  bitr: 82.61 MB (86618502) [based on bitrate]
  tdif: 11.86 MB (12437862) 12.55%
  bdif: 11.86 MB (12438913) 12.55%
OK 0.31 - 250206
i used avdump version 0.31

epoximator
AniDB Staff
Posts: 379
Joined: Sun Nov 07, 2004 11:05 am

Post by epoximator » Fri Jun 29, 2007 6:07 am


User avatar
Rar
AniDB Staff
Posts: 1471
Joined: Fri Mar 12, 2004 2:41 pm
Location: UK
Contact:

Post by Rar » Fri Jun 29, 2007 9:46 am

Can't you just bloody escape it? 's not hard, here is what is valid CDATA, so, if you need to include arbitrary bytes (like you do to record <private/> say) just make anything outside the range 0x20-0x80 an escape sequence of some kind. For instance, cstyle for 74187:
<private>\xef\xbf\xbd\\\\\x01\x01#\xef\xbf\xbd\x01=Japanese</private>
Or URL style:
<private>%EF%BF%BD%5C%5C%01%01%23%EF%BF%BD%01%3DJapanese</private>

Looks like you've already mangled that one though, as can decode \xef\xbf\xbd -> \ufffd with utf8.

Rar

epoximator
AniDB Staff
Posts: 379
Joined: Sun Nov 07, 2004 11:05 am

Post by epoximator » Fri Jun 29, 2007 10:48 am

no, i don't like uppercase, thank you very much

User avatar
Rar
AniDB Staff
Posts: 1471
Joined: Fri Mar 12, 2004 2:41 pm
Location: UK
Contact:

Post by Rar » Fri Jun 29, 2007 11:05 am

Tell me what flavour of c and strings you prefer, I'll write you a safebytes func myself...

Rar

epoximator
AniDB Staff
Posts: 379
Joined: Sun Nov 07, 2004 11:05 am

Post by epoximator » Fri Jun 29, 2007 12:26 pm

as i've said before i prefer to fix the issues in avdump instead of guaranteeing valid xml. the main objective of avdump is to prevent hundred thousands of manual creqs; 14 of 114095 dumps doesn't really matter in that sense. the 533 dumps considered "corrupt" or/and the 52 "incoherent" matters more.

it would of course be possible to have valid xml AND a way to detect and mark dumps like this, but why should i add code for that on both sides when i already get what i want? just because invalid xml is evil? i know we are hosting them, but they are not linked anywhere (except in wiki).

and why would i want '<private>%EF%BF%BD%5C%5C%01%01%23%EF%BF%BD%01%3DJapanese</private>' when .32 gives '<language>Japanese</language>' ? it has to be handled at one of the ends anyway

anyway, much of this has already been fixed, but .32 is stalled for other reasons.

a func like that could come in handy nevertheless. atm avdump use char*, wchar_t*, string, wstring, wxString, Ztring, JString and so on, ehehe

User avatar
Rar
AniDB Staff
Posts: 1471
Joined: Fri Mar 12, 2004 2:41 pm
Location: UK
Contact:

Post by Rar » Fri Jun 29, 2007 5:56 pm

My understanding is that we wanted some metadata verbatim, in which case having a means of representing arbitrary bytes is wanted. Sure, parsing some actual information out of the data is the real goal, but it's easier to get there and check you're doing it right if you've got the strangely encoded strings in unscrewable form.
Keeping the KBs of junk people put in their headers is clearly not needed though...

Rar

Locked