Page 1 of 1

codec identifiers in anidb

Posted: Tue Mar 29, 2005 10:07 pm
by Rar
Err... this was just a quick suggestion to change some of the existing codec identifiers in anidb to make more sense, but has spiralled into a big overview of video encoding. I'm not an expert, if I've made mistakes, tell me. Blanks are where I've been lazy.

Video codec identifiers in anidb

1>unknown Used if the codec is not known (generally due to lazy file adders or people who don't know how to deal with non-avi containers), or if the codec is known but not in the list.
2>DivX UNK If you somehow know it's divx, but not the version, this might be used. Seems difficult to believe you'd know it was Divx and not some other ASP codec, but not know the version.
9>MPG1 (vcd) From wikipedia: "VCD display resolution is 352x240 pixels (NTSC) or 352x288 pixels (PAL), approximately one quarter of full TV resolution. VCD video is in MPEG-1 format; audio is encoded as MPEG Layer 2 (MP2); video is stored at 1150 kilobits per second, audio at 224 kbit/s." Basically a specific limitation of the MPEG standard, which can be directly ripped to file.
10>MPG2 (svcd) From wikipedia: "SVCDs store digital video in MPEG-2 format at a resolution of 480x480 pixels (for NTSC) or 480x576 pixels (for PAL)... Video may be encoded at a variable bit rate, up to 2.6 megabits per second. Audio is stored in MPEG Layer 2 format, with a bit rate varying from 128 to 384 kilobits per second." Again, a specific limitation of the MPEG-2 standard.
11>MPG4 MPEG-4 is a collection of standards, that covers lots of things beyond video. Most of the codecs in this list are based on MPEG-4 so this is used when the file uses one that's not in the list (eg lavc).
18>MS MP4x The non-standard implimentation of MPEG-4 Part 2 by Microsoft, precursor to the current wmv format. It was reverse engineered to create DivX ;-)
12>ASF This is a Microsoft patented container for their wmv and wma streams, not a video codec.
13>MOV This is a Quicktime container format that can hold a wide variety of data, not a video codec.
14>RM (also ram) RM is a RealNetworks container, not a video codec. This is generally used for versions of RealVideo prior to 9.
20>RV9 From Doom9: "RealVideo 9 and 10 are video codecs developed by RealNetworks, with MMX/SSE2/SSE3 optimizations from Intel, and AltiVec (OS X) optimizations from RealNetworks. RV9 and 10 are bitstream compatible, RV 10 is an encoder side improvement to gain quality by letting the encoder spend more time making wise decisions.... RV9/10 is not based on wavelets, and not on fractals, and does not include inherent pre-filtering. Which technology is its foundation? Since it is not open source and not a standard, I can't say, but it's nothing magic. It was not optimized for low bitrates, but to avoid compression artifacts at any bitrate." Generally used by French, Québécois, and other fools.
15>Vivo From wikipedia: "The Vivo platform was a known player when streaming media was in its infancy and was deployed mainly on erotic sites during the mid 1990s... Vivo format is based upon H.263 codec. It uses inter-frame coding, but does not insert any key frames, except at the beginning of the clip, which effectively disables the possibility of seeking."
21>IV5 Perhaps meant to be Indeo 5, and early vfw codec. Part of a long series of codecs by Intel, primarily used for streaming.
16>DVD I presume this is some kind of MPEG-2 misunderstanding, there is no 'dvd' video codec.
19>WMV9 (also WMV3)

Some rework needed, perhaps:
'Unknown/Other' <- Unknown, ASF, MOV
'Legacy' <- MS MP4x, RM (also ram), IV5, Vivo (and h263, cinepak, etcetc)
'MPEG-1' <- MPG1 (vcd)
'MPEG-2' <- DVD, MPG2 (svcd)
'ASP Other' <- DivX UNK, MPG4 (and lavc, 3ivx, etcetc)
'RealVideo 9/10'
'Windows Media Video'
What to identify separately and what to combine is a fun issue, but hey. Future additions, depending on encoder usage, might be Dirac, VP7, Theora, etc.

Audio codec identifiers in anidb

1>unknown If you don't know, don't care, or don't have the option.
2>AC3 Dolby Digital audio.
3>DivX Audio An early hack of the WMA codec by Microsoft, would correctly include the ;-)
4>MP3 UNK Reasonably pointless.
5>MP3 Used for CBR mp3 audio. Specified in MPEG-1 Part 3 Layer 3.
6>MP3 VBR Used for VBR mp3 audio.
7>MSAudio Some other WMA type thing?
8>Ogg Vorbis The mighty ogg.
9>AAC Specifed in MPEG-2 Layer 3, and extended in MPEG-4 Layer 3.
10>PCM Uncompressed digital audio.

Mostly fine, a few suggestions:
-Rename 'Ogg Vorbis' to 'Vorbis'
-Rename 'DivX Audio' to 'WMA [and DivX ;-) Audio]'
-Remove 'MP3 UNK' and move any files with it to 'Unknown'
-Rename 'MP3' to 'MP3 CBR'
-Add 'MP2'
Seems likely additions may be needed in the future depending on the uptake of other audio formats.

A chart of stuff

-MPEG-1 (1992) (wikipedia)
--MPEG-1 Part 2 (Video)
---SDL MPEG (SMPEG) (wikipedia)
---VCD (1993) (wikipedia)
--MPEG-1 Part 3 (Audio)
---MPEG-1 Part 3 Layer 2 (MP2)
---MPEG-1 Part 3 Layer 3 (MP3) (wikipedia)

-MPEG-2 (1994) (wikipedia)
--MPEG-2 Part 2 (Video)
---DVD (wikipedia)
---SVCD (wikipedia)
--MPEG-2 Part 3 (Audio)
---AAC (wikipedia)

-MPEG-4 (1998) (wikipedia) (Doom9)
--MPEG-4 Part 2 (Video)
---Advanced Simple Profile (ASP)
----in ffmpeg (libavcodec ie lavc)
----many others
---Microsoft based non-standard implementations
----DivX ;-) (DivX3)
----Windows Media Video 7 (WMV1)
----Windows Media Video 8 (WMV2)
----Windows Media Video 9 (WMV3)
--MPEG-4 Part 2 (Audio)
--MPEG-4 Part 10 (also known as AVC or H.264) (May 2003)
---NeroDigital AVC
---many others

-RealNetworks (wikipedia)
---In player version 4 and 5, by rv10.dll (Feb 1997)
---'G2' in player version 6 and 7, by rv20.dll (Apr 1999)
---In player version 8, by rv30.dll (May 2000)
---In player version 9 and 10, by rv40.dll (Apr 2002) (Doom9)
--RealAudio hodgepoge (wikipedia)

Posted: Wed Mar 30, 2005 6:00 pm
by Amour
> -Rename 'MP3' to 'MP3 CBR'

No need. I prefer 'MP3' alone.

And I vote for the deletion of 'DivX UNK'. If you don't know exactly what it is, then leave it unknown.

Posted: Thu Mar 31, 2005 4:20 pm
by nwa
what about DTS audio?

Posted: Thu Mar 31, 2005 10:38 pm
by Rar
nwa: Dunno. Should only add it if a reasonable number of releases use it. Wiki seems to suggest a bitrate of 1.5 megabits/second, which seems somewhat large for p2p.

amour: The problem is 'MP3' is a superset of 'MP3 VBR' and it's not very clear what that entry means unless to note the other one. The main reason it's useful differentiating at all is the dodgy avi handling of vbr audio.



Posted: Mon Jul 25, 2005 12:43 am
by Isochroma
I did an encoding of Wonderful Days... original audio stream was 768kbps DTS. No transcoding, just straight transmux into mkv.

As for filetypes, how about just using the FourCC for video? Such an apporach makes much more sense that trying to define an arbitrary list of codecs. User friendly names can be mapped to the FourCCs, or they can be listed together as one string.

Posted: Mon Jul 25, 2005 6:45 am
by egg
It was pointed out before [note this is only accessible to mods], the fourCC is not always reliable. Here are some excerpts:
wahaha wrote:This is somewhat related to Poll: XviD/DivX.

A single "divx"-4cc (as used in some anime-keep encodes) is a problem - for old files it could be DivX4 while it could aswell be DivX5. In such a case, set the codec to "DivX (unknown)", although it really is DivX5 for all newer encodes.

A suggestion for how to set the codec-value:
(Most single 4CCs in GSpot directly translate to the codec-entries in AniDB:)
divx -> DivX (Unknown)
div3 -> DivX3
div4 -> DivX4
dx50 -> DivX5
xvid -> XviD
wmv3 -> WMV9

However, if GSpot shows a second "xvid"-4cc after the slash, it should be set to xvid:
DX50/xvid -> XviD
divx/xvid -> XviD

If the second 4cc is "divx", set it to DivX5:
DX50/divx -> DivX5

Other values I've seen where I'm not sure what to set:
"div3/div4" -> I'd set it to DivX4 or DivX (Unknown)
"xvid/yv12" -> I'd set it to XviD
PetriW wrote:this is the current table used in AniDB O'Matic:
colums are: codec/compression in file, long name, short name, anidb id

Code: Select all

    ('div3',      'DivX 3 Low-Motion',  'DivX3',    '3'),
    ('div3/div3', 'DivX 3 Low-Motion',  'DivX3',    '3'),
    ('div4/div3', 'DivX 3 Fast-Motion', 'DivX3',    '3'),
    ('div5',      'DivX 5.0',           'DivX5',    '7'),
    ('dx50',      'DivX 5.0',           'DivX5',    '7'),
    ('xvid/dx50', 'DivX 5.0',           'DivX5',    '7'),
    ('divx/dx50', 'DivX 5.0',           'DivX5',    '7'),
    ('divx',      'DivX 4 (OpenDivX)',  'DivX4',    '5'),
    ('divx/divx', 'DivX 4 (OpenDivX)',  'DivX4',    '5'),
    ('xvid',      'XviD',               'XviD',    '17'),
    ('xvid/xvid', 'XviD',               'XviD',    '17'),
    ('xvid/divx', 'DivX 4 (OpenDivX)',  'DivX4',    '5'),
    ('mp42',      'S-Mpeg 4 version 2', 'MS MP4x', '18'),
    ('mp43',      'S-Mpeg 4 version 3', 'MS MP4x', '18')
AOM has certain rules to how it detemines what codec a file has. If a column is unknown it's assumed to be corrupt (anidb has several files like that) and the codec is predicted the same way GSpot does it, aka compression has priority then codec then known over unknown.
This table above was made by parsing over 15% of all avi files in AniDB fyi.

Posted: Tue Jul 26, 2005 8:06 am
by rowaasr13
PetriW's table much more correct.

DIV4 always were DivX 3 Fast-Motion. It is never DivX 4. And DIVX 4cc is DivX 4. Actually if DB is ever switched to 4cc, just add divs field along with it and then, as already suggested, map pairs on user-friendly names.

Posted: Wed Aug 17, 2005 5:05 pm
by dinoex
this is my stats of "ecoder/vcodec" pairs.

Code: Select all

count   ecoder/vcodec  (Anidb)
1403    xvid/XVID      (Xvid)
786     xvid/DX50      (Xvid or Divx5)
738     div3/DIV3      (Divx3)
538     divx/DX50      (Divx5)
394     xvid/DIVX      (Xvid or Divx4)
282     /XVID          (Xvid)
100     /DX50          (Xvid or Divx5)
87      divx/DIVX      (DivX4)
59      DX50/DX50      (Divx5)
48      div4/DIV3      (Divx4 or Divx3)
37      XVID/XVID      (Xvid)
32      DIVX/DX50      (Divx5)
30      /MPEG1         (mgp1)
28      WMV3/WMV3      (wmv9)
22      DIVX/DIVX      (Divx4)
20      wmv3/WMV3      (wmv9)
19      DIV3/DIV3      (Divx3)
17      yv12/DX50      (Divx5)
17      /MP43          (MS)
11      MP43/MP43      (MS)
9       div5/DIV5      (Divx5)
9       /WMV3          (wmv9)
5       /SVQ3          (unknown)
4       div5/DIV3      (Divx5 or Divx3)
3       yv12/XVID      (Xvid)
3       mp43/MP43      (ms)
3       divf/DIVX      (Divx4)
2       div6/DIV5      (Divx5)
2       div3/div3      (Divx3)
2       DIV4/DIV3      (Divx4 or Divx3)
2       /RV40          (RM)
2       /MP42          (MS)
2       /DIV3          (Divx3)
1       x264/H264      (H264/AVC)
1       mp42/MP42      (MS)
1       XVID/DX50      (Xvid or Divx5)
1       XVID/DIVX      (Xvid or Divx4)
1       MP42/MP42      (MS)
1       3ivx/3IV2      (unknown)
1       3iv2/3IV2      (unknown)
1       /avc1          (unknown)
1       /WMV1          (unknown)
1       /RV30          (rm)
1       (0)/DIVX       (Divx4)
What is the recommended mapping to anidb?

Back when I added "xvid/DX50" as "Divx5"
some mods asked to check for "xvid" first,
which is not matching the list posted by petriw above.

Posted: Wed Aug 17, 2005 5:17 pm
by dinoex
Here is my list of audio codecs

Code: Select all

count   codec           (Anidb)
4154    mp3             (mp3)
331     vorbis          (ogg/vorbis)
89      ffwmav2         (dvix Audio)
54      faad            (aac)
48      AC3             (ac3)
20      msadpcm         (pcm or msausdio)
13      ffwmav1         (divx audio)
10      pcm             (pcm)
9       wma9dmo         (msaudio)
3       racookwin       (unknown Real Audio) 
"msadpcm" is the only case of doubt here
it is listed in andb files as "unkown", "pcm" or "msaudio" ?

How should this codecs be set in andb?

Posted: Wed Aug 17, 2005 11:25 pm
by Skywalka
I think you shoudl use "layer" instead of "part" when it comes to MPEG and it's different parts ;-)
The name is derived from "MPEG-1 Audio Layer 3"

All articles on wikipedia use it and I never heard it differently.

Posted: Thu Aug 18, 2005 12:34 am
by Rar
Isochroma: Fortunately nuts do not make up a 'reasonable number of releases'.

dinoex: 4cc alone is a poor method of identifying codecs. Also, if you don't say what tool you're using, and why it gets confused between pcm and msaudio (which should be easy to distinguish) the data isn't of much use.

Skywalka: Try reading the post again. There's not point even correcting you.


Posted: Thu Aug 18, 2005 8:34 am
by dinoex
this data was collected by using mplayer

the tool identify its as "msadpcm"

which can be read as "msaudio"
or as "PCM", dependend on what you prefer to sort it in.

Not sure if tehre is a difrfent to pcm,
but there is a diffrent to "msaudo" codec= wma9dmo

There is for now not much tools that will run
on non windows systems.
But any Info how to detect further differences is welcome.


Posted: Wed Aug 31, 2005 12:57 am
by Isochroma
How about separating div6 from div5? I think that the quality difference between these two versions would make it desireable that users could identify them...

AC3 has 2 channel and 6 channel

Posted: Fri Sep 16, 2005 3:39 am
by Bokoo
AC3 has 2 channel and 6(5.1) channel
big difference, but there are not that many 6 channel audio anime