Subcode conversion

Hi everybody,

I’ve read the mmc specification and I have a question about subcode. I know it is possible to read the subcode in raw mode or in deinterleaved and corrected mode. The second form is done by the cdrom hardware. My question is, if I start from the raw mode, how can I obtaine the second form in software. Does someone have algorithm or source code about this?

Thank you for your help!

Yannick

to get the deinterleaved you’ll have to extra each channel
like the MMC describes it stored, you can take a peek at
my notes here http://www.yates2k.net/junk/cdnotes.txt

take each of the 96 bytes and extract the required bit you need
from each to build up a channel byte, you should have 12 bytes
of each by the time you finished, fun, will require lots of
shifts :slight_smile: i havent seen any source code around myself but
i’ll be doing my own fairly soon so i’ll give you it when i finish
if your still interested.

yates.

Hi,

thank for the info but I dont think a simple bit swapping can do the deinterleaving/error correction of subcode. I suspect a more complex algorithm as reed-solomon. I suppose that they are described in red book but I don’t have a copy.

Here is an example that I ripped myself. The first block shows good subcodes. The readcd command was set to transfer P-W (100b). The second block show a raw transfer mode of P-W (001b). You will see that sometime the line match exactly and if they don’t match, each byte keep is magnitude (high byte seems to remain high).

In the mmc pdf, there is a picture that shows the building of the packet, but they mention that the algorithm is beyond the scope of the standard. This is why I suppose this is mentionned in the redbook.

good subcodes:
0 0 0 0 0 0 0 40
0 0 40 0 37 d 36 28
9 6 35 3a 0 0 1 26
0 40 0 40 0 0 0 0
0 0 40 0 75 2e c 70
9 6 35 3a 0 0 1 27
40 40 40 0 40 0 0 40
40 40 40 0 4a 4e 5a 78
9 6 35 3a 0 0 1 68
0 0 0 40 40 0 0 40

raw subcodes:
0 0 0 0 0 0 0 40
0 0 46 0 3f 27 1a 3a
9 0 0 1b 0 35 1 27
0 40 0 40 0 0 0 0
0 0 46 0 7d 7 d 7a
9 0 0 23 0 35 1 28
40 40 40 0 40 0 0 40
40 40 46 0 42 64 5b 7a
9 0 0 2b 0 35 1 69
0 0 0 40 40 0 0 40

Yannick

Your ripped subcodes look like they are in interlaced format and seems to be corrupted. You read them from a damaged CD?

Your question isn’t very clear.

Are you saying you wish to be able to correct the corrupted raw interlaced subchannel data?

The thing is you normally cannot do this and it depends on which/how many bytes are corrupted. You will need the lower level Reed Solomon Interleaved (CIRC) Error Detection/Correction codes and these are not included in the returned subchannel data you ripped. In the lower level, i.e. in the small frames you need the following:

  1. C2 error pointers are the Error Detection codes and
  2. Q/P parity codes are the Error Correction codes.

MMC readcd command can retrieve C2 info on some drives. But there are no MMC command to return low level Q/P parity data.

If you just wish to convert between interlaced/deinterlaced format, then it is just simply bit swapping - it’s what yates has given you.

Hi,

Let me reformulate it. I use the scsi command Read CD (BEh). Refer to the mmc document. There is a field called subchannel selection bits. This field can take the value 001b to transfer subchannel as raw mode or 100b to transfer the same subchannel deinterleaved and corrected. If I have a cdrom that only support the raw mode, how can I transform the returned subchannel (96 bytes length) to deinterleaved them. In theory I should obtain something exactly like to the transfer done with the 100b mode.

Thank for your support!

Yannick

You mean simply converting between interleave/deinterleave format?

Then the problem is that your CDROM drive is not very good at reading subchannel data. It is giving you corrupted subchannel data.

Or, you are not issuing the Read CD command correctly to the CDROM drive - simple test, clear the input buffer before issuing the command and if you get all zeroes the command wasn’t sent.

E.g. a ripped subchannel data of a CDROM formatted CD of sector 10 should look like this (read with MMC command Read CD BEh with 001b subchannel option):

00400000000000400000000000000040
00000000000000400000000000000000
00000000000000000000004000000040
00000000000000000000000000000000
00000000000040000000004000000040
40000000004000000000400040000040

Software converted (just bit swapping) to deinterlaced format gives:

00000000000000000000000041010100
00110000021184290000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000

Noticed that there should only be 12 bytes of Q subchannel data - all others should be zeroes for a normal CDROM formatted CD.

You should try ripping using another CDROM drive and you’ll see what I mean.

ok here is the code,

//////////////////////////////////////////////////////////////////////

/////////////////////////////////////
// Create De-Interleaved SubChannel//
/// from RAW return Interleaved. //
//// //
/// [yates] - 11/DEC/03 //
/////////////////////////////////////

#define P_CHANNEL 0x80
#define Q_CHANNEL 0X40
#define R_CHANNEL 0X20
#define S_CHANNEL 0X10
#define T_CHANNEL 0X08
#define U_CHANNEL 0X04
#define V_CHANNEL 0X02
#define W_CHANNEL 0X01

void DeInterleave_Channel(byte channel, unsigned char *raw, unsigned char *buffer)
{

_asm
{
pushad
pushfd
//int 3

xor eax,eax
mov ebx, buffer
mov edi, raw

get_my_subs:
xor ecx,ecx
cmp eax,12
ja job_done

xor edx,edx
Extract_Byte:

mov cl, byte ptr [edi]
and cl, channel
ror cl,7
or dl,cl
inc edi

mov cl, byte ptr [edi]
and cl, channel
ror cl,0
or dl,cl
inc edi

mov cl, byte ptr [edi]
and cl, channel
ror cl,1
or dl,cl
inc edi

mov cl, byte ptr [edi]
and cl, channel
ror cl,2
or dl,cl
inc edi

mov cl, byte ptr [edi]
and cl, channel
ror cl,3
or dl,cl
inc edi

mov cl, byte ptr [edi]
and cl, channel
ror cl,4
or dl,cl
inc edi

mov cl, byte ptr [edi]
and cl, channel
ror cl,5
or dl,cl
inc edi

mov cl, byte ptr [edi]
and cl, channel
ror cl,6
or dl,cl
inc edi

Got_Byte:
mov byte ptr [ebx], dl
inc ebx
inc eax
jmp get_my_subs

job_done:
popfd
popad

//int 3
//mov ebx, buffer
};

};
//////////////////////////////////////////////////////////////////////

e.g.
UCHAR ucData[3300];
UCHAR qchannel[20];

ASPI.ReadSector2352(0x10,1,ucData);
DeInterleave_Channel(Q_CHANNEL,ucData+0x930,qchannel);

I’ve tested it myself and it works, but your raw subs do look
a bit odd, perhapes they contain cdtext or something?

regards,
yates.

Hi,

they were extracted from a cd+g, this is why they contain a lot of data.

I will try your code and let you know if it works.

Yannick

I don’t think your code can help me. It works if I start from the extracted version (ripped with subchannel set to 100b). I’ve read the ecma pdf and I think I need some complex process to convert from raw mode (001b) to real sub channel data (100b). Probably the annexe C but maybe I’m wrong.

Thank for you help guys! Your fast answer time is really appreciated!

Yannick

I had another look at the MMC specs. I know what you’re trying to talk about now.

Actually, we are all mixed up here. Me and yates are referring to plain interleaved/deinterleaved subchannel data formats.

But you are referring to MMC interleaved packed subchannel data format. This is not the same as plain interleaved subchannel data format - they don’t have the same meaning.

Yannick, if you look more carefully at MMC spec, you’ll notice that it says:

for option 100b, it may return:

a. deinterleaved & error corrected (MMC interleaved packed format),
b. RAW (plain deinterleaved format) or
c. padded with zeroes

depending on the ‘R-W supported and R-W de-interleaved and error corrected’ bits in the ‘CD capabilities and Mechanism’ status page.

It just means that it depends on the drive whether it supports returning MMC interleaved packed formatted subch data. You can tell by looking at the bit in ‘CD capabilities and Mechanism’ status page.

If it doesn’t then the option returns plain P-W interleaved format - same format as option 001b.

I’ve worked with quite a few drives and most drives, with 001b option, just return plain P-W interleaved format. But some drives will give of a different sector to option 100b (not sure why), my Plextor drives does this.

My raw subs look normal - they are from a CDROM formatted CD, only P & Q are used, R-W are all zeroes. On the other hand CD+G uses all P-W data channels.

You said with both options 001b and 100b that they some times give same data - given your explanation how is that possible?

I suspect that in your case the 100b option is just returning plain interleaved subchannel data (i.e. raw subs - same format as 001b) but shifted at a different sector.

old version only worked for Q channel so heres the new
version that should do all channels, just incase anyone else
wanted it :slight_smile:

/////////////////////////////////////
// Create De-Interleaved SubChannel//
/// from RAW return Interleaved. //
//// //
/// [yates] - 11/DEC/03 //
/////////////////////////////////////

revision b - 13/DEC/03

byte P_CHANNEL[29]={0x80, 0, 1, 2, 3, 4, 5, 6 ,7};
byte Q_CHANNEL[29]={0x40, 7, 0, 1, 2, 3, 4, 5, 6};
byte R_CHANNEL[29]={0x20, 6, 7, 0, 1, 2, 3, 4, 5};
byte S_CHANNEL[29]={0x10, 5, 6, 7, 0, 1, 2, 3, 4};
byte T_CHANNEL[29]={0x08, 4, 5, 6, 7 ,0, 1, 2, 3};
byte U_CHANNEL[29]={0x04, 3, 4, 5, 6, 7 ,0, 1, 1};
byte V_CHANNEL[29]={0x02, 2, 3, 4, 5, 6, 7, 0, 1};
byte W_CHANNEL[29]={0x01, 1, 2, 3, 4, 5, 6, 7, 0};

byte ror(byte src, byte dest)
{
byte result;
_asm
{
pushad
mov al, src
mov cl, dest
ror al, cl
mov result,al
popad
};

return result;

};

void DeInterleave_Channel(byte *channel, unsigned char *raw, unsigned char *buffer)
{

byte chanbyte = 0;
byte rorres = 0;
byte rawbyte;
byte sub = 0;

for(int x=1;x<13;x++)
{

for (int i=1;i&lt;9;i++)
{
	rawbyte = raw[sub];
	rawbyte = rawbyte & (BYTE)channel[0];
	rorres = ror(rawbyte, (BYTE)channel[i]);
	chanbyte = chanbyte | rorres;
	sub++;

};

buffer[x]=chanbyte;
chanbyte = 0;
};

};

Yates, i think your code is incorrect

instead of …

for(int x=1;x<13;x++)

shouldn’t it be …

for(int x=0;x<12;x++)

otherwise the result is offset by one byte.

Hi, all!

Returning back to the original posting, I think it is a common task for anyone who tries to develop a CD+G grabber to deinterleave raw subchannel data. Below there is a sample code that I wrote to convert raw subcode data into the deinterleaved format. My tests show that the result data is a valid CD+G stream.

I have never seen the Red Book and the offsets were merely guessed. I also don’t know if it is necessary to apply Reed-Solomon error-correction before or after the deinterleaving… but it works and I am going to use this code in my cdg-grabber (http://karaoke-dx.sourceforge.net).

#include <cstdio>

int offsets[] = { 0, 66, 125, 191, 100, 50, 150, 175,
8, 33, 58, 83, 108, 133, 158, 183,
16, 41, 25, 91, 116, 141, 166, 75 };

unsigned char* read_file(const char* name, long& len) {
FILE* file = fopen(name, “rb”);

if ( !file ) {
    perror("fopen");
    return 0;
}

fseek(file, 0, SEEK_END);
len = ftell(file);
fseek(file, 0, SEEK_SET);

unsigned char* buffer = new unsigned char[len];
fread(buffer, 1, len, file);
fclose(file);
return buffer;

}

int main(int argc, char** argv) {
long length;
unsigned char* raw = read_file(argv[1], length);
unsigned char* decoded = new unsigned char[length];
unsigned int nsectors = length / 96;

for ( unsigned int sector = 0; sector &lt; nsectors; ++sector ) {
    for ( unsigned int pack = 0; pack &lt; 4; ++pack ) {
        for ( unsigned int column = 0; column &lt; 24; ++column ) {
            decoded[sector * 96 + pack * 24 + column]
                = raw[sector * 96  + pack * 24 + offsets[column]];
        }
    }
}

FILE* file = fopen(argv[2], "w+b");
if ( !file ) {
    perror("fopen");
    return 1;
}


fwrite(decoded, 1, length, file);
fclose(file);

delete[] raw;
delete[] decoded;
return 0;

}

The code is not optimized, but I hope it will be helpful.

Taras

@Truman:

Why does your tool deinterleave subchannel data when packed mode (100b) is asked?

It’s supposed that you have to de-pack it to look like if extracted with 001b.

Thanks.

Reason is that this command is in 3 ways, as explained further up this thread. Because drives may return raw interleaved sub data (same as 001b) instead of packed sub data, so in this case they may want to de-pack.