Lemmings 2 File Formats

Started by GuyPerfect, March 30, 2013, 09:06:52 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

ccexplore

http://www.lemmingsforums.com/index.php?topic=765.msg18580#msg18580">Quote from: EricLang on 2014-01-06 17:00:59
I just was wondering if the compression in dos lemmings (original, ohno etc.) is basically the same as in lemmings tribes. I never really studied the algorithm, but just 'blindly' translated the C or VB code that I encountered.

I'm pretty sure no, they aren't the same, even beyond superficial differences like headers etc.  To be fair I have yet to read anything on the Lemmings 2 compression format in detail, but from what little I skimmed, it sounds like if nothing else, Lemmings 1 examines the data as bits when looking for redundancy, while Lemmings 2 does so in bytes.

namida

From a quick glance, I don't even see any minor resemblances...
My projects
2D Lemmings: NeoLemmix (engine) | Lemmings Plus Series (level packs) | Doomsday Lemmings (level pack)
3D Lemmings: Loap (engine) | L3DEdit (level / graphics editor) | L3DUtils (replay / etc utility) | Lemmings Plus 3D (level pack)
Non-Lemmings: Commander Keen: Galaxy Reimagined (a Commander Keen fangame)

EricLang

So trying to do L2 objects now. In the style files we have palettes of 128 length. But if my eyes do not deceive me then - when staring at the bytes of the L2SS section - I see palette indices > 127.
What and where is the second part (GUI part) of the palette? Or am I staring wrong http://www.lemmingsforums.com/Smileys/lemmings/smiley.gif" alt=":)" title="Smiley" class="smiley" />

EricLang

Pascal code for decompressing L2.
(Delphi added some new syntax, but the code should be readable by a programmer).

Code: [Select]
unit L2Decompress;

interface

type
  TSigChars = array[0..3] of AnsiChar;

  TSignature = packed record
    case byte of
      0: ( Chars: TSigChars );
      1: ( Id: Cardinal );
  end;

  TL2Decompressor = class
  private
    const
      CHUNK_SIZE = 2048; // maximum decompression size
    type
      TBuffer = array[0..CHUNK_SIZE - 1] of Byte; // decompressbuffer
      // ref entity for chunk decompression
      TRef = packed record
        Byte: Byte;
        Pair: array[0..1] of Byte;
      end;
      // one compressed chunk in compressed file
      TChunk = packed record
        LastChunkFlag  :  Byte;         // Last chunk = $FF; $00 otherwise
        RefCount       : UInt16;        // Number of refs
        Refs           : TArray<TRef>;  // 'dictionary' [RefCount]
        CompressedSize : UInt16;        // Number of bytes in compressed data
        CompressedData : TArray<Byte>;  // compressed data [CompressedSize]
        procedure LoadFromStream(S: TStream; out IsLast: Boolean);
        procedure DecompressChunk(out Buffer: TBuffer; out Size: Integer);
      end;
  public
    procedure Decompress(Src, Dst: TStream); overload;
  end;

implementation

procedure TL2Decompressor.TChunk.LoadFromStream(S: TStream; out IsLast: Boolean);
// load one chunk of compressed data from stream
var
  j, i: Integer;
begin
  // read the chunk flag
  S.ReadBuffer(LastChunkFlag, 1);
  if not (LastChunkFlag in [$FF, $00]) then raise Exception.Create('LastChunkFlag error');
  IsLast := LastChunkFlag = $FF;

  // read the refcount
  S.ReadBuffer(RefCount, 2);
  SetLength(Refs, RefCount);

  // and fill the refs byte by byte
  for i := 0 to RefCount - 1 do
    if S.Read(Refs.Byte, 1) <> 1 then raise exception.create('chunk ref read error a');

  // and fill the byte-pairs
  for j := 0 to 1 do
    for i := 0 to RefCount - 1 do
      if S.Read(Refs.Pair[j], 1) <> 1 then raise exception.create('cunks byte pair read error 2');

  // read the compressed size
  S.ReadBuffer(CompressedSize, SizeOf(CompressedSize));
  if CompressedSize > 100000 then raise exception.Create('datasize error');
  // and finally read the compressed data in to the buffer
  SetLength(CompressedData, CompressedSize);
  S.ReadBuffer(CompressedData[0], CompressedSize);
end;

procedure TL2Decompressor.TChunk.DecompressChunk(out Buffer: TBuffer; out Size: Integer);
// decompress one chunk of data
var
  i, j, d, s, Cnt: Integer;
begin
  Size := CompressedSize;
  if Size > CHUNK_SIZE then raise Exception.Create('chunk decompression error (chunksize)');
  // initialize the decompression buffer
  FillChar(Buffer, SizeOf(Buffer), 0);
  Move(CompressedData[0], Buffer[0], Size);

  for i := RefCount - 1 downto 0 do begin
    Cnt := 0;
    for j := Size - 1 downto 0 do
      if (Buffer[j] = Refs.Byte) then
        Inc(Cnt);
        if Size + Cnt > CHUNK_SIZE then raise Exception.Create('chunk decompression error (overflow)');
        d := Size - 1 + Cnt;
   for s := Size - 1 downto 0 do begin
          if (Buffer = Refs.Byte) then begin
            for j := 1 downto 0 do begin
             Buffer[d] := Refs.Pair[j];
             dec(d);
           end;
        end
      else begin
        Buffer[d] := Buffer;
        Dec(d);
      end;
      if d < -1 then raise Exception.Create('chunk decompression error (underflow)');
    end;
    Inc(Size, Cnt);
  end;
end;

procedure TL2Decompressor.Decompress(Src, Dst: TStream);
// decompress all chunks from sourcestream (src) to destinationstream (dst)
var
  Chunk: TChunk;
  OutSize: Integer;
  TotalSize: Integer;
  Buffer: TBuffer;
  F: TFileStream;
  IsLast: Boolean;
  Sig: TSignature;
  DecompressedSize: UInt32;
begin
  // read signature
  Src.ReadBuffer(Sig, 4);
  if Sig.Chars <> 'GSCM' then raise Exception.Create('L2 decompression read signature error');

  // read size
  Src.ReadBuffer(DecompressedSize, 4);
  if DecompressedSize > MegaByte then raise Exception.Create('decompress error: decompressed size too large'); // don't know limit

  TotalSize := 0;
  IsLast := False;
  while not IsLast do begin
    Chunk.LoadFromStream(Src, IsLast); // read 1 compressed chunk
    OutSize := 0;
    Chunk.DecompressChunk(Buffer, OutSize); // decompress 1 chunk
    Inc(TotalSize, OutSize);
    Dst.WriteBuffer(Buffer[0], OutSize);
  end;

  if TotalSize <> DecompressedSize then raise Exception.Create('Decompress OutSize mismatch');
end;

end.

ccexplore

http://www.lemmingsforums.com/index.php?topic=765.msg18632#msg18632">Quote from: EricLang on 2014-01-08 01:02:15
So trying to do L2 objects now. In the style files we have palettes of 128 length. But if my eyes do not deceive me then - when staring at the bytes of the L2SS section - I see palette indices > 127.
What and where is the second part (GUI part) of the palette? Or am I staring wrong http://www.lemmingsforums.com/Smileys/lemmings/smiley.gif" alt=":)" title="Smiley" class="smiley" />

Unfortunately it doesn't look like the second part of the palette is documented, unless I missed something on reading (quite possible).  I believe it's similar to how in Lemmings 1, the palette is split into two halves, with the lower 8 entries being effectively fixed across all graphics sets, while the upper 8 are the only ones that truly changes from set to set.  In the L2 case it apparently means the fixed upper half is not stored in the style files, but rather in some unknown location in other files of the game (possibly L2.exe itself).

Your best bet may unfortunately be to just create a modified L2SS section that uses the 128-255 indices, get a screenshot of that graphics in DOSBox, and consume the resulting bitmap to find out what those 128 palette entries should be.

EricLang

Yep, same system I thought too. Only the low and high palettes are 128 in size now.
I''ll first scan through all tribes files to check if there is some gui palette hidden inside there.

ccexplore

If/when you do manage to find out the values for the remainder of the palette, do post something here that allows the documentation to be completed.  Even a binary file attachment (with suitable explanation on how the values are stored) can suffice--I can take over converting that to something more human-readable if necessary.

EricLang

Currently exploring the files.

Candidate found, but I think this is just a picture...
Picture attachment goes wrong on the forum
The first (128 * 3) bytes (RGB) of ARK.ANM certainly looks like a palette.

Candidate found in LOAD.RKO at addres 202

ccexplore

I don't think there's a good way for you to be sure without ultimately testing it out with a specially created L2SS that uses all the upper palette entries.  There are far too many other graphics in this game that are not styles.  They could have their own palettes that nevertheless still only apply to the lower entries.  For example, I'm guessing ARK.ANM contains graphics for the animation of the ark pieces (on the screen that shows the map of all 12 tribes, and your progress as represented by ark pieces moving towards the center).  LOAD.RKO looks more promising, but could also potentially be a number of other things in the game.

I actually seem to recall now that there is somehow a way in one of DOSBox's more hidden UI to make it display the current palette in used by the (emulated) graphics card--I'm 99% sure I saw it with my own eyes at some point.  If that's the case, then I guess there is a far simpler way to solve this issue.  I'll try to find out more about this later today. [edit: scratch that for now]

EricLang

The L2SS is a good idea, but I just started decoding the stuff, so that is a bit premature for me.

I ran a test on the files in the root of the Lemmings tribes.
There are not much candidates there so that is hopefull too.
Here are some pictures:
http://ericenzwaan.nl/eric/lemmings/Temp/candidates.zip" class="bbc_link" target="_blank">http://ericenzwaan.nl/eric/lemmings/Temp/candidates.zip
Say I found something in VSTYLE.DAT at position 23377 then the bitmap is called VSTYLE_DAT_27377.bmp
None of the results is in a compressed file.

I just mapped a 128 * 3 byte buffer through the files, doing 2 checks
1) all values must have R < 64 and G < 64 and B < 64.
2) if 80% is zero the check fails too

The endsection of VSTYLE.DAT is the best candidate now.

[candidate: ][ARK.ANM][0]
[candidate: ][ARK.ANM][384]
[candidate: ][INSTALL.EXE][40890]
[candidate: ][L2.RKO][21643]
[candidate: ][LOAD.RKO][202]
[candidate: ][LOAD.RKO][586]
[candidate: ][LOAD.RKO][970]
[candidate: ][LOAD.RKO][1354]
[candidate: ][LOAD.RKO][1738]
[candidate: ][PROCESS.RKO][1096]
[candidate: ][PROCESS.RKO][21070]
[candidate: ][VGA.RKO][14229]
[candidate: ][VGA.RKO][14613]
[candidate: ][VSTYLE.DAT][27377]

ccexplore

http://www.lemmingsforums.com/index.php?topic=765.msg18654#msg18654">Quote from: ccexplore on 2014-01-08 16:02:08
I actually seem to recall now that there is somehow a way in one of DOSBox's more hidden UI to make it display the current palette in used by the (emulated) graphics card--I'm 99% sure I saw it with my own eyes at some point.  If that's the case, then I guess there is a far simpler way to solve this issue.  I'll try to find out more about this later today.

On second thought, I think it's far more likely that I mis-remembered, and what I remember seeing was probably from an emulator for something else like Genesis or something.  Granted, I haven't updated my DOSBox from v0.73 so maybe there is such a feature in more current versions but I won't hold my hopes up.

kaimitai

#26
Quote from: EricLang on January 08, 2014, 11:54:26 AM
Yep, same system I thought too. Only the low and high palettes are 128 in size now.
I''ll first scan through all tribes files to check if there is some gui palette hidden inside there.

Since the topic is not locked, I will take my chances reviving an old thread.

I have injected a tile of palette entries 128-255 into the styles files, and used it in one level of each style - which allowed me to extract the rgb-values for the high palettes of each. An interesting find is that palette entries 145, 146 and 147 cycle in all styles; red, red, yellow. (at different modes - so you can use it to make some kind of monochrome animation) Palette entries 164 and up are black in all styles.

The attached file contains the styles and levels (need to be used in combination) that i used, as well as screenshots of each palette, plus a text file with rgb-values for each entry. Note that these colors are according to dosbox on my system. Those who are interested can load up the files on their own systems and compare. (I used the l2-fix executable to run the game)




I also want to point out that the formula used in the original post for creating bitmaps from the 4-layer encoded graphics only works if the width of the image is a multiple of 4. Otherwise the 4 layers will have different sizes, the first being the biggest. For example if the width is 43, layer 1,2 and 3 will have widths 11, while layer 4 has width 10. (at least this is how it seems to be)

This is important for the run-length encoded sprite data (L2SS section) in the style-files which I have trouble decoding in some cases. For Medieval I have no problems, for Classic there is supposed to be one sprite which I cannot decode, and for Circus the 7th sprite cannot be decoded by my implementation.
EDIT: After loading in the extracted palettes and redrawing the sprites, they are all little off especially near the bottom. I think my sprite decoding algorithm is not 100%.

For those who can answer, I have 2 questions (which I will investigate myself if I have to, but if someone can tell me it will save me some time):


  • Is it possible for the run length encoding to spill over from one line to the next (meaning I need to check the x-index for each pixel copied/skipped and "manually" do a line shift)?
  • Can there be garbage data in this section if it is not used by any sprite animation? Say if the animations only use sprites #0, #1 and #2 - there is no use trying to decode sprite #3 even if the L2SS header tells me there are 4 sprites?

My implementation for decoding one layer is roughly as follows (I have the correct layer width parameter I believe, as I know they will vary for sprites where (width mod 4) !=0);


decode_sprite_layer(int layer_width, int layer_height, const std::vector<byte>& input) {
std::vector<byte> result(layer_width* layer_height, 0); // initialize with all zeroes
int x{ 0 }, y{ 0 };
        int stream_index {0};

while (true) {
    int high_nibble = (input.at(stream_index ) & 0xf0) >> 4;
    int low_nibble = input.at(stream_index ) & 0x0f;

    ++stream_index ;

if (x == 0 && high_nibbe == 0xf)
return result;

// perform copy/skip for high nibble
bool copy_mode = (high_nibble >> 3) == 0;
int count = high_nibble & 0b0111;
   
if(copy_mode) {
  // copy all bytes between input[stream_index] to input[stream_index + count - 1] inclusive, to result, starting at result[y*layer_width+x]
  // increase x and the stream_index by count
  } else {
  // only increase x by count, but do not touch the stream index
}

// HERE: perform copy/skip for low nibble
// logic the same as for the high nibble

if(low_nibble==0) {
x=0;
++y;
}

}

}
Life is like a sewer. What you get out of it, depends on what you put into it. --Tom Lehrer

kaimitai

The sprite decoding algorithm I described above is correct, I just had a bug in my concrete implementation that made it "almost" work. I have decoded all the sprites successfully, and to answer my own questions - the answers are no and no.

Incidentally, all the styles have at least one sprite associated with them - but the styles that only have one sprite have this image, in different palettes:

Life is like a sewer. What you get out of it, depends on what you put into it. --Tom Lehrer