December 04, 2020, 12:00:43 PM

Author Topic: [bmx] LZMA Compression by Otus [ 1+ years ago ]  (Read 891 times)

Offline BlitzBot

  • Jr. Member
  • **
  • Posts: 1
[bmx] LZMA Compression by Otus [ 1+ years ago ]
« on: June 29, 2017, 12:28:42 AM »
Title : LZMA Compression
Author : Otus
Posted : 1+ years ago

Description : This LZMA compression module works like Pub.zlib but uses the public domain LZMA SDK (http://www.7-zip.org/sdk.html) instead of zlib for compression - hopefully for better compression ratio. See <a href="http://en.wikipedia.org/wiki/LZMA" target="_blank">http://en.wikipedia.org/wiki/LZMA[/url] for information about the algorithm.

If you don't want to go through the hassle of installing, you can download a zipped module version here: <a href="http://jan.varho.org/blog/programming/blitz/lzma-module-version-1-01/" target="_blank">http://jan.varho.org/blog/programming/blitz/lzma-module-version-1-01/[/url]

Installation:

1 - Save the following code as lzma.bmx and LzmaEnc.c

2 - Download LZMA SDK from <a href="http://www.7-zip.org/sdk.html" target="_blank">http://www.7-zip.org/sdk.html[/url]

3 - Copy the "C" directory from the SDK and rename it "lzmasdk"

4 - Build modules and docs.


Code :
Code: BlitzMax
  1. 'lzma.bmx
  2. SuperStrict
  3.  
  4. Rem
  5. bbdoc: Lzma compression
  6. End Rem
  7. Rem
  8. Module Otus.Lzma
  9.  
  10. ModuleInfo "Version: 1.0"
  11. ModuleInfo "Author: Igor Pavlov (7-zip.org)"
  12. ModuleInfo "License: Public domain"
  13. ModuleInfo "Credit: BlitzMax interface by Jan Varho"
  14. ModuleInfo "History: 1.01 Release"
  15. ModuleInfo "History: Fixed interface to exactly match zlib"
  16. ModuleInfo "History: Removed redundant wrapper"
  17. ModuleInfo "History: Upgraded SDK to 4.65"
  18. End Rem
  19. Import "LzmaEnc.c"
  20. Import "lzmasdk/LzmaUtil/Lzma86Dec.c"
  21. Import "lzmasdk/Alloc.c"
  22. Import "lzmasdk/Bra86.c"
  23. Import "lzmasdk/LzmaEnc.c"
  24. Import "lzmasdk/LzmaDec.c"
  25. Import "lzmasdk/LzFind.c"
  26.  
  27. Extern
  28.  
  29. Rem
  30. bbdoc: Uncompress a block of data
  31. End Rem
  32. Function LzmaUncompress( dest:Byte Ptr, destLen:Int Var, src:Byte Ptr, srcLen:Int Var ) = "Lzma86_Decode"
  33.  
  34. Rem
  35. bbdoc: Compress at the compression level given using a specified dictionary size
  36. about:
  37. Compression level should be in the range 1-9 with 9 the maximum compression.
  38.  
  39. Valid dictionary sizes are between 2^12 and 2^27 bytes. A power of two is recommended.
  40. The default (used in LzmaCompress and LzmaComress2) is 2^24 bytes (16 MB).
  41. End Rem
  42. Function LzmaCompress3( dest:Byte Ptr, destLen:Int Var, src:Byte Ptr, srcLen:Int, level:Int, dictSize:Int = LZMA_DICT_SIZE ) = "_LzmaCompress"
  43.  
  44. End Extern
  45.  
  46. ' Dictionary size in bytes (16MB)
  47. Const LZMA_DICT_SIZE:Int = $1000000
  48.  
  49. Rem
  50. bbdoc: Compress a block of data at default compression level
  51. end rem
  52. Function LzmaCompress( dest:Byte Ptr, destLen:Int Var, src:Byte Ptr, srcLen:Int )
  53.         LzmaCompress3( dest, destLen, src, srcLen, 5, LZMA_DICT_SIZE )
  54. End Function
  55.  
  56. Rem
  57. bbdoc: Compress a block of data at the compression level given
  58. about:
  59. Compression level should be in the range 1-9 with 9 the maximum compression.
  60. End Rem
  61. Function LzmaCompress2( dest:Byte Ptr, destLen:Int Var, src:Byte Ptr, srcLen:Int, level:Int )
  62.         LzmaCompress3( dest, destLen, src, srcLen, level, LZMA_DICT_SIZE )
  63. End Function
  64.  
  65. // LzmaEnc.c
  66. // Wrapper for the Encode function without filtering
  67.  
  68. #include "lzmasdk/LzmaUtil/Lzma86Enc.c"
  69.  
  70. void _LzmaCompress( Byte *dest, size_t *destLen, const Byte *src, size_t srcLen,
  71.     int level, UInt32 dictSize )
  72. {
  73.         Lzma86_Encode( dest, destLen, src, srcLen, level, dictSize, SZ_FILTER_NO );
  74. }


Comments :


Otus(Posted 1+ years ago)

 Simple test app.
Code: [Select]
'Tests that lzma module works.

SuperStrict

Framework BRL.StandardIO

Import "lzma.bmx"

Const DATA_BYTES% = 2560000


Print "Generating "+DATA_BYTES+" bytes of sequential data..."

Local rsize% = DATA_BYTES
Local raw:Byte[rsize]
Local rbuf:Byte Ptr = raw

For Local i% = 0 Until DATA_BYTES
rbuf[i] = i
Next

Print "Done."


Print "Compressing data using default level..."

Local csize% = DATA_BYTES
Local comp:Byte[csize]

LzmaCompress comp, csize, raw, rsize

Print "Done: "+csize+" bytes."


Print "Compressing data using maximum compression level..."

csize = DATA_BYTES

LzmaCompress2 comp, csize, raw, rsize, 9

Print "Done: "+csize+" bytes."


Print "Uncompressing data..."

Local dsize% = DATA_BYTES
Local dec:Byte[dsize]

LzmaUncompress dec, dsize, comp, csize

Print "Done: "+dsize+" bytes."


Print "Verifying integrity..."

If dsize <> DATA_BYTES Then Print "Failed!" ; End

Local dbuf:Byte Ptr = dec

For Local i% = 0 Until dsize
If Byte(dbuf[i] - i) Then Print "Failed!" ; End
Next

Print "Done."



xlsior(Posted 1+ years ago)

 nice!


plash(Posted 1+ years ago)

 Does LMZA compress faster/smaller then zip?


xlsior(Posted 1+ years ago)

 Smaller than zip, apparently. From the wiki link:<div class="quote"> The Lempel-Ziv-Markov chain-Algorithm (LZMA) is an algorithm used to perform data compression. It has been under development since 1998[1] and is used in the 7z format of the 7-Zip archiver. This algorithm uses a dictionary compression scheme somewhat similar to LZ77 and features a high compression ratio (generally higher than bzip2 </div>I know that the 7zip native format tends to be smaller than plain vanilla .zip on average, so this one should/could be too.


Otus(Posted 1+ years ago)

 Yes, smaller. LZMA is used extensively in for example live Linux distributions precisely because of its better compression ratio. It should also have relatively fast decompression, though I haven't profiled this implementation. I haven't tested this extensively yet, but it seems like image data compresses 10-20% better than with zlib.I can post benchmarks later.


ImaginaryHuman(Posted 1+ years ago)

 This looks cool and I'd like to use it but I'm getting compilation errors. I even tried reinstalling the sdk part with version 4-whatever.Building testCompiling:test.bmxflat assembler  version 1.68  (1680888 kilobytes memory)3 passes, 2335 bytes.Linking:test.exeC:/Users/Admin/Documents/DocumentsByMe/Other/BlitzMax1.36/mod/otus.mod/lzma.mod/lzma.release.win32.x86.a(LzmaEnc.c.release.win32.x86.o):LzmaEnc.c:(.text+0x3d): undefined reference to `LzmaEncProps_Init'C:/Users/Admin/Documents/DocumentsByMe/Other/BlitzMax1.36/mod/otus.mod/lzma.mod/lzma.release.win32.x86.a(LzmaEnc.c.release.win32.x86.o):LzmaEnc.c:(.text+0x19c): undefined reference to `LzmaEncode'Build Error: Failed to link C:/Users/Admin/Documents/DocumentsByMe/Other/BlitzMax1.36/mod/Otus.mod/Lzma.mod/test.exeProcess complete


ImaginaryHuman(Posted 1+ years ago)

 Hmm, actually your code above does work, but the code on your blog/website/google code gives the above errors.I'm also not sure your module is compressing properly... in the test is compresses 2.5 megabytes down to 678 bytes!Generating 2560000 bytes of sequential data...Done.Compressing data using default level...Done: 678 bytes.And if I put random numbers in the array rather than incresing integers, it gets even smaller down to 14 bytes!Generating 2560000 bytes of sequential data...Done.Compressing data using default level...Done: 14 bytes.This cannot be correct at all. Does anyone have this working properly?


Otus(Posted 1+ years ago)

 Hi,I emailed you, but here's the answer for others:678 bytes is correct. 14 bytes is an error as uncompressible data may need a larger target buffer (larger csize).I don't have the linking problem on XP or Ubuntu.


ImaginaryHuman(Posted 1+ years ago)

 Hey, yes, thanks, I needed a bigger buffer because the compressed size was a bit bigger than the original - which is typical for random data.Okay... so now here is another issue. I've been using the lzma compression (above) on a variety of files and sometime I find that LzmaUncompress sometimes reports that the length of the decompressed data is 1 byte longer than it should be. It doesn't happen with all files. I was trying to figure out why my decompressed data wasn't working properly and it was because there was an extra 0 byte on the end of it, due to the length reported by LzmaUncompress.I don't know if that means it's a bug in LZMA or in your wrapper? All of the bytes in the decompressed file are the same as the original file, it's just the length can be slightly off. Thankfully it's only off by an additional byte, occasionally, which isn't too difficult to work around. It just means you have to include a list of the original file sizes as part of the app and use that to determine how much data is output from the decompression.Any ideas what's going on there?Also I am trying to reinstall the module and I am now getting the above errors for every version, even the version that is quoted above, version 1.01 zip, 1.02 zip, and direct from the 7z sdk. What's going on? [/i]

 

SimplePortal 2.3.6 © 2008-2014, SimplePortal