| Anonymous | Login | Signup for a new account | 11-10-2008 12:49 PST |
| Main | My View | View Issues | Change Log | Docs |
| Viewing Issue Simple Details [ Jump to Notes ] | [ View Advanced ] [ Issue History ] [ Print ] | |||||||||||
| ID | Category | Severity | Reproducibility | Date Submitted | Last Update | |||||||
| 0004944 | [BusyBox] New Features | feature | N/A | 09-13-08 00:54 | 10-08-08 23:26 | |||||||
| Reporter | Alain137 | View Status | public | |||||||||
| Assigned To | BusyBox | |||||||||||
| Priority | normal | Resolution | open | |||||||||
| Status | assigned | Product Version | 1.12.x | |||||||||
| Summary | 0004944: patch to support lzop compression/decompression | |||||||||||
| Description |
This is a patch to add the lzop compressor/decompressor to busybox. Lzop is a compressor that is more than twice as fast as gzip at its fastest setting, and is thus useful in any application where compression/decompression speed is more important than ratio. For instance, I use lzop with udcpast to speed up network transfers (gzip would actually make the transfer slower, as any time gain obtained due to having less data to transfer would be lost due to the time it takes to compress) Compression: > time lzop -c /usr/bin/mplayer >/dev/null lzop -c /usr/bin/mplayer > /dev/null 0,33s user 0,01s system 100% cpu 0,337 total > time gzip -1 -c /usr/bin/mplayer >/dev/null gzip -1 -c /usr/bin/mplayer > /dev/null 0,80s user 0,03s system 94% cpu 0,882 total Decompression: > time gzip -dc mplayer.gz >/dev/null gzip -dc mplayer.gz > /dev/null 0,27s user 0,01s system 97% cpu 0,288 total > time lzop -dc mplayer.lzo >/dev/null lzop -dc mplayer.lzo > /dev/null 0,10s user 0,01s system 100% cpu 0,112 total For building it, a liblzo2 static library must be present on the system (for example, liblzo2-dev on a Kubuntu system) |
|||||||||||
| Additional Information | ||||||||||||
| Attached Files |
|
|||||||||||
|
|
||||||||||||
Notes |
|
|
(0011714) vda 09-20-08 07:54 edited on: 09-20-08 07:56 |
The patch seems to contain only the glue needed to interface with LZO library. It does not contain LZO compression code itself. The "innermost" routines are lzo_compress() and lzo_decompress() but from there they just call into liblzo2 (lzo1x_1_15_compress() etc). And this glue code is big. I mean, it's VERY BIG. For comparison: look into bbunzip.c, unlzma glue code is less than one screensful: static char* make_new_name_unlzma(char *filename) { return make_new_name_generic(filename, "lzma"); } static USE_DESKTOP(long long) int unpack_unlzma(void) { return unpack_lzma_stream(STDIN_FILENO, STDOUT_FILENO); } int unlzma_main(int argc UNUSED_PARAM, char **argv) { getopt32(argv, "cf"); argv += optind; /* lzmacat? */ if (applet_name[4] == 'c') option_mask32 |= OPT_STDOUT; return bbunpack(argv, make_new_name_unlzma, unpack_unlzma); } lzop glue code should be just added to this file in a similar way. |
|
(0011824) Alain137 09-23-08 22:30 |
Well, part of the problem is that the LZO library (apparently?) only handles the actual compressed data, but not the header. So, header handling is left to the application (glue) |
|
(0011834) bernhardf 09-24-08 00:21 |
I would strongly suggest to make sure that we can use minilzo instead of the rather big "normal" lzo. For reference: http://repo.or.cz/w/buildroot.git?a=blob_plain;f=package/lzo_mini/lzo_mini.mk;hb=HEAD [^] |
|
(0011844) vda 09-24-08 03:54 |
I'm not against glue code itself. I am just saying it is way, way too big. I understand that you will need the code like lzo_compress() and lzo_decompress() functions to handle .lzo data format and interface it with LZO library, but they are not big. The rest of 100k+ code in this patch is not needed. Start by adding .lzo handling to bbunzip.c which calls not-yet-existing unpack_lzo_stream(), then implement that one by taking only those parts of lzo code that you need. I bet this way it will be much smaller. |
|
(0012284) Alain137 09-27-08 12:17 |
I've made a new version (see attachment), which is much smaller: 36899 bytes of source, rather than 178272. I admit, 36899 is still big (for just glue code), but unfortunately, the lzop library is rather barebones, and _only_ supplies (de)compression of one block, and not much else. Interpretation of headers, chaining of blocks, calculation of checksums need all to be managed by the application. |
|
(0012294) vda 09-27-08 17:37 |
Looked at the code. e_method() is dead code, it can never be called: +static void e_method(int m) +{ + fprintf(stderr,"%s: illegal method option -- %c\n", + applet_name, m & 255); + e_usage(); +} ... +int lzo_set_method(int level) +{ + int l = level & 0xff; + int m = 0; + + if (l < 1 || l > 9) + return -1; /* not a LZO method */ ... +} ... + case '1': + case '2': + case '3': + case '4': + case '5': + case '6': + case '7': + case '8': + case '9': + if (lzo_set_method(optc - '0')) + e_method(optc); +extern const char *opt_complementary; #include <libbb.h> does this already, no? +int lzop_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE; +int lzop_main(int argc UNUSED_PARAM, char **argv) +{ + int i; + char *options="cfvdt123456789CFVh"; + opt_complementary="h-"; + int r=getopt32(argv, options); + argv += optind; + /* lzopcat? */ + if (applet_name[4] == 'c') + option_mask32 |= (OPT_STDOUT | OPT_DECOMPRESS); + /* unlzop? */ + if (applet_name[0] == 'u') + option_mask32 |= OPT_DECOMPRESS; + + /* By default, use method 3 */ + lzo_set_method(3); ... Use tabs for indentation. Above is a mix of tabs/spaces, which is visible in the diff. |
|
(0012304) Alain137 09-28-08 02:12 |
Uploaded a new version with the issues fixed, and a couple of other simplifications |
|
(0012314) vda 09-28-08 05:09 |
busybox already have a functions analogous to: read_buf() - full_read(0,...) write_buf() - xwrite(1,...) error() - bb_error_msg_and_die() io_error() - bb_perror_msg_and_die() Why lzo_set_method() is returning a value? It's always 0, thus carries no useful information. + if (dst_len > MAX_BLOCK_SIZE) { + error("lzop file corrupted"); + ok = 0; + break; + } error() exits, why here you have code after call to error()? + case 'V': + bb_error_msg("version: v%s, %s", + LZOP_VERSION_STRING, LZOP_VERSION_DATE); + exit(0); + break; We do not need to support ALL options. Options like -V, --version are usually not implemented in busybox applets. + if (read_buf(b1, src_len) != (lzo_int) src_len) + read_error(); We have xread() for this in busybox. +static int exit_code = EXIT_OK; Unused variable. -LDLIBS := +LDLIBS := lzo2 You need to add this _conditionally_. Example: ifeq ($(CONFIG_SELINUX),y) LDLIBS += selinux sepol endif + lzo_uint32 f_adler32; + lzo_uint32 f_crc32; still using spaces for indent here |
|
(0012464) Alain137 09-29-08 11:00 |
Just a quick note to notify you of some new versions that I attached yesterday. Patch5 is the most recent and smallest. I took account your recent suggestions (LIBS, busybox methods, indentation ...). Moreover, I merged everything into a single C file, bringing down the size to 26790. Thx, Alain |
|
(0012474) bernhardf 09-29-08 11:04 |
Note that LOC of source is not really relevant as opposed to the output of $ ./scripts/bloat-o-meter busybox_unstripped.old busybox_unstripped where the .old is without your code. can you please paste this? TIA, |
|
(0012484) bernhardf 09-29-08 12:11 |
A few comments on the patch itself: - the code should live in the archival/ directory - preferably it would be under GPLv2 "or later" - includes The first include in lzop.c has to be libbb.h, then your <lzo/.*.h> stanza and then you can remove all includes from stdio.h down to sys/stat.h - Since we are concerned about size, you should check building against lzo_mini (see previous reply; did you verify that?) - everything in "/* options*/" should live in globals, but we can fix that later - calloc() should be xzalloc() - lzo_decompress() if (b2) {b2=NULL;free(b2)} This can't be right. - magic numbers: 0 == STDIN_FILENO, 1 == STDOUT_FILENO etc. Please use these for clarity. - check_magic() is only used once. Inline it into the single caller. - init_compress_header: h->flags = (F_OS & F_OS_MASK) | (F_CS & F_CS_MASK); - do_compress odd prototype, you forgot "void" ... |
|
(0012494) vda 09-29-08 15:42 |
Having entire separate lzop/* directory for just one *.c file doesn't make sense. "config LZOP" should default to "n" as every other applet does. All other applets share usage.h, lzop is not that special to have separate one. Other compressors have compressor/decompressor selectable *separately* - it's possible to select bunzip2 but not bzip2. lzop should be similar. #include <stdio.h> #include <string.h> #include <limits.h> #include <sys/types.h> #include <unistd.h> #include <stdlib.h> #include <sys/stat.h> This all is already included by libbb.h lzo_byte * const b1 = calloc(1, block_size); lzo_byte * const b2 = calloc(1, MAX_COMPRESSED_SIZE(block_size)); Since you do not check for allocation errors anyway, at least use xmalloc() - it will die with an error message intead of a segfault. if (wrk_mem) free(wrk_mem); free(NULL) is safe, you don't need if() here. if (b2) { b2 = NULL; free(b2); } Whats going on here?? Okay, I tried to build it, and due to -Wundef being used by busubox I got gobs of errors from lzo2 headers. Okay, disabled that. Then I've got these: archival/lzop.c:237: error: function declaration isn't a prototype archival/lzop.c: In function 'f_read8': archival/lzop.c:237: error: old-style function definition archival/lzop.c: At top level: archival/lzop.c:247: error: function declaration isn't a prototype archival/lzop.c: In function 'f_read16': archival/lzop.c:247: error: old-style function definition archival/lzop.c: At top level: archival/lzop.c:257: error: function declaration isn't a prototype archival/lzop.c: In function 'f_read32': archival/lzop.c:257: error: old-style function definition archival/lzop.c: At top level: archival/lzop.c:706: error: function declaration isn't a prototype archival/lzop.c: In function 'do_compress': archival/lzop.c:706: error: old-style function definition archival/lzop.c:707: error: unused variable 'ok' archival/lzop.c: In function 'lzop_main': archival/lzop.c:783: error: ISO C90 forbids mixed declarations and code archival/lzop.c:783: error: unused variable 'r' Ooookay, fixing it all up... see attached 5.patch |
|
(0012504) vda 09-29-08 15:45 |
bloatcheck: function old new delta do_compress - 2756 +2756 lzo1x_999_compress_internal - 2099 +2099 find_match - 1962 +1962 lzo1x_optimize - 1487 +1487 lzo1x_decompress_safe - 1246 +1246 do_decompress - 1184 +1184 lzo_crc32_table - 1024 +1024 lzo1x_decompress - 980 +980 code_match - 539 +539 lzo_adler32 - 389 +389 lzo_crc32 - 385 +385 lzo1x_1_compress - 243 +243 lzo1x_1_15_compress - 243 +243 lzop_main - 228 +228 len_of_coded_match - 222 +222 static.c - 216 +216 STORE_RUN - 203 +203 better_match - 174 +174 lzo1x_999_compress_level - 99 +99 packed_usage 24949 25044 +95 make_new_name_lzop - 84 +84 code_run - 65 +65 copy3 - 62 +62 lzo_set_method - 55 +55 f_write - 43 +43 f_read_num - 43 +43 lzo_check - 42 +42 add_bytes_to_chksum - 40 +40 ... ------------------------------------------------------------------------------ (add/remove: 61/0 grow/shrink: 6/0 up/down: 16807/0) Total: 16807 bytes I think lzo2 itself should be busyboxed. 16kb+ is twice the size of busybox's bzip2 implementation. |
|
(0012554) Alain137 10-01-08 12:54 |
Sorry for the slow answer, I'm on the road right now :) > if (b2) { > b2 = NULL; > free(b2); > } > > Whats going on here?? A simple typo... should be the other way round: first the free, then the NULL assignment. > if (wrk_mem) > free(wrk_mem); > > free(NULL) is safe, you don't need if() here. ... and that is probably the reason why the error above went unnoticed: rather than segfaulting, free just did a no-op, and all that happened was a memory leak. ... more when I'm back next week |
|
(0012754) vda 10-04-08 10:17 |
a.patch is an attempt to pull in parts of lzo *library* so that we don't need to link against it. Boy, the author (Markus F.X.J. Oberhumer) wants to be so compatible with gazillions of compilers and platforms, he sacrificed code readability for that. Tons of macros + #ifdef forest. I'm not sure I untangled it all without breaking something. But this is a start. I gave up on extracting lzo1x_999_compress_level() out of lzo library. It's hopelessly intertwined with the rest. (Grep for "lzo1x_999_compress_level" to see where I disabled it in this patch). As a result, levels 7...9 won't use it, producing "less compressed" archives (level 6). I gave it a very limited testing (seems to compress and uncompress at level 1, resulting archives uncompress ok with stock lzop. And vice versa). Couple of notes: * stock lzop doesn't seem to have "unlzop" and "lzopcat", do we want to invent those? * stock lzop does not delete source file, unlike g[un]zip and b[un]zip2. * crc table can (should) be generated (busybox already has this code elsewhere) - save 1k of text segment. Alain137, can you test this and reinstate lzo1x_999_compress_level() usage? |
|
(0012764) vda 10-04-08 10:19 |
make bloatcheck: function old new delta do_compress - 4120 +4120 do_decompress - 1167 +1167 lzo_crc32_table - 1024 +1024 lzo1x_decompress_safe - 1010 +1010 ... ------------------------------------------------------------------------------ (add/remove: 30/0 grow/shrink: 5/0 up/down: 8772/0) Total: 8772 bytes |
|
(0012784) Alain137 10-04-08 13:07 |
> a.patch is an attempt to pull in parts of lzo *library* so that we don't need to link against it. Ok, thanks for that. > Boy, the author (Markus F.X.J. Oberhumer) wants to be so compatible with > gazillions of compilers and platforms, he sacrificed code readability for > that. I already noticed that in the glue code... and I've got the impression that most of this is for supporting legacy 80's area 16 bit (or even 8 bit) homecomputer processors. I think we can do away with that all... At least that's what I did for the glue code... > Tons of macros + #ifdef forest. I'm not sure I untangled it all without > breaking something. But this is a start. Indeed :( > I gave up on extracting lzo1x_999_compress_level() out of lzo library. It's > hopelessly intertwined with the rest. (Grep for "lzo1x_999_compress_level" to > see where I disabled it in this patch). As a result, levels 7...9 won't use > it, producing "less compressed" archives (level 6). If needed, we can do without it. However, I'll have a closer look at lzo1x_999_compress_level when I'm back home, maybe I'll be able to do something there. Or if not, we can still keep the original for 999, but make it optional. > I gave it a very limited testing (seems to compress and uncompress at level > 1, resulting archives uncompress ok with stock lzop. And vice versa). The tests that I did was attempting to compress mplayer using busybox lzop and decompress the result using stock lzop, and then vice-versa. I used mplayer as a test object, as smaller files (such as /bin/ls) didn't catch all bugs. An even better test would be to piece together a file from one large executable, and bits of another file that's already compressed using gzip. This would be to test correct handling of "stored" blocks. I'll do some tests when I'll be back home. > Couple of notes: > * stock lzop doesn't seem to have "unlzop" and "lzopcat", do we want to > invent those? I introduced those to mirror gzip (gunzip, zcat) and lzma... > * stock lzop does not delete source file, unlike g[un]zip and b[un]zip2. Indeed, that was part of the original byzantine glue code. You could pass it a command-line argument to delete the original file. All this went out of the window when I trimmed down the glue code: now lzop behaves like all the other compressors, and unconditionnally removes original file after compressing it. There are other parts which went away too, such as optionally storing original file name in the compressed file, storing the mode and the modification time. Keeping such functionality would have made it impossible to use the generic bbunzip function. > * crc table can (should) be generated (busybox already has this code > elsewhere) - save 1k of text segment. Fully agreed. > Alain137, can you test this and reinstate lzo1x_999_compress_level() usage? I'll do as soon as I'm back home. |
|
(0013144) Alain137 10-08-08 17:23 |
I've just attached patch 7. Changes: - crc32 now uses crc32_filltable rather than wasting 1k of statically initialized variable - lzo1x_999 compression implemented - fixed problem with level (levels 789 were not distinguished amongst each other) Total "bloat" count is now 11740... of which 2935 at least are directly accountable to lzo1x_999 compression. So, this algorithm seems to be indeed larger than the other 2, this depsite some optimizations (removal of useless statistics in lzo1x_999_t structure, elimination of code duplication, usage of constants where appropriate) |
|
(0013164) Alain137 10-08-08 23:26 |
Patch8 brings bloat down to 11292 bytes. Moreover, it makes the higher compression levels (789) optional. Indeed, these high levels are actually slower than gzip -2 while having a worse ratio, so in 3 words: they are pointless. Level 6 on the other hand does make sense: it is more than twice as fast as gzip -1 at only a 20% worse ratio. Bloat without level 789 compression is 7928 bytes. |
| Copyright © 2000 - 2006 Mantis Group |