BusyBox Bug and Patch Tracking
BusyBox
  

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0004944 [BusyBox] New Features feature N/A 09-13-08 00:54 10-08-08 23:26
Reporter Alain137 View Status public  
Assigned To BusyBox
Priority normal Resolution open  
Status assigned   Product Version 1.12.x
Summary 0004944: patch to support lzop compression/decompression
Description This is a patch to add the lzop compressor/decompressor to busybox.
Lzop is a compressor that is more than twice as fast as gzip at its fastest setting, and is thus useful in any application where compression/decompression speed is more important than ratio. For instance, I use lzop with udcpast to speed up network transfers (gzip would actually make the transfer slower, as any time gain obtained due to having less data to transfer would be lost due to the time it takes to compress)

Compression:

> time lzop -c /usr/bin/mplayer >/dev/null
lzop -c /usr/bin/mplayer > /dev/null 0,33s user 0,01s system 100% cpu 0,337 total
> time gzip -1 -c /usr/bin/mplayer >/dev/null
gzip -1 -c /usr/bin/mplayer > /dev/null 0,80s user 0,03s system 94% cpu 0,882 total


Decompression:
> time gzip -dc mplayer.gz >/dev/null
gzip -dc mplayer.gz > /dev/null 0,27s user 0,01s system 97% cpu 0,288 total
> time lzop -dc mplayer.lzo >/dev/null
lzop -dc mplayer.lzo > /dev/null 0,10s user 0,01s system 100% cpu 0,112 total


For building it, a liblzo2 static library must be present on the system (for example, liblzo2-dev on a Kubuntu system)
Additional Information
Attached Files  lzop.diff [^] (178,272 bytes) 09-13-08 00:54
 busybox-1.12.0-lzop.patch [^] (36,899 bytes) 09-27-08 12:11
 busybox-1.12.0-lzop-3.patch [^] (35,088 bytes) 09-28-08 02:10
 busybox-1.12.0-lzop-4.patch [^] (31,989 bytes) 09-28-08 15:35
 busybox-1.12.0-lzop-5.patch [^] (26,790 bytes) 09-28-08 16:49
 5.patch [^] (24,831 bytes) 09-29-08 15:42
 a.patch [^] (70,550 bytes) 10-04-08 10:07
 busybox-1.12.0-lzop-7.patch [^] (90,280 bytes) 10-08-08 17:15
 busybox-1.12.0-lzop-8.patch [^] (88,379 bytes) 10-08-08 23:21

- Relationships

- Notes
(0011714)
vda
09-20-08 07:54
edited on: 09-20-08 07:56

The patch seems to contain only the glue needed to interface with LZO library. It does not contain LZO compression code itself. The "innermost" routines are lzo_compress() and lzo_decompress() but from there they just call into liblzo2 (lzo1x_1_15_compress() etc).

And this glue code is big. I mean, it's VERY BIG.

For comparison: look into bbunzip.c, unlzma glue code is less than one screensful:

static char* make_new_name_unlzma(char *filename)
{
        return make_new_name_generic(filename, "lzma");
}
static USE_DESKTOP(long long) int unpack_unlzma(void)
{
        return unpack_lzma_stream(STDIN_FILENO, STDOUT_FILENO);
}
int unlzma_main(int argc UNUSED_PARAM, char **argv)
{
        getopt32(argv, "cf");
        argv += optind;
        /* lzmacat? */
        if (applet_name[4] == 'c')
                option_mask32 |= OPT_STDOUT;
        return bbunpack(argv, make_new_name_unlzma, unpack_unlzma);
}

lzop glue code should be just added to this file in a similar way.

 
(0011824)
Alain137
09-23-08 22:30

Well, part of the problem is that the LZO library (apparently?) only handles the actual compressed data, but not the header. So, header handling is left to the application (glue)
 
(0011834)
bernhardf
09-24-08 00:21

I would strongly suggest to make sure that we can use minilzo instead of the rather big "normal" lzo.

For reference:
http://repo.or.cz/w/buildroot.git?a=blob_plain;f=package/lzo_mini/lzo_mini.mk;hb=HEAD [^]
 
(0011844)
vda
09-24-08 03:54

I'm not against glue code itself. I am just saying it is way, way too big.

I understand that you will need the code like lzo_compress() and lzo_decompress() functions to handle .lzo data format and interface it with LZO library, but they are not big. The rest of 100k+ code in this patch is not needed.

Start by adding .lzo handling to bbunzip.c which calls not-yet-existing unpack_lzo_stream(), then implement that one by taking only those parts of lzo code that you need. I bet this way it will be much smaller.
 
(0012284)
Alain137
09-27-08 12:17

I've made a new version (see attachment), which is much smaller: 36899 bytes of source, rather than 178272.

I admit, 36899 is still big (for just glue code), but unfortunately, the lzop library is rather barebones, and _only_ supplies (de)compression of one block, and not much else. Interpretation of headers, chaining of blocks, calculation of checksums need all to be managed by the application.
 
(0012294)
vda
09-27-08 17:37

Looked at the code.

e_method() is dead code, it can never be called:

+static void e_method(int m)
+{
+ fprintf(stderr,"%s: illegal method option -- %c\n",
+ applet_name, m & 255);
+ e_usage();
+}
...
+int lzo_set_method(int level)
+{
+ int l = level & 0xff;
+ int m = 0;
+
+ if (l < 1 || l > 9)
+ return -1; /* not a LZO method */
...
+}
...
+ case '1':
+ case '2':
+ case '3':
+ case '4':
+ case '5':
+ case '6':
+ case '7':
+ case '8':
+ case '9':
+ if (lzo_set_method(optc - '0'))
+ e_method(optc);



+extern const char *opt_complementary;

#include <libbb.h> does this already, no?



+int lzop_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
+int lzop_main(int argc UNUSED_PARAM, char **argv)
+{
+ int i;
+ char *options="cfvdt123456789CFVh";
+ opt_complementary="h-";
+ int r=getopt32(argv, options);
+ argv += optind;
+ /* lzopcat? */
+ if (applet_name[4] == 'c')
+ option_mask32 |= (OPT_STDOUT | OPT_DECOMPRESS);
+ /* unlzop? */
+ if (applet_name[0] == 'u')
+ option_mask32 |= OPT_DECOMPRESS;
+
+ /* By default, use method 3 */
+ lzo_set_method(3);
...

Use tabs for indentation. Above is a mix of tabs/spaces, which is visible in the diff.
 
(0012304)
Alain137
09-28-08 02:12

Uploaded a new version with the issues fixed, and a couple of other simplifications
 
(0012314)
vda
09-28-08 05:09

busybox already have a functions analogous to:
read_buf() - full_read(0,...)
write_buf() - xwrite(1,...)
error() - bb_error_msg_and_die()
io_error() - bb_perror_msg_and_die()

Why lzo_set_method() is returning a value? It's always 0, thus carries no useful information.


+ if (dst_len > MAX_BLOCK_SIZE) {
+ error("lzop file corrupted");
+ ok = 0;
+ break;
+ }

error() exits, why here you have code after call to error()?


+ case 'V':
+ bb_error_msg("version: v%s, %s",
+ LZOP_VERSION_STRING, LZOP_VERSION_DATE);
+ exit(0);
+ break;

We do not need to support ALL options. Options like -V, --version are usually not implemented in busybox applets.


+ if (read_buf(b1, src_len) != (lzo_int) src_len)
+ read_error();

We have xread() for this in busybox.


+static int exit_code = EXIT_OK;

Unused variable.


-LDLIBS :=
+LDLIBS := lzo2

You need to add this _conditionally_. Example:
ifeq ($(CONFIG_SELINUX),y)
LDLIBS += selinux sepol
endif


+ lzo_uint32 f_adler32;
+ lzo_uint32 f_crc32;

still using spaces for indent here
 
(0012464)
Alain137
09-29-08 11:00

Just a quick note to notify you of some new versions that I attached yesterday.
Patch5 is the most recent and smallest. I took account your recent suggestions (LIBS, busybox methods, indentation ...). Moreover, I merged everything into a single C file, bringing down the size to 26790.

Thx,

Alain
 
(0012474)
bernhardf
09-29-08 11:04

Note that LOC of source is not really relevant as opposed to the output of
$ ./scripts/bloat-o-meter busybox_unstripped.old busybox_unstripped
where the .old is without your code.

can you please paste this?
TIA,
 
(0012484)
bernhardf
09-29-08 12:11

A few comments on the patch itself:
- the code should live in the archival/ directory
- preferably it would be under GPLv2 "or later"
- includes
  The first include in lzop.c has to be libbb.h, then your <lzo/.*.h> stanza
  and then you can remove all includes from stdio.h down to sys/stat.h
- Since we are concerned about size, you should check building against lzo_mini
  (see previous reply; did you verify that?)
- everything in "/* options*/" should live in globals, but we can fix that later
- calloc() should be xzalloc()
- lzo_decompress() if (b2) {b2=NULL;free(b2)} This can't be right.
- magic numbers: 0 == STDIN_FILENO, 1 == STDOUT_FILENO etc. Please use these for
  clarity.
- check_magic() is only used once. Inline it into the single caller.
- init_compress_header: h->flags = (F_OS & F_OS_MASK) | (F_CS & F_CS_MASK);
- do_compress odd prototype, you forgot "void"
...
 
(0012494)
vda
09-29-08 15:42

Having entire separate lzop/* directory for just one *.c file doesn't make sense.

"config LZOP" should default to "n" as every other applet does.

All other applets share usage.h, lzop is not that special to have separate one.

Other compressors have compressor/decompressor selectable *separately* - it's possible to select bunzip2 but not bzip2. lzop should be similar.

#include <stdio.h>
#include <string.h>
#include <limits.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/stat.h>

This all is already included by libbb.h

        lzo_byte * const b1 = calloc(1, block_size);
        lzo_byte * const b2 = calloc(1, MAX_COMPRESSED_SIZE(block_size));

Since you do not check for allocation errors anyway,
at least use xmalloc() - it will die with an error message
intead of a segfault.

        if (wrk_mem)
                free(wrk_mem);

free(NULL) is safe, you don't need if() here.



                        if (b2) {
                                b2 = NULL;
                                free(b2);
                        }

Whats going on here??


Okay, I tried to build it, and due to -Wundef being used by busubox
I got gobs of errors from lzo2 headers. Okay, disabled that.
Then I've got these:

archival/lzop.c:237: error: function declaration isn't a prototype
archival/lzop.c: In function 'f_read8':
archival/lzop.c:237: error: old-style function definition
archival/lzop.c: At top level:
archival/lzop.c:247: error: function declaration isn't a prototype
archival/lzop.c: In function 'f_read16':
archival/lzop.c:247: error: old-style function definition
archival/lzop.c: At top level:
archival/lzop.c:257: error: function declaration isn't a prototype
archival/lzop.c: In function 'f_read32':
archival/lzop.c:257: error: old-style function definition
archival/lzop.c: At top level:
archival/lzop.c:706: error: function declaration isn't a prototype
archival/lzop.c: In function 'do_compress':
archival/lzop.c:706: error: old-style function definition
archival/lzop.c:707: error: unused variable 'ok'
archival/lzop.c: In function 'lzop_main':
archival/lzop.c:783: error: ISO C90 forbids mixed declarations and code
archival/lzop.c:783: error: unused variable 'r'


Ooookay, fixing it all up... see attached 5.patch
 
(0012504)
vda
09-29-08 15:45

bloatcheck:

function old new delta
do_compress - 2756 +2756
lzo1x_999_compress_internal - 2099 +2099
find_match - 1962 +1962
lzo1x_optimize - 1487 +1487
lzo1x_decompress_safe - 1246 +1246
do_decompress - 1184 +1184
lzo_crc32_table - 1024 +1024
lzo1x_decompress - 980 +980
code_match - 539 +539
lzo_adler32 - 389 +389
lzo_crc32 - 385 +385
lzo1x_1_compress - 243 +243
lzo1x_1_15_compress - 243 +243
lzop_main - 228 +228
len_of_coded_match - 222 +222
static.c - 216 +216
STORE_RUN - 203 +203
better_match - 174 +174
lzo1x_999_compress_level - 99 +99
packed_usage 24949 25044 +95
make_new_name_lzop - 84 +84
code_run - 65 +65
copy3 - 62 +62
lzo_set_method - 55 +55
f_write - 43 +43
f_read_num - 43 +43
lzo_check - 42 +42
add_bytes_to_chksum - 40 +40
...
------------------------------------------------------------------------------
(add/remove: 61/0 grow/shrink: 6/0 up/down: 16807/0) Total: 16807 bytes

I think lzo2 itself should be busyboxed. 16kb+ is twice the size of busybox's bzip2 implementation.
 
(0012554)
Alain137
10-01-08 12:54

Sorry for the slow answer, I'm on the road right now :)

> if (b2) {
> b2 = NULL;
> free(b2);
> }
>
> Whats going on here??

A simple typo... should be the other way round: first the free, then the NULL assignment.

> if (wrk_mem)
> free(wrk_mem);
>
> free(NULL) is safe, you don't need if() here.

... and that is probably the reason why the error above went unnoticed: rather than segfaulting, free just did a no-op, and all that happened was a memory leak.

... more when I'm back next week
 
(0012754)
vda
10-04-08 10:17

a.patch is an attempt to pull in parts of lzo *library* so that we don't need to link against it.

Boy, the author (Markus F.X.J. Oberhumer) wants to be so compatible with gazillions of compilers and platforms, he sacrificed code readability for that.
Tons of macros + #ifdef forest. I'm not sure I untangled it all without breaking something. But this is a start.

I gave up on extracting lzo1x_999_compress_level() out of lzo library. It's hopelessly intertwined with the rest. (Grep for "lzo1x_999_compress_level" to see where I disabled it in this patch). As a result, levels 7...9 won't use it, producing "less compressed" archives (level 6).

I gave it a very limited testing (seems to compress and uncompress at level 1, resulting archives uncompress ok with stock lzop. And vice versa).

Couple of notes:
* stock lzop doesn't seem to have "unlzop" and "lzopcat", do we want to invent those?
* stock lzop does not delete source file, unlike g[un]zip and b[un]zip2.
* crc table can (should) be generated (busybox already has this code elsewhere) - save 1k of text segment.

Alain137, can you test this and reinstate lzo1x_999_compress_level() usage?
 
(0012764)
vda
10-04-08 10:19

make bloatcheck:

function old new delta
do_compress - 4120 +4120
do_decompress - 1167 +1167
lzo_crc32_table - 1024 +1024
lzo1x_decompress_safe - 1010 +1010
...
------------------------------------------------------------------------------
(add/remove: 30/0 grow/shrink: 5/0 up/down: 8772/0) Total: 8772 bytes
 
(0012784)
Alain137
10-04-08 13:07

> a.patch is an attempt to pull in parts of lzo *library* so that we don't need to link against it.

Ok, thanks for that.

> Boy, the author (Markus F.X.J. Oberhumer) wants to be so compatible with
> gazillions of compilers and platforms, he sacrificed code readability for
> that.

I already noticed that in the glue code... and I've got the impression that most of this is for supporting legacy 80's area 16 bit (or even 8 bit) homecomputer processors. I think we can do away with that all... At least that's what I did for the glue code...

> Tons of macros + #ifdef forest. I'm not sure I untangled it all without
> breaking something. But this is a start.

Indeed :(

> I gave up on extracting lzo1x_999_compress_level() out of lzo library. It's
> hopelessly intertwined with the rest. (Grep for "lzo1x_999_compress_level" to > see where I disabled it in this patch). As a result, levels 7...9 won't use
> it, producing "less compressed" archives (level 6).

If needed, we can do without it. However, I'll have a closer look at lzo1x_999_compress_level when I'm back home, maybe I'll be able to do something there. Or if not, we can still keep the original for 999, but make it optional.

> I gave it a very limited testing (seems to compress and uncompress at level
> 1, resulting archives uncompress ok with stock lzop. And vice versa).

The tests that I did was attempting to compress mplayer using busybox lzop and decompress the result using stock lzop, and then vice-versa. I used mplayer as a test object, as smaller files (such as /bin/ls) didn't catch all bugs.
An even better test would be to piece together a file from one large executable, and bits of another file that's already compressed using gzip. This would be to test correct handling of "stored" blocks.

I'll do some tests when I'll be back home.

> Couple of notes:
> * stock lzop doesn't seem to have "unlzop" and "lzopcat", do we want to
> invent those?

I introduced those to mirror gzip (gunzip, zcat) and lzma...

> * stock lzop does not delete source file, unlike g[un]zip and b[un]zip2.

Indeed, that was part of the original byzantine glue code. You could pass it a command-line argument to delete the original file. All this went out of the window when I trimmed down the glue code: now lzop behaves like all the other compressors, and unconditionnally removes original file after compressing it.
There are other parts which went away too, such as optionally storing original file name in the compressed file, storing the mode and the modification time. Keeping such functionality would have made it impossible to use the generic bbunzip function.
 
> * crc table can (should) be generated (busybox already has this code
> elsewhere) - save 1k of text segment.

Fully agreed.

> Alain137, can you test this and reinstate lzo1x_999_compress_level() usage?

I'll do as soon as I'm back home.
 
(0013144)
Alain137
10-08-08 17:23

I've just attached patch 7.

Changes:

- crc32 now uses crc32_filltable rather than wasting 1k of statically initialized variable
- lzo1x_999 compression implemented
- fixed problem with level (levels 789 were not distinguished amongst each other)

Total "bloat" count is now 11740... of which 2935 at least are directly accountable to lzo1x_999 compression. So, this algorithm seems to be indeed larger than the other 2, this depsite some optimizations (removal of useless statistics in lzo1x_999_t structure, elimination of code duplication, usage of constants where appropriate)
 
(0013164)
Alain137
10-08-08 23:26

Patch8 brings bloat down to 11292 bytes.

Moreover, it makes the higher compression levels (789) optional. Indeed, these high levels are actually slower than gzip -2 while having a worse ratio, so in 3 words: they are pointless.
Level 6 on the other hand does make sense: it is more than twice as fast as gzip -1 at only a 20% worse ratio.

Bloat without level 789 compression is 7928 bytes.
 

- Issue History
Date Modified Username Field Change
09-13-08 00:54 Alain137 New Issue
09-13-08 00:54 Alain137 Status new => assigned
09-13-08 00:54 Alain137 Assigned To  => BusyBox
09-13-08 00:54 Alain137 File Added: lzop.diff
09-20-08 07:54 vda Note Added: 0011714
09-20-08 07:56 vda Note Edited: 0011714
09-23-08 22:30 Alain137 Note Added: 0011824
09-24-08 00:21 bernhardf Note Added: 0011834
09-24-08 03:54 vda Note Added: 0011844
09-27-08 12:11 Alain137 File Added: busybox-1.12.0-lzop.patch
09-27-08 12:17 Alain137 Note Added: 0012284
09-27-08 17:37 vda Note Added: 0012294
09-28-08 02:10 Alain137 File Added: busybox-1.12.0-lzop-3.patch
09-28-08 02:12 Alain137 Note Added: 0012304
09-28-08 05:09 vda Note Added: 0012314
09-28-08 15:35 Alain137 File Added: busybox-1.12.0-lzop-4.patch
09-28-08 16:49 Alain137 File Added: busybox-1.12.0-lzop-5.patch
09-29-08 11:00 Alain137 Note Added: 0012464
09-29-08 11:04 bernhardf Note Added: 0012474
09-29-08 12:11 bernhardf Note Added: 0012484
09-29-08 15:42 vda Note Added: 0012494
09-29-08 15:42 vda File Added: 5.patch
09-29-08 15:45 vda Note Added: 0012504
10-01-08 12:54 Alain137 Note Added: 0012554
10-04-08 10:07 vda File Added: a.patch
10-04-08 10:17 vda Note Added: 0012754
10-04-08 10:19 vda Note Added: 0012764
10-04-08 13:07 Alain137 Note Added: 0012784
10-08-08 17:15 Alain137 File Added: busybox-1.12.0-lzop-7.patch
10-08-08 17:23 Alain137 Note Added: 0013144
10-08-08 23:21 Alain137 File Added: busybox-1.12.0-lzop-8.patch
10-08-08 23:26 Alain137 Note Added: 0013164


Copyright © 2000 - 2006 Mantis Group
Powered by Mantis Bugtracker