linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, dave.rodgman@arm.com,
	linux-mm@kvack.org, mark.rutland@arm.com, markus@oberhumer.com,
	minchan@kernel.org, mm-commits@vger.kernel.org,
	ngupta@vflare.org, sergey.senozhatsky.work@gmail.com,
	stable@vger.kernel.org, torvalds@linux-foundation.org, w@1wt.eu,
	yuchao0@huawei.com
Subject: [patch 4/5] lib/lzo: fix ambiguous encoding bug in lzo-rle
Date: Thu, 11 Jun 2020 17:34:54 -0700	[thread overview]
Message-ID: <20200612003454.cUjqmKdwv%akpm@linux-foundation.org> (raw)
In-Reply-To: <20200611173002.24352ae77ca6d6d7e65e4b2a@linux-foundation.org>

From: Dave Rodgman <dave.rodgman@arm.com>
Subject: lib/lzo: fix ambiguous encoding bug in lzo-rle

In some rare cases, for input data over 32 KB, lzo-rle could encode two
different inputs to the same compressed representation, so that
decompression is then ambiguous (i.e.  data may be corrupted - although
zram is not affected because it operates over 4 KB pages).

This modifies the compressor without changing the decompressor or the
bitstream format, such that:

- there is no change to how data produced by the old compressor is
  decompressed

- an old decompressor will correctly decode data from the updated
  compressor

- performance and compression ratio are not affected

- we avoid introducing a new bitstream format

In testing over 12.8M real-world files totalling 903 GB, three files were
affected by this bug.  I also constructed 37M semi-random 64 KB files
totalling 2.27 TB, and saw no affected files.  Finally I tested over files
constructed to contain each of the ~1024 possible bad input sequences; for
all of these cases, updated lzo-rle worked correctly.

There is no significant impact to performance or compression ratio.

Link: http://lkml.kernel.org/r/20200507100203.29785-1-dave.rodgman@arm.com
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Dave Rodgman <dave.rodgman@arm.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Chao Yu <yuchao0@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/lzo.txt    |    8 ++++++--
 lib/lzo/lzo1x_compress.c |   13 +++++++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

--- a/Documentation/lzo.txt~lib-lzo-fix-ambiguous-encoding-bug-in-lzo-rle
+++ a/Documentation/lzo.txt
@@ -159,11 +159,15 @@ Byte sequences
            distance = 16384 + (H << 14) + D
            state = S (copy S literals after this block)
            End of stream is reached if distance == 16384
+           In version 1 only, to prevent ambiguity with the RLE case when
+           ((distance & 0x803f) == 0x803f) && (261 <= length <= 264), the
+           compressor must not emit block copies where distance and length
+           meet these conditions.
 
         In version 1 only, this instruction is also used to encode a run of
-        zeros if distance = 0xbfff, i.e. H = 1 and the D bits are all 1.
+           zeros if distance = 0xbfff, i.e. H = 1 and the D bits are all 1.
            In this case, it is followed by a fourth byte, X.
-           run length = ((X << 3) | (0 0 0 0 0 L L L)) + 4.
+           run length = ((X << 3) | (0 0 0 0 0 L L L)) + 4
 
       0 0 1 L L L L L  (32..63)
            Copy of small block within 16kB distance (preferably less than 34B)
--- a/lib/lzo/lzo1x_compress.c~lib-lzo-fix-ambiguous-encoding-bug-in-lzo-rle
+++ a/lib/lzo/lzo1x_compress.c
@@ -268,6 +268,19 @@ m_len_done:
 				*op++ = (M4_MARKER | ((m_off >> 11) & 8)
 						| (m_len - 2));
 			else {
+				if (unlikely(((m_off & 0x403f) == 0x403f)
+						&& (m_len >= 261)
+						&& (m_len <= 264))
+						&& likely(bitstream_version)) {
+					// Under lzo-rle, block copies
+					// for 261 <= length <= 264 and
+					// (distance & 0x80f3) == 0x80f3
+					// can result in ambiguous
+					// output. Adjust length
+					// to 260 to prevent ambiguity.
+					ip -= m_len - 260;
+					m_len = 260;
+				}
 				m_len -= M4_MAX_LEN;
 				*op++ = (M4_MARKER | ((m_off >> 11) & 8));
 				while (unlikely(m_len > 255)) {
_


  parent reply	other threads:[~2020-06-12  0:34 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-12  0:30 incoming Andrew Morton
2020-06-12  0:34 ` [patch 1/5] mm/memory-failure: prioritize prctl(PR_MCE_KILL) over vm.memory_failure_early_kill Andrew Morton
2020-06-12  0:34 ` [patch 2/5] mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread Andrew Morton
2020-06-12  0:34 ` [patch 3/5] ocfs2: fix build failure when TCP/IP is disabled Andrew Morton
2020-06-12  0:34 ` Andrew Morton [this message]
2020-06-12  0:34 ` [patch 5/5] amdgpu: a NULL ->mm does not mean a thread is a kthread Andrew Morton
     [not found] <20200611172827.bc85320ccf09b4c7e401d3f3@linux-foundation.org>
2020-06-12  0:30 ` [patch 4/5] lib/lzo: fix ambiguous encoding bug in lzo-rle Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200612003454.cUjqmKdwv%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=dave.rodgman@arm.com \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=markus@oberhumer.com \
    --cc=minchan@kernel.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=ngupta@vflare.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=w@1wt.eu \
    --cc=yuchao0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox