linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Um, Taeil" <taeilum@amazon.com>
To: Dan Streetman <ddstreet@ieee.org>
Cc: Linux-MM <linux-mm@kvack.org>, Seth Jennings <sjenning@redhat.com>
Subject: Re: zswap: use PAGE_SIZE * 2 for compression dst buffer size when calling crypto compression API
Date: Fri, 21 Sep 2018 01:01:43 +0000	[thread overview]
Message-ID: <A5D25E6B-137C-4E09-9353-30B36C2B192E@amazon.com> (raw)
In-Reply-To: <CALZtONB-y=ePYMZjtRiyfCYbWJ=R-xaR2NHPafzYMohtKOUSYg@mail.gmail.com>

   > do you have a specific example of how this causes any actual problem?
Yes, I have a H/W accelerator that tries to finish compression even if compressed data size is greater than source data size.

   > personally, i'd prefer reducing zswap_dstmem down to 1 page to save
   > memory, since there is no case where zswap would ever want to use a
   > compressed page larger than that.
I don't think "reducing zswap_dstmem down to 1 page" works in today's lzo and lz4 kernel implementation.
If I read kernel's lzo, lz4 code correctly, it does not stop compression when compressed data size is greater than source data size. 

On 9/19/18, 8:47 AM, "Dan Streetman" <ddstreet@ieee.org> wrote:

    On Tue, Sep 18, 2018 at 7:48 PM Um, Taeil <taeilum@amazon.com> wrote:
    >
    > We can tell whether compressed size is greater than PAGE_SIZE by looking at the returned *dlen value from crypto_comp_compress. This should be fairly easy.
    > This is actually what zram is doing today. zram looks for *dlen and not take the compressed result if *dlen is greater than certain size.
    > I think errors from crypto_comp_compress should be real errors.
    >
    > Today in kernel compression drivers such as lzo and lz4, they do not stop just because compression result size is greater than source size.
    > Also, some H/W accelerators would not have the option of stopping compression just because of the result size is greater than source size.
    
    do you have a specific example of how this causes any actual problem?
    
    personally, i'd prefer reducing zswap_dstmem down to 1 page to save
    memory, since there is no case where zswap would ever want to use a
    compressed page larger than that.
    
    >
    > Thank you,
    > Taeil
    >
    > On 9/18/18, 2:44 PM, "Dan Streetman" <ddstreet@ieee.org> wrote:
    >
    >     On Tue, Sep 18, 2018 at 2:52 PM Um, Taeil <taeilum@amazon.com> wrote:
    >     >
    >     > Problem statement:
    >     > "compressed data are not fully copied to destination buffer when compressed data size is greater than source data"
    >     >
    >     > Why:
    >     > 5th argument of crypto_comp_compress function is *dlen, which tell the compression driver how many bytes the destination buffer space is allocated (allowed to write data).
    >     > This *dlen is important especially for H/W accelerator based compression driver because it is dangerous if we allow the H/W accelerator to access memory beyond *dst + *dlen.
    >     > Note that buffer location would be passed as physical address.
    >     > Due to the above reason, H/W accelerator based compression driver need to honor *dlen value when it serves crypto_comp_compress API.
    >
    >     and that's exactly what zswap wants to happen - any compressor (hw or
    >     sw) should fail with an error code (ENOSPC makes the most sense, but
    >     zswap doesn't actually care) if the compressed data size is larger
    >     than the provided data buffer.
    >
    >     > Today, we pass slen = PAGE_SIZE and *dlen=PAGE_SIZE to crypto_comp_compress in zswap.c.
    >     > If compressed data size is greater than source (uncompressed) data size,  H/W accelerator cannot copy (deliver) the entire compressed data.
    >
    >     If the "compressed" data is larger than 1 page, then there is no point
    >     in storing the page in zswap.
    >
    >     remember that zswap is different than zram; in zram, there's no other
    >     place to store the data.  However, with zswap, if compression fails or
    >     isn't good, we can just pass the uncompressed page down to the swap
    >     device.
    >
    >     >
    >     > Thank you,
    >     > Taeil
    >     >
    >     > On 9/18/18, 7:15 AM, "Dan Streetman" <ddstreet@ieee.org> wrote:
    >     >
    >     >     On Mon, Sep 17, 2018 at 7:10 PM Um, Taeil <taeilum@amazon.com> wrote:
    >     >     >
    >     >     > Currently, we allocate PAGE_SIZE * 2 for zswap_dstmem which is used as compression destination buffer.
    >     >     >
    >     >     > However, we pass only half of the size (PAGE_SIZE) to crypto_comp_compress.
    >     >     >
    >     >     > This might not be a problem for CPU based existing lzo, lz4 crypto compression driver implantation.
    >     >     >
    >     >     > However, this could be a problem for some H/W acceleration compression drivers, which honor destination buffer size when it prepares H/W resources.
    >     >
    >     >     How exactly could it be a problem?
    >     >
    >     >     >
    >     >     > Actually, this patch is aligned with what zram is passing when it calls crypto_comp_compress.
    >     >     >
    >     >     > The following simple patch will solve this problem. I tested it with existing crypto/lzo.c and crypto/lz4.c compression driver and it works fine.
    >     >     >
    >     >     >
    >     >     >
    >     >     >
    >     >     >
    >     >     > --- mm/zswap.c.orig       2018-09-14 14:36:37.984199232 -0700
    >     >     >
    >     >     > +++ mm/zswap.c             2018-09-14 14:36:53.340189681 -0700
    >     >     >
    >     >     > @@ -1001,7 +1001,7 @@ static int zswap_frontswap_store(unsigne
    >     >     >
    >     >     >                 struct zswap_entry *entry, *dupentry;
    >     >     >
    >     >     >                 struct crypto_comp *tfm;
    >     >     >
    >     >     >                 int ret;
    >     >     >
    >     >     > -              unsigned int hlen, dlen = PAGE_SIZE;
    >     >     >
    >     >     > +             unsigned int hlen, dlen = PAGE_SIZE * 2;
    >     >     >
    >     >     >                 unsigned long handle, value;
    >     >     >
    >     >     >                 char *buf;
    >     >     >
    >     >     >                 u8 *src, *dst;
    >     >     >
    >     >     >
    >     >     >
    >     >     >
    >     >     >
    >     >     >
    >     >     >
    >     >     > Thank you,
    >     >     >
    >     >     > Taeil
    >     >     >
    >     >     >
    >     >
    >     >
    >     >
    >
    >
    >
    
    


  reply	other threads:[~2018-09-21  1:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-17 23:10 Um, Taeil
2018-09-18 14:15 ` Dan Streetman
2018-09-18 18:52   ` Um, Taeil
2018-09-18 21:44     ` Dan Streetman
2018-09-18 23:48       ` Um, Taeil
2018-09-19 15:47         ` Dan Streetman
2018-09-21  1:01           ` Um, Taeil [this message]
2018-09-26  9:00             ` Dan Streetman
2018-09-26 16:10               ` Um, Taeil
2018-09-27 10:08                 ` Dan Streetman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A5D25E6B-137C-4E09-9353-30B36C2B192E@amazon.com \
    --to=taeilum@amazon.com \
    --cc=ddstreet@ieee.org \
    --cc=linux-mm@kvack.org \
    --cc=sjenning@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox