From: "Song Bao Hua (Barry Song)" <song.bao.hua@hisilicon.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"herbert@gondor.apana.org.au" <herbert@gondor.apana.org.au>,
"davem@davemloft.net" <davem@davemloft.net>,
"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Linuxarm <linuxarm@huawei.com>,
"Luis Claudio R . Goncalves" <lgoncalv@redhat.com>,
Mahipal Challa <mahipalreddy2006@gmail.com>,
Seth Jennings <sjenning@redhat.com>,
Dan Streetman <ddstreet@ieee.org>,
Vitaly Wool <vitaly.wool@konsulko.com>,
"Wangzhou (B)" <wangzhou1@hisilicon.com>,
"Colin Ian King" <colin.king@canonical.com>
Subject: RE: [PATCH v4] mm/zswap: move to use crypto_acomp API for hardware acceleration
Date: Thu, 9 Jul 2020 01:32:38 +0000 [thread overview]
Message-ID: <B926444035E5E2439431908E3842AFD25610B7@DGGEMI525-MBS.china.huawei.com> (raw)
In-Reply-To: <20200708145934.4w3qk53mgavyyln7@linutronix.de>
> -----Original Message-----
> From: linux-crypto-owner@vger.kernel.org
> [mailto:linux-crypto-owner@vger.kernel.org] On Behalf Of Sebastian Andrzej
> Siewior
> Sent: Thursday, July 9, 2020 3:00 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>
> Cc: akpm@linux-foundation.org; herbert@gondor.apana.org.au;
> davem@davemloft.net; linux-crypto@vger.kernel.org; linux-mm@kvack.org;
> linux-kernel@vger.kernel.org; Linuxarm <linuxarm@huawei.com>; Luis Claudio
> R . Goncalves <lgoncalv@redhat.com>; Mahipal Challa
> <mahipalreddy2006@gmail.com>; Seth Jennings <sjenning@redhat.com>;
> Dan Streetman <ddstreet@ieee.org>; Vitaly Wool
> <vitaly.wool@konsulko.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
> Colin Ian King <colin.king@canonical.com>
> Subject: Re: [PATCH v4] mm/zswap: move to use crypto_acomp API for
> hardware acceleration
>
> On 2020-07-08 00:52:10 [+1200], Barry Song wrote:
> …
> > @@ -127,9 +129,17 @@
> module_param_named(same_filled_pages_enabled,
> > zswap_same_filled_pages_enabled,
> > * data structures
> > **********************************/
> >
> > +struct crypto_acomp_ctx {
> > + struct crypto_acomp *acomp;
> > + struct acomp_req *req;
> > + struct crypto_wait wait;
> > + u8 *dstmem;
> > + struct mutex mutex;
> > +};
> …
> > @@ -561,8 +614,9 @@ static struct zswap_pool *zswap_pool_create(char
> *type, char *compressor)
> > pr_debug("using %s zpool\n", zpool_get_type(pool->zpool));
> >
> > strlcpy(pool->tfm_name, compressor, sizeof(pool->tfm_name));
> > - pool->tfm = alloc_percpu(struct crypto_comp *);
> > - if (!pool->tfm) {
> > +
> > + pool->acomp_ctx = alloc_percpu(struct crypto_acomp_ctx *);
>
> Can't you allocate the whole structure instead just a pointer to it? The
> structure looks just like bunch of pointers anyway. Less time for pointer
> chasing means more time for fun.
>
> > @@ -1074,12 +1138,32 @@ static int zswap_frontswap_store(unsigned
> type, pgoff_t offset,
> > }
> >
> > /* compress */
> > - dst = get_cpu_var(zswap_dstmem);
> > - tfm = *get_cpu_ptr(entry->pool->tfm);
> > - src = kmap_atomic(page);
> > - ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, &dlen);
> > - kunmap_atomic(src);
> > - put_cpu_ptr(entry->pool->tfm);
> > + acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
> > +
> > + mutex_lock(&acomp_ctx->mutex);
> > +
> > + src = kmap(page);
> > + dst = acomp_ctx->dstmem;
>
> that mutex is per-CPU, per-context. The dstmem pointer is per-CPU. So if I read
> this right, you can get preempted after crypto_wait_req() and another context
> in this CPU writes its data to the same dstmem and then…
>
> > + sg_init_one(&input, src, PAGE_SIZE);
> > + /* zswap_dstmem is of size (PAGE_SIZE * 2). Reflect same in sg_list */
> > + sg_init_one(&output, dst, PAGE_SIZE * 2);
> > + acomp_request_set_params(acomp_ctx->req, &input, &output,
> PAGE_SIZE, dlen);
> > + /*
> > + * it maybe looks a little bit silly that we send an asynchronous request,
> > + * then wait for its completion synchronously. This makes the process
> look
> > + * synchronous in fact.
> > + * Theoretically, acomp supports users send multiple acomp requests in
> one
> > + * acomp instance, then get those requests done simultaneously. but in
> this
> > + * case, frontswap actually does store and load page by page, there is no
> > + * existing method to send the second page before the first page is done
> > + * in one thread doing frontswap.
> > + * but in different threads running on different cpu, we have different
> > + * acomp instance, so multiple threads can do (de)compression in
> parallel.
> > + */
> > + ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req),
> &acomp_ctx->wait);
> > + dlen = acomp_ctx->req->dlen;
> > + kunmap(page);
> > +
> > if (ret) {
> > ret = -EINVAL;
> > goto put_dstmem;
>
> This looks using the same synchronous mechanism around an asynchronous
> interface. It works as a PoC.
>
> As far as I remember the crypto async interface, the incoming skbs were fed to
> the async interface and returned to the caller so the NIC could continue
> allocate new RX skbs and move on. Only if the queue of requests was getting
> to long the code started to throttle. Eventually the async crypto code
> completed the decryption operation in a different context and fed the
> decrypted packet(s) into the stack.
>
> From a quick view, you would have to return -EINPROGRESS here and have at
> the caller side something like that:
>
> iff --git a/mm/page_io.c b/mm/page_io.c
> index e8726f3e3820b..9d1baa46ec3ed 100644
> --- a/mm/page_io.c
> +++ b/mm/page_io.c
> @@ -252,12 +252,15 @@ int swap_writepage(struct page *page, struct
> writeback_control *wbc)
> unlock_page(page);
> goto out;
> }
> - if (frontswap_store(page) == 0) {
> + ret = frontswap_store(page);
> + if (ret == 0) {
> set_page_writeback(page);
> unlock_page(page);
> end_page_writeback(page);
> goto out;
> }
> + if (ret = -EINPROGRESS)
> + goto out;
> ret = __swap_writepage(page, wbc, end_swap_bio_write);
> out:
> return ret;
>
Unfortunately, this is not true and things are not that simple.
We can't simply depend on -EINPROGRESS and go out.
We have to wait for the result of compression to decide if we should
do __swap_writepage(). As one page might be compressed into two
pages, in this case, we will still need to do _swap_writepage().
As I replied in the latest email, all of the async improvement to frontswap
needs very careful thinking and benchmark. It can only happen after
we build the base in this patch, fixing the broken connection between
zswap and those new zip drivers.
Thanks
Barry
next prev parent reply other threads:[~2020-07-09 1:32 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-07 12:52 Barry Song
2020-07-08 14:59 ` Sebastian Andrzej Siewior
2020-07-08 21:45 ` Song Bao Hua (Barry Song)
2020-07-09 7:17 ` Sebastian Andrzej Siewior
2020-07-09 12:14 ` Song Bao Hua (Barry Song)
2020-07-09 1:32 ` Song Bao Hua (Barry Song) [this message]
2020-07-09 7:39 ` Sebastian Andrzej Siewior
2020-07-09 7:55 ` Song Bao Hua (Barry Song)
2020-07-09 8:40 ` Sebastian Andrzej Siewior
2020-07-09 9:09 ` Song Bao Hua (Barry Song)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=B926444035E5E2439431908E3842AFD25610B7@DGGEMI525-MBS.china.huawei.com \
--to=song.bao.hua@hisilicon.com \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=colin.king@canonical.com \
--cc=davem@davemloft.net \
--cc=ddstreet@ieee.org \
--cc=herbert@gondor.apana.org.au \
--cc=lgoncalv@redhat.com \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxarm@huawei.com \
--cc=mahipalreddy2006@gmail.com \
--cc=sjenning@redhat.com \
--cc=vitaly.wool@konsulko.com \
--cc=wangzhou1@hisilicon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox