From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE854C433E0 for ; Thu, 14 Jan 2021 21:09:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2C38523A3B for ; Thu, 14 Jan 2021 21:09:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2C38523A3B Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8E4358D0129; Thu, 14 Jan 2021 16:09:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 86CB98D00F0; Thu, 14 Jan 2021 16:09:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 70E208D0129; Thu, 14 Jan 2021 16:09:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0157.hostedemail.com [216.40.44.157]) by kanga.kvack.org (Postfix) with ESMTP id 55FD98D00F0 for ; Thu, 14 Jan 2021 16:09:19 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 192031E0C for ; Thu, 14 Jan 2021 21:09:19 +0000 (UTC) X-FDA: 77705621238.05.brain74_23154c127529 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin05.hostedemail.com (Postfix) with ESMTP id CD28D1826B6C5 for ; Thu, 14 Jan 2021 21:09:18 +0000 (UTC) X-HE-Tag: brain74_23154c127529 X-Filterd-Recvd-Size: 8284 Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Thu, 14 Jan 2021 21:09:18 +0000 (UTC) Received: by mail-lj1-f179.google.com with SMTP id e7so8057919ljg.10 for ; Thu, 14 Jan 2021 13:09:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pHpbF4OscGizLBDLyi3USrj1/btPjncTQIVKGpBzSm4=; b=rs3cfWxfom6shk3ZB6O2LE3rhUMRrQSUvbuxkfNkgNZblUvnMz0k9FGkvB78NPfCaF pp3T1de1E39XOpseQvUojKyKcFljEFXDfFXROLo8Ii9aeqGuIhy/ZuH61qCYTOuw68bK Bdo3r9caWaExP2a4UuYKHructwWrA3FCRz34uRAsE1OJIhAe1VedSuj+I7+MOhMnTl3Z sx+WZY+nZ+RmrQptOrotxy1BJrm4Mfv5dMMBL72vFNA37jK4EBbosxcipybtUtWC5oIY OD83hxRIpKOSCjuLZKOAiAVRYMF/Uj1n3yRNKqzAcA6Tc9sJKSnFtXTO65DrPeXYF9Qa 8niA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pHpbF4OscGizLBDLyi3USrj1/btPjncTQIVKGpBzSm4=; b=h+e0v1sv2BQcWo6qJzRnXRYV0NJFmIywhqmkWu6AYf3PLr+8ZdTusXRdJ8huzuUs4F vitcnVWvAiah7er0lyuM57IKzdTJpcZgZckqfMy8OQ7xZETisCdCPIZvj3d8kfI1LzcW E6V1xt1J/zdz5VE9HBRnuLvYEg1c1w6/x/oA+9Biz5KAcfJfKKI3B+9AXH2VKHzCw679 tUIVV++JaR6OhgX+30Qp0Xc0crC9iBom0pkrGWqS+aIGpN9ZCzFYKX1Uge9i4ahVJCKc 0LFmDm9Pxa68bEbxgVvkY8I8NgmZJ0qPNUIgNLCpDzKE2ICy7p35UUBjrfeB8CzLVcXc Gwww== X-Gm-Message-State: AOAM5335DAEPuBFUu9Rqx7kg3GcJ7tkLMmuwaPzSx84gGKTBBHSTVg2J CusBp8w1nifyj6q1B92Hcx3ghsgGPAPiykuWabSIKA== X-Google-Smtp-Source: ABdhPJx91Oj8U++j/gKWXmlwZmmF526EfUp57IUtjorOeNuXIS7tQY2ZRZyqiLE6HV+7CHDD32utysMq53KWuqVVOIg= X-Received: by 2002:a05:651c:234:: with SMTP id z20mr3748517ljn.456.1610658556376; Thu, 14 Jan 2021 13:09:16 -0800 (PST) MIME-Version: 1.0 References: <1608894171-54174-1-git-send-email-tiantao6@hisilicon.com> <1608894171-54174-2-git-send-email-tiantao6@hisilicon.com> In-Reply-To: From: Shakeel Butt Date: Thu, 14 Jan 2021 13:09:05 -0800 Message-ID: Subject: Re: [RFC mm/zswap 1/2] mm/zswap: add the flag can_sleep_mapped To: Vitaly Wool Cc: Minchan Kim , Tian Tao , Seth Jennings , Dan Streetman , Andrew Morton , Barry Song , Linux-MM , Sergey Senozhatsky Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jan 14, 2021 at 11:53 AM Vitaly Wool wrote: > > On Thu, Jan 14, 2021 at 8:21 PM Shakeel Butt wrote: > > > > On Thu, Jan 14, 2021 at 11:05 AM Vitaly Wool wrote: > > > > > > On Thu, Jan 14, 2021 at 7:56 PM Minchan Kim wrote: > > > > > > > > On Thu, Jan 14, 2021 at 07:40:50PM +0100, Vitaly Wool wrote: > > > > > On Thu, Jan 14, 2021 at 7:29 PM Minchan Kim wrote: > > > > > > > > > > > > On Fri, Dec 25, 2020 at 07:02:50PM +0800, Tian Tao wrote: > > > > > > > add a flag to zpool, named is "can_sleep_mapped", and have it set true > > > > > > > for zbud/z3fold, set false for zsmalloc. Then zswap could go the current > > > > > > > path if the flag is true; and if it's false, copy data from src to a > > > > > > > temporary buffer, then unmap the handle, take the mutex, process the > > > > > > > buffer instead of src to avoid sleeping function called from atomic > > > > > > > context. > > > > > > > > > > > > > > Signed-off-by: Tian Tao > > > > > > > --- > > > > > > > include/linux/zpool.h | 3 +++ > > > > > > > mm/zpool.c | 13 +++++++++++++ > > > > > > > mm/zswap.c | 50 +++++++++++++++++++++++++++++++++++++++++++++----- > > > > > > > 3 files changed, 61 insertions(+), 5 deletions(-) > > > > > > > > > > > > > > diff --git a/include/linux/zpool.h b/include/linux/zpool.h > > > > > > > index 51bf430..e899701 100644 > > > > > > > --- a/include/linux/zpool.h > > > > > > > +++ b/include/linux/zpool.h > > > > > > > @@ -73,6 +73,7 @@ u64 zpool_get_total_size(struct zpool *pool); > > > > > > > * @malloc: allocate mem from a pool. > > > > > > > * @free: free mem from a pool. > > > > > > > * @shrink: shrink the pool. > > > > > > > + * @sleep_mapped: whether zpool driver can sleep during map. > > > > > > > > > > > > I don't think it's a good idea. It just breaks zpool abstraction > > > > > > in that it exposes internal implementation to user to avoid issue > > > > > > zswap recently introduced. It also conflicts zpool_map_handle's > > > > > > semantic. > > > > > > > > > > > > Rather than introducing another break in zpool due to the new > > > > > > zswap feature recenlty introduced, zswap could introduce > > > > > > CONFIG_ZSWAP_HW_COMPRESSOR. Once it's configured, zsmalloc could > > > > > > be disabled. And with disabling CONFIG_ZSWAP_HW_COMPRESSOR, zswap > > > > > > doesn't need to make any bounce buffer copy so that no existing > > > > > > zsmalloc user will see performance regression. > > > > > > > > > > I believe it won't help that much -- the new compressor API presumes > > > > > that the caller may sleep during compression and that will be an > > > > > accident waiting to happen as long as we use it and keep the handle > > > > > mapped in zsmalloc case. > > > > > > > > > > Or maybe I interpreted you wrong and you are suggesting re-introducing > > > > > calls to the old API under this #ifdef, is that the case? > > > > > > > > Yub. zswap could abstract that part under #ifdef to keep old behavior. > > > > > > We can reconsider this option when zsmalloc implements reclaim > > > callback. So far it's obviously too much a mess for a reason so weak. > > > > > > > Sorry I don't understand the link between zsmalloc implementing shrink > > callback and this patch. > > There is none. There is a link between taking all the burden to revive > zsmalloc for zswap at the cost of extra zswap complexity and zsmalloc > not being fully zswap compatible. > > The ultimate zswap goal is to cache hottest swapped-out pages in a > compressed format. zsmalloc doesn't implement reclaim callback, and > therefore zswap can *not* fulfill its goal since old pages are there > to stay, and it can't accept new hotter ones. So, let me make it > clear: zswap/zsmalloc combo is a legacy corner case. > This is the first time I am hearing that zswap/zsmalloc combo is a legacy configuration. We use zswap in production and intentionally size the zswap pool to never have to go to the backing device. So absence of reclaim callback is not an issue for us. Please let me know if the zswap/zsmalloc combo is officially on its way to deprecation. > > This patch is adding an overhead for all > > zswap+zsmalloc users irrespective of availability of hardware. If we > > want to add support for new hardware, please add without impacting the > > current users. > > No, it's not like that. zswap+zsmalloc combination is currently > already broken By broken do you mean the missing reclaim callback? > and this patch implements a way to work that around. > The users are already impacted and that is of course a problem. Are you talking about rt users or was there some other report? > The workaround is not perfect but looks good enough for me. I would like a see page fault perf experiment for the non-hardware case.