From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8050C4332F for ; Fri, 4 Nov 2022 03:48:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4F1BE6B0071; Thu, 3 Nov 2022 23:48:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4A11B6B0073; Thu, 3 Nov 2022 23:48:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 369C86B0074; Thu, 3 Nov 2022 23:48:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 23C7F6B0071 for ; Thu, 3 Nov 2022 23:48:49 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E0B7FA05AD for ; Fri, 4 Nov 2022 03:48:48 +0000 (UTC) X-FDA: 80094378336.08.04D30FB Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) by imf06.hostedemail.com (Postfix) with ESMTP id 7BE20180004 for ; Fri, 4 Nov 2022 03:48:48 +0000 (UTC) Received: by mail-pj1-f49.google.com with SMTP id f5-20020a17090a4a8500b002131bb59d61so6723443pjh.1 for ; Thu, 03 Nov 2022 20:48:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=4RE+X/RkKiDS4AJLCzo53ynRp9BFAN/2PAvOPFZNf/s=; b=aG8LOTs19ztLgbIkoVWI31xW7AfJ4agzyU0E8fQtCtCNU46tWTjOSoHXEj9jSpQS/b PHtfEUPnKFOFnvqGWgaU0T1Psw+71JEhSTh6n5/ewDLa3bEpMcfy5E3e6D2WC5yfrVTU tZjLF+iwUHeCxK4NyHD0uAOotmz+rddWcUQTs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=4RE+X/RkKiDS4AJLCzo53ynRp9BFAN/2PAvOPFZNf/s=; b=HfG493f6izou8zY3h5qwbQsmoz5bGRDHVa+FFsdMhPKFF/INMVXXTaRDWS/LiYBuI0 mU370rkAefZ1qaimHV867xVWQ1K2x2LYB5facaQMXH4okdgGRZ9JOEkpc4Wcl6M6NO9k YW9U7J+tvtZmxX5jaQBLkVvDsvDn28N5WqS8uSdksHZNP8qAVRVMNl++BSiiE9URcYUK y73QRm2jEqWBq53cS+mHyaPAUIl07bwV1TuulQcjAysUEco906kzkR/X0YWMlUgsSMIa ID15zm23aB6dbsWY9rorZTC1m7Vq6Rbv/fJO9ZgGowbO93va2hYo27o8q/99HVf8QYCi Iaeg== X-Gm-Message-State: ACrzQf3zB7rGYgCx/0080lolt3dj9hg7d/+XxOMPLfrpQYYydE5rej9f gTCHNyje9dLweRKTddvmSI7WEw== X-Google-Smtp-Source: AMsMyM46EYkxamSkq+8PpxCgcioaFJWUezswYOfb3hsyxkoNF9n/hHcoODXErjUaAzdT+vhxUwaccQ== X-Received: by 2002:a17:902:d64e:b0:186:91fa:59cc with SMTP id y14-20020a170902d64e00b0018691fa59ccmr249426plh.44.1667533727268; Thu, 03 Nov 2022 20:48:47 -0700 (PDT) Received: from google.com ([240f:75:7537:3187:f2f6:8f5:87c8:3aeb]) by smtp.gmail.com with ESMTPSA id z15-20020a170903018f00b00186616b8fbasm1513242plg.10.2022.11.03.20.48.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Nov 2022 20:48:46 -0700 (PDT) Date: Fri, 4 Nov 2022 12:48:42 +0900 From: Sergey Senozhatsky To: Minchan Kim Cc: Sergey Senozhatsky , Andrew Morton , Nitin Gupta , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCHv4 4/9] zram: Introduce recompress sysfs knob Message-ID: References: <20221018045533.2396670-1-senozhatsky@chromium.org> <20221018045533.2396670-5-senozhatsky@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667533728; a=rsa-sha256; cv=none; b=TDt9Tl6M+7GZOp6E/cmEsDu9IVmauIUXQxkB/1ggdAz4eb5wYgmLDU96JwqP1jWOy1bgRZ r4NutP76NZ+UIXAoYM7nMl6dFUcgQGfw81AbwafYe29b64ld6ukpronksb67ADAOwX/xER B975LF/tEc/7E+u9Z4btmqu2E1/DzK0= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=aG8LOTs1; spf=pass (imf06.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.216.49 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667533728; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4RE+X/RkKiDS4AJLCzo53ynRp9BFAN/2PAvOPFZNf/s=; b=EBkVlQP24r8lQ3GwjCkW7FgES715uKec6q4iYc57xulviV+NZZ2ZFemNJ14hPeP/Sej+xd Tyx51Nay7rnXhbxH/VAOy3CMIrbN9q7ON3cBd00IDXtzAhrPOIAdGkPxHWv9vK/m2GOFt2 EpC7c7Xa3M45bHv80B0z9b6ICfxPV0o= X-Stat-Signature: p45qn1ixeg6d9hmobzaiqc98nmr54tsz X-Rspamd-Server: rspam09 X-Rspam-User: X-Rspamd-Queue-Id: 7BE20180004 Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=aG8LOTs1; spf=pass (imf06.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.216.49 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org X-HE-Tag: 1667533728-651622 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On (22/11/03 10:00), Minchan Kim wrote: [..] > > Per-my understanding this threshold can change quite often, > > depending on memory pressure and so on. So we may force > > user-space to issues more syscalls, without any gain in > > simplicity. > > Sorry, didn't understand your point. Let me clarify my idea. > If we have separate knob for recompress thresh hold, we could > work like this. > > # recompress any compressed pages which is greater than 888 bytes. > echo 888 > /sys/block/zram0/recompress_threshold > > # try to compress any pages greather than threshold with following > # algorithm. > > echo "type=lzo priority=1" > /sys/block/zram0/recompress_algo > echo "type=zstd priority=2" > /sys/block/zram0/recompress_algo > echo "type=deflate priority=3" > /sys/block/zram0/recompress_algo OK. We can always add more sysfs knobs and make threshold a global per-device value. I think I prefer the approach when threshold is part of the current recompress context, not something derived form another context. That is, when all values (page type, threshold, possibly algorithm index) are submitted by user-space for this particular recompression echo "type=huge threshold=3000 ..." > recompress If threshold is a global value that is applied to all recompress calls then how does user-space say no-threshold? For instance, when it wants to recompress only huge pages. It probably still needs to supply something like threshold=0. So my personal preference for now - keep threshold as a context dependent value. Another thing that I like about threshold= being context dependent is that then we don't need to protect recompression against concurrent global threshold modifications with lock and so on. It keeps things simpler. [..] > > > Let's squeeze the comp algo index into meta area since we have > > > some rooms for the bits. Then can we could remove the specific > > > recomp two flags? > > > > What is meta area? > > zram->table[index].flags > > If we squeeze the algorithm index, we could work like this > without ZRAM_RECOMP_SKIP. We still need ZRAM_RECOMP_SKIP. Recompression may fail to compress object further: sometimes we can get recompressed object that is larger than the original one, sometimes of the same size, sometimes of a smaller size but still belonging to the same size class, which doesn't save us any memory. Without ZRAM_RECOMP_SKIP we will continue re-compressing objects that are in-compressible (in a way that saves us memory in zsmalloc) by any of the ZRAM's algorithms. > read_block_state > zram_algo_idx(zram, index) > 0 ? 'r' : '.'); > > zram_read_from_zpool > if (zram_algo_idx(zram, idx) != 0) > idx = 1; As an idea, maybe we can store everything re-compression related in a dedicated meta field? SKIP flag, algorithm ID, etc. We don't have too many bits left in ->flags on 32-bit systems. We currently probably need at least 3 bits - one for RECOMP_SKIP and at least 2 for algorithm ID. 2 bits for algorithm ID put us into situation that we can have only 00, 01, 10, 11 as IDs, that is maximum 3 recompress algorithms: 00 is the primary one and the rest are alternative ones. Maximum 3 re-compression algorithms sounds like a reasonable max value to me. Yeah, maybe we can use flags bits for it. [..] > > > zram_bvec_read: > > > algo_idx = zram_get_algo_idx(zram, index); > > > zstrm = zcomp_stream_get(zram, algo_idx); > > > zcomp_decompress(zstrm); > > > zcomp_stream_put(zram, algo_idx); > > > > Hmm. This is something that should not be enabled by default. > > Exactly. I don't mean to enable by default, either. OK. > > N compressions per every stored page is very very CPU and > > power intensive. We definitely want a way to have recompression > > as a user-space event, which gives all sorts of flexibility and > > extensibility. ZRAM doesn't (and should not) know about too many > > things, so ZRAM can't make good decisions (and probably should not > > try). User-space can make good decisions on the other hand. > > > > So recompression for us is not something that happens all the time, > > unconditionally. It's something that happens sometimes, depending on > > the situation on the host. > > Totally agree. I am not saying we should enable the feature by default > but at lesat consider it for the future. I have something in mind to > be useful later. OK. > > [..] > > > > +static int zram_recompress(struct zram *zram, u32 index, struct page *page, > > > > + int size_watermark) > > > > +{ > > > > + unsigned long handle_prev; > > > > + unsigned long handle_next; > > > > + unsigned int comp_len_next; > > > > + unsigned int comp_len_prev; > > > > > > How about orig_handle and new_nandle with orig_comp_len and new_comp_len? > > > > No opinion. Can we have prev and next? :) > > prev and next gives the impression position something like list. > orig and new gives the impression stale and fresh. > > We are doing latter here. Yeah, like I said in internal email, this will make rebasing harder on my side, because this breaks a patch from Alexey and then breaks a higher order zspages patch series. It's an very old series and we already have quite a bit of patches depending on it.