From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66A76C02188 for ; Tue, 28 Jan 2025 01:00:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C97B12801EC; Mon, 27 Jan 2025 20:00:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C470E2801E4; Mon, 27 Jan 2025 20:00:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ABD8A2801EC; Mon, 27 Jan 2025 20:00:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 88EEE2801E4 for ; Mon, 27 Jan 2025 20:00:05 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 2F0B31207E1 for ; Tue, 28 Jan 2025 01:00:05 +0000 (UTC) X-FDA: 83055053970.28.E73462E Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf15.hostedemail.com (Postfix) with ESMTP id 3F980A0011 for ; Tue, 28 Jan 2025 01:00:03 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=OZVI3ASt; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf15.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.176 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738026003; a=rsa-sha256; cv=none; b=bTTiCOEvvLFAs07HqXBH7IS0Sn2uM+BCsExtddpGRqz+WAACTIAq809IiCIuljN5zBJoSv 9ApJMrlQMtH2dN8VMcFVK5fkZak45PU/4VLuUOWHNH9ccW+znsE3MoOiLi9DyOgz+AU+XV pXQc51H7ryifk1VWpAUXSvau3sn0tfM= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=OZVI3ASt; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf15.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.176 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738026003; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fj15Xz+8+rDlkTBh/uboTJtkxur8IFUuNH13XIH0SVc=; b=4U87XWHdsG2qTAMQ8PqMULYleFuaFjVCrs8pN1od3RFkVtstB5KpHjsQnp1L4zyS0AeUkz pOK+rxcH04GfzITMy6RYYGWGpM+9AP5m/QMzG44IkggtNYKpGjYBW85MUuqcfbx2l8Rd4p 0cTJAZd126AEP90FZlUFY1lgVuZc2a4= Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-2165cb60719so89620515ad.0 for ; Mon, 27 Jan 2025 17:00:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1738026002; x=1738630802; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=fj15Xz+8+rDlkTBh/uboTJtkxur8IFUuNH13XIH0SVc=; b=OZVI3AStGI38T8lhz1wchUB/zK5rLsbxApVOEvdWIt7OOgssZkeJbCoEyorPJ/qqVf fZBYZZ4pk8J6RPylgjQVYL+OCScILhOI8vTwOEwYreMbF0EGkBGH7Cuxd+C0xN6/7rA6 FUucgxFfgdAwc0Cf6kbLnbSovNMkAKcegdK1o= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738026002; x=1738630802; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=fj15Xz+8+rDlkTBh/uboTJtkxur8IFUuNH13XIH0SVc=; b=gvb2BzYuqwimy8rmdFY184VR52P5ttKpv6dQXrpOHM02vIbFCSZWA4jvzkGZ9sZJX/ vGXUNZQ4/8YQnBV1enpFSX282po8gtDWXltxn/zE4egDePKhZ4luqwQZ1JatrM9ei3e9 mMArABTn9EMdKr4tQGVHtdGDz+X2V36Um3+GdDBhpguwgXL53R/HDSsbPMs/MSBfiYIM xtxU9QnKlrdLh0SUDeF1hdPb6h1n4+X9BiSSqLI9Ub2D6mUQyD+5s2J0gI87gFQjdcrL j6lHtBbL9zqPWyTNuV7qQabDlrkl9Yvzs0W6fiATZQ5IHoRYtCxehuQiMdfooHYyP0Cv y40g== X-Forwarded-Encrypted: i=1; AJvYcCXPJRaghIgU64tvkjMM+sbIiJO2+JYu/Ew26XOMgVia/2602rVLoeWB5NYWUSlbTVjDnUbOh+cNfg==@kvack.org X-Gm-Message-State: AOJu0YzLDknOjc9Dv7TS3syrvHg/5O8YyOkXYtZ6wIY4sMDfg/oHYf4n aij6kQjeP84IGfXxWWLTmbGPJm4pCIlpb9poCFiZaAKAAKxpb4qcDJFE2kuKhA== X-Gm-Gg: ASbGncvfilTSXsXAFNTAaQT+ABCOAAniAVw9nmQprEFUcOG9lSIAXvFqId1Uyl0ARRs TKDKNj3vXI9ugol3tHJqSwHUVVr9ENE38zbRZdbSaGxCpnCx9oFiGNsSKaFmxEcfHmJ9ynOm+Ih Bx4zJvoXBqFqOg61J/lLKSJdH/9WRP/SC20Q4sXUEQDByKCngvZhWhQok9iBr0sf/6AabHZ6voM 0O3uA113rrYr7toCqrk3zfH9WvulZ/Xfxdrxu/5pTMv7L6hbqeUmZyIckQRcL8f01uk8rjqtT8p fGBrfpCk X-Google-Smtp-Source: AGHT+IHzsxSHpYUV76z1wz/XM83aCwSEZTfSMhKxzMLC5Q+iTBoQvgAG3pgPpo8iNgqYsCyFq17/iw== X-Received: by 2002:a17:902:f682:b0:215:6489:cfd0 with SMTP id d9443c01a7336-21c352d639bmr564392885ad.3.1738026001910; Mon, 27 Jan 2025 17:00:01 -0800 (PST) Received: from google.com ([2401:fa00:8f:203:566d:6152:c049:8d3a]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21da424eecbsm69677715ad.228.2025.01.27.16.59.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Jan 2025 17:00:01 -0800 (PST) Date: Tue, 28 Jan 2025 09:59:55 +0900 From: Sergey Senozhatsky To: Yosry Ahmed Cc: Sergey Senozhatsky , Andrew Morton , Minchan Kim , Johannes Weiner , Nhat Pham , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 5/6] zsmalloc: introduce handle mapping API Message-ID: References: <20250127080254.1302026-1-senozhatsky@chromium.org> <20250127080254.1302026-6-senozhatsky@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 3F980A0011 X-Stat-Signature: hzwm3jseq7w1d35oniwas6rnghx3d4xo X-HE-Tag: 1738026003-195647 X-HE-Meta: U2FsdGVkX18uRVa8QelHhZmbIPPUdiqULUZjSNZIq6FcCsZs+6Ge1ojgPO2+BBvUosOdCb+28MUTFuA9jUGQgD1bydiw2hlj6x76kg4tVAnSslZ461uhwrDuBrTEioE4lB7ZMfRR6GVkJlqafPJ3dywk20T9e01EvL9eLcZUv7YZNtzPHOZnG3db2frtiyEnxbQTvV7jkaZ2ZwMOUCRH+XwKg9KsFWv+vn9HoNJK1Om5x2Lelq1CmSbE3tf83Q+VTsMZFKltpKAjJWiYOPwEC7TunAw9DRCBXTnAtW1hkL9gvT+UmKd+fzin0xKDYD3OVMV5ocng/1M3ZfCDMw2cH56hvu5gbG+aCX4SNWWv6LEJ2LZaba9Xq3doVEDxXUi6AWXO5OBPNVmzNz17wUyvSrZJoW3+UKL2e3WGsuCy8EHEf/knTHqyd7aT9PAX91IFldpHFgl+ou3YC6ehGm3qpHfdiDrDJrHqF8Kjbp2luA+7ezECs0tKR4pfcFrbA4fUAe95xipSyxLmjsCi/cleENOx042jWzIdTU7FPSt8bD/tJJHsLOTxgmbUzHCwePCG44wufVBq8286Vy3VXfUj6W3FPSKjhk6Fwge2cKvPl5EoM8sqPx1DOYvyf5ZxHxEPQ56JQTDHJ2tXv+nzQh/CSnaq7BpJCdQ0TW7ihxvIUu/0sDw4f/YWRbcfryFMzMpRWLFY6m+znmTRPCJ/qsi9zdTCQ0H2E4LVawDXQHgnWNAf4YPeg0Gxivo3iyXaNlAFVlBmlDegc+p7V2u75sCuIjIOSuBdrLm1F3pYRkRS0GNJOkSLVm1YJMkDFr17bMfN4daqUmhQ48GTscZqg1UPb1qylH1ulXs3n3TRDNwjhcyyr3E1ms6F4AK8aoWmojXUq1Jw9W6FypCp5nUXwB0swyP8X8WPYA4L9yNM7gJLpqPwJM8K/HAYS10GjHKOpV67ArN+wnuMKuU8VYnKAYW Ng5uf6gx hHv/EWKHFimF94vBPY2a2sSIHBKuBSmFGWY6JKClBRN/HBxxvW+sGvBpln4IBwn+omvX38KbN0JjYqZvqHsxdmsQR7mSPSCaOUc4P/pgERcAmrzpzFXzoPDfiC3Ho/FVdfhF3uwhu22kEIp0LN04YCf+ld1IGEzni+kzFuEzehVMKsgBL/cfJR/oYOSzABw2ZNT1P/8N5xshwPmygTxf6gaHMRLGybD7k/MJab6BYyXVMpaMss3UQ3iyS5BKK69V3vg8Wod+MbDo+7qxH5siiWD4rM1uW8Wcyyqj/iR6nNnqg7Y03GgDYw3M/SY1JWbIHZrI9lzX0KXjUCk0h4VB0kKklNJqM23+Lq6NdmTTAYCqAdU4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On (25/01/27 21:58), Yosry Ahmed wrote: > On Mon, Jan 27, 2025 at 04:59:30PM +0900, Sergey Senozhatsky wrote: > > Introduce new API to map/unmap zsmalloc handle/object. The key > > difference is that this API does not impose atomicity restrictions > > on its users, unlike zs_map_object() which returns with page-faults > > and preemption disabled - handle mapping API does not need a per-CPU > > vm-area because the users are required to provide an aux buffer for > > objects that span several physical pages. > > I like the idea of supplying the buffer directly to zsmalloc, and zswap > already has per-CPU buffers allocated. This will help remove the special > case to handle not being able to sleep in zswap_decompress(). The interface, basically, is what we currently have, but the state is moved out of zsmalloc internal per-CPU vm-area. > That being said, I am not a big fan of the new API for several reasons: > - The interface seems complicated, why do we need struct > zs_handle_mapping? Can't the user just pass an extra parameter to > zs_map_object/zs_unmap_object() to supply the buffer, and the return > value is the pointer to the data within the buffer? At least now we need to save some state - e.g. direction of the map() so that during unmap zsmalloc determines if it needs to perform copy-out or not. It also needs that state in order to know if the buffer needs to be unmapped. zsmalloc MAP has two cases: a) the object spans several physical non-contig pages: copy-in object into aux buffer and return (linear) pointer to that buffer b) the object is contained within a physical page: kmap that page and return (linear) pointer to that mapping, unmap in zs_unmap_object(). > - This seems to require an additional buffer on the compress side. Right > now, zswap compresses the page into its own buffer, maps the handle, > and copies to it. Now the map operation will require an extra buffer. Yes, for (a) mentioned above. > I guess in the WO case the buffer is not needed and we can just pass > NULL? Yes. > Taking a step back, it actually seems to me that the mapping interface > may not be the best, at least from a zswap perspective. In both cases, > we map, copy from/to the handle, then unmap. The special casing here is > essentially handling the copy direction. Zram looks fairly similar but I > didn't look too closely. > > I wonder if the API should store/load instead. You either pass a buffer > to be stored (equivalent to today's alloc + map + copy), or pass a > buffer to load into (equivalent to today's map + copy). What we really > need on the zswap side is zs_store() and zs_load(), not zs_map() with > different mapping types and an optional buffer if we are going to > eventually store. I guess that's part of a larger overhaul and we'd need > to update other zpool allocators (or remove them, z3fold should be > coming soon). So I though about it: load and store. zs_obj_load() { zspage->page kmap, etc. memcpy buf page # if direction is not WO unmap } zs_obj_store() { zspage->page kmap, etc. memcpy page buf # if direction is not RO unmap } load+store would not require zsmalloc to be preemptible internally, we could just keep existing atomic locks and it would make things a little simpler on the zram side (slot-free-notification is called from atomic section). But, and it's a big but. And it's (b) from the above. I wasn't brave enough to just drop (b) optimization and replace it with memcpy(), especially when we work with relatively large objects (say size-class 3600 bytes and above). This certainly would not make battery powered devices happier. Maybe in zswap the page is only read once (is that correct?), but in zram page can be read multiple times (e.g. when zram is used as a raw block-dev, or has a mounted fs on it) which means multiple extra memcpy()-s.