From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6595FC2BD09 for ; Fri, 12 Jul 2024 22:36:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9E6A16B008C; Fri, 12 Jul 2024 18:36:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 995826B0092; Fri, 12 Jul 2024 18:36:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85C836B0093; Fri, 12 Jul 2024 18:36:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 677536B008C for ; Fri, 12 Jul 2024 18:36:44 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E3762A0D65 for ; Fri, 12 Jul 2024 22:36:43 +0000 (UTC) X-FDA: 82332561486.30.5BD462E Received: from mail-yb1-f174.google.com (mail-yb1-f174.google.com [209.85.219.174]) by imf06.hostedemail.com (Postfix) with ESMTP id 18E1D180012 for ; Fri, 12 Jul 2024 22:36:41 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=A7ZuBDHr; spf=pass (imf06.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.174 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720823784; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qkASzPzjNMSeYMQLDsLAKiFjJF2MAimi/PdOy1+z6fI=; b=jc/jRSkyDLwHRko99hXJ7NXCnmG25FPUsVsv6epte55gn36OInbthsBzgHPxjOWMfXMS28 OvFgKLuNp7o5inG2cSp6IzrGPPlPQKDMw286FaJuveEPeJPZhWFYwXei2iFgGlm7ju+Zze 3jrTPwAd0uckQEUR/Us634wcvG+52lQ= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=A7ZuBDHr; spf=pass (imf06.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.174 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720823784; a=rsa-sha256; cv=none; b=blkUV2i9gKIWOugi6jrRPvH8bUgbsPuaw/aIyxolA3Z1EJqJVeOJAfSDFRSf3tF0vUBR2u fmHXceClAcyPz0Y/ubeeO+L1A0AHLS0rCMJA4uzby9xDvjNHyQwSsf+1JCIy8HLYorlR0z aYs06J6ofxDHPYSUVcL6GNC4VAcyw/k= Received: by mail-yb1-f174.google.com with SMTP id 3f1490d57ef6-dff17fd97b3so2672308276.2 for ; Fri, 12 Jul 2024 15:36:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720823801; x=1721428601; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=qkASzPzjNMSeYMQLDsLAKiFjJF2MAimi/PdOy1+z6fI=; b=A7ZuBDHrhQIkuL3nW5BLODDxtQKnB+gb/3fMfz7bZv2Yb9GMtAPtJju7Ox3xkBXxQa frEXkGUDhFl10ELNBbx14jZxtimJmhYLeyViz4yj7pPpsbiJwjTKTRqsbs5nMvT3L8jQ B3QL8N7qMv4r/nAzzIhzAlM6Ggv8LAVTzUKDmIBaT74kOAhHRXyXVubKe2INH+toAtCH CVvwO/cfK+QZCqI8JHcpPcwuW3kywZkJFqJf8oh17nZOeD5ShyLfvSKsa3/NqssmdCyB hb0TOb7104/QR02PXK3N5iSprhEZeGU4XLyNIrMGG4HiUWf1SbaYp0r63QSbuC7qc5yz whgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720823801; x=1721428601; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qkASzPzjNMSeYMQLDsLAKiFjJF2MAimi/PdOy1+z6fI=; b=CGdy6wk/1dxLWd/JMFbikV1p745DSzI4ypZWT7oUwciCSLW28NlF7yH3A4uxWoq+9Z DWRd7uqPThi3sWljJ/vaKarqlQBag/WBnQ8A+poifoz35Fs2o2k5iSldkKT5QOfMJidh T3zF3Gr9wJT4JX02BkYQc4Z3dsn559R2Z8dpI8WQIhyQI4PjuwtWF8kETX6EGZjnl38L cHDS0pt/5u5/Dju28xU7t4dCJwJBrGlnLrZajIgjDtEJpAiF8qrq/ARHVNmH0XlMH8GZ qyBJ3j+2eFWUeXz5CaxFmXIq7S8WFouDYyNMZ27od9b6pMz37ZksiOW5H7Pyh8CnV9jI 5VwA== X-Forwarded-Encrypted: i=1; AJvYcCXukBhaTNW1Mo5PaxFZ28JcLMmSIs++/sfiJMghuZyM1HIH0ae6pwMQOSLbfMMiEpZBs2ZFXIocCuXT83Q9ctjwYZw= X-Gm-Message-State: AOJu0YyDFn4SzOHkZ2S4e5iXRm2i2ePacD4gTc4dMlA9IwP/DrbYLG2f ho8XNr2kFEPn8A3SRrn9k1hUd5ETs+cnkk2T4tm8KDvhzuIDeYmobNcwQf9vY/meGVsuOyfvyhs XCI7WMRDvHTBIEnxR/HKrYG3FNg8= X-Google-Smtp-Source: AGHT+IEfRngAry3fAk2Q0OflTPBnSqJsQeCzMJqXZvzmBdMAdeDaSX1xQkT/TMFBVwfkVTmmZPmrn/0AjlgcY2vWGa0= X-Received: by 2002:a5b:b92:0:b0:e03:b61f:22e6 with SMTP id 3f1490d57ef6-e041b039ad6mr14440524276.9.1720823800917; Fri, 12 Jul 2024 15:36:40 -0700 (PDT) MIME-Version: 1.0 References: <20240706022523.1104080-1-flintglass@gmail.com> <20240706022523.1104080-6-flintglass@gmail.com> In-Reply-To: From: Nhat Pham Date: Fri, 12 Jul 2024 15:36:29 -0700 Message-ID: Subject: Re: [PATCH v2 5/6] mm: zswap: store incompressible page as-is To: Takero Funaki Cc: Johannes Weiner , Yosry Ahmed , Chengming Zhou , Jonathan Corbet , Andrew Morton , Domenico Cerasuolo , linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 18E1D180012 X-Stat-Signature: 7i7ymi376ieakcy9funrq84r19t5gsbn X-Rspam-User: X-HE-Tag: 1720823801-186864 X-HE-Meta: U2FsdGVkX1/xjD/i96Mx8sA+cYw1jX6mOJ6A4nxMROvaZzuxt9mSoKw5K/QH633R1u5d2dKE8b8RsAnSBOg/epSlhFEg84X9M0dbbJeXjlJYgHQds15jKwoGOtLyH+AxSfsOqVnb6x7vNagZYcmZqHObDLiIMIT0Dwj8lyOUTy2GQIqHIbzSkmKqdTbyWjBa5GjtTsNART3bu0P3ruV5jsH4J1s7JSSQnDUMHSkWui9p/PUSNDb3yjFofuLWFT3c0SrVmVleHCcqCZomqC/hy2OAPXQfaj96mZTJYlK8Me6cxY9W00ZML60E/5o4HhrvCAs6BWv5zGSJkv7DcOXktjuFU7iqCDhM1rW+cM29v+x2SVAABpKGInXtwENJMO7Em9yooI8ESOlPQCO+ISB5Aa8JW0EK0LZMNXoRSo9yb2tSOOvCHGQl3aR1o/tC43coANenYLFGBe5sHS7nWBbkJM8DVzGqVqvqdCoXOUXMkyakS1mJQtUfK1UyvwWqjTznID3UlY6AE8VXpy9rUQxXLpyTqdz1NSJ3k3zGxZjFDL4z2d69CxXweGBugn/3bR+ICuHCW+xURX3QiZctuCAuYSpSNvbLow2h05p60fmlWP/UB+XHKflEAQoMH6P3ChE0Dfq2gD1uz8Q1LNXdbhb3c9jGmEkV0EdDrf4PjP1SUGMWI+H1WuQnGnAfCyL8TyE3MGAemhamQpmJdekinOupWgBmc/aA+hJW4Fu/5h8FtfRP1L0FwiUbXXkgYokeqSOlZ7OkCAJ4v6FydUqxapn+Hr1fFxz2walP6DVvZxHFq3PAoYsPMRKl6oxSxDJFHawq4wZTnw+Iez+s+qaeasGlBbz9cqbLA5yk6lJOR5yjALy3IUTMO0XniBBaJDUPSyuLLWcu8ynkSBm0QEPsNmWwyF8GLfpvSBSCwEtVaYCrOQ+jV7qq4Jh5uO/ce2xznSYtrnRpLXENUJ3CHhbQkwh TBD/zmqd QrT4ylwVdBOAMgEdKF80H+N236U/LMSl7q5/COtEI+P5EF/hk95ZczLn+K2j6GaNz5Noe9+1qKAHtJoqVJNdqaSb219ItDKMoPDeDFZ1bouLRymH4eE4pyWH5vahdUXplLYzC73oVbx3fGyJdCm0GlXNRpfpHzgDfG9dg5fcTpVkaLtWD40yJCR+/3PWubM0yZAJeLhuYA3UDPL7J1gUgO3r1HtLELwLEJ5OZetH6ya92ynTRSuaWfTNWm72ZvbY9daG5nXzAnUni5AKbOf4mf3SEjFzPA9eY2MZBVJkRSihHpSts8C+0KbNfii6nloFFEFm56155YutvtRPDC97bvpH8Hxn4LseX3STP1/H9nZjLycmJPEXQrmhuFkalptDEZnav X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Jul 7, 2024 at 2:38=E2=80=AFAM Takero Funaki = wrote: > > 2024=E5=B9=B47=E6=9C=887=E6=97=A5(=E6=97=A5) 8:53 Nhat Pham : > > > > I tried to propose something similar in the past. Please read the > > following discussion: > > > > https://lore.kernel.org/all/CAJD7tka6XRyzYndRNEFZmi0Zj4DD2KnVzt=3DvMGhf= F4iN2B4VKw@mail.gmail.com/ > > > > But, the TLDR is Yosry was (rightly) concerned that with this > > approach, memory reclaiming could end up increasing memory usage > > rather than reducing (since we do not free up the page that fail to > > zswap-out, and we need extra memory for the zswap metadata of that > > page). > > > > So my vote on this patch would be NACK, until we get around this issue > > somehow :) > > It seems the discussion on the thread mixed up memory allocation > failure (system runs out of memory reserve) and incompressible pages > (compression algorithm successfully compressed but the result is equal > to or larger than PAGE_SIZE). > > zswap has been storing pages into dedicated pages 1:1 when compressed > to near PAGE_SIZE. Using zsmalloc, current zswap stores pages > compressed to between 3633 bytes (=3Dhugeclass+1) to 4095 bytes > (=3DPAGE_SIZE-1) into 1 page. This patch changes the range to 3633 to > 4096 by treating PAGE_SIZE as a special case. I could not find a > reason to reject only PAGE_SIZE while accepting PAGE_SIZE-1. > I'm not actually sure if this is true in practice. While yes, zsmalloc has the capability to store near-PAGE_SIZE objects, this also depends on the compression algorithm. At Meta, we use zstd. What I have found is that a lot of the time, it just flat out rejects the page if it's too poorly compressed. Without this change, we will not have to suffer the memory overhead of the zswap_entry structures for these rejected pages, whereas we will with this change. We might need to run some tracing to get a histogram of the distribution of post-compression sizes. > zswap wastes memory for metadata for all accepted pages but reduces IO Key word: accepted. The compression algorithm might already have some built in logic to reject poorly compressed pages, preventing the cases where the overhead might be too high for the saving. > amount and latency by compressed buffer memory. For pages between 3633 > to 4096 bytes, zswap reduces the latency only. This is still > beneficial because the rare incompressible pages trigger urgent > pageout IO and incur a head-of-line blocking on the subsequent pages. > It also keeps LRU priority for pagein latency. > > In the worst case or with a malicious dataset, zswap will waste a > significant amount of memory, but this patch does not affect nor > resolve the scenario. For example, if a user allocates pages > compressed to 3633 bytes, current zswap using zsmalloc cannot gain > memory as the compression ratio, including zsmalloc overhead, becomes > 1:1. This also applies to zbud. The compression ratio will be 1:1 as > zbud cannot find buddies smaller than 463 bytes. zswap will be less > efficient but still work in this situation since the max pool percent > and background writeback ensure the pool size does not overwhelm > usable memory. > > I suppose the current zswap has accepted the possible waste of memory, > at least since the current zswap_compress() logic was implemented. If > zswap had to ensure the compression ratio is better than 1:1, and only > prefers reducing IO amount (not latency), there would have been a > compression ratio threshold to reject pages not compressible to under > 2048 bytes. I think accepting nearly incompressible pages is > beneficial and changing the range to 4096 does not negatively affect > the current behavior. FWIW, I do agree with your approach (storing incompressible pages in the zswap pool to maintain LRU ordering) - this is *essentially* what I was trying to do too with the attempt I mentioned above. I'll let Johannes and Yosry chime in as well, since they were the original folks who raised these concerns :) If they're happy then I'll revoke my NACK.