From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11294C36018 for ; Wed, 2 Apr 2025 20:06:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9D421280003; Wed, 2 Apr 2025 16:06:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 95C5F280001; Wed, 2 Apr 2025 16:06:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D48D280003; Wed, 2 Apr 2025 16:06:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5A88F280001 for ; Wed, 2 Apr 2025 16:06:56 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id AE26AB035A for ; Wed, 2 Apr 2025 20:06:56 +0000 (UTC) X-FDA: 83290187232.22.2C04966 Received: from mail-yb1-f173.google.com (mail-yb1-f173.google.com [209.85.219.173]) by imf06.hostedemail.com (Postfix) with ESMTP id 02BC5180010 for ; Wed, 2 Apr 2025 20:06:54 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XP6VNjrn; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf06.hostedemail.com: domain of joshua.hahnjy@gmail.com designates 209.85.219.173 as permitted sender) smtp.mailfrom=joshua.hahnjy@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743624415; a=rsa-sha256; cv=none; b=MGok9CpyzXcBcRqaqXBANLe7o3sHpvb9D1HOsgX6j5IPqsYJnCH6LfiwY7pupwSZkISZTs x2MrvPdxL0zXjGF45DBsirVP3gwVWkiFYnNffHOlr7IEGvL/e3ONU8rPXvMpZyzPVSg76q 2IgUJwbWuxHf4XrFjT3xI3+IoQJ3btE= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XP6VNjrn; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf06.hostedemail.com: domain of joshua.hahnjy@gmail.com designates 209.85.219.173 as permitted sender) smtp.mailfrom=joshua.hahnjy@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743624415; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ovp/3q2Ql4SNo8orFSiM0VLKdTHZcpZiyaTbFpXtnKU=; b=MYVccIb/1gdWkLuFp0HeG2KnGIld0RYW3E9gvfARpLcJw4nJ3BPOjLxy02Du35TBiTqRY+ m/J6WNlIKUuagHE3e8/+wum3ZyIwkHXrxu3MF/XaoVKni65ResFzKKJYyffrA0G9zGddFZ Lmx0V5SVW6uH9zrpaQ0btjbdD+oAra8= Received: by mail-yb1-f173.google.com with SMTP id 3f1490d57ef6-e60aef2711fso133994276.2 for ; Wed, 02 Apr 2025 13:06:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1743624414; x=1744229214; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Ovp/3q2Ql4SNo8orFSiM0VLKdTHZcpZiyaTbFpXtnKU=; b=XP6VNjrnbQ4Co413iDQPFpI0oaFv08jnvpg0hiQc9HUIa9cSmOrR7UJ4MyERdOce+z ErLDVDFlTSq0xBr+TGuKU5u6u6rng5OsUEFqQRD+pzyevGJ5M64WQ5AveTFiENceBCBz ZjlEN/vQ/pUJpP5t+c+SP1EuloyS5m2DC8eXzBIBC0BIHRR7fvxIz6JxBPtIvkeh9wQr gDkzhTU+cD59TXWdMxnxlXQKYpKn+FGOGv23LuNtfKxlRxrji3skg2ndoCIEjwfnHsc1 GmnKQlF3IOlrqsX+BQ40Cw4VX9n8KKNLU7ALauEukeu/aaj91FCZvWiTOR/nf7E6eF7w d6vA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743624414; x=1744229214; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ovp/3q2Ql4SNo8orFSiM0VLKdTHZcpZiyaTbFpXtnKU=; b=pAf8eytgp8QZe9WSgh3sudtDLXia3YXFBfl6i4NfX5iG1/MQicM17GTLnRAg1b/PoR oapKBSSX6vA9aTOOVePTUsGzt/dvcn5QOIayczWEr8H3MQKdtBBDi+O22+HO8iWNpzRS lYrGHOkJpVvHSuHfAUEyVXv6iGMm3UUcFd22APqyxUOud9J4Z8LZIq9VwzGbo8Nj2rtN 0oE5w2j2p6hbFs3WWPyLZwz8pgekzLCVtdIhA1VS1fzWGamYfkar1IC2fWtYj8Vl/xF6 L31Hh1XoCaHCx/KG1ye2F7lMsTAi/EWuuf47tN+HaosHuyIaBnMJy1m0zaS0Pw457Hys NCtQ== X-Forwarded-Encrypted: i=1; AJvYcCW5dMk+PrR8Rw0PTQoPyQRF0rEf7wSV35b4ighqhQg/0vpxD7D2eOFb/mR47BFQ1gbvTHK5BpjFGg==@kvack.org X-Gm-Message-State: AOJu0YzSEywq5ItWzOyjpwRkQNldQYpz0kLTBtxS3UKz4HQVdtwKrg6Q maHdKbK1qHH8WyskAo5+ArMJQHk4MzSi2rbEapXLjrdmTqTEPQuY X-Gm-Gg: ASbGncucG8+nXE4YUzcVAIMfCLBayviGHKrYTU6+wGjbZrz97hOjvg3wQfoPcnQnqMp Fx8sM0bHhi/iWlgHA5RUfHROIR6Z7vL/o4PXxTPPJX4msEsmNPtNDx4VwS0O+vxaFdsMc13GHJl rJjAQOeolBLJi/GoHuQ0x6JKWby7WD+jfE39JSo6f4QoVUarjlxIxVjagiKArTtbvciRljKBlNy eNFPnYZLhJ+44vSiYHhdTWkK4jcBWR+sJJFsy/swFTwfyISPKXayWenDtwHyrxzvK/bKD6g6udi JatuQHgPquKFRBea8A5kCRdzYwVZeERvyvmIAa0DvTtl X-Google-Smtp-Source: AGHT+IENa3JuIhpq8gLK1UO7q0UVxsVc37TtBAYFggpxhZ9V2d0HIgX4uMiKRlW+XpWaKKf0qKtdug== X-Received: by 2002:a05:6902:1b04:b0:e69:1efc:9855 with SMTP id 3f1490d57ef6-e6b83aaafb8mr25705751276.38.1743624413896; Wed, 02 Apr 2025 13:06:53 -0700 (PDT) Received: from localhost ([2a03:2880:25ff:72::]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e6df6a4b667sm452073276.57.2025.04.02.13.06.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Apr 2025 13:06:53 -0700 (PDT) From: Joshua Hahn To: Yosry Ahmed Cc: Nhat Pham , akpm@linux-foundation.org, hannes@cmpxchg.org, cerasuolodomenico@gmail.com, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com, hughd@google.com, corbet@lwn.net, konrad.wilk@oracle.com, senozhatsky@chromium.org, rppt@kernel.org, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, david@ixit.cz Subject: Re: [PATCH 0/2] minimize swapping on zswap store failure Date: Wed, 2 Apr 2025 13:06:49 -0700 Message-ID: <20250402200651.1224617-1-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 02BC5180010 X-Stat-Signature: x5e8xy4gezfhzeexxkzpgxmdp1h8iwc3 X-HE-Tag: 1743624414-353190 X-HE-Meta: U2FsdGVkX1/CKTHQ/VZNhZ2gkl/Qo2Dwe1nCRSDn8gk3JLW9lADt6Qwke55crUovr+c1FdqJF++Awemkn5n3eHjzF2Fx2UQKFHgRNzEc6kRzqEIJEu+3iQVePlzSjOLFAObPlZ+PRVA2/QwmmQ9B+hYV1sEh9y8yjXPcThYLkUAJq9Ru/4Lt7TzCScBU/8TMSeAtV1Rh1WMA7Xb2baHD6PQmlgH/CY6nAASjh9Oa+gvmQ+FAbpqtFjJ0n0MsWCvKKdDtwQ+4XBrvOgsCC6w5ucHVzrGzy0jZ89R7GlUi7nuXXQHo7cDXkavb/aI0Dq3tEL7jHsRosx+bPFcOXnlmPHYeZnNGlW+WKu5NYbAsUJp2gxRf7bVCHIYEqU/Jj0MGNQ7nz4IlhLzg7UbHTUEjcGCzSap0qdrezMJelHjd56NaVyw/cm/T+OEyXyI9XWreUtph6kSTVn17kd/h17edDJeKcVNVDQVddcdIY8zIX1WgL0G8shyxGIAqIBVFAHj1FmTSFzFP2Qt4u8ozbiF7vwz4U4Un9u5jTk4UYAQnLTYIox+Gp8F1vXAYyC91Ifqhi0EshLNmhD4TJ+NQRcUDYieCxfmwYaWm6y6eKyi59SEDW8muNw0h9CzG57/1Ptlw1Ozv1ADAOAV9cCF0jUUS20DgW+KjtnOYvbaCzva53rz5ClQy6JtfHwrhiioAq55ksHEp1tNGsNAYEu9y+v8ne/e2Cl99jrK5Qkagd8Y5GgbSjMcgacP8QPEA32aZa/AfgB/HLtMF3Ivg/qkyMXD5dTHkRPCRUWz7X4+NbMGtWye03mvGI2b/T908ba8lbF990OK+vbGwLhrYnvrCeK1GBdsM7fV+ETdc4GcGJbHVa/p4u4xP1ZWBTBMVjTlHgEO9oISpLzX9XS+MJO2uXKfYJzHJWO6eQ3rw7kR6FAp4Q6/CTPhbNLp8dMgZ0r69t3lwseKLB3ZZM1UG/S0AhOW UqC6z2fP sCLVX2TKQUVrb0E/i8ADzIvH6MdZTCoZlyb0K/K87L9KJfLu5RjwNRBOLxL0C44rHUcnN X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 16 Oct 2023 17:57:31 -0700 Yosry Ahmed wrote: > On Mon, Oct 16, 2023 at 5:35 PM Nhat Pham wrote: > I thought before about having a special list_head that allows us to > use the lower bits of the pointers as markers, similar to the xarray. > The markers can be used to place different objects on the same list. > We can have a list that is a mixture of struct page and struct > zswap_entry. I never pursued this idea, and I am sure someone will > scream at me for suggesting it. Maybe there is a less convoluted way > to keep the LRU ordering intact without allocating memory on the > reclaim path. Hi Yosry, Apologies for reviving an old thread, but I wasn't sure whether opening an entirely new thread was a better choice : -) So I've implemented your idea, using the lower 2 bits of the list_head's prev pointer (last bit indicates whether the list_head belongs to a page or a zswap_entry, and the second to last bit was repurposed for the second chance algorithm). For a very high level overview what I did in the patch: - When a page fails to compress, I remove the page mapping and tag both the xarray entry (tag == set lowest bit to 1) and the page's list_head prev ptr, then store the page directly into the zswap LRU. - In zswap_load, we take the entry out of the xarray and check if it's tagged. - If it is tagged, then instead of decompressing, we just copy the page's contents to the newly allocated page. - (More details about how to teach vmscan / page_io / list iterators how to handle this, but we can gloss over those details for now) I have a working version, but have been holding off because I have only been seeing regressions. I wasn't really sure where they were coming from, but after going through some perf traces with Nhat, found out that the regressions come from the associated page faults that come from initially unmapping the page, and then re-allocating it for every load. This causes (1) more memcg flushing, and (2) extra allocations ==> more pressure ==> more reclaim, even though we only temporarily keep the extra page. Just wanted to put this here in case you were still thinking about this idea. What do you think? Ideally, there would be a way to keep the page around in the zswap LRU, but do not have to re-allocate a new page on a fault, but this seems like a bigger task. Ultimately the goal is to prevent an incompressible page from hoarding the compression algorithm on multiple reclaim attempts, but if we are spending more time by allocating new pages... maybe this isn't the correct approach :( Please let me know if you have any thoughts on this : -) Have a great day! Joshua Sent using hkml (https://github.com/sjp38/hackermail)