From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1532C19F28 for ; Wed, 3 Aug 2022 07:43:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EE4E68E0001; Wed, 3 Aug 2022 03:42:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E94EF6B0072; Wed, 3 Aug 2022 03:42:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D35288E0001; Wed, 3 Aug 2022 03:42:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C15686B0071 for ; Wed, 3 Aug 2022 03:42:59 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 9148214112E for ; Wed, 3 Aug 2022 07:42:59 +0000 (UTC) X-FDA: 79757490078.18.F36DC8F Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf26.hostedemail.com (Postfix) with ESMTP id 41482140119 for ; Wed, 3 Aug 2022 07:42:59 +0000 (UTC) Received: by mail-pl1-f169.google.com with SMTP id m2so8886839pls.4 for ; Wed, 03 Aug 2022 00:42:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=yZyRNrOtmJn2R9oPRcJXRCnJ2GUbITB77EurprO2OQI=; b=Ot1NjR/UkHIDQeGzZnG3pPKCxE57l/N5qwvNLfad66Bn3n5u8eg7oDqIVa8nEWvc2d CY4qXmToL6CmKvbzgmXCRnFM+soJ44AKdA6qwPV4aVRAoIcIhIOBh474KLF5E1r5J4RK PpphwE2cffnx6mUv5tz/ym9DeWhTjZMIXTGzrZdLErT9kqUbPNeBWtiW551TjPpaEToa ov2nZOxydaj58+HG7z1NocYCfOS5ZpVUp8DZmKlR8YbloLCMHvbVluaKaEcMxm0b6+13 cnIfr+GIULENfKMun0SPEU3cQGCYlF2+TeQ354oHU1KdRVt5b04DC1hB5uI36Ug+esZ+ YFqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=yZyRNrOtmJn2R9oPRcJXRCnJ2GUbITB77EurprO2OQI=; b=2CF759P2YNVfT6hbaXPRjtiwgiiIwuXNmcqhilZdIEkteC+OKd7OgB6teOZMhL+OVh A4hhyOYGFN6ClLfoeCwrCegnEuIAbJruszCsJ6hi6u+haAtaKbPwlmyO5tAQI0E9FoZB H0SP6x1KY9mmAfVUvvD7HbFutqra60THl6BQpqjpTNDRpl2zhuaicO+XTDDANhVHqXED +r4YFVkPlXNjdY9wrUu108lRyNR65UBZDmZphz4K1axsXPQq1aadDBfk0ShLNOgc0IdH C0pQf+ti/xkZ6J4YG+EL4moc0XfyCD4aKT5idBTQ36WklZi8DIbphSC7w2ew1x1MST5N Ot/w== X-Gm-Message-State: ACgBeo3y2J2xf7m66xBIjMdt3TsV+H94Mz61KAgxKys7kqy34w993Kan JyDaoXPh46SG4qxRr0tOlfI= X-Google-Smtp-Source: AA6agR7zBK03yU4nvnvfaX28GT3DJcs+OhaxcL2kojdQXR/eKF9gfPomgJ3BMZAa7AbLxE0hvWtF8A== X-Received: by 2002:a17:902:cec1:b0:16d:c4f2:66c5 with SMTP id d1-20020a170902cec100b0016dc4f266c5mr24747092plg.20.1659512577878; Wed, 03 Aug 2022 00:42:57 -0700 (PDT) Received: from smtpclient.apple (c-24-6-216-183.hsd1.ca.comcast.net. [24.6.216.183]) by smtp.gmail.com with ESMTPSA id q30-20020a631f5e000000b0041c6541383dsm2536258pgm.60.2022.08.03.00.42.56 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Aug 2022 00:42:56 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Subject: Re: [PATCH 2/2] mm: Remember young bit for page migrations From: Nadav Amit In-Reply-To: <20220803012159.36551-3-peterx@redhat.com> Date: Wed, 3 Aug 2022 00:42:54 -0700 Cc: LKML , linux-mm@kvack.org, Andrea Arcangeli , Andi Kleen , Andrew Morton , David Hildenbrand , Hugh Dickins , Huang Ying , "Kirill A . Shutemov" , Vlastimil Babka Content-Transfer-Encoding: quoted-printable Message-Id: References: <20220803012159.36551-1-peterx@redhat.com> <20220803012159.36551-3-peterx@redhat.com> To: Peter Xu X-Mailer: Apple Mail (2.3696.120.41.1.1) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659512579; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yZyRNrOtmJn2R9oPRcJXRCnJ2GUbITB77EurprO2OQI=; b=AkoEaBAUcQbNvfbchUkQFieyCW+Egi71wed4Jh26hkzlwyKgeQKgDfGxLpV3OPHcLBpbms XYbR+VtCx9ZAtiwas45rH7j940cCW20KW0z3GH+8ZMDNfHzF4z5EH6KN9F/MUUepLJB+lS piz/HgQSWsdZ3QVBrq/nBVwaWCfc35M= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="Ot1NjR/U"; spf=pass (imf26.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659512579; a=rsa-sha256; cv=none; b=PiC1ouLQJBTo3yJrwk7sgLMu3ObGYonu89np4rHn+HvvMFb9EYMrv0xY/OzmdCusqcgAli N0l9fTBT49fjxrChN/TigKxng4JON1FPFJVMMu8FeDoi8Fae3aKMZ6d86F4uDUWA2fRheo jBXnLv/4BclGB7V4ZlA2+P/deDf0Rps= X-Rspam-User: Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="Ot1NjR/U"; spf=pass (imf26.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Stat-Signature: rqkyzjka7qb7nusai78caqts5c961oig X-Rspamd-Queue-Id: 41482140119 X-Rspamd-Server: rspam10 X-HE-Tag: 1659512579-289593 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Aug 2, 2022, at 6:21 PM, Peter Xu wrote: > When page migration happens, we always ignore the young bit settings = in the > old pgtable, and marking the page as old in the new page table using = either > pte_mkold() or pmd_mkold(). >=20 > That's fine from functional-wise, but that's not friendly to page = reclaim > because the moving page can be actively accessed within the procedure. = Not > to mention hardware setting the young bit can bring quite some = overhead on > some systems, e.g. x86_64 needs a few hundreds nanoseconds to set the = bit. >=20 > Actually we can easily remember the young bit configuration and = recover the > information after the page is migrated. To achieve it, define a new = bit in > the migration swap offset field to show whether the old pte has young = bit > set or not. Then when removing/recovering the migration entry, we can > recover the young bit even if the page changed. >=20 > One thing to mention is that here we used max_swapfile_size() to = detect how > many swp offset bits we have, and we'll only enable this feature if we = know > the swp offset can be big enough to store both the PFN value and the = young > bit. Otherwise the young bit is dropped like before. I gave it some more thought and I am less confident whether this is the = best solution. Not sure it is not either, so I am raising an alternative with pros and cons. An alternative would be to propagate the access bit into the page (i.e., using folio_set_young()) and then set it back into the PTE later (i.e., based on folio_test_young()). It might even seem that in general it is better to always set the page access bit if folio_test_young(). This can be simpler and more performant. Setting the access-bit would = not impact reclaim decisions (as the page is already considered young), = would not induce overheads on clearing the access-bit (no TLB flush is needed = at least on x86), and would save the time the CPU takes to set the access = bit if the page is ever accessed (on x86). It may also improve the preciseness of page-idle mechanisms and the interaction with it. IIUC, page-idle does not consider migration = entries, so the user would not get indication that pages under migration are not = idle. When page-idle is reset, migrated pages might be later reinstated as =E2=80=9Caccessed=E2=80=9D, giving wrong indication that the pages are = not-idle, when in fact they are. On the negative side, I am not sure whether other archs, that might = require a TLB flush for resetting the access-bit, and the overhead of doing = atomic operation to clear the access-bit, would not induce more overhead than = they would save. One more unrelated point - note that remove_migration_pte() would always = set a clean PTE even when the old one was dirty=E2=80=A6