From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6E11C47DB3 for ; Fri, 2 Feb 2024 11:23:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5BFED6B0078; Fri, 2 Feb 2024 06:23:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 597616B007D; Fri, 2 Feb 2024 06:23:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 45FBC6B007E; Fri, 2 Feb 2024 06:23:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 34B206B0078 for ; Fri, 2 Feb 2024 06:23:28 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0AA2B160F64 for ; Fri, 2 Feb 2024 11:23:28 +0000 (UTC) X-FDA: 81746628096.23.61BF5E9 Received: from mail-yb1-f176.google.com (mail-yb1-f176.google.com [209.85.219.176]) by imf05.hostedemail.com (Postfix) with ESMTP id 53B68100005 for ; Fri, 2 Feb 2024 11:23:26 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QrO+TyTP; spf=pass (imf05.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.219.176 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706873006; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1KOce/rNIsOTa8oBL/B9soOdUTgL+D05JTqUDqwEEKQ=; b=f5bnX0ZAxRdHyrGgRGhBUeJPFAhqosUhrQSs2kRzctrjprRuv7z0wz4rGesxpfxCOdqaSi QL2BDdqi1yIOsI4NB3UfYAvuBQNlLiBblGjsAxDVqWBy9qTizdQJlz/g0jnOUCeXVFxRKR ASt7qCaZDV0+sz6t9BjhHfES4/U3Jaw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706873006; a=rsa-sha256; cv=none; b=MX+PLI5Unth67Wmxt7aXg7SWmIe4/RW1jU1/QttzfWr8HJmGQotqMkQD3oY22C9gd6wMD9 FVzv4c7p1Z/gVPCGsfQitVTAzbQSUSmvmb5TCDp0WO3wqMra0FJvCMWicL7Cfk3d00Xbya kwmP2GR8repJ60PEcW8siBbReDlvyMg= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QrO+TyTP; spf=pass (imf05.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.219.176 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-yb1-f176.google.com with SMTP id 3f1490d57ef6-dc236729a2bso1895546276.0 for ; Fri, 02 Feb 2024 03:23:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706873005; x=1707477805; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=1KOce/rNIsOTa8oBL/B9soOdUTgL+D05JTqUDqwEEKQ=; b=QrO+TyTPboTPVfT5cu7RkxPm5KXSHLtgBgq62/tPlIDLcgw7t9p+bzuoJUzbObbEP3 SHXRx1IbO6hewQqoOffDYgByhkv15luZwXtiu4HV/DcgH90wuWbqjOfQG/UsgmDwJQEM B1gOfw6leJgNviHsFEW2imU8+tt/mDwdWlZSzgOfHlTdlfnHJrXpNflXYNUvZ+mw5rnE LFGjcDlbr3R61oZ5Q/vUUY/yg+OdLxpWej4KuQ9kO59wycvn1axPmt5XQYwvQmonYO/j eJpmArSUu5qV6WJ2Ab6NCU7BHdRjhE5ocY39PVzK4KnFF/wtsUEGC/KXhq7gBOM1ouO/ ZreQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706873005; x=1707477805; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1KOce/rNIsOTa8oBL/B9soOdUTgL+D05JTqUDqwEEKQ=; b=LvUggkoxiVdVAB16ux5NEDOLbNFix0WLq614Dj0K9UTRy2+JrRGf0wuB2GU8M2h7+M Vst6Ni+ZdbWbwucQ1L9G8JM0UvKBJD9FxRHwSSuU+3iBt23j9m6a0oFOUGxLj56oRjCZ X+U0zhEEkaNnStUeEl8GATbCbKqYaHDweZFW7iEIx27SQYC4ZumYVPt0F2tuOa9Yueq1 8mKGqayz6DV92q6Vy2CkYAirNT6liMGZyaPlL63oDIMrXtJgzeqDraHxkzWxImIOkhY1 PVFSBXbQzHNi5TRhTc5PcHGGZOHxM5+ZieCzU/iZTFavvhN4Hps4xwMh2yqQ7KLKB35E cPFQ== X-Gm-Message-State: AOJu0YyFqffSZuz2Y+ObR3NefB5vMsbp+95UJVXad6ZYYsTaFrzaCBHj 9QTlxaBR88JZa5XHexQueLcxu+4QLuDWoift06ZeqowXut/ujF6D4ZCq7j/Jn6lJFZE1iJY8qOf 6js9/0V/b8MrVDevuvwJNKyjlWiA= X-Google-Smtp-Source: AGHT+IHotxU7Do/mwe8RBQy1Prt/LOpJsoznYZjdpK0c3L9GAPq7bxWrCVRBtGU0P75D9YdnsJX1NAf4Zh044Th6W+4= X-Received: by 2002:a25:aaed:0:b0:dc6:a5e1:3a05 with SMTP id t100-20020a25aaed000000b00dc6a5e13a05mr5181665ybi.14.1706873005389; Fri, 02 Feb 2024 03:23:25 -0800 (PST) MIME-Version: 1.0 References: <20240201125226.28372-1-ioworker0@gmail.com> In-Reply-To: From: Lance Yang Date: Fri, 2 Feb 2024 19:23:13 +0800 Message-ID: Subject: Re: [PATCH 1/1] mm/khugepaged: skip copying lazyfree pages on collapse To: Yang Shi Cc: akpm@linux-foundation.org, mhocko@suse.com, zokeefe@google.com, david@redhat.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 45fzbxowiqob1rtqm6kcahgna6sr5ox5 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 53B68100005 X-Rspam-User: X-HE-Tag: 1706873006-226075 X-HE-Meta: U2FsdGVkX1+0U0YTIrPWFat8WP/JIhTpJxwldaz7PDi8Dr3xshR8nbZl/tDp2qvw4abaQQbS6VpMplaXp43rr6oX4NanSbfaJ62xUIomeZqUNcLVBwUgBvcWEkDx1K9hA/P6e+sC7paU0ZTGQXf5QHlJu4IOsvGxYHwqg0j9NreELqGUPVAXpP37cdH/JvRdP3tmVhVNi3J7+lse8o8xiAbBPG06LNBAhdiCQjukJKWOYXb5WI/LJP99Z7o3Rm+r2A2u+2+yIji6BJZ+MPi9/4Jo1oORDYJwHEhJx9U5+M37T3abJI92gB9R84JEUUx8ozRpTv9SRoEuc0jChU86wiMNAYRBLPVl48a/de50/eNIhVY/Vpk8vBzei3tqlz2l5vymJBPiFhB1lGkj59BJYrB1XDHFe5a53irr6zZh/9pty9UAmADijzIC5HzI9wjWprZDDT6PZnsFyrIF7f76vEMWAeZ8Riu23gL9nK73XjyRO43FMqGYZ1t923YaI//1hf4+r55FJaFUyfKtg+RDlsYrwLr2yxh4DW0sqFyz9cMPdWd0esz6FKCsNoaKfZczXPWZL62ZWMKd6YHmExArRZGRI7oeAzqBMEWMwpZdv6CfT+SbOVPGEgCkT8cNEXafzFE6IOtd9UToH+vRmZjRxH43BzOwPJ8VQzpk6nOoVDaiE8oDg8bGkcSEbvpUjft+1Kmb8AM1wj6srhKyfH0QW/II9geNAH4jLwEFL/H8+EHnuVhwb/mHtzvLskuetukElaERTd5sSrJp+GuADYiWhm8MQxKX/J8fT7JlFjBmI1PQ9Sr6Y6Pt5KqsbDpoVmXjAj3in954gF3jOo8oH2n8GtCWs4+AoNUrMCerSArJUkBFNCGrd46hhx11wQyofPFFs0logoaoIullstmjTZq+CLyQlh/9V5N3zy68k/r3GQUcxs7ihYabPmo46IPAe4HHCDCyPddCxw/DoRlOE0Q 0Djn7HNa GYB8hciKMutVYlUNolUPwJ4w4EucRgP84w4NQ83TCXIeLrwqOKvIn4PLB4dqT9j7+24u7dQlpYC+u497YAKFQRPeLArgc2rp6JofOow01crPYJ7XGz2GDTy3l9VJ8ukFj9WEikCAHV8HiU/nrgpApv6WeoSl5U2nhQTpEHCAmeZyEjP0raR3/J/UHOqfHJo0yNjtrgX/IZOe60yfwLgyMwZY4I+h6iFD2yvWezr9lKRmRNbkhCe2WynwjCFnp/ps67xQ/c80e9jqGcjgnIfIZ+4mA7iopBFRJiOT8fmC4K+j8lOwIC0gMp0s0MslY1RT/jKgoDgrobrtjwkvWwxaSLO4PqrtZqvgU0BYpkXhHtPMdVVJ3D75nK5rODA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Feb 2, 2024 at 4:37=E2=80=AFAM Yang Shi wrote= : > > On Thu, Feb 1, 2024 at 4:53=E2=80=AFAM Lance Yang w= rote: > > > > The collapsing behavior of khugepaged with pages > > marked using MADV_FREE might cause confusion > > among users. > > > > For instance, allocate a 2MB chunk using mmap and > > later release it by MADV_FREE. Khugepaged will not > > collapse this chunk. From the user's perspective, > > it treats lazyfree pages as pte_none. However, > > for some pages marked as lazyfree with MADV_FREE, > > khugepaged might collapse this chunk and copy > > these pages to a new huge page. This inconsistency > > in behavior could be confusing for users. > > > > After a successful MADV_FREE operation, if there is > > no subsequent write, the kernel can free the pages > > at any time. Therefore, in my opinion, counting > > lazyfree pages in max_pte_none seems reasonable. > > > > Perhaps treating MADV_FREE like MADV_DONTNEED, not > > copying lazyfree pages when khugepaged collapses > > huge pages in the background better aligns with > > user expectations. > > > > Signed-off-by: Lance Yang > > --- > > mm/khugepaged.c | 10 +++++++++- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index 2b219acb528e..6cbf46d42c6a 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -777,6 +777,7 @@ static int __collapse_huge_page_copy(pte_t *pte, > > pmd_t orig_pmd, > > struct vm_area_struct *vma, > > unsigned long address, > > + struct collapse_control *cc, > > spinlock_t *ptl, > > struct list_head *compound_pagelis= t) > > { > > @@ -797,6 +798,13 @@ static int __collapse_huge_page_copy(pte_t *pte, > > continue; > > } > > src_page =3D pte_page(pteval); > > + > > + if (cc->is_khugepaged > > + && !folio_test_swapbacked(page_folio(sr= c_page))) { > > + clear_user_highpage(page, _address); > > + continue; > > If the page was written before khugepaged collapsed it, and khugepaged > collapsed the page before memory reclaim kicked in, didn't this > somehow cause data corruption? > Thanks a lot! Yang, you're correct; indeed, there is a potential issue with data corruption. I took a look at the check for lazyfree pages in smaps_pte_entry. Here's the modification: if (cc->is_khugepaged && !PageSwapBacked(src_page) && !pte_dirty(pteval) && !PageDirty(src_page)) { clear_user_highpage(page, _address); continue; } Could you please take a look? Thanks, Lance > > + } > > + > > if (copy_mc_user_highpage(page, src_page, _address, vma= ) > 0) { > > result =3D SCAN_COPY_MC; > > break; > > @@ -1205,7 +1213,7 @@ static int collapse_huge_page(struct mm_struct *m= m, unsigned long address, > > anon_vma_unlock_write(vma->anon_vma); > > > > result =3D __collapse_huge_page_copy(pte, hpage, pmd, _pmd, > > - vma, address, pte_ptl, > > + vma, address, cc, pte_ptl, > > &compound_pagelist); > > pte_unmap(pte); > > if (unlikely(result !=3D SCAN_SUCCEED)) > > -- > > 2.33.1 > >