From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 257EEC4828D for ; Thu, 1 Feb 2024 20:38:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A97C06B0074; Thu, 1 Feb 2024 15:38:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A47F56B0078; Thu, 1 Feb 2024 15:38:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 910256B007B; Thu, 1 Feb 2024 15:38:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 81DE06B0074 for ; Thu, 1 Feb 2024 15:38:01 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 5750A120DDF for ; Thu, 1 Feb 2024 20:38:01 +0000 (UTC) X-FDA: 81744396762.23.3BEEECF Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) by imf20.hostedemail.com (Postfix) with ESMTP id 88BA91C001C for ; Thu, 1 Feb 2024 20:37:59 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="h6Z/cugk"; spf=pass (imf20.hostedemail.com: domain of shy828301@gmail.com designates 209.85.210.179 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706819879; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GVQAdkYwWaZIRq7f/GgaZUQN4RdS31peSm7+ApO0u8M=; b=EdDAbj3PBNeAto/vvyln6pTMaBc4Qq/V27zbTD5lrXp4rrqtDYOaL6FtlJtgTGiOWpgDjp gw//tEhUAVx3jXgK71j0euxE8CsB3FSwW9AGjYH9r7Ne8AO6RcpItty/FK4PVpippfJBFg TM7IhbJUM5hK4jZz2YaLDiHYTLKmT38= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="h6Z/cugk"; spf=pass (imf20.hostedemail.com: domain of shy828301@gmail.com designates 209.85.210.179 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706819879; a=rsa-sha256; cv=none; b=WiW4vflcQ97s9UrJJ7SJNLKcHhz3jIANz1HU//p+zSwSX01upvSSoNZoZ9auToM554dR7K anIoK3TXURf3o8mRWcZr/z4BNM9tSBzrQBiswOgk1N3zxxtNHchuC5U/W0BAFEO5dDiwLH 0cI5/00bkTZDwsCArHagSFJEhMn2424= Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-6de287449f1so957880b3a.2 for ; Thu, 01 Feb 2024 12:37:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706819878; x=1707424678; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=GVQAdkYwWaZIRq7f/GgaZUQN4RdS31peSm7+ApO0u8M=; b=h6Z/cugkGgm2lzEwXb3vhZgf5Py8OeSc3ABGHYNFm0rL1l2gn21bNsHmlB7LZakKOx B8vqYHwmP6iLuwDr4W6mYB1sukkglViyj1xkJOspLb7C9A+/9IsfQJ+pu+fbWyAzPMi0 nL5w1VFmMVUnu2D1g5LGmQASD3+Ib9kxrfwwXTjfZsKkkN7emBRnRaskSP+L90ZMKjtE 8IrOMazF0eCu+x8c22dlXGZHYbS+2aQj7u4W8Xb3Ccf0OVkpVz0bJ8AS+QgiSs1OBuWZ Ec6jocAQ9OBxYpBLAChLlbTaMQ5E697muf36mQWwfn1Ub8I8MUJ7Vp+GHn7ukFrtiZfE IdGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706819878; x=1707424678; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GVQAdkYwWaZIRq7f/GgaZUQN4RdS31peSm7+ApO0u8M=; b=PNT9CFOow4vV2zMagmcERDf2ISJFiZ0frv7XBKBZrVLV1eXyb9w3BhvySmsoqeDMlb zbwd6/dqLf3afnYZs1hR8K1bk05D8BKcu1v0oM+gRTYDCf/Qf3ngKerfnNzxFjRwG1GD TT7Wb7UnZV7OWozP2xUOYf2tblEdU1Q3sHkgSl8X8Ehh4oX+fPznUpfseG81Al3u8q/a 8s7izMt/Fnz56DGDOeB/R3J9CSJWtMRBdIjXtPyoIY7PIw7iK2fg7i5UgR1AGbhTN3/H 6NHrJpoHQoOSPhMLxCu0IDniNBkoVA3qRixdV7pdIniAWCutUh7EgHTuJNCkSTFGaBwt lgpg== X-Gm-Message-State: AOJu0YwUdQh+5JLPSmxV1ZKRuuSw/EwVybgw9aaIcfNcre7fjwPW9jMD 86z7Vaj/9kxtutD++l/Z7uLMhGrIunQA/RCWFs3pqkGdPgVMyRZJhNNVvpOIYULB399Oo2zGInS 5qthpoxfN6INVeXb3Oe9/ZtItJP0= X-Google-Smtp-Source: AGHT+IF2her2WRCa1VTjbbFWF6VH0KNfeEIpm6T5opSnPGtCDL4vQI27qXy7/VJZsuPDwrbshxFkE6GSqjlWGvz7ey0= X-Received: by 2002:a05:6a00:ccb:b0:6df:e229:8f16 with SMTP id b11-20020a056a000ccb00b006dfe2298f16mr5802679pfv.17.1706819878240; Thu, 01 Feb 2024 12:37:58 -0800 (PST) MIME-Version: 1.0 References: <20240201125226.28372-1-ioworker0@gmail.com> In-Reply-To: <20240201125226.28372-1-ioworker0@gmail.com> From: Yang Shi Date: Thu, 1 Feb 2024 12:37:45 -0800 Message-ID: Subject: Re: [PATCH 1/1] mm/khugepaged: skip copying lazyfree pages on collapse To: Lance Yang Cc: akpm@linux-foundation.org, mhocko@suse.com, zokeefe@google.com, david@redhat.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 88BA91C001C X-Rspam-User: X-Stat-Signature: b8mnxnyar1nk4wsc9xb55ck5az1tipg7 X-Rspamd-Server: rspam01 X-HE-Tag: 1706819879-663331 X-HE-Meta: U2FsdGVkX1+gC5ULlalFcTD4hDbzdG/hmopwtuBq3i+idpi/BApW3xktRg9O/aTFjzGo5BK+cuYCVfmJV/dWCjVw2qou7fpShdSHBlkAN4AJP24UbuhxjQWmmw7gCnQM7zAB+o/Rx6C+6Gs0JcSgYl971kh0geoondy0LTTa4wpAorDy+1d+twj2lgSopYynk+AYARIHFaMC0pD+Gw44RDjzHIiYtQOb5asis2lLEZkRYLZy9y5dye+IleT5BlCHjM3qmDiyBmGugelSwgfMQ1Jg/s1iOZ4hCuafl8OKAsCBYvmVAjG3HXbsZAC+KAdCKxyjc75yIxZmqpQRWajGGGnNroUU7fz2bNGePGtqAZ61wJZwIXBorXcBYV3dg28ECyurMyFVSnRJ27CT5/otf2Eiv6pxri8rsU55pWZnwmEb6FhrRIOui56WGlN+1NNjEKmMm7qSqZVsPDSVINS2HChh5TbL9CmLnvfelxU/BA/UQbR9wshy2wfhe97SWG9PSkHl0su1gEN24/UhXz3k/bY85evLJ9JqPZ6bDkQNXlr0axoO3HtX5v7jS6a9sP2cHP/kEGKhGDpkkIIN4yg2MUMK1lMEPRpc99YLMueLdRrsgPsXBeR2VGPk2UZ+XZLb+LGNrLp0pCU2bztNet8X9onLrCq33ISBb02ChxGqyAVEtko6wxg0VgO/akHzOwuyYDPgmqL4RD0iDcSxTmNNVCNunMtHlG7Lba21hp35YJwAG68GT9AZACWJvAVY5n1IYSljASvl7CdMS1b5xq3Ul66lWc5YF5k4Fk5UJ2zdgs53xyq67bxVHOrcxF0G+QfKNnr+/KAVI469p75vSpj77J70ICWOJPmVhUFKc0xRYfKZY2Oe1tHAYtuEsckuaHY7MZxda2P5kLzMmq5kFcDq8lMkjzI3nFnEGGjyNSHE6Mj4vFIthzJNILIcptMKDE3xEqWzc/IdwZ0gY4TObVN SqS9dfkR EHIZ/c15GO/bP6CSJiz+oy5kzZJb2ckH3HTvumQ35OhCh5UYJyZ1Bh0bsK1bmj0pjzEim8os4dNLuSviCTG4CFKO+vPwE1iHo5ztU8tCrPjGtNQouzCbOqOnzBe+6UIRxl0yqn17/n3yHThP50A1efQx1c+zPDHyMbaudqFQU621feIz10Q9SZEQBmcvoILPJui/XFslQpSKf1MiB8cfRtsXHUHGTAb006YCzmDBRH0EfLSMxQmZI63picSIGbwZkoSavnQMaYFrdMh1FwcN5Vzs3IQWfa9iiCGlKCUpvxSpEJ6XeYpoIp6U/yNgkrccwI/H0wYvNLOTmwqs8roEqhNy6zAh/a9M5ySeuH63oYHwrJmoEXkKpQjn4jA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 1, 2024 at 4:53=E2=80=AFAM Lance Yang wro= te: > > The collapsing behavior of khugepaged with pages > marked using MADV_FREE might cause confusion > among users. > > For instance, allocate a 2MB chunk using mmap and > later release it by MADV_FREE. Khugepaged will not > collapse this chunk. From the user's perspective, > it treats lazyfree pages as pte_none. However, > for some pages marked as lazyfree with MADV_FREE, > khugepaged might collapse this chunk and copy > these pages to a new huge page. This inconsistency > in behavior could be confusing for users. > > After a successful MADV_FREE operation, if there is > no subsequent write, the kernel can free the pages > at any time. Therefore, in my opinion, counting > lazyfree pages in max_pte_none seems reasonable. > > Perhaps treating MADV_FREE like MADV_DONTNEED, not > copying lazyfree pages when khugepaged collapses > huge pages in the background better aligns with > user expectations. > > Signed-off-by: Lance Yang > --- > mm/khugepaged.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 2b219acb528e..6cbf46d42c6a 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -777,6 +777,7 @@ static int __collapse_huge_page_copy(pte_t *pte, > pmd_t orig_pmd, > struct vm_area_struct *vma, > unsigned long address, > + struct collapse_control *cc, > spinlock_t *ptl, > struct list_head *compound_pagelist) > { > @@ -797,6 +798,13 @@ static int __collapse_huge_page_copy(pte_t *pte, > continue; > } > src_page =3D pte_page(pteval); > + > + if (cc->is_khugepaged > + && !folio_test_swapbacked(page_folio(src_= page))) { > + clear_user_highpage(page, _address); > + continue; If the page was written before khugepaged collapsed it, and khugepaged collapsed the page before memory reclaim kicked in, didn't this somehow cause data corruption? > + } > + > if (copy_mc_user_highpage(page, src_page, _address, vma) = > 0) { > result =3D SCAN_COPY_MC; > break; > @@ -1205,7 +1213,7 @@ static int collapse_huge_page(struct mm_struct *mm,= unsigned long address, > anon_vma_unlock_write(vma->anon_vma); > > result =3D __collapse_huge_page_copy(pte, hpage, pmd, _pmd, > - vma, address, pte_ptl, > + vma, address, cc, pte_ptl, > &compound_pagelist); > pte_unmap(pte); > if (unlikely(result !=3D SCAN_SUCCEED)) > -- > 2.33.1 >