From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C478BC48291 for ; Fri, 2 Feb 2024 17:43:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C2936B007D; Fri, 2 Feb 2024 12:43:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 572D06B007E; Fri, 2 Feb 2024 12:43:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 43AC06B0080; Fri, 2 Feb 2024 12:43:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 3342B6B007D for ; Fri, 2 Feb 2024 12:43:46 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id EEDE5801AF for ; Fri, 2 Feb 2024 17:43:45 +0000 (UTC) X-FDA: 81747586410.19.7DCB3B8 Received: from mail-pg1-f172.google.com (mail-pg1-f172.google.com [209.85.215.172]) by imf14.hostedemail.com (Postfix) with ESMTP id 3F98E100006 for ; Fri, 2 Feb 2024 17:43:43 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YfVYexhj; spf=pass (imf14.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.172 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706895823; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ohf58qwHS9OfZsVTPFlr5rG9ibb3sy6+1zmgublylac=; b=VB+cFP+mO8OqOyTq3/kHUzh2/nREciNlPBqDQ17usJXsZmkqMV1A6RCPGvC4Z03zmeEYgs euCVDZAHIfgZzKlcOOLy5uI7jad0mgOq00yokS0Mb1/MzuVcLJUp00MwhShe1mJyUHQZFw pxfgsiO/SEcZX/YVhbsCULBbvjRLyvI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706895823; a=rsa-sha256; cv=none; b=4DscDjw4afTkroKmpkXQRriVcWmaG2DKoW+h+R8E2A9B5l8w52kcHXjp2gTmc80GqLjGmh 6IhsVYTA7V2GaSltcSMxlbv5xn/bq780Dxb8yQY8gKDe5vhtw6uYk+mR5N7sqiMM2Koy0J uYelgToczvaydIK/bxz0Q6XzR2ttsIk= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YfVYexhj; spf=pass (imf14.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.172 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pg1-f172.google.com with SMTP id 41be03b00d2f7-5cf2d73a183so2619004a12.1 for ; Fri, 02 Feb 2024 09:43:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706895822; x=1707500622; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ohf58qwHS9OfZsVTPFlr5rG9ibb3sy6+1zmgublylac=; b=YfVYexhjQmwO+QqNzWKobhMseD2byrE6PR0KKcDAcD9vmgBUEUtTCxf0y8/Ihb/pf+ qzojRDTaJubcsssbA8tkjG9cwRCQpCC1R5Wjj3VuWYRsVMNoIA6Y21qPwQQD5A1HuM/U Bhj36CZdL2LtCusoQy0gxskCL/S9NP7Mbc8ZeGgX6PfnjtYpRQKxGwHOxipBJ1hUR8xF Jaigjh9L3OEbNKBxdPfC8dipfaNXUkUAmR2mXecoSw0z592Q8ZpyXxFKTTV90iJ/BzJJ Dtjgk5ntNBX4AWsZx73xMI6wXkn9Tpids7FUg3QvGalT3xJVv1V276pwo4KsaSWi2aqb 7RAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706895822; x=1707500622; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ohf58qwHS9OfZsVTPFlr5rG9ibb3sy6+1zmgublylac=; b=udbf399uibzgapIExfgRCefI4n4xYheEG4qwDnFhNJvJka6H251dqpOuOOMXdDikyk Pctq0AiZBsXRxP1rNqWS43hSYXD7QP2jXSyPphbQ8aHSC62XBmcatPSvnXnxk46fSWG7 X31R83b0CMl+KVuFXccBIGiz2aNMLc3FEnzzgqT+xznW7KVFCq0XpUnniztOf4iglaBO LK3vwolGN7e4e1YCeN5DVOks7mAfrmqLgPp5zBARaMH9boDVAB7qR5ThtMddrXSvFcZV 9CvYqBie9ETuEJJDuK+V4ApSbdNicBVPh1bE7QVvi5Joiasx+Bxj18ga/P7UvkasIid6 ThLw== X-Gm-Message-State: AOJu0Yz7zEPxFxr60WTOUh2UljEzpdqatCusvJbqvQeIaCKEvkBDD210 owiMoRBHFC9Brt/BkThDt1xSGfyIt8qUGFySfPHmjZBuiDUOQjtGD+nZiD4rZThROb9iLGzTjJw 5+WQ1bl/EnoZ5sDipIPmS+knZNZc= X-Google-Smtp-Source: AGHT+IE4dSolelQjrgNHfq12kZYEQAarNzx/Bs7uqLuBqre6ETpVv4o6KY8ZcgJ5g9LNnxIU7cuyNvR0jq9o/hS83mY= X-Received: by 2002:a05:6a21:3945:b0:19e:4f3e:5b5f with SMTP id ac5-20020a056a21394500b0019e4f3e5b5fmr3383511pzc.9.1706895822096; Fri, 02 Feb 2024 09:43:42 -0800 (PST) MIME-Version: 1.0 References: <20240201125226.28372-1-ioworker0@gmail.com> In-Reply-To: From: Yang Shi Date: Fri, 2 Feb 2024 09:43:30 -0800 Message-ID: Subject: Re: [PATCH 1/1] mm/khugepaged: skip copying lazyfree pages on collapse To: Lance Yang Cc: akpm@linux-foundation.org, mhocko@suse.com, zokeefe@google.com, david@redhat.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 3F98E100006 X-Rspam-User: X-Stat-Signature: skkdye356siqud51rcnn6uk98ncp5ag5 X-Rspamd-Server: rspam03 X-HE-Tag: 1706895823-68339 X-HE-Meta: U2FsdGVkX1+TJktVd0Cc9uHu57+kvELKymb25Z2VfNdoSxQLS4RqhIr5XhCAOhBlnsXsCm6Mi4eosNUJCrTUizKCaVCkOyukI8gaxfF3Vpprh/8nd5nvIrkOACEiU0qNTuWqteuPmdWaqtj6bi6CQrFgSJXP4V+WLUgdRkE+W9gOdxfzuq9c7DtWSq1mtaPb2aqZqzIJxDpz2JjpURT9dDtQw7OTTtPgc0bKgFhFNSYNf+MWcTaXSHeEDGjoIoe3wHtsSQIHAaEXibm+wILcHWf6VBsmATEzAuMZso6c/6hU7QUap4t3/5638Z6bAXdFJQGaYB0lejvTJn2SbNkDl3v/Ge4ziU9jvExA2qVaCONOsASvf+SUfw7unN5oFCqpdEvET/3nSntw0lTQ1NT/55Y7LE+N6nKSbtpglTKqtnoVIr88iYgUYjgENcoIdiqUsNaZ5Id3GsiZ10QkZ4uVmAf7ulk45azOdHr4doMwsoovL9yIrYlE85sVEjld10XQwYPsjQ5MRd0I+bnsJuPBy3iEsVv1Wv1RDg3AYFnO9hEsDeggtEIj3/XZFSUMNJzoJTND988+NNP9lKziqnJO7EhRzS91x6KkSrbbCdEw9KcKKQ9xCHZrTGmejO7ZsxYiVlFNhj5RQQrZ8MAGYl791FVf9vSPDIoLjM9oBsj+vIZN3rpJo3VxwJRLRpQzCgGIgF/WX3WYVfyJXDIesDoQZlRUXFtt/oTooB+ofaJR90gJ5tFgVvB7S+TKJ91HRuUuFX66IQC10HUMyvV0QcC5SL3DTy5gYjYuXP/NLvWXlWuxB6Ze/4HRwvB62wKkwy2fJGO+Y9HxzKMrz9vNb3JSOlIoN+m/ZP7tviYbRp3OvA8fMELkGNUhqKFPXZizWdgXxO57uXWpFanymj/vrOvWJW8NhVEBmG3+0Jj1S2R5Ff8+sA8jQsaT2iYQZVE/RCO3hMOrFziUOoE82V1dUH9 rP9Lw3iR R3o+DdalQ5PrFeCde+EFBwIRRr7tdyMKA/IBIn5nDyvzn2vIKVYol7cD01syPmq/Swd40JFGSfueXD6mwlQufSl1xOwlB1WN2aji1DA/UP9cGZaTOwC6pIP64Z5ljZOdKeeACcc9M94l9pm1Wu7Mbt3SizDCnhf8MEXvj+NYp40rOj7wX4c9GNQ/NQ5yxe/Skmd3sDzd5U56wKDsk5O+9TpXcweQLPRS8CjLvntETsCLyGWKttxyIh3jKnq+v+TP7FHVDFE+VV8mW3e4U4VdYFqQEUQ2BLjp3zxmbwgJEeKZ4+DWV9FXFGzCHp6vBT2l9/k6iOQa3giVT2NYteSZO3dWlxAUCdlXNZbJ5pfEa9UAE4gyCTHhAxiu7i68eZCfIzJcpeB34P05pkPA7V+3CjYJbHI9HKd0SyuJ/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Feb 2, 2024 at 3:23=E2=80=AFAM Lance Yang wro= te: > > On Fri, Feb 2, 2024 at 4:37=E2=80=AFAM Yang Shi wro= te: > > > > On Thu, Feb 1, 2024 at 4:53=E2=80=AFAM Lance Yang = wrote: > > > > > > The collapsing behavior of khugepaged with pages > > > marked using MADV_FREE might cause confusion > > > among users. > > > > > > For instance, allocate a 2MB chunk using mmap and > > > later release it by MADV_FREE. Khugepaged will not > > > collapse this chunk. From the user's perspective, > > > it treats lazyfree pages as pte_none. However, > > > for some pages marked as lazyfree with MADV_FREE, > > > khugepaged might collapse this chunk and copy > > > these pages to a new huge page. This inconsistency > > > in behavior could be confusing for users. > > > > > > After a successful MADV_FREE operation, if there is > > > no subsequent write, the kernel can free the pages > > > at any time. Therefore, in my opinion, counting > > > lazyfree pages in max_pte_none seems reasonable. > > > > > > Perhaps treating MADV_FREE like MADV_DONTNEED, not > > > copying lazyfree pages when khugepaged collapses > > > huge pages in the background better aligns with > > > user expectations. > > > > > > Signed-off-by: Lance Yang > > > --- > > > mm/khugepaged.c | 10 +++++++++- > > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > index 2b219acb528e..6cbf46d42c6a 100644 > > > --- a/mm/khugepaged.c > > > +++ b/mm/khugepaged.c > > > @@ -777,6 +777,7 @@ static int __collapse_huge_page_copy(pte_t *pte, > > > pmd_t orig_pmd, > > > struct vm_area_struct *vma, > > > unsigned long address, > > > + struct collapse_control *cc, > > > spinlock_t *ptl, > > > struct list_head *compound_pagel= ist) > > > { > > > @@ -797,6 +798,13 @@ static int __collapse_huge_page_copy(pte_t *pte, > > > continue; > > > } > > > src_page =3D pte_page(pteval); > > > + > > > + if (cc->is_khugepaged > > > + && !folio_test_swapbacked(page_folio(= src_page))) { > > > + clear_user_highpage(page, _address); > > > + continue; > > > > If the page was written before khugepaged collapsed it, and khugepaged > > collapsed the page before memory reclaim kicked in, didn't this > > somehow cause data corruption? > > > > Thanks a lot! Yang, you're correct; indeed, there is > a potential issue with data corruption. > > I took a look at the check for lazyfree pages in > smaps_pte_entry. > > Here's the modification: > if (cc->is_khugepaged && !PageSwapBacked(src_page) > && !pte_dirty(pteval) && !PageDirty(src_page)) { > clear_user_highpage(page, _address); > continue; > } This may be ok. But as I said in another reply, this may still incur data corruption. > > Could you please take a look? > > Thanks, > Lance > > > > + } > > > + > > > if (copy_mc_user_highpage(page, src_page, _address, v= ma) > 0) { > > > result =3D SCAN_COPY_MC; > > > break; > > > @@ -1205,7 +1213,7 @@ static int collapse_huge_page(struct mm_struct = *mm, unsigned long address, > > > anon_vma_unlock_write(vma->anon_vma); > > > > > > result =3D __collapse_huge_page_copy(pte, hpage, pmd, _pmd, > > > - vma, address, pte_ptl, > > > + vma, address, cc, pte_ptl, > > > &compound_pagelist); > > > pte_unmap(pte); > > > if (unlikely(result !=3D SCAN_SUCCEED)) > > > -- > > > 2.33.1 > > >