From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C7D3C47DB3 for ; Thu, 1 Feb 2024 01:13:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 88CAF6B0074; Wed, 31 Jan 2024 20:13:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 83CE16B007D; Wed, 31 Jan 2024 20:13:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 704286B0080; Wed, 31 Jan 2024 20:13:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5C4F46B0074 for ; Wed, 31 Jan 2024 20:13:17 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2CB7D160643 for ; Thu, 1 Feb 2024 01:13:17 +0000 (UTC) X-FDA: 81741461634.18.6810C4D Received: from mail-yb1-f178.google.com (mail-yb1-f178.google.com [209.85.219.178]) by imf09.hostedemail.com (Postfix) with ESMTP id 4A2DA140009 for ; Thu, 1 Feb 2024 01:13:15 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YTcWURho; spf=pass (imf09.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.219.178 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706749995; a=rsa-sha256; cv=none; b=qm1H6ae5yUUg+rhUUEODVV58IM3DkIl8rVEjJrzGTwTY75uX9Qvg8khjH4clzejjvxLoIl HTA58YjM4dhQCiZ9p8I18AniyL9ad9xKHIxfa2EByBQ85ghqSkYo9nRJZUXz4exQDulSGt 22wTbBzaMh7gMz1OfVYvQsp/PCic8wM= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YTcWURho; spf=pass (imf09.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.219.178 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706749995; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Qu7TDXtuEwstu87+FhD80qzBfPypdOeZfjOo4stpT1Q=; b=Egs/LY7SCCMAiP3eFB1erNoisGaOVq0BWDyv//tbv8wdgd0VTKfSV8tcFUeUigC9uaYrJy tbk5BUSWojjlq15AxcgqMkrWAASBGEaKA829HdQxv1evJVKA/9B8VGFLTMAl00Qj5PlKNH xoVKpbAwC1ieCB8nZB8tACPUPzJvYRw= Received: by mail-yb1-f178.google.com with SMTP id 3f1490d57ef6-dc22ade26d8so364139276.1 for ; Wed, 31 Jan 2024 17:13:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706749994; x=1707354794; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Qu7TDXtuEwstu87+FhD80qzBfPypdOeZfjOo4stpT1Q=; b=YTcWURho8+vw2dmuH5P4HIHreCeo+3b/hYkgo+i2+4pmgvYHXuRzW0OxXXcetnikLS rYV13aDOMVUZAcT8rwEHE0MEZ3oYv+UGX/SCjRT0Kmn9NM3s41MG2reeMlPwu2ZlFAf/ EGkWCyiREYJ44vsfTq+L1zZR5GBMBBGP22PLJ3uOHNTr8+0dd+kzJDfq7eHtsSWdYlvQ YKCmKFjOBUg5ZWCx6PT/FgmaiJJDpV9hfVIYsdKb5bocNfx4rTisBtIc+jzXQ5uWsxJU 7+gbE126qqQFCVrd+2SZ4TQfHtzJdk2Hy0dx837cmzBXeMdm26wr3+OTdYLWOoxpE/pw G8EA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706749994; x=1707354794; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qu7TDXtuEwstu87+FhD80qzBfPypdOeZfjOo4stpT1Q=; b=oFKemr4vHflKwLULhcG+fyHItLNPEgyPhSkVnZav/HIHthezuVa1NREsHbVFgUzJnx lcoMF9Sn2zsA8cQhUyqKwUNRU9bMJsJEDaowXYplIc055IRzSIsD2ISGYcAYW9s60U6Y ilN9q5JCwDHVen84Euo6n4qb2Qdw2BKyNDvS6WJnai3RW/sYtfAEoYohs8sn+t0NLiPy IYAqi2lzel9eedJwiBcwJWbPi6wrtergZuKJEKjFN0M2Mg0dHo5i+PdYqT9I3G2fKkf9 ercIqqy05I+fMAc2FT1mMLMZV1EWFIVT2JZg2lqte3IeCZbXrsSgxUgAUqn8mbYB4ELQ oupg== X-Gm-Message-State: AOJu0Yw3itYRM3ffMte90kF1heJTJE/v6ZoAHP5QqSVq5CvfUsASmpV5 Wu4MaDSAB2bALP0WkXK0TQ7dv6k0hFiG5FmifRe4bhlamtfdSvf5pD9qZF5pEacSbE+e3z6BWcK ojAJJSumfy1VxBBzbA8yDYuKTjxI= X-Google-Smtp-Source: AGHT+IHKXWWgiAZ1Zxbnc6MHAENf3squcjzzsB7r34ZiDW9mV7Z8OQBfmEKGxRYn9SgkJ/ST47X+yq4KhYlngFm2Awk= X-Received: by 2002:a25:41d0:0:b0:dc2:5573:42df with SMTP id o199-20020a2541d0000000b00dc2557342dfmr3496260yba.25.1706749994147; Wed, 31 Jan 2024 17:13:14 -0800 (PST) MIME-Version: 1.0 References: <20240129054551.57728-1-ioworker0@gmail.com> In-Reply-To: From: Lance Yang Date: Thu, 1 Feb 2024 09:13:02 +0800 Message-ID: Subject: Re: [PATCH 1/1] mm/khugepaged: bypassing unnecessary scans with MMF_DISABLE_THP check To: Yang Shi Cc: akpm@linux-foundation.org, mhocko@suse.com, zokeefe@google.com, david@redhat.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 4A2DA140009 X-Stat-Signature: g3imixiqwut8ogcasb5p5iytmuk9ik3y X-Rspam-User: X-HE-Tag: 1706749995-328862 X-HE-Meta: U2FsdGVkX1/KdyGBQEournHDHRMnVW2YGvZ30Xd/70QJJQKl4t2/cwX9YBG/TTXkY72jiqWjpWA9G8F3DbXz4bChgJP0Qz+rjtizlobH6piUU0PZTCqcZlGzsXPyb/a2ncHM/TM9FpedOOtLKphngPTBjF0u5QXcH4BwXFG4V6FuL8nY7Kt01E26DcDu8mQZ4iyY9DP0w9eHZjbkLFVoXV0xWX1MKxsH4GHzR0euWmRnhDnde+qHHECAqmAZG7QOsYBxuQY20BXdCs3HTX0KghG0dcqGy8K21CCdRzletJDYYk+F1U18HGDURH68aFMxj/f0RtPwfxDW38RRGRNF3R1AiSEYBk2pPH2WZqMcGUlbhu8IRkJudh+2+VslUMiqagn8xZdof307s3Jh5unmnVtgJk9jFo1RBnJvqoyhvZ1PCUz8pHOnjCYOrWbqabmCb/e1QHCeOqSe3+q+xDiHxXgFJDJ1/gKI5cnILR53r2Bmr5413u0yaLFaS+5IYXRlDjj4cW9lPT2t6u1rWLYAb81rngFOGJwJo/dE9CJwx4ZYHXbF3SgjmQHK5lqwWQCUaLqWODEi/TBqsKabvM79KxZqxbMIjSxzbItwsvj9VJLHSwPeLG1OFD3DZPVV9zXc5lRfyLKrksv+Mpq4dZxU779o7TjqiqpVCUh0t5AgNNOQ8x809+BrydBTcgdSVJdDzuSz7HqZrlUf3EeuXHH2R5hqtRugLC74QVyK4R82dcSBQEHyZ9OFZnOnb76AwwHFjmRSFSsqIJRw+fHMk8TzteyQiBTdwECcxhlOwJ4xcAyTeQH82zw9zTGkJbcEBfVBMBRYDUx3InMA3487A/vMqluoh3uzYKCr4AxfMOoq9QrjDm4py/k4OSIz7iaoG/wbfN3m/xg8oTBOsMIGnmb9aSAwr91sdMUQ3nQSzf7SUfZhlwHh4ODpQIT4tqRRV+N0tnLxQNH1HdP0Zilw72B Irnqfl2G fSxAicilCLorloioIm1BQw9+xtrNRSsHPt+tuMOBDyW2yaj09kGhABhuhGZ+cmLpYNLk1pX3Em/YZdiExkdnPhCN06xz8T5tNkGWA7t135xb+doGOjR4iS/0c8uWvF9wZqv+uz1R+GSHcT/Ov6hCzpx4E5QezHhW7BjdFi/koEzm9drHr0zQv+h/4Zpjrk25fH8bs5qr+ua5Wbj1jZPKTue0GiOceVihLUC1nmAzO9yU7l6uZXEiuVxt0RDWrj+OyQ646i21Q9p8UkQQIsL2Kh2VnoaPge44G79g/XIVuUqzUNANH/d8xcCgAFqiOwIHcsptPTbThjnSy15WIhnAxIQLRFlQffms16T82DXANeUGMBZsjgQVV+EQM2rXyqqi3ntVwk3U19uwl2bdKAPfa8QQ2KU3hHW/1nY6srM1gurOyqfsEQzUihP4yrHpg+NbvlTtgKSlingMuh8ZbPSJ8LsCocw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000019, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hey Yang, Thank you for the clarification. You're correct. If the daemon calls prctl with MMF_DISABLE_THP before fork, the child mm won't be on the hash list. What I meant is that the daemon mm might already be on the hash list before fork. Therefore, khugepaged might still scan the address space for the daemon. Thanks, Lance On Thu, Feb 1, 2024 at 4:06=E2=80=AFAM Yang Shi wrote= : > > On Wed, Jan 31, 2024 at 1:30=E2=80=AFAM Lance Yang = wrote: > > > > Updating the change log. > > > > khugepaged scans the entire address space in the > > background for each given mm, looking for > > opportunities to merge sequences of basic pages > > into huge pages. However, when an mm is inserted > > to the mm_slots list, and the MMF_DISABLE_THP > > flag is set later, this scanning process becomes > > unnecessary for that mm and can be skipped to > > avoid redundant operations, especially in scenarios > > with a large address space. > > > > This commit introduces a check before each scanning > > process to test the MMF_DISABLE_THP flag for the > > given mm; if the flag is set, the scanning process is > > bypassed, thereby improving the efficiency of khugepaged. > > > > This optimization is not a correctness issue but rather an > > enhancement to save expensive checks on each VMA > > when userspace cannot prctl itself before spawning > > into the new process. > > If this is an optimization, you'd better show some real numbers to help j= ustify. > > > > > On some servers within our company, we deploy a > > daemon responsible for monitoring and updating local > > applications. Some applications prefer not to use THP, > > so the daemon calls prctl to disable THP before fork/exec. > > Conversely, for other applications, the daemon calls prctl > > to enable THP before fork/exec. > > If your daemon calls prctl with MMF_DISABLE_THP before fork, then you > end up having the child mm on the hash list in the first place, I > think it should be a bug in khugepaged_fork() IIUC. khugepaged_fork() > should check this flag and bail out if it is set. Did I miss > something? > > > > > Ideally, the daemon should invoke prctl after the fork, > > but its current implementation follows the described > > approach. In the Go standard library, there is no direct > > encapsulation of the fork system call; instead, fork and > > execve are combined into one through syscall.ForkExec. > > > > Thanks, > > Lance > > > > On Mon, Jan 29, 2024 at 1:46=E2=80=AFPM Lance Yang wrote: > > > > > > khugepaged scans the entire address space in the > > > background for each given mm, looking for > > > opportunities to merge sequences of basic pages > > > into huge pages. However, when an mm is inserted > > > to the mm_slots list, and the MMF_DISABLE_THP flag > > > is set later, this scanning process becomes > > > unnecessary for that mm and can be skipped to avoid > > > redundant operations, especially in scenarios with > > > a large address space. > > > > > > This commit introduces a check before each scanning > > > process to test the MMF_DISABLE_THP flag for the > > > given mm; if the flag is set, the scanning process > > > is bypassed, thereby improving the efficiency of > > > khugepaged. > > > > > > Signed-off-by: Lance Yang > > > --- > > > mm/khugepaged.c | 18 ++++++++++++------ > > > 1 file changed, 12 insertions(+), 6 deletions(-) > > > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > index 2b219acb528e..d6a700834edc 100644 > > > --- a/mm/khugepaged.c > > > +++ b/mm/khugepaged.c > > > @@ -410,6 +410,12 @@ static inline int hpage_collapse_test_exit(struc= t mm_struct *mm) > > > return atomic_read(&mm->mm_users) =3D=3D 0; > > > } > > > > > > +static inline int hpage_collapse_test_exit_or_disable(struct mm_stru= ct *mm) > > > +{ > > > + return hpage_collapse_test_exit(mm) || > > > + test_bit(MMF_DISABLE_THP, &mm->flags); > > > +} > > > + > > > void __khugepaged_enter(struct mm_struct *mm) > > > { > > > struct khugepaged_mm_slot *mm_slot; > > > @@ -1422,7 +1428,7 @@ static void collect_mm_slot(struct khugepaged_m= m_slot *mm_slot) > > > > > > lockdep_assert_held(&khugepaged_mm_lock); > > > > > > - if (hpage_collapse_test_exit(mm)) { > > > + if (hpage_collapse_test_exit_or_disable(mm)) { > > > /* free mm_slot */ > > > hash_del(&slot->hash); > > > list_del(&slot->mm_node); > > > @@ -2360,7 +2366,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > goto breakouterloop_mmap_lock; > > > > > > progress++; > > > - if (unlikely(hpage_collapse_test_exit(mm))) > > > + if (unlikely(hpage_collapse_test_exit_or_disable(mm))) > > > goto breakouterloop; > > > > > > vma_iter_init(&vmi, mm, khugepaged_scan.address); > > > @@ -2368,7 +2374,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > unsigned long hstart, hend; > > > > > > cond_resched(); > > > - if (unlikely(hpage_collapse_test_exit(mm))) { > > > + if (unlikely(hpage_collapse_test_exit_or_disable(mm))= ) { > > > progress++; > > > break; > > > } > > > @@ -2390,7 +2396,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > bool mmap_locked =3D true; > > > > > > cond_resched(); > > > - if (unlikely(hpage_collapse_test_exit(mm))) > > > + if (unlikely(hpage_collapse_test_exit_or_disa= ble(mm))) > > > goto breakouterloop; > > > > > > VM_BUG_ON(khugepaged_scan.address < hstart || > > > @@ -2408,7 +2414,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > fput(file); > > > if (*result =3D=3D SCAN_PTE_MAPPED_HU= GEPAGE) { > > > mmap_read_lock(mm); > > > - if (hpage_collapse_test_exit(= mm)) > > > + if (hpage_collapse_test_exit_= or_disable(mm)) > > > goto breakouterloop; > > > *result =3D collapse_pte_mapp= ed_thp(mm, > > > khugepaged_scan.addre= ss, false); > > > @@ -2450,7 +2456,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > * Release the current mm_slot if this mm is about to die, or > > > * if we scanned all vmas of this mm. > > > */ > > > - if (hpage_collapse_test_exit(mm) || !vma) { > > > + if (hpage_collapse_test_exit_or_disable(mm) || !vma) { > > > /* > > > * Make sure that if mm_users is reaching zero while > > > * khugepaged runs here, khugepaged_exit will find > > > -- > > > 2.33.1 > > >