From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF929C47DB3 for ; Tue, 30 Jan 2024 02:37:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4344E6B009F; Mon, 29 Jan 2024 21:37:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3BE316B00CB; Mon, 29 Jan 2024 21:37:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 237686B00CA; Mon, 29 Jan 2024 21:37:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 0996C6B009A for ; Mon, 29 Jan 2024 21:37:44 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id AC992C0239 for ; Tue, 30 Jan 2024 02:37:43 +0000 (UTC) X-FDA: 81734416806.17.7D41BE7 Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) by imf28.hostedemail.com (Postfix) with ESMTP id 0064AC000A for ; Tue, 30 Jan 2024 02:37:41 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=fnaIN4BE; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf28.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.128.176 as permitted sender) smtp.mailfrom=ioworker0@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706582262; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1Nf7ZAE4lUdPHPJYVXVTuVeF2gxZD1vrM/mMUbDvf/w=; b=aIYdH0hr6M3rYjYHnAHEzRIWroI45vITT2d1+nNdG5TwCXWBXCaw3V7G7fL5wq0iyQSHrY gLk5+vitRYJVvPmcE3cW2qVnyF5NinvGl/ojghM33txPg5uc4tscQTrvMDS0a2EajW4XRz X7QL0uErfq5jNTe5ivo3LDcoANJvnLA= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=fnaIN4BE; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf28.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.128.176 as permitted sender) smtp.mailfrom=ioworker0@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706582262; a=rsa-sha256; cv=none; b=EkSGCO3f/cda46coxrH8iHPrWy1UZ/FVJ2MSClf1oOTmsDutWEvPjP/DeeA6Ctmhwx0HH4 WQ0wAbOBmosgpvF/m92Ai/FkeV3HtCJcqJJ6slv2HIH6OMxkEduCJeqzpVkdnZs+iNSWNE ypiVxOlvVCLhvmJNVfXiJYRszMBnhBI= Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-6039716f285so26757397b3.2 for ; Mon, 29 Jan 2024 18:37:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706582261; x=1707187061; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=1Nf7ZAE4lUdPHPJYVXVTuVeF2gxZD1vrM/mMUbDvf/w=; b=fnaIN4BEyxUMtP+D4+La1ypxoxkV8qDDbrGQZDoSMp9UtcNwNTR5VNQHMiTIXLawuc GPnuugYESjE6eheO1M6B8RrMc9dmBQ2PXg1hTqx/CPIIzQqvCR5R16UcBvEP64FbFHIm 2/s6tHaPoTu1JM6TX/U4H/1VCBahOnvauXBC8GHakjQfQSYRgLvrEsvaOS9gB/hfQxKu WXi/MRTWJrQOXuz71vQ/VPj6MsjyqNe7A3fCHCnQe88MwtC4i6U8mNxLWCafaK2oxnul pTbucUSu5lLH0i/pUhYGVxXNi7CmR85fSCYWw2/J+w2zJbsWb63ioMCEJdQKmMj0hyEJ mHwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706582261; x=1707187061; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1Nf7ZAE4lUdPHPJYVXVTuVeF2gxZD1vrM/mMUbDvf/w=; b=smZvSJLnHLVmkaiM+x6KkJRZ7YYTWjlYkMqTmavxiINcIZa5H7TJ8IijDdd1ewjq+Y W4Y0d7rKFerOPy0W7e+ACsMt1MTY7UDRL11K5ngBRoLbglsBc7Ky8p/YEDK21RTwid5t 9cXZ6Y9YPMfBEEgJwGgEKoyyBNGKWAW2Wk/fnEKlwZ1tDaPRmBwZeozVwEDl0yFWAIuK bHqEnsbkCCVz5lTVfXYMgauKlU3el7rVJCEHIzlExGspArAjf6/jAhxG3kEQvP8aPJXb 9jxr/WmGp3925Un3cOxWrEh0hG4GQdMUPNw5e1y53GcHflMAHqVvruB05i8RQ1p5r83X ioDA== X-Gm-Message-State: AOJu0YzefcffrkbLydNvKp0d6BTN6UsW8jU5mFipn/rYy74GZN4kswDr HnNqPFkGEp/1C6bJoK8pz4MRkj1AxXGEhz+jN2vFZLUnItHQz7h/oB4AG5L7hW2Bdf+79pkNx0K GGdB8AukuhMYXFCoXpzN5n5A9tII= X-Google-Smtp-Source: AGHT+IEAXhI5XMsUEZlfm8ZJ9kk+bnWCQ4wvDsNOoWmuR6EdN6gc+DKb+xIyNimqPRxtbyMJ/SJtg1F5mRRyfsomJtI= X-Received: by 2002:a25:abc3:0:b0:dc6:7156:d2cc with SMTP id v61-20020a25abc3000000b00dc67156d2ccmr2857608ybi.82.1706582261010; Mon, 29 Jan 2024 18:37:41 -0800 (PST) MIME-Version: 1.0 References: <20240129054551.57728-1-ioworker0@gmail.com> In-Reply-To: From: Lance Yang Date: Tue, 30 Jan 2024 10:37:26 +0800 Message-ID: Subject: Re: [PATCH 1/1] mm/khugepaged: bypassing unnecessary scans with MMF_DISABLE_THP check To: "Zach O'Keefe" Cc: Yang Shi , akpm@linux-foundation.org, mhocko@suse.com, david@redhat.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 6ytaybxytdjcy1h6ieo8g8mpewdd4rry X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 0064AC000A X-HE-Tag: 1706582261-268838 X-HE-Meta: U2FsdGVkX1/5fAOu8qbN0nPm1loCBoUkSuAf/8e7maUpCyhia5lBeIl4s59wGjv2wgwDOHeRIf8odFHfJpS77G5TChvvttiof4cTtFpK/kcywo02d+4Grn0J08DKlGlZdNnJxjVliufZZSOnsJnaX7f408LLdTmza90Q2bi3aCrS0Cfbg60FWQchPIymkXVVlv9xTykn+bBknUfy7Oj8soXvSA2UpGCMBSvmA5slcpwoPdRFzBAmLWXnhgFshvUwUHZVLfbk6J8EJ1TbRlaKh0GYgm6oxl9JPJJMq/OT/A+mlHWDdYrLbtC+ICZGNXkutiDRY1TicVzKdYpPwzH6NEwk9RtQI2poyqc3O3Yqt9V1gdVhiQd4vjSmFSXdMcvomVNjYwQXRQFmLAWfz+x+7p0X7iy0SXEPdr7M2jruT9ogUViKoQxWppk2LCb56IZF8+8YC9zRmUQ/x+BcvIi1WlcsqO/Ad94USs2Wc5HjlziB5ls6C5pkYoDiUQbkK3h3cPyEtvu6uWuEOQtjqj1/4o1g+E9e1QCd2vXnHc2ZckzXkW03CiCMnsr5OhReRPbfbW9s/csvl3zeRN5lm/NjRg0fkOkCtBOmvbno3qXtGkBj0Izyh8HAuK7i9tAJpqUl7gnpVz6H/COVzl03QOm+In7TA6xoHVTTej3aRxQydBTe+KRWAO0K7WkFWt/H+JPk7DPmP5RLKFZMxMBPJOFciNRoGSrYPNYM9WknhzkEveO/oKNrtkmU2fQXhX6ux8owMzb8jqwffsZ0zOXuDYzXoB2aMeC8boe1p6Oyin4GvurL+SelNGZw3ml1khYtECdxHcUPYaAEJ1/jf5UUIrfJgO1GhiV6zokMpKL2t1aKM/Yl3k/kX3rkEXfNztI5W7B+3C2dNIZkIKq4H6PRL/T+gEB+CXPy9UcDJhxNrVCpobXVZtTzW3DWOcBOwaFGqaOQ70ejWjHfrju7u6uqhnw 8o38lNv1 PwjjdNUGDvs94iCKkY9sKdQZhdUvbiR4d/iFu6vyfvxQnb4ql4EcbQ3g/dLsZOGE675CwyiLxGbg7SFWBypcI+GTwcKKnCJ5h3QgVARCLaoWpCF5u2NEOW7xeq0eC/RHbtR+ilVq4jzbzpgIgee6r61Z5CwqgGCq/4AID6f0dPwnGdEffP4MyF7ABvACz1PU4yv742qaW6jY/2ZoSk+n7uAFjh+aLDHEn3rjT0J/gaF4FUlqzgvpgRs6GyPqGc3RAxgAjOswBSVH/GDQ7Z9SEJYbZfb0UMuQBNCMmlJpQ4pslqFUWSyTLzKx70YvSsIZdHif/470yGrdJC/+NInrBSL1SgnHBsZfJwX6Zi6yfTgBS50sqiIQEilgK2ix1sGpcd5K4dt9Olo4B35oZ+igsIfQCqAQCw/79LlDrqHtA1cpADcL0CXzgrNJrvHC6eqeuDBcS X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hey Zach, Thanks for taking time to review! On Tue, Jan 30, 2024 at 3:04=E2=80=AFAM Zach O'Keefe w= rote: [...] > IIUC, there really isn't any correctness race. Claim is just that we Yes, there is indeed no correctness race. > can avoid a number of per-vma checks. AFAICT, any task w/ > MMF_DISABLE_THP set will always have each and every vma checked > (albeit, with a very inexpensive ->vm_mm->flags check) [...] IMO, for any task with MMF_DISABLE_THP set, the check for each VMA can be skipped to avoid redundant operations, (with a very inexpensive ->mm->flags check) especially in scenarios with a large address space. BR, Lance On Tue, Jan 30, 2024 at 3:04=E2=80=AFAM Zach O'Keefe w= rote: > > On Mon, Jan 29, 2024 at 10:53=E2=80=AFAM Yang Shi w= rote: > > > > On Sun, Jan 28, 2024 at 9:46=E2=80=AFPM Lance Yang wrote: > > > > > > khugepaged scans the entire address space in the > > > background for each given mm, looking for > > > opportunities to merge sequences of basic pages > > > into huge pages. However, when an mm is inserted > > > to the mm_slots list, and the MMF_DISABLE_THP flag > > > is set later, this scanning process becomes > > > unnecessary for that mm and can be skipped to avoid > > > redundant operations, especially in scenarios with > > > a large address space. > > > > > > This commit introduces a check before each scanning > > > process to test the MMF_DISABLE_THP flag for the > > > given mm; if the flag is set, the scanning process > > > is bypassed, thereby improving the efficiency of > > > khugepaged. > > > > > > Signed-off-by: Lance Yang > > > --- > > > mm/khugepaged.c | 18 ++++++++++++------ > > > 1 file changed, 12 insertions(+), 6 deletions(-) > > > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > index 2b219acb528e..d6a700834edc 100644 > > > --- a/mm/khugepaged.c > > > +++ b/mm/khugepaged.c > > > @@ -410,6 +410,12 @@ static inline int hpage_collapse_test_exit(struc= t mm_struct *mm) > > > return atomic_read(&mm->mm_users) =3D=3D 0; > > > } > > > > > > +static inline int hpage_collapse_test_exit_or_disable(struct mm_stru= ct *mm) > > > +{ > > > + return hpage_collapse_test_exit(mm) || > > > + test_bit(MMF_DISABLE_THP, &mm->flags); > > > +} > > > + > > > void __khugepaged_enter(struct mm_struct *mm) > > > { > > > struct khugepaged_mm_slot *mm_slot; > > > @@ -1422,7 +1428,7 @@ static void collect_mm_slot(struct khugepaged_m= m_slot *mm_slot) > > > > > > lockdep_assert_held(&khugepaged_mm_lock); > > > > > > - if (hpage_collapse_test_exit(mm)) { > > > + if (hpage_collapse_test_exit_or_disable(mm)) { > > > /* free mm_slot */ > > > hash_del(&slot->hash); > > > list_del(&slot->mm_node); > > > @@ -2360,7 +2366,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > goto breakouterloop_mmap_lock; > > > > > > progress++; > > > - if (unlikely(hpage_collapse_test_exit(mm))) > > > + if (unlikely(hpage_collapse_test_exit_or_disable(mm))) > > > goto breakouterloop; > > > > > > vma_iter_init(&vmi, mm, khugepaged_scan.address); > > > @@ -2368,7 +2374,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > unsigned long hstart, hend; > > > > > > cond_resched(); > > > - if (unlikely(hpage_collapse_test_exit(mm))) { > > > + if (unlikely(hpage_collapse_test_exit_or_disable(mm))= ) { > > > > The later thp_vma_allowable_order() does check whether MMF_DISABLE_THP > > is set or not. And the hugepage_vma_revalidate() after re-acquiring > > mmap_lock does the same check too. The checking in khugepaged should > > be already serialized with prctl, which takes mmap_lock in write. > > IIUC, there really isn't any correctness race. Claim is just that we > can avoid a number of per-vma checks. AFAICT, any task w/ > MMF_DISABLE_THP set will always have each and every vma checked > (albeit, with a very inexpensive ->vm_mm->flags check) > > Thanks, > Zach > > > > progress++; > > > break; > > > } > > > @@ -2390,7 +2396,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > bool mmap_locked =3D true; > > > > > > cond_resched(); > > > - if (unlikely(hpage_collapse_test_exit(mm))) > > > + if (unlikely(hpage_collapse_test_exit_or_disa= ble(mm))) > > > goto breakouterloop; > > > > > > VM_BUG_ON(khugepaged_scan.address < hstart || > > > @@ -2408,7 +2414,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > fput(file); > > > if (*result =3D=3D SCAN_PTE_MAPPED_HU= GEPAGE) { > > > mmap_read_lock(mm); > > > - if (hpage_collapse_test_exit(= mm)) > > > + if (hpage_collapse_test_exit_= or_disable(mm)) > > > goto breakouterloop; > > > *result =3D collapse_pte_mapp= ed_thp(mm, > > > khugepaged_scan.addre= ss, false); > > > @@ -2450,7 +2456,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > * Release the current mm_slot if this mm is about to die, or > > > * if we scanned all vmas of this mm. > > > */ > > > - if (hpage_collapse_test_exit(mm) || !vma) { > > > + if (hpage_collapse_test_exit_or_disable(mm) || !vma) { > > > /* > > > * Make sure that if mm_users is reaching zero while > > > * khugepaged runs here, khugepaged_exit will find > > > -- > > > 2.33.1 > > >