From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65234EB64D9 for ; Wed, 12 Jul 2023 06:24:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BB1256B0071; Wed, 12 Jul 2023 02:24:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B61746B0072; Wed, 12 Jul 2023 02:24:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A29056B0075; Wed, 12 Jul 2023 02:24:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 934AF6B0071 for ; Wed, 12 Jul 2023 02:24:02 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 63C531C773E for ; Wed, 12 Jul 2023 06:24:02 +0000 (UTC) X-FDA: 81001969524.27.CF05D54 Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) by imf09.hostedemail.com (Postfix) with ESMTP id 85C2814000C for ; Wed, 12 Jul 2023 06:24:00 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=0wW4+rFZ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf09.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.179 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689143040; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ogDTocmjdnSJBzO5jS03IkwUPvdzM4qsYFLSfwaNw2c=; b=HV2o9eJxKEmZWFUUAik1WRViT607Pbt/7TpwLTWhu+JZUfKgc7YfA9cam3K4FgbXndKDtv 2VbmNokeVNBd86ymMAHEB/2hH3wVzfSayruFB+IVmx5v8HYvCCBJ/8DLssjBdCIyI79SBQ Qx0v+4o+9VdEBQVFOQLG4f5Cg/q8TSU= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=0wW4+rFZ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf09.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.179 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689143040; a=rsa-sha256; cv=none; b=XLRYIY8+huTyGBs9fQZA/tat0fQC65MtOs8ums6NjcHxb6jUksHm5RGs20CIVsVFpI0zpB HJD2heVApFXr4JAymKBqhk33jOiqaKCUUgaCzJzEDTZxVkFJAqL0nrAQzcYjYGRoOJZ5vr CUm5Izpm3xRinUdRmKK976+DhvKvPh0= Received: by mail-qt1-f179.google.com with SMTP id d75a77b69052e-403b622101bso128601cf.1 for ; Tue, 11 Jul 2023 23:24:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689143039; x=1691735039; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ogDTocmjdnSJBzO5jS03IkwUPvdzM4qsYFLSfwaNw2c=; b=0wW4+rFZWUn48j1wSeqL+TD0e+Wah6NZ17W0Rk7rtflQOvgc3dzjRQVQo2btFCGgUY +otwdvO8r3nLl7el2+HDA9ZxNvA8mmbOHqhXOW5K2wF5Jm9KLnD32O2Q126QVlJd+DC0 iMbz/MKdWmIbYaqKOFKzhMQKDmnXWpI9nBF9r2Cl2Gj7qVs0cz+6q1t6vlVGXgfwyqlu Fz/cJnRJL1STnPDPAHY9WHqIuwSobzOgcOJwMJnWV21ERN7tXyj3SAoixjay2/MBnkzV /CSyaCnF+GqRpVKpZHuA3vAiViYsAs5Bk4z22Bo+iqwBQUcPBtxyCpreYS9huaTbvVU4 36Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689143039; x=1691735039; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ogDTocmjdnSJBzO5jS03IkwUPvdzM4qsYFLSfwaNw2c=; b=XDihm+0ENObKXBswQlR1lWwBJIID8fJb9R4frcy0UuesBYoxBU2pLCPKfjPF8AweDC uDrguPQTHKNu0oZkeYDOYLP8J4s01JZdY+hZBCMMeuutKxiyjE5iJnkux8Z71gZeKxMd b63qYK/VBuo4G63vMGVBqqnj5aB6TPbEfOMAZK0GNcuqt0VPsammkUPFl5fxTt4Bs6bm 4n8BMDVgWWHF2sUkjtsHJeoEpqgfgCyOQn+At8kWFByklPCg3h2v6b6CsueR5us3YtDk b3Hy6al7Ti7TZvrW/2ikq+U41NXN3UNJ5OR5FxLDVOHEIA96C5cNE+vZYA5eB1WHaMHy Yglw== X-Gm-Message-State: ABy/qLauInthZCqmtBVItCJKbk0Fyf2rM0XCr0WqW8fsBbVYk/d+aT/7 TaLdSIPRq5TZ4xLlFSnewOTSPZeoAGyC9ntwa36btA== X-Google-Smtp-Source: APBJJlF/wzGEFxsGQuAId1W1Jr0FjvUxt9dmS2lBIZNPtptznWl/zhbQ+mPEOajxg+2S6tHbWJ3gCv8HOBxgULojg3o= X-Received: by 2002:ac8:58c3:0:b0:3f9:b81c:3a0f with SMTP id u3-20020ac858c3000000b003f9b81c3a0fmr98013qta.17.1689143039538; Tue, 11 Jul 2023 23:23:59 -0700 (PDT) MIME-Version: 1.0 References: <20230712060144.3006358-1-fengwei.yin@intel.com> <20230712060144.3006358-3-fengwei.yin@intel.com> In-Reply-To: <20230712060144.3006358-3-fengwei.yin@intel.com> From: Yu Zhao Date: Wed, 12 Jul 2023 00:23:23 -0600 Message-ID: Subject: Re: [RFC PATCH v2 2/3] mm: handle large folio when large folio in VM_LOCKED VMA range To: Yin Fengwei Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 85C2814000C X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: eudx1gc6ie9f9i6pjxbkgqgazwgrx75u X-HE-Tag: 1689143040-386648 X-HE-Meta: U2FsdGVkX1/2f0sD7HTGob3gZDg+JuR1F0oi+z2UvMYeRfElbzx9GvZX5ZUSmjmz1uffSeh5THvdZL1XquOk1gUxv9wk8C6G1uMTdqmw1nBoRLaeC9SPC2C5w52m9Ys3f7NKEq3N+k6JMCv1hNQCeTwCUnJZrxv/dEK9EtRY5etY4bTn1QWA5MV+jrWr4WThEYSMwVn9hCLhZto4WM9knjda65e5hYCBxqGivVux4X2C+eJnPMkOgM0zZho5h48siTwk9wfxbg9mIbgP9Opm5LsgiFQukCuoqSlJwXSKh4ISWQp/XaOGw3r9zJ+ZZaogivfnfChELCUYBPuNDyjDAPO0AVOyWG+Ob7PAEqsRWOHUJBQkYnp64W/O1Ni9HmjqCmJXFeSrxLRBmgoYzi0EoQh7yXHIkuIQQAtLdJoxATL5SAL3uIPVRs+jY0qotgCdWiOY7dlzRUtV6YDJM3koanmsn1e4UIl7wrxhByLCStKY7DaQw+bRd8kCXEQv1lz6F/znlf2knCVwEzEa99oYJmdtQoIQF2QwLVH9Y+rmeIXu0bMon0Nm+1fgiXSWHF01bGy/zjklPFoV55tnid9HYlMkUDHejNEapl7AlDY7s0ZjdNy06cW5/C5Sfqj8wMK7HjPra/1Orm4KsDvidIbAFEMu6WKnSiUny2Saz4dh2Yf8nRNwYchWNijq3Y2uEwh+rxr0e4/O7YxkjiDJlWOOthR5b2YN31COTPAdEUBe9e4WJFevvBtbO4OHOnM1B/bZkVNdaz1+GoocvBFs6uE50AwCUfk14CNaEhj11yAZycJIsfytaK62efhIq3nc8PoJzJY7CRVY7AdcSdqFBllHl0I/M7hbNAxVn2wLQtFJjVUIRXG8YXXt0ML3vzWjHLA83u7OgC+4/fZK2xK3G4jAhFvGbj6LF33QGNJnjPGa27jPKq+iiPDOzAAP3k2Aaong9AEvSRpk7c9rDTtcyZC pa+rDhJG c2XVm5lcfSg3Qc7Do2k0AiY2GylMfZn+9mtu+kF6KxAWNaC54WO7ipTqDeS49q7Ap9cWtnuNhX/M3WJi2mmtDjlvKaJKlGn5bapA8vYb7DK1RMYEX1u/hTg/rVELvycE3cTw7kyEcgXYWMebVhBFJ6mtnsk0MF8ZidC40Fs/FoFxX0Pwvv2GmC9i7hZLVnYglgndn+sQVp+VYZ/tHQvYstWH10qj667dirbZyM5KPHFajqfjTKXEQlpHRGLZX2gi4eDMTld7HEikPmaYrX/Q/c66hEREenpTY8V5EioL9R6Q8B3I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jul 12, 2023 at 12:02=E2=80=AFAM Yin Fengwei wrote: > > If large folio is in the range of VM_LOCKED VMA, it should be > mlocked to avoid being picked by page reclaim. Which may split > the large folio and then mlock each pages again. > > Mlock this kind of large folio to prevent them being picked by > page reclaim. > > For the large folio which cross the boundary of VM_LOCKED VMA, > we'd better not to mlock it. So if the system is under memory > pressure, this kind of large folio will be split and the pages > ouf of VM_LOCKED VMA can be reclaimed. > > Signed-off-by: Yin Fengwei > --- > mm/internal.h | 11 ++++++++--- > mm/rmap.c | 34 +++++++++++++++++++++++++++------- > 2 files changed, 35 insertions(+), 10 deletions(-) > > diff --git a/mm/internal.h b/mm/internal.h > index c7dd15d8de3ef..776141de2797a 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -643,7 +643,8 @@ static inline void mlock_vma_folio(struct folio *foli= o, > * still be set while VM_SPECIAL bits are added: so ignore it = then. > */ > if (unlikely((vma->vm_flags & (VM_LOCKED|VM_SPECIAL)) =3D=3D VM_L= OCKED) && > - (compound || !folio_test_large(folio))) > + (compound || !folio_test_large(folio) || > + folio_in_range(folio, vma, vma->vm_start, vma->vm_end))) > mlock_folio(folio); > } This can be simplified: 1. remove the compound parameter 2. make the if if (unlikely((vma->vm_flags & (VM_LOCKED|VM_SPECIAL)) =3D=3D VM_LOC= KED) && folio_within_vma()) mlock_folio(folio); > @@ -651,8 +652,12 @@ void munlock_folio(struct folio *folio); > static inline void munlock_vma_folio(struct folio *folio, > struct vm_area_struct *vma, bool compound) Remove the compound parameter here too. > { > - if (unlikely(vma->vm_flags & VM_LOCKED) && > - (compound || !folio_test_large(folio))) > + /* > + * To handle the case that a mlocked large folio is unmapped from= VMA > + * piece by piece, allow munlock the large folio which is partial= ly > + * mapped to VMA. > + */ > + if (unlikely(vma->vm_flags & VM_LOCKED)) > munlock_folio(folio); > } > > diff --git a/mm/rmap.c b/mm/rmap.c > index 2668f5ea35342..455f415d8d9ca 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -803,6 +803,14 @@ struct folio_referenced_arg { > unsigned long vm_flags; > struct mem_cgroup *memcg; > }; > + > +static inline bool should_restore_mlock(struct folio *folio, > + struct vm_area_struct *vma, bool pmd_mapped) > +{ > + return !folio_test_large(folio) || > + pmd_mapped || folio_within_vma(folio, vma); > +} This is just folio_within_vma() :) > /* > * arg: folio_referenced_arg will be passed > */ > @@ -816,13 +824,25 @@ static bool folio_referenced_one(struct folio *foli= o, > while (page_vma_mapped_walk(&pvmw)) { > address =3D pvmw.address; > > - if ((vma->vm_flags & VM_LOCKED) && > - (!folio_test_large(folio) || !pvmw.pte)) { > - /* Restore the mlock which got missed */ > - mlock_vma_folio(folio, vma, !pvmw.pte); > - page_vma_mapped_walk_done(&pvmw); > - pra->vm_flags |=3D VM_LOCKED; > - return false; /* To break the loop */ > + if (vma->vm_flags & VM_LOCKED) { > + if (should_restore_mlock(folio, vma, !pvmw.pte)) = { > + /* Restore the mlock which got missed */ > + mlock_vma_folio(folio, vma, !pvmw.pte); > + page_vma_mapped_walk_done(&pvmw); > + pra->vm_flags |=3D VM_LOCKED; > + return false; /* To break the loop */ > + } else { There is no need for "else", or just if (!folio_within_vma()) goto dec_pra_mapcount; > + /* > + * For large folio cross VMA boundaries, = it's > + * expected to be picked by page reclaim= . But > + * should skip reference of pages which a= re in > + * the range of VM_LOCKED vma. As page re= claim > + * should just count the reference of pag= es out > + * the range of VM_LOCKED vma. > + */ > + pra->mapcount--; > + continue; > + } > }