From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4243C6FD18 for ; Wed, 19 Apr 2023 17:27:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7B4AA6B0071; Wed, 19 Apr 2023 13:27:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 764D3900003; Wed, 19 Apr 2023 13:27:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 62D08900002; Wed, 19 Apr 2023 13:27:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 551576B0071 for ; Wed, 19 Apr 2023 13:27:27 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id EDAB21C6540 for ; Wed, 19 Apr 2023 17:27:26 +0000 (UTC) X-FDA: 80698822092.13.5193A9D Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) by imf09.hostedemail.com (Postfix) with ESMTP id 29AD1140003 for ; Wed, 19 Apr 2023 17:27:24 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=aq1puqB9; spf=pass (imf09.hostedemail.com: domain of fvdl@google.com designates 209.85.218.48 as permitted sender) smtp.mailfrom=fvdl@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681925245; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8OV/LLS+mB3Bj9QYvC63i1cWwzFoTwZYWF4g6IjNVeI=; b=ajdY/zj1k+FkegYvqSWkqIFi25P+qswjwRT0wH4I2ZcDGYlJlw/TIdz7bQCIBliG+Pz4N1 FXdpYZcUrxSaAll/Hg8B5+j7/Hsi5eM7mlBfAAE2ipiCAgt4jM2QkcBo88THK2TIkT3zCm T0df6hAzPpKsidnSGyVeyQpU1gPtruM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681925245; a=rsa-sha256; cv=none; b=OD7gFYn9wxJeNU/LzC5Cl/wDiJBIdMbshvp7VymWSW5h/vJt658I2F0tbwTdFoyvCpiUsY d3xnMWgbdbK2pHCWZA7/QyL6dGK0UWpaD9F0VGV8YF9zegt2FcgYmcw08gzyY9pdEsuABm xzgVs4x4kSsA3v5a02VDttd9LWAQM8o= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=aq1puqB9; spf=pass (imf09.hostedemail.com: domain of fvdl@google.com designates 209.85.218.48 as permitted sender) smtp.mailfrom=fvdl@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ej1-f48.google.com with SMTP id fy21so103564ejb.9 for ; Wed, 19 Apr 2023 10:27:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1681925244; x=1684517244; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=8OV/LLS+mB3Bj9QYvC63i1cWwzFoTwZYWF4g6IjNVeI=; b=aq1puqB9I5d1jEGu0LgwCV257GxZgaUfaaaHHNDozghKO9EAoK88d7YfcCGMhOjrQf WAvCbVGVUCD17u6DurCiZYiyGF0arsh8d4TXzpidbCw1K3Ne+9DywQ6aGDG+GZdLZTWL 89FiyV1wQV6VP2D4OJTaYz0SB2jloJrSqM+dAg2smvTPDPh2VVxe2cUGCZYOuQF67260 Wne4MnP704dCem/hjMhLmdHYrG8nkWldN4d2h7pzJqaFzKOLBpZ7VhxczakfPEEzS5yI nporAxCOFVGSIPlKulaFS58rwg4Y5KFIRydcLvuw7RtZmXeRQUiXNjeIZhai8V1pdaWb Dd/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681925244; x=1684517244; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8OV/LLS+mB3Bj9QYvC63i1cWwzFoTwZYWF4g6IjNVeI=; b=eSbYevmjH/Sul7wFMcLV44GE3sOaXnGKrEoWWw4eJb6dHh4BtNhH5OVbvK2Dfn1Lyb fjYH0qulVNQMUEeMBXrtqd7NJMdEWuBwXXOd2DytpUI4K5wMHKQcg0i94JdLC2MnrWF+ jZj2xclAsFv6Gl1pkZlOuH4R0FvqaYhzL/leJsn4KXN1nd7J7stbKI3T9Ov5AEUoY4i/ EbycAFK6bHQlmxkCCcGWdoagGG2LAs0vfrnbmEt+HGj+F3M0N3/W4py19DQNCtGLIuWW tB7yUGK1epQWMNDFLCMFUkZRQYwEcqmFetyVzfOwaRPJHaGI+74FJvC5XDMb2MwEORK9 sXVg== X-Gm-Message-State: AAQBX9fYG8X7/Nsm00K/a7JWFF6dV1dZyWseVAOTiMY+2aSPE+YQ8OvB 4g2+ObH13laohyXObrPsOKnU8v/WajHlawFKDznJ+Q== X-Google-Smtp-Source: AKy350Y2hm9UUZwFtCtGE9F6Wp6As2TFMg6Vomy2gQVL/IpicmtV5V+x1ViujkPcGWgtwbL74Mjaa6vt/7PQ8Aal9c4= X-Received: by 2002:a17:907:7d9e:b0:94f:8f46:b286 with SMTP id oz30-20020a1709077d9e00b0094f8f46b286mr4498183ejc.15.1681925243656; Wed, 19 Apr 2023 10:27:23 -0700 (PDT) MIME-Version: 1.0 References: <20230418172942.740769-1-fvdl@google.com> <20230419041926.GA99028@hu-pkondeti-hyd.qualcomm.com> In-Reply-To: <20230419041926.GA99028@hu-pkondeti-hyd.qualcomm.com> From: Frank van der Linden Date: Wed, 19 Apr 2023 10:27:12 -0700 Message-ID: Subject: Re: [PATCH V7 0/2] mm: shmem: support POSIX_FADV_[WILL|DONT]NEED for shmem files To: Pavan Kondeti Cc: quic_charante@quicinc.com, akpm@linux-foundation.org, hughd@google.com, willy@infradead.org, markhemm@googlemail.com, rientjes@google.com, surenb@google.com, shakeelb@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 29AD1140003 X-Rspamd-Server: rspam09 X-Stat-Signature: 8m48c3mrd8d7fmr3upyogksudoit7wqy X-HE-Tag: 1681925244-552334 X-HE-Meta: U2FsdGVkX1/YHx409ZoV26cUUcAtYDR3PHS6D92MlepcrRNLP8ijx19ahkhRD2c3klTm+9yc+2g05gBu31CibP2v5tgpqxUKRMlljh0wg2WDpGlXFW39uv3X8W1wxGVyLCE8pAPTZVEf7wjfhSODgiQoVxjWV+datrxNhIcSMXTjYUsxCkT7eJJuOopK2zA0AxJyS1MXYLJRRi2rBC0hAIIb5RiiX4H/duwiBWVWTYML0aF3WeClFCtpmzbz/wJ0utu+hICttnnsI+mBKoJztE0+OhWKhT/U/wqqiV3NU/3mHBtg1oSCL5uL1pP2966zS74mpFfKTDs7/ZobDJDOyLEFdTg51M89+32FswhMIJ4d+Zy2vtgfCoYpsnE3jzvZn1DoCrBgUkjAg49aL4seJ7hVMrKRgzHAZdB6gkWTZ9XUcLoY5z3l0lpvx+kC2ggV6Fox9dN5/TF6CSJoabsxHJ0HZoxuJvnTkpH4PQkHQyhsjhw5+K+rPg/76acnJ4w/WB1n+SzXcXL7NjJMHUzY4ird7e/ZxfSBuXrTaN8ipycGtnq5u5iEqnlObv1Cqqj/WCXpqHX7tFaDq7ZCzHokAzvvBsp0t2m09qcKvQqSv6LQU72zEoM7mKjMD3Ob9fTmGxRAYxkYq5yCqfTcMNDgJncn/1JFdpbAvpW00wFHwn2CSYFXuGlO+bvbg9kFdHGYPA5Sx/GC4ztwqb50oE8uAFAcH4ptQPAW4MWNtA8gjRd+AjDk9io+cZhrIABAVZLUSXl60IA/H7GyAGxguxVYV9+tlgey8xRTPeeiVvSgTMyRNvidtaJfbB7yxTcCYMyvp8/DG1cxoOFX95v63AMTgvZZP3NE8WqVDsNI6zv9sNWq0c40WiSP478LW9j6eX//6ntrwfSqooLNLFE7kvH55tc9jgPlwL0I/lx0THo575pm98/PX9z2WzNsb4YxkYb94PPtWSsvvHVCO4eZs0T IUVD15x2 1RygjMbsWxsuHzVxVCRQDek/gNOYfJbE8UjcErtemfCViZKJPDq2V8dvuXh/jlJt0oL+7qCws2Issbyzr6dMSYd/P3plvuU26PUBbxupKAYceolz/EW9/ISPR8arcOIZwy5/GVMwsIlzu0PS6VDXIEBoEZtD2o8lXeqinKACRk0pfhTnUHX9zoduP5Zzu7NYLoKLKvxMVR9r6DufL5B3YGWffJWlOOvxiv5cA2DTBh/UgFvycq8v31mzjuB3UA02EGi8j5ZuKmC222OCrn5cOAYqNMXpB8BZAeW8WzzoS5lqk9AS9tl+h9q4xAS1kDBIvHdrS/LfKHwfs0WibXj/NIrvyw+v2JGbRjb+6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000003, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 18, 2023 at 9:19=E2=80=AFPM Pavan Kondeti wrote: > > On Tue, Apr 18, 2023 at 05:29:42PM +0000, Frank van der Linden wrote: > > Below is a quick patch to allow FADVISE_DONTNEED for shmem to reclaim > > mapped pages too. This would fit our usecase, and matches MADV_PAGEOUT > > more closely. > > > > The patch series as posted skips mapped pages even if you remove > > the folio_mapped() check, because page_referenced() in > > shrink_page_list() will find page tables with the page mapped, > > and ignore_references is not set when called from reclaim_pages(). > > > > You can make this work in a similar fashion to MADV_PAGEOUT by > > first unmapping a page, but only if the mapping belongs to > > the caller. You just have to change the check for "is there > > only one mapping and am I the owner". To do that, change a few > > lines in try_to_unmap to allow for checking the mm the mapping > > belongs to, and pass in current->mm (NULL still unmaps all mappings). > > > > I lightly tested this in a somewhat older codebase, so the diff > > below isn't fully tested. But if there are no objections to > > this approach, we could add it on top of your patchset after > > better testing. > > > > - Frank > > > > diff --git a/include/linux/rmap.h b/include/linux/rmap.h > > index b87d01660412..4403cc2ccc4c 100644 > > --- a/include/linux/rmap.h > > +++ b/include/linux/rmap.h > > @@ -368,6 +368,8 @@ int folio_referenced(struct folio *, int is_locked, > > > > void try_to_migrate(struct folio *folio, enum ttu_flags flags); > > void try_to_unmap(struct folio *, enum ttu_flags flags); > > +void try_to_unmap_mm(struct mm_struct *mm, struct folio *folio, > > + enum ttu_flags flags); > > > > int make_device_exclusive_range(struct mm_struct *mm, unsigned long st= art, > > unsigned long end, struct page **pages, > > diff --git a/mm/rmap.c b/mm/rmap.c > > index 8632e02661ac..4d30e8f5afe2 100644 > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -1443,6 +1443,11 @@ void page_remove_rmap(struct page *page, struct = vm_area_struct *vma, > > munlock_vma_folio(folio, vma, compound); > > } > > > > +struct unmap_arg { > > + enum ttu_flags flags; > > + struct mm_struct *mm; > > +}; > > + > > /* > > * @arg: enum ttu_flags will be passed to this argument > > */ > > @@ -1455,7 +1460,11 @@ static bool try_to_unmap_one(struct folio *folio= , struct vm_area_struct *vma, > > struct page *subpage; > > bool anon_exclusive, ret =3D true; > > struct mmu_notifier_range range; > > - enum ttu_flags flags =3D (enum ttu_flags)(long)arg; > > + struct unmap_arg *uap =3D (struct unmap_arg *)arg; > > + enum ttu_flags flags =3D uap->flags; > > + > > + if (uap->mm && uap->mm !=3D mm) > > + return true; > > > > /* > > * When racing against e.g. zap_pte_range() on another cpu, > > @@ -1776,6 +1785,7 @@ static int folio_not_mapped(struct folio *folio) > > > > /** > > * try_to_unmap - Try to remove all page table mappings to a folio. > > + * @mm: mm to unmap from (NULL to unmap from all) > > * @folio: The folio to unmap. > > * @flags: action and flags > > * > > @@ -1785,11 +1795,16 @@ static int folio_not_mapped(struct folio *folio= ) > > * > > * Context: Caller must hold the folio lock. > > */ > > -void try_to_unmap(struct folio *folio, enum ttu_flags flags) > > +void try_to_unmap_mm(struct mm_struct *mm, struct folio *folio, > > + enum ttu_flags flags) > > { > > + struct unmap_arg ua =3D { > > + .flags =3D flags, > > + .mm =3D mm, > > + }; > > struct rmap_walk_control rwc =3D { > > .rmap_one =3D try_to_unmap_one, > > - .arg =3D (void *)flags, > > + .arg =3D (void *)&ua, > > .done =3D folio_not_mapped, > > .anon_lock =3D folio_lock_anon_vma_read, > > }; > > @@ -1800,6 +1815,11 @@ void try_to_unmap(struct folio *folio, enum ttu_= flags flags) > > rmap_walk(folio, &rwc); > > } > > > > +void try_to_unmap(struct folio *folio, enum ttu_flags flags) > > +{ > > + try_to_unmap_mm(NULL, folio, flags); > > +} > > + > > /* > > * @arg: enum ttu_flags will be passed to this argument. > > * > > diff --git a/mm/shmem.c b/mm/shmem.c > > index 1af85259b6fc..b24af2fb3378 100644 > > --- a/mm/shmem.c > > +++ b/mm/shmem.c > > @@ -2362,8 +2362,24 @@ static void shmem_isolate_pages_range(struct add= ress_space *mapping, loff_t star > > > > if (!folio_try_get(folio)) > > continue; > > - if (folio_test_unevictable(folio) || folio_mapped(folio) = || > > - folio_isolate_lru(folio)) { > > + > > + if (folio_test_unevictable(folio)) { > > + folio_put(folio); > > + continue; > > + } > > + > > + /* > > + * If the folio is mapped once, try to unmap it from the > > + * caller's page table. If it's still mapped afterwards, > > + * it belongs to someone else, and we're not going to > > + * change someone else's mapping. > > + */ > > + if (folio_mapcount(folio) =3D=3D 1 && folio_trylock(folio= )) { > > + try_to_unmap_mm(current->mm, folio, TTU_BATCH_FLU= SH); > > + folio_unlock(folio); > > + } > > Is rmap walk can be done from a RCU read critical section which does not > allow explicit blocking? > > Thanks, > Pavan True, yes, rmap_walk may block, so it the try_to_unmap calls should be outside the loop. The easiest thing to do there is to add all mapped pages to a separate list, walk that list outside of the rcu lock for i_mapping, and add all pages that could be unmapped to the return list. - Frank