From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FAB6EB64DA for ; Fri, 14 Jul 2023 03:24:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E4C7090000C; Thu, 13 Jul 2023 23:24:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DFCAC900002; Thu, 13 Jul 2023 23:24:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CC39190000C; Thu, 13 Jul 2023 23:24:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B9FA3900002 for ; Thu, 13 Jul 2023 23:24:04 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 73D321C887A for ; Fri, 14 Jul 2023 03:24:03 +0000 (UTC) X-FDA: 81008773566.02.F90975E Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) by imf12.hostedemail.com (Postfix) with ESMTP id 90F2740009 for ; Fri, 14 Jul 2023 03:24:01 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=KIXsfkQw; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of yuzhao@google.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689305041; a=rsa-sha256; cv=none; b=sfaAihe/Sm9g4oizdPWUpvO2a5BQqk0qJof+Mh/cYin5SK2Geuw12ybvdyS7kooyRdUfDE 8ru6QWLPSjA01/mRv4DmhXlAHIDXJG1uhJQd9IgbV9hRTnqkGd/xiW5wqa84llr5qgEK0I LC/wBM1AbLiIPHk0s6o2G8Vw4fS1wsE= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=KIXsfkQw; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of yuzhao@google.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689305041; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uAavaLmIeCwYclXN+9/jkCoblgrGrtrOk9lcsx3pq0w=; b=Rqa8wuowgXbd1P3X9Di1dOHsvLilseymwxBQFhnE6K8Q953Jkur6TXA9IfMZb/RaHgx9ja 0brVZRMLY9AdLp6hFfYhqsyJSa6BDYRk02tFmy1sWy3rbg/M1oM0TnKoy8zJDW7zz1tzsY ejNAyuZ55akZ3QW72Er2QFRYoercFeI= Received: by mail-ed1-f50.google.com with SMTP id 4fb4d7f45d1cf-51e55517de3so4238a12.1 for ; Thu, 13 Jul 2023 20:24:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689305040; x=1691897040; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=uAavaLmIeCwYclXN+9/jkCoblgrGrtrOk9lcsx3pq0w=; b=KIXsfkQwyuSiZ7lAeGp6ZIlSb0TYvzb3dcNKFNJPdbKpiqxZ+wHFTp4eqzpe2tkANv pv3Qs46L3I3UpQ0j2QoVYO25+jRodIjC+C9c/bxR/gB6H4P/n+bLdRYux3pWloWLyGQI j6WetHkIsnSmEstipMxWommnaseemHOYIjBASDk4un8tR37s+zFnezjQEVbLP8ED1cWN wHYhSa9pQNeUMkwc8oDEzSbLQdCquhS9w9+mjGMobinX5ubMFOn9sLy3m89W+wMX94j8 eMbBPGG9yOB3T8epjm+QMTFf39Obi/B9nahejQp89OKprTBRnVV1EVo32nh4ULZYLXIi qKIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689305040; x=1691897040; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uAavaLmIeCwYclXN+9/jkCoblgrGrtrOk9lcsx3pq0w=; b=UrSFa365MM1mN8Qwp1d/gWynwQaQlhH7kpQF9gNwtsssPFgFOzGa4XnKViWQmyG5ZL C8P1QWwcCGUIKmNnI/JmQr4ItYlZ1scdfA9IblWjd0J4Ek/tKSW3mZveq04pU8Jz5jHr aMzJ3YGlR6ncSMK//BPiG7TRQAg3JcLyboWIVPvZW99oDcZLpBahKLEJeanXIOcK23Vj 7XsVnz/AwKhAbS2wj24o3nXz9iWlVGpK1dyPGIL2/92x90ojS9+1TCHeo9NkYEAbtwZX bjGf9RkoC663j/vXskOCGsiHb6NPpa+fhLqiVZOPQ77tgY1euqg/1oSMiiairNPDRGLs YoOg== X-Gm-Message-State: ABy/qLbFMVsu4DaA0rrRz9zEFIJGzi6UhGgn80aR4HT5w0lcIuyN4L/Z i4L3mALGjqBvJuaH3QanHBa6VNm8sLWZDFk92EcPHw== X-Google-Smtp-Source: APBJJlGjG4qBbS0rwqfLUSuyZ1p0ktq+NWopDc8HyTh7i3Ug4Oz1bNDtDsEWrktWVlsA/CtTxe6Hum4AcDflwh3pwAY= X-Received: by 2002:a50:bb6a:0:b0:51e:5e41:a0b2 with SMTP id y97-20020a50bb6a000000b0051e5e41a0b2mr341376ede.2.1689305039871; Thu, 13 Jul 2023 20:23:59 -0700 (PDT) MIME-Version: 1.0 References: <20230713150558.200545-1-fengwei.yin@intel.com> <8547495c-9051-faab-a47d-1962f2e0b1da@intel.com> In-Reply-To: <8547495c-9051-faab-a47d-1962f2e0b1da@intel.com> From: Yu Zhao Date: Thu, 13 Jul 2023 21:23:21 -0600 Message-ID: Subject: Re: [RFC PATCH] madvise: make madvise_cold_or_pageout_pte_range() support large folio To: "Yin, Fengwei" Cc: Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 90F2740009 X-Stat-Signature: jrsf5r8j6ipmxrm6sjj5ccjgp1kkwghq X-HE-Tag: 1689305041-605512 X-HE-Meta: U2FsdGVkX1/W49xpavoSQQChiuVlmQ/EUA43XgNo7xddP2stmdHbR7zthetPYJ8xSykGPX/vxiqRYL+GDH9N8hLe/z7UeiEhcKfhIOLEWlhxssoKXLqgG4qq8p/RAyi4Y+ha5ScXinBlWCIrWV6kUYyb2nH1BBQ2aTMb9b4juAeMnKts6cMZju0E1/GwFqMFjs63P6pH/XBktD7+7bG/9YaYRZIrAF6twWRAFKLj5M4jK9YEzp5VBdO7bQkbedX2hdEFV3ufDqu7LGsjFMnt6pFkn2+xQzdYrmTtwUnaSocPCAXjDYre4bKS0ghYXl2v/KrtQ0RbJBpHrJQ7g2cpxksoKtRc+bsnq4hgXcgapxXmpsbwrI+O7QrrO1kdy/yELziFp8LV/HG0LymyhozI3qLludAZHGAnMzSqfFb6A0JUSvBU+tFrmKFTVUdbpd2nW6u0iFFyF8BYQd78GtgUsIDyfT1kiczwHYgIqOG002rRNR7nn8asdNCFKfrfUISKZTNkO8eG2xkP7zYo7sDGbnMc7+zfhKWT5hPb2239800gvDzV27hgw/bRYE01Z3A+xYx5+VE+FtwwqHTpM4dzh/QUqVXbiwZnnREB5lBob1y3AI8fVVPE/FQulVdHrhhDZWIOJ3QPDPau6l9aye4UBWBZ+MDD5NTsFlybIVu/Nz+SFpx12v8gZs3fbyEDgHUS/PtQifJhhYzC1xaNfofUgtWlajLhEO68uvgA2W0nNvr9Biw4VE3yUsW6rsPmLwZYze0qYS9VgDH8EY2g8sN1oQ+n6n67i+4AS8wkh0nuOUxGuQRqZugCChpXYb7H/GuuyH53TqkJG4zhVODDAPRUI+gDsp+z7Yk7NI2kb8qVGC6OMkogqeS4Nnlt4+nnnYS9PVbCFihacEKGHnhMLzMH6g4nYLF5j4Bd+Mu7Au0KDx5EVcFIHfD208iDv1J4W3oA1ldIuLWLMK7+8f0zfzg OmTj0siA 1ZBZimYEZ/ZCZaHd0+X0NQMRcPUHOZioQ+ysVB1Izmq8del4fbFZSn6KohUCKSokA0A47QJFXWnNKLPdT1YSFiAyfbsDzSPHhn68DFueWSb8RqZX95xqTpaEULeYh3Ln6RfYDcJZOK4DsusjLwTp6RoQ/YmfTkp5YrnohrKH+C48UCisKNfjffgvPJR2panXLSss5OyXGPP6T+bNC8rdvumZc0Kw3zR8mm2sKf7jJC+SnjIQFWDFQCmFAxCVytaW7gpv4R407+BDN9inFJ/dRAE2W2gEpXorcUzO5R+/nfYfgb9Mk3soWGKrN4FR40UZ/Wefpi8wm907gDiW8mVR5pcLOqA87cE+7WmoiHIcGh8zOVVjcNRomHMysZw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jul 13, 2023 at 9:10=E2=80=AFPM Yin, Fengwei wrote: > > > > On 7/14/2023 10:08 AM, Yu Zhao wrote: > > On Thu, Jul 13, 2023 at 9:06=E2=80=AFAM Yin Fengwei wrote: > >> > >> Current madvise_cold_or_pageout_pte_range() has two problems for > >> large folio support: > >> - Using folio_mapcount() with large folio prevent large folio from > >> picking up. > >> - If large folio is in the range requested, shouldn't split it > >> in madvise_cold_or_pageout_pte_range(). > >> > >> Fix them by: > >> - Use folio_estimated_sharers() with large folio > >> - If large folio is in the range requested, don't split it. Leave > >> to page reclaim phase. > >> > >> For large folio cross boundaries of requested range, skip it if it's > >> page cache. Try to split it if it's anonymous folio. If splitting > >> fails, skip it. > > > > For now, we may not want to change the existing semantic (heuristic). > > IOW, we may want to stick to the "only owner" condition: > > > > - if (folio_mapcount(folio) !=3D 1) > > + if (folio_entire_mapcount(folio) || > > + (any_page_within_range_has_mapcount > 1)) > > > > +Minchan Kim > The folio_estimated_sharers() was discussed here: > https://lore.kernel.org/linux-mm/20230118232219.27038-6-vishal.moola@gmai= l.com/ > https://lore.kernel.org/linux-mm/20230124012210.13963-2-vishal.moola@gmai= l.com/ > > Yes. It's accurate to check each page of large folio. But it may be over = killed in > some cases (And I think madvise is one of the cases not necessary to be a= ccurate. > So folio_estimated_sharers() is enough. Correct me if I am wrong). I see. Then it's possible this is also what the original commit wants to do -- Minchan, could you clarify? Regardless, I think we can have the following fix, potentially cc'ing stabl= e: - if (folio_mapcount(folio) !=3D 1) + if (folio_estimated_sharers(folio) !=3D 1) Sounds good? > > Also there is an existing bug here: the later commit 07e8c82b5eff8 > > ("madvise: convert madvise_cold_or_pageout_pte_range() to use folios") > > is incorrect for sure; the original commit 9c276cc65a58f ("mm: > > introduce MADV_COLD") seems incorrect too. > > > > +Vishal Moola (Oracle) > > > > The "any_page_within_range_has_mapcount" test above seems to be the > > only correct to meet condition claimed by the comments, before or > > after the folio conversion, assuming here a THP page means the > > compound page without PMD mappings (PMD-split). Otherwise the test is > > always false (if it's also PMD mapped somewhere else). > > > > /* > > * Creating a THP page is expensive so split it only if we > > * are sure it's worth. Split it if we are only owner. > > */