From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46904C3DA4A for ; Fri, 16 Aug 2024 21:17:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C097D6B03B9; Fri, 16 Aug 2024 17:17:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BB7E76B03BA; Fri, 16 Aug 2024 17:17:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7F1C6B03BB; Fri, 16 Aug 2024 17:17:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 8A3406B03B9 for ; Fri, 16 Aug 2024 17:17:11 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 04415C02E5 for ; Fri, 16 Aug 2024 21:17:10 +0000 (UTC) X-FDA: 82459369062.07.5EC310D Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) by imf26.hostedemail.com (Postfix) with ESMTP id 211DF14001A for ; Fri, 16 Aug 2024 21:17:08 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=sNkOpW1L; spf=pass (imf26.hostedemail.com: domain of yuzhao@google.com designates 209.85.128.43 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723842954; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/W+Zkc0vnLGrK1XpLfx31aiSDPwenjvchzSOzxXMTO4=; b=74frVbGsbmkdLf3iKsuZG24HPeASdSQMo9c/NwH00soiRjNcpNRoX0kx9QgVlp5KQ1x4DE Xs7ZGJmwKg7dAD3kHXpXXZywyr7D5fjcMcTMKrllZrmFApE9Dfum9Wjfq5sfacP6BbdtDB HSPXghFNBDvN+7YuEGzDQ7hXdSsgh8A= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723842954; a=rsa-sha256; cv=none; b=HehHslUQPcECrU/dF56GK/YYHt/u20zvDafuMfez/JY4WCk5YiJQUwF+KYTs0ZGgqE+UVL I9NTwv46B6dfSVKlWFTq5WVKUG+pfFzGNP6/eXXOiYnkSc8Cy72l/dIYBgwaytXhneNZ1k e2XShAZpHRxye07gUKJ1iu66sK/rhq8= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=sNkOpW1L; spf=pass (imf26.hostedemail.com: domain of yuzhao@google.com designates 209.85.128.43 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-428e3129851so17803555e9.3 for ; Fri, 16 Aug 2024 14:17:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723843027; x=1724447827; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=/W+Zkc0vnLGrK1XpLfx31aiSDPwenjvchzSOzxXMTO4=; b=sNkOpW1Lf87isROuA1xZwY6A40fXfRhMICPWS3nVg6c6k3dAPLsm0SOa3jNk/LKMkW EnaFNmjen1xveI060mZLgk7lcF0nzv7QzOl/iiwXFDNH4PI9gkWHtNaNF3qVCHpgTwyB l5BYTpOJIxIYgL6mHthP5bLnublSfjyltGmBZLZeo20/BKvUMs5JXkf+FGvzhsFZwcfz E4ZdHTV37YCeCjhMbB9YCvRRFZLIp+GysVA3Bkzpr53H4w9M0g6NI/Fsvs00YiHRn2GD HDV4r1hBdu7K63UmO9IdYRTLlH3DwUMtpkceYkpUyXxYYxqd2lSUOVtFJEv9fEOzBRJM ICbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723843027; x=1724447827; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/W+Zkc0vnLGrK1XpLfx31aiSDPwenjvchzSOzxXMTO4=; b=sqXiNQqC8Bb35f+PEdTKztYuDoIlKmQzyaseAiPipo2BZscwy0x6U9abVg/zQDCiz6 xGQc81CduqImLBoLwJlJ2Wv13eKN/3LOYLWGYWkiTEvRYaqcF0TYfLPayEZkVCsCG+ck eOPS3iAo70SVghzhJ/JOp/UWRSCLHtdmOZzp9ygbqiyITrQfWJehXu/YmF4idDl71tgB yVx/SJsRbl2I4zjNfijcYkA8Bz72eYF+UcxrLgISgJJe2M+4ntksiehDPWpgqdEJZwOD rMcLR8PsxlQLUYqY2imflC2xcZxUfPLVFS0KwO3VEpXE1R6PigjNIowWA7th8ahqpfLO OWAw== X-Forwarded-Encrypted: i=1; AJvYcCXKpCn9o7YGN0dyfNLyHoDINIHlgTmt6EN+kHAEAsZru3Ryjii9MqiHW0NtJpwY10QMZNzA9qsJ5uYandfAmObWnwU= X-Gm-Message-State: AOJu0YxYfvlNZhuDme9/PUh6bcRktN9RnknH9FJaueqBKgOnTT299ifF iNgE6RnscJc4PUtAysj0AHVfFQir6vBC1aSIvIpyFP50QAjuQCE+5DkjqwLuGubhiteLqs3OvEQ S3whR4fFehCZLDFk749uDBU/xSya0tP8Ld+2R X-Google-Smtp-Source: AGHT+IGMmrYdNrgLOtN++aQMwxFTSLXNPMhKC1NKLk62pfi7w2HyRXAC5EJXKq33zlInju2tHiuogLpG8zaDWCjoi2U= X-Received: by 2002:a5d:5a15:0:b0:360:7c4b:58c3 with SMTP id ffacd0b85a97d-371946a0b75mr5095038f8f.54.1723843027060; Fri, 16 Aug 2024 14:17:07 -0700 (PDT) MIME-Version: 1.0 References: <20240816040625.650053-1-wangkefeng.wang@huawei.com> <29d190d9-6b1a-409b-b3a1-90539ddbc091@redhat.com> In-Reply-To: <29d190d9-6b1a-409b-b3a1-90539ddbc091@redhat.com> From: Yu Zhao Date: Fri, 16 Aug 2024 15:16:26 -0600 Message-ID: Subject: Re: [PATCH] mm: remove migration for HugePage in isolate_single_pageblock() To: David Hildenbrand Cc: Zi Yan , Kefeng Wang , Andrew Morton , Oscar Salvador , linux-mm@kvack.org, Matthew Wilcox Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 211DF14001A X-Stat-Signature: t3ofkmmaq5y75qfoqz5pddb4qhwptdk4 X-HE-Tag: 1723843028-615868 X-HE-Meta: U2FsdGVkX1/HoQbUWEI/UspE7LDjuXioTjMzNQfbGAz3C7SYW/xnZ7EnHcRHXTYKZgEO7YyAglMPlpuTybaL3p0j5qNNGSea8+XtR1RN2RNG67FFc5ILEamtr4L77UJvTjhs3YToIV20cjk033gsMMS4yEqpNuUdau1g8vH060Z0kSMR0xxHftVqyCPXOvsDL+owTKG3vmEccmy/SrwAlzqDWRU+B3KKArgjkyOy1mhNlM/+M2zvcMOiO4se6kmk3IQb4oAeIl9tudHgyCHFyZmmjU1jHssACA46CwzhMot93Hy4pKc33df47/YLi1hHoGAmP29UU8yQc1JuwDtUxROESC6kRN5qwMvZFY3Sgf8/qnRXWyUoTBs5UFfKT5u2+R0sW13IdCWltMAeoLGgt/+NPryiToEBAp0df5p0d3yoUj5jTLeSAYsbpoQ2LoXby/lJop9RnXOgLGKGXS9Kox51W5TDC8grVxIvzyV6q8ISkufuia22LbKpoD2UaFMb/595dx6BIsbzh2NgsxE3sKMseYxyqQ0AtvEvtaThSmnR/X68CeINhL6++1V/fL/tes5LpF/7JvNaPp0jHK4qo+AGWPIDG9eNpaa6/oKgfcpwa0eeEgAnI/x2lF4CafGwI2GV8VrIUho2k/EioqzmT+M67o6kNXw45OIzS3CHhnUg9fBTI2JBs2kPcl28N/7uqlykIID5uAk6pv0+NAz8AfrMzPEP49QWl78c76AK39x2Ebp9DJvYa9qOJ0Ny5U4UkcGDsQGbvfvhSD6Uza+XpSZSafUSSbwC3XqOMWeYqQQr/0UnSvWgkAqHOT1gUL0xMRFIVlLk3d0OfAF9koXkT3ANGXJT2ap8ZOtuvnccR/DUJ4nBToAGGl76NIB16w9B7PhbWoWMGPnHeweuqJbPxQsr/Qn4gHMQNCIv0Sz8Q8G5L7m8mSEavXnL5Z0EbYQKHxOcua/DHPwNh9GC23V i2B4pnfx zsshr/BluF9q0f2+F5CB/KFKHhGu+rh8SvJ1Ekwj8eGsKcRhA3BxOGOENN9sFeqyt06QMzNbJuJC4jeYG31+a0JctisNjTnNhhP/pP9sG763nZw+9IQII3v2V1w9Vf+IpQeYJgGaCAl3mTlf+acdtke+0U5O0Pp/SS81PykpfV4i2X9/mv9JkvRyuIbHAbslziLQFaHdMNiKMUy24Lt0avqvVESaCMuEa9ClQ2FLn6DQxrG+VzLgkknuN0aSJ/bqVlFv+Vmjrv12dN62VWWbw3f2g/wzftEi6ZBp+iBRX4TC+JOp2n3pP2FS/7Ua0xwGt/zsyp4RHsmeM1PGEKcKQjTRYEw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 16, 2024 at 2:12=E2=80=AFPM David Hildenbrand wrote: > > On 16.08.24 17:06, Zi Yan wrote: > > On 16 Aug 2024, at 7:30, Kefeng Wang wrote: > > > >> On 2024/8/16 18:11, David Hildenbrand wrote: > >>> On 16.08.24 06:06, Kefeng Wang wrote: > >>>> The gigantic page size may larger than memory block size, so memory > >>>> offline always fails in this case after commit b2c9e2fbba32 ("mm: ma= ke > >>>> alloc_contig_range work at pageblock granularity"), > >>>> > >>>> offline_pages > >>>> start_isolate_page_range > >>>> start_isolate_page_range(isolate_before=3Dtrue) > >>>> isolate [isolate_start, isolate_start + pageblock_nr_pages) > >>>> start_isolate_page_range(isolate_before=3Dfalse) > >>>> isolate [isolate_end - pageblock_nr_pages, isolate_end) page= block > >>>> __alloc_contig_migrate_range > >>>> isolate_migratepages_range > >>>> isolate_migratepages_block > >>>> isolate_or_dissolve_huge_page > >>>> if (hstate_is_gigantic(h)) > >>>> return -ENOMEM; > >>>> > >>>> In fact, we don't need to migrate page in page range isolation, for > >>>> memory offline path, there is do_migrate_range() to move the pages. > >>>> For contig allocation, there is another __alloc_contig_migrate_range= () > >>>> after isolation to migrate the pages. So fix issue by skipping the > >>>> __alloc_contig_migrate_range() in isolate_single_pageblock(). > >>>> > >>>> Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock = granularity") > >>>> Signed-off-by: Kefeng Wang > >>>> --- > >>>> mm/page_isolation.c | 28 +++------------------------- > >>>> 1 file changed, 3 insertions(+), 25 deletions(-) > >>>> > >>>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c > >>>> index 39fb8c07aeb7..7e04047977cf 100644 > >>>> --- a/mm/page_isolation.c > >>>> +++ b/mm/page_isolation.c > >>>> @@ -403,30 +403,8 @@ static int isolate_single_pageblock(unsigned lo= ng boundary_pfn, int flags, > >>>> unsigned long head_pfn =3D page_to_pfn(head); > >>>> unsigned long nr_pages =3D compound_nr(head); > >>>> - if (head_pfn + nr_pages <=3D boundary_pfn) { > >>>> - pfn =3D head_pfn + nr_pages; > >>>> - continue; > >>>> - } > >>>> - > >>>> -#if defined CONFIG_COMPACTION || defined CONFIG_CMA > >>>> - if (PageHuge(page)) { > >>>> - int page_mt =3D get_pageblock_migratetype(page); > >>>> - struct compact_control cc =3D { > >>>> - .nr_migratepages =3D 0, > >>>> - .order =3D -1, > >>>> - .zone =3D page_zone(pfn_to_page(head_pfn)), > >>>> - .mode =3D MIGRATE_SYNC, > >>>> - .ignore_skip_hint =3D true, > >>>> - .no_set_skip_hint =3D true, > >>>> - .gfp_mask =3D gfp_flags, > >>>> - .alloc_contig =3D true, > >>>> - }; > >>>> - INIT_LIST_HEAD(&cc.migratepages); > >>>> - > >>>> - ret =3D __alloc_contig_migrate_range(&cc, head_pfn, > >>>> - head_pfn + nr_pages, page_mt); > >>>> - if (ret) > >>>> - goto failed; > >>> > >>> But won't this break alloc_contig_range() then? I would have expected= that you have to special-case here on the migration reason (MEMORY_OFFLINE= ). > >>> > >> > >> Yes, this is what I did in rfc, only skip migration for offline path. > >> but Zi Yan suggested to remove migration totally[1] > >> > >> [1] https://lore.kernel.org/linux-mm/50FEEE33-49CA-48B5-B4C5-964F1BE25= D43@nvidia.com/ > >> > >>> I remember some dirty details when we're trying to allcoate with a si= ngle pageblock for alloc_contig_range(). > > > > Most likely I was overthinking about the situation back then. I thought > > I'm more than happy if we can remove that code here :) > > > PageHuge, PageLRU, and __PageMovable all can be bigger than a pageblock= , > > but in reality only PageHuge can and the gigantic PageHuge is freed as > > order-0. > > Does that still hold with Yu's patches to allocate/free gigantic pages > from CMA using compound pages that are on the list (and likely already > in mm-unstable)? Gigantic folios are now freed at pageblock granularity rather than order-0, as Zi himself stated during the review :) https://lore.kernel.org/linux-mm/29B680F7-E14D-4CD7-802B-5BBE1E1A3F92@nvidi= a.com/ > I did not look at the freeing path of that patchset. As > the buddy doesn't understand anything larger than MAX_ORDER, I would > assume that we are fine. Correct. > I assume the real issue is when we have a movable allocation (folio) > that spans multiple pageblocks. For example, when MAX_ORDER is large > than a single pageblock, like it is on x86. > > Besides gigantic pages, I wonder if that can happen. Likely currently > really only with hugetlb. > > > This means MIGRATE_ISOLATE pageblocks will get to the right > > free list after __alloc_contig_migrate_range(), the one after > > start_isolate_page_range(). > > > > David, I know we do not have cross-pageblock PageLRU yet (wait until > > someone adds PMD-level mTHP). But I am not sure about __PageMovable, > > even if you and Johannes told me that __PageMovable has no compound pag= e. > > I think it's all order-0. Likely we should sanity check that somewhere > (when setting a folio-page movable?). > > For example, the vmware balloon handles 2M pages differently than 4k > pages. Only the latter is movable. > > > I wonder what are the use cases for __PageMovable. Is it possible for > > a driver to mark its cross-pageblock page __PageMovable and provide > > ->isolate_page and ->migratepage in its struct address_space_operations= ? > > Or it is unsupported, so I should not need to worry about it. > > I never tried. We should document and enforce/sanity check that it only > works with order-0 for now. > > > > >>> > >>> Note that memory offlining always covers pageblocks large than MAX_OR= DER chunks (which implies full pageblocks) but alloc_contig_range() + CMA m= ight only cover (parts of) single pageblocks. > >>> > >>> Hoping Zi Yan can review :) > > > > At the moment, I think this is the right clean up. > > I think we want to have some way to catch when it changes. For example, > can we warn if we find a LRU folio here that is large than a single > pageblock? > > Also, I think we have to document why it works with hugetlb gigantic > folios / large CMA allocations somewhere (the order-0 stuff you note > above). Maybe as part of this changelog. > > -- > Cheers, > > David / dhildenb > >