From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12330C5475B for ; Wed, 6 Mar 2024 08:35:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4FBCC6B0075; Wed, 6 Mar 2024 03:35:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4A97B6B007D; Wed, 6 Mar 2024 03:35:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 371C66B007E; Wed, 6 Mar 2024 03:35:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 279AA6B0075 for ; Wed, 6 Mar 2024 03:35:35 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id EF961160368 for ; Wed, 6 Mar 2024 08:35:34 +0000 (UTC) X-FDA: 81865955388.17.569B1C0 Received: from out30-97.freemail.mail.aliyun.com (out30-97.freemail.mail.aliyun.com [115.124.30.97]) by imf01.hostedemail.com (Postfix) with ESMTP id CC12A40006 for ; Wed, 6 Mar 2024 08:35:31 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=FiHAlf9U; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf01.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709714133; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Vl0gQbnZXGNiKyFtfyQ7GeBCGUn8kVJwcHSVyAumDv0=; b=iCku37DlvAM3BGV47dU6Dzob+2WelmWv6I648Hq36q/Ktgmgn751PofI30ZF+dYfvkC2Wz zpJnizDRzUNlPqzeIdmyCQpxE4F7Rx9xS9wmYGAFtdrLSROIT+mgwLgrKEPl6KDe81abDI duBMt+2g0zdO8tPweyk/yDzC4NJoDMI= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=FiHAlf9U; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf01.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709714133; a=rsa-sha256; cv=none; b=D8/pv22I8+cQPxkKBJe3967zjudBTo28eLWlEb7PRSz2zViSj2OnVBqNEJUcjAY7H9L8Jf WA7NdNr6ZJwV3mq+XOnqUNnywxhZqWd9tRZvNOTJ1cHVnk7seayfT1TOWg1N/D3Z+8um0L nEaS3Lde5TU11qXroBU/DIR/NpqOxYc= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1709714128; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=Vl0gQbnZXGNiKyFtfyQ7GeBCGUn8kVJwcHSVyAumDv0=; b=FiHAlf9UBSHwcEta+OnDlkfZdn/u5GM1f3+AbXlpT9RA5nT7yoBj13YmNqt3J6rp+cV3OZ3eiam4bWDkDgPRX3O1fa7fOHDUx5yFaJkc12zpSlbi4BnIh1IlF/CI510le2W9TJ98osuNveWCCNadA6umeD4sWWU3IgUP0i96WQY= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045168;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0W1wtLBh_1709714126; Received: from 30.97.56.47(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W1wtLBh_1709714126) by smtp.aliyun-inc.com; Wed, 06 Mar 2024 16:35:27 +0800 Message-ID: <3eda72bd-25ad-4518-b38e-b63f75e5e94d@linux.alibaba.com> Date: Wed, 6 Mar 2024 16:35:26 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/3] mm: hugetlb: make the hugetlb migration strategy consistent To: Oscar Salvador Cc: akpm@linux-foundation.org, muchun.song@linux.dev, david@redhat.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com, mhocko@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <8d35b8ae-b8d8-4237-bfcf-ed63c0bb4223@linux.alibaba.com> From: Baolin Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: CC12A40006 X-Stat-Signature: 3ff6n8a87c715doyq6omkd9adw5uymba X-Rspam-User: X-HE-Tag: 1709714131-216566 X-HE-Meta: U2FsdGVkX1+e1/1lE637LHyVV9ugYji6/EIiPHjE7bO/rZwa72TX+UJX/ad+WAq+Kqnc5JEj1h3AArTjcZSj6ysi/vMRrXbrpu8iUGs92/bDUqeeNgLrHeFlcJ6iSAXwzRzzmaByie7ke5zBNkDOuXTdlbp8jZUOVa0opVudYErvlTgsljGpL1iqiKnDaUWD7LfZ4fi4lqWPQUvUKRj/KIrnKy2jArn3ORqtYlasdXhf9yLXBaWdQHEDW0zwUnru4MegQt+5z9TRXAlwLywzX4Xm9JZV0wtNQC7sMG7bEisYVmx7T3gglkqoJYL+AZKOFb7oktMIkBxSoMW8zx/GO3L4hvFDNkHMOUB1o67uP7x+iEr2iwC5hXXguw+rdM8UOy4c9hm3iSNrWgmqSOuFfbmCT3BUaBxbz84Sk/AEGHToSiolmh7NJ3ES8IjzxBUY6oa0FerrqU9H/NZZwH5a2Z3FC3UQcjUO7bmYjzeuooaZYCGU3uy55qZ1ktl2yTZrA42QniHxT+cWQr/IB/2ipkggTTr2Lh0YyijYJ4ltV1J7N6JwD0qiSDwvCdWoFibfGFmo98KoMLi7T8WIIyd6h3htG0AKUJYnRSskvGlPt6bMkUyv44bFKaRXCPWssAAYbFxvMrmmq2zmjQcNmgS80XxSnIPXiRydu3qiyb1LiowODMjr4BRE53LBrE+Om3MfityOJfi08r4sI7+iATtWXVyzYcmiRtHgIEG2CK6UtLuGnuDplut4UjnUooEPP6Ttn1KpSCzBxUYifSeEGpYpDpm966KfswN6IGAsCCHfpzOzNeIuO+i15OYWzfpiiOodN+T69iRA0NCzFCZT5whk/YeHBEfTdA+1BPET9BGiSN0s+WOHxDszeBJ16lvStQJ92FFrMPcOOkWIEOsAsskPm5OF1iRCecXI7ET1/HnBit2wPMUFtyvc4eo7u8iGOlP1KMBZK1BUAbad40+qUJU Aw9oyoZU tNzTHdhcPAKzgilnAbbRGYAJQogE35dsU09jqZDWNoXz7H2WuKZ5Pbo5BBTRdp6usaeNfzDR0JKW5mWxEWcCMcRZRL5FKEYZp4Gq1fxIynqgLORsj5V05Why1TfRzQulvUseFwAaEEQlcss/I2TkFPGDaRudcdYVzS/kLS/MOdmWFWNdv4MWpz7YnuuxZ6Nj6r1IV0evKsjU580pWGQFb78xYuUKhucLEvA8kAcs6YlHqRNTVhYLyyAgBlGuz2oDcAkGFPM+kJAgEWC/VqrKezEwKjC2WK5cItI2dGO2nvOgNSYAJ5Fv/Vtx8/76zav51UNlKLinPFRBFfmK6JW9Vmrj+7Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/2/28 16:41, Oscar Salvador wrote: > On Wed, Feb 28, 2024 at 03:40:08PM +0800, Baolin Wang wrote: >> >> >> On 2024/2/27 23:17, Oscar Salvador wrote: >>> On Tue, Feb 27, 2024 at 09:52:26PM +0800, Baolin Wang wrote: >>> >>>> --- a/mm/hugetlb.c >>>> +++ b/mm/hugetlb.c >>>> @@ -2567,13 +2567,38 @@ static struct folio *alloc_surplus_hugetlb_folio(struct hstate *h, >>>> } >>>> static struct folio *alloc_migrate_hugetlb_folio(struct hstate *h, gfp_t gfp_mask, >>>> - int nid, nodemask_t *nmask) >>>> + int nid, nodemask_t *nmask, int reason) >>> >>> I still dislike taking the reason argument this far, and I'd rather have >>> this as a boolean specifing whether we allow fallback on other nodes. >>> That would mean parsing the reason in alloc_migration_target(). >>> If we don't add a new helper e.g: gfp_allow_fallback(), we can just do >>> it right there an opencode it with a e.g: macro etc. >>> >>> Although doing it in an inline helper might help hiding these details. >>> >>> That's my take on this, but let's see what others have to say. >> >> Sure. I also expressed my preference for hiding these details within the >> hugetlb core as much as possible. >> >> Muchun, what do you think? Thanks. > > JFTR: I'm talking about https://lore.kernel.org/linux-mm/ZdxXLTDZn8fD3pEn@localhost.localdomain/ > or maybe something cleaner which doesn't need a new helper (we could if > we want though): > > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index c1ee640d87b1..ddd794e861e6 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -73,6 +73,16 @@ struct resv_map { > #endif > }; > > +#define MIGRATE_MEMORY_HOTPLUG 1UL << MR_MEMORY_HOTPLUG > +#define MIGRATE_MEMORY_FAILURE 1UL << MR_MEMORY_FAILURE > +#define MIGRATE_SYSCALL 1UL << MR_SYSCALL > +#define MIGRATE_MBIND 1UL << MR_MEMPOLICY_MBIND > +#define HTLB_ALLOW_FALLBACK (MIGRATE_MEMORY_HOTPLUG| \ > + MIGRATE_MEMORY_FAILURE| \ > + MIGRATE_SYSCALL| \ > + MIGRATE_MBIND) > + > + > /* > * Region tracking -- allows tracking of reservations and instantiated pages > * across the pages in a mapping. > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index ed1581b670d4..7e8d6b5885d6 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -2619,7 +2619,7 @@ struct folio *alloc_buddy_hugetlb_folio_with_mpol(struct hstate *h, > > /* folio migration callback function */ > struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, > - nodemask_t *nmask, gfp_t gfp_mask) > + nodemask_t *nmask, gfp_t gfp_mask, bool allow_fallback) > { > spin_lock_irq(&hugetlb_lock); > if (available_huge_pages(h)) { > @@ -2634,6 +2634,12 @@ struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, > } > spin_unlock_irq(&hugetlb_lock); > > + /* > + * We cannot fallback to other nodes, as we could break the per-node pool > + */ > + if (!allow_fallback) > + gfp_mask |= GFP_THISNODE; > + > return alloc_migrate_hugetlb_folio(h, gfp_mask, preferred_nid, nmask); > } > > diff --git a/mm/migrate.c b/mm/migrate.c > index cc9f2bcd73b4..c1f1d011629d 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -2016,10 +2016,15 @@ struct folio *alloc_migration_target(struct folio *src, unsigned long private) > > if (folio_test_hugetlb(src)) { > struct hstate *h = folio_hstate(src); > + bool allow_fallback = false; > + > + if ((1UL << reason) & HTLB_ALLOW_FALLBACK) > + allow_fallback = true; IMHO, users also should not be aware of these hugetlb logics. > > gfp_mask = htlb_modify_alloc_mask(h, gfp_mask); > return alloc_hugetlb_folio_nodemask(h, nid, > - mtc->nmask, gfp_mask); > + mtc->nmask, gfp_mask, > + allow_fallback); 'allow_fallback' can be confusing, that means it is 'allow_fallback' for a new temporary hugetlb allocation, but not 'allow_fallback' for an available hugetlb allocation in alloc_hugetlb_folio_nodemask().