From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44217C021BE for ; Thu, 27 Feb 2025 10:24:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B18A86B0085; Thu, 27 Feb 2025 05:24:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AC86A6B0089; Thu, 27 Feb 2025 05:24:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 990A36B008C; Thu, 27 Feb 2025 05:24:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 7F0756B0085 for ; Thu, 27 Feb 2025 05:24:25 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 20A3FC0DE0 for ; Thu, 27 Feb 2025 10:24:25 +0000 (UTC) X-FDA: 83165340090.12.4E08053 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf25.hostedemail.com (Postfix) with ESMTP id 2501CA001D for ; Thu, 27 Feb 2025 10:24:23 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ExW6d259; spf=pass (imf25.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740651863; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UYBN2NKvBKU3OanAU0JK1XWLxV1oeCNrcv9ftOH6zOQ=; b=UXiXHai7IZfx3P9fEpLkf52gagOl+wy9w/JzfIgc4mRCTowlAI4Ut8Z0C2yQqhS0Qpe9Vd mp1radzZdrR5LbP8ouDZqPTJTVABIxh9HWib/L+Vgaod+W258Hl7zLQbPOrpf7ttwN44jz /p9fxeF1/hhV9kQ0PbWcVANA1ItqBeM= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ExW6d259; spf=pass (imf25.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740651863; a=rsa-sha256; cv=none; b=LKwzFWJW7OHBONovam7nGU1KytfjwgCY3GMFtIVsiHlecwq9OeUP8X7n70EUTRgOE4b2DT 2FssXuySgzGXF5Wvq3ubiC44pxvCBzqs1B85/pNpN7ZAH0lVMMkXqTxiQuMuzB1hVd6Mpd 7Fp29Acf+9rdI6TGdxLIyRMWGW5Hl6Y= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740651862; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=UYBN2NKvBKU3OanAU0JK1XWLxV1oeCNrcv9ftOH6zOQ=; b=ExW6d259GZzIj+Y9Ah/rmNymmMpOFZQqHagByej941t73td9TIjfNQgunoZ8mv7au9wab8 Wo8AHyF8dtE6TNcFGdfLV7Ti+iRBSOJeRurZR6KcdXvmG7D57jbyHimyYoA2Arh3awEvNX EmmDcNnoQxEC0ivjF9/l7du6UKB7yTI= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-345-2oOO4HCLNQqBpykoiGfE_Q-1; Thu, 27 Feb 2025 05:24:18 -0500 X-MC-Unique: 2oOO4HCLNQqBpykoiGfE_Q-1 X-Mimecast-MFC-AGG-ID: 2oOO4HCLNQqBpykoiGfE_Q_1740651856 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 927B21800874; Thu, 27 Feb 2025 10:24:16 +0000 (UTC) Received: from localhost (unknown [10.72.112.52]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 37B62180035E; Thu, 27 Feb 2025 10:24:14 +0000 (UTC) Date: Thu, 27 Feb 2025 18:24:09 +0800 From: Baoquan He To: Vlastimil Babka Cc: Michal Hocko , Gabriel Krisman Bertazi , akpm@linux-foundation.org, linux-mm@kvack.org, Mel Gorman Subject: Re: [PATCH] Revert "mm/page_alloc.c: don't show protection in zone's ->lowmem_reserve[] for empty zone" Message-ID: References: <20250226032258.234099-1-krisman@suse.de> MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: O-fboGcTzw1o8fnqgNkD3UMtA-yO4tEBsvsd0tllEmk_1740651856 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Rspamd-Queue-Id: 2501CA001D X-Stat-Signature: gaad8sdnwiq8wwwfturedpiooy8omkpr X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1740651863-247187 X-HE-Meta: U2FsdGVkX1+salKFEkJDZZDINDoaZ2Q87HzaUwlMyIzgvFK5/Jg03XGlolD6ct+2GQJKlw+fgci42dFH5drACkOt6fZ1j+u7MM6Ssmr1kV+LYOijwx/OhyX3v6dyBxOEod5bDXBkp7YVZDVytEUpVmieTz1FMba8ITFCcHnEnFCY/OrjuzWv6mMD1I7bomJ9WLZleoohs8BPljKebwnhy55GOcPXh/USA+uBGlJ9+blFohwqwb9B11HYCr0l2SJhn45+UHNa6cyP80aEbMvF40mrctOnWZaPbPNhxOg9xRDHBIs0YZgVzCNXuj3O+uW8/xlEyzjWxD+p4qcAAj7dgvrmrwGfK+/IbnvmeZ1XSqao3WUsRq8l8qRS0UNtFmpuIdnWP4STWQEADbZZ5ZVqRSJ3eV+Al11FPkQXKxwwLj1pcHqxPYe7Kwygwlhz6KIONnR6PvgCd29tb9/UpIO2PCETUZKsWwCuVjLNnGyZ3sDnrnQxxSUqxyB1qwKK9ksXfJgiAE1Gmlzs8F3zPQ3WNFKywNxqnKkWHJhYr4K8eDT7JzZRKIcrbnjOpNVOo1MbtuISPqu24jS+o5TbgkoMvCNULt3+IvSKq+BHpIwjqaHpINkJfSMKgOWo09Ze8hN40BmUr4RxhJbOfKRYSWULbU+0xoBUHunzubjAPJta+orPNJcYGr2T0hmkqUUTYT3xwdlg31GTCdAUgSlbcWyb/pX9WskTMgBc7dZkDhNgYt7pDXOd7+bvbcs7Qg0cJlOSCua5hai1WvzkYq4hoA2FrNfD8FS5iTI4rfjbP5pyi9dp/J20CnwwN3ldWKqr0CuR73kYwnybL0MnEFXHwC46588yHRIjQnG1pm2tBYBSA1ehbEuZEOPzkXoFXStu/6dkKwlmZH8av/q/SxDv6xD6bJb0CIw8kTOIMDSkEk4sMLKdXL/yYYnsbP+5/TP3RUMiLUTy1LNd/dgyth5B0R8 iYi0PrqR THSIAAM+eyR80eDWLYhoYltHQEYxe6iZgSYKSRmETVGjqlkeBttVjQk/kLP5GAXmASONiy6xtBxwg94iQ5mharf48ATYC2w6vhRwoAbPZL6MMoOfdB5dSc1jpKdcv/a93bDznOuDNz6cjn18sQfhHQegfMl0olS+aXX+g5iKtfa4VPW9nP0UWBBm8wf+PAQj+7LYBX5uKH4K73ypCcPRQC21zYVbZXOP4wokd0cnSQoWqaMV++r1PRU2r1xYp+hEmrKmK3I8ifdrdHaAAvrrqS1fRc0yywz7V+fy0UCAELYyx2/NQkkeSPK7DzWcNXToNXXrgjMK2sMQvafnzHp56b0TkOAkTA+fDtgy4BIQciabKAIR3DtTl5HzeqC5iY00rNcEmy3UDwlU57mHcfKLacIy84wO+oDh1vzTbAKCY2cPrIRH6dKCQRfaRlDIddlBdJvKWhAEMjFlmQJtxGdtZRKs5agDO++R1g07X/as8aNchn0s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 02/27/25 at 10:16am, Vlastimil Babka wrote: > On 2/26/25 16:57, Baoquan He wrote: > > On 02/26/25 at 01:01pm, Michal Hocko wrote: > >> > >> Why do you think anything needs to be adjusted? > > > > No, I don't think like that. But I am wondering what makes you get > > the conclusion. > > > >> > >> > I haven't thought of the whole zone fallback list to interleave nodes > >> > which invovles a lot of change. > >> > > >> > > > >> > > Btw. has 96a5c186efff tried to fix any actual runtime problem? The > >> > > changelog doesn't say much about that. > >> > > >> > No, no actual problem was observed on tht. > >> > >> OK > >> > >> > I was just trying to make > >> > clear the semantics because I was confused by its obscure value printing > >> > of zone->lowmem_reserve[] in /proc/zoneinfo. > >> > > >> > I think we can merge this reverting firstly, then to investigate how to > >> > better clarify it. > >> > >> What do you think needs to be clarify? How exactly is the original > >> output confusing? > > > > When I did the change, I wrote the reason in commit log. I don't think > > you care to read it from your talking. Let me explain again, in > > setup_per_zone_lowmem_reserve(), each zone's protection value is > > calculated based on its own node's zones. E.g below on node 0, its > > Movable zone and Device zone are empty but still show influence on > > Normal/DMA32/DMA zone, this is unreasonable from the protection value > > calculating code and its showing. Ah, I saw your mail when I finished my replying to Michal. Thanks for your sharing with deliberate details, I almost agree with them all. > > It's not unreasonable. A GFP_HIGHUSER_MOVABLE can use up to the Movable > zone, so e.g. the dma32 zone should be protected from such an allocation, so > it has space for GFP_DMA32 restricted allocations. Yes, I didn't realize that when I did the change in commit 96a5c186efff, sorry about that. > > If no Movable zone exists, but Normal zone does, the result is the > protection will be the same for GFP_KERNEL allocations (that can use up to > the Normal zone) and GFP_HIGHUSER_MOVABLE allocations. (i.e. the number of > 22134 in your listing is the same for both indexes). That's fine. But > setting the protection from Movable allocations to 0 as commit 96a5c186efff > did was simply a bug, as that can directly lead to GFP_HIGHUSER_MOVABLE > depleting ZONE_DMA32. Yes, agree. I think that's the reason Gabriel observed the regression. > > The only "unreasonable" part here is that we define and show protections > from ZONE_DEVICE allocations. The usage of this zone is AFAIK completely > separate from normal page allocation through zonelists, so we could exclude > it, if anyone cared enough. I think this is not the only unreasonable part. sysctl_lowmem_reserve_ratio is a knob provided for user to tune the memory management. While the underlying code relative to the set ratio can't meet the expections. Even though we revert my patch, it seems to work well, while the protection value is not under good management. It just happens to work. Because the protection value is calculated relative to __GFP_THISNODE allocation, while ignoring the FALLBACK allocation. I think you have pointed that out greatly in below comment. > > > If really as your colleague Gabriel said, the protection value of DMA zone > > on node 0 will impact allocation when targeted zone is Movable zone, we > > may need consider the protection value calcuation acorss nodes. Because > > the impact happens among different nodes. I only said we can do > > investigation, I didn't said we need change or have to change. > > There might be a theoretical issue if e.g. Node 0 only contained DMA and > DMA32 zones and nothing else, while the Normal zone is on Node 1, there > would be no protection for DMA/DMA32 zones from Normal allocations, as > setup_per_zone_lowmem_reserve() considers each node separately and thus > would not take Normal zone size from Node 1 into account. > > Should we sum zone sizes accross all nodes then? But then __GFP_THISNODE > Normal allocations for node 0 would never succeed? Or we'd need a separate > lowmem_reserve array for those? Yeah, I have the same thought as you here. We may need adapation here. > > I guess the issue doesn't happen in practice. In any case it's out of scope > of the reverted commit and the revert. It could happen on arm64 because arm64 only has ZONE_DMA by default and its boundary is not fixed. I saw all zones are ZONE_DMA on arm64, I guess it could be easier to see a arm64 system which only has ZONE_DMA on node 0 and ZONE_NORMAL/MOVABLE on other nodes. > > > Node 0, zone DMA > > ...... > > pages free 2816 > > ...... > > protection: (0, 1582, 23716, 23716, 23716) > > Node 0, zone DMA32 > > pages free 403269 > > ...... > > protection: (0, 0, 22134, 22134, 22134) > > Node 0, zone Normal > > pages free 5423879 > > ...... > > protection: (0, 0, 0, 0, 0) > > Node 0, zone Movable > > pages free 0 > > ...... > > protection: (0, 0, 0, 0, 0) > > Node 0, zone Device > > pages free 0 > > ...... > > protection: (0, 0, 0, 0, 0) > > > >> > >> -- > >> Michal Hocko > >> SUSE Labs > >> > > > > >