From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49418C369A2 for ; Tue, 8 Apr 2025 13:10:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BFCA26B000C; Tue, 8 Apr 2025 09:10:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BACF56B000D; Tue, 8 Apr 2025 09:10:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4E166B000E; Tue, 8 Apr 2025 09:10:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 874116B000C for ; Tue, 8 Apr 2025 09:10:14 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 114A880E89 for ; Tue, 8 Apr 2025 13:10:15 +0000 (UTC) X-FDA: 83310909990.18.0BDF792 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf25.hostedemail.com (Postfix) with ESMTP id D3666A000C for ; Tue, 8 Apr 2025 13:10:12 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=MUQG3EPZ; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=UpgyulmB; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=MUQG3EPZ; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=UpgyulmB; spf=pass (imf25.hostedemail.com: domain of osalvador@suse.de designates 195.135.223.131 as permitted sender) smtp.mailfrom=osalvador@suse.de; dmarc=pass (policy=none) header.from=suse.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744117813; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2JKTfWMt4KTKRKa4/BzbY2o43GLqeFWO5H7hN4V0/OI=; b=ownnBkDVyyPz2rHlnlf1sSC3pylMdXIVsjwDzdXmV1XJpXVjlSwG+ld3/nIzQcusGO46CL UFUSSYqcvarefDA6nym/FOvjoYRgVKKc74kDEF1TMmPYNKWtxFQXRUMGVLQeYi5R6hGFLs QTgqZvjYk8dpvMeD6a5fStSdf4XU9dc= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=MUQG3EPZ; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=UpgyulmB; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=MUQG3EPZ; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=UpgyulmB; spf=pass (imf25.hostedemail.com: domain of osalvador@suse.de designates 195.135.223.131 as permitted sender) smtp.mailfrom=osalvador@suse.de; dmarc=pass (policy=none) header.from=suse.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744117813; a=rsa-sha256; cv=none; b=ZeNGpG/i3lvLUKZD0Xa9BvTKmkzs1EZE16iQckQQh2s8DgTG1WPHjwZo5YipVDJhnCONyz eqbF6QwSleNpxOLLGtLJj7RXDjiBHeqtSVkIUlhXMLA8X0qz6yYwrncKx3oFIBl/JjfXnF nTlNv/cjB91j7dHyttNSFaLEoLh9nyc= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 17DB31F388; Tue, 8 Apr 2025 13:10:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1744117811; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2JKTfWMt4KTKRKa4/BzbY2o43GLqeFWO5H7hN4V0/OI=; b=MUQG3EPZEeiT2xYGpRoBx5gye//6YdzlT02JHRgW059IcBDU5ejdfa8vGEj8jLNwJWgJwi wF6vQ5AS860QeyhQZ8UDEmCSx3ksu9G34zyZ/d+oNnrlOPmYVOJXfO6oXOUwQYbQsVQNou X4oo+RHfxsdEdmmw19b8IWU5e9Yqp8g= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1744117811; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2JKTfWMt4KTKRKa4/BzbY2o43GLqeFWO5H7hN4V0/OI=; b=UpgyulmBfOSQXIXlBO49ZXknhzgnD0ox/YiX3b5DS/Jsf5LrPodv9XoUrv8vIm73lSWTXc wZUp0pmsvzZNaAAg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1744117811; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2JKTfWMt4KTKRKa4/BzbY2o43GLqeFWO5H7hN4V0/OI=; b=MUQG3EPZEeiT2xYGpRoBx5gye//6YdzlT02JHRgW059IcBDU5ejdfa8vGEj8jLNwJWgJwi wF6vQ5AS860QeyhQZ8UDEmCSx3ksu9G34zyZ/d+oNnrlOPmYVOJXfO6oXOUwQYbQsVQNou X4oo+RHfxsdEdmmw19b8IWU5e9Yqp8g= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1744117811; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2JKTfWMt4KTKRKa4/BzbY2o43GLqeFWO5H7hN4V0/OI=; b=UpgyulmBfOSQXIXlBO49ZXknhzgnD0ox/YiX3b5DS/Jsf5LrPodv9XoUrv8vIm73lSWTXc wZUp0pmsvzZNaAAg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id A56DE13691; Tue, 8 Apr 2025 13:10:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id bTR9JTIg9WdELQAAD6G6ig (envelope-from ); Tue, 08 Apr 2025 13:10:10 +0000 Date: Tue, 8 Apr 2025 15:10:09 +0200 From: Oscar Salvador To: Jinjiang Tu Cc: david@redhat.com, akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, wangkefeng.wang@huawei.com Subject: Re: [PATCH v3] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages Message-ID: References: <20250407124706.2688092-1-tujinjiang@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250407124706.2688092-1-tujinjiang@huawei.com> X-Rspamd-Action: no action X-Rspamd-Queue-Id: D3666A000C X-Rspamd-Server: rspam05 X-Rspam-User: X-Stat-Signature: xpgt64ucmfgins6ms8ra9ttniop7sgca X-HE-Tag: 1744117812-193137 X-HE-Meta: U2FsdGVkX1+6vxcHt9aSIHl/XqdhCt9w3BsAffv/DjlFqNRVxoI3oBuHLl3qo+BY2ylYaRO3mCB1UNloQjdaBow5MpKoTss91fyk+SIlLxCummDtoHjE5EV4p1PKy1Na03CJ6qYMnpRLwrp3Fn5P7qv1IwW2rXzRieMOiOqEmeffDfr1UZEzrNn8tlJIFWn96K3O4lhO4asikwSPcKvAgvQ72fv8ovdEORSdl3F1QMBBCbQx1bThe9pw7LvHT7/5DcXzN202tIQZyTSb8z07hSSyc3XoE5WW1G0LWWo7ni4+UlArcHuso8aXiMaQlpBP5EestqimJbMiHHYVx1eCYwDaQfd29MqprcvsA7N7N9O0g6GmPxXjeTV06uBOTVOy6DWJAe4dGmyR7xwWtY2159rRokKNOiIOQAlF5pAXc5NK3ab0LCZ52XqqRmlHLjp1BCZelxO5YvByV9WtXcroLAZ7m6Q64x4KrO6mh7QRWBKPZ+ePwSccQ01DTsBGexDDumkQk6qDyJoGEJ1X7Y6Y2S6sC5uXw1d0PNsHFseXtoDHRk0+60GcMuzWe97fi+48j+sAxQH63WJCHI6V6iQN9L/bbR9cFiGQcO9bEZKxKlrL1f8ooWfyo1iL5UCQiODrPAejvLmv04lcudoDkCTo3+Zw+s+Vg69a032GcB+DauCQ9EgQDkbPo5Xye171mliGJ8/SqY+CDmFcyd6rYuzxq5yO2NsvYnkDHrjFY2aj9B+foi1ie9xBHex6ppwkKGVH0YbOmslhgmfLqBaef0ppQN4bjK4vXyZ6HF/AmvbQk9cA1s/mev73VwnhyIgQ/zBimbjLtceLiHtm4XDIN0s7S60jFCoEUmhYfvoho/gqAX1ZXSLzJCgaza0ipBJe3jm+eJUPEjgFxPRqgY90mTfZ6i3MZVck6TfXDX+LCbXHCLZVxgTJVQc4gmn0x/SBU1lU3Obft+n0YcCn9mQj3B4 98/jGwrz FBtIduDp+uzi4RGgExiIEvqeZgprJd3ujsc2J1mBBkfYhQTSfJL8hn4okmubGQryUEyqA8JYOoVxB3pBjbJe11mrB9cxHjSAHao+3rxo7JgWmQfG/PSCfFpBqK6nC0csqRvPAv6J7i/WcYBRgEj5z4c8kfLVVmlSC5KzgFlmtdCFyvk/Ys3fnuGsojWqqGNqNUCnUSpWkP3FOeNlLt6PbO5FYO3PMMTR9W6d36irNoLtM2P0UQzHMcxMTLf/8GaFnHqHxRYuxvg0k0S1eGVds+Nn0w04e+T99stRwr7IdT5mhXMOPq/RvMagxIn0eW/2Op0Ae1l5aZohyjiA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Apr 07, 2025 at 08:47:06PM +0800, Jinjiang Tu wrote: > In set_max_huge_pages(), min_count should mean the acquired persistent > huge pages, but it contains surplus huge pages. It will leads to failing > to freeing free huge pages for a Node. > > Steps to reproduce: > 1) create 5 hugetlb folios in Node0 > 2) run a program to use all the hugetlb folios > 3) echo 0 > nr_hugepages for Node0 to free the hugetlb folios. Thus the 5 > hugetlb folios in Node0 are accounted as surplus. > 4) create 5 hugetlb folios in Node1 > 5) echo 0 > nr_hugepages for Node1 to free the hugetlb folios > > The result: > Node0 Node1 > Total 5 5 > Free 0 5 > Surp 5 5 I would put this after the explanation, as otherwise is a bit hard to follow. > We couldn't subtract surplus_huge_pages from min_mount, since free hugetlb > folios may be surplus due to HVO. In __update_and_free_hugetlb_folio(), > hugetlb_vmemmap_restore_folio() may fail, add the folio back to pool and > treat it as surplus. If we directly subtract surplus_huge_pages from > min_mount, some free folios will be subtracted twice. > > To fix it, check if count is less than the num of free huge pages that > could be destroyed (i.e., available_huge_pages(h)), and remove hugetlb > folios if so. But this is not true, you are no longer comparing against available_huge_pages(h) as you did in v2. I would go with something along these lines as changelog. "In set_max_huge_pages(), min_count is computed taking into account also surplushuge pages, which might lead in some cases to not be able to free huge pages and end up accounting them as surplus intead. One way to solve it is to substract surplus_huge_pages directly, but we cannot do it blindly because there might be surplus pages thar are also free pages, which might happen when we fail to restore the vmemmap for optimized hvo pages. So we could be subtracting the same page twice. In order to work this around, let us first compute the number of free persistent pages, and use that along with surplus pages to compute min_count." And then put the PoC. > Fixes: 9a30523066cd ("hugetlb: add per node hstate attributes") > Signed-off-by: Jinjiang Tu Acked-by: Oscar Salvador -- Oscar Salvador SUSE Labs