From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F28EC36017 for ; Mon, 31 Mar 2025 21:23:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AF00C280002; Mon, 31 Mar 2025 17:23:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A77C7280001; Mon, 31 Mar 2025 17:23:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F21C280002; Mon, 31 Mar 2025 17:23:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 6DC4C280001 for ; Mon, 31 Mar 2025 17:23:48 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7E5298036F for ; Mon, 31 Mar 2025 21:23:48 +0000 (UTC) X-FDA: 83283123336.30.9AA20BF Received: from mail-yw1-f178.google.com (mail-yw1-f178.google.com [209.85.128.178]) by imf16.hostedemail.com (Postfix) with ESMTP id 9E19C180003 for ; Mon, 31 Mar 2025 21:23:46 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mMfAkLRO; spf=pass (imf16.hostedemail.com: domain of joshua.hahnjy@gmail.com designates 209.85.128.178 as permitted sender) smtp.mailfrom=joshua.hahnjy@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743456226; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ToiqwXTNWbSOaWmF/3nfIhcFQB86VWKA/YOyk/TfLa4=; b=RRoKu6CFFZlbqZRHtrJUfO5m5zks4zuu40p5mlkzSA7YDMjdjdVw6XEPMqAy7szA+iyK2/ sbajXG+DYFU6c6sUdtg0qsHomxQgEp2RmhPtsFO8JmbjHlg/YB8PNogR+YKYf3eETDsIxs TFi8ErWbOVYtdPkzel+8thvGyt/pLlw= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mMfAkLRO; spf=pass (imf16.hostedemail.com: domain of joshua.hahnjy@gmail.com designates 209.85.128.178 as permitted sender) smtp.mailfrom=joshua.hahnjy@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743456226; a=rsa-sha256; cv=none; b=lL9UfmjgYdYsd6A643wHq/vZxLtVa89IeMXwWlIYvZhVol5mL2HRgQ9yJ8PWpGeCPBym1T NvksV7L2TmUSMKBJ2gJA+ABmzIbk3lMxtwmUzxJRLtr9yYjP/NntGXab9pdVqnXhDTToVI 56WLFlgdo1yiGtWHHKeQWa0NBzVDW20= Received: by mail-yw1-f178.google.com with SMTP id 00721157ae682-6feab7c5f96so44333407b3.3 for ; Mon, 31 Mar 2025 14:23:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1743456226; x=1744061026; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ToiqwXTNWbSOaWmF/3nfIhcFQB86VWKA/YOyk/TfLa4=; b=mMfAkLRO6r72UpBIqPBfMXFbdms1SnRavRCmo1JnFcvTtYHzm+7GH9Z5Ep3+r0bly7 LYnHFNCAS3Q6RrrgzTD9J5ldUckawxWBcQcyGB994abNuwFK5VgMt7oFHookJsYwXATn Dm7V7Igzd2axcMZzdo8+tOxpUfuDSK35FTH/Ln8cwaljaXLO85mY+DKVHI+WzmJ+wc7g d6FwwzqacVtfUjx6gbvAKLSfCRJzgYV4XcWtp5ugQDsGUkGRe7vyJNkHgKIu/49vHUpS wc2MgELLsytrXkgtKGdo7juPuPLwEvnB671/CRe90DXTHCKtLIOzc0O4Zoys4Pv+Em2l 6SBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743456226; x=1744061026; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ToiqwXTNWbSOaWmF/3nfIhcFQB86VWKA/YOyk/TfLa4=; b=DhLhdDzq5DAH8rfAKEAChxTMImwKdhGuFHKf4k6itNKvab6eX1HAONleyR7GGPqXtA zO/R8AHM1c5zsQ5aoqAkY3HQHbIrxd9Z6X+l8TxlLSI28k9J3G1pQmiPPiQ368YDAOKR 3INZqzAhQHGOWUiaHlFZPGsQp+G4FRhJ9Xyr1h6C3mib0DdhTdWvK1E7eUA5YQONxmPO 1GjhVLzxqzDJ0ECT/x/azwCLW7Qe72P828WcV6hUDcKQdPACllN3aztYiDknhZFiq9Er GwnzCeEiga/rz8ICsSL9AK4WgSGakofA6nf1Gow8ZxpiEnouSCplTLnrIgz7Y+RF1Qjr cGrA== X-Forwarded-Encrypted: i=1; AJvYcCU7AzZCUTEiAzBmf4LvrNjn3A1075QTDrx+mm6kT8iT806tBHQhBJw7udqIpPVLtZr7aLc/nSOb3g==@kvack.org X-Gm-Message-State: AOJu0Yw12t4lCihaOu7pQO3wTcPN9SLAf0lteuYc868aqVfIIQ+ykhnE eIJQci4NzVjnuM/1EFX37yfHEwmm7BXmnyv2IBEes7amFUh6+KEY X-Gm-Gg: ASbGncuQ580z45+6APySh7qn+Lni/wV8LEC46ffyyS6iLG1XOXRhwI5sZWe50n6GlFM OzQq6oah9oHAx0apLFJ8oo6w2F7/auplz890tnTJCu90PzpIHsEMAZj85lrXISqFU+kc60FK2jk UyOXsMwZ/RUOwqRS2XbqahbncgKeKDM0+4lk2Q3jMEoROrLHJLMeh7Wlpfi/9PDVtmyKxX00eVE dG6yNmwo4AY3ROkZ6gJP8YC+79woDuJHtnO/e5ktEZn475E52z1r8LLHNVjbpx/ObI9ctwli7Mh 2gDyU0/m0+L6yGs16YR0WTH1uzh28TdoA/kzlR6VRWTJRYI3nUFiug== X-Google-Smtp-Source: AGHT+IF2ccfX+N8GaDhZWoT7fQ4TPsL1iJZI+VlKe+DiDxzY0SY30WKJtzZwgizskXCPz7NweIs+fQ== X-Received: by 2002:a05:690c:74c9:b0:6fd:390d:5a9c with SMTP id 00721157ae682-70257119196mr148843427b3.10.1743456225544; Mon, 31 Mar 2025 14:23:45 -0700 (PDT) Received: from localhost ([2a03:2880:25ff:4::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7023aa07ce3sm23481337b3.116.2025.03.31.14.23.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Mar 2025 14:23:45 -0700 (PDT) From: Joshua Hahn To: Wupeng Ma Cc: akpm@linux-foundation.org, mike.kravetz@oracle.com, david@redhat.com, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: Re: [RFC PATCH] mm: hugetlb: Fix incorrect fallback for subpool Date: Mon, 31 Mar 2025 14:23:41 -0700 Message-ID: <20250331212343.66780-1-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250325061634.2118202-1-mawupeng1@huawei.com> References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 9E19C180003 X-Rspamd-Server: rspam05 X-Rspam-User: X-Stat-Signature: bi64dqbzd5iih6rcitojoy1oa3byq5rb X-HE-Tag: 1743456226-338478 X-HE-Meta: U2FsdGVkX1+4Bvf0guZwmyoQ1E/spR+CQmhWLYkWg9nuUY74EhuOoYQjXMhcYKK9aBBt877h2dvpWZQOkFju7lfQLj2tVZ9RoH1n3xRxaQIQVZl34syVsgTvs6PfHAYmzqLSP3f+GQtco2wHDc8rVnzsPp6bCXGABHBfFX4wbQ1HzSjT3zRTHor0AhGF2QNKUnir1blYbb3/EDmjG9wN4ZtGzArxJFYozGJVVC2IaPmEdZ+srsN26tYd7WU2B9T61gM61wHeDjO8wN2lMLoXW/V9kUqioUP2GQ7uh0VEGQdjqKPNtAjsUtJ8Gzer9KPCiYWbog3cYTIAsSCntA78SraBhU/towi+c18lP3FQB8iAIiSKT6Acivw+Lz5Jfeavw0lsfXuyRu/WxroyJnyG9GclMgNy9JmEOBodhoSle4jXCUdmGCbej9F16aiynRWO5GLV6lxUmLHMCUwWJZh7b/buXIy7ZhvST6sGpTLp0/DF9hfp+xURTxCa1PwUCpUzcKX16iAt1zQqJBx+WbIl98i+6ZNbbKqoHHtkkl5qvo+BI4J8Krmm8KeTLMvClyXQuAPpB4K489np0LJ7vqpzu5xwrPKlYOX6zgjvDENII4LCahQmVgSWycyFEoye1PnXc4FmlXSHlc72Q61PNBGxM5GKtAYBKisNZKBSsyjrAjWY1wjFAbaPcJ6mrAbvo+AFOdCnurBi3qoItkgq5/pRxI5VVWTS87oHqihni1izzGtUKeMbXE/ZHswXwF+RrFjGWuE+EjHnjneNo2NY3yxjW4tuiYxwf7knbmGb4+CJcOkWL2kFcQFBAQxnmpTb9milb2bdLlbjsE+0b60Oads25VR/48vZ0JNm1p2KeYRhyI1SgYpAuhWCDU6ZRk+FD4xCxwd3399tGgu/wOlev+HRR5j1bci3a014pYGw5sagH5r8bzJGi2nQ2I6jpf/ZFUMA+2zKM9BByUzudzdPo9P eppeo+RC 65TNWd5/jMSzxlFzvl6OcidLJdVX3oot7+GYYsj/O20+xfR7QMddsKmfSCS+9SMhnOvFrFwpPOkfEXtyAqM2o7E7Bf05VGofI/S6KnVvEfOslwpvHT+Z1oo9vTCXaE6SKI64oo3kXNSDHYEgE+eZwBpRQFmhR54+aZFGgEPntloZrdX/22P/ngHczYKfycDRfjzoJyt4wg8Q8Tb/L1RytVK0R1q6FLLUxYsx5EkL9AGlEWvDh1fy8AHF5OSt//dXpD3qGw+lFaNTtZ9YOMSpShYC6ngXt3cUpcD1uBDmWvC/aK9ItZD4izYsIMkNv1s3RDOQrOUoFstFVAD7f22EJodam5lG1ZcXKe9Q/ivTof6dkDl/QCKvIWJdCTMPa9da+UhG1vTJ7khhkJpfVgE+E7N4Bk6odILghohri0T7jjcww0cCdeNqZP5vPAzt7BAgujupuBJBT0AcPkkvaykXnEu1NjiqBQPD0QtjS8RzdAiF5rzFWzdAU2tjHp4K6o5zM/Tt9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 25 Mar 2025 14:16:34 +0800 Wupeng Ma wrote: > During our testing with hugetlb subpool enabled, we observe that > hstate->resv_huge_pages may underflow into negative values. Root cause > analysis reveals a race condition in subpool reservation fallback handling > as follow: > > hugetlb_reserve_pages() > /* Attempt subpool reservation */ > gbl_reserve = hugepage_subpool_get_pages(spool, chg); > > /* Global reservation may fail after subpool allocation */ > if (hugetlb_acct_memory(h, gbl_reserve) < 0) > goto out_put_pages; > > out_put_pages: > /* This incorrectly restores reservation to subpool */ > hugepage_subpool_put_pages(spool, chg); > > When hugetlb_acct_memory() fails after subpool allocation, the current > implementation over-commits subpool reservations by returning the full > 'chg' value instead of the actual allocated 'gbl_reserve' amount. This > discrepancy propagates to global reservations during subsequent releases, > eventually causing resv_huge_pages underflow. > > This problem can be trigger easily with the following steps: > 1. reverse hugepage for hugeltb allocation > 2. mount hugetlbfs with min_size to enable hugetlb subpool > 3. alloc hugepages with two task(make sure the second will fail due to > insufficient amount of hugepages) > 4. with for a few seconds and repeat step 3 which will make > hstate->resv_huge_pages to go below zero. > > To fix this problem, return corrent amount of pages to subpool during the > fallback after hugepage_subpool_get_pages is called. > > Fixes: 1c5ecae3a93f ("hugetlbfs: add minimum size accounting to subpools") > Signed-off-by: Wupeng Ma Hi Wupeng, Thank you for the fix! This is a problem that we've also seen happen in our fleet at Meta. I was able to recreate the issue that you mentioned -- to explicitly lay down the steps I used: 1. echo 1 > /proc/sys/vm/nr_hugepages 2. mkdir /mnt/hugetlb-pool 3.mount -t hugetlbfs -o min_size=2M none /mnt/hugetlb-pool 4. (./get_hugepage &) && (./get_hugepage &) # get_hugepage just opens a file in /mnt/hugetlb-pool and mmaps 2M into it. 5. sleep 3 6. (./get_hugepage &) && (./get_hugepage &) 7. cat /proc/meminfo | grep HugePages_Rsvd ... and (6) shows that HugePages_Rsvd has indeed underflowed to U64_MAX! I've also verified that applying your fix and then re-running the reproducer shows no underflow. Reviewed-by: Joshua Hahn Tested-by: Joshua Hahn Sent using hkml (https://github.com/sjp38/hackermail)