From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 275AEC77B6E for ; Wed, 12 Apr 2023 17:54:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6699A6B0074; Wed, 12 Apr 2023 13:54:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 619B96B0075; Wed, 12 Apr 2023 13:54:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4BB1A900002; Wed, 12 Apr 2023 13:54:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 389E66B0074 for ; Wed, 12 Apr 2023 13:54:58 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id F239E1401A6 for ; Wed, 12 Apr 2023 17:54:57 +0000 (UTC) X-FDA: 80673489834.05.0D79429 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by imf25.hostedemail.com (Postfix) with ESMTP id 30A22A0005 for ; Wed, 12 Apr 2023 17:54:54 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=mobXcbVl; spf=pass (imf25.hostedemail.com: domain of rientjes@google.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681322095; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KzjQyPk8ItloD0n11E/8AENdtb3dyPJg1fPMRZLAN5Q=; b=pDathMNtWZrRTG3474dkw4wlHFWSQB9Ig7REszDkmwi2u/nwHbH75HKnO3yufLId05MH6E HfbRshh0O9wd7btJUVJuUcTS+T7XptydlZaVC+Q2TXGEZgC2/g3p+hzI9jEiFg5KFBz5NR 9kifVhzS3oobdzgXzjjXbfeK0DIou8s= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=mobXcbVl; spf=pass (imf25.hostedemail.com: domain of rientjes@google.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681322095; a=rsa-sha256; cv=none; b=X9u5TsvsQNYYjHwiYE3/Mri8EhU0il/NNlYQVbQNK4q70gixW/5i4CduXcE048GsFN5rf2 BbMXv++qeh1g5rmnmGe8lEIWmU9amCsY2T3kJ/ib64zW9ZlNqqcb37iI66nEAFFUkfM6LA WxXuUmA0dF/Wbn6lD/gKTom0ugr5LxM= Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-1a1b23f49e2so137695ad.0 for ; Wed, 12 Apr 2023 10:54:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1681322094; x=1683914094; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=KzjQyPk8ItloD0n11E/8AENdtb3dyPJg1fPMRZLAN5Q=; b=mobXcbVlLT8tTfyGzA//t5dhW+n7PAMhBLJdr6MsF4thikk+w9JRNzjt/sNEFbfvif WzEZNi8yT0IV4CWlWSeuVnVCn5FrlDYcqDRfQ9Drh4igJOGU70EPnixZOIppLtPPuzQK 73w4IfICCDDax77WX54qme+vgYqRVvTmQbM9FUA7qs8taN36maVynqBUkkfMvRm3rqi3 eOVdDwh8Vbq2hoZNA4HIHX3aRSVBj53vDco8YNZuGkba3dOUUqcnlLojtOv4mtpFyT50 LhLKHciOjo6E47/RAUS6uJ7bN69hC19ZxSkx5Ec1AtpYBDlo1IXZco5l5QqThOsfaTYc 0+vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681322094; x=1683914094; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KzjQyPk8ItloD0n11E/8AENdtb3dyPJg1fPMRZLAN5Q=; b=R4rNp2Qzk0vx7cscizdqy3skZERPs9C5mkzeOlep5ueki2zNCyWED19A64SpfA9Vng 2RdcEiUppstGM4O0N+6ZkwiUyu9aTDmDuO252ZyYbLDqX39MRTp1IhDVoJfZaZD55V9I m7gXvZFNA2jWBqvW27+jUC7PYjP2TNIVQkbvgL4dUzKQO3eSfenmREojM8G9wYi6H6BM 0dJd/OvENYd0UT2e6o7lb8T8LsZEmtAuUzaDYOPULrWh4Z+f+Gd4zI5tHAvTsLHIOWdH 65bIuAoHAqJJgYT4AJ6GDaW8jEm1bUQLzL/uDdqV4Q0McWPYvuAzp1M4nSuOw960i2CS pBCQ== X-Gm-Message-State: AAQBX9cexxTcDEEv/QG2dUJimltNlbHLEsm9BOpvEhM3eB/dI9bvc7Gt 8bQ0oJ+YdzKM9R8tpccL5eUslg== X-Google-Smtp-Source: AKy350bhDki0PDMoioLTk5kHyac9ucwOJGU23Q/RGh3Ai6Y5Qw8nbEgU/ftHkBUZGzdk/rgmuCcIsw== X-Received: by 2002:a17:903:1351:b0:1a5:2e85:94a1 with SMTP id jl17-20020a170903135100b001a52e8594a1mr701496plb.14.1681322093749; Wed, 12 Apr 2023 10:54:53 -0700 (PDT) Received: from [2620:0:1008:11:7e3f:e20c:7479:83ea] ([2620:0:1008:11:7e3f:e20c:7479:83ea]) by smtp.gmail.com with ESMTPSA id w7-20020a170902904700b0019fcece6847sm11896383plz.227.2023.04.12.10.54.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 10:54:53 -0700 (PDT) Date: Wed, 12 Apr 2023 10:54:52 -0700 (PDT) From: David Rientjes To: Pasha Tatashin cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, muchun.song@linux.dev, souravpanda@google.com Subject: Re: [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees In-Reply-To: <20230412152337.1203254-1-pasha.tatashin@soleen.com> Message-ID: <63736432-5cef-f67c-c809-cc19b236a7f4@google.com> References: <20230412152337.1203254-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 30A22A0005 X-Stat-Signature: cuizrd5t4rjoy99gq73fgazq5tz4b4w7 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1681322094-105481 X-HE-Meta: U2FsdGVkX19qm4S/kzPokKREYvZiy9lxOeAV4/+FyRuzgSMMCvKY8rx57SZtB1+JXYPjhayZfrn9WbJGVMaWnh/Ier3Ol871RbvB7v+ZzO6QFPxOLO2JWJAmnk76k0VKHkSLFX/woybRDVXYQFY8g+gCJE4LURLTDIUPxh9sBhXn4p2E2kyVtCmquKRIfeFlVOfsR7EwoJ8PvU5FXw8FZeevTZLFkUIOleuY8LnrGUotzs6p3JLW5PEEzA26d5qhmz3W5kgL5zMrymze+oMmpSyVeh8IiLAqu/M6y3V26iUy6GChBTGMG0F7yIT/93wCIRL8S9R3Yl6Z+lTy9AZACf3hzopZ4cH1T6q8h6rHv5yjgl+59BQLtxUmUwPw024PesY/hGJfVc0L7nPneqIHs5AxmGbw2qJ6Sbc+yf3ITX9wjenTWBNKBU4kkollNbvqCdCqLzSTA7LRGE6ML2NuPV5d9jXLJvLdr+JPEW/vA1mU/JRa2JpiJDxU2W9fj37+FYMBRN4FpDNBsP8zgX4BuPl1vIvJZg7P7k1bkVAKM55zs5mMb1SH1Fr4FvJ3G2wxeFK2e7rstIxWFkH48ca++TJoqc0EkNKc+mnnPxsmAVjzxJwFlrIgrubb12A6r6+LwwDSggADj2qRZBSU908ddIaNzDOGj2NZsgyt67KsLpOsl+nawfoy49qwzFNV5ERwT+x8PAtbEwmhW8Og7KevuMU/5S92vb21Mwl/K/DlxValGTZQGGbD7eKOwtGmFU8qyCdYR58+0ftqY2i2MtJGvdBUQIBHd3HIesTWJDVZTvLuNYvM7cFspmVCW+fxEHHhreKoNV5PodrU7OBbfKiNu6t4Dj7INL//9vsIo+Os5oQ7WBZ2J5dO4XLwJXtaBp6VWt9Yoz4FtIoO+tuqrVQ2GY/xyMouhoPf+1nLsKB0rq/8GcaSz3pLGnFsASQIX3FpxkEz3kvlbs/cdwA9HLx BuTMI5n2 APHoVgNa++0IsrPI5ESon/N0l0TWCM23+gfebIzAh6X5hKa/7QAfoFpufMTg8v3Mb0SYlvGoMHYSL01AGUXQs7Nizn6Kl9IkaLDKr2Q2AstV6nuWxjprPZbW8+x2tbcv6t2eRQNMciCeL/nbHjTkFfe0aENTyG4BtQvEhsUmjw+1yEy0YPBoDoDz/ng9mdaH+POD2GlbtLYyHhvCV2cEFfIKuclR7tMikJtxgRl453CFMHhSnVrhA2BfgJ45FN7dtkpMxyqP2nLKkGLf+NOT4Q6BmWaJqCzHNxDt/U81W3kH8wp/Y9L2RfB8a6xIfY42B5+7fvGJsEEzE01qyxvbmY5FgpIdHtltzykgScJiQPcUdT08AU+aRVaUSm5yZpkTyUfN2B6ecsEq/S1zRq1LRRGLzCfVnk9ivPVoV95b7+quV+53iPl/cx1twYFd5pFXuTw5LJYS25uPeErXL0nzYXtN+i92kmTOWrf/CY2fkwWyyLIvlL0glqyjBtK7PkICg+urC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 12 Apr 2023, Pasha Tatashin wrote: > HugeTLB pages have a struct page optimizations where struct pages for tail > pages are freed. However, when HugeTLB pages are destroyed, the memory for > struct pages (vmemmap) need to be allocated again. > > Currently, __GFP_NORETRY flag is used to allocate the memory for vmemmap, > but given that this flag makes very little effort to actually reclaim > memory the returning of huge pages back to the system can be problem. Lets > use __GFP_RETRY_MAYFAIL instead. This flag is also performs graceful > reclaim without causing ooms, but at least it may perform a few retries, > and will fail only when there is genuinely little amount of unused memory > in the system. > Thanks Pasha, this definitely makes sense. We want to free the hugetlb page back to the system so it would be a shame to have to strand it in the hugetlb pool because we can't allocate the tail pages (we want to free more memory than we're allocating). > Signed-off-by: Pasha Tatashin > Suggested-by: David Rientjes > --- > mm/hugetlb_vmemmap.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > index a559037cce00..c4226d2af7cc 100644 > --- a/mm/hugetlb_vmemmap.c > +++ b/mm/hugetlb_vmemmap.c > @@ -475,9 +475,12 @@ int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head) > * the range is mapped to the page which @vmemmap_reuse is mapped to. > * When a HugeTLB page is freed to the buddy allocator, previously > * discarded vmemmap pages must be allocated and remapping. > + * > + * Use __GFP_RETRY_MAYFAIL to fail only when there is genuinely little > + * unused memory in the system. > */ > ret = vmemmap_remap_alloc(vmemmap_start, vmemmap_end, vmemmap_reuse, > - GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE); > + GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE); > if (!ret) { > ClearHPageVmemmapOptimized(head); > static_branch_dec(&hugetlb_optimize_vmemmap_key); The behavior of __GFP_RETRY_MAYFAIL is different for high-order memory (at least larger than PAGE_ALLOC_COSTLY_ORDER). The order that we're allocating would depend on the implementation of alloc_vmemmap_page_list() so likely best to move the gfp mask to that function.