From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9996BC77B72 for ; Sat, 15 Apr 2023 00:47:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1200D900003; Fri, 14 Apr 2023 20:47:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AA88900002; Fri, 14 Apr 2023 20:47:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E64FF900003; Fri, 14 Apr 2023 20:47:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D3432900002 for ; Fri, 14 Apr 2023 20:47:33 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9BBD9ABBEA for ; Sat, 15 Apr 2023 00:47:33 +0000 (UTC) X-FDA: 80681787186.15.AB4FB39 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf07.hostedemail.com (Postfix) with ESMTP id D5E6A4000D for ; Sat, 15 Apr 2023 00:47:31 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=RtOWkJYb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of rientjes@google.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=rientjes@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681519651; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IOjLK7bQw/L2nScgnPBp53VOvgLxyAOh6awtiT0/aow=; b=0mmWR5FZsLvwKyKxx6tqW+kx5IBhInluH4vlprZLaqFjoDiOwERR3/CcIS6e7d2MsAdF7m Jk9DRKNvSlINtDsWCL2xGa9jz8y9PwO1bJL8xl3WMSoaLvK7I6Db5I6JwmfKgf8TD1bTFU rb9Fa/dGfBvya9Y6YY/Vw7hiucxXVsg= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=RtOWkJYb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of rientjes@google.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=rientjes@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681519651; a=rsa-sha256; cv=none; b=O0Sg5o2nm9wSBad6VFrIOAi9mJuDLAOp2YOvFRmgxHmSwxIQSx839Mrn1oZJMXHylb3IGi piSVtSu9vsZ93W9hfJaMM5BTcKA7esxlGk/chNAz9df7sxu2QW6Ap14kvpxQUCPB4LKpA0 qR5c22jvWgaNXzUILxLHPKRVbICd7Fo= Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-1a1b23f49e2so346345ad.0 for ; Fri, 14 Apr 2023 17:47:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1681519651; x=1684111651; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=IOjLK7bQw/L2nScgnPBp53VOvgLxyAOh6awtiT0/aow=; b=RtOWkJYbe+9ZGSd5xT+gJJLYqAEs2lVrS2bkzsMdZM7klHA/vSY0tlzl9xqOiJdx8m rqlTfL3MNAEjVlQMaCXNLrbo5gZYSR3bhVkruXKeaodAqxkbYpyechzuxax+w8YuuJUD 3OGs6/AEAYveEgXL/bO+6mCSy9S7+6J4/XJLqCXX0u9jEvIaoVBuWSt5OgAdjfBwsXre rKkC0tHcAyrqchK8mRNiDF2e23Guzns+ByJka25IHcbAt4DomqxMo8dbEzlSCLdjltxr QUB8GI9mo8ImKeov5KYlljJx8+b7wgs8PgbPW1MCx681BvtaYwevu0jbqxSasgXmR7Bg QoaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681519651; x=1684111651; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IOjLK7bQw/L2nScgnPBp53VOvgLxyAOh6awtiT0/aow=; b=LtxTUWLhnqzHB6QbhkWimaZ8jcEddnCEYNI60W3rrxC463iiPzcPReuAHQJB8LgC+t uD3EFM+DRAv4UZt/7mq+CnW6Uh0rEbaWNdGzYqdYmLzMibw/hPrGSus1uR3divVxtZJS Rz/GXczsV2qASR8RlZhbdO1pojdfJm29jDAD+AZtLVSLnOhvmBPSJ6pl22E7XEGj8GLQ Q5GoWRjHJ4pFNton3lwUMXaPC9R09DVfy+Ki+cjBdISV2WbW2yEtxyYMdvyE8XsGQqDH oAmNvfRntxBBgeApAik3YqF+rOsCsakhZqltAn1VI1d8TLfRHf/45uLBCMt0yFMOChg+ Rkow== X-Gm-Message-State: AAQBX9dfR9t1rdZCuzLfUKFHkH+e2VKl0sD9WOTOSQoC6Gx3aE7oV1sF 77EDh4a4JKp6HIINpnqasRoQ4g== X-Google-Smtp-Source: AKy350bHkVJvegqUh1KkAk2WDRkXKzUMzBurIQmEMbkglCRkotl3792v/bGfao0Tkm70skXEKMpw4g== X-Received: by 2002:a17:902:6505:b0:19a:c659:e1cc with SMTP id b5-20020a170902650500b0019ac659e1ccmr72607plk.2.1681519650587; Fri, 14 Apr 2023 17:47:30 -0700 (PDT) Received: from [2620:0:1008:11:7645:4c1a:1f51:a9b8] ([2620:0:1008:11:7645:4c1a:1f51:a9b8]) by smtp.gmail.com with ESMTPSA id k20-20020aa790d4000000b0063585190b09sm3552921pfk.113.2023.04.14.17.47.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Apr 2023 17:47:29 -0700 (PDT) Date: Fri, 14 Apr 2023 17:47:28 -0700 (PDT) From: David Rientjes To: Michal Hocko cc: Pasha Tatashin , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, mike.kravetz@oracle.com, muchun.song@linux.dev, souravpanda@google.com Subject: Re: [PATCH v2] mm: hugetlb_vmemmap: provide stronger vmemmap allocation guarantees In-Reply-To: Message-ID: References: <20230412195939.1242462-1-pasha.tatashin@soleen.com> <20230412131302.cf42a7f4b710db8c18b7b676@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: D5E6A4000D X-Stat-Signature: pmh4uj4wufgq7nfouxsu86nfdzeu9drc X-HE-Tag: 1681519651-782561 X-HE-Meta: U2FsdGVkX1+p20I6QdSPGgwa5z+2uG8ykDvygr3ETYKKn6Sgy4UxYWVVaHskvht+5b4Dcy+0eE+Cl8HkRC9p5QuMHKjLJ+4u1qvinLnaNjp9lpgtDiqq9HheAAOUP3fTphU0KlnRr/9fVURBUz4Y1qdOpwQZzb50moWoYlndXGZMJMXV32WbBX6FTMEgyb6FP8pOFm7v7OQIk1JkpEoxniUFHvtQWdow75tX6G38B/SCAN6XmZMkygsTMuCH0HEPsrhEPX+x90NGZ/JOAPMJj8tmMGv/qyXMKXa2vjFbQ6ciE2GhTOgKkuHoTecnqzqQVPm78/uq0o0u5TRlV9dWwP/FoSzM9MDH1l2cw0zmSQyMmHnj+19Wdi/s5dUxRdc7GzBbSgVacOrNiDMizJQdUXrxznihaEvSadINSJLP4VwD//Nem77FNkXnKI0J5y8O/kHO1MvZuMHe8STa+TEueQ8bEywHXLT5dn7kbL/JlbakS7IkVcmAzFwggucWt5tPl3d9/CIho1dsVQ7c7Enpi/PscgDemw43w/NfPQGeunwblo4hTgobqvPmS2RZCjSsLNdBbo/0QYjskjql81Ga17s6bIa87t4JDTbuWe/7XkJesAqol++gjXAeFDquYYPhj7G9Gj+UtEFi+geGoHTh4NCTNT0V15gekyVBNPRYBeF74iwnURet07DdsRKZu4TqpVVXxjhp7XeDQ/hwPGMXCu9TwxS3gD8g5XJ9ibHciaHTKmtSJn7JgZ+GSM9ZZ+Ukkf12BlG/KXIlaos8aj9dIY1OMWYjOw54WQh/fP0c8aefEUpWojPj3w6nOZb0x1FNujF6QFTJYCSqb9We/DfjnM0A02/3uzS6UyjLG8dozlPO2SINfA7+JKFeQkCv6lt6N6R17+wBe6rQ2eubKmrxNWST3rdS/YwhNPKnXMCBbHDyiVRBUsCZoJS5Vjra4Bf+AVUL62lBhnAvNs82Vs1 iNntWoMC CKl7GAzjY2QChCcYB8RexFabVdtfXAX+8irK4dfgvvnzMpbZJKvx+HCjkekX++F1vuePB5VxcIOgVi9vXGQvFORhJWhxn0T3vgiGHDEParemNhBZf8Z8iHxDevatABwSFMemojr9RhX+KM4iTg70AdKHzv+U+06blHZnvq23sU3lHLbzUSXsKBUlK8I0ykokuZA4bJS1L/5ew8uq/PyEkKrHcQ5KrRgKeGbWaoVzRAIYBeV8Sw0OYBPygk7bo6gbcjeNXZ70a32EyxpMitjvl5FPSeQAPpxSCkL9cGzEnE4jwPcA9oZpB+l0Z1QDJJ+5ig3GL4DbDGGtsMOV7e981IiuXBHxsinEiiXwcRZFaPoJ+ZFFVunRXx1GTK2OytZXGw/O8Sdb2TnhPpyyWA9Nf8BBGkZftfkU2dUj2+w+vW5a3NtRH27cEXK0FnvxEA/DAMllCyz+hN0PCk9Y= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 13 Apr 2023, Michal Hocko wrote: > [...] > > > > This is a theoretical concern. Freeing a 1G page requires 16M of free > > > > memory. A machine might need to be reconfigured from one task to > > > > another, and release a large number of 1G pages back to the system if > > > > allocating 16M fails, the release won't work. > > > > > > This is really an important "detail" changelog should mention. While I > > > am not really against that change I would much rather see that as a > > > result of a real world fix rather than a theoretical concern. Mostly > > > because a real life scenario would allow us to test the > > > __GFP_RETRY_MAYFAIL effectivness. As that request might fail as well we > > > just end up with a theoretical fix for a theoretical problem. Something > > > that is easy to introduce but much harder to get rid of should we ever > > > need to change __GFP_RETRY_MAYFAIL implementation for example. > > > > I will add this to changelog in v3. If __GFP_RETRY_MAYFAIL is > > ineffective we will receive feedback once someone hits this problem. > > I do not remember anybody hitting this with the current __GFP_NORETRY. > So arguably there is nothing to be fixed ATM. > I think we should still at least clear __GFP_NORETRY in this allocation: to be able to free 1GB hugepages back to the system we'd like the page allocator to at least exercise its normal order-0 allocation logic rather than exempting it from retrying reclaim by opting into __GFP_NORETRY. I'd agree with the analysis in https://lore.kernel.org/linux-mm/YCafit5ruRJ+SL8I@dhcp22.suse.cz/ that either a cleared __GFP_NORETRY or a __GFP_RETRY_MAYFAIL makes logical sense. We really *do* want to free these hugepages back to the system and the amount of memory freeing will always be more than the allocation for struct page. The net result is more free memory. If the allocation fails, we can't free 1GB back to the system on a saturated node if our first reclaim attempt didn't allow these struct pages to be allocated. Stranding 1GB in the hugetlb pool that no userspace on the system can make use of at the time isn't very useful.