From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C6B1C432BE for ; Thu, 26 Aug 2021 07:32:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4307A60EE4 for ; Thu, 26 Aug 2021 07:32:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4307A60EE4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=sina.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C2E2F8D0002; Thu, 26 Aug 2021 03:32:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BDE958D0001; Thu, 26 Aug 2021 03:32:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF4DF8D0002; Thu, 26 Aug 2021 03:32:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0171.hostedemail.com [216.40.44.171]) by kanga.kvack.org (Postfix) with ESMTP id 942C18D0001 for ; Thu, 26 Aug 2021 03:32:38 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 32B98183DC7D7 for ; Thu, 26 Aug 2021 07:32:38 +0000 (UTC) X-FDA: 78516414396.27.BCC617A Received: from mail3-166.sinamail.sina.com.cn (mail3-166.sinamail.sina.com.cn [202.108.3.166]) by imf09.hostedemail.com (Postfix) with SMTP id E69CF3000110 for ; Thu, 26 Aug 2021 07:32:34 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([222.130.245.194]) by sina.com (172.16.97.23) with ESMTP id 6127438C0002E17D; Thu, 26 Aug 2021 15:32:31 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 66549854919627 From: Hillf Danton To: Mike Kravetz Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Vlastimil Babka Subject: Re: [PATCH RESEND 0/8] hugetlb: add demote/split page functionality Date: Thu, 26 Aug 2021 15:32:19 +0800 Message-Id: <20210826073219.2234-1-hdanton@sina.com> In-Reply-To: <10d86c18-f0cf-395f-4209-17ac71b9fc03@oracle.com> References: <20210816224953.157796-1-mike.kravetz@oracle.com> <20210816162749.22b921a61156a091f3e1d14d@linux-foundation.org> <20210816184611.07b97f4c26b83090f5d48fab@linux-foundation.org> MIME-Version: 1.0 Authentication-Results: imf09.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf09.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.166 as permitted sender) smtp.mailfrom=hdanton@sina.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: E69CF3000110 X-Stat-Signature: xuieczszzgpjo8sbww5o85nbx19hspys X-HE-Tag: 1629963154-630451 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 24 Aug 2021 15:08:46 -0700 Mike Kravetz wrote: >On 8/16/21 6:46 PM, Andrew Morton wrote: >> On Mon, 16 Aug 2021 17:46:58 -0700 Mike Kravetz wrote: >>=20 >>>> It really is a ton of new code. I think we're owed much more detail >>>> about the problem than the above. To be confident that all this >>>> material is truly justified? >>> >>> The desired functionality for this specific use case is to simply >>> convert a 1G huegtlb page to 512 2MB hugetlb pages. As mentioned >>> >>> "Converting larger to smaller hugetlb pages can be accomplished today= by >>> first freeing the larger page to the buddy allocator and then alloca= ting >>> the smaller pages. However, there are two issues with this approach= : >>> 1) This process can take quite some time, especially if allocation o= f >>> the smaller pages is not immediate and requires migration/compact= ion. >>> 2) There is no guarantee that the total size of smaller pages alloca= ted >>> will match the size of the larger page which was freed. This is >>> because the area freed by the larger page could quickly be >>> fragmented." >>> >>> These two issues have been experienced in practice. >>=20 >> Well the first issue is quantifiable. What is "some time"? If it's >> people trying to get a 5% speedup on a rare operation because hey, >> bugging the kernel developers doesn't cost me anything then perhaps we >> have better things to be doing. > >Well, I set up a test environment on a larger system to get some >numbers. My 'load' on the system was filling the page cache with >clean pages. The thought is that these pages could easily be reclaimed. > >When trying to get numbers I hit a hugetlb page allocation stall where >__alloc_pages(__GFP_RETRY_MAYFAIL, order 9) would stall forever (or at >least an hour). It was very much like the symptoms addressed here: >https://lore.kernel.org/linux-mm/20190806014744.15446-1-mike.kravetz@ora= cle.com/ > >This was on 5.14.0-rc6-next-20210820. > >I'll do some more digging as this appears to be some dark corner case of >reclaim and/or compaction. The 'good news' is that I can reproduce >this. I am on vacation until 1 Sep, with an ear on any light on the corner cases. Hillf > >> And the second problem would benefit from some words to help us >> understand how much real-world hurt this causes, and how frequently. >> And let's understand what the userspace workarounds look like, etc. > >The stall above was from doing a simple 'free 1GB page' followed by >'allocate 512 MB pages' from userspace. > >Getting out another version of this series will be delayed, as I think >we need to address or understand this issue first. >--=20 >Mike Kravetz