From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6CF2C433DB for ; Mon, 1 Feb 2021 16:10:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2F0C164DA5 for ; Mon, 1 Feb 2021 16:10:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2F0C164DA5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B0D3D6B0074; Mon, 1 Feb 2021 11:10:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ABC5D6B007D; Mon, 1 Feb 2021 11:10:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9AAE66B007E; Mon, 1 Feb 2021 11:10:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0081.hostedemail.com [216.40.44.81]) by kanga.kvack.org (Postfix) with ESMTP id 82F916B0074 for ; Mon, 1 Feb 2021 11:10:45 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 49C13180AD82F for ; Mon, 1 Feb 2021 16:10:45 +0000 (UTC) X-FDA: 77770187250.25.chess58_4216695275c3 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 105F41804E3A8 for ; Mon, 1 Feb 2021 16:10:45 +0000 (UTC) X-HE-Tag: chess58_4216695275c3 X-Filterd-Recvd-Size: 6330 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Mon, 1 Feb 2021 16:10:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612195843; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LTkTHTZtrG7r7H93f6RfvQ2RYuYcKcR+efM5cQJVmzE=; b=jQZF8lVJ8+RFzoz9WHyP5m6H+Fr4QOgS96GGrgMh5tOkKhIg1cBYrNiCuuh54uKVpw8z7y TTbpEBS5wYMGqfnYdB2mQgEBtH+4TJ2wbarT1OpmcgqqORrQJ3lT++uGKYE1CbiBiAt87T 5adpOyNY7J9nFyFK8l7GzZPZMADNAaQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-355-Nv06uFrmOrOjzL9zhKXSug-1; Mon, 01 Feb 2021 11:10:39 -0500 X-MC-Unique: Nv06uFrmOrOjzL9zhKXSug-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 75E0E107ACE8; Mon, 1 Feb 2021 16:10:35 +0000 (UTC) Received: from [10.36.115.24] (ovpn-115-24.ams2.redhat.com [10.36.115.24]) by smtp.corp.redhat.com (Postfix) with ESMTP id C679B60C66; Mon, 1 Feb 2021 16:10:28 +0000 (UTC) To: Mike Kravetz , Muchun Song , Oscar Salvador Cc: Jonathan Corbet , Thomas Gleixner , mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , viro@zeniv.linux.org.uk, Andrew Morton , paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, Mina Almasry , David Rientjes , Matthew Wilcox , Michal Hocko , "Song Bao Hua (Barry Song)" , =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= , Xiongchun duan , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel References: <20210117151053.24600-1-songmuchun@bytedance.com> <20210117151053.24600-6-songmuchun@bytedance.com> <20210126092942.GA10602@linux> <6fe52a7e-ebd8-f5ce-1fcd-5ed6896d3797@redhat.com> <20210126145819.GB16870@linux> <259b9669-0515-01a2-d714-617011f87194@redhat.com> <20210126153448.GA17455@linux> <9475b139-1b33-76c7-ef5c-d43d2ea1dba5@redhat.com> From: David Hildenbrand Organization: Red Hat GmbH Subject: Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page Message-ID: <41160c2e-817d-3ef2-0475-4db58827c1c3@redhat.com> Date: Mon, 1 Feb 2021 17:10:27 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: >> What's your opinion about this? Should we take this approach? >=20 > I think trying to solve all the issues that could happen as the result = of > not being able to dissolve a hugetlb page has made this extremely compl= ex. > I know this is something we need to address/solve. We do not want to a= dd > more unexpected behavior in corner cases. However, I can not help but = think > about similar issues today. For example, if a huge page is in use in > ZONE_MOVABLE or CMA there is no guarantee that it can be migrated today= . Yes, hugetlbfs is broken with alloc_contig_range() as e.g., used by CMA=20 and needs fixing. Then, similar problems as with hugetlbfs pages on=20 ZONE_MOVABLE apply. hugetlbfs pages on ZONE_MOVABLE for memory unplug are problematic in=20 corner cases only I think: 1. Not sufficient memory to allocate a destination page. Well, nothing=20 we can really do about that - just like trying to migrate any other=20 memory but running into -ENOMEM. 2. Trying to dissolve a free huge page but running into reservation=20 limits. I think we should at least try allocating a new free huge page=20 before failing. To be tackled in the future. > Correct? We may need to allocate another huge page for the target of t= he > migration, and there is no guarantee we can do that. >=20 I agree that 1. is similar to "cannot migrate because OOM". So thinking about it again, we don't actually seem to lose that much when a) Rejecting migration of a huge page when not being able to allocate=20 the vmemmap for our source page. Our system seems to be under quite some=20 memory pressure already. Migration could just fail because we fail to=20 allocate a migration target already. b) Rejecting to dissolve a huge page when not able to allocate the=20 vmemmap. Dissolving can fail already. And, again, our system seems to be=20 under quite some memory pressure already. c) Rejecting freeing huge pages when not able to allocate the vmemmap. I=20 guess the "only" surprise is that the user might now no longer get what=20 he asked for. This seems to be the "real change". So maybe little actually speaks against allowing for migration of such=20 huge pages and optimizing any huge page, besides rejecting freeing of=20 huge pages and surprising the user/admin. I guess while our system is under memory pressure CMA and ZONE_MOVABLE=20 are already no longer able to always keep their guarantees - until there=20 is no more memory pressure. --=20 Thanks, David / dhildenb