From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F482C433C1 for ; Wed, 24 Mar 2021 13:40:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9342861A07 for ; Wed, 24 Mar 2021 13:40:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9342861A07 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C41AA6B02B6; Wed, 24 Mar 2021 09:40:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BF1CB6B02C6; Wed, 24 Mar 2021 09:40:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A92966B02CA; Wed, 24 Mar 2021 09:40:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0211.hostedemail.com [216.40.44.211]) by kanga.kvack.org (Postfix) with ESMTP id 8BEC76B02B6 for ; Wed, 24 Mar 2021 09:40:32 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D36224DDD for ; Wed, 24 Mar 2021 13:40:31 +0000 (UTC) X-FDA: 77954877462.24.39EFF57 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf18.hostedemail.com (Postfix) with ESMTP id DBDC12000252 for ; Wed, 24 Mar 2021 13:40:30 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1616593230; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=fGJhpVQ2ygvTcqyB7BGZDKbHMPmYinPLpC19q8LrgoE=; b=erfT6L07rc/gL7ONI/CNEU/2gpu18MKemZwVqVZ8aMHTR1SwLPJ8LCZ987rM2gGnArIIoI TfOVNwKPKyCxZKAHB5e2yTx7LQjn1oUcKWKrq1TOY1Zt+X4pRV04dVjnDtml1bKP7x8znQ upQVS+S/zBvDiDGXFhYiV1HeQx74+6k= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 04E3FAD71; Wed, 24 Mar 2021 13:40:30 +0000 (UTC) Date: Wed, 24 Mar 2021 14:40:29 +0100 From: Michal Hocko To: David Hildenbrand Cc: Oscar Salvador , Andrew Morton , Anshuman Khandual , Vlastimil Babka , Pavel Tatashin , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v5 1/5] mm,memory_hotplug: Allocate memmap from the added memory range Message-ID: References: <20210319092635.6214-1-osalvador@suse.de> <20210319092635.6214-2-osalvador@suse.de> <20210324101259.GB16560@linux> <32bc6e31-0200-1e8c-895c-3f60ed072fc2@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <32bc6e31-0200-1e8c-895c-3f60ed072fc2@redhat.com> X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DBDC12000252 X-Stat-Signature: cfq6mjzc4xm8xng9of5ghbiyuwycyib3 Received-SPF: none (suse.com>: No applicable sender policy available) receiver=imf18; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1616593230-427447 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 24-03-21 14:13:31, David Hildenbrand wrote: > On 24.03.21 13:37, Michal Hocko wrote: > > On Wed 24-03-21 13:23:47, David Hildenbrand wrote: > > > On 24.03.21 13:10, Michal Hocko wrote: > > > > On Wed 24-03-21 13:03:29, Michal Hocko wrote: > > > > > On Wed 24-03-21 11:12:59, Oscar Salvador wrote: > > > > [...] > > > > > > > > an additional remark > > > > > > > > > > - online_pages()->move_pfn_range_to_zone(): Accounts for node/zone's spanned pages > > > > > > - online_pages()->zone->present_pages += nr_pages; > > > > > > > > I am pretty sure you shouldn't account vmmemmap pages to the target zone > > > > in some cases - e.g. vmemmap cannot be part of the movable zone, can it? > > > > So this would be yet another special casing. This patch has got it wrong > > > > unless I have missed some special casing. > > > > > > > > > > It's a bit unfortunate that we have to discuss the very basic design > > > decisions again. > > > > It would be great to have those basic design decisions layed out in the > > changelog. > > > > > @Oscar, maybe you can share the links where we discussed all this and add > > > some of it to the patch description. > > > > > > I think what we have right here is good enough for an initial version, from > > > where on we can improve things without having to modify calling code. > > > > I have to say I really dislike vmemmap proliferation into > > {on,off}lining. It just doesn't belong there from a layering POV. All > > this code should care about is to hand over pages to the allocator and > > make them visible. > > Well, someone has to initialize the vmemmap of the vmemmap pages ( which is > itself :) ), Yeah, and I would expect this to be done when the vmemmap space is reserved. This is at the hotadd time and we do not know the zone but that shouldn't really matter because their zone can be quite arbitrary kernel zone. As mentioned previously I do not think associating those with zone movable is a good idea as they are fundamentally not movable. It is likely that the zone doesn't really matter for these pages anyway and the only think we do care about is that they are not poisoned and there is at least something but again it would be much better to have a single place where all those details are done (including accounting) rather than trying to wrap head around different pfns when onlining pages and grow potential and suble bugs there. > and as the vemmap does not span complete sections things can > get very weird as we can only set whole sections online (there was more to > that, I think it's buried in previous discussions). Yes the section can only online as whole. This is an important "detail" and it would deserve some more clarification in the changelog as well. You have invested quite some energy into code consolidation and checks to make sure that hotplugged code doesn't have holes and this work bends those rules. vmemmap is effectivelly a hole in a memblock/section. I think we should re-evaluate some of those constrains rather than try to work them around somehow. -- Michal Hocko SUSE Labs