From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91605EB64DA for ; Wed, 19 Jul 2023 08:06:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EEC4D900006; Wed, 19 Jul 2023 04:06:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E9CD18D004B; Wed, 19 Jul 2023 04:06:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D64E1900006; Wed, 19 Jul 2023 04:06:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C41AD8D004B for ; Wed, 19 Jul 2023 04:06:05 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 8C12A401AD for ; Wed, 19 Jul 2023 08:06:05 +0000 (UTC) X-FDA: 81027628290.02.8078EBF Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf26.hostedemail.com (Postfix) with ESMTP id 947E9140005 for ; Wed, 19 Jul 2023 08:06:03 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=qsOSHoPT; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf26.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689753963; a=rsa-sha256; cv=none; b=Q8G+yRZnSvmPKB8dO7TZuAt+R5onVZnVQsWNANmIv2VeZVmfpJrSrZ2qeJwH7PzRXB7fkg meFHmDMCgvH+KVCAjkgcV9ydsNfHGOxN+0da/cqUwKr88x/1fQhtahq7UnAAM2eGNLVntw A8oAcX+XUzHwLWIitB1lOakrSHNXwHA= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=qsOSHoPT; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf26.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689753963; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zBnExcgSy6/8W0b3ctbNk4QK5v+V0L2vbTzjaBee7xs=; b=p/6MUzFr9bt8teQlWaKlhPQftVQ7jSYFeofbON3vPV19a3Af/T+MvEZm8ec4aaNY0WcV2+ oFJU3c/vG7iV0DbcwkuMHFfxs+Vo/LmNt+SxUH64Eklxk2/hCd13Q0Od0GIN6+rebQQKt4 xZ5oDBmnw3InEB7fnd4tNZa85VelzuQ= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id E9E9E1FD64; Wed, 19 Jul 2023 08:06:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1689753961; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=zBnExcgSy6/8W0b3ctbNk4QK5v+V0L2vbTzjaBee7xs=; b=qsOSHoPT2PhOaYUzl8APdcnP6okS7SZTR5gw33DZdqCMWOu7C/ic4prqf4sQy7161rigdD QbwLdP/SnJCkbWv/SK1jkdtX283tQyiUQ+8NEOImeqwMYtRKnG6pMAFAO5GCXjYJWd+RwB s+S1XoSPXyqAKGGXSV/IWupMnvmrPKE= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id CE39D13460; Wed, 19 Jul 2023 08:06:01 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id VwjZL2mZt2Q3cQAAMHmgww (envelope-from ); Wed, 19 Jul 2023 08:06:01 +0000 Date: Wed, 19 Jul 2023 10:06:01 +0200 From: Michal Hocko To: Mike Rapoport Cc: Ross Zwisler , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Matthew Wilcox , Mel Gorman , Vlastimil Babka , David Hildenbrand Subject: Re: collision between ZONE_MOVABLE and memblock allocations Message-ID: References: <20230718220106.GA3117638@google.com> <20230719075952.GH1901145@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230719075952.GH1901145@kernel.org> X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 947E9140005 X-Stat-Signature: xn8s7goiiprudyz813k839mi1o94dkbq X-HE-Tag: 1689753963-871241 X-HE-Meta: U2FsdGVkX1+FigaK01Wiye7tiVP2ZR5t+tZGwRKnOTQfUD9MOg8dX5hf4cdy78luvwMbNeyBmKnAEYAxmpgvdO2ohghmYf7jJDsSZXTzLlrVnekQQvoW79+ifSU/DgD7MikcKYbnIQcRINVBd9ey+m+0RyW6P8YmVxi8owvReZa340VendJKYs5NuLwBS5fCJIqqSZtZHHAwj7OTqrvRiNxRoYW2xtbQXbHQOXq1nV2tTHj/AdGyDf//9BFa34P2OAEe7Z0/TJ6uY3+tf+mioA2NrxvzuZ4BbeT2LuqJFiYVqrpqqleLJF5hSe0ptqmOuJxob/HaVZfk3hvX8V7LZKMLMXTFK+0q7O9XfkjlNcNV2mVOut/IhIfmxcgWGGjFX2GlX0I6RSvAWPTusVWzXqA95rkbaxR0rowQn9+c54tD1wG3EQNabkWSO2f5zZck5Xbim9mXxrDJcXdQFTQxcwO98Oe2WfJcecVc/Is1h00M4naz0/jP/5F8/RrVSaz3low3pNv81yP6sTtPanz8G3HnXBwaJY1Up7t2334duNj3HY1BjSmEr50W300cB6S8Y0i2HDKZEx383DoREJPvKgklvFJBBcAHWDcTWzA/ajdFlpEsfW8JwXtNb21wb5ufl2jPdHC7gbowX1vo6T5V0CkYU5DhCEBU1e6wVyISONQOR4Uv94d/cgTD2XCZde+mBp8MIN8FX+sYgiWPyZtCcnDR1hJuTbqrPNqQl2FNCTzl4Dm8A6RM5YpuG4ZwJGjig9E1ccHtNzTxVPw+uUBi1ZOwb+nJK+RqPVBw2/k9c8QK6lNNLnVmtP24PQgWuAEKtlzfWv2BFGUHWhNmidhhFHYImT5w6SZlV50NHRrE31pS8YjeHYJ/PqddjB8wQylmS5ouLkB/CeWdcjs1+0589CEYKlsuApnRVqvhXtVpLX33u2N2OL01L+eoPz5kTnvKLc0U88a0S3aPZN+NQD0 lZS2TZmQ oGx9D0VfDFoBskCaa8HEYEzL3e3RBd3/GxYHs+k/hR1PN43UxHPEGfVLE4gve4bAlZhEvFFG2lGVOilPx+zpJj05ckuXopXhIiOMwJwf8mBfstyZ05KbsU+Wbcm7jUv4VUS7dZ5G8v/CmryjC7Ghq6V+my9Oh+h2AywwYR2ODlPDTMMKbtDpdQGGBnq0UX4+vdfuD X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 19-07-23 10:59:52, Mike Rapoport wrote: > On Wed, Jul 19, 2023 at 08:14:48AM +0200, Michal Hocko wrote: > > On Tue 18-07-23 16:01:06, Ross Zwisler wrote: > > [...] > > > I do think that we need to fix this collision between ZONE_MOVABLE and memmap > > > allocations, because this issue essentially makes the movablecore= kernel > > > command line parameter useless in many cases, as the ZONE_MOVABLE region it > > > creates will often actually be unmovable. > > > > movablecore is kinda hack and I would be more inclined to get rid of it > > rather than build more into it. Could you be more specific about your > > use case? > > > > > Here are the options I currently see for resolution: > > > > > > 1. Change the way ZONE_MOVABLE memory is allocated so that it is allocated from > > > the beginning of the NUMA node instead of the end. This should fix my use case, > > > but again is prone to breakage in other configurations (# of NUMA nodes, other > > > architectures) where ZONE_MOVABLE and memblock allocations might overlap. I > > > think that this should be relatively straightforward and low risk, though. > > > > > > 2. Make the code which processes the movablecore= command line option aware of > > > the memblock allocations, and have it choose a region for ZONE_MOVABLE which > > > does not have these allocations. This might be done by checking for > > > PageReserved() as we do with offlining memory, though that will take some boot > > > time reordering, or we'll have to figure out the overlap in another way. This > > > may also result in us having two ZONE_NORMAL zones for a given NUMA node, with > > > a ZONE_MOVABLE section in between them. I'm not sure if this is allowed? > > > > Yes, this is no problem. Zones are allowed to be sparse. > > The current initialization order is roughly > > * very early initialization with some memblock allocations > * determine zone locations and sizes > * initialize memory map > - memblock_alloc(lots of memory) > * lots of unrelated initializations that may allocate memory > * release free pages from memblock to the buddy allocator > > With 2) we can make sure the memory map and early allocations won't be in > the ZONE_MOVABLE, but we'll still may have reserved pages there. Yes this will always be fragile. If the spefic placement of the movable memory is not important and the only thing that matters is the size and numa locality then an easier to maintain solution would be to simply offline enough memory blocks very early in the userspace bring up and online it back as movable. If offlining fails just try another memblock. This doesn't require any kernel code change. -- Michal Hocko SUSE Labs