From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1875C76196 for ; Tue, 28 Mar 2023 06:25:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 34FD36B0072; Tue, 28 Mar 2023 02:25:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2FF7E6B0074; Tue, 28 Mar 2023 02:25:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1C6FF900002; Tue, 28 Mar 2023 02:25:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0D6396B0072 for ; Tue, 28 Mar 2023 02:25:52 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D255FABA8D for ; Tue, 28 Mar 2023 06:25:51 +0000 (UTC) X-FDA: 80617321302.23.F2722C3 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf20.hostedemail.com (Postfix) with ESMTP id 2D2461C0006 for ; Tue, 28 Mar 2023 06:25:49 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=dH1R2Ibm; spf=pass (imf20.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679984750; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kiE52WBOA0DNFtjqwZU6s8vg64p6J65a+vzwXg8UZ9Q=; b=V94x3WVUR1rgn47cQNF12rcFx3OfGeyOtbWPcAMidp1COF4+0AeH2u6bGwnbDQBDsXYg71 cTr/fnkyOi0B12b6YDL9L/6GtJKYb6ZvU1/OApdse19XNmAybCw6N1t79QjuW9thcZNR4P Voe8KgmgSWWehILLRZNsaGac5VVpNf0= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=dH1R2Ibm; spf=pass (imf20.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679984750; a=rsa-sha256; cv=none; b=CdTc/czH2duKhQtfKXfjgXp3E4wAfhjlhL0amoWQb6yONo3PRPe1lAWSadCgiKC06Y1zEo vaPPFfIhvEgaF9OjUAwlnTHHi6wkz8HLPdv+b6SsEMI83sJGFC4gUoRMEYRtwd/JhgOljS dS9iTehkFRqElb2pUoz8AYPVXdjb4PY= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1DBC761034; Tue, 28 Mar 2023 06:25:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 91733C433EF; Tue, 28 Mar 2023 06:25:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679984748; bh=/gjN378BHPpXOBuY5ZBtQCxFRa5v+wfjv5N1z6Y/jBs=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=dH1R2Ibm3W3MCBkV5lrXi0+FotfnDyK0xLxyuG3hKSq5fCI8grjlIaaKmYxJK+dWs aMMnrIdeJtPewg6zlyNPF9QcQBkST3UtXwLmcKDibpyPC2vGUi4sRAVAwzNE6nS+t2 jpf9ABSjs4ye5UdDznEr/ksf0yNJLM645OrSNq3dAGfaRLPtwLfPdqBoWpoXTmDwfC mQmlQHKK6P7oqtIZ7tpzyAjJ8NlFUq+gqMTHjeDqNs6v0O8/MpvlhHL5gjLFUqLMnK OoZi2YzyP5Of0faOwoydDJbQCWj5EHLbBDDw6bku7lXu8LzB8ot4shGIfWpkZ8FxCz +R1eT5KqGDvxA== Date: Tue, 28 Mar 2023 09:25:35 +0300 From: Mike Rapoport To: Michal Hocko Cc: linux-mm@kvack.org, Andrew Morton , Dave Hansen , Peter Zijlstra , Rick Edgecombe , Song Liu , Thomas Gleixner , Vlastimil Babka , linux-kernel@vger.kernel.org, x86@kernel.org Subject: Re: [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped_alloc() Message-ID: References: <20230308094106.227365-1-rppt@kernel.org> <20230308094106.227365-2-rppt@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: qwfqybp6mrrrf7wjwhja64x18ophqr1s X-Rspamd-Queue-Id: 2D2461C0006 X-HE-Tag: 1679984749-644987 X-HE-Meta: U2FsdGVkX1/G8W/aN/z+fPjaTKWS7ON0atcYvRUB2BNADpIvuFu8r8n09cUyXOCeUsmQW+WANRJGG6oCbFPbDX3zxhX4IGAyLIjFmtmdqNXyNLAg9+hwsRZyT8dACzcy2QX/wbSfDnfZ1aQEZtHr0YVkXPhmRtOC99Ky+QukWwshy1SbfYG9mUEM7llBBXJ0pZfWEAo/SNTpYefWy6Wbs9Wm8QLO5YXvZEtCKRCIhcvobdXwHeT+pC81TmkA//I0Ite9srobJK9X2+Lyk/FQapkITovgV/F2iIEQzWeIopJNWdEkrcwU8mxymD4EnwY+x5caNt+yK+D7XTxTpdbEbc6ghnSBi4LmXeOPqDHd0oyU8hVtBVrHzvYXjPEzmycikPs40/I0bQYODVM3dMQIuMDIaGodyGLulcaPF14Kq9idNDFVXGf66yO8p0LU0l5Z/jsHlfMrHYUrx1IjI+pdePuUNZWKqxcEG/Yzycgn5j1Qa1s2qFkJn18EhNYK97hw+MlByrT2lh6PrNENFImgWITXFuETzCAfv4byKwGssARLm0uQSw9ImpEqtRdheye1E62L0xKWN3n1ukq7xvIsCghcbuj8si6yNTa0D6F+JxKXIfTmmwwIEwckCeqEhkMrclydpYxmsg9kQgUa4UZZllCZJh4yFeXA+hMOsuZnhy6rmeLmr5XjTS2W9IAYCUPc7T+W1uEjDF8cp/VIIrRWCZnQ7JYDQsZoh5edDacr2qnj2IJfCp01RDcyI50mV61aC6pmOqXyWPMu3/XpzXlPY6L6YE7LDdsj2YAFlluQ99KMKs7mZKDDACEp4211bqbliFb//ZCGtf4OAHWVg9rtC8P/bF27ySB8cp70aSoaBIF5C442J2Pd4fDk/dFhJVyV/mTHn/cvNSTQUjvmsMXX4HAv5L2cxfgE3vwX85FibxeTg+JdiOWTyr+RNIIb8lz4ssAjJvw8p7iPwjaXV9B sShYKE3Z 9iP2vd10h7tG302XJty45eY0cL8hrNIZdQf+nYI0bKS65tZqn4HXKV5F6ddL6Hmuy1xBAC1uGFWJEP2yCvUJc3+4sI66v17zO9mBheYcPjG+D8RV9X+/1m7xvMRyUFHuGPE4n37BE/B4F80f97vQRTFu4cnz7SxvzqAHb5S56dHz6fBtyrwweoMdgfhonc00wenig5n4G3z+mVVvmLIZKeWCFrWc/v6A2gTaDyisGLf8igCrhL7tV5AzoO/B8j/crlW4fVGaLoEUmbqiqdbyrRS2mzniHe7wz1rL0OHThvUHM3n++5WcaZ0XQLSjI5geMXRSdv89v7jEFVagF0PeTd8gq4Hb4NCWTv58vGj6NlhyfNytRYU7YIF/1KwsFuCuLFTU7I0+OxjWBNn0vkdLn54zhTteIszdlM4n9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Mar 27, 2023 at 03:43:27PM +0200, Michal Hocko wrote: > On Sat 25-03-23 09:38:12, Mike Rapoport wrote: > > On Fri, Mar 24, 2023 at 09:37:31AM +0100, Michal Hocko wrote: > > > On Wed 08-03-23 11:41:02, Mike Rapoport wrote: > > > > From: "Mike Rapoport (IBM)" > > > > > > > > When set_memory or set_direct_map APIs used to change attribute or > > > > permissions for chunks of several pages, the large PMD that maps these > > > > pages in the direct map must be split. Fragmenting the direct map in such > > > > manner causes TLB pressure and, eventually, performance degradation. > > > > > > > > To avoid excessive direct map fragmentation, add ability to allocate > > > > "unmapped" pages with __GFP_UNMAPPED flag that will cause removal of the > > > > allocated pages from the direct map and use a cache of the unmapped pages. > > > > > > > > This cache is replenished with higher order pages with preference for > > > > PMD_SIZE pages when possible so that there will be fewer splits of large > > > > pages in the direct map. > > > > > > > > The cache is implemented as a buddy allocator, so it can serve high order > > > > allocations of unmapped pages. > > > > > > Why do we need a dedicated gfp flag for all this when a dedicated > > > allocator is used anyway. What prevents users to call unmapped_pages_{alloc,free}? > > > > Using unmapped_pages_{alloc,free} adds complexity to the users which IMO > > outweighs the cost of a dedicated gfp flag. > > Aren't those users rare and very special anyway? > > > For modules we'd have to make x86::module_{alloc,free}() take care of > > mapping and unmapping the allocated pages in the modules virtual address > > range. This also might become relevant for another architectures in future > > and than we'll have several complex module_alloc()s. > > The module_alloc use is lacking any justification. More context would be > more than useful. Also vmalloc support for the proposed __GFP_UNMAPPED > likely needs more explanation as well. Right now module_alloc() boils down to vmalloc() with the virtual range limited to the modules area. The allocated chunk contains both code and data. When CONFIG_STRICT_MODULE_RWX is set, parts of the memory allocated with module_alloc() remapped with different permissions both in vmalloc address space and in the direct map. The change of permissions for small ranges causes splits of large pages in the direct map. If we were to use unmapped_pages_alloc() in modules_alloc(), we would have to implement the part of vmalloc() that reserves the virtual addresses and maps the allocated memory there in module_alloc(). > > And for secretmem while using unmapped_pages_alloc() is easy, the free path > > becomes really complex because actual page freeing for fd-based memory is > > deeply buried in the page cache code. > > Why is that a problem? You already hook into the page freeing path and > special case unmapped memory. I didn't say there is a problem with unmapped_pages_alloc() in secretmem, I said there is a problem with unmapped_pages_free() and hence are the special case for unmapped memory in the freeing path. > > My gut feeling is that for PKS using a gfp flag would save a lot of hassle > > as well. > > Well, my take on this is that this is not a generic page allocator > functionality. It is clearly an allocator on top of the page allocator. > In general gfp flags are scarce and convenience argument usually fires > back later on in hard to predict ways. So I've learned to be careful > here. I am not saying this is a no-go but right now I do not see any > acutal advantage. The vmalloc usecase could be interesting in that > regards but it is not really clear to me whether this is a good idea in > the first place. I don't see the usage of a gfp flag as a convenience argument, but rather it feels for me that a gfp flag will cause less maintenance burden. Of course this is subjective. And although this is an allocator on top of the page allocator, it is still very tightly coupled with the core page allocator. I'm still think that using a migrate type for this would have been more elegant, but I realize that a migrate type would have more impact on the allocation path. > -- > Michal Hocko > SUSE Labs -- Sincerely yours, Mike.