From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1322AC76195 for ; Mon, 27 Mar 2023 15:10:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0B8F5900003; Mon, 27 Mar 2023 11:10:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0689E900002; Mon, 27 Mar 2023 11:10:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E9903900003; Mon, 27 Mar 2023 11:10:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D90EF900002 for ; Mon, 27 Mar 2023 11:10:15 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 9FFA0A077C for ; Mon, 27 Mar 2023 15:10:15 +0000 (UTC) X-FDA: 80615013990.22.55E3ED4 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf19.hostedemail.com (Postfix) with ESMTP id 88E2B1A0022 for ; Mon, 27 Mar 2023 15:10:12 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=ao7iOgtg; spf=pass (imf19.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679929812; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wumnRnZ3mJK+95JSA6PLaHw4mVwxf/aNZqTtEDsqwmM=; b=ldikzQnLoY2C0cMW7F+qqxRXzsHh45Rv0TQWJZGxr5PU+WPIDlqqhbCWiXfUtvqm57xVeO 2eLiE+ATwulg64qVC9taBjri3VQ1xLMzvVtVXhSSrkZLYVnHFHIstP0wQGtZbh4QTLD4+r Vf3lJBd7D9YKU01DZ3iuYtdNJu6UFZo= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=ao7iOgtg; spf=pass (imf19.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679929812; a=rsa-sha256; cv=none; b=7k58aGyxsIGFQta4smZY0D4QCzVBcUUzdaq1BpImXgUVVpmc5WLiXIATeiCiMWrM7azs81 lrnyK6aEi3Z891H842HdhHWbo51wTtdgJVswhg97eStpc7XF2JZLF8we+nkxaZ/xHRtdAa WojDCX+M07kzWl/vq/lfJ2AI1YsJJBk= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id E108B1FDE4; Mon, 27 Mar 2023 15:10:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1679929810; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=wumnRnZ3mJK+95JSA6PLaHw4mVwxf/aNZqTtEDsqwmM=; b=ao7iOgtgltan86e3vJELzQW8Z9M+QQ3IyzYxQtzhBXmTU5dJUQ7/BZw40GFk0hmH17YHyO ULCwjK41X0JZRzM4EBH2eBYq0ZcKxL4y4yql6iwP6BjF4mSHXRBv5A93lRqm27RiUbwWIR ZczhHUnlnZgzD4xrdlX0hBmFNoBMOcM= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id BFCD413329; Mon, 27 Mar 2023 15:10:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id Q//OKtKxIWS/LAAAMHmgww (envelope-from ); Mon, 27 Mar 2023 15:10:10 +0000 Date: Mon, 27 Mar 2023 17:10:10 +0200 From: Michal Hocko To: Vlastimil Babka Cc: Mike Rapoport , linux-mm@kvack.org, Andrew Morton , Dave Hansen , Peter Zijlstra , Rick Edgecombe , Song Liu , Thomas Gleixner , linux-kernel@vger.kernel.org, x86@kernel.org, Mel Gorman Subject: Re: [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped_alloc() Message-ID: References: <20230308094106.227365-1-rppt@kernel.org> <20230308094106.227365-2-rppt@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: q48ewpzifycth1ty5nafpa6u37djnmaf X-Rspamd-Queue-Id: 88E2B1A0022 X-HE-Tag: 1679929812-586475 X-HE-Meta: U2FsdGVkX19jTNYIeBTJbREXcS3BOO6x7FVh7HTa0TGxgBHo2RQloyXF3R62vwpNnmPI0DcnyTovnV7oL1tPQF3KfpEEcW9M8xMEj3MUhin2eEQy51/a+N6wFY7KPvfpoKIvpIbF7fRI17sj1EjRCIAaKi/Uqf1KcKNlh63g24fE+7gg43z0gounSkta7hsOk5T6UrwLjoBw8SqR/JuTeb0deSgQJDqjFfcanHWGs2YvpEyIGpWGUXVG5GuTMFFvfQXNMpIbSf9sP0YuN10gFMqiIG+fj39HiNgUt7mC9vFl0iHGZ4+1sJi3WpdAcvRMW0TO1GjU3hI+0CTyTUdpDROevpUC7UngRV09Br4l4JiSH4h9Sd0BCRwPeqFlM6kwO26qS4KQIucZD5+b1BeGpxgQtgnv/FuZhmuw0BtP15YjwJaF3D6PpcBI2rEImV5O1hLCKSoB6BqSQ3Vdee9KxSHvWath4cfzgOXG+tBCm8uq8Lgkg/tGlh9CBXKMYeJs2nzs2teteXu+B/qZKMkQLAuxaAbih2A0PBrUgtCZhK/E+zoeswinqUeYx6Dw3gFVIuhLWF2O7ZSWFKPkgpEw5eOztx7g6sG5COdZdMEUrBpTPEsTpKgs8+U95WWzD22AvPT4883TwmjvVZyeWeqbuv5X5+84Wc3BUGKNFAcufZyy6bsBu8vso9V2bXN1CvC/G6rZIRBZWPOPqm2xnDHTjecVgKJuGQJFuFxgEUQzdcoHo1GR+HRW8S0krFknVWbuM9tLyoCpJp9qhSQlDk1C/5+0Yg2oM4xIxpQyuo9GpE+cM/RYDA6jI0ISS5iXNPMrehThT4KXxh27pYfWfriPN5lAXqeNL2lhdYUy1dL+NrkV0qMStJBt5JkbFjHpX/nLFQmGN+cYk/lL2ShNgpfOnExzxfMJ+/MCCDrIoL3PupLE1CA9LHx7he+e1jXHqAhJB6YW8+Nn5X/TlazBx2k EeNPrBAP StGO3RoLKRP4LfI65RP1NdJJcAgzuxwCUV58PKFxsBxcWQjvptyw0Gn54cQKP24NoGblZocu0MJJ3fTuEWzEhsyJz+yYJPL87DSSexr+cGxT71MkbphCQ8qTSsZCMiUdCUeqyQMHKDaiJJgRyPeu6qTwmKPDwgFGCt05l0VgPg4Cdjn4zBu94qmsBCVnnvyCIXuEFjXD4PvpLW++07YeQOOttSymGeN+vhaf9x/l7dM0a9Qb7oBqOtz0FREXwvktjikkOdO+677jN3OnIGYm1IaThjQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon 27-03-23 16:31:45, Vlastimil Babka wrote: > On 3/27/23 15:43, Michal Hocko wrote: > > On Sat 25-03-23 09:38:12, Mike Rapoport wrote: > >> On Fri, Mar 24, 2023 at 09:37:31AM +0100, Michal Hocko wrote: > >> > On Wed 08-03-23 11:41:02, Mike Rapoport wrote: > >> > > From: "Mike Rapoport (IBM)" > >> > > > >> > > When set_memory or set_direct_map APIs used to change attribute or > >> > > permissions for chunks of several pages, the large PMD that maps these > >> > > pages in the direct map must be split. Fragmenting the direct map in such > >> > > manner causes TLB pressure and, eventually, performance degradation. > >> > > > >> > > To avoid excessive direct map fragmentation, add ability to allocate > >> > > "unmapped" pages with __GFP_UNMAPPED flag that will cause removal of the > >> > > allocated pages from the direct map and use a cache of the unmapped pages. > >> > > > >> > > This cache is replenished with higher order pages with preference for > >> > > PMD_SIZE pages when possible so that there will be fewer splits of large > >> > > pages in the direct map. > >> > > > >> > > The cache is implemented as a buddy allocator, so it can serve high order > >> > > allocations of unmapped pages. > >> > > >> > Why do we need a dedicated gfp flag for all this when a dedicated > >> > allocator is used anyway. What prevents users to call unmapped_pages_{alloc,free}? > >> > >> Using unmapped_pages_{alloc,free} adds complexity to the users which IMO > >> outweighs the cost of a dedicated gfp flag. > > > > Aren't those users rare and very special anyway? > > I think it's mostly about the freeing that can happen from a generic context > not aware of the special allocation, so it's not about how rare it is, but > how complex would be to determine exhaustively those contexts and do > something in them. Yes, I can see a challenge with put_page users but that is not really related to the gfp flag as those are only relevant for the allocation context. > >> For modules we'd have to make x86::module_{alloc,free}() take care of > >> mapping and unmapping the allocated pages in the modules virtual address > >> range. This also might become relevant for another architectures in future > >> and than we'll have several complex module_alloc()s. > > > > The module_alloc use is lacking any justification. More context would be > > more than useful. Also vmalloc support for the proposed __GFP_UNMAPPED > > likely needs more explanation as well. > > > >> And for secretmem while using unmapped_pages_alloc() is easy, the free path > >> becomes really complex because actual page freeing for fd-based memory is > >> deeply buried in the page cache code. > > > > Why is that a problem? You already hook into the page freeing path and > > special case unmapped memory. > > But the proposal of unmapped_pages_free() would suggest this would no longer > be the case? I can see a check in the freeing path. > But maybe we could, as a compromise, provide unmapped_pages_alloc() to get > rid of the new __GFP flag, provide unmapped_pages_free() to annotate places > that are known to free unmapped memory explicitly, but the generic page > freeing would also keep the hook? Honestly I do not see a different option if those pages are to be reference counted. Unless they can use a destructor concept like hugetlb pages. At least secret mem usecase cannot AFAICS. -- Michal Hocko SUSE Labs