From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78FCEC36017 for ; Wed, 2 Apr 2025 18:30:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 46EE8280005; Wed, 2 Apr 2025 14:30:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 42133280001; Wed, 2 Apr 2025 14:30:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2C06C280005; Wed, 2 Apr 2025 14:30:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 08DDF280001 for ; Wed, 2 Apr 2025 14:30:18 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7AAA74BFA3 for ; Wed, 2 Apr 2025 18:30:19 +0000 (UTC) X-FDA: 83289943758.09.3D52B85 Received: from out-187.mta1.migadu.com (out-187.mta1.migadu.com [95.215.58.187]) by imf22.hostedemail.com (Postfix) with ESMTP id A874BC0002 for ; Wed, 2 Apr 2025 18:30:17 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=PySjimlQ; spf=pass (imf22.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.187 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743618617; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RRd3SQcvztMwM6nVkcAFoWMZZ4yVqjmYprozJgTEDhQ=; b=fVZkYRdEWcIkETIXgL5GIQGAgib1kUcmyxh2RxttEBcKabjtp9TE6xFpANfKXwHpir0r5r WgpZnDPGIMwwCmyzUnA4em/4fO1lXsV+qUbQ7yApC+kJtC9Y7acT8MfRLetLJ734MQRVoY uTVQYO0gp9nVMDSUG9odTsVK2eq6U7E= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=PySjimlQ; spf=pass (imf22.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.187 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743618617; a=rsa-sha256; cv=none; b=UgCu4rOvlbGDqsTjTWDs5ruD4RYzSiBXDaHW+RUrvkt1JacaSXFBktlHf1Wd+tWjPZdQan kWPn+8Ad1G8u3gx/at5BQ2nMWGuiJvQB3l2rVIYyrwRqZ6GhStTPAySb1ziwXosoJUzr00 mwrLWAHffa7XlDglaWXn9S230s25/ig= Date: Wed, 2 Apr 2025 11:30:10 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1743618615; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=RRd3SQcvztMwM6nVkcAFoWMZZ4yVqjmYprozJgTEDhQ=; b=PySjimlQBSxDpoJBYRyJxPQX5UpQ3bEXK4WqSxSBQq3lQsdjcZ3pft3X45O3k9kKcmTIxp qnZtHY0MvLbowPwWBVCXOWBHpr/KAWGybO6i7YS4frnUMuYCtrHjbTedEihOjnwzqkxHpt YUA+D+cNrQ4LgfNb3wc5WN7NV3Szmak= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Matthew Wilcox Cc: Michal Hocko , Dave Chinner , Yafang Shao , Harry Yoo , Kees Cook , joel.granados@kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Josef Bacik , linux-mm@kvack.org, Vlastimil Babka Subject: Re: [PATCH] proc: Avoid costly high-order page allocations when reading proc files Message-ID: References: <20250401073046.51121-1-laoar.shao@gmail.com> <3315D21B-0772-4312-BCFB-402F408B0EF6@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: A874BC0002 X-Rspamd-Server: rspam05 X-Rspam-User: X-Stat-Signature: zgq61tcgnxdrbrxa7t7xuoqpbcihtbhq X-HE-Tag: 1743618617-221954 X-HE-Meta: U2FsdGVkX1/EbxHsrSTD4R2TSIel2bOUettl1XYSwyCfzBjwfJUZzFkM5mAx+lWLsI+wVnrRz4B0s/3e0ZTY6TymuQVIZO+8q/9H8lk7WpZ+RX30p1Czw8iuJQv9z9B7SCvFgo9zguqoJxTX8DNrXGf93zxHTGR4jNWDNzZqtTh9FIbXDvNjxcQS5dkwzupu806PNOd8Sha66hRPMt963x4/39YR0PvLci6yOctaZFxGUiCYRRcSmYvO5fg2G9TgoBMcqxRNa3oGa4j5CwIZ5vs6LPFBLD/L3LQVwPGl5E4xRhyHKh08awhaDrFXK6G2OINsy1/ZKG0vmstvuiK4fJ0pHyV/f2vz1PNEzO3i1yrD/L7/Ah+wwwnWOIfa/qN8w3clEez2i4HmlaNjI37IeGrq0xXLdWOES0aV6OWCKOHPRGaSy66rHG+Ynui2a6i7cyKAgVCqJ9awQBH44jVa1r5lAKtrqwGKmAjYmnpQTpgsfFsM6lFib/XfeXMZDmYW/OAAZNXpUvjlI4g7ArVNF+BQrwpz5o8r90LG3Rd3AkjI10ss7eun2YlA6reVmsz7hpJ8DEBrwR/iZya3NLgRFlZ9+2FUASoa7cLf4Elx1k71B+Yobqrut3pVwkfR5WyQU6VM/XorJFWhyvpkVahS9CUnGnq2x41QZxniuOlizq13NR4YNpAvmP0UuoJb1rGifQDpgq0MUpi+jh1lGnpsc0ZgXWLNLnp9bw6J5jHkd6UCsBsKHSswJunC+x+W+25DuI4JX6KLEga9/h1FCzTp3gYXXq8atLVrcAc/WFKzRR0oCuZweeWr/iGPX3cs7PJh4fvf8V2p+QG/muGXQz34iyQQVYzytMuovbCZCphkAq+KIkm6oc747aPyjwyXhAhYFbCUD//vXA/feOKJZhTUJO0rG9pMYzkKqQS3MbnG1ZdBpzqPlwARA1Y8T4R1rPhBQ/d6CSG3eA6clNPs0Ya 9/NBFSjs DrzUtm3BBsDNZipeq0lh5k5v3j3uwoPF4SxjDju13YV9/218n+SOO5RaisiA94BbmyhDJ4SVJhb12EzVEfXotWq5CcHE5clMRJkx4P6a0njxW2AK+dSZo1wFe6c0MUCQViFFkUCImaKUTmHNCqjMjmBepzu/siHdlPbJHLFiwFVu2q6KFf38140QsllNExJTn+HD/OOg5tDR42z3LKGK0hRjwukUMx5sGcrg+83Q/os8wUQRgU3HJmvQS2sWhW1rk67vG X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 02, 2025 at 06:24:10PM +0100, Matthew Wilcox wrote: > On Wed, Apr 02, 2025 at 02:24:45PM +0200, Michal Hocko wrote: > > On Wed 02-04-25 22:32:14, Dave Chinner wrote: > > > > > > >+ /* > > > > > > >+ * Use vmalloc if the count is too large to avoid costly high-order page > > > > > > >+ * allocations. > > > > > > >+ */ > > > > > > >+ if (count < (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) > > > > > > >+ kbuf = kvzalloc(count + 1, GFP_KERNEL); > > > > > > > > > > > > Why not move this check into kvmalloc family? > > > > > > > > > > Hmm should this check really be in kvmalloc family? > > > > > > > > Modifying the existing kvmalloc functions risks performance regressions. > > > > Could we instead introduce a new variant like vkmalloc() (favoring > > > > vmalloc over kmalloc) or kvmalloc_costless()? > > > > > > We should fix kvmalloc() instead of continuing to force > > > subsystems to work around the limitations of kvmalloc(). > > > > Agreed! > > > > > Have a look at xlog_kvmalloc() in XFS. It implements a basic > > > fast-fail, no retry high order kmalloc before it falls back to > > > vmalloc by turning off direct reclaim for the kmalloc() call. > > > Hence if the there isn't a high-order page on the free lists ready > > > to allocate, it falls back to vmalloc() immediately. > > ... but if vmalloc fails, it goes around again! This is exactly why > we don't want filesystems implementing workarounds for MM problems. > What a mess. > > > if (size > PAGE_SIZE) { > > flags |= __GFP_NOWARN; > > > > if (!(flags & __GFP_RETRY_MAYFAIL)) > > flags |= __GFP_NORETRY; > > + else > > + flags &= ~__GFP_DIRECT_RECLAIM; > > I think it might be better to do this: > > flags |= __GFP_NOWARN; > > if (!(flags & __GFP_RETRY_MAYFAIL)) > flags |= __GFP_NORETRY; > + else if (size > (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) > + flags &= ~__GFP_DIRECT_RECLAIM; The above seems more appropriate then the Michal's bigger hammer. In addition I think Vlastimil has a very good point about the kswapd reclaim for such cases (the patch explicitly complains about kcompactd cpu usage). > > I think it's entirely appropriate for a call to kvmalloc() to do > direct reclaim if it's asking for, say, 16KiB and we don't have any of > those available. Better than exacerbating the fragmentation problem by > allocating 4x4KiB pages, each from different groupings.