From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B4C5C3600C for ; Thu, 3 Apr 2025 05:06:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1A5E4280003; Thu, 3 Apr 2025 01:06:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 15378280001; Thu, 3 Apr 2025 01:06:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 069F5280003; Thu, 3 Apr 2025 01:06:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id DECA2280001 for ; Thu, 3 Apr 2025 01:06:06 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id D436B1A183B for ; Thu, 3 Apr 2025 05:06:06 +0000 (UTC) X-FDA: 83291545932.06.4F629F6 Received: from out-173.mta0.migadu.com (out-173.mta0.migadu.com [91.218.175.173]) by imf22.hostedemail.com (Postfix) with ESMTP id E1031C0002 for ; Thu, 3 Apr 2025 05:06:04 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=UM5R2RfB; spf=pass (imf22.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.173 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743656765; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WNARpdbUPSnkit1inDWc8sAf3hL37+HU5+sK+taKmoU=; b=5a9+rUc/eGrgryOkNbcESS6W2XtWePOcWCI9xn6i75ZlYl5a2J20tZxU61VjXmFatdlkHE +8+Dm/GXH/xdghSHPxLD7gIC3JQO9pG4yHdSOS4R5Ez55FnGpwmRTGnNDOZ+H5kAFqpcV2 EFloho/ihQQHnsIhzp8eIj2S3ALRbHw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743656765; a=rsa-sha256; cv=none; b=8iXdAn0O+lmAZodtm7RSMuwoVbz17EF7z1nnbDqtGu5CEjCdzfcCcVnk+1pFfwKLjAwRNQ NEgTyeCHlvkryCTHF9p6RQCKbW4aIPgVPIEZah+Zon2jI5sFP9WkiGH5aJxi+QIoLRPfq/ DenBlo+eQEe3d74PpGMP/HaHONPHckg= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=UM5R2RfB; spf=pass (imf22.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.173 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev Date: Wed, 2 Apr 2025 22:05:57 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1743656762; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=WNARpdbUPSnkit1inDWc8sAf3hL37+HU5+sK+taKmoU=; b=UM5R2RfBpH6yPzHKqknVCKOCncJLrdg9ocpISSNbAYz7MLL2hbCVvQX6PBX4usBOn0zU2L 8MHr5mVrg6/I1q6MSk7W9FeejHGO+MmGmTl8YMa0IUeHhWwZUE/i5h8yBhEvoC3b6LrvFR cb2pdD8edbA/7FBrH00mWy3iIyab1Ks= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Dave Chinner Cc: Michal Hocko , Yafang Shao , Harry Yoo , Kees Cook , joel.granados@kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Josef Bacik , linux-mm@kvack.org, Vlastimil Babka Subject: Re: [PATCH] proc: Avoid costly high-order page allocations when reading proc files Message-ID: <7gmvaxj5hpd7aal4xgcis7j7jicwxtlaqjatshrwrorit3jwn6@67j2mc6itkm6> References: <20250401073046.51121-1-laoar.shao@gmail.com> <3315D21B-0772-4312-BCFB-402F408B0EF6@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: E1031C0002 X-Stat-Signature: 53yporat47cn46fck5bisg641n9kpoij X-HE-Tag: 1743656764-128049 X-HE-Meta: U2FsdGVkX1+vpL1ac5KaOpDHQWUoKfSQqqK2jugr+yQwxEUlcfsPo42Tx3qyLdmHL3AOjizWaHB3gHrPlXpc5QIXWFlBmueY/C6onJOe65OeBAXONxuec7VVvXkHuXHn49HF5HZeOlFButNzGovNoSbiUqW9npMFHqK+qebPWThTaUjfGBSf/42akwCKi+YOrjC8jDGMOek40ObUAVyDMVank/ZCdsUAybmiQXBCW5EoAQ2CIkiTTmHwSmprqxDGi/6kVDSJhvk0g1+Tgn4TSyRZeoUvuh8ElvPKPwGwqI1mt0tNjg/E0LsUzwykRzBMqgQfEO+MlsWlKZ4Ou0nXQCs6ALfSo9dGn8hEeXd+LJCF12nuOIoOrsPu3irl8s3ltp668cpLe0f/HfnfG7jJBNcHrXT67jYZHMwR788b5VzEBJ+CLw1JSxGWj7VTKUbelpjh6LjWVgJgSeru76UBD17fwRU5pWGTe8hmLGXC9Uxn3A2nvI88vvgHjVIbxA3TYCe03cItwZXRQzvmwCOuB5sgL6QIAPfHi3RtpfMn9OGyKrpJrG2+UccuIToLjo1BkWYotoArF5+YNbvC7QDbvtV0hbVsHEqq5jFB3CChpj7abhpwEZmGfDqF9w0qQZlFfdPhbi3RJTwtPjF6tgrydbZiZaCprZpVvjOADFhx8y6cIMCk2JD9kq4rFCgSV/Yte2jjdJTMubBbdmbP0MP2fUmZismB/PMSujX/3QljFBfh1u9ZAE56PbATEqc49cKryVstvAQefEkM5XnBhzwzm7YzHW1bMLcJgysTgsJ3KpcpNWngBIMuNBjBBLsIEHebZszrP7foMVuTR1ykXJOz92ZWc7vPa1eXdm2fgYCKqBLQagGqpYlIFjFn5Q/w1kak/yJ79YOBBD8fA9MKk08rcCgSMGlIzw5fhbZiON+8SctXMeavdP4HsuSbPTINuXuFn8qBW3IQuna+U4Qcy5E DJF/QAXK oD5XIjzJKsdrE5BLQGqqfqvVm60+YbGR0v1QS72eaeE0a8fhEtg1fOkxFevOUL1NL5f8QA514emA/OZ24ros7iKUUQIVd4lBMPRew5C9STKje2i9BwjFa3TTsu2TI6PCcWfa1DNenMCjOAxN/nbb16S7sHGxZOnhjME7eN1fE4rgtkOgjUnk0eCFO58FViYSBlvVVyaXB5zBeY91cL7NyjuR8rEZNt3qJpOCQjec+X1oWw4GsdlwgMwMZavGKOVrFIuuz X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Apr 03, 2025 at 12:22:31PM +1100, Dave Chinner wrote: > On Wed, Apr 02, 2025 at 04:10:06PM -0700, Shakeel Butt wrote: > > On Thu, Apr 03, 2025 at 08:16:56AM +1100, Dave Chinner wrote: > > > On Wed, Apr 02, 2025 at 02:24:45PM +0200, Michal Hocko wrote: > > > > On Wed 02-04-25 22:32:14, Dave Chinner wrote: > > > > > Have a look at xlog_kvmalloc() in XFS. It implements a basic > > > > > fast-fail, no retry high order kmalloc before it falls back to > > > > > vmalloc by turning off direct reclaim for the kmalloc() call. > > > > > Hence if the there isn't a high-order page on the free lists ready > > > > > to allocate, it falls back to vmalloc() immediately. > > > > > > > > > > For XFS, using xlog_kvmalloc() reduced the high-order per-allocation > > > > > overhead by around 80% when compared to a standard kvmalloc() > > > > > call. Numbers and profiles were documented in the commit message > > > > > (reproduced in whole below)... > > > > > > > > Btw. it would be really great to have such concerns to be posted to the > > > > linux-mm ML so that we are aware of that. > > > > > > I have brought it up in the past, along with all the other kvmalloc > > > API problems that are mentioned in that commit message. > > > Unfortunately, discussion focus always ended up on calling context > > > and API flags (e.g. whether stuff like GFP_NOFS should be supported > > > or not) no the fast-fail-then-no-fail behaviour we need. > > > > > > Yes, these discussions have resulted in API changes that support > > > some new subset of gfp flags, but the performance issues have never > > > been addressed... > > > > > > > kvmalloc currently doesn't support GFP_NOWAIT semantic but it does allow > > > > to express - I prefer SLAB allocator over vmalloc. > > > > > > The conditional use of __GFP_NORETRY for the kmalloc call is broken > > > if we try to use __GFP_NOFAIL with kvmalloc() - this causes the gfp > > > mask to hold __GFP_NOFAIL | __GFP_NORETRY.... > > > > > > We have a hard requirement for xlog_kvmalloc() to provide > > > __GFP_NOFAIL semantics. > > > > > > IOWs, we need kvmalloc() to support kmalloc(GFP_NOWAIT) for > > > performance with fallback to vmalloc(__GFP_NOFAIL) for > > > correctness... > > > > Are you asking the above kvmalloc() semantics just for xfs or for all > > the users of kvmalloc() api? > > I'm suggesting that fast-fail should be the default behaviour for > everyone. > > If you look at __vmalloc() internals, you'll see that it turns off > __GFP_NOFAIL for high order allocations because "reclaim is too > costly and it's far cheaper to fall back to order-0 pages". > > That's pretty much exactly what we are doing with xlog_kvmalloc(), > and what I'm suggesting that kvmalloc should be doing by default. > > i.e. If it's necessary for mm internal implementations to avoid > high-order reclaim when there is a faster order-0 allocation > fallback path available for performance reasons, then we should be > using that same behaviour anywhere optimisitic high-order allocation > is used as an optimisation for those same performance reasons. > I am convinced and I think Michal is onboard as well for the above. At least we should try and see how it goes. > The overall __GFP_NOFAIL requirement is something XFS needs, but it > is most definitely not something that should be enabled by default. > However, it needs to work with kvmalloc(), and it is not possible to > do so right now. After the kmalloc(GFP_NOWAIT) being default in kvmalloc(), what remains to support kvmalloc(__GFP_NOFAIL)? (Yafang mentioned vmap_huge)