From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E47EC36014 for ; Thu, 3 Apr 2025 01:22:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C8715280003; Wed, 2 Apr 2025 21:22:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C3617280001; Wed, 2 Apr 2025 21:22:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AFE81280003; Wed, 2 Apr 2025 21:22:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 92E71280001 for ; Wed, 2 Apr 2025 21:22:38 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id EFFBE120D94 for ; Thu, 3 Apr 2025 01:22:38 +0000 (UTC) X-FDA: 83290982796.05.0362FF7 Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) by imf20.hostedemail.com (Postfix) with ESMTP id E34651C0004 for ; Thu, 3 Apr 2025 01:22:36 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=CoWAT25l; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf20.hostedemail.com: domain of david@fromorbit.com designates 209.85.216.44 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743643357; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=H4rcAW6CiUL4EGwBHYCgCAM1aNdXUEavAlcbulq5QK8=; b=U9UevsrzBc5ZbSqiEUF/ING8+3kTiU7Afj4c1ofGUbLuTmiOed8inNSmD62Rkcv1v2zagB xKFrNbRfb8CSsm+m1DJeryYsgIw9B6oeT9FU6VOwAED2r1svFexohd880bT4ZKQsmUZyZ/ Kuf5zface+PrrNyN/OVjgs7ROQ1l0EY= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=CoWAT25l; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf20.hostedemail.com: domain of david@fromorbit.com designates 209.85.216.44 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743643357; a=rsa-sha256; cv=none; b=IGzFLwVw74vmerzODseIDgrN12RVoUEeEZPzigVOiGV07aBgqo7sfugjlykCAoLAf03IeB IJbUpEkT7wDyfbd28GHefH/p6RiTPrpftC1SHj3OxUanq0+j4LmKqpjsd52bp5FMj7LURz 3Nlc2/axULmsl78j29Z/1BR1URtKJr8= Received: by mail-pj1-f44.google.com with SMTP id 98e67ed59e1d1-301493f45aeso343968a91.1 for ; Wed, 02 Apr 2025 18:22:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1743643356; x=1744248156; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=H4rcAW6CiUL4EGwBHYCgCAM1aNdXUEavAlcbulq5QK8=; b=CoWAT25lFYq+SoHgVrIEymE2z6KHRWlt2lDqsntb7J2uK7/tjCpW6Vhhf3cL4Wrz14 mcJEmdEzPYbuw7TYYpKj1kHmgPC7Kxm0ariN4VOGpLmT8pdWx1deUlYISTsFbgnQLGIM t3UAtFikopKk366x0b7UfYnvoDItw9gZKA1wJu+0Oii7K2QqBr6uwHXOcx6wtw+hPVQL 9yTLjF0VxSCTthv0laEJyGNGGkS3KWnZCqfIhgfyNvPo/jmPA5RnqiOpMA4P1kTwY/72 +duwvOuc66EL5xAAH4vdlwBvRJWZVjWVPZEfNo4zbdyeDq9f7weaE6GZKVgtbqRVyjJt x3wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743643356; x=1744248156; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=H4rcAW6CiUL4EGwBHYCgCAM1aNdXUEavAlcbulq5QK8=; b=b683SYJiyCtTLdwoTywekFd6XT8fStIiUOGIQDHC41KTvvn+ngOQnuXe2LjRSe3+3J Gfn4XX2iZnwiy2HV7FlzvMZ+d3YOqAKvzdx601rSzT/GmurAuTW+ZG3EafDRz+wRdYqo IgTNwoiEXpOngkX2+uSGOZ8+QgajCeRSa023UOThiTFPWyolTw9ldV9vIIYpBDc+W/iQ qN/K37Fz2kDXRXTUgc/kq3Z91/FLZ60JBNfW765PXCKwSx7elZ1kpsA8LXE0AqUXFo7B ANBUKklaQWiOg7ayoauD8o7qJnLRU904pVIP+4zzrtdK4mmo2s76iwzWBt429Fd/IAY9 POxw== X-Forwarded-Encrypted: i=1; AJvYcCXV87HRlJSDNhvMdHzxq2D/D8HKyXXTD2/pOi+ttznAjP1KBM69YryC9wdUkzpzOIbo443YdrwdBQ==@kvack.org X-Gm-Message-State: AOJu0Yy+yc0JzOhC+wPTQKZc3mGb/PGyGqVqbHNP0g3yILmG0snHnBKU h62ynkBXh2CqfYi8YUUk1UquHBdRE5ERMCwlgAEAaCy0TKZN8/aOn5tP6Q3gVV4= X-Gm-Gg: ASbGncu/ZPFnF2hWs4a64iaKrY0qOzlS9gDOQEAT20lvr8mWSTPt4rVENT2d01pkLgo us5kISPCgMRi56K28tNUI1bpxfIB2aoHciUUQpmEV/bawTBB7Ezdwre3Ngol/iPRfg+cRsequ4D hLS+7BRydkbsLXcFTfijrKogTa2wZZXSmdHmRfSAQ1V7YFCkIY8plb7l7np2qVy2LrEaBXUc7ug KGYKKALNH/s3M/xVOAR4VlFGp9GhLsMDgs89bd+WUvCLmGaJ5tW8HxGsqmWNuVcowzI4F35sSWx rKI8+jQBXzxRF5yvX+Zoa6GoGgxIi+ewMwzpjqnAbq7mvDzLKX6UejBA3Zm3xHIbRdT+xZbc2NS l4a9d11A= X-Google-Smtp-Source: AGHT+IHMRa3sFAfbWro38Uxsf5pzpkFaygYzxLs0JzIu4oVQOmbGGlje/VbuGD8ibb/J/RjYbjK44Q== X-Received: by 2002:a17:90b:2e0b:b0:2fe:a515:4a98 with SMTP id 98e67ed59e1d1-3057cbef71dmr1095737a91.31.1743643355647; Wed, 02 Apr 2025 18:22:35 -0700 (PDT) Received: from dread.disaster.area (pa49-181-60-96.pa.nsw.optusnet.com.au. [49.181.60.96]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-3058494a321sm162810a91.19.2025.04.02.18.22.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Apr 2025 18:22:35 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.98) (envelope-from ) id 1u09Hr-00000003lXk-0oE4; Thu, 03 Apr 2025 12:22:31 +1100 Date: Thu, 3 Apr 2025 12:22:31 +1100 From: Dave Chinner To: Shakeel Butt Cc: Michal Hocko , Yafang Shao , Harry Yoo , Kees Cook , joel.granados@kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Josef Bacik , linux-mm@kvack.org, Vlastimil Babka Subject: Re: [PATCH] proc: Avoid costly high-order page allocations when reading proc files Message-ID: References: <20250401073046.51121-1-laoar.shao@gmail.com> <3315D21B-0772-4312-BCFB-402F408B0EF6@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam01 X-Stat-Signature: q9sjd1a5icrh3bpt59ra79k7zzbgbyoy X-Rspam-User: X-Rspamd-Queue-Id: E34651C0004 X-HE-Tag: 1743643356-904361 X-HE-Meta: U2FsdGVkX180nkwRX0Mp3AgfbMHZDzDoZlRyCGCMhzpvEmrH7c2Tf0O9mpPP0By4hgq/xozEF39jditzW9f2g8+0Rt2sZro01BeHygrRpEC92OdECpfWyxjvFTROWDCJIX7F8l2sMJDhgFMtPHxcZkgFDpQDKlIzUXUBrNtYcZGzv//MHBnpdAyp7tBsJUIcfL/HeeCPwddkwRvDHbnR+MZMtfmHdyBx3IxoGWNSyVWyOzGOKUahDokkBmU2vzbRU2olZzczsNHGTyA7ein4V9f7ZS30JWhciz00EutsZ53rxDzekCH19CgED5HiMNy8FdHXcqRVPcEhZVsbIUW9o6LaTkyuyv4cy5a8d04Kygt5Dq04i800tJc7AxFRjWJvFPbM2wRo9PU/MoiiOZIU7nxL0CsAUIgRL+RngB2ij6GpyVmPf744/ogZwWamQWzWgaKERc7+XIiL5eLIzpCpZtsCODxFNSS+XLLhltZJxpMWs7wCXLcJIoaNSfixKYB0F/bt2l6eZZE8zqX9Lg0qDbEd96Zn0+aXFn9sMOZZneVG3EfBW2tlf19ExJHhsbtc40iqFm5XKtw6MQ6Bfc63nAsugIH2cZ/ibZZxm+XuSZvgZKG508HdTWADSkgGOvTP2gslzgRihtreITAJMs5ijCmvxVLCYIKbj0TYA4myt6P1+3dqIwCH6Xv9AdXi9aaxeWXYqU5N68xcZwyd6pMmoMId8XOECuh5kTiu25dyjwH2SvOZrKZSPv9G4CPFv625VdgAZXp0ZIwKrpuCCM36WA0bk4ZEhL26wJwqFr6UuhxtVEV+9s0kPQp63HOrTD/T39qQRk9ygaBogSXmwZ0Mvad88HlA6gk1hbN37pxHWH52xxGSxTUA2CvY8WKlEcaJ4MWPAD5uNxXZthCEgZh6FZ2SSeCo9RcHNwGD+CySVSsaKRTBLIkUw4H5Uvb8Jp3jusu0tA03yT97hCa/PSy y3kv9Lq1 K1ArVNELinz6t/Tf3VVU1tfBDrbrrpWeRrKwF6QltNZxYWHnf/yCmUioEPZ3tfoTe/y86XdPYgeH8E4BPckYP1mOzFmllyhEWTU3pYbFhPXD00AZftJyDeAmvBxRfbD7WCQjTeLKRjDsyj3kWTPzTqV5jWUPjZ48ZJ2YFv6MHFeY+OZaCtmtWUEJP4g1sakEklsdVSD/eogyYOQfLACaaBXXU19LPoSCT1ODimj8xXWTIUc4HxKd2zZTAOkHeL5sNMR9SKUAUjvFbE0Q7k6+kO9reVzUfeL0MuNccFU8uoLt9LI40gsUD96OVwoszfV91pcz4LLbRu5CehOU4YbIW1Jpt4DXzN+T4bde/G+r2pAWdXkI/QRMDZWgYwfZ2ZdGeAJZdvEr8nn3lIApAmIhu97vRwjFIqaJ31rmg7leV2w9Mf5OF/I9KzrTwTA4GFY+tEyFugG2aLDI+MdxXrxvtmgZD3BTVlqKBfHFX0j/MbP/4Azw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000035, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 02, 2025 at 04:10:06PM -0700, Shakeel Butt wrote: > On Thu, Apr 03, 2025 at 08:16:56AM +1100, Dave Chinner wrote: > > On Wed, Apr 02, 2025 at 02:24:45PM +0200, Michal Hocko wrote: > > > On Wed 02-04-25 22:32:14, Dave Chinner wrote: > > > > Have a look at xlog_kvmalloc() in XFS. It implements a basic > > > > fast-fail, no retry high order kmalloc before it falls back to > > > > vmalloc by turning off direct reclaim for the kmalloc() call. > > > > Hence if the there isn't a high-order page on the free lists ready > > > > to allocate, it falls back to vmalloc() immediately. > > > > > > > > For XFS, using xlog_kvmalloc() reduced the high-order per-allocation > > > > overhead by around 80% when compared to a standard kvmalloc() > > > > call. Numbers and profiles were documented in the commit message > > > > (reproduced in whole below)... > > > > > > Btw. it would be really great to have such concerns to be posted to the > > > linux-mm ML so that we are aware of that. > > > > I have brought it up in the past, along with all the other kvmalloc > > API problems that are mentioned in that commit message. > > Unfortunately, discussion focus always ended up on calling context > > and API flags (e.g. whether stuff like GFP_NOFS should be supported > > or not) no the fast-fail-then-no-fail behaviour we need. > > > > Yes, these discussions have resulted in API changes that support > > some new subset of gfp flags, but the performance issues have never > > been addressed... > > > > > kvmalloc currently doesn't support GFP_NOWAIT semantic but it does allow > > > to express - I prefer SLAB allocator over vmalloc. > > > > The conditional use of __GFP_NORETRY for the kmalloc call is broken > > if we try to use __GFP_NOFAIL with kvmalloc() - this causes the gfp > > mask to hold __GFP_NOFAIL | __GFP_NORETRY.... > > > > We have a hard requirement for xlog_kvmalloc() to provide > > __GFP_NOFAIL semantics. > > > > IOWs, we need kvmalloc() to support kmalloc(GFP_NOWAIT) for > > performance with fallback to vmalloc(__GFP_NOFAIL) for > > correctness... > > Are you asking the above kvmalloc() semantics just for xfs or for all > the users of kvmalloc() api? I'm suggesting that fast-fail should be the default behaviour for everyone. If you look at __vmalloc() internals, you'll see that it turns off __GFP_NOFAIL for high order allocations because "reclaim is too costly and it's far cheaper to fall back to order-0 pages". That's pretty much exactly what we are doing with xlog_kvmalloc(), and what I'm suggesting that kvmalloc should be doing by default. i.e. If it's necessary for mm internal implementations to avoid high-order reclaim when there is a faster order-0 allocation fallback path available for performance reasons, then we should be using that same behaviour anywhere optimisitic high-order allocation is used as an optimisation for those same performance reasons. The overall __GFP_NOFAIL requirement is something XFS needs, but it is most definitely not something that should be enabled by default. However, it needs to work with kvmalloc(), and it is not possible to do so right now. -Dave. -- Dave Chinner david@fromorbit.com