From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A395C3600C for ; Thu, 3 Apr 2025 19:51:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AEE68280004; Thu, 3 Apr 2025 15:51:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A9D15280001; Thu, 3 Apr 2025 15:51:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 965CB280004; Thu, 3 Apr 2025 15:51:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 786FE280001 for ; Thu, 3 Apr 2025 15:51:49 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A100D14112E for ; Thu, 3 Apr 2025 19:51:49 +0000 (UTC) X-FDA: 83293777938.20.BC93F97 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) by imf26.hostedemail.com (Postfix) with ESMTP id AFF4314000B for ; Thu, 3 Apr 2025 19:51:47 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=bPuJdyUv; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf26.hostedemail.com: domain of mhocko@suse.com designates 209.85.128.51 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743709907; a=rsa-sha256; cv=none; b=FnoZb3FOuHWCiQDDJzN82d2x0W9rlXr9ZfP4gNo246tzlYz9huC7vVzNuvwErUTf+JKO5Y vWb3GwVz1qeMoU0Hd/rvLHJ0g4N/hoJ2tsRamXUwSNB10w5Css1TqgL493b8yUN4htGK6M 3XYgwuoACy0gOslVRxXuMU36qXz4Jfc= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=bPuJdyUv; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf26.hostedemail.com: domain of mhocko@suse.com designates 209.85.128.51 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743709907; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bvJx/Nf83BWDQrNZJzmSlvuvHulVYJrNODaU2UiR8ZI=; b=guylqPECSeKxpf1ZZ6NbtCjQ+qEg3wrGhdaqm5cXUSBgAMIvlbL/JDWZUvFsc+ivkbq0EA Vp2VT3EFDTn6UJl+/eqInD8DfQIMK3snPTlqwdb6a1pPwKavf5VwripO6+oaT+aY26+M5m GJ27i1ReUXxXbAm80hEIwsrX9zC/B3w= Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-43cfe574976so8983725e9.1 for ; Thu, 03 Apr 2025 12:51:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1743709906; x=1744314706; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=bvJx/Nf83BWDQrNZJzmSlvuvHulVYJrNODaU2UiR8ZI=; b=bPuJdyUviGA3EFuxQM2vKanHVzHuON5t5/VcWhxQ+AgZBjLPl7g3tx+YtCc8WhL9AR AX5pB+n6LO+ohvQqI52ESPRbTlje1iH2SuReP25GoZeihZZMz7IU4F9pT7lLPyZnvuAU 4SZmoztkY1TXmRe/yXCIb4vi4MQMO3FKIimRt42ebjZ7ekxQTd3SdkwUPnBpUZ64L8mm n+g1LglS+lkMRKXFT4pIdmEKMjxmobKWRb/sfl8KoxMoB3RJvygeb8qTf5DA7doNZd5X 5xNfpYXqTuYytVxnkLIzEExaooyczIydxk3RDt4wBPfGFplBgeRtUy0825yZqg2ivj/0 wCqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743709906; x=1744314706; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=bvJx/Nf83BWDQrNZJzmSlvuvHulVYJrNODaU2UiR8ZI=; b=Pb1iXqGLIVYrUseTXjHtunJOksZdE3CflNJVPGTXxzpzEiIyEiFMVRxjT7sZr8yb4/ 1b30aqOEF8AbzjXEAakXAo95wqKQMhUjSLaqE0VReqzEOgQRtbOGmgEFy57d8r+q2Hsn VXeKlb+cqhlnSeErzLbrVh7ZaPeres3YW2tS6rEo7DkMB343s2j/KJvF0IkxFuRSxWlp Lj9CxrQhS/7YFvEa56XRXDgkNVGdw8qA+QzfLydZ3kXW0SoUyaZAbepT2qXHSCrp0IpS D8qHZ6ZKUVaAjzdKwZVpz7F9ApBCEo7XMQXWJL9FJF5CKKh5GPI6CRI8AQja6SHWoyhB TrCg== X-Forwarded-Encrypted: i=1; AJvYcCUy+EK3NjlYejY4LHZ3lrSDkqMK+G3d8pOWIAAX7OOJJOqWFlogElDAMb2Zhl0+Ny5D1Lc1RhlKtg==@kvack.org X-Gm-Message-State: AOJu0YxjuJ/56uoiSiuHo9mMThtiYAT1aAZ9RTfXxEb3c2R9vdVxDaGA hFNrvxs56BJ85+RkCfExLDDQPm0v1qlMDQ0tzwAlp1CZoCyyToGlgDKhYn/m+60= X-Gm-Gg: ASbGncvh872ewwenmt/hqrJAQiPV9Wtd7V2IhR1Hx97PwZXeDdMlpP92Wf6tuG74bQQ h5jw3UiJFwtKa/23xqE4q6loP2jBxPdLjCmDKovr15VV3DBi/GvEYEuZwgH6HYw5JEBAQwEFuYb TU3bcIshDvKlotnNJCP0ZgGoa8iT4sq1upSt/0+seJll4T64iE4EdGuoMmwSkvP6XXTBNVl/e+6 2lNqWOwsGrJ+JPp+i6afr4LtHNOvpJa83n/5F/Y9sFFCreTVM+mfvssvPA2eRSLQbrR4Xni9m36 1D8xG/bmNOhpjZw9wG+n1ONGTaCukeTnsWGXXAqTuteaJgu40wX7UuU= X-Google-Smtp-Source: AGHT+IGh2WWnx/ttbx6Lz7h8enrheV1fo3XO0n5YEs4nTyeGyFq/Ko8pVniqm/t/CvcNY5sKH0IMXA== X-Received: by 2002:a05:6000:2405:b0:391:4999:776c with SMTP id ffacd0b85a97d-39cba975d5dmr455082f8f.40.1743709906313; Thu, 03 Apr 2025 12:51:46 -0700 (PDT) Received: from localhost (109-81-82-69.rct.o2.cz. [109.81.82.69]) by smtp.gmail.com with UTF8SMTPSA id ffacd0b85a97d-39c30096b9csm2610336f8f.13.2025.04.03.12.51.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Apr 2025 12:51:46 -0700 (PDT) Date: Thu, 3 Apr 2025 21:51:44 +0200 From: Michal Hocko To: Dave Chinner , Andrew Morton Cc: Shakeel Butt , Yafang Shao , Harry Yoo , Kees Cook , joel.granados@kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Josef Bacik , linux-mm@kvack.org, Vlastimil Babka Subject: Re: [PATCH] mm: kvmalloc: make kmalloc fast path real fast path Message-ID: References: <20250401073046.51121-1-laoar.shao@gmail.com> <3315D21B-0772-4312-BCFB-402F408B0EF6@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: AFF4314000B X-Stat-Signature: gzqdwixjgpirjc4q74wujabp6o4417zj X-HE-Tag: 1743709907-210933 X-HE-Meta: U2FsdGVkX1+9ES/kV0576XEdhrGYxzbw44+4QaFW0VxGEjduBLkYT/cZ/iFnzJlaLUj6luqVH2sviv3YPkAmc3imVZxkhUnbF4wv3r2nIjkhcd0hihgwAuLLsYU9ZZbyhLiH6UWfqr8bXL6eQnEzJSEZ6SoVmaXIfIcsaay8DhewUuYiv7lXvE20UiDqFTpoNJOnuO8M7DrCLWpDYieXTKv9/hCdH/L/MbY65WVVtn0CyCBnTK4bwShuTl4lABtWxl7Tj56xv1Xzl75OpOGJ6Qd3dtpQibVze+LPVIm3hDNRjKyJyf87NK4Ibh7kyfmaldM6yVO5RLUeyb39NJWDigNMYV+oOZ6SwgSieb5tj+vwZzM+PwjT3EJWua2o0qjc38khFpbwCGLYWfpT6L4hYpWjp0ZVGhWHRsummti02EqkQjL3AFoB0xS7toW9xrAEaexHdVg1Q5YkScYhfDuoqmS5Xg3WJ0mYuCi2wW8TU70Sekkx7eP+C3ArmAnKBJtG9InmLobtjlQMysJYKELD8cN/xCnur8X5cy2JosDvu91Upkn5Y0O03KLPmB8SMwKvfmv80LrUHjwwbcXZB8j3xeavgaqSocRcz5t1IxtLXvTUt2i0G4BqQy7E+JwOXPF6Tbo8ODPC8gFUXnMe9EOw1PNEB3dP8uRsJeTgfsN8gGcNfkHEdMR9dRYjBsR9ogFbziMDOQkNd5TtFI4GwRzncb0DVEwpseGhmPOeLbc+LbS8JV2SZ3tJuzxWDjHvTA0Cez7R7Dg+40XY+0G3Uxyi6YnDCVjcr/u7PYNeGwbVUfSeLj7+m3c80J1Xf0X1iPb9p9fh80fGyCT4lhpbNBxivp1KxUq0NVY/DgepaNxGucXJ8jLGjqH9tTJBAlfcIhoF2Txh/UVWlJIJgGH9BsyyJEGtSJbbPcMNvgeNZi43U+jja4IQWWdj/PJXoxCmwpc/P+cvBB7Pp2LiRkBxtcw VOUc++7R LVn7xV99DBFnOygNSJHYfA0JLYRNrod46x8MI571j8XPuIlSVu4z71xYNlQk5RDPf2tmQuSGRz6NooFlOHNiWbh1OC9HzDFnU7jO0KG3URDb0BooGhXS67bdxpb++LIZ5mU7kugN4LkPWL1KZP114Ne8Dvh+XL3qy1gAbUly7fp4CJBy+T9/h2KsNkQYeGLVtUaSJHs1ro6eYhd86GvZpX4SSBY0tK2sTy56Z/eN43GGeGAbT0SfgyzzRItK+9bLv5L4/j2T/R89KngCeVRuUehqDPlYtm0qkLODOe0qaQvt5UQq1eauQHQrTOsLleilRUWCm5qPT2K/en3vN5YexjmaWOh+KWsNe0ErB1D2P/DjLRvU8PdnZZB8oO0KbdiKmT3vKo55a8BxWNR8cqKjcMJQ04fxM3WKcym2l+HPtchjAJglFHAb9J0snjvBknfjkAIdC/JSDW08Gb3LUFVwmCxdaJA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add Andrew Also, Dave do you want me to redirect xlog_cil_kvmalloc to kvmalloc or do you preffer to do that yourself? On Thu 03-04-25 09:43:41, Michal Hocko wrote: > There are users like xfs which need larger allocations with NOFAIL > sementic. They are not using kvmalloc currently because the current > implementation tries too hard to allocate through the kmalloc path > which causes a lot of direct reclaim and compaction and that hurts > performance a lot (see 8dc9384b7d75 ("xfs: reduce kvmalloc overhead for > CIL shadow buffers") for more details). > > kvmalloc does support __GFP_RETRY_MAYFAIL semantic to express that > kmalloc (physically contiguous) allocation is preferred and we should go > more aggressive to make it happen. There is currently no way to express > that kmalloc should be very lightweight and as it has been argued [1] > this mode should be default to support kvmalloc(NOFAIL) with a > lightweight kmalloc path which is currently impossible to express as > __GFP_NOFAIL cannot be combined by any other reclaim modifiers. > > This patch makes all kmalloc allocations GFP_NOWAIT unless > __GFP_RETRY_MAYFAIL is provided to kvmalloc. This allows to support both > fail fast and retry hard on physically contiguous memory with vmalloc > fallback. > > There is a potential downside that relatively small allocations (smaller > than PAGE_ALLOC_COSTLY_ORDER) could fallback to vmalloc too easily and > cause page block fragmentation. We cannot really rule that out but it > seems that xlog_cil_kvmalloc use doesn't indicate this to be happening. > > [1] https://lore.kernel.org/all/Z-3i1wATGh6vI8x8@dread.disaster.area/T/#u > Signed-off-by: Michal Hocko > --- > mm/slub.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/mm/slub.c b/mm/slub.c > index b46f87662e71..2da40c2f6478 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -4972,14 +4972,16 @@ static gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t size) > * We want to attempt a large physically contiguous block first because > * it is less likely to fragment multiple larger blocks and therefore > * contribute to a long term fragmentation less than vmalloc fallback. > - * However make sure that larger requests are not too disruptive - no > - * OOM killer and no allocation failure warnings as we have a fallback. > + * However make sure that larger requests are not too disruptive - i.e. > + * do not direct reclaim unless physically continuous memory is preferred > + * (__GFP_RETRY_MAYFAIL mode). We still kick in kswapd/kcompactd to start > + * working in the background but the allocation itself. > */ > if (size > PAGE_SIZE) { > flags |= __GFP_NOWARN; > > if (!(flags & __GFP_RETRY_MAYFAIL)) > - flags |= __GFP_NORETRY; > + flags &= ~__GFP_DIRECT_RECLAIM; > > /* nofail semantic is implemented by the vmalloc fallback */ > flags &= ~__GFP_NOFAIL; > -- > 2.49.0 > -- Michal Hocko SUSE Labs