From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33FE0C36002 for ; Wed, 9 Apr 2025 09:11:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 20A37280055; Wed, 9 Apr 2025 05:11:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 192F1280053; Wed, 9 Apr 2025 05:11:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F25DD280055; Wed, 9 Apr 2025 05:11:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D3A73280053 for ; Wed, 9 Apr 2025 05:11:31 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 24F2C16022A for ; Wed, 9 Apr 2025 09:11:33 +0000 (UTC) X-FDA: 83313937266.03.6975887 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf24.hostedemail.com (Postfix) with ESMTP id AAD82180004 for ; Wed, 9 Apr 2025 09:11:30 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=x68SfJeD; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ATH4sD7m; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=x68SfJeD; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ATH4sD7m; spf=pass (imf24.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744189891; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7VXKOBNGUI9fKpg7pWhEZW+0RN6rg/pB63thRan8omU=; b=ntUyP1iVGkFoa/4u9NC8icItsIczSWah58eUCR9lH1DIsTNMihj/gsMzaBhQ7Fbf+ofSO6 RmloR/ooNLaJGifLhleBj29/NFeCBrZ2gH/f4uoqmFU/CtWqGmGgf7LhXe5gL5im200ILg 366JXdPKLjuGp3DGgN8ervIGeARqmTw= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=x68SfJeD; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ATH4sD7m; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=x68SfJeD; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ATH4sD7m; spf=pass (imf24.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744189891; a=rsa-sha256; cv=none; b=go6lKJsOf7f0JKLkTJ6a0hzSrfnz2/r5Eu0FdUK0YAdbSJ+pVWMK+9Bxd+yVe0HgvHncNZ d3pqs+iG/ztBMXrxEG0aL2GyUE27mD/gKp5wKlIXdMVa6TawuJIabNHlqpOPYmhlpOpY3F gNM9rWeAZc1GZz1BpK3QCxwEuKR9PTU= Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 15F1F21163; Wed, 9 Apr 2025 09:11:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1744189889; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7VXKOBNGUI9fKpg7pWhEZW+0RN6rg/pB63thRan8omU=; b=x68SfJeDcHWG4nFcDgc0x+pMik057A/7bA8Si1mY8jdtIS8ZV99dKP54JG6sLgFH4x29rT QoY47pkCg30pmyOhVz14P/MJBWT04qTcrNT3+aMGP5L1UbuP5howowKrCoXiBsAqoXdkVr tpK3Azj1uysVQmPb7LZpkHVHGeBeNY8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1744189889; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7VXKOBNGUI9fKpg7pWhEZW+0RN6rg/pB63thRan8omU=; b=ATH4sD7mpWPE4lfUe+bcILMf7OGxlOw2A3qtnSff+apAYhQUB7VOa1SIOkMbO8Sr4VczvK WPGh52nLDaQz8ZDA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1744189889; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7VXKOBNGUI9fKpg7pWhEZW+0RN6rg/pB63thRan8omU=; b=x68SfJeDcHWG4nFcDgc0x+pMik057A/7bA8Si1mY8jdtIS8ZV99dKP54JG6sLgFH4x29rT QoY47pkCg30pmyOhVz14P/MJBWT04qTcrNT3+aMGP5L1UbuP5howowKrCoXiBsAqoXdkVr tpK3Azj1uysVQmPb7LZpkHVHGeBeNY8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1744189889; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7VXKOBNGUI9fKpg7pWhEZW+0RN6rg/pB63thRan8omU=; b=ATH4sD7mpWPE4lfUe+bcILMf7OGxlOw2A3qtnSff+apAYhQUB7VOa1SIOkMbO8Sr4VczvK WPGh52nLDaQz8ZDA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 0351313691; Wed, 9 Apr 2025 09:11:29 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id qdSUAME59mfRawAAD6G6ig (envelope-from ); Wed, 09 Apr 2025 09:11:29 +0000 Message-ID: <0f2091ba-0a43-4dd3-aa48-fe284530044a@suse.cz> Date: Wed, 9 Apr 2025 11:11:37 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm: kvmalloc: make kmalloc fast path real fast path To: Michal Hocko , Dave Chinner , Andrew Morton Cc: Shakeel Butt , Yafang Shao , Harry Yoo , Kees Cook , joel.granados@kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Josef Bacik , linux-mm@kvack.org References: <20250401073046.51121-1-laoar.shao@gmail.com> <3315D21B-0772-4312-BCFB-402F408B0EF6@kernel.org> From: Vlastimil Babka Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: AAD82180004 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: a11xi8kfmtj6dtwsmqrkqinxnkmspk4h X-HE-Tag: 1744189890-119579 X-HE-Meta: U2FsdGVkX18WA3Ham+Rmu5eUNIUpl7rQF2IMPE03AU990ZyR8pL6+F1Rp5VAsKmZKbvQ/cMJ2iOwA2NpgMKbb2nvohRxlZeJJtt2bmymGkzFRqpBP2LtusZYQ/E3BLLO3sq06eScyJ2/JqqYorEMYj80sq9nZr8MDzsd7aufh+nfH9+eqQAay+SatNIZ1TzVp6yWPzaihQfChK+KIyd+CrA4ADbEi6o7bXCAzcW2BgH54bxBMkd2Vk/rQtp1ag7nRKpb1GXu6G9Q9y/tz7HUYWGpdTj3Iopo2FD0uzC0Y/in773Mcj8dfSCfFfa6rSGWx58sTTwTgz2WmGHR1oUg8cGPBQQkZ35IG2lZjoGxWu/qvYTARJNMJY2vaEsB9yuDUJY2OPwJ89jg3gJW+rYAduIFmfisEw35i+9OImd6apEe5pTKEYLdqDbGuUaeLtFMXcjvI4rFnbViWy4UFudjYKWwng5w2fcHtXYjBfUnLj3F1uL/WLiHpr6FYFhhgzfyhdRfuSHUGO1t+r0QwR4oFwo+SNcbIZJCMGD29GGhKgYIJxw9yusoXufNTN9FIDSE/0FPLFvJPSeiEYX8X3CWuNQOC5J961ZY6DzBYDeCFyhjsrHeXFo/cXw8bPm/qPDfdBFdxRbmmRAy7/v2m+pcslYS7PwD5I9rPF0nUMVxkvM9U2eYOY9Xt55L/Ke/xYBSvGKhM7P2xqvkaA8wjR1zFVasIWeX/vE4YOFJQe/guvQ6Z/KTRMRvRTzRzZYt7TOnJN6XzkOqttCZRgmXX89mflwpuDxajJnXciDo1RYKW8HRO8qzdXCMpO/T16TMqmQiv1edivd7iRDYlS1O88FAxMfJUr8a5sVNRxAdAwXE+0fgLEuKj3cZ6fbN8C+xu1xJ1Hw9IzKRh9A6ziiTR1EMPwk6MglsW28+s1qPbdApFcLAJEJvoCnRMSyA61f9uezRYJ85u7wgT+JAW+lpWQF JcnUNshU 3O6bq6tnOWQVtga3l0LmUSBGhMX3Rzr4ylwNx6IQSKMo1WVoGaIw8RYHQozPb+M6JMChSLVYjxjFAAoPPWOYd30pH9oNdiOLuFv2XVeRzbCK3vV4o7KDV7bNOrW34sY4UfJsis8SRDTjWA9cQ+bju/UPjmdt0tSGzyddzw+EQkwwW88nv7CqU03xBI1HxGccJbBRDIScc2k3rYpKcnDXA7xbVhWL5mqc8p02Bwe5bpnqIyFinluxhts7mleTFRYBHqZcy9lWNPN9xKWVrki7nNjllDBx14sZMJN0D3lrQu2cNhpLG0QzDAtO7bnCTAv4G7iIaigZXKX9LifvcL3GWOL+14hzRXFjiPZYbwc6kyM9IYD+mPTcBRlajDUOD53u2zZz5w+Oec1au7hcXKCUZuSQTiakC/IXrZ6fghRSq4TR7XOgyYDWtajLNnHfIPmWHcQ1o X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 4/9/25 9:35 AM, Michal Hocko wrote: > On Thu 03-04-25 21:51:46, Michal Hocko wrote: >> Add Andrew > > Andrew, do you want me to repost the patch or can you take it from this > email thread? I'll take it as it's now all in mm/slub.c >> Also, Dave do you want me to redirect xlog_cil_kvmalloc to kvmalloc or >> do you preffer to do that yourself? >> >> On Thu 03-04-25 09:43:41, Michal Hocko wrote: >>> There are users like xfs which need larger allocations with NOFAIL >>> sementic. They are not using kvmalloc currently because the current >>> implementation tries too hard to allocate through the kmalloc path >>> which causes a lot of direct reclaim and compaction and that hurts >>> performance a lot (see 8dc9384b7d75 ("xfs: reduce kvmalloc overhead for >>> CIL shadow buffers") for more details). >>> >>> kvmalloc does support __GFP_RETRY_MAYFAIL semantic to express that >>> kmalloc (physically contiguous) allocation is preferred and we should go >>> more aggressive to make it happen. There is currently no way to express >>> that kmalloc should be very lightweight and as it has been argued [1] >>> this mode should be default to support kvmalloc(NOFAIL) with a >>> lightweight kmalloc path which is currently impossible to express as >>> __GFP_NOFAIL cannot be combined by any other reclaim modifiers. >>> >>> This patch makes all kmalloc allocations GFP_NOWAIT unless >>> __GFP_RETRY_MAYFAIL is provided to kvmalloc. This allows to support both >>> fail fast and retry hard on physically contiguous memory with vmalloc >>> fallback. >>> >>> There is a potential downside that relatively small allocations (smaller >>> than PAGE_ALLOC_COSTLY_ORDER) could fallback to vmalloc too easily and >>> cause page block fragmentation. We cannot really rule that out but it >>> seems that xlog_cil_kvmalloc use doesn't indicate this to be happening. >>> >>> [1] https://lore.kernel.org/all/Z-3i1wATGh6vI8x8@dread.disaster.area/T/#u >>> Signed-off-by: Michal Hocko >>> --- >>> mm/slub.c | 8 +++++--- >>> 1 file changed, 5 insertions(+), 3 deletions(-) >>> >>> diff --git a/mm/slub.c b/mm/slub.c >>> index b46f87662e71..2da40c2f6478 100644 >>> --- a/mm/slub.c >>> +++ b/mm/slub.c >>> @@ -4972,14 +4972,16 @@ static gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t size) >>> * We want to attempt a large physically contiguous block first because >>> * it is less likely to fragment multiple larger blocks and therefore >>> * contribute to a long term fragmentation less than vmalloc fallback. >>> - * However make sure that larger requests are not too disruptive - no >>> - * OOM killer and no allocation failure warnings as we have a fallback. >>> + * However make sure that larger requests are not too disruptive - i.e. >>> + * do not direct reclaim unless physically continuous memory is preferred >>> + * (__GFP_RETRY_MAYFAIL mode). We still kick in kswapd/kcompactd to start >>> + * working in the background but the allocation itself. >>> */ >>> if (size > PAGE_SIZE) { >>> flags |= __GFP_NOWARN; >>> >>> if (!(flags & __GFP_RETRY_MAYFAIL)) >>> - flags |= __GFP_NORETRY; >>> + flags &= ~__GFP_DIRECT_RECLAIM; >>> >>> /* nofail semantic is implemented by the vmalloc fallback */ >>> flags &= ~__GFP_NOFAIL; >>> -- >>> 2.49.0 >>> >> >> -- >> Michal Hocko >> SUSE Labs >