From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98144C04FFE for ; Fri, 17 May 2024 12:56:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F55D6B0083; Fri, 17 May 2024 08:56:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A4156B0088; Fri, 17 May 2024 08:56:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E86126B0089; Fri, 17 May 2024 08:56:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id CA3A66B0083 for ; Fri, 17 May 2024 08:56:17 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 745F7160302 for ; Fri, 17 May 2024 12:56:17 +0000 (UTC) X-FDA: 82127885994.02.0570602 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf24.hostedemail.com (Postfix) with ESMTP id 2EFDB180011 for ; Fri, 17 May 2024 12:56:14 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=w1cp1FNV; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=9sLrQzGJ; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=w1cp1FNV; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=9sLrQzGJ; spf=pass (imf24.hostedemail.com: domain of hare@suse.de designates 195.135.223.131 as permitted sender) smtp.mailfrom=hare@suse.de; dmarc=pass (policy=none) header.from=suse.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715950575; a=rsa-sha256; cv=none; b=tyBnCTbQd7X9tOlqAZJ54IlvCtN+rvMo1f5pjC63nZxYyUn70e46vhjVqUf0pmPWlK5bTe xkx/V2n2cZ/p8BUnXoOrFLPw4PRUF/yqH4c8lCuPzngIgXmUOI1PdUCgSkEqd4C/dwV50O KbzL3KuDyVwUJFk7vx1AJ0SQrL8VTew= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=w1cp1FNV; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=9sLrQzGJ; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=w1cp1FNV; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=9sLrQzGJ; spf=pass (imf24.hostedemail.com: domain of hare@suse.de designates 195.135.223.131 as permitted sender) smtp.mailfrom=hare@suse.de; dmarc=pass (policy=none) header.from=suse.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715950575; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cby//ih+NMEN3e6wlO01IS8t6sJXDU+AyZekjPzlkn8=; b=ZGpdEdYv1Cvm7B+QxrWc8sh8kUIj4nIemBR8ayxFgmkpMpPc87m2/lJzr+tBpaYwMJPnsP ANc6S+hrBKiEMN+MHCGkSaimzuCCF4NTthwvtwExCBHjcrbRLQ7DbEbz2KBsqfjtKbtM4G +Aqy3oaMk/FxNC/jJ/eZlx8P952xhbA= Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 63C285D39D; Fri, 17 May 2024 12:56:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1715950573; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cby//ih+NMEN3e6wlO01IS8t6sJXDU+AyZekjPzlkn8=; b=w1cp1FNVIuKA1qfXgHtz6MXzAE1rPbnAHQccBwH4oB7Q+CrjyPZsMG7Wp8Ya4tDl7CQuER xX4JvXXjBTp5W00aHF9ztIsp2raHdoqrAy8wEXdXt5mFE7vJISMFXvzITzFC0PmQE69otW gG4KtRXzF+bI5qBh04YXuxdSED1F4kI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1715950573; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cby//ih+NMEN3e6wlO01IS8t6sJXDU+AyZekjPzlkn8=; b=9sLrQzGJXcLCz+kbnFtXjYOZNr26jZNpYG6JbjKJvbAvv3XSzRr/sAU1rApxjsN3VnakKG bxdHBUX9hPyG2yAQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1715950573; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cby//ih+NMEN3e6wlO01IS8t6sJXDU+AyZekjPzlkn8=; b=w1cp1FNVIuKA1qfXgHtz6MXzAE1rPbnAHQccBwH4oB7Q+CrjyPZsMG7Wp8Ya4tDl7CQuER xX4JvXXjBTp5W00aHF9ztIsp2raHdoqrAy8wEXdXt5mFE7vJISMFXvzITzFC0PmQE69otW gG4KtRXzF+bI5qBh04YXuxdSED1F4kI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1715950573; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cby//ih+NMEN3e6wlO01IS8t6sJXDU+AyZekjPzlkn8=; b=9sLrQzGJXcLCz+kbnFtXjYOZNr26jZNpYG6JbjKJvbAvv3XSzRr/sAU1rApxjsN3VnakKG bxdHBUX9hPyG2yAQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id A6E8013991; Fri, 17 May 2024 12:56:12 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id nkfnJuxTR2ZhQQAAD6G6ig (envelope-from ); Fri, 17 May 2024 12:56:12 +0000 Message-ID: Date: Fri, 17 May 2024 14:56:12 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC] iomap: use huge zero folio in iomap_dio_zero Content-Language: en-US From: Hannes Reinecke To: "Pankaj Raghav (Samsung)" , Matthew Wilcox Cc: david@fromorbit.com, djwong@kernel.org, hch@lst.de, Keith Busch , mcgrof@kernel.org, akpm@linux-foundation.org, brauner@kernel.org, chandan.babu@oracle.com, gost.dev@samsung.com, john.g.garry@oracle.com, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-xfs@vger.kernel.org, p.raghav@samsung.com, ritesh.list@gmail.com, ziy@nvidia.com References: <20240503095353.3798063-8-mcgrof@kernel.org> <20240507145811.52987-1-kernel@pankajraghav.com> <20240515155943.2uaa23nvddmgtkul@quentin> <20240516150206.d64eezbj3waieef5@quentin> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 2EFDB180011 X-Rspam-User: X-Rspamd-Server: rspam12 X-Stat-Signature: mrdr8k7xngte9xhwhznwuay774cxdjiw X-HE-Tag: 1715950574-573969 X-HE-Meta: U2FsdGVkX1+8LodtdD3hjAwzq+PaiLtth74VXHdS8Oe9bMWj2XSgwGSr675wzGXk0cM7lcyb5jQyLQtWSVi0iLAdE6EiB0CxE4o6RM4XvRQTWAZo3AvLN8pk2eLDe5xMYfRVmmn59QKTG6VIWcUArrgOApfLQda/AyqRKpwpKunCtsKMOQQUAwv9MUN28p1nFO86Kcbd9k26FzBRFHFAHphubSG6pNM8+r05FaENcsM8FAawhSRRUkKGWPvqOVLa1bpZy/X1dz9zcKnqJBV4buPRL7vGdxnaQKAaCOfb2Pi/PcqX4RKttJwNmiAGAf838n4G40/5iiwV79XxmCkpn1PL4yx4ZTgYWXE9naJsvKl4R7grSLKtmGNp9Cjm40MyzmdeDoA3RlJZFrrEnt14TVZLx1nK9uwsCwdFZXbbsMIvRyvnBh70F+8F/AzUyanq78rjCkhUdH2E/RiGy2mdhUMCA2xHj7twYCMTBL+S0oBxhFkNYNMWWeyWiAdrjBgA9GrNGkxPWg41U2YuSLwKaCde2wIhb+ZQmNz6pXzUrhQZXPQy3oaNBtAeYpUGibXjLW0NWHKYf5JqzRn0edTE7Lbo9fIQI1uYF5OxluFqxmjcgsehmqEX6ZPrBfE7UyW8hn/zDXeMpdTkRNjmqIE1fTJK/UIIfCa6/xuVmv0XwGEOBOAAqfA6bh7yjs/S/YQohGFP8utlq3ry7zk+Gto3ZbXEog+jytuDLOLMXVXCdoYFkIRolb6BARTQMleJVhUtg4CIvRqHYb/whpTOlBxD6n3wyk+P706HvRvdqlrw0q+xj2wq2M+LXtIXW9gWKV5oiXj+IUHNM9iwtV0h3iuG/oyux4KBXmi7EHgupZI/X3qvwLNaMLmtTGN9FgquWVv57pEETGiH+AsHbU6wsgGBmBsJKKoIynjqE8ZicaQmURlED+hTXY9APYf0cjLYOEsT0ke+XRcDbylggWLT9bb 1xVs17ca k7KVS8ptaa8scSqxrMzSeOO4V2EfWseeAoD1waS6pVivt84GEYCwH7ctV0UcnVxh6ZWDpjTAo6e/9STErPLDnJX1HhFqI/kzYxd0N9YK9K5jWs4zYgvzQfEnblBP5DrQ5D3ICsNDhDQp7jK/qwLd09r1ic38fgFGgJ1bBa9ZVNkBk9vUmubJouVpSuZpXP5nrnpt8vd7+YgIqL/IK9rGJH6YE/67L+VYBC6TcjO5cJdAfS7xKULWd0zwlaCGKZ08zzciBB01byUKr0yhESFfyMzSt+/T1sqJ5cLPDCrabCWi12N8mLpAHL9TrsLUl0VwV8uIpPnlj6umNlG09Bu0tS6mUuakF6odENt8ADKHVYhfIgXDFRZwlH24JTOF39vgeVJJusCBnVSQvdp92fLBlM1C3I4EOtXa9xotnINk+civ2IPw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 5/17/24 14:36, Hannes Reinecke wrote: > On 5/16/24 17:02, Pankaj Raghav (Samsung) wrote: >> On Wed, May 15, 2024 at 07:03:20PM +0100, Matthew Wilcox wrote: >>> On Wed, May 15, 2024 at 03:59:43PM +0000, Pankaj Raghav (Samsung) wrote: >>>>   static int __init iomap_init(void) >>>>   { >>>> +       void            *addr = kzalloc(16 * PAGE_SIZE, GFP_KERNEL); >>> >>> Don't use XFS coding style outside XFS. >>> >>> kzalloc() does not guarantee page alignment much less alignment to >>> a folio.  It happens to work today, but that is an implementation >>> artefact. >>> >>>> + >>>> +       if (!addr) >>>> +               return -ENOMEM; >>>> + >>>> +       zero_fsb_folio = virt_to_folio(addr); >>> >>> We also don't guarantee that calling kzalloc() gives you a virtual >>> address that can be converted to a folio.  You need to allocate a folio >>> to be sure that you get a folio. >>> >>> Of course, you don't actually need a folio.  You don't need any of the >>> folio metadata and can just use raw pages. >>> >>>> +       /* >>>> +        * The zero folio used is 64k. >>>> +        */ >>>> +       WARN_ON_ONCE(len > (16 * PAGE_SIZE)); >>> >>> PAGE_SIZE is not necessarily 4KiB. >>> >>>> +       bio = iomap_dio_alloc_bio(iter, dio, BIO_MAX_VECS, >>>> +                                 REQ_OP_WRITE | REQ_SYNC | REQ_IDLE); >>> >>> The point was that we now only need one biovec, not MAX. >>> >> >> Thanks for the comments. I think it all makes sense: >> >> diff --git a/fs/internal.h b/fs/internal.h >> index 7ca738904e34..e152b77a77e4 100644 >> --- a/fs/internal.h >> +++ b/fs/internal.h >> @@ -35,6 +35,14 @@ static inline void bdev_cache_init(void) >>   int __block_write_begin_int(struct folio *folio, loff_t pos, >> unsigned len, >>                  get_block_t *get_block, const struct iomap *iomap); >> +/* >> + * iomap/buffered-io.c >> + */ >> + >> +#define ZERO_FSB_SIZE (65536) >> +#define ZERO_FSB_ORDER (get_order(ZERO_FSB_SIZE)) >> +extern struct page *zero_fs_block; >> + >>   /* >>    * char_dev.c >>    */ > But why? > We already have a perfectly fine hugepage zero page in huge_memory.c. > Shouldn't we rather export that one and use it? > (Actually I have some patches for doing so...) > We might allocate folios Bah. Hit 'enter' too soon. We might allocate a zero folio as a fallback if the huge zero page is not available, but first we should try to use that. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich