From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72A7FCD13CF for ; Mon, 2 Sep 2024 08:36:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D23928D00A4; Mon, 2 Sep 2024 04:36:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CD5BA8D0065; Mon, 2 Sep 2024 04:36:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B9B308D00A4; Mon, 2 Sep 2024 04:36:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 967248D0065 for ; Mon, 2 Sep 2024 04:36:17 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 35223A83FC for ; Mon, 2 Sep 2024 08:36:17 +0000 (UTC) X-FDA: 82519141194.16.0E34CC8 Received: from out30-97.freemail.mail.aliyun.com (out30-97.freemail.mail.aliyun.com [115.124.30.97]) by imf22.hostedemail.com (Postfix) with ESMTP id CBDE1C0008 for ; Mon, 2 Sep 2024 08:36:13 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=nu7vNGiC; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf22.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725266099; a=rsa-sha256; cv=none; b=Kb40ZrjVkpIlOqbFaSIcUXdCGaYo4YTOjar1rEujavG7+yO2L876/l41FzEic/aYSTVViz PmEbZX3l0rmpb6A9f4Xn3xUfSz5K2BvjudUr7dAnf4s4V64GEGlkU0emoRPGmKyyLsmHg/ 5AQJ229nMr7zWHKlQPRYPUx5L/RBojk= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=nu7vNGiC; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf22.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725266099; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gqogtgI3FS2H6sxfDkFSml13NwueUA6Yx4J8h4K0pCc=; b=sqYaThpmzyjMZNiObJruVeG0Rg2GhNmOB8tFUgWjYJW74LwRCRHeu1/mDlZxwHjfWGZMwn Z9KpVaaTOfR+Gqr3QU0ELNBBpUcITUo5wILhrv/XrlXqeXdWxUbRwS2+2dV51dnBOGxEkn 76inXSHurkF5CJEOjIDrKS1qwRVJRTM= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1725266170; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=gqogtgI3FS2H6sxfDkFSml13NwueUA6Yx4J8h4K0pCc=; b=nu7vNGiCzXVn9dlS0MvfwV1sJ+SNF0G59Qe/eCNX9nm6KjmxGTG4r2JKT5Syag+k0+kuponKVZqBtpop8WzuQnP0NTGnychUsUs0/3xjNjTAa7Nd/6SbzyGamN0fex6us0SIi9cBM1NOJS0J8JwTWer75GUYx4tO4rAH52ghR68= Received: from 30.74.144.122(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WE63r6B_1725266168) by smtp.aliyun-inc.com; Mon, 02 Sep 2024 16:36:09 +0800 Message-ID: Date: Mon, 2 Sep 2024 16:36:08 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm,tmpfs: consider end of file write in shmem_is_huge To: Rik van Riel , Hugh Dickins Cc: kernel-team@meta.com, Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dave Chinner , "Darrick J. Wong" , Vlastimil Babka References: <20240829235415.57374fc3@imladris.surriel.com> From: Baolin Wang In-Reply-To: <20240829235415.57374fc3@imladris.surriel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: CBDE1C0008 X-Stat-Signature: 3ftsp7fmf5eqmsueq89rm5j5uis3idgx X-Rspam-User: X-HE-Tag: 1725266173-92951 X-HE-Meta: U2FsdGVkX1/bOOjejHxA0rECHbCYOj3dw2yYz7xBOzBqYyEdefcvTU38O9xxr96t3NpLLCRiB9gRBk+BEzBkJz/qHiP9/hfyIyj5BLwdPzs0oeY4egFtrw3Meirtbn0805hyXgvbZsSZbHp9QzOfapcbAc2lsVBrdPZ3Knyyozmy4MXkQ2LznlMIqwFeO27gG/gN0cOt8xeLHvVpfYuVK2gyzaQltYtTCpPLIaeR9O1LHoeqK8eGHShK7HwIIof8MwBkfaQuN+TGO4bXNKbK0dknMNhXg2HDj3nsYD7syX1zSmQ6pZmgJNO2bwjSbwozxCtbkbHe4YDYe1+gqWFZ8LtGl7rHMaNqVbkEajNsrnkIKuvmotV75u/lGq3/R9B4YnpfAJm8aACC22MaTCOX4J0zcHalWUHswHMIWHcJpRBM9R7dtCe3lIOC4DZsM7mZvgWhSFEEUPHF3Gx/KTHpXIeA7YYytDoln+/O6k/PhtVYercnxCOWYcdPZ2sQZNsJBm9KJw6MA37UV20rhcU+Jng0sNWcOs58GobjYhVLIXKgGD9tfsdtnbIa8YDWGA43hePayM6Z7U50KL5vsIDGWPkG4wgIdaUUv0yAr7JXbW0fTfoPunjmLwy+8FpBWGkmg8E+Ra3uxH8yywCtDsQXengxKa5+mIYEi5OUqGcusFOC1r04PPEgpH+obCh2BxxRaVztbVGgfPw77baSEF2/LPaabpuMe/DbuY1Zal0agoFLDmhPHOv9s4k2Ixg3GHC1IkImdekGYQpi1gjItPDx6icLCpWc952EjKoHNZhq0UcujcU/EfaNCHlBy7+pzDo4AT79y/5jqo1P1ggtP9R+K9EmCr44ij8PeUaNZfOOjZZyscRqRwo+9QIl5Dzx0u4FEWaRIwiB1b6DXHbsjQ1HxbCmrC4NH+rtukriOtSd9z4JZzkMeX6/bYu4EYrb4YxFb/t3v5TlQSIkr6WxLuW /gfihQEQ 2k2ZiQH5r5BlAkJsCjm2fJpOC/hyhqOUSpSLGtoo8kfOtjdc+BgxWjZbHW/8uZ2ZrpgJa73jH/+Y5U4RC++4nc9Fff9BupObf+2Qg8AbqidlZ8fHZlbRKX20sJUFA5xZE+fbOL0DHM8H2tgUTU07SF4gCJXEgyCrT1bVZla7HhucMQ7gz0m3rrEezfFjOIQLpmWvTuKKEszJdNtRVXmg8L5mWvcCrqflvvSsi3616xvW6sAPazpH2s0R9gfVMmLD7N5mFsC6haA7X62KMl7DsV/O5uraWz6xXGFXsBa7t+NQ3DKvl6qgSMn5UhcMg+gkZ36QqA48gkEd1Gdv282F6blluI8yT6FsQBxo+5LhILYD2PfbDmbKGQ2hZ8kkJBm3UgT0uCdpQqcldJuQwv6pJpS0HVw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/8/30 11:54, Rik van Riel wrote: > Take the end of a file write into consideration when deciding whether > or not to use huge folios for tmpfs files when the tmpfs filesystem is > mounted with huge=within_size > > This allows large writes that append to the end of a file to automatically > use large folios. Make sense to me. > > Doing 4MB squential writes without fallocate to a 16GB tmpfs file: > - 4kB pages: 1560 MB/s > - huge=within_size 4720 MB/s > - huge=always: 4720 MB/s > > Signed-off-by: Rik van Riel > --- > fs/xfs/scrub/xfile.c | 6 +++--- > fs/xfs/xfs_buf_mem.c | 2 +- > include/linux/shmem_fs.h | 12 ++++++----- > mm/huge_memory.c | 2 +- > mm/khugepaged.c | 2 +- > mm/shmem.c | 44 +++++++++++++++++++++------------------- > mm/userfaultfd.c | 2 +- > 7 files changed, 37 insertions(+), 33 deletions(-) > > diff --git a/fs/xfs/scrub/xfile.c b/fs/xfs/scrub/xfile.c > index d848222f802b..e6e1c1fd23cb 100644 [snip] > diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h > index 1d06b1e5408a..846c1ea91f50 100644 > --- a/include/linux/shmem_fs.h > +++ b/include/linux/shmem_fs.h > @@ -111,13 +111,15 @@ extern void shmem_truncate_range(struct inode *inode, loff_t start, loff_t end); > int shmem_unuse(unsigned int type); > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > -extern bool shmem_is_huge(struct inode *inode, pgoff_t index, bool shmem_huge_force, > - struct mm_struct *mm, unsigned long vm_flags); > +extern bool shmem_is_huge(struct inode *inode, pgoff_t index, loff_t write_end, > + bool shmem_huge_force, struct mm_struct *mm, > + unsigned long vm_flags); > unsigned long shmem_allowable_huge_orders(struct inode *inode, > struct vm_area_struct *vma, pgoff_t index, > bool global_huge); > #else > -static __always_inline bool shmem_is_huge(struct inode *inode, pgoff_t index, bool shmem_huge_force, > +static __always_inline bool shmem_is_huge(struct inode *inode, pgoff_t index, > + loff_t write_end, bool shmem_huge_force, > struct mm_struct *mm, unsigned long vm_flags) > { > return false; > @@ -150,8 +152,8 @@ enum sgp_type { > SGP_FALLOC, /* like SGP_WRITE, but make existing page Uptodate */ > }; > > -int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, > - enum sgp_type sgp); > +int shmem_get_folio(struct inode *inode, pgoff_t index, loff_t write_end, > + struct folio **foliop, enum sgp_type sgp); > struct folio *shmem_read_folio_gfp(struct address_space *mapping, > pgoff_t index, gfp_t gfp); > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 67c86a5d64a6..8c09071e78cd 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -160,7 +160,7 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, > * own flags. > */ > if (!in_pf && shmem_file(vma->vm_file)) { > - bool global_huge = shmem_is_huge(file_inode(vma->vm_file), vma->vm_pgoff, > + bool global_huge = shmem_is_huge(file_inode(vma->vm_file), vma->vm_pgoff, 0, > !enforce_sysfs, vma->vm_mm, vm_flags); > > if (!vma_is_anon_shmem(vma)) > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index cdd1d8655a76..0ebabff10f97 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -1866,7 +1866,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, > if (xa_is_value(folio) || !folio_test_uptodate(folio)) { > xas_unlock_irq(&xas); > /* swap in or instantiate fallocated page */ > - if (shmem_get_folio(mapping->host, index, > + if (shmem_get_folio(mapping->host, index, 0, > &folio, SGP_NOALLOC)) { > result = SCAN_FAIL; > goto xa_unlocked; > diff --git a/mm/shmem.c b/mm/shmem.c > index 5a77acf6ac6a..964c24fc480f 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -548,7 +548,7 @@ static bool shmem_confirm_swap(struct address_space *mapping, > > static int shmem_huge __read_mostly = SHMEM_HUGE_NEVER; > > -static bool __shmem_is_huge(struct inode *inode, pgoff_t index, > +static bool __shmem_is_huge(struct inode *inode, pgoff_t index, loff_t write_end, > bool shmem_huge_force, struct mm_struct *mm, > unsigned long vm_flags) > { > @@ -568,7 +568,8 @@ static bool __shmem_is_huge(struct inode *inode, pgoff_t index, > return true; > case SHMEM_HUGE_WITHIN_SIZE: > index = round_up(index + 1, HPAGE_PMD_NR); > - i_size = round_up(i_size_read(inode), PAGE_SIZE); > + i_size = max(write_end, i_size_read(inode)); > + i_size = round_up(i_size, PAGE_SIZE); > if (i_size >> PAGE_SHIFT >= index) > return true; > fallthrough; The shmem_is_huge() is no longer exported and has been renamed to shmem_huge_global_enabled() by the series[1]. So you need rebase on the latest mm-unstable branch. [1] https://lore.kernel.org/all/cover.1721626645.git.baolin.wang@linux.alibaba.com/T/#md2580130f990af0b1428010bfb4cc789bb865136