From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7F45C43334 for ; Tue, 26 Jul 2022 13:09:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D772940007; Tue, 26 Jul 2022 09:09:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 387698E0001; Tue, 26 Jul 2022 09:09:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 275DC940007; Tue, 26 Jul 2022 09:09:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 198BE8E0001 for ; Tue, 26 Jul 2022 09:09:38 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id DDEA61C635F for ; Tue, 26 Jul 2022 13:09:37 +0000 (UTC) X-FDA: 79729282794.26.8F04A8A Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf17.hostedemail.com (Postfix) with ESMTP id F326F400AD for ; Tue, 26 Jul 2022 13:09:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=C8bm4+E5Q8gW9r32tr4zATyxHjW/zfIBaCyBHQ98YRs=; b=sQi57Gf91h+lsb1M91bQsKwWtx x8JqMjDDFWuG+hEqs9UuZo7RE7btLCwbkICuWLS65jKcjRidzKLAo+FS7A8XmyeEQIv5p90L/unqG ZMat7CWi5cOx9idAgryuhyewcwIpYsKbIq4IE9gkAsv7HieeDsFrZMgULDz5Z8ErE6Yawp4ElgFj+ k65oBAh1QAl1lRL4y8elAzSLRnwJcaSP2ndWXA/IPFDdij/DPJOJ0MbeQ4lficpU+0aYOBYiL/Jzq NXAlxMArTE2rhmx9AV2P7IQ/hifbAEoyDnaKh63AqUjkePcB6bpq5RU7o5sFtNUQ8FLbPjEdG6i4v aGzBUjrw==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1oGKJR-0020v9-TH; Tue, 26 Jul 2022 13:09:25 +0000 Date: Tue, 26 Jul 2022 14:09:25 +0100 From: Matthew Wilcox To: Liu Zixian Cc: hughd@google.com, akpm@linux-foundation.org, linux-mm@kvack.org, linfeilong@huawei.com Subject: Re: [PATCH] shmem: support huge_fault to avoid pmd split Message-ID: References: <20220726124315.1606-1-liuzixian4@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220726124315.1606-1-liuzixian4@huawei.com> ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658840975; a=rsa-sha256; cv=none; b=E5X+4HRqun77NiKECfu9yuwPQNCGokW5bEtR3wo99w7Pu7OPpjlM95rl+jStspTOzF8Kqg XOSxz8E+YMlHmw1bPzrsm4MUAQcXN0U1slOu6b6mjww+5fdgg7578mvcc0QIj1HVaO9hKh QiCCooQk1AjKZb0g376wCPcSfM/LBuU= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=sQi57Gf9; dmarc=none; spf=none (imf17.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658840975; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=C8bm4+E5Q8gW9r32tr4zATyxHjW/zfIBaCyBHQ98YRs=; b=G5llCeVZe9B/XOjqER17lSU0b9CgxnAXNu5BqX2yJy5E8in2Ma18bSWUb3j0amzrr5vgcP nXGgBFVk3ZBC34z2LpB2OFteKaAifP4/dNAeQZZSfsfs4VD+w1k4faFz+oWeGdyCTTQjFt u6pJ+r8uQ7d+3Wc72koVju54ECXNk5o= Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=sQi57Gf9; dmarc=none; spf=none (imf17.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org X-Stat-Signature: gr58gkwfn4coyp4syazr61cph6gstn7w X-Rspamd-Queue-Id: F326F400AD X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1658840973-513093 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jul 26, 2022 at 08:43:15PM +0800, Liu Zixian wrote: > Transparent hugepage of tmpfs is useful to improve TLB miss, but > it will be split during cow memory fault. That's intentional. Possibly misguided, but there's a tradeoff to be made between memory consumption and using large pages. > This will happen if we mprotect and rewrite code segment (which is > private file map) to hotpatch a running process. > > We can avoid the splitting by adding a huge_fault function. > > Signed-off-by: Liu Zixian > --- > mm/shmem.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 46 insertions(+) > > diff --git a/mm/shmem.c b/mm/shmem.c > index a6f565308..12b2b5140 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -2120,6 +2120,51 @@ static vm_fault_t shmem_fault(struct vm_fault *vmf) > return ret; > } > > +static vm_fault_t shmem_huge_fault(struct vm_fault *vmf, enum page_entry_size pe_size) > +{ > + vm_fault_t ret = VM_FAULT_FALLBACK; > + unsigned long haddr = vmf->address & HPAGE_PMD_MASK; > + struct page *old_page, *new_page; > + int gfp_flags = GFP_HIGHUSER_MOVABLE | __GFP_COMP; > + > + /* read or shared fault will not split huge pmd */ > + if (!(vmf->flags & FAULT_FLAG_WRITE) > + || (vmf->vma->vm_flags & VM_SHARED)) > + return VM_FAULT_FALLBACK; > + if (pe_size != PE_SIZE_PMD) > + return VM_FAULT_FALLBACK; > + > + if (pmd_none(*vmf->pmd)) { > + if (shmem_fault(vmf) & VM_FAULT_ERROR) > + goto out; > + if (!PageTransHuge(vmf->page)) > + goto out; > + old_page = vmf->page; > + } else { > + old_page = pmd_page(*vmf->pmd); > + page_remove_rmap(old_page, vmf->vma, true); > + pmdp_huge_clear_flush(vmf->vma, haddr, vmf->pmd); > + add_mm_counter(vmf->vma->vm_mm, MM_SHMEMPAGES, -HPAGE_PMD_NR); > + } > + > + new_page = &vma_alloc_folio(gfp_flags, HPAGE_PMD_ORDER, > + vmf->vma, haddr, true)->page; > + if (!new_page) > + goto out; > + prep_transhuge_page(new_page); vma_alloc_folio() does the prep_transhuge_page() for you. > + copy_user_huge_page(new_page, old_page, haddr, vmf->vma, HPAGE_PMD_NR); > + __SetPageUptodate(new_page); > + > + ret = do_set_pmd(vmf, new_page); > + > +out: > + if (vmf->page) { > + unlock_page(vmf->page); > + put_page(vmf->page); > + } > + return ret; > +} > + > unsigned long shmem_get_unmapped_area(struct file *file, > unsigned long uaddr, unsigned long len, > unsigned long pgoff, unsigned long flags) > @@ -3884,6 +3929,7 @@ static const struct super_operations shmem_ops = { > > static const struct vm_operations_struct shmem_vm_ops = { > .fault = shmem_fault, > + .huge_fault = shmem_huge_fault, > .map_pages = filemap_map_pages, > #ifdef CONFIG_NUMA > .set_policy = shmem_set_policy, > -- > 2.33.0 > >