From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 577A5C6379F for ; Wed, 1 Feb 2023 15:50:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9496A6B0074; Wed, 1 Feb 2023 10:50:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8D06D6B0075; Wed, 1 Feb 2023 10:50:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7722C6B0078; Wed, 1 Feb 2023 10:50:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 67BAF6B0074 for ; Wed, 1 Feb 2023 10:50:32 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 162DC120EFA for ; Wed, 1 Feb 2023 15:50:32 +0000 (UTC) X-FDA: 80419160304.14.D989130 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf20.hostedemail.com (Postfix) with ESMTP id 009251C0020 for ; Wed, 1 Feb 2023 15:50:28 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=vAE9ah2n; spf=none (imf20.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675266629; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tC1d/OfDex0MM5IlS5XSS4kDaNMxwVSsHXuPjP+T7Z0=; b=pGcVkMG8UqlnPgWt0EJaKfr5ITFJDk5U18r7SnesM/tQ9EuR0VQnw7JdKKYUIegQr/47LO XLbvif0eS7iEgqabMysgqsZFu/3xNEz0CmOWQkPv1Qgts6RseTZ0I/nr8Nl6hj4rr3llUM 7z6K1h+uYrpwcjmoooY03a5Uee5OrMg= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=vAE9ah2n; spf=none (imf20.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675266629; a=rsa-sha256; cv=none; b=x8+DijpODxZ950F0tiUr+3U/Ln4cG+7zAVyh0MozetC0GPuNOFI7WTGhhN7TYB+xg5kyi6 8NRCIxj9ZJIwJiEDv8UtMyyVbQU9rmfzfGLD9ok+i8Pk3U+UxMKkIRl3NBYVAGZvkld2iC neE6pzxDao9oChHmP46LtVUAgHKITnI= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=tC1d/OfDex0MM5IlS5XSS4kDaNMxwVSsHXuPjP+T7Z0=; b=vAE9ah2n7Q7Lmvh56ITqIPPReL ktHovpBFZ74lbRBLwloijEBPFftOLU4Q48tqCWbQ2dxOK+xiyngpiSaMmWeVAR1OPF6xEFxMFexye k7eWRYhDhsFRrGkbEnTuTmWXBXCn9s20rbYfGmHoLcJqo2fUt1o20R+sAmsU0zkIY8zvbjz0ytChQ ZK3AXzAcssmYP4qBshYnRMe+5JqzMU86m2+xmPi+uorWj6f/qyO/z+45OjZBOaxffYVjZZIa/Ki93 6T13EtAfEj7nmX+d/FaLnZjqWzPd35YqbG8SSMi2QCd27MIW7MfREUkr4u41awjVYdr7Pc3G05yUv eSe6l7qw==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1pNFNP-00CQzP-MH; Wed, 01 Feb 2023 15:50:23 +0000 Date: Wed, 1 Feb 2023 15:50:23 +0000 From: Matthew Wilcox To: Yin Fengwei Cc: david@redhat.com, linux-mm@kvack.org, dave.hansen@intel.com, tim.c.chen@intel.com, ying.huang@intel.com Subject: Re: [RFC PATCH v2 5/5] filemap: batched update mm counter,rmap when map file folio Message-ID: References: <20230201081737.2330141-1-fengwei.yin@intel.com> <20230201081737.2330141-6-fengwei.yin@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230201081737.2330141-6-fengwei.yin@intel.com> X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: pq5kqwazefc7gffp8ehe1go6ubnawh78 X-Rspamd-Queue-Id: 009251C0020 X-HE-Tag: 1675266628-498219 X-HE-Meta: U2FsdGVkX1+PVk/jKEReyTMOTlOL/GqBfnayXr4J768eoP8H33lTxPZ/TR8qtRFpAOsb433bMTe1p7yxsbXbQiOl8l/MH/wpb+hybEjSwAwM6AgENWZbM6DfcAiNMMal8dhcrs+Ag8zflLbVWnq6bzeJYFwABh76u6TwbkBtK8hd1DVYT5x/2KCLoWVD0+gBX9s/A0sdlZT3CEXR+4cBbWhlA5NF2326bZY7Y+d8ich9Xj5avqk9mQZUs/YkSV8t6YXpW7szfvp3vDO48fL3Kn58v2qIBBAdj7gsLH8ypT3wShLWdOxqLfRgnUx+3RXAVKmYlx0RpcgAI9WdLr0iJ4DpYK4sdDI6YcMuO2497tpFHZLpOtGQJtvCrcS2kEaYJOUkNOEJmndEkum6+lKXOwl/OewkoyC4C9e/k62ThMWdWcngpNdB9rmKZfz2aWmdAL4NCwk8kEOUL+kBpPy2FrAi9OT5mPrHEowFxRToypR0E/ukR8QXWxG3G53QXwlVog0i6MEmhrCmxGUNjM+5X2NXVdiQ2tbU1WJD36CD4BNAElED/AqTAZb3iJqnKQ8DdlGpSN9i/keFQGJL9ML5VqjzWb9vQoPfl6T6mmbhwiy86qQYRIBon1Les+rfR+kDnwvRF/OHlL/y6aqTzZeFx+GSStBZkjmNsnqY9v2okgqt225C5gyUKzLSrs0JtPg/svdFAM+xO1k/O4FszAd8iNfE/Q2Fqqja2iHHVIj3SW2qp9QwdOmcm1m+pO847yh7naWzib1451Y9Ne+dXbsgGaVluSsizTRyqMS7jMUP/8H2BWEGtSHE7XDTDbs2+yWneiczHD+oBcGZNhPP07S/SpDob9IQ4s0RY+mJLV4y8TAPfI+Nmu53XnXH8r8qMvCacVSR/rg5cQx5F7HL3nL0bFiAZVIUPfaLc/YMiiLqNAONqo2CQvVUWbmHDKOOK/mr+ueCQIt2Ono6xD1vW15 QRAxTfdr LTfH8/IifRyHRNTNxU+nVuuIf0xAn8X9GiYo4va4j0kmXNvH8RbaYGGuPu06QlAMbpNoVFISzO/+A8GVwk0yeiE3VPyhmKdv9rG2H94GcGTQqmvuDdxBK2ilEJ3qjYHkCA2RSjyc6A0QA1snqOOOuF9V8cQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Feb 01, 2023 at 04:17:37PM +0800, Yin Fengwei wrote: > Use do_set_pte_range() in filemap_map_folio_range(). Which > batched updates mm counter, rmap. > > With a self cooked will-it-scale.page_fault3 like app (change > file write fault to read fault) got 15% performance gain. I'd suggest that you create a will-it-scale.page_fault4. Anton is quite open to adding new variations of tests. > Perf data collected before/after the change: > 18.73%--page_add_file_rmap > | > --11.60%--__mod_lruvec_page_state > | > |--7.40%--__mod_memcg_lruvec_state > | | > | --5.58%--cgroup_rstat_updated > | > --2.53%--__mod_lruvec_state > | > --1.48%--__mod_node_page_state > > 9.93%--page_add_file_rmap_range > | > --2.67%--__mod_lruvec_page_state > | > |--1.95%--__mod_memcg_lruvec_state > | | > | --1.57%--cgroup_rstat_updated > | > --0.61%--__mod_lruvec_state > | > --0.54%--__mod_node_page_state > > The running time of __mode_lruvec_page_state() is reduced a lot. Nice. > +++ b/mm/filemap.c > @@ -3364,11 +3364,22 @@ static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf, > struct file *file = vma->vm_file; > struct page *page = folio_page(folio, start); > unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); > - unsigned int ref_count = 0, count = 0; > + unsigned int ref_count = 0, count = 0, nr_mapped = 0; > > do { > - if (PageHWPoison(page)) > + if (PageHWPoison(page)) { > + if (nr_mapped) { > + vmf->pte -= nr_mapped; > + do_set_pte_range(vmf, folio, > + start + count - nr_mapped, > + addr - nr_mapped * PAGE_SIZE, > + nr_mapped); > + > + } > + > + nr_mapped = 0; > continue; > + } Having subtracted nr_mapped from vmf->pte, we then need to add it again. But this is all too complicated. What if we don't update vmf->pte each time around the loop? ie something like this: do { if (PageHWPoisoned(page)) goto map; if (mmap_miss > 0) mmap_miss--; /* * If there're PTE markers, we'll leave them to be handled * in the specific fault path, and it'll prohibit the * fault-around logic. */ if (!pte_none(vmf->pte[count])) goto map; if (vmf->address == addr + count * PAGE_SIZE) ret = VM_FAULT_NOPAGE; continue; map: if (count > 1) { do_set_pte_range(vmf, folio, start, addr, count - 1); folio_ref_add(folio, count - 1); } start += count; vmf->pte += count; addr += count * PAGE_SIZE; nr_pages -= count; count = 0; } while (page++, ++count < nr_pages); if (count > 0) { do_set_pte_range(vmf, folio, start, addr, count); folio_ref_add(folio, count); } else { /* Make sure the PTE points to the correct page table */ vmf->pte--; }