From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1777BC4332F for ; Thu, 9 Nov 2023 23:22:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2B9F54401BB; Thu, 9 Nov 2023 18:22:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 269A3440183; Thu, 9 Nov 2023 18:22:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 131834401BB; Thu, 9 Nov 2023 18:22:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 03994440183 for ; Thu, 9 Nov 2023 18:22:10 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 771AE1A0C99 for ; Thu, 9 Nov 2023 23:22:09 +0000 (UTC) X-FDA: 81439991178.06.FACDF5C Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf27.hostedemail.com (Postfix) with ESMTP id 640434000F for ; Thu, 9 Nov 2023 23:22:07 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=ge2SwE5j; spf=none (imf27.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699572128; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mc6OzgcBVY8B61ieg8s3Enizv9IsIEbGxDPtpfAWkvg=; b=gr6ctyumKlEdcmYHY87XxqU9S4N0aFjSoIjxfH4N/fP9hWE4jEj+hJShU7BmFyrCPNXs8o azbEEE/L3JLTK3RCi80CbkAY8vUPIfkdaf2kqDK5iPCYzJzuq+Tn9g+Kbmsj9UnH6VinX6 zaW0zmXrCh+rkXUj0P7/ZtfhDvr2vtg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699572128; a=rsa-sha256; cv=none; b=pNWrDJQac7P0JtgYwm5kY+TwDwds+TYKXoe+mmfXq+tn2u2THEqCYxWn3tYK7Z0wEgPdb9 U0LMmmtbcMV4geeOHgX0FRPDI+492ivxglo0XIMUbVkK63WYVcTdTehEsYHrLLgL7Vlizf ZC0+LZoWLux76rCQM6yZG7wCBRx6C50= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=ge2SwE5j; spf=none (imf27.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=mc6OzgcBVY8B61ieg8s3Enizv9IsIEbGxDPtpfAWkvg=; b=ge2SwE5jZ8Pg+BCnDdpbmmvw1S 7X7Pwcus6b5uPZaCjSO2mFFX6C4j27TVv0ypW7tt6YBXRJ3n4VOfXBvw8gq7R8Ehm84RsvD982LSJ 53h1jhbLHsvonZH191u+jmsBOzLdz+0lT5YWepKl9d8eIAhUbY2+IJL6Vh4K68zCm3lKH6r7M2rFZ O8i6XphYWB2UkhltblXkHcSpYI08HXX154gCbUaQXBPOEgAmkWKjeifCZIZzObtzJuopb2ZYU0C8u vujodut9+csevS7q/chyE7GyZ5zJo87DLelQn5tSnoU7+6jSyiAzxe00Pvhps5zwo0NJ2atjJ+AYa zyc0jd0A==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1r1ELV-00A0ax-Ul; Thu, 09 Nov 2023 23:21:57 +0000 Date: Thu, 9 Nov 2023 23:21:57 +0000 From: Matthew Wilcox To: "zhangpeng (AS)" Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, lstoakes@gmail.com, hughd@google.com, david@redhat.com, fengwei.yin@intel.com, vbabka@suse.cz, peterz@infradead.org, mgorman@suse.de, mingo@redhat.com, riel@redhat.com, ying.huang@intel.com, hannes@cmpxchg.org, Nanyong Sun , Kefeng Wang , "Aneesh Kumar K.V" Subject: Re: [Question]: major faults are still triggered after mlockall when numa balancing Message-ID: References: <9e62fd9a-bee0-52bf-50a7-498fa17434ee@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9e62fd9a-bee0-52bf-50a7-498fa17434ee@huawei.com> X-Stat-Signature: nuc7uftocxstzjjpxneypmzh3ytm9x5c X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 640434000F X-Rspam-User: X-HE-Tag: 1699572127-823565 X-HE-Meta: U2FsdGVkX18Kf1SHyxb+c0BEmcqh1zGcAmHZApxg7D+lyl9hY2Qnv94FPNVed63v+Tv+GVipiXPINhjZrVy/7TSfzDbd+1N2meCqPrJoBtNaVA1XBKLHFizYftPNh/yHnU7L2IyViRzlc9MVDCLoz6skvmx/NnXEnNRwJLV9+GnW6ZF64dn9QcaJGfxSvomhqTl+hj4UXiTbd/nKNrCmFBPdAZs8GtBdz/WPnZ7oLUGvlhhYFSoh6JqqsV/oAB6Iwyw++rMO3uDPqnuIDN/2gXz7tEzfXPsJGlCS/7ilS31aUasYee3N/UF2vPVaQ79XJY499xYR1aR0L85kGFlY8nThKn39lxgSpyBehl1tyX927XRpqzY64dpP+XvKA6rFZD7227gD/V3J4Bz7K+iWqwM48ssSLGI91pW60MHbZr+qgeOTpfffDf9O5GpHc31Z4Z0CA6WR022bzTmGcczs/CJjRRrJOXnEXEnqPmEO17ETgoorLHro3VVIoOrw8kuOPBO6XjmBp8gEPdpY8a9+gLaFZm/fAdKx9eGS10uB6s9ROgRfSRo1WXi5I0dtEbUJt4DyPxpg271Mhb/ZocdOJ80hSSkQ5+xDCymfBGihBeKyZd6fbz+DlUZ022Q6EcVM9CuY2E7NAh77MB45mpI8EfOb/xWs2qzUwMj6z45zEX/XHJxJcRe9BWuH+UMaLiG0g64eR1/CtmSNIaF1box1mwRww4PLA/TIkiODzMmyUDJDoTVS7uFkPES3h/Y1AHZHSuyqJ6yaD1GjMbNPqLC8r1clILdQ1kIuI6lDiiFgv/W+mNLOsTDvc0KgvpRHKF7IUxDYfPWU7S3GhQaONjxv9vzUkNNThKY/cRwO1+zP1nfRcq/vSLusJNl8hMfNcWqzXY6D7VAYSPcXFXwYxXoKpLa7yQK0mmgbwP8vbNuiHHh6VtzYga+x+c7zGq/6bGkERtHW/S9TSGGpPkB80va J94wh2Rs JwBb5cOdjbRRoPw77e/PqmS/l+YFMoBr7oq/W0L8lXYBh3ioX8Ni2elm4X/tqg0ZDhR8VoYEeRqny2g0jQm0UkNbtoLwvxkpDvSJynrX+m/JI46Ib+eEZgKhxTL9yuY1NhQq9X6jRa1vRa6A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: I went spelunking to try to find out more about this issue, and I discovered it's Aneesh's fault from 2017 ... On Thu, Nov 09, 2023 at 09:47:24PM +0800, zhangpeng (AS) wrote: > Hi everyone, > > There is a performance issue that has been bothering us recently. > This problem can reproduce in the latest mainline version (Linux 6.6). > > We use mlockall(MCL_CURRENT | MCL_FUTURE) in the user mode process > to avoid performance problems caused by major fault. > > There is a stage in numa fault which will set pte as 0 in do_numa_page() : > ptep_modify_prot_start() will clear the vmf->pte, until > ptep_modify_prot_commit() assign a value to the vmf->pte. > > For the data segment of the user-mode program, the global variable area > is a private mapping. After the pagecache is loaded, the private > anonymous page is generated after the COW is triggered. Mlockall can > lock COW pages (anonymous pages), but the original file pages cannot > be locked and may be reclaimed. If the global variable (private anon page) > is accessed when vmf->pte is zero which is concurrently set by numa fault, > a file page fault will be triggered. > > At this time, the original private file page may have been reclaimed. > If the page cache is not available at this time, a major fault will be > triggered and the file will be read, causing additional overhead. > > Our problem scenario is as follows: > > task 1 task 2 > ------ ------ > /* scan global variables */ > do_numa_page() > spin_lock(vmf->ptl) > ptep_modify_prot_start() > /* set vmf->pte as null */ > /* Access global variables */ > handle_pte_fault() > /* no pte lock */ > do_pte_missing() > do_fault() > do_read_fault() > ptep_modify_prot_commit() > /* ptep update done */ > pte_unmap_unlock(vmf->pte, vmf->ptl) > do_fault_around() > __do_fault() > filemap_fault() > /* page cache is not available > and a major fault is triggered */ > do_sync_mmap_readahead() > /* page_not_uptodate and goto > out_retry. */ > > Is there any way to avoid such a major fault? > > -- > Best Regards, > Peng >