From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A3B1C04A68 for ; Sat, 30 Jul 2022 17:52:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A15276B0071; Sat, 30 Jul 2022 13:52:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C4EC6B0072; Sat, 30 Jul 2022 13:52:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88D316B0073; Sat, 30 Jul 2022 13:52:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 78F9A6B0071 for ; Sat, 30 Jul 2022 13:52:34 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 2F98E1A0205 for ; Sat, 30 Jul 2022 17:52:34 +0000 (UTC) X-FDA: 79744511028.14.E94C7F0 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf05.hostedemail.com (Postfix) with ESMTP id 840551000C9 for ; Sat, 30 Jul 2022 17:52:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=jGFGvJoR+Jxr6T31gt4WNXR0yqCHEGj/HDLPrpa20+k=; b=VLEVdVRHYadFUc0e7jesIG6pe4 OxX+e1r0SeLDlMHk1aPTOqyMKW8Y/B6T2D9Agqf4KrITfeykv+XnCj5mW5I5Xh0u9os70ssvLQ/Fb FRS358pnU9hnybW47PWlMomuDKoBhSfPLECbGJgxo82+S8i7zgt4AKJkKxUFwpzkx/bGZY5Io8z3P EHygezyqVxIenfMYY7GGQG+1dq5ocq2EUYqlvG0F8kaZ6OZ1c5QezBpiZTI4wHKTd4ZNgl1f2/k3v 2kG7qr4Osagm84RYP6zt7VzU47DPIafLh2TSfSV337DV3eI6+q2K1zfvYwNOmpM8EoSAbNcoWVvNO f86vyHrQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1oHqHF-005j6b-Jw; Sat, 30 Jul 2022 17:29:25 +0000 Date: Sat, 30 Jul 2022 18:29:25 +0100 From: Matthew Wilcox To: Xin Hao Cc: adobriyan@gmail.com, akpm@linux-foundation.org, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [RFC PATCH] mm: add last level page table numa info to /proc/pid/numa_pgtable Message-ID: References: <20220730163528.48377-1-xhao@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220730163528.48377-1-xhao@linux.alibaba.com> ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659203553; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jGFGvJoR+Jxr6T31gt4WNXR0yqCHEGj/HDLPrpa20+k=; b=e0UCoP85U+OMiv4Hn7P31aWKzsFhsvsm8Ws63VvMVqTKVCu8g9b5nvffbTiVD5wpN5AoG5 7JzHvx++iP6SUANqxSEXj3DY1+/QI8Oe/sy86km6gfm7qzGzb27pY3G8YagioPd2oqQsZf yN/wwhIM4U9JegVFKE/zQ4yxPLE4THY= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=VLEVdVRH; dmarc=none; spf=none (imf05.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659203553; a=rsa-sha256; cv=none; b=sxJn+hfnmHoaPQ+BRukd3zBH7M0ssHXyHFNPYtHiGMfCrQqh9vMIevEG5X2KTEtUHmWLbm ncICuolU4WnF2kuwyZjwhMF7oBf1UYRfv9+jkkVG4jIAf4hcdXV33dDGArF8py2X8rOj4a T7DGMPZns49ajDkxkj8awZ9koJcUN6Q= X-Rspamd-Server: rspam02 X-Rspam-User: Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=VLEVdVRH; dmarc=none; spf=none (imf05.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org X-Stat-Signature: k7dk9poez5y3nuznm7dext3fuggj4brb X-Rspamd-Queue-Id: 840551000C9 X-HE-Tag: 1659203553-927158 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Jul 31, 2022 at 12:35:28AM +0800, Xin Hao wrote: > In many data center servers, the shared memory architectures is > Non-Uniform Memory Access (NUMA), remote numa node data access > often brings a high latency problem, but what we are easy to ignore > is that the page table remote numa access, It can also leads to a > performance degradation. > > So there add a new interface in /proc, This will help developers to > get more info about performance issues if they are caused by cross-NUMA. Interesting. The implementation seems rather more complex than necessary though. > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index 2d04e3470d4c..a51befb47ea8 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -1999,4 +1999,133 @@ const struct file_operations proc_pid_numa_maps_operations = { > .release = proc_map_release, > }; > > +struct pgtable_numa_maps { > + unsigned long node[MAX_NUMNODES]; > +}; > + > +struct pgtable_numa_private { > + struct proc_maps_private proc_maps; > + struct pgtable_numa_maps md; > +}; struct pgtable_numa_private { struct proc_maps_private proc_maps; unsigned long node[MAX_NUMNODES]; }; > +static void gather_pgtable_stats(struct page *page, struct pgtable_numa_maps *md) > +{ > + md->node[page_to_nid(page)] += 1; > +} > + > +static struct page *can_gather_pgtable_numa_stats(pmd_t pmd, struct vm_area_struct *vma, > + unsigned long addr) > +{ > + struct page *page; > + int nid; > + > + if (!pmd_present(pmd)) > + return NULL; > + > + if (pmd_huge(pmd)) > + return NULL; > + > + page = pmd_page(pmd); > + nid = page_to_nid(page); > + if (!node_isset(nid, node_states[N_MEMORY])) > + return NULL; > + > + return page; > +} > + > +static int gather_pgtable_numa_stats(pmd_t *pmd, unsigned long addr, > + unsigned long end, struct mm_walk *walk) > +{ > + struct pgtable_numa_maps *md = walk->private; > + struct vm_area_struct *vma = walk->vma; > + struct page *page; > + > + if (pmd_huge(*pmd)) { > + struct page *pmd_page; > + > + pmd_page = virt_to_page(pmd); > + if (!pmd_page) > + return 0; > + > + if (!node_isset(page_to_nid(pmd_page), node_states[N_MEMORY])) > + return 0; > + > + gather_pgtable_stats(pmd_page, md); > + goto out; > + } > + > + page = can_gather_pgtable_numa_stats(*pmd, vma, addr); > + if (!page) > + return 0; > + > + gather_pgtable_stats(page, md); > + > +out: > + cond_resched(); > + return 0; > +} static int gather_pgtable_numa_stats(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) { struct pgtable_numa_private *priv = walk->private; struct vm_area_struct *vma = walk->vma; struct page *page; int nid; if (pmd_huge(*pmd)) { page = virt_to_page(pmd); } else { page = pmd_page(*pmd); } nid = page_to_nid(page); priv->node[nid]++; return 0; }