From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DA0FC433EF for ; Fri, 11 Mar 2022 19:08:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1F3E48D0006; Fri, 11 Mar 2022 14:08:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A3008D0001; Fri, 11 Mar 2022 14:08:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 06C0A8D0006; Fri, 11 Mar 2022 14:08:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0140.hostedemail.com [216.40.44.140]) by kanga.kvack.org (Postfix) with ESMTP id EE05A8D0001 for ; Fri, 11 Mar 2022 14:08:45 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id B4EE59EB39 for ; Fri, 11 Mar 2022 19:08:45 +0000 (UTC) X-FDA: 79233042210.19.293189D Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) by imf14.hostedemail.com (Postfix) with ESMTP id 1A769100025 for ; Fri, 11 Mar 2022 19:08:44 +0000 (UTC) Received: by mail-pj1-f53.google.com with SMTP id mg21-20020a17090b371500b001bef9e4657cso11960388pjb.0 for ; Fri, 11 Mar 2022 11:08:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=AUBsSaqnRWg4B0uOKbllkHk4Scdppop0KK0xMi3fHF0=; b=PAbXkYu6bO5CXr9IedJ6KlSYT41m9t9e0QurfT3ujodBdLBV+DtO/BasHg04X5F54j dpjmZViSKc7LwACuQN2p2nVCvDJpi6HPy+bO3hVy8oc7XXFpC67uM7/RbNZJt9DCevJ8 sR8s6D/vIDBptzP1nURtjV1KEfPhd3twjmLzjCQYXni/j8GfhNXqNrOtyK0wGpeEBfXO ubWMK7PLb08B39jxdnlpnf6OBypo8xJ3YyWB3cW0XEmkpsYTWScUoAnuIn4l8tbMtQvr f6fdCKZfIme6lT86B3YjtIix1YxCoTL0uJj0/1R847d6bQ/Qvg0kpAQn0k7i3ZJzI1v3 lD0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AUBsSaqnRWg4B0uOKbllkHk4Scdppop0KK0xMi3fHF0=; b=AjtaeQHy6r2jAQUyKjtpohor98ggNAjLVJxSg4ZeRW2YW3YWBfXvt9/GNfY0aSWJFO jLjh2er7lq0FE3EO7pRRG+zA2geeQXnx6brQOlFhs+uRvPKbxlos7wMEUeyl6bMziEYe QgG4pYsTSl89ysloIIkmZU6zC5ozQjSeat0PMXC/FDBO11S/MRuqPAjVTwhg1KeFtFZr dZ6LIWoq8X4gEEJfbQ6/75KZ8mTS++7Sh9IPGQLfTPfkRPzwhjGj6IQslv/s6tbl1GDI kuqd+5l1xssDw13v0+1gIi9r4dZvrm3WSzakfDCZAVRSid1ZX2Rdnh5iLAww76IFj5Iv 0+5w== X-Gm-Message-State: AOAM532m/V0Bddp56NspYBh7fWOD37wFi7kjuNuQpxuvDlP/7aVJ5bQk RAsTLko9KKG72MVNJk3HWR4LTBVPrO8z1McbdC0= X-Google-Smtp-Source: ABdhPJwruMQQDmYQxzHZEyz8v1ZYnPpx8eYxrEgy5L6YvNnSQ/ihIh1qe8rCOQwZEPM/5ttb0ttPENHh8JC6xalucW4= X-Received: by 2002:a17:90b:4595:b0:1be:db22:8327 with SMTP id hd21-20020a17090b459500b001bedb228327mr23402186pjb.99.1647025724268; Fri, 11 Mar 2022 11:08:44 -0800 (PST) MIME-Version: 1.0 References: <20220311090119.2412738-1-maobibo@loongson.cn> In-Reply-To: <20220311090119.2412738-1-maobibo@loongson.cn> From: Yang Shi Date: Fri, 11 Mar 2022 11:08:32 -0800 Message-ID: Subject: Re: [PATCH] mm/khugepaged: sched to numa node when collapse huge page To: Bibo Mao Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=PAbXkYu6; spf=pass (imf14.hostedemail.com: domain of shy828301@gmail.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 1A769100025 X-Stat-Signature: eo7hue5fbpw5ukw75usmzxgmizx4azck X-HE-Tag: 1647025724-99952 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Mar 11, 2022 at 1:01 AM Bibo Mao wrote: > > collapse huge page is slow, specially when khugepaged daemon runs > on different numa node with that of huge page. It suffers from > huge page copying across nodes, also cache is not used for target > node. With this patch, khugepaged daemon switches to the same numa > node with huge page. It saves copying time and makes use of local > cache better. > > Signed-off-by: Bibo Mao > --- > mm/khugepaged.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 131492fd1148..460c285dc974 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -116,6 +116,7 @@ struct khugepaged_scan { > struct list_head mm_head; > struct mm_slot *mm_slot; > unsigned long address; > + int node; > }; > > static struct khugepaged_scan khugepaged_scan = { > @@ -1066,6 +1067,7 @@ static void collapse_huge_page(struct mm_struct *mm, > struct vm_area_struct *vma; > struct mmu_notifier_range range; > gfp_t gfp; > + const struct cpumask *cpumask; > > VM_BUG_ON(address & ~HPAGE_PMD_MASK); > > @@ -1079,6 +1081,13 @@ static void collapse_huge_page(struct mm_struct *mm, > * that. We will recheck the vma after taking it again in write mode. > */ > mmap_read_unlock(mm); > + > + /* sched to specified node before huage page memory copy */ > + cpumask = cpumask_of_node(node); > + if ((khugepaged_scan.node != node) && !cpumask_empty(cpumask)) { > + set_cpus_allowed_ptr(current, cpumask); > + khugepaged_scan.node = node; What if khugepaged was scheduled to the other nodes after this, but khugepaged_scan.node still equals to node? It seems possible to me IIUC. TBH I'm not quite sure if migrating khugepaged is really worth it for everyone or not. The worst case is the locality of base pages are not obvious, for example, the base pages may be across all nodes, so you always get cross nodes memory copy. And khugepaged may get slower if cpu is contentious. In addition, I saw MIPS has its own copy_user_highpage(), is it a contributing factor too? > + } > new_page = khugepaged_alloc_page(hpage, gfp, node); > if (!new_page) { > result = SCAN_ALLOC_HUGE_PAGE_FAIL; > @@ -2380,6 +2389,7 @@ int start_stop_khugepaged(void) > kthread_stop(khugepaged_thread); > khugepaged_thread = NULL; > } > + khugepaged_scan.node = NUMA_NO_NODE; > set_recommended_min_free_kbytes(); > fail: > mutex_unlock(&khugepaged_mutex); > -- > 2.31.1 > >