From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5F51C2D0E2 for ; Tue, 22 Sep 2020 12:52:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2DA9E2395C for ; Tue, 22 Sep 2020 12:52:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="F2Vwo7jo" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2DA9E2395C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6DD2D90007B; Tue, 22 Sep 2020 08:52:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B191900063; Tue, 22 Sep 2020 08:52:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5837A90007B; Tue, 22 Sep 2020 08:52:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0199.hostedemail.com [216.40.44.199]) by kanga.kvack.org (Postfix) with ESMTP id 37B50900063 for ; Tue, 22 Sep 2020 08:52:04 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E4D1F1EE6 for ; Tue, 22 Sep 2020 12:52:03 +0000 (UTC) X-FDA: 77290684926.25.ship14_2503fbf2714d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id B31E31804E3A1 for ; Tue, 22 Sep 2020 12:52:03 +0000 (UTC) X-HE-Tag: ship14_2503fbf2714d X-Filterd-Recvd-Size: 6190 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Sep 2020 12:52:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600779122; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=mzF2SsHUb3UjWmBZrtDmWdBk/7nlXgiRIsvkXf0DAaI=; b=F2Vwo7joNTO0z0T1HkL19dlJAlLmKAFJ5gh9wRW6Ra1mDcEcTa3czN3e19o8noLTzm9tWX 5kjuBx+TpaJaixZ6NF8zrmClP6W1UTr+pfEyAEz/VePNJSDXKZ2KTCBvoEPSj+2rVS4u5Z aUXejPOcp+Y43jK8z84JvP5koUqhG4M= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-39-rO_3mtOtMJ6jT1OnKmE6PA-1; Tue, 22 Sep 2020 08:51:58 -0400 X-MC-Unique: rO_3mtOtMJ6jT1OnKmE6PA-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7BE7A56C21; Tue, 22 Sep 2020 12:51:56 +0000 (UTC) Received: from lorien.usersys.redhat.com (ovpn-113-73.phx2.redhat.com [10.3.113.73]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EE06C55780; Tue, 22 Sep 2020 12:51:54 +0000 (UTC) Date: Tue, 22 Sep 2020 08:51:53 -0400 From: Phil Auld To: Huang Ying Cc: Peter Zijlstra , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Ingo Molnar , Mel Gorman , Johannes Weiner , "Matthew Wilcox (Oracle)" , Dave Hansen , Andi Kleen , Michal Hocko , David Rientjes Subject: Re: [RFC -V2] autonuma: Migrate on fault among multiple bound nodes Message-ID: <20200922125049.GA10420@lorien.usersys.redhat.com> References: <20200922065401.376348-1-ying.huang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200922065401.376348-1-ying.huang@intel.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, On Tue, Sep 22, 2020 at 02:54:01PM +0800 Huang Ying wrote: > Now, AutoNUMA can only optimize the page placement among the NUMA nodes if the > default memory policy is used. Because the memory policy specified explicitly > should take precedence. But this seems too strict in some situations. For > example, on a system with 4 NUMA nodes, if the memory of an application is bound > to the node 0 and 1, AutoNUMA can potentially migrate the pages between the node > 0 and 1 to reduce cross-node accessing without breaking the explicit memory > binding policy. > > So in this patch, if mbind(.mode=MPOL_BIND, .flags=MPOL_MF_LAZY) is used to bind > the memory of the application to multiple nodes, and in the hint page fault > handler both the faulting page node and the accessing node are in the policy > nodemask, the page will be tried to be migrated to the accessing node to reduce > the cross-node accessing. > Do you have any performance numbers that show the effects of this on a workload? > [Peter Zijlstra: provided the simplified implementation method.] > > Questions: > > Sysctl knob kernel.numa_balancing can enable/disable AutoNUMA optimizing > globally. But for the memory areas that are bound to multiple NUMA nodes, even > if the AutoNUMA is enabled globally via the sysctl knob, we still need to enable > AutoNUMA again with a special flag. Why not just optimize the page placement if > possible as long as AutoNUMA is enabled globally? The interface would look > simpler with that. I agree. I think it should try to do this if globally enabled. > > Signed-off-by: "Huang, Ying" > Cc: Andrew Morton > Cc: Ingo Molnar > Cc: Mel Gorman > Cc: Rik van Riel > Cc: Johannes Weiner > Cc: "Matthew Wilcox (Oracle)" > Cc: Dave Hansen > Cc: Andi Kleen > Cc: Michal Hocko > Cc: David Rientjes > --- > mm/mempolicy.c | 17 +++++++++++------ > 1 file changed, 11 insertions(+), 6 deletions(-) > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > index eddbe4e56c73..273969204732 100644 > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -2494,15 +2494,19 @@ int mpol_misplaced(struct page *page, struct vm_area_struct *vma, unsigned long > break; > > case MPOL_BIND: > - > /* > - * allows binding to multiple nodes. > - * use current page if in policy nodemask, > - * else select nearest allowed node, if any. > - * If no allowed nodes, use current [!misplaced]. > + * Allows binding to multiple nodes. If both current and > + * accessing nodes are in policy nodemask, migrate to > + * accessing node to optimize page placement. Otherwise, > + * use current page if in policy nodemask, else select > + * nearest allowed node, if any. If no allowed nodes, use > + * current [!misplaced]. > */ > - if (node_isset(curnid, pol->v.nodes)) > + if (node_isset(curnid, pol->v.nodes)) { > + if (node_isset(thisnid, pol->v.nodes)) > + goto moron; Nice label :) > goto out; > + } > z = first_zones_zonelist( > node_zonelist(numa_node_id(), GFP_HIGHUSER), > gfp_zone(GFP_HIGHUSER), > @@ -2516,6 +2520,7 @@ int mpol_misplaced(struct page *page, struct vm_area_struct *vma, unsigned long > > /* Migrate the page towards the node whose CPU is referencing it */ > if (pol->flags & MPOL_F_MORON) { > +moron: > polnid = thisnid; > > if (!should_numa_migrate_memory(current, page, curnid, thiscpu)) > -- > 2.28.0 > Cheers, Phil --