From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F527C3DA63 for ; Tue, 23 Jul 2024 07:00:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0C0F6B009C; Tue, 23 Jul 2024 03:00:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DBA446B00A2; Tue, 23 Jul 2024 03:00:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C59ED6B00A4; Tue, 23 Jul 2024 03:00:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A4BFC6B009C for ; Tue, 23 Jul 2024 03:00:38 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 4AAC31C42BA for ; Tue, 23 Jul 2024 07:00:38 +0000 (UTC) X-FDA: 82370119356.03.D68B3C3 Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) by imf23.hostedemail.com (Postfix) with ESMTP id 4CC9014001D for ; Tue, 23 Jul 2024 07:00:35 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Ut0uhoZ+; spf=pass (imf23.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721717991; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GIJXylkwylAN7xSwKIRIAh2rHN1HRJEdfi0vWvEfVxQ=; b=Wp97QodK9CKVEWqV92GLA+dchZZlOB68obXfRYTPt02DaNnxqFzAtfcWzXnhHmQFuY5lGJ TkvaYpG0VTeNSpTbdbgSgdKoXWBMKghcJ950pKLPbunvhf1C1+94HXHmRnp+dbZunZF6Tl JnO9Vj8Al/+smCT1mbGw3K2pvf7Z+BA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721717991; a=rsa-sha256; cv=none; b=v+SZEnOOnQ2tlgD1hGJ9f7riW9xj4nQFa5niZiGxApxTlPIr7tLQPqb77k8yykoiEERow5 fyZUuEJ0okudgnzx/VlHbuf/cGrQssGhbdpc6dN5NeCmnmSHY5qF7rgV01SrDz1pKaQkpF h/+B++Qabl06iFMZBf5eqPeD6ZH+U7o= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Ut0uhoZ+; spf=pass (imf23.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-lj1-f178.google.com with SMTP id 38308e7fff4ca-2ebe3fb5d4dso46132751fa.0 for ; Tue, 23 Jul 2024 00:00:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1721718033; x=1722322833; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=GIJXylkwylAN7xSwKIRIAh2rHN1HRJEdfi0vWvEfVxQ=; b=Ut0uhoZ+AXYec+p3Jq0vxrVgTQwIEFzY6WfFBdnEcYZ5hG3frHXAxYGnYZZ/bkxHSs /5qnmCLu7FiuY10jwcJrRSozHm05WQeeKbQfaSKaj/rDT4f12qbUIoNbQhgKYggTYfT6 D+oaWv8tjJJsiqsru179M2VVb5++MDOSpK6VdcfcLi6bOAS+EfN3GN2ICJdPPXMYM2cG 8QR4fO78zqTnW0sWiVkZIKcgjUEkJcnY7CSiZnM5lzbvGB/esdRGXJX5tY/Ktv7q+xGN ByBCyRYYIES4V17AfkDSkDSe0mTWBdIYm1TTzoMyrO+MiqDztZ9Rjw86WbUmSYmFNodC cQzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721718033; x=1722322833; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GIJXylkwylAN7xSwKIRIAh2rHN1HRJEdfi0vWvEfVxQ=; b=dfCNFLN2MD2TerWa4Zhyewdtj2VMWq2yGHK3BcIy+3n5QBMRxwFzoJgNvlXsBkFaxd FzGvipCYqHtmS0McESDxhyCZzaJFg8US40SmQYoEwi2wgw28hn4+uu95ndncNroJrN+H WTV+b6i6A7LxV5k1eqrh63c1XN1tq+P9iSFIG6put0IHOOGIQyD7KnrMcv/Nt0Jr1UU5 4V/j7vcWsQY6sxFvorCE7UHqmetQK5CsmQYNNMl/+JUJXmeh2ozu7ufadrlF12ZwNkqQ ivNtGqfBl8fxDbFvaWFDPXc873v37q6K252JGho0LtOI7bBlXKRNGuI5Vw3cymeo+0ch 9M/w== X-Forwarded-Encrypted: i=1; AJvYcCWmJ3GBC2eMp26KmE5/FaEaQkyOqq2GhgROLaH/3TuspmCJPiAUmoxUPPqgF3O/wOPxsVpWATmlTxYWO8fXQKxCNX8= X-Gm-Message-State: AOJu0Yx2Nd9ommD130Co68lK2jNPlKegfKolr2WPIWDgTzf62w4Y0GhK ToBkw0jo6EAWCDny/FLce+qTNJtI3D1l+yGjThRNvuHz3MLvFoa3+HEWWeuKTXoK2UoDsqBEXnp KhcqWComoz0R7T9SIqvH4BXGflaluE539UodgZw== X-Google-Smtp-Source: AGHT+IG+GPWIOXgEAUmzqbdNseDLyBBswW0QMmqJxplQIZtkC+9KhmYVFci+9G56qnsnCEq1mvyVXPNXwvF1y9NOssA= X-Received: by 2002:a2e:30a:0:b0:2ef:2e9b:c705 with SMTP id 38308e7fff4ca-2f02103069cmr3307791fa.7.1721718033311; Tue, 23 Jul 2024 00:00:33 -0700 (PDT) MIME-Version: 1.0 References: <20240723053250.3263125-1-hezhongkun.hzk@bytedance.com> In-Reply-To: From: Zhongkun He Date: Tue, 23 Jul 2024 15:00:22 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v1] mm/numa_balancing: Fix the memory thrashing problem in the single-threaded process To: Anshuman Khandual Cc: peterz@infradead.org, mgorman@suse.de, ying.huang@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, wuyun.abel@bytedance.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4CC9014001D X-Stat-Signature: p5gewh7g1ztkneq9k8qgs4zffimy79wi X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1721718035-784568 X-HE-Meta: U2FsdGVkX1+TBLAYxciw6s0YOjsbf2nr7ZHNNuCA6pEs4ySf7ObFlktYKKHsUeGN/bkiy8NiRc1JnZ1+EposDo/apGSXas9hRicT3EULhprjaL62FxFfc/Y/WVGYZflARlJ4oBfFTf71Ryz/EDHxNt0Fp9pyHkn5sdoqiBiZdC7WIfbaQQmh9akfuIKSpYE6yh2anG60rg+MwlGTNnaZ6d+cDbKmLNEDFF5khV8iRWX4DF4RGu7MLR46ukfkrIheQJHM8NYWPBbjLGWLyDWY53AtypIZDYpsbox+Sm6N2HMBowgWGJ78psAwtPtYdC+zCTtmU53ZT7VEYylgTcgttZnAZ3965sFeADDd+cq27Tw9Y+dwRhosuKBN+fcqKtNPkjD9yRZCIlMsf0RVm1EoSw8bwm5p9vOJsuBCBwwRGzHIREO3yIJEYT9JiWG0UtsDTUXRoL7jBSPKZzS25nnXGMHqkRkO8AqfFdNM9RF4ddFhzT8hUr37Qsiq0OgyrJ8+dTFl62uUIZniPN3OS/nUp06DwUuY9A19prqX6azVrte9fCCC3kIs3ax7CZ+PO4C6IT34EuKTULW34UzgQQl0PlQJ63Jt41IHDmyFaZUNnxz3alYpByikA9WSJEm1NGLnACWDw1odwhRvuf3tmETboSWWQpqJ4X3ORb+9T+rCbZWxlH9yC8J/dM3O5eEU322E2v3Glb6ANeTIvDIx/wY+dQrUDNe3GPrTR32M04hJ0r+6Ag6C5+pw6Hku66wmpGlQNzjl0uc4XiRBX3mLaqo1GTBC0TmnKOzZjATCm+bVsulQmFYfi8M28YviQQbdF569mf/SP+/OaCr6pF+AKZcG/fdDbj9VpGD9EnH2cu4gkX2iUtAeGD73kpTR5m/oVA3Nh6WvagOXgKIIbtB2EApTEObfLDHxgow3R15aUtOqyUJ/gL/eyW2ltwvWbToReahre6QDSG0dV50aTiP+/a8 b2i0BCTJ ERlYSm6nqn+GIFvLxTqnhJtAimKlY+usEh/4zUX7SocvibFX1PDM69ltZ9SS01MvRgtzED+9OG5j6olphf+uwa25GemvQGFtu4AYza/5KN6JpR8i4r8hbUFQUP8TIUp2B0cidi+EAvzvsapKKyzEpxVcfLpFvsLB6LLrtVPXDT+foDPPCMAAQI2SkTQ/U9KSLxhvCH5cmT6PuMov8EdKeNQBAk3uMx7/9bSncXz03n7Z6liWeV62F8nzlmZedX0tMqqCmuD9W8hs8IyA3TC9UGRFVPPdeHuSxsQUshHcxpkGcOWxk8u9lWr3LLKkHzlD0cRuPSdzjkUrK8lz22c+Xq1Q7EJzj3kRgWTi0KCyZtQDlT2uXrjXyhXIGKqqrxuoSTOl36GCkAxtz1yeG1yuggZ5G4Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jul 23, 2024 at 2:16=E2=80=AFPM Anshuman Khandual wrote: > > > > On 7/23/24 11:02, Zhongkun He wrote: > > I found a problem in my test machine that the memory of a process is > > repeatedly migrated between two nodes and does not stop. > > > > 1.Test step and the machines. > > ------------ > > VM machine: 4 numa nodes and 10GB per node. > > > > stress --vm 1 --vm-bytes 12g --vm-keep > > > > The info of numa stat: > > while :;do cat memory.numa_stat | grep -w anon;sleep 5;done > > anon N0=3D98304 N1=3D0 N2=3D10250747904 N3=3D2634334208 > > anon N0=3D98304 N1=3D0 N2=3D10250747904 N3=3D2634334208 > > anon N0=3D98304 N1=3D0 N2=3D9937256448 N3=3D2947825664 > > anon N0=3D98304 N1=3D0 N2=3D8863514624 N3=3D4021567488 > > anon N0=3D98304 N1=3D0 N2=3D7789772800 N3=3D5095309312 > > anon N0=3D98304 N1=3D0 N2=3D6716030976 N3=3D6169051136 > > anon N0=3D98304 N1=3D0 N2=3D5642289152 N3=3D7242792960 > > anon N0=3D98304 N1=3D0 N2=3D5105442816 N3=3D7779639296 > > anon N0=3D98304 N1=3D0 N2=3D5105442816 N3=3D7779639296 > > anon N0=3D98304 N1=3D0 N2=3D4837007360 N3=3D8048074752 > > anon N0=3D98304 N1=3D0 N2=3D3763265536 N3=3D9121816576 > > anon N0=3D98304 N1=3D0 N2=3D2689523712 N3=3D10195558400 > > anon N0=3D98304 N1=3D0 N2=3D2515148800 N3=3D10369933312 > > anon N0=3D98304 N1=3D0 N2=3D2515148800 N3=3D10369933312 > > anon N0=3D98304 N1=3D0 N2=3D2515148800 N3=3D10369933312 > > anon N0=3D98304 N1=3D0 N2=3D3320455168 N3=3D9564626944 > > anon N0=3D98304 N1=3D0 N2=3D4394196992 N3=3D8490885120 > > anon N0=3D98304 N1=3D0 N2=3D5105442816 N3=3D7779639296 > > anon N0=3D98304 N1=3D0 N2=3D6174195712 N3=3D6710886400 > > anon N0=3D98304 N1=3D0 N2=3D7247937536 N3=3D5637144576 > > anon N0=3D98304 N1=3D0 N2=3D8321679360 N3=3D4563402752 > > anon N0=3D98304 N1=3D0 N2=3D9395421184 N3=3D3489660928 > > anon N0=3D98304 N1=3D0 N2=3D10247872512 N3=3D2637209600 > > anon N0=3D98304 N1=3D0 N2=3D10247872512 N3=3D2637209600 > > > > 2. Root cause: > > Since commit 3e32158767b0 ("mm/mprotect.c: don't touch single threaded > > PTEs which are on the right node")the PTE of local pages will not be > > changed in change_pte_range() for single-threaded process, so no > > page_faults information will be generated in do_numa_page(). If a > > single-threaded process has memory on another node, it will > > unconditionally migrate all of it's local memory to that node, > > even if the remote node has only one page. > > > > So, let's fix it. The memory of single-threaded process should follow > > the cpu, not the numa faults info in order to avoid memory thrashing. > > > > After a long time of testing, there is no memory thrashing > > from the beginning. > > > > while :;do cat memory.numa_stat | grep -w anon;sleep 5;done > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > anon N0=3D2548117504 N1=3D10336903168 N2=3D139264 N3=3D0 > > > > V1: > > -- Add the test results (numa stats) from Ying's feedback > > > > Signed-off-by: Zhongkun He > > Acked-by: "Huang, Ying" > > --- > > kernel/sched/fair.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 24dda708b699..d7cbbda568fb 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -2898,6 +2898,12 @@ static void task_numa_placement(struct task_stru= ct *p) > > numa_group_count_active_nodes(ng); > > spin_unlock_irq(group_lock); > > max_nid =3D preferred_group_nid(p, max_nid); > > + } else if (atomic_read(&p->mm->mm_users) =3D=3D 1) { > > + /* > > + * The memory of a single-threaded process should > > + * follow the CPU in order to avoid memory thrashing. > > + */ > > + max_nid =3D numa_node_id(); > > } > > > > if (max_faults) { > > This in fact makes sense for a single threaded process but just > wondering could there be any other unwanted side effects ? Hi Anshuman, This fix only works on a single threaded process because of the statement 'atomic_read(&p->mm->mm_users) =3D=3D 1', so I don't think there's any oth= er effects.