From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17942C3DA4A for ; Fri, 9 Aug 2024 07:56:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 92D006B0085; Fri, 9 Aug 2024 03:56:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8DB906B0089; Fri, 9 Aug 2024 03:56:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A3B06B0098; Fri, 9 Aug 2024 03:56:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 587136B0085 for ; Fri, 9 Aug 2024 03:56:03 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 11B1AA09A9 for ; Fri, 9 Aug 2024 07:56:03 +0000 (UTC) X-FDA: 82431948606.27.EA1D1E7 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) by imf21.hostedemail.com (Postfix) with ESMTP id 152711C000B for ; Fri, 9 Aug 2024 07:55:59 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=l6+qrJHw; spf=pass (imf21.hostedemail.com: domain of ying.huang@intel.com designates 198.175.65.15 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723190127; a=rsa-sha256; cv=none; b=YlyH0KLiu4RVeRhqqTWUg5DClp0kh9pKAViu4vDDSJXuIzt1RkJScn6dKYkrkxNmjdrkQT i1sLjR+7gYwu8jZvqbRLSq6e/b6n2pj0zJqlp9JnLEYTgv6CSA5t1McuHGYPloM6coWhoI /Wch/70yLcOrG3LLZpE/xWOwJMtoRZc= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=l6+qrJHw; spf=pass (imf21.hostedemail.com: domain of ying.huang@intel.com designates 198.175.65.15 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723190127; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8eQz4MceN/+zN7IOraoMqn4OvKqZOP6U6ikAofvOCBQ=; b=rwe7C+kVzjPCrhy/KVbM7624M8Mhs7UjMF0CiY99A8azAfMCWQvueHMFKD9pgIOmFrOYfd suQrRz2ARU/ekLHEKpPGpUVhLz3yMBJsW2VPzpCIJkpULtlX5fZl5g2c3io2Dx1bs+CBnA 4bELHZFskwMfcWlhN6cywpGVzmG2GnQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1723190161; x=1754726161; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=94lZVQDuVX0rmJEGUMmn2REy9MejZbsiPG+f8IK032c=; b=l6+qrJHwfx/JzK2AGx8TkTEfNMhyAbmjnkK5lg3bGkVOXEz8g8DakgzZ J3cxW+EOSEJQBsmqFQzFTKYcWDPRud2Dw8rQFu5Dk++R1lDZbRkuQOnHQ QqYCPxnmrsuQHpd8aP5qQm0u5DCsefsU6IIQ5qvHGV87LHRbgp5EbjVgI +WFgeLF7Fs0+gb89kx7tA6NEVHEqvUKdktFtmsgeJkselm+KgD+mxTFCM k0EefoMxAI0Apb4j+Y4ejtE7rLNM4EGtxgIBvw4dzKMmc8IxixEYFNdSc LmyOjPxmznrm/MS4j7uLRMAHqkOnaSajYZ5C1FkCIO8YCvwve3zNkOtIm Q==; X-CSE-ConnectionGUID: 6T+TeMlxTAOOMqxzggsjFg== X-CSE-MsgGUID: l3weZOp8TUuifYFA5r6/aA== X-IronPort-AV: E=McAfee;i="6700,10204,11158"; a="25120794" X-IronPort-AV: E=Sophos;i="6.09,275,1716274800"; d="scan'208";a="25120794" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Aug 2024 00:55:58 -0700 X-CSE-ConnectionGUID: 7LbPVeFGRV2/faFvfItgpw== X-CSE-MsgGUID: ET6+4kdqS+uqbJmnhFPo1w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,275,1716274800"; d="scan'208";a="58204970" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Aug 2024 00:55:55 -0700 From: "Huang, Ying" To: Zi Yan Cc: Kefeng Wang , David Hildenbrand , Andrew Morton , Baolin Wang , , , Subject: Re: [PATCH 1/2] mm/numa: no task_numa_fault() call if page table is changed In-Reply-To: <03D403CE-5893-456D-AB4B-67C9E9F0F532@nvidia.com> (Zi Yan's message of "Thu, 08 Aug 2024 22:05:52 -0400") References: <20240807184730.1266736-1-ziy@nvidia.com> <956553dc-587c-4a43-9877-7e8844f27f95@linux.alibaba.com> <1881267a-723d-4ba0-96d0-d863ae9345a4@redhat.com> <09AC6DFA-E50A-478D-A608-6EF08D8137E9@nvidia.com> <052552f4-5a8d-4799-8f02-177585a1c8dd@redhat.com> <8890DD6A-126A-406D-8AB9-97CF5A1F4DA4@nvidia.com> <87cymizdvc.fsf@yhuang6-desk2.ccr.corp.intel.com> <03D403CE-5893-456D-AB4B-67C9E9F0F532@nvidia.com> Date: Fri, 09 Aug 2024 15:52:22 +0800 Message-ID: <874j7uyvyh.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Stat-Signature: x9pt813tc3yiobhqy1g1fsodi5jr3n7p X-Rspamd-Queue-Id: 152711C000B X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1723190159-895002 X-HE-Meta: U2FsdGVkX1+PHYzaRFOV+Hv8g+BkSDOxUIRc5HQgy3vFJRX4+mIzjAk6C8VK/1LG2KhuB+5GuDr8/lOSZm+KPytv+ybihmrTTPGQCMdDsVOifpPjyq9xHc5MBYtw5e0+wRtIoIwBZ/IPc6vxzVH93mZVgwdtJ+jHslsC7Q3oAHMydrnY6QSxUW2zBX9DQrQtPeNavc9KcdMksZje7dlaSMK7EP9V/KLpa/wJQhRsM+mTy3G6WFoU/QnaxgX/K1DswZxx3mnx/Cgm1EGiVdsIMOUixU4nwnV9rTRtHMag4dAuuqYMhdpLiZYnMEtycT0n5OYNd26X0l/pX+mJJ96b6kzPsVFwdcvg8VJbqVHb3DROOaEcu5B+bGD3uadaJGW8J6MjyPBiLjh059kNanu3iN7CkGHNKKhRHUm84q8qu16Fx8e30vaSrxv5whX8SsovrMvYVGkgT9r9SB0vYSfCbdwmpuHBUtnAc3djWOEl050j6IcYasetmHxIvR5n8DVsap4mKd2XmyfsEC/50XHQXZz2C0TM81HSX2UZ6IHo05nBkG61A9pHNfOwMQehjvlEVvOf2Xqz0WeH3LOd+gVkc2OU2ZOrv5ZqCbeL/pIEQpXOjpF22vJRbdDlJymyOdtEYoWATeBlrTEog6VaqmFQc78ge1i7PT7xx+ktL0zeLEfxLwJNXotcSzcYcTBo1yWjr3vUxm+XijzWVKHuoaDFxc1UYTV/pNZMBk6SOm+L3RLacvqx8GPMiB4VEgN/u5W4GunBMKZZphbOM8LBSpSgGJUKvfP7wNlSsnNPAVvLhurxsTemWVJ4Kgso4RL6DDRddu/Neb4lQMRRUU6Me6rgXtiDIB1tGSdoqORIimuhrBO7O29gmzmg9Dk/kyJPZEGGTtXUeiDPf6HhOuxIog83tEE116sIychLoy5+RPfdeDkYKraDtdvULOW1M2vScEeVjtuiMwOXbkpc/L4WB9Q lJtAi2dz 8alsdBP/SB2PccpU1UaNyxUiXpi7g23QUFH54/EdrfvHmGRXEgtjAyP1cofc2bBTX7G0t5u7Ggua1f49vAeII1NJVB8TwXPLzzBk2YBJIolLkm+yYQ8A4InYSxdn8cc8m40wyNmJ1bHPmHG1MAgSHNlPDDRxtnVlvT3fULRjiuUGfFNYwnnQJxjrFp9BsTP40y2whM5nw8c4wZj7rqie8NlQuyANm+8ygZokyF/txHliisEY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Zi Yan writes: > On 8 Aug 2024, at 21:25, Huang, Ying wrote: > >> Kefeng Wang writes: >> >>> On 2024/8/8 22:21, Zi Yan wrote: >>>> On 8 Aug 2024, at 10:14, David Hildenbrand wrote: >>>> >>>>> On 08.08.24 16:13, Zi Yan wrote: >>>>>> On 8 Aug 2024, at 4:22, David Hildenbrand wrote: >>>>>> >>>>>>> On 08.08.24 05:19, Baolin Wang wrote: >>>>>>>> >>>>>>>> >>> ... >>>>>>> Agreed, maybe we should simply handle that right away and replace the "goto out;" users by "return 0;". >>>>>>> >>>>>>> Then, just copy the 3 LOC. >>>>>>> >>>>>>> For mm/memory.c that would be: >>>>>>> >>>>>>> diff --git a/mm/memory.c b/mm/memory.c >>>>>>> index 67496dc5064f..410ba50ca746 100644 >>>>>>> --- a/mm/memory.c >>>>>>> +++ b/mm/memory.c >>>>>>> @@ -5461,7 +5461,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) >>>>>>> if (unlikely(!pte_same(old_pte, vmf->orig_pte))) { >>>>>>> pte_unmap_unlock(vmf->pte, vmf->ptl); >>>>>>> - goto out; >>>>>>> + return 0; >>>>>>> } >>>>>>> pte = pte_modify(old_pte, vma->vm_page_prot); >>>>>>> @@ -5528,15 +5528,14 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) >>>>>>> vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, >>>>>>> vmf->address, &vmf->ptl); >>>>>>> if (unlikely(!vmf->pte)) >>>>>>> - goto out; >>>>>>> + return 0; >>>>>>> if (unlikely(!pte_same(ptep_get(vmf->pte), vmf->orig_pte))) { >>>>>>> pte_unmap_unlock(vmf->pte, vmf->ptl); >>>>>>> - goto out; >>>>>>> + return 0; >>>>>>> } >>>>>>> goto out_map; >>>>>>> } >>>>>>> -out: >>>>>>> if (nid != NUMA_NO_NODE) >>>>>>> task_numa_fault(last_cpupid, nid, nr_pages, flags); >>>>>>> return 0; >>> >>> Maybe drop this part too, >>> >>> diff --git a/mm/memory.c b/mm/memory.c >>> index 410ba50ca746..07343c1469e0 100644 >>> --- a/mm/memory.c >>> +++ b/mm/memory.c >>> @@ -5523,6 +5523,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) >>> if (!migrate_misplaced_folio(folio, vma, target_nid)) { >>> nid = target_nid; >>> flags |= TNF_MIGRATED; >>> + goto out; >>> } else { >>> flags |= TNF_MIGRATE_FAIL; >>> vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, >>> @@ -5533,12 +5534,8 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) >>> pte_unmap_unlock(vmf->pte, vmf->ptl); >>> return 0; >>> } >>> - goto out_map; >>> } >>> >>> - if (nid != NUMA_NO_NODE) >>> - task_numa_fault(last_cpupid, nid, nr_pages, flags); >>> - return 0; >>> out_map: >>> /* >>> * Make it present again, depending on how arch implements >> >> IMHO, migration success is normal path, while migration failure is error >> processing path. If so, it's better to use "goto" for error processing >> instead of normal path. >> >>> @@ -5551,6 +5548,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) >>> numa_rebuild_single_mapping(vmf, vma, vmf->address, >>> vmf->pte, >>> writable); >>> pte_unmap_unlock(vmf->pte, vmf->ptl); >>> +out: >>> if (nid != NUMA_NO_NODE) >>> task_numa_fault(last_cpupid, nid, nr_pages, flags); >>> return 0; >>> >>> > > How about calling task_numa_fault and return in the migration successful path? > (target_nid cannot be NUMA_NO_NODE, so if is not needed) > > diff --git a/mm/memory.c b/mm/memory.c > index 3441f60d54ef..abdb73a68b80 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -5526,7 +5526,8 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) > if (!migrate_misplaced_folio(folio, vma, target_nid)) { > nid = target_nid; > flags |= TNF_MIGRATED; > - goto out; > + task_numa_fault(last_cpupid, nid, nr_pages, flags); > + return 0; > } else { > flags |= TNF_MIGRATE_FAIL; > vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, > @@ -5550,7 +5551,6 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) > numa_rebuild_single_mapping(vmf, vma, vmf->address, vmf->pte, > writable); > pte_unmap_unlock(vmf->pte, vmf->ptl); > -out: > if (nid != NUMA_NO_NODE) > task_numa_fault(last_cpupid, nid, nr_pages, flags); > return 0; > This looks better for me, or change it further. if (migrate_misplaced_folio(folio, vma, target_nid)) goto out_map_pt; nid = target_nid; flags |= TNF_MIGRATED; task_numa_fault(last_cpupid, nid, nr_pages, flags); return 0; out_map_pt: flags |= TNF_MIGRATE_FAIL; vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, ... > > > Or move the make-present code inside migration failed branch? This one > does not duplicate code but others can jump into this branch. > > diff --git a/mm/memory.c b/mm/memory.c > index 3441f60d54ef..c9b4e7099815 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -5526,7 +5526,6 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) > if (!migrate_misplaced_folio(folio, vma, target_nid)) { > nid = target_nid; > flags |= TNF_MIGRATED; > - goto out; > } else { > flags |= TNF_MIGRATE_FAIL; > vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, > @@ -5537,20 +5536,20 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) > pte_unmap_unlock(vmf->pte, vmf->ptl); > return 0; > } > - } > out_map: > - /* > - * Make it present again, depending on how arch implements > - * non-accessible ptes, some can allow access by kernel mode. > - */ > - if (folio && folio_test_large(folio)) > - numa_rebuild_large_mapping(vmf, vma, folio, pte, ignore_writable, > - pte_write_upgrade); > - else > - numa_rebuild_single_mapping(vmf, vma, vmf->address, vmf->pte, > - writable); > - pte_unmap_unlock(vmf->pte, vmf->ptl); > -out: > + /* > + * Make it present again, depending on how arch implements > + * non-accessible ptes, some can allow access by kernel mode. > + */ > + if (folio && folio_test_large(folio)) > + numa_rebuild_large_mapping(vmf, vma, folio, pte, > + ignore_writable, pte_write_upgrade); > + else > + numa_rebuild_single_mapping(vmf, vma, vmf->address, > + vmf->pte, writable); > + pte_unmap_unlock(vmf->pte, vmf->ptl); > + } > + > if (nid != NUMA_NO_NODE) > task_numa_fault(last_cpupid, nid, nr_pages, flags); > return 0; > > > Of course, I will need to change mm/huge_memory.c as well. > -- Best Regards, Huang, Ying