From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE0C5C00140 for ; Mon, 15 Aug 2022 21:03:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5346C8D0002; Mon, 15 Aug 2022 17:03:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E45E8D0001; Mon, 15 Aug 2022 17:03:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3ABAE8D0002; Mon, 15 Aug 2022 17:03:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 290188D0001 for ; Mon, 15 Aug 2022 17:03:06 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E0FE31206D4 for ; Mon, 15 Aug 2022 21:03:05 +0000 (UTC) X-FDA: 79803051930.27.7FBE08F Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) by imf09.hostedemail.com (Postfix) with ESMTP id 67C4A1401BB for ; Mon, 15 Aug 2022 21:03:05 +0000 (UTC) Received: by mail-pj1-f50.google.com with SMTP id ch17-20020a17090af41100b001fa74771f61so178841pjb.0 for ; Mon, 15 Aug 2022 14:03:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc; bh=je9mkTd7hK9getobpinLs+IwDRRu6tp5EWDUYT/znxA=; b=S4kstyT5ya1G60oqBCS4kOWjOPbd/1oj1NF42PQ/N+lgI6ykau/pCdSRLa+keQoBaS ImmmkcX3Zfp11vHkCcPMHm2jmJNLihnMe0bReHPfh8JbAAzZkDUn/J/2QWt3VVJjWpCr PTmUrGryJggALRRwrbIj8LQQgsn29MgBMRQMHPOY4DzHJ1E+7bF3t4oiH9vAoEEVAkow nw6AJR2BwnlV/HcHjStl5eVON5+EBs5gDgPEREtG5mMTqxjOTvwNgok0uAPsACFqcl56 Lawtkygv+S6s1iVWyn5x3EHnt41c7L5Q6jWqCO4I9EP8pc/oG2rhfw/CkFPZ/mUSIY4D 8cOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc; bh=je9mkTd7hK9getobpinLs+IwDRRu6tp5EWDUYT/znxA=; b=NRHcB3/spxgUU//iEYoxUc3xdcDppHKGDSW8WDfzwzza6t0KTjSNWT8r7IarK3jhN8 ciiHXQJ0lC0VO5A2AaS50/psBfdcVEJqTGyiL6aZkLXk2kAmgMZ1tczb5FKuRzGSc8fs gmdbOYYxIztwtfXgfnCj/sFEcCJCeRV2VGL997ViNNElaA9iGwqDStQpo4a6TDtw4sAk TQlpK4eVb9hqbylypM1DDnRHKB9dC6gFiv4/+nBewh1/Nvgu2F75lwgu9obbcIu+69cX p6seLrzkWRH2WuuIEbbjg6tN2fhJElEUZddqsL8XeYtQaWTpb2JcPSj8kQ5di9YWwLLz fwUA== X-Gm-Message-State: ACgBeo0XLr2/ZnIlI5cGVeWmBNLL8tRaqghoRwzvtMAwaQfUU7cyJmr9 ZUroRIbrrRe5X0wjsLilvic= X-Google-Smtp-Source: AA6agR44+ifgWCDgmkrzaLC4LcN0Pzmv1dnZPiz9dx0VkCSF+V99u1IA17b1bjC3rh7w1WfFZsIzuQ== X-Received: by 2002:a17:90b:390f:b0:1f4:e394:8c18 with SMTP id ob15-20020a17090b390f00b001f4e3948c18mr19648896pjb.141.1660597384068; Mon, 15 Aug 2022 14:03:04 -0700 (PDT) Received: from smtpclient.apple (c-24-6-216-183.hsd1.ca.comcast.net. [24.6.216.183]) by smtp.gmail.com with ESMTPSA id y13-20020a17090322cd00b0016dbb878f8asm7420251plg.82.2022.08.15.14.03.02 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 15 Aug 2022 14:03:03 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Subject: Re: [PATCH v3 5/7] mm: Remember young/dirty bit for page migrations From: Nadav Amit In-Reply-To: <5B21352C-2BE6-4070-BB6B-C1B7A5D4D225@gmail.com> Date: Mon, 15 Aug 2022 14:03:01 -0700 Cc: "Huang, Ying" , Linux MM , LKML , Minchan Kim , David Hildenbrand , Andrew Morton , Hugh Dickins , Vlastimil Babka , Andrea Arcangeli , Andi Kleen , "Kirill A . Shutemov" , Dave Hansen Content-Transfer-Encoding: quoted-printable Message-Id: References: <20220809220100.20033-1-peterx@redhat.com> <20220809220100.20033-6-peterx@redhat.com> <87pmh6dwdr.fsf@yhuang6-desk2.ccr.corp.intel.com> <5B21352C-2BE6-4070-BB6B-C1B7A5D4D225@gmail.com> To: Peter Xu X-Mailer: Apple Mail (2.3696.120.41.1.1) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660597385; a=rsa-sha256; cv=none; b=dM75KcjZkRzczYAj5aiCj62nYCM20LkY7mh0H2TYoMRKE4e3Tix/i0b6GalmID4lR45kFJ etu7pbUEh+4GR/Uvn5Vyhs7ZQoObT62lHg9lo53n1zlMyeOW6AMbvwZtNmWcyvYV42oEJZ RJaCf3A+0K7m3HLXuj+6QE2SFePDBnw= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=S4kstyT5; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660597385; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=je9mkTd7hK9getobpinLs+IwDRRu6tp5EWDUYT/znxA=; b=GlXw6ARnsIYn5INs/XXMo57WLW1WLITTOU9zgtK8IVJZcpeylKq7QkpTBfYNvj/8xcCxoy iBjcfI1uOx5H9hYo1Y0JeLdy86AVz2nntJxPzV31M52EMjmDkP/750QyD8INXLj1QXEmf0 /tgcRZdnjmDNpr30b7NE4s0SKzml4RA= X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 67C4A1401BB X-Rspam-User: Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=S4kstyT5; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com X-Stat-Signature: m6yacczbsfxdfc6756gq5mo3sneuz1x1 X-HE-Tag: 1660597385-500127 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Aug 15, 2022, at 1:52 PM, Nadav Amit wrote: > On Aug 15, 2022, at 12:18 PM, Peter Xu wrote: >=20 >> On Fri, Aug 12, 2022 at 10:32:48AM +0800, Huang, Ying wrote: >>> Peter Xu writes: >>>=20 >>>> On Tue, Aug 09, 2022 at 06:00:58PM -0400, Peter Xu wrote: >>>>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c >>>>> index 27fb37d65476..699f821b8443 100644 >>>>> --- a/mm/migrate_device.c >>>>> +++ b/mm/migrate_device.c >>>>> @@ -221,6 +221,10 @@ static int migrate_vma_collect_pmd(pmd_t = *pmdp, >>>>> else >>>>> entry =3D make_readable_migration_entry( >>>>> = page_to_pfn(page)); >>>>> + if (pte_young(pte)) >>>>> + entry =3D = make_migration_entry_young(entry); >>>>> + if (pte_dirty(pte)) >>>>> + entry =3D = make_migration_entry_dirty(entry); >>>>> swp_pte =3D swp_entry_to_pte(entry); >>>>> if (pte_present(pte)) { >>>>> if (pte_soft_dirty(pte)) >>>>=20 >>>> This change needs to be wrapped with pte_present() at least.. >>>>=20 >>>> I also just noticed that this change probably won't help anyway = because: >>>>=20 >>>> (1) When ram->device, the pte will finally be replaced with a = device >>>> private entry, and device private entry does not yet support = A/D, it >>>> means A/D will be dropped again, >>>>=20 >>>> (2) When device->ram, we are missing information on either A/D = bits, or >>>> even if device private entries start to suport A/D, it's still = not >>>> clear whether we should take device read/write into = considerations >>>> too on the page A/D bits to be accurate. >>>>=20 >>>> I think I'll probably keep the code there for completeness, but I = think it >>>> won't really help much until more things are done. >>>=20 >>> It appears that there are more issues. Between "pte =3D *ptep" and = pte >>> clear, CPU may set A/D bit in PTE, so we may need to update pte when >>> clearing PTE. >>=20 >> Agreed, I didn't see it a huge problem with current code, but it = should be >> better in that way. >>=20 >>> And I don't find the TLB is flushed in some cases after PTE is = cleared. >>=20 >> I think it's okay to not flush tlb if pte not present. But maybe = you're >> talking about something else? >=20 > I think Huang refers to situation in which the PTE is cleared, still = not > flushed, and then A/D is being set by the hardware. >=20 > At least on x86, the hardware is not supposed to do so. The only case = I > remember (and sometimes misremembers) is with KNL erratum, which = perhaps > needs to be considered: >=20 > = https://lore.kernel.org/all/20160708001911.9A3FD2B6@viggo.jf.intel.com/ I keep not remembering this erratum correctly. IIRC, the erratum says = that the access/dirty might be set, but it does not mean that a write is = possible after the PTE is cleared (i.e., the dirty/access might be set on the non-present PTE, but the access itself would fail). So it is not an = issue in this case - losing A/D would not impact correctness since the access = should fail. Dave Hansen hates when I get confused with this one, but I cc him if he wants to confirm. [ Having said all of that, in general the lack of regard to mm->tlb_flush_pending is always concerning in such functions. ]=