From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 785BDC54746 for ; Wed, 28 Aug 2024 00:42:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE4726B0088; Tue, 27 Aug 2024 20:42:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D94626B008A; Tue, 27 Aug 2024 20:42:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C833D6B008C; Tue, 27 Aug 2024 20:42:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A9B8C6B0088 for ; Tue, 27 Aug 2024 20:42:38 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3F5F714176E for ; Wed, 28 Aug 2024 00:42:38 +0000 (UTC) X-FDA: 82499803596.30.5E4D0A1 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) by imf18.hostedemail.com (Postfix) with ESMTP id 6809C1C000F for ; Wed, 28 Aug 2024 00:42:36 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="aRzyR/21"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.128.42 as permitted sender) smtp.mailfrom=jiaqiyan@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724805659; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4VsMSVQftnlr9QdT0CiWGqtoF39SGuCHkxj/8j+EtsY=; b=VrP/fETlDRnj9dml6pX68EhF9MUFSTN1BRuyMICOCSFebv3xaDb6DSLUC5NVfxf8ZwfBkK AUh1/Ju2iC3LzS9Hwrhv3vs+rASIhjYxzdHn9fBmYGCENuP7CGDb0zSNpqeVy8yfMO9YHv qrdLsZ+np3oIev0UtOe9zHZ6tSdMehc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724805659; a=rsa-sha256; cv=none; b=EGekE09YTsObnGOU5sig5HRJsQhyAfrla0UYEpkWZKDs+HUAFd+DNnCzql6AKlsSLHSRcg 0jW11Nh2ZA66Z+6sYn1bPm5UU/fwyxtwyY+/R4+UJIfyCsOXfNw3R09Xp3UXiAL3WW4q7r QbEyY/XiAguVvZd/WZhldXFW7PvlgDc= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="aRzyR/21"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.128.42 as permitted sender) smtp.mailfrom=jiaqiyan@google.com Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-429d1a9363aso12115e9.1 for ; Tue, 27 Aug 2024 17:42:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724805755; x=1725410555; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=4VsMSVQftnlr9QdT0CiWGqtoF39SGuCHkxj/8j+EtsY=; b=aRzyR/21R7ErfX8qZurKr0cyQvxbo3IvDtabLu25zLp1orpVG3ae06FbjaAZV2ESd1 1Byw2meJiEVtDZVXazD47fNoe1/rgHimlPiQ1dX+MMoZG6Y+szoZ0FtVpVWJpAoJlQv5 bk3p16y6+wgxMiRuKI7s/teJ1MuShIgYMmShVYkDBsnHm6oCmHSz6W1sFOm0QPJt8h1E jLRjoK+dr6UgzzTK7/ESaanJPvGCodlp5ijXfyiQt99EqxZmRNPUnU7pU/1nuRDZ9tlF tlIuat2uyuWNxK/QGsDO2iNgU3i9Iy8gcz5IEgs1cqWjtJxTCqgOcUrQKnWgZzu4+Kmc Kf/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724805755; x=1725410555; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4VsMSVQftnlr9QdT0CiWGqtoF39SGuCHkxj/8j+EtsY=; b=sz3b7LkjZbPYIrqQiUN+jx1KrrUYW4ctrwf3dxjattPOCpjzyi5gKp2XHsUacsaiC4 5NOmDbdFqVO7vw4296zhsLVFHxrGAiGWoJvlnM8BKq7nPHbKyBQI9jo3jL1jekIPHcys hoNe6hGC9Gr6WkWJwdH5FUXBQgrdYfk+x59yrSuUluL4HMI56MFvgvYGubCLaD7xxCHR /re3/Rm2vE2aHOWcLJGQicK8suYH1s+Mzdo3MypugbU+F6oN/7e+TWCb8B288iDX2c01 dEMlzCu7d2+4y2XEHcUSZRykRP8uAD8sUdbcUOSiQzeiZ6h7e9c2EgWwERkeVjzFpqwS cagA== X-Forwarded-Encrypted: i=1; AJvYcCUuwW1wdBt7f6O+70/4o8+YPVTho+NydWbTB5R/Bdoyfz2ERbaR7+s+k0TnvjxToZXwkUga+QqOnQ==@kvack.org X-Gm-Message-State: AOJu0YzEYJFhHBN5PRUm8l+YO/SDkaJKGFnHku+cw3+ZL5KX6xmxX/OX ZfXvIehyMQ8MHNiK5r4cl1dgA4X1dGaRBcPJoRO5F9PXWaFFZ7iaw0n7ikJPmsIyNt4iHWq5/fH j74T47ArLcASviHWn7Wfd5An6lIP3g3kx0//+KIWw2CjFDaNjWOV6zwg= X-Google-Smtp-Source: AGHT+IGrYlmUlXw/SqtUS/tOqUNPxxyajsxVgA4XCcv29hgIwhdGlTSkUs2Nj+Omxm+e5LsJNReaXIC2KatgimDrJgA= X-Received: by 2002:a05:600c:5010:b0:426:7018:2e2f with SMTP id 5b1f17b1804b1-42ba50d11e6mr306605e9.5.1724805754529; Tue, 27 Aug 2024 17:42:34 -0700 (PDT) MIME-Version: 1.0 References: <20240826204353.2228736-1-peterx@redhat.com> In-Reply-To: From: Jiaqi Yan Date: Tue, 27 Aug 2024 17:42:21 -0700 Message-ID: Subject: Re: [PATCH v2 00/19] mm: Support huge pfnmaps To: Peter Xu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Gavin Shan , Catalin Marinas , x86@kernel.org, Ingo Molnar , Andrew Morton , Paolo Bonzini , Dave Hansen , Thomas Gleixner , Alistair Popple , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Sean Christopherson , Oscar Salvador , Jason Gunthorpe , Borislav Petkov , Zi Yan , Axel Rasmussen , David Hildenbrand , Yan Zhao , Will Deacon , Kefeng Wang , Alex Williamson Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 6809C1C000F X-Stat-Signature: fd3pfz9jkt61mze5yfw89k38f8foexga X-Rspam-User: X-HE-Tag: 1724805756-871082 X-HE-Meta: U2FsdGVkX1/1cxsbxqZxweCxHxV4kUAt39JihusQRB5WkiDuKxtFkdM214J3cTs7NtE/fvmCNsi3Iu4jvQf79yDqJDKRHneoxKtngkEUsJF0SxxnccJi6WpOWBCYNN9VjYmGbptOPbkJQVgGOsUtFkG9CHa/SkEFG9R8QcC2ZdYiATWRA8Q6N9u5/yBa4VWx5ne960A2w7vct+8QC0eapBW6pvWHyhNM+bV3MCuaSdUubPADxzEZsGXo4Q43VsFrUBrIAg64sfT435WFvyRsgkX4HCxVGzaPPUa/0wvB1mMpPpLhWDoBVxY8RFzawklXVytSmZp7SP+O1JbQP+Y1ycgVY7X78LdB4OHyKvYHH2icKLi/ijCpXuQKh6G4PLPo6kLwxUrzi7MhV2xglwDRb+v7shJXecp7OYDRTVewhWxQtM3Rclt1Jh3FNmFd3zQ2cXGnAYitYbcpznBAqe+nUurncH1eAvvNspuEO9xsL99hPN2ar/5774oiH/0lpflYiew3cTT6RBeTpxeUbxm3ehYvI0SFipZLMcVKYZYmSww7DXbNJQIVx8i4VNE9vc/85AgFD13/BGHFSmGX0lsb2+6Yx/14VeFd7/igywLNtM/CcbHQztXNAie3/s+DkYRzmnufjj6JqR2b+4L/IhGKmLuBIUJnMiQjCXZePsFd71Vck/yIqXCDURuv1pQyOKKRKLdSN3Q4N5CmzME0KVEU6bfmYf3cpnZCODjoIziYus9Abl21DohrQaOu7Mw1bnOX0SjpJFECVb8dSOJGT5NPJthbAC3gllTzFHxhuCpGqYiilR9/he9Yf41wiyc1PwCdyxcCFVS+WliaZIp7IEdLqIDUxcu2SvwtUP1f9MSEmVK3M84w7CoFfnmNKXXnTnd+eegH3RLcy9MKoqdhFNycveOXW/eq4Xv/rLh3B5+wrEaCsa9uKQukMVAMjPJtraV5AO1XHO39jft03fQRXZP m2Q45hjq tdG9T+Dx5ROIOBqlYBUwK4RIz070mPK+pdjYWWBKYu7TjDoKAkH0GCVYfi51v0iSm3TI7MvtBwyEiVQ83yzkOJmpcfkMEIRu8gMtf/ZQtxJOTYSaQU6JQuxh/NGKakPv4zvVeP9FI4Ff0BiD0KBy5FpA3vTMrhCRq9xQk6SoTCy3DpqIgc22fmXSIDDblzPc8NdKarjUeiTYEF8yL1WFC5zgxUI+CWyk1RAb0vrzAKQzfuT4J2wGPpu/IrOTXMWuIYGkTmMTUDsNguOBvEEV/bOmsSg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Aug 27, 2024 at 3:57=E2=80=AFPM Peter Xu wrote: > > On Tue, Aug 27, 2024 at 03:36:07PM -0700, Jiaqi Yan wrote: > > Hi Peter, > > Hi, Jiaqi, > > > I am curious if there is any work needed for unmap_mapping_range? If a > > driver hugely remap_pfn_range()ed at 1G granularity, can the driver > > unmap at PAGE_SIZE granularity? For example, when handling a PFN is > > Yes it can, but it'll invoke the split_huge_pud() which default routes to > removal of the whole pud right now (currently only covers either DAX > mappings or huge pfnmaps; it won't for anonymous if it comes, for example= ). > > In that case it'll rely on the driver providing proper fault() / > huge_fault() to refault things back with smaller sizes later when accesse= d > again. I see, so the driver needs to drive the recovery process, and code needs to be in the driver. But it seems to me the recovery process will be more or less the same to different drivers? In that case does it make sense that memory_failure do the common things for all drivers? Instead of removing the whole pud, can driver or memory_failure do something similar to non-struct-page-version of split_huge_page? So driver doesn't need to re-fault good pages back? > > > poisoned in the 1G mapping, it would be great if the mapping can be > > splitted to 2M mappings + 4k mappings, so only the single poisoned PFN > > is lost. (Pretty much like the past proposal* to use HGM** to improve > > hugetlb's memory failure handling). > > Note that we're only talking about MMIO mappings here, in which case the > PFN doesn't even have a struct page, so the whole poison idea shouldn't > apply, afaiu. Yes, there won't be any struct page. Ankit proposed this patchset* for handling poisoning. I wonder if someday the vfio-nvgrace-gpu-pci driver adopts your change via new remap_pfn_range (install PMD/PUD instead of PTE), and memory_failure_pfn still unmap_mapping_range(pfn_space->mapping, pfn << PAGE_SHIFT, PAGE_SIZE, 0), can it somehow just work and no re-fault needed? * https://lore.kernel.org/lkml/20231123003513.24292-2-ankita@nvidia.com/#t > > Thanks, > > -- > Peter Xu >