From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A6AFC54746 for ; Wed, 28 Aug 2024 00:46:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9A2686B0085; Tue, 27 Aug 2024 20:46:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 951AA6B008A; Tue, 27 Aug 2024 20:46:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 819B26B008C; Tue, 27 Aug 2024 20:46:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 673786B0085 for ; Tue, 27 Aug 2024 20:46:38 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D1C8D81B61 for ; Wed, 28 Aug 2024 00:46:37 +0000 (UTC) X-FDA: 82499813634.01.FBD1C03 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) by imf01.hostedemail.com (Postfix) with ESMTP id 0199B40008 for ; Wed, 28 Aug 2024 00:46:35 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=cTKzKsh5; spf=pass (imf01.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=jiaqiyan@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724805976; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mNtL7Y4SLCuHl55ceG+Z2aJ+HGfSmmJoRhqG+ba1GF0=; b=GcsvIs9Pmz1U7Qp6G63tZgzn+ukfAbcQgW+YoSMMDBlbEb51m1uuSaa4swbdJKS7sHAL+f ZJty94/tzgrEvPW7D17YJfLvV3UWzNijhybWwk5jjlRQwABpJFlUSqBh9vAn7Bam7JMEFQ RwT7ec3oT7wIVlbD2vv7ZTISTcfFnxQ= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=cTKzKsh5; spf=pass (imf01.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=jiaqiyan@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724805976; a=rsa-sha256; cv=none; b=Q8XukDGGcOPIW7qc6FLFOhgJvnuV0g9MfaXKsyiCRG2xVBFOU9dWe0iM0tFAMc2B2+eIGy sNVClDBi7g7BCbELoDKegMLcLMkocQUb8jDGefBsUxnn11ZyyBmB+9eBUz3hCtwc+KxVlA iN/WufVjAD6uWh9GGEwLqW/1Db3ERQw= Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-42807cb6afdso14985e9.1 for ; Tue, 27 Aug 2024 17:46:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724805994; x=1725410794; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=mNtL7Y4SLCuHl55ceG+Z2aJ+HGfSmmJoRhqG+ba1GF0=; b=cTKzKsh5qq8ppA4j6gyi4R4eKYHpRRgWuEJtlgljwuHocAlGm5tWV/8DvmhY847l5/ jilS2QHW9voWn2mQCe6KQr491ikbjMdkR20lczWGISmkJstRBdAvrrVzDvGkxmD+R+Af atyCAdSrSxWVKSpk2A7fP1ot2hW3+3zbEPjE2CI5jQ4wxbvut93OO1GIaHKZuxVB79o1 Gf60u8NlJ4/1nRCJYmhdzVXFeBQDXQlJK2ll6Uq0QyCQBQppXLqS+XBxSMdngQ0mlIQU VLlKR4nFB2F1cQI7fkHwgCYPp/pKY8Xz9Eyy9dsM7gE0/TTBonYzPrGU1CzzWZc1tUr7 ZQpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724805994; x=1725410794; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mNtL7Y4SLCuHl55ceG+Z2aJ+HGfSmmJoRhqG+ba1GF0=; b=RCY0SxlOx20ZFh9lu/+kdTlSM+5eFCkdMJk6fo2ky8+LgU/2CyelmMPdx+niJwRzrO Nt1Mb8Io+miUXoQMz8RqOx1Ud9wtaCGu8xMpDWk4BUujQhS/ItSh/Vel7hRSIqh98YUc t61tg6jMg2M4iO1GpVbsj7O3+VZUSuaaqgmw1z8yVcLoJz74mRCg8Koyx8DPE5hU/y60 zBHllvZ2Vp2aKen+voCRnR2UvZXBNGuOoDrdt5q+zRO4vH+JX8M66133jC53ubtM3/zF TqweH9znWMMG1fIwo+QYahjWKKjagiVYtOgsL4uyAiV0cxQGUgDVIhZfhPNoCFhM863n XW8A== X-Forwarded-Encrypted: i=1; AJvYcCXvwBlExTejOP426+d1IVkaxsCzEvXGdYenQTtSeuCGvN9PbSPl0RMPsXBfqRjnCkh5xyZm7aIsAA==@kvack.org X-Gm-Message-State: AOJu0YzHEsvWbh1LxJiucovYHFPJueF6D6Js7L0YzwN25NDafU6qQXiC 7140NEy74BPp0D2GUo/rdLaXlGJkaLCR/Dhutt90Xi1zKL+LSZKHnGZjRjh2om9lp+gNaKYolBK c2G9lLKBEY6kxEzLX3JXw8PvsxuHOKZrf+h0W X-Google-Smtp-Source: AGHT+IEJQjNM5EEZ0P+DtwfmFfwnYXEBTl1Xu2jOycjfCFoQgr5nBySorNilM2CyUNtMN31u1OefYix4Fj2yslkbGGQ= X-Received: by 2002:a05:600c:3d18:b0:42b:892a:333b with SMTP id 5b1f17b1804b1-42ba432dd18mr625745e9.2.1724805994079; Tue, 27 Aug 2024 17:46:34 -0700 (PDT) MIME-Version: 1.0 References: <20240826204353.2228736-1-peterx@redhat.com> In-Reply-To: From: Jiaqi Yan Date: Tue, 27 Aug 2024 17:46:22 -0700 Message-ID: Subject: Re: [PATCH v2 00/19] mm: Support huge pfnmaps To: ankita@nvidia.com Cc: Peter Xu , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Gavin Shan , Catalin Marinas , x86@kernel.org, Ingo Molnar , Andrew Morton , Paolo Bonzini , Dave Hansen , Thomas Gleixner , Alistair Popple , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Sean Christopherson , Oscar Salvador , Jason Gunthorpe , Borislav Petkov , Zi Yan , Axel Rasmussen , David Hildenbrand , Yan Zhao , Will Deacon , Kefeng Wang , Alex Williamson Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: cc93kzukk8b781ehifiuewtzcom9jh8m X-Rspamd-Queue-Id: 0199B40008 X-Rspamd-Server: rspam11 X-HE-Tag: 1724805995-147616 X-HE-Meta: U2FsdGVkX1/i826OUcpTGE3Kgcw7i1dBEAp41jAykkkBdZcOFhAZa7Uwvsi06A9RihQp4u93rNSTiTrG0abD9c0IiBSdOZz406DOH2dbWK0AB9C/IcZcYBPDWyF/zzv0j+I0zlY9QU28cvFtH9ODzSTrDZvbeQ2UkF0ZfCE4+gUo4Neh0ICY748HYPpczxwptwIeMIuN8wCtSNGgBuDBJ1ZvNeAR8rhrkRT0LA9CmmuwROf3TDS30ZYK0z16VrsgPwS1WjBah+58gseppSJaVyPbrM77OkjSL1dtUCVAHAYgo/kdZ54W8I7N64swnP2TLZevv1yrMiLtto73alxxmPUj1gg0nu1D//3S5oa7ksX8q0ZSVXTt5dKTK768vIfizcEDIOBNzqwlz4wH3VST18lfqKspZWFitRlKGfjdRn3e07MpDEEuBrb0OfeBvKnNQDc8qqnIopZD2ahEFP8fXFJYOttB4CbU2g+oTXKoxTfND4CTKWydKYbXme8xlCmIO9CNA5JOeB5BAhLuwmvwyK+yoq4Te33/4yAixHguPsE52nGz2bVI1psdj7c0LBEZ7kyMlYEGxmYTuOlf1I9xlUG8xOhDEyfqVTv8TW+R518T9+z4eMwAAz1jYLZ1252mjTSnvptXSwWJcHlqcdnEsKdoHXyUIX/qST2G0oyeTBuaRPVWHJyYeFiDR2GSY55JKEw4LQAVO9oZRLcJwDj4r+4XXfEbnLsrOuiGECVy3oRuPnOrzgqLkm5+I8u9vw4IhWFYfuh/iGhL/UcR7M/4xKn4gEX5kSvG5yq1gj3GYRbriCN+JTIJfsFCONMi7XvpYOPexdpYWcElDUWYca6t4//UTs7iVkuffO2ENztlmHQJhGmg5v9bBH9O1uE2tYeFMbaGwS5vRVcGi2li/UO6tuOJCNIjGiZN2CwqviYSpFPZmnh8pcvmB/Li0jcsR41ic3T9r5zrOg8gY67SxZM D3QDkciQ ftsxiksDdi1ECJ0Acumeq4xLOXMhyuUH7506p1b+1cj7kdyNI6kMmXi5IIjW4jCYdemwB6C+QjGDL0SO1mVHdB9+Drz6/YC6z5d6Z+wQZQUb1EpcJ95YN7dY6ws9GuiPU5VK+r+CB2O91gyYVvdhsp1LRW/RlTgOCCnrhfdftfp3jc0lWazQadxCCjBCB2QGV3rsDhnGFroC3dReOYsMX33x28qrvKlAEdBZcx/t3h2m+K24fuHGAYg2oktby2ijMPBbOVF4S3GM/PL1xpQk4+8HQCo787SmsP3cm6QOeslsts6agnJnbNFirC6218EMTLvB7pO3mOLIyyQw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Adding Ankit in case he has opinions. On Tue, Aug 27, 2024 at 5:42=E2=80=AFPM Jiaqi Yan wro= te: > > On Tue, Aug 27, 2024 at 3:57=E2=80=AFPM Peter Xu wrot= e: > > > > On Tue, Aug 27, 2024 at 03:36:07PM -0700, Jiaqi Yan wrote: > > > Hi Peter, > > > > Hi, Jiaqi, > > > > > I am curious if there is any work needed for unmap_mapping_range? If = a > > > driver hugely remap_pfn_range()ed at 1G granularity, can the driver > > > unmap at PAGE_SIZE granularity? For example, when handling a PFN is > > > > Yes it can, but it'll invoke the split_huge_pud() which default routes = to > > removal of the whole pud right now (currently only covers either DAX > > mappings or huge pfnmaps; it won't for anonymous if it comes, for examp= le). > > > > In that case it'll rely on the driver providing proper fault() / > > huge_fault() to refault things back with smaller sizes later when acces= sed > > again. > > I see, so the driver needs to drive the recovery process, and code > needs to be in the driver. > > But it seems to me the recovery process will be more or less the same > to different drivers? In that case does it make sense that > memory_failure do the common things for all drivers? > > Instead of removing the whole pud, can driver or memory_failure do > something similar to non-struct-page-version of split_huge_page? So > driver doesn't need to re-fault good pages back? > > > > > > > poisoned in the 1G mapping, it would be great if the mapping can be > > > splitted to 2M mappings + 4k mappings, so only the single poisoned PF= N > > > is lost. (Pretty much like the past proposal* to use HGM** to improve > > > hugetlb's memory failure handling). > > > > Note that we're only talking about MMIO mappings here, in which case th= e > > PFN doesn't even have a struct page, so the whole poison idea shouldn't > > apply, afaiu. > > Yes, there won't be any struct page. Ankit proposed this patchset* for > handling poisoning. I wonder if someday the vfio-nvgrace-gpu-pci > driver adopts your change via new remap_pfn_range (install PMD/PUD > instead of PTE), and memory_failure_pfn still > unmap_mapping_range(pfn_space->mapping, pfn << PAGE_SHIFT, PAGE_SIZE, > 0), can it somehow just work and no re-fault needed? > > * https://lore.kernel.org/lkml/20231123003513.24292-2-ankita@nvidia.com/#= t > > > > > > > Thanks, > > > > -- > > Peter Xu > >