From: Naoya Horiguchi <naoya.horiguchi@linux.dev>
To: ankita@nvidia.com
Cc: jgg@nvidia.com, alex.williamson@redhat.com,
akpm@linux-foundation.org, tony.luck@intel.com, bp@alien8.de,
naoya.horiguchi@nec.com, linmiaohe@huawei.com,
aniketa@nvidia.com, cjia@nvidia.com, kwankhede@nvidia.com,
targupta@nvidia.com, vsethi@nvidia.com, acurrid@nvidia.com,
anuaggarwal@nvidia.com, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, linux-edac@vger.kernel.org,
kvm@vger.kernel.org
Subject: Re: [PATCH v1 4/4] vfio/nvgpu: register device memory for poison handling
Date: Tue, 26 Sep 2023 16:38:57 +0900 [thread overview]
Message-ID: <20230926073857.GB1344149@ik1-406-35019.vs.sakura.ne.jp> (raw)
In-Reply-To: <20230920140210.12663-5-ankita@nvidia.com>
On Wed, Sep 20, 2023 at 07:32:10PM +0530, ankita@nvidia.com wrote:
> From: Ankit Agrawal <ankita@nvidia.com>
>
> The nvgrace-gpu-vfio-pci module [1] maps the device memory to the user VA
> (Qemu) using remap_pfn_range() without adding the memory to the kernel.
> The device memory pages are not backed by struct page. Patches 1-3
> implements the mechanism to handle ECC/poison on memory page without
> struct page and expose a registration function. This new mechanism is
> leveraged here.
>
> The module registers its memory region with the kernel MM for ECC handling
> using the register_pfn_address_space() registration API exposed by the
> kernel. It also defines a failure callback function pfn_memory_failure()
> to get the poisoned PFN from the MM.
>
> The module track poisoned PFN as a bitmap with a bit per PFN. The PFN is
> communicated by the kernel MM to the module through the failure function,
> which sets the appropriate bit in the bitmap.
>
> The module also defines a VMA fault ops for the module. It returns
> VM_FAULT_HWPOISON in case the bit for the PFN is set in the bitmap.
>
> [1] https://lore.kernel.org/all/20230915025415.6762-1-ankita@nvidia.com/
>
> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
> ---
...
> @@ -406,6 +494,19 @@ nvgrace_gpu_vfio_pci_fetch_memory_property(struct pci_dev *pdev,
>
> nvdev->memlength = memlength;
>
> +#ifdef CONFIG_MEMORY_FAILURE
> + /*
> + * A bitmap is maintained to track the pages that are poisoned. Each
> + * page is represented by a bit. Allocation size in bytes is
> + * determined by shifting the device memory size by PAGE_SHIFT to
> + * determine the number of pages; and further shifted by 3 as each
> + * byte could track 8 pages.
> + */
> + nvdev->pfn_bitmap
> + = vzalloc((nvdev->memlength >> PAGE_SHIFT)/BITS_PER_TYPE(char));
> + if (!nvdev->pfn_bitmap)
> + ret = -ENOMEM;
> +#endif
> return ret;
> }
>
I assume that memory failure is a relatively rare event (otherwise the device
is simply broken and it's better to stop using it), so the bitmap is mostly
full of zeros.
I think that the size of device memory is on the order of 100GB, then the
bitmap size is about 3.2MB, which might be not too large in modern systems,
but using other data structure with smaller memory footprint like hash table
can be more beneficial?
Thanks,
Naoya Horiguchi
next prev parent reply other threads:[~2023-09-26 7:39 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-20 14:02 [PATCH v1 0/4] mm: Implement ECC handling for pfn with no struct page ankita
2023-09-20 14:02 ` [PATCH v1 1/4] mm: handle poisoning of pfn without struct pages ankita
2023-09-23 3:20 ` Miaohe Lin
2023-09-25 12:36 ` Jason Gunthorpe
2023-09-26 7:23 ` Naoya Horiguchi
2023-09-20 14:02 ` [PATCH v1 2/4] mm: Add poison error check in fixup_user_fault() for mapped pfn ankita
2023-09-20 14:02 ` [PATCH v1 3/4] mm: Change ghes code to allow poison of non-struct pfn ankita
2023-09-20 14:02 ` [PATCH v1 4/4] vfio/nvgpu: register device memory for poison handling ankita
2023-09-26 5:36 ` kernel test robot
2023-09-26 7:38 ` Naoya Horiguchi [this message]
2023-09-28 19:45 ` Alex Williamson
2023-09-20 16:02 ` [PATCH v1 0/4] mm: Implement ECC handling for pfn with no struct page Andrew Morton
2023-09-20 16:04 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230926073857.GB1344149@ik1-406-35019.vs.sakura.ne.jp \
--to=naoya.horiguchi@linux.dev \
--cc=acurrid@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=aniketa@nvidia.com \
--cc=ankita@nvidia.com \
--cc=anuaggarwal@nvidia.com \
--cc=bp@alien8.de \
--cc=cjia@nvidia.com \
--cc=jgg@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=kwankhede@nvidia.com \
--cc=linmiaohe@huawei.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=naoya.horiguchi@nec.com \
--cc=targupta@nvidia.com \
--cc=tony.luck@intel.com \
--cc=vsethi@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox