* Re: [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state [not found] ` <20260224123221.GM10607@unreal> @ 2026-02-24 20:57 ` Pranjal Shrivastava 2026-02-25 4:49 ` Ashish Mhetre 2026-02-25 7:50 ` Leon Romanovsky 0 siblings, 2 replies; 11+ messages in thread From: Pranjal Shrivastava @ 2026-02-24 20:57 UTC (permalink / raw) To: Leon Romanovsky Cc: Ashish Mhetre, robin.murphy, joro, will, iommu, linux-kernel, linux-tegra, linux-mm On Tue, Feb 24, 2026 at 02:32:21PM +0200, Leon Romanovsky wrote: > On Tue, Feb 24, 2026 at 10:42:57AM +0000, Ashish Mhetre wrote: > > When mapping scatter-gather entries that reference reserved > > memory regions without struct page backing (e.g., bootloader created > > carveouts), is_pci_p2pdma_page() dereferences the page pointer > > returned by sg_page() without first verifying its validity. > > I believe this behavior started after commit 88df6ab2f34b > ("mm: add folio_is_pci_p2pdma()"). Prior to that change, the > is_zone_device_page(page) check would return false when given a > non‑existent page pointer. > Doesn't folio_is_pci_p2pdma() also check for zone device? I see[1] that it does: static inline bool folio_is_pci_p2pdma(const struct folio *folio) { return IS_ENABLED(CONFIG_PCI_P2PDMA) && folio_is_zone_device(folio) && folio->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA; } I believe the problem arises due to the page_folio() call in folio_is_pci_p2pdma(page_folio(page)); within is_pci_p2pdma_page(). page_folio() assumes it has a valid struct page to work with. For these carveouts, that isn't true. Potentially something like the following would stop the crash: diff --git a/include/linux/memremap.h b/include/linux/memremap.h index e3c2ccf872a8..e47876021afa 100644 --- a/include/linux/memremap.h +++ b/include/linux/memremap.h @@ -197,7 +197,8 @@ static inline void folio_set_zone_device_data(struct folio *folio, void *data) static inline bool is_pci_p2pdma_page(const struct page *page) { - return IS_ENABLED(CONFIG_PCI_P2PDMA) && + return IS_ENABLED(CONFIG_PCI_P2PDMA) && page && + pfn_valid(page_to_pfn(page)) && folio_is_pci_p2pdma(page_folio(page)); } But my broader question is: why are we calling a page-based API like is_pci_p2pdma_page() on non-struct-page memory in the first place? Could we instead add a helper to verify if the sg_page() return value is actually backed by a struct page? If it isn't, we should arguably skip the P2PDMA logic entirely and fall back to a dma_map_phys style path. Isn't handling these "pageless" physical ranges the primary reason dma_map_phys exists? +mm list Thanks, Praan [1] https://elixir.bootlin.com/linux/v6.19.3/source/include/linux/memremap.h#L179 > If any fix is needed, the is_pci_p2pdma_page() must be changed and not iommu. > > Thanks > > > > > This causes a kernel paging fault when CONFIG_PCI_P2PDMA is enabled > > and dma_map_sg_attrs() is called for memory regions that have no > > associated struct page: > > > > Unable to handle kernel paging request at virtual address fffffc007d100000 > > ... > > Call trace: > > iommu_dma_map_sg+0x118/0x414 > > dma_map_sg_attrs+0x38/0x44 > > > > Fix this by adding a pfn_valid() check before calling > > is_pci_p2pdma_page(). If the page frame number is invalid, skip the > > P2PDMA check entirely as such memory cannot be P2PDMA memory anyway. > > > > Signed-off-by: Ashish Mhetre <amhetre@nvidia.com> > > --- > > drivers/iommu/dma-iommu.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > > index 5dac64be61bb..5f45f33b23c2 100644 > > --- a/drivers/iommu/dma-iommu.c > > +++ b/drivers/iommu/dma-iommu.c > > @@ -1423,6 +1423,9 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, > > size_t s_length = s->length; > > size_t pad_len = (mask - iova_len + 1) & mask; > > > > + if (!pfn_valid(page_to_pfn(sg_page(s)))) > > + goto post_pci_p2pdma; > > + > > switch (pci_p2pdma_state(&p2pdma_state, dev, sg_page(s))) { > > case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: > > /* > > @@ -1449,6 +1452,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, > > goto out_restore_sg; > > } > > > > +post_pci_p2pdma: > > sg_dma_address(s) = s_iova_off; > > sg_dma_len(s) = s_length; > > s->offset -= s_iova_off; > > -- > > 2.25.1 > > > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state 2026-02-24 20:57 ` [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state Pranjal Shrivastava @ 2026-02-25 4:49 ` Ashish Mhetre 2026-02-25 7:56 ` Leon Romanovsky 2026-02-25 7:50 ` Leon Romanovsky 1 sibling, 1 reply; 11+ messages in thread From: Ashish Mhetre @ 2026-02-25 4:49 UTC (permalink / raw) To: Pranjal Shrivastava, Leon Romanovsky Cc: robin.murphy, joro, will, iommu, linux-kernel, linux-tegra, linux-mm On 2/25/2026 2:27 AM, Pranjal Shrivastava wrote: > External email: Use caution opening links or attachments > > > On Tue, Feb 24, 2026 at 02:32:21PM +0200, Leon Romanovsky wrote: >> On Tue, Feb 24, 2026 at 10:42:57AM +0000, Ashish Mhetre wrote: >>> When mapping scatter-gather entries that reference reserved >>> memory regions without struct page backing (e.g., bootloader created >>> carveouts), is_pci_p2pdma_page() dereferences the page pointer >>> returned by sg_page() without first verifying its validity. >> I believe this behavior started after commit 88df6ab2f34b >> ("mm: add folio_is_pci_p2pdma()"). Prior to that change, the >> is_zone_device_page(page) check would return false when given a >> non‑existent page pointer. >> Thanks Leon for the review. This crash started after commit 30280eee2db1 ("iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg"). > Doesn't folio_is_pci_p2pdma() also check for zone device? > I see[1] that it does: > > static inline bool folio_is_pci_p2pdma(const struct folio *folio) > { > return IS_ENABLED(CONFIG_PCI_P2PDMA) && > folio_is_zone_device(folio) && > folio->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA; > } > > I believe the problem arises due to the page_folio() call in > folio_is_pci_p2pdma(page_folio(page)); within is_pci_p2pdma_page(). > page_folio() assumes it has a valid struct page to work with. For these > carveouts, that isn't true. > > Potentially something like the following would stop the crash: > > diff --git a/include/linux/memremap.h b/include/linux/memremap.h > index e3c2ccf872a8..e47876021afa 100644 > --- a/include/linux/memremap.h > +++ b/include/linux/memremap.h > @@ -197,7 +197,8 @@ static inline void folio_set_zone_device_data(struct folio *folio, void *data) > > static inline bool is_pci_p2pdma_page(const struct page *page) > { > - return IS_ENABLED(CONFIG_PCI_P2PDMA) && > + return IS_ENABLED(CONFIG_PCI_P2PDMA) && page && > + pfn_valid(page_to_pfn(page)) && > folio_is_pci_p2pdma(page_folio(page)); > } > Yes, this will also fix the crash. > But my broader question is: why are we calling a page-based API like > is_pci_p2pdma_page() on non-struct-page memory in the first place? > Could we instead add a helper to verify if the sg_page() return value > is actually backed by a struct page? If it isn't, we should arguably > skip the P2PDMA logic entirely and fall back to a dma_map_phys style > path. Isn't handling these "pageless" physical ranges the primary reason > dma_map_phys exists? Thanks for the feedback, Pranjal. To clarify: are you suggesting we handle non-page-backed mappings inside iommu_dma_map_sg (within dma-iommu), or that callers should detect non-page-backed memory and use dma_map_phys instead of dma_map_sg? Former approach sounds better so that existing iommu_dma_map_sg callers don't need changes, but I'd like to confirm your preference. > +mm list > > Thanks, > Praan > > [1] https://elixir.bootlin.com/linux/v6.19.3/source/include/linux/memremap.h#L179 > > >> If any fix is needed, the is_pci_p2pdma_page() must be changed and not iommu. >> >> Thanks >> >>> This causes a kernel paging fault when CONFIG_PCI_P2PDMA is enabled >>> and dma_map_sg_attrs() is called for memory regions that have no >>> associated struct page: >>> >>> Unable to handle kernel paging request at virtual address fffffc007d100000 >>> ... >>> Call trace: >>> iommu_dma_map_sg+0x118/0x414 >>> dma_map_sg_attrs+0x38/0x44 >>> >>> Fix this by adding a pfn_valid() check before calling >>> is_pci_p2pdma_page(). If the page frame number is invalid, skip the >>> P2PDMA check entirely as such memory cannot be P2PDMA memory anyway. >>> >>> Signed-off-by: Ashish Mhetre <amhetre@nvidia.com> >>> --- >>> drivers/iommu/dma-iommu.c | 4 ++++ >>> 1 file changed, 4 insertions(+) >>> >>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c >>> index 5dac64be61bb..5f45f33b23c2 100644 >>> --- a/drivers/iommu/dma-iommu.c >>> +++ b/drivers/iommu/dma-iommu.c >>> @@ -1423,6 +1423,9 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, >>> size_t s_length = s->length; >>> size_t pad_len = (mask - iova_len + 1) & mask; >>> >>> + if (!pfn_valid(page_to_pfn(sg_page(s)))) >>> + goto post_pci_p2pdma; >>> + >>> switch (pci_p2pdma_state(&p2pdma_state, dev, sg_page(s))) { >>> case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: >>> /* >>> @@ -1449,6 +1452,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, >>> goto out_restore_sg; >>> } >>> >>> +post_pci_p2pdma: >>> sg_dma_address(s) = s_iova_off; >>> sg_dma_len(s) = s_length; >>> s->offset -= s_iova_off; >>> -- >>> 2.25.1 >>> >>> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state 2026-02-25 4:49 ` Ashish Mhetre @ 2026-02-25 7:56 ` Leon Romanovsky 2026-02-25 20:11 ` Pranjal Shrivastava 0 siblings, 1 reply; 11+ messages in thread From: Leon Romanovsky @ 2026-02-25 7:56 UTC (permalink / raw) To: Ashish Mhetre Cc: Pranjal Shrivastava, robin.murphy, joro, will, iommu, linux-kernel, linux-tegra, linux-mm On Wed, Feb 25, 2026 at 10:19:41AM +0530, Ashish Mhetre wrote: > > > On 2/25/2026 2:27 AM, Pranjal Shrivastava wrote: > > External email: Use caution opening links or attachments > > > > > > On Tue, Feb 24, 2026 at 02:32:21PM +0200, Leon Romanovsky wrote: > > > On Tue, Feb 24, 2026 at 10:42:57AM +0000, Ashish Mhetre wrote: > > > > When mapping scatter-gather entries that reference reserved > > > > memory regions without struct page backing (e.g., bootloader created > > > > carveouts), is_pci_p2pdma_page() dereferences the page pointer > > > > returned by sg_page() without first verifying its validity. > > > I believe this behavior started after commit 88df6ab2f34b > > > ("mm: add folio_is_pci_p2pdma()"). Prior to that change, the > > > is_zone_device_page(page) check would return false when given a > > > non‑existent page pointer. > > > > > Thanks Leon for the review. This crash started after commit 30280eee2db1 > ("iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg"). > > > Doesn't folio_is_pci_p2pdma() also check for zone device? > > I see[1] that it does: > > > > static inline bool folio_is_pci_p2pdma(const struct folio *folio) > > { > > return IS_ENABLED(CONFIG_PCI_P2PDMA) && > > folio_is_zone_device(folio) && > > folio->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA; > > } > > > > I believe the problem arises due to the page_folio() call in > > folio_is_pci_p2pdma(page_folio(page)); within is_pci_p2pdma_page(). > > page_folio() assumes it has a valid struct page to work with. For these > > carveouts, that isn't true. > > > > Potentially something like the following would stop the crash: > > > > diff --git a/include/linux/memremap.h b/include/linux/memremap.h > > index e3c2ccf872a8..e47876021afa 100644 > > --- a/include/linux/memremap.h > > +++ b/include/linux/memremap.h > > @@ -197,7 +197,8 @@ static inline void folio_set_zone_device_data(struct folio *folio, void *data) > > > > static inline bool is_pci_p2pdma_page(const struct page *page) > > { > > - return IS_ENABLED(CONFIG_PCI_P2PDMA) && > > + return IS_ENABLED(CONFIG_PCI_P2PDMA) && page && > > + pfn_valid(page_to_pfn(page)) && > > folio_is_pci_p2pdma(page_folio(page)); > > } > > > > Yes, this will also fix the crash. > > > But my broader question is: why are we calling a page-based API like > > is_pci_p2pdma_page() on non-struct-page memory in the first place? > > Could we instead add a helper to verify if the sg_page() return value > > is actually backed by a struct page? If it isn't, we should arguably > > skip the P2PDMA logic entirely and fall back to a dma_map_phys style > > path. Isn't handling these "pageless" physical ranges the primary reason > > dma_map_phys exists? > > Thanks for the feedback, Pranjal. > > To clarify: are you suggesting we handle non-page-backed mappings inside > iommu_dma_map_sg (within dma-iommu), or that callers should detect > non-page-backed memory and use dma_map_phys instead of dma_map_sg? The latter one. > Former approach sounds better so that existing iommu_dma_map_sg callers > don't need changes, but I'd like to confirm your preference. The bug is in callers which used wrong API, they need to be adapted. Thanks ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state 2026-02-25 7:56 ` Leon Romanovsky @ 2026-02-25 20:11 ` Pranjal Shrivastava 2026-02-26 7:58 ` Leon Romanovsky 0 siblings, 1 reply; 11+ messages in thread From: Pranjal Shrivastava @ 2026-02-25 20:11 UTC (permalink / raw) To: Leon Romanovsky Cc: Ashish Mhetre, robin.murphy, joro, will, iommu, linux-kernel, linux-tegra, linux-mm On Wed, Feb 25, 2026 at 09:56:09AM +0200, Leon Romanovsky wrote: > On Wed, Feb 25, 2026 at 10:19:41AM +0530, Ashish Mhetre wrote: > > > > > > On 2/25/2026 2:27 AM, Pranjal Shrivastava wrote: > > > External email: Use caution opening links or attachments > > > > > > > > > On Tue, Feb 24, 2026 at 02:32:21PM +0200, Leon Romanovsky wrote: > > > > On Tue, Feb 24, 2026 at 10:42:57AM +0000, Ashish Mhetre wrote: > > > > > When mapping scatter-gather entries that reference reserved > > > > > memory regions without struct page backing (e.g., bootloader created > > > > > carveouts), is_pci_p2pdma_page() dereferences the page pointer > > > > > returned by sg_page() without first verifying its validity. > > > > I believe this behavior started after commit 88df6ab2f34b > > > > ("mm: add folio_is_pci_p2pdma()"). Prior to that change, the > > > > is_zone_device_page(page) check would return false when given a > > > > non‑existent page pointer. > > > > > > > > Thanks Leon for the review. This crash started after commit 30280eee2db1 > > ("iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg"). > > > > > Doesn't folio_is_pci_p2pdma() also check for zone device? > > > I see[1] that it does: > > > > > > static inline bool folio_is_pci_p2pdma(const struct folio *folio) > > > { > > > return IS_ENABLED(CONFIG_PCI_P2PDMA) && > > > folio_is_zone_device(folio) && > > > folio->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA; > > > } > > > > > > I believe the problem arises due to the page_folio() call in > > > folio_is_pci_p2pdma(page_folio(page)); within is_pci_p2pdma_page(). > > > page_folio() assumes it has a valid struct page to work with. For these > > > carveouts, that isn't true. > > > > > > Potentially something like the following would stop the crash: > > > > > > diff --git a/include/linux/memremap.h b/include/linux/memremap.h > > > index e3c2ccf872a8..e47876021afa 100644 > > > --- a/include/linux/memremap.h > > > +++ b/include/linux/memremap.h > > > @@ -197,7 +197,8 @@ static inline void folio_set_zone_device_data(struct folio *folio, void *data) > > > > > > static inline bool is_pci_p2pdma_page(const struct page *page) > > > { > > > - return IS_ENABLED(CONFIG_PCI_P2PDMA) && > > > + return IS_ENABLED(CONFIG_PCI_P2PDMA) && page && > > > + pfn_valid(page_to_pfn(page)) && > > > folio_is_pci_p2pdma(page_folio(page)); > > > } > > > > > > > Yes, this will also fix the crash. > > > > > But my broader question is: why are we calling a page-based API like > > > is_pci_p2pdma_page() on non-struct-page memory in the first place? > > > Could we instead add a helper to verify if the sg_page() return value > > > is actually backed by a struct page? If it isn't, we should arguably > > > skip the P2PDMA logic entirely and fall back to a dma_map_phys style > > > path. Isn't handling these "pageless" physical ranges the primary reason > > > dma_map_phys exists? > > > > Thanks for the feedback, Pranjal. > > > > To clarify: are you suggesting we handle non-page-backed mappings inside > > iommu_dma_map_sg (within dma-iommu), or that callers should detect > > non-page-backed memory and use dma_map_phys instead of dma_map_sg? > > The latter one. > Yup, I meant the latter. > > Former approach sounds better so that existing iommu_dma_map_sg callers > > don't need changes, but I'd like to confirm your preference. > > The bug is in callers which used wrong API, they need to be adapted. Yes, the thing is, if the caller already knows that the region to be mapped is NOT struct page-backed, then why does it use dma_map_sg variants? Thanks Praan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state 2026-02-25 20:11 ` Pranjal Shrivastava @ 2026-02-26 7:58 ` Leon Romanovsky 2026-02-27 5:46 ` Ashish Mhetre 0 siblings, 1 reply; 11+ messages in thread From: Leon Romanovsky @ 2026-02-26 7:58 UTC (permalink / raw) To: Pranjal Shrivastava Cc: Ashish Mhetre, robin.murphy, joro, will, iommu, linux-kernel, linux-tegra, linux-mm On Wed, Feb 25, 2026 at 08:11:29PM +0000, Pranjal Shrivastava wrote: > On Wed, Feb 25, 2026 at 09:56:09AM +0200, Leon Romanovsky wrote: > > On Wed, Feb 25, 2026 at 10:19:41AM +0530, Ashish Mhetre wrote: > > > > > > > > > On 2/25/2026 2:27 AM, Pranjal Shrivastava wrote: > > > > External email: Use caution opening links or attachments > > > > > > > > > > > > On Tue, Feb 24, 2026 at 02:32:21PM +0200, Leon Romanovsky wrote: > > > > > On Tue, Feb 24, 2026 at 10:42:57AM +0000, Ashish Mhetre wrote: > > > > > > When mapping scatter-gather entries that reference reserved > > > > > > memory regions without struct page backing (e.g., bootloader created > > > > > > carveouts), is_pci_p2pdma_page() dereferences the page pointer > > > > > > returned by sg_page() without first verifying its validity. > > > > > I believe this behavior started after commit 88df6ab2f34b > > > > > ("mm: add folio_is_pci_p2pdma()"). Prior to that change, the > > > > > is_zone_device_page(page) check would return false when given a > > > > > non‑existent page pointer. > > > > > > > > > > > Thanks Leon for the review. This crash started after commit 30280eee2db1 > > > ("iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg"). > > > > > > > Doesn't folio_is_pci_p2pdma() also check for zone device? > > > > I see[1] that it does: > > > > > > > > static inline bool folio_is_pci_p2pdma(const struct folio *folio) > > > > { > > > > return IS_ENABLED(CONFIG_PCI_P2PDMA) && > > > > folio_is_zone_device(folio) && > > > > folio->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA; > > > > } > > > > > > > > I believe the problem arises due to the page_folio() call in > > > > folio_is_pci_p2pdma(page_folio(page)); within is_pci_p2pdma_page(). > > > > page_folio() assumes it has a valid struct page to work with. For these > > > > carveouts, that isn't true. > > > > > > > > Potentially something like the following would stop the crash: > > > > > > > > diff --git a/include/linux/memremap.h b/include/linux/memremap.h > > > > index e3c2ccf872a8..e47876021afa 100644 > > > > --- a/include/linux/memremap.h > > > > +++ b/include/linux/memremap.h > > > > @@ -197,7 +197,8 @@ static inline void folio_set_zone_device_data(struct folio *folio, void *data) > > > > > > > > static inline bool is_pci_p2pdma_page(const struct page *page) > > > > { > > > > - return IS_ENABLED(CONFIG_PCI_P2PDMA) && > > > > + return IS_ENABLED(CONFIG_PCI_P2PDMA) && page && > > > > + pfn_valid(page_to_pfn(page)) && > > > > folio_is_pci_p2pdma(page_folio(page)); > > > > } > > > > > > > > > > Yes, this will also fix the crash. > > > > > > > But my broader question is: why are we calling a page-based API like > > > > is_pci_p2pdma_page() on non-struct-page memory in the first place? > > > > Could we instead add a helper to verify if the sg_page() return value > > > > is actually backed by a struct page? If it isn't, we should arguably > > > > skip the P2PDMA logic entirely and fall back to a dma_map_phys style > > > > path. Isn't handling these "pageless" physical ranges the primary reason > > > > dma_map_phys exists? > > > > > > Thanks for the feedback, Pranjal. > > > > > > To clarify: are you suggesting we handle non-page-backed mappings inside > > > iommu_dma_map_sg (within dma-iommu), or that callers should detect > > > non-page-backed memory and use dma_map_phys instead of dma_map_sg? > > > > The latter one. > > > > Yup, I meant the latter. > > > > Former approach sounds better so that existing iommu_dma_map_sg callers > > > don't need changes, but I'd like to confirm your preference. > > > > The bug is in callers which used wrong API, they need to be adapted. > > Yes, the thing is, if the caller already knows that the region to be > mapped is NOT struct page-backed, then why does it use dma_map_sg > variants? Before dma_map_phys() was added, there was no reliable way to DMA‑map such memory, and using dma_map_sg() was a workaround that happened to work. I'm not sure whether it worked by design or by accident, but the correct approach now is to use dma_map_phys(). Thanks > > Thanks > Praan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state 2026-02-26 7:58 ` Leon Romanovsky @ 2026-02-27 5:46 ` Ashish Mhetre 2026-02-27 14:05 ` Robin Murphy 2026-02-27 14:08 ` Pranjal Shrivastava 0 siblings, 2 replies; 11+ messages in thread From: Ashish Mhetre @ 2026-02-27 5:46 UTC (permalink / raw) To: Leon Romanovsky, Pranjal Shrivastava Cc: robin.murphy, joro, will, iommu, linux-kernel, linux-tegra, linux-mm On 2/26/2026 1:28 PM, Leon Romanovsky wrote: > External email: Use caution opening links or attachments > > > On Wed, Feb 25, 2026 at 08:11:29PM +0000, Pranjal Shrivastava wrote: >> On Wed, Feb 25, 2026 at 09:56:09AM +0200, Leon Romanovsky wrote: >>> On Wed, Feb 25, 2026 at 10:19:41AM +0530, Ashish Mhetre wrote: >>>> >>>> On 2/25/2026 2:27 AM, Pranjal Shrivastava wrote: >>>>> External email: Use caution opening links or attachments >>>>> >>>>> >>>>> On Tue, Feb 24, 2026 at 02:32:21PM +0200, Leon Romanovsky wrote: >>>>>> On Tue, Feb 24, 2026 at 10:42:57AM +0000, Ashish Mhetre wrote: >>>>>>> When mapping scatter-gather entries that reference reserved >>>>>>> memory regions without struct page backing (e.g., bootloader created >>>>>>> carveouts), is_pci_p2pdma_page() dereferences the page pointer >>>>>>> returned by sg_page() without first verifying its validity. >>>>>> I believe this behavior started after commit 88df6ab2f34b >>>>>> ("mm: add folio_is_pci_p2pdma()"). Prior to that change, the >>>>>> is_zone_device_page(page) check would return false when given a >>>>>> non‑existent page pointer. >>>>>> >>>> Thanks Leon for the review. This crash started after commit 30280eee2db1 >>>> ("iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg"). >>>> >>>>> Doesn't folio_is_pci_p2pdma() also check for zone device? >>>>> I see[1] that it does: >>>>> >>>>> static inline bool folio_is_pci_p2pdma(const struct folio *folio) >>>>> { >>>>> return IS_ENABLED(CONFIG_PCI_P2PDMA) && >>>>> folio_is_zone_device(folio) && >>>>> folio->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA; >>>>> } >>>>> >>>>> I believe the problem arises due to the page_folio() call in >>>>> folio_is_pci_p2pdma(page_folio(page)); within is_pci_p2pdma_page(). >>>>> page_folio() assumes it has a valid struct page to work with. For these >>>>> carveouts, that isn't true. >>>>> >>>>> Potentially something like the following would stop the crash: >>>>> >>>>> diff --git a/include/linux/memremap.h b/include/linux/memremap.h >>>>> index e3c2ccf872a8..e47876021afa 100644 >>>>> --- a/include/linux/memremap.h >>>>> +++ b/include/linux/memremap.h >>>>> @@ -197,7 +197,8 @@ static inline void folio_set_zone_device_data(struct folio *folio, void *data) >>>>> >>>>> static inline bool is_pci_p2pdma_page(const struct page *page) >>>>> { >>>>> - return IS_ENABLED(CONFIG_PCI_P2PDMA) && >>>>> + return IS_ENABLED(CONFIG_PCI_P2PDMA) && page && >>>>> + pfn_valid(page_to_pfn(page)) && >>>>> folio_is_pci_p2pdma(page_folio(page)); >>>>> } >>>>> >>>> Yes, this will also fix the crash. >>>> >>>>> But my broader question is: why are we calling a page-based API like >>>>> is_pci_p2pdma_page() on non-struct-page memory in the first place? >>>>> Could we instead add a helper to verify if the sg_page() return value >>>>> is actually backed by a struct page? If it isn't, we should arguably >>>>> skip the P2PDMA logic entirely and fall back to a dma_map_phys style >>>>> path. Isn't handling these "pageless" physical ranges the primary reason >>>>> dma_map_phys exists? >>>> Thanks for the feedback, Pranjal. >>>> >>>> To clarify: are you suggesting we handle non-page-backed mappings inside >>>> iommu_dma_map_sg (within dma-iommu), or that callers should detect >>>> non-page-backed memory and use dma_map_phys instead of dma_map_sg? >>> The latter one. >>> >> Yup, I meant the latter. >> >>>> Former approach sounds better so that existing iommu_dma_map_sg callers >>>> don't need changes, but I'd like to confirm your preference. >>> The bug is in callers which used wrong API, they need to be adapted. >> Yes, the thing is, if the caller already knows that the region to be >> mapped is NOT struct page-backed, then why does it use dma_map_sg >> variants? > Before dma_map_phys() was added, there was no reliable way to DMA‑map > such memory, and using dma_map_sg() was a workaround that happened to > work. I'm not sure whether it worked by design or by accident, but the > correct approach now is to use dma_map_phys(). Thanks Leon and Pranjal for the detailed feedback. I'll update our callers to use dma_map_phys() for non-page-backed buffers. One question: would it make sense to add a check in iommu_dma_map_sg to fail gracefully when non-page-backed buffers are passed, instead of crashing the kernel? Thanks, Ashish Mhetre > Thanks > >> Thanks >> Praan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state 2026-02-27 5:46 ` Ashish Mhetre @ 2026-02-27 14:05 ` Robin Murphy 2026-02-27 14:08 ` Pranjal Shrivastava 1 sibling, 0 replies; 11+ messages in thread From: Robin Murphy @ 2026-02-27 14:05 UTC (permalink / raw) To: Ashish Mhetre, Leon Romanovsky, Pranjal Shrivastava Cc: joro, will, iommu, linux-kernel, linux-tegra, linux-mm On 2026-02-27 5:46 am, Ashish Mhetre wrote: > > > On 2/26/2026 1:28 PM, Leon Romanovsky wrote: >> External email: Use caution opening links or attachments >> >> >> On Wed, Feb 25, 2026 at 08:11:29PM +0000, Pranjal Shrivastava wrote: >>> On Wed, Feb 25, 2026 at 09:56:09AM +0200, Leon Romanovsky wrote: >>>> On Wed, Feb 25, 2026 at 10:19:41AM +0530, Ashish Mhetre wrote: >>>>> >>>>> On 2/25/2026 2:27 AM, Pranjal Shrivastava wrote: >>>>>> External email: Use caution opening links or attachments >>>>>> >>>>>> >>>>>> On Tue, Feb 24, 2026 at 02:32:21PM +0200, Leon Romanovsky wrote: >>>>>>> On Tue, Feb 24, 2026 at 10:42:57AM +0000, Ashish Mhetre wrote: >>>>>>>> When mapping scatter-gather entries that reference reserved >>>>>>>> memory regions without struct page backing (e.g., bootloader >>>>>>>> created >>>>>>>> carveouts), is_pci_p2pdma_page() dereferences the page pointer >>>>>>>> returned by sg_page() without first verifying its validity. >>>>>>> I believe this behavior started after commit 88df6ab2f34b >>>>>>> ("mm: add folio_is_pci_p2pdma()"). Prior to that change, the >>>>>>> is_zone_device_page(page) check would return false when given a >>>>>>> non‑existent page pointer. >>>>>>> >>>>> Thanks Leon for the review. This crash started after commit >>>>> 30280eee2db1 >>>>> ("iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg"). >>>>> >>>>>> Doesn't folio_is_pci_p2pdma() also check for zone device? >>>>>> I see[1] that it does: >>>>>> >>>>>> static inline bool folio_is_pci_p2pdma(const struct folio *folio) >>>>>> { >>>>>> return IS_ENABLED(CONFIG_PCI_P2PDMA) && >>>>>> folio_is_zone_device(folio) && >>>>>> folio->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA; >>>>>> } >>>>>> >>>>>> I believe the problem arises due to the page_folio() call in >>>>>> folio_is_pci_p2pdma(page_folio(page)); within is_pci_p2pdma_page(). >>>>>> page_folio() assumes it has a valid struct page to work with. For >>>>>> these >>>>>> carveouts, that isn't true. >>>>>> >>>>>> Potentially something like the following would stop the crash: >>>>>> >>>>>> diff --git a/include/linux/memremap.h b/include/linux/memremap.h >>>>>> index e3c2ccf872a8..e47876021afa 100644 >>>>>> --- a/include/linux/memremap.h >>>>>> +++ b/include/linux/memremap.h >>>>>> @@ -197,7 +197,8 @@ static inline void >>>>>> folio_set_zone_device_data(struct folio *folio, void *data) >>>>>> >>>>>> static inline bool is_pci_p2pdma_page(const struct page *page) >>>>>> { >>>>>> - return IS_ENABLED(CONFIG_PCI_P2PDMA) && >>>>>> + return IS_ENABLED(CONFIG_PCI_P2PDMA) && page && >>>>>> + pfn_valid(page_to_pfn(page)) && >>>>>> folio_is_pci_p2pdma(page_folio(page)); >>>>>> } >>>>>> >>>>> Yes, this will also fix the crash. >>>>> >>>>>> But my broader question is: why are we calling a page-based API like >>>>>> is_pci_p2pdma_page() on non-struct-page memory in the first place? >>>>>> Could we instead add a helper to verify if the sg_page() return value >>>>>> is actually backed by a struct page? If it isn't, we should arguably >>>>>> skip the P2PDMA logic entirely and fall back to a dma_map_phys style >>>>>> path. Isn't handling these "pageless" physical ranges the primary >>>>>> reason >>>>>> dma_map_phys exists? >>>>> Thanks for the feedback, Pranjal. >>>>> >>>>> To clarify: are you suggesting we handle non-page-backed mappings >>>>> inside >>>>> iommu_dma_map_sg (within dma-iommu), or that callers should detect >>>>> non-page-backed memory and use dma_map_phys instead of dma_map_sg? >>>> The latter one. >>>> >>> Yup, I meant the latter. >>> >>>>> Former approach sounds better so that existing iommu_dma_map_sg >>>>> callers >>>>> don't need changes, but I'd like to confirm your preference. >>>> The bug is in callers which used wrong API, they need to be adapted. >>> Yes, the thing is, if the caller already knows that the region to be >>> mapped is NOT struct page-backed, then why does it use dma_map_sg >>> variants? >> Before dma_map_phys() was added, there was no reliable way to DMA‑map >> such memory, and using dma_map_sg() was a workaround that happened to >> work. I'm not sure whether it worked by design or by accident, but the >> correct approach now is to use dma_map_phys(). > > Thanks Leon and Pranjal for the detailed feedback. I'll update our > callers to use > dma_map_phys() for non-page-backed buffers. > > One question: would it make sense to add a check in iommu_dma_map_sg to > fail gracefully when non-page-backed buffers are passed, instead of > crashing > the kernel? No, it is the responsibility of drivers not to abuse kernel APIs inappropriately. Checking for misuse adds overhead that penalises correct users. dma_map_page/sg on non-page-backed memory has never been valid, and it would only have been system-configuration-dependent luck that it wasn't already blowing up before. I guess dma-debug could add additional checks on these APIs similarly to debug_dma_map_single(), but the fact that we've never even considered checking for made-up bogus struct page pointers only goes to show just how wrong a thing to do it is. Thanks, Robin. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state 2026-02-27 5:46 ` Ashish Mhetre 2026-02-27 14:05 ` Robin Murphy @ 2026-02-27 14:08 ` Pranjal Shrivastava 2026-02-27 14:13 ` Jason Gunthorpe 1 sibling, 1 reply; 11+ messages in thread From: Pranjal Shrivastava @ 2026-02-27 14:08 UTC (permalink / raw) To: Ashish Mhetre Cc: Leon Romanovsky, robin.murphy, joro, will, iommu, linux-kernel, linux-tegra, linux-mm, jgg, jgg On Fri, Feb 27, 2026 at 11:16:02AM +0530, Ashish Mhetre wrote: > > > On 2/26/2026 1:28 PM, Leon Romanovsky wrote: > > External email: Use caution opening links or attachments > > > > > > On Wed, Feb 25, 2026 at 08:11:29PM +0000, Pranjal Shrivastava wrote: > > > On Wed, Feb 25, 2026 at 09:56:09AM +0200, Leon Romanovsky wrote: > > > > On Wed, Feb 25, 2026 at 10:19:41AM +0530, Ashish Mhetre wrote: > > > > > > > > > > On 2/25/2026 2:27 AM, Pranjal Shrivastava wrote: > > > > > > External email: Use caution opening links or attachments > > > > > > > > > > > > > > > > > > On Tue, Feb 24, 2026 at 02:32:21PM +0200, Leon Romanovsky wrote: > > > > > > > On Tue, Feb 24, 2026 at 10:42:57AM +0000, Ashish Mhetre wrote: > > > > > > > > When mapping scatter-gather entries that reference reserved > > > > > > > > memory regions without struct page backing (e.g., bootloader created > > > > > > > > carveouts), is_pci_p2pdma_page() dereferences the page pointer > > > > > > > > returned by sg_page() without first verifying its validity. > > > > > > > I believe this behavior started after commit 88df6ab2f34b > > > > > > > ("mm: add folio_is_pci_p2pdma()"). Prior to that change, the > > > > > > > is_zone_device_page(page) check would return false when given a > > > > > > > non‑existent page pointer. > > > > > > > > > > > > Thanks Leon for the review. This crash started after commit 30280eee2db1 > > > > > ("iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg"). > > > > > > > > > > > Doesn't folio_is_pci_p2pdma() also check for zone device? > > > > > > I see[1] that it does: > > > > > > > > > > > > static inline bool folio_is_pci_p2pdma(const struct folio *folio) > > > > > > { > > > > > > return IS_ENABLED(CONFIG_PCI_P2PDMA) && > > > > > > folio_is_zone_device(folio) && > > > > > > folio->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA; > > > > > > } > > > > > > > > > > > > I believe the problem arises due to the page_folio() call in > > > > > > folio_is_pci_p2pdma(page_folio(page)); within is_pci_p2pdma_page(). > > > > > > page_folio() assumes it has a valid struct page to work with. For these > > > > > > carveouts, that isn't true. > > > > > > > > > > > > Potentially something like the following would stop the crash: > > > > > > > > > > > > diff --git a/include/linux/memremap.h b/include/linux/memremap.h > > > > > > index e3c2ccf872a8..e47876021afa 100644 > > > > > > --- a/include/linux/memremap.h > > > > > > +++ b/include/linux/memremap.h > > > > > > @@ -197,7 +197,8 @@ static inline void folio_set_zone_device_data(struct folio *folio, void *data) > > > > > > > > > > > > static inline bool is_pci_p2pdma_page(const struct page *page) > > > > > > { > > > > > > - return IS_ENABLED(CONFIG_PCI_P2PDMA) && > > > > > > + return IS_ENABLED(CONFIG_PCI_P2PDMA) && page && > > > > > > + pfn_valid(page_to_pfn(page)) && > > > > > > folio_is_pci_p2pdma(page_folio(page)); > > > > > > } > > > > > > > > > > > Yes, this will also fix the crash. > > > > > > > > > > > But my broader question is: why are we calling a page-based API like > > > > > > is_pci_p2pdma_page() on non-struct-page memory in the first place? > > > > > > Could we instead add a helper to verify if the sg_page() return value > > > > > > is actually backed by a struct page? If it isn't, we should arguably > > > > > > skip the P2PDMA logic entirely and fall back to a dma_map_phys style > > > > > > path. Isn't handling these "pageless" physical ranges the primary reason > > > > > > dma_map_phys exists? > > > > > Thanks for the feedback, Pranjal. > > > > > > > > > > To clarify: are you suggesting we handle non-page-backed mappings inside > > > > > iommu_dma_map_sg (within dma-iommu), or that callers should detect > > > > > non-page-backed memory and use dma_map_phys instead of dma_map_sg? > > > > The latter one. > > > > > > > Yup, I meant the latter. > > > > > > > > Former approach sounds better so that existing iommu_dma_map_sg callers > > > > > don't need changes, but I'd like to confirm your preference. > > > > The bug is in callers which used wrong API, they need to be adapted. > > > Yes, the thing is, if the caller already knows that the region to be > > > mapped is NOT struct page-backed, then why does it use dma_map_sg > > > variants? > > Before dma_map_phys() was added, there was no reliable way to DMA‑map > > such memory, and using dma_map_sg() was a workaround that happened to Ack. > > work. I'm not sure whether it worked by design or by accident, but the > > correct approach now is to use dma_map_phys(). > > Thanks Leon and Pranjal for the detailed feedback. I'll update our callers > to use > dma_map_phys() for non-page-backed buffers. > > One question: would it make sense to add a check in iommu_dma_map_sg to > fail gracefully when non-page-backed buffers are passed, instead of crashing > the kernel? In my opinion, the answer is no, since this is almost like the "should the kernel protect developers from themselves" debate.. we should be a little dramatic to make sure the developer doesn't call the wrong API. Sure, we could return a DMA_MAPPING_ERROR or something but a silent DMA_MAPPING_ERROR can be ignored by a lazy driver resulting in a much harder-to-debug scenario than a straight-forward crash. The question is, are we sure to use scatterlists to represent non-paged memory? If no, then why are we even calling the dma_map_sg* API? struct scatterlist has a field "page_link" [1] which is literally the struct page with a few bits representing something else. If yes, then we could maybe encode some information (similar to SG_CHAIN) representing if the sg is backed by a struct page. And then in the *sg_map APIs, we could fallback to the dma_phys API if it isn't struct paged-backed. (This would be quite some re-work and not limited to the DMA API alone). But as Leon pointed out that the use of sg for non-paged memory started as a "work-around" since there was no equivalent API to dma_map_phys earlier. Since that's the status quo, I'm leaning towards no. But I think this gives us a nice opportunity to discuss if we really *need* to have scatterlists to represent non-paged memory. I remember some similar discussion happened during tcp_devmem reviews [2]. Adding Jason for his thoughts as well.. Thanks, Praan [1] https://elixir.bootlin.com/linux/v6.19.3/source/include/linux/scatterlist.h#L12 [2] https://lore.kernel.org/netdev/20241115015912.GA559636@ziepe.ca/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state 2026-02-27 14:08 ` Pranjal Shrivastava @ 2026-02-27 14:13 ` Jason Gunthorpe 0 siblings, 0 replies; 11+ messages in thread From: Jason Gunthorpe @ 2026-02-27 14:13 UTC (permalink / raw) To: Pranjal Shrivastava Cc: Ashish Mhetre, Leon Romanovsky, robin.murphy, joro, will, iommu, linux-kernel, linux-tegra, linux-mm On Fri, Feb 27, 2026 at 02:08:42PM +0000, Pranjal Shrivastava wrote: > The question is, are we sure to use scatterlists to represent non-paged > memory? This is absolutely illegal and a driver bug to put non-struct page memory into a scatter list. It was never an acceptable "work around". What driver is doing this?? If you want to improve robustness add some pfn_valid/etc checks under the CONFING DMA DEBUG and throw warns for these mistakes. Jason ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state 2026-02-24 20:57 ` [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state Pranjal Shrivastava 2026-02-25 4:49 ` Ashish Mhetre @ 2026-02-25 7:50 ` Leon Romanovsky 2026-02-25 20:15 ` Pranjal Shrivastava 1 sibling, 1 reply; 11+ messages in thread From: Leon Romanovsky @ 2026-02-25 7:50 UTC (permalink / raw) To: Pranjal Shrivastava Cc: Ashish Mhetre, robin.murphy, joro, will, iommu, linux-kernel, linux-tegra, linux-mm, Christoph Hellwig, Matthew Wilcox On Tue, Feb 24, 2026 at 08:57:56PM +0000, Pranjal Shrivastava wrote: > On Tue, Feb 24, 2026 at 02:32:21PM +0200, Leon Romanovsky wrote: > > On Tue, Feb 24, 2026 at 10:42:57AM +0000, Ashish Mhetre wrote: > > > When mapping scatter-gather entries that reference reserved > > > memory regions without struct page backing (e.g., bootloader created > > > carveouts), is_pci_p2pdma_page() dereferences the page pointer > > > returned by sg_page() without first verifying its validity. > > > > I believe this behavior started after commit 88df6ab2f34b > > ("mm: add folio_is_pci_p2pdma()"). Prior to that change, the > > is_zone_device_page(page) check would return false when given a > > non‑existent page pointer. > > > > Doesn't folio_is_pci_p2pdma() also check for zone device? > I see[1] that it does: > > static inline bool folio_is_pci_p2pdma(const struct folio *folio) > { > return IS_ENABLED(CONFIG_PCI_P2PDMA) && > folio_is_zone_device(folio) && > folio->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA; > } > > I believe the problem arises due to the page_folio() call in > folio_is_pci_p2pdma(page_folio(page)); within is_pci_p2pdma_page(). > page_folio() assumes it has a valid struct page to work with. For these > carveouts, that isn't true. Yes, i came to the same conclusion, just explained why it worked before. > > Potentially something like the following would stop the crash: > > diff --git a/include/linux/memremap.h b/include/linux/memremap.h > index e3c2ccf872a8..e47876021afa 100644 > --- a/include/linux/memremap.h > +++ b/include/linux/memremap.h > @@ -197,7 +197,8 @@ static inline void folio_set_zone_device_data(struct folio *folio, void *data) > > static inline bool is_pci_p2pdma_page(const struct page *page) > { > - return IS_ENABLED(CONFIG_PCI_P2PDMA) && > + return IS_ENABLED(CONFIG_PCI_P2PDMA) && page && > + pfn_valid(page_to_pfn(page)) && pfn_valid() is a relatively expensive function [1] to invoke in the data path, and is_pci_p2pdma_page() ends up being called in these execution flows. [1] https://elixir.bootlin.com/linux/v6.19.3/source/include/linux/mmzone.h#L2167 > folio_is_pci_p2pdma(page_folio(page)); > } > > > But my broader question is: why are we calling a page-based API like > is_pci_p2pdma_page() on non-struct-page memory in the first place? +1 > Could we instead add a helper to verify if the sg_page() return value > is actually backed by a struct page? According to the SG design, callers should store only struct page pointers. There is one known user that violates this requirement: dmabuf, which is gradually being migrated away from this behavior [2]. [2] https://lore.kernel.org/all/0-v1-b5cab63049c0+191af-dmabuf_map_type_jgg@nvidia.com/ > If it isn't, we should arguably skip the P2PDMA logic entirely and fall > back to a dma_map_phys style path. Isn't handling these "pageless" physical > ranges the primary reason dma_map_phys exists? Right. dma_map_sg() is indeed the wrong API to use for memory that is not backed by struct page pointers. Thanks > > +mm list > > Thanks, > Praan > > [1] https://elixir.bootlin.com/linux/v6.19.3/source/include/linux/memremap.h#L179 > > > > If any fix is needed, the is_pci_p2pdma_page() must be changed and not iommu. > > > > Thanks > > > > > > > > This causes a kernel paging fault when CONFIG_PCI_P2PDMA is enabled > > > and dma_map_sg_attrs() is called for memory regions that have no > > > associated struct page: > > > > > > Unable to handle kernel paging request at virtual address fffffc007d100000 > > > ... > > > Call trace: > > > iommu_dma_map_sg+0x118/0x414 > > > dma_map_sg_attrs+0x38/0x44 > > > > > > Fix this by adding a pfn_valid() check before calling > > > is_pci_p2pdma_page(). If the page frame number is invalid, skip the > > > P2PDMA check entirely as such memory cannot be P2PDMA memory anyway. > > > > > > Signed-off-by: Ashish Mhetre <amhetre@nvidia.com> > > > --- > > > drivers/iommu/dma-iommu.c | 4 ++++ > > > 1 file changed, 4 insertions(+) > > > > > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > > > index 5dac64be61bb..5f45f33b23c2 100644 > > > --- a/drivers/iommu/dma-iommu.c > > > +++ b/drivers/iommu/dma-iommu.c > > > @@ -1423,6 +1423,9 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, > > > size_t s_length = s->length; > > > size_t pad_len = (mask - iova_len + 1) & mask; > > > > > > + if (!pfn_valid(page_to_pfn(sg_page(s)))) > > > + goto post_pci_p2pdma; > > > + > > > switch (pci_p2pdma_state(&p2pdma_state, dev, sg_page(s))) { > > > case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: > > > /* > > > @@ -1449,6 +1452,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, > > > goto out_restore_sg; > > > } > > > > > > +post_pci_p2pdma: > > > sg_dma_address(s) = s_iova_off; > > > sg_dma_len(s) = s_length; > > > s->offset -= s_iova_off; > > > -- > > > 2.25.1 > > > > > > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state 2026-02-25 7:50 ` Leon Romanovsky @ 2026-02-25 20:15 ` Pranjal Shrivastava 0 siblings, 0 replies; 11+ messages in thread From: Pranjal Shrivastava @ 2026-02-25 20:15 UTC (permalink / raw) To: Leon Romanovsky Cc: Ashish Mhetre, robin.murphy, joro, will, iommu, linux-kernel, linux-tegra, linux-mm, Christoph Hellwig, Matthew Wilcox On Wed, Feb 25, 2026 at 09:50:00AM +0200, Leon Romanovsky wrote: > On Tue, Feb 24, 2026 at 08:57:56PM +0000, Pranjal Shrivastava wrote: > > On Tue, Feb 24, 2026 at 02:32:21PM +0200, Leon Romanovsky wrote: > > > On Tue, Feb 24, 2026 at 10:42:57AM +0000, Ashish Mhetre wrote: > > > > When mapping scatter-gather entries that reference reserved > > > > memory regions without struct page backing (e.g., bootloader created > > > > carveouts), is_pci_p2pdma_page() dereferences the page pointer > > > > returned by sg_page() without first verifying its validity. > > > > > > I believe this behavior started after commit 88df6ab2f34b > > > ("mm: add folio_is_pci_p2pdma()"). Prior to that change, the > > > is_zone_device_page(page) check would return false when given a > > > non‑existent page pointer. > > > > > > > Doesn't folio_is_pci_p2pdma() also check for zone device? > > I see[1] that it does: > > > > static inline bool folio_is_pci_p2pdma(const struct folio *folio) > > { > > return IS_ENABLED(CONFIG_PCI_P2PDMA) && > > folio_is_zone_device(folio) && > > folio->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA; > > } > > > > I believe the problem arises due to the page_folio() call in > > folio_is_pci_p2pdma(page_folio(page)); within is_pci_p2pdma_page(). > > page_folio() assumes it has a valid struct page to work with. For these > > carveouts, that isn't true. > > Yes, i came to the same conclusion, just explained why it worked before. > Ack. > > > > Potentially something like the following would stop the crash: > > > > diff --git a/include/linux/memremap.h b/include/linux/memremap.h > > index e3c2ccf872a8..e47876021afa 100644 > > --- a/include/linux/memremap.h > > +++ b/include/linux/memremap.h > > @@ -197,7 +197,8 @@ static inline void folio_set_zone_device_data(struct folio *folio, void *data) > > > > static inline bool is_pci_p2pdma_page(const struct page *page) > > { > > - return IS_ENABLED(CONFIG_PCI_P2PDMA) && > > + return IS_ENABLED(CONFIG_PCI_P2PDMA) && page && > > + pfn_valid(page_to_pfn(page)) && > > pfn_valid() is a relatively expensive function [1] to invoke in the data path, > and is_pci_p2pdma_page() ends up being called in these execution flows. > Right, that makes sense. Ideally, it shouldn't be there at either of the places (iommu_dma_map_sg or is_pci_p2pdma_page()). > [1] https://elixir.bootlin.com/linux/v6.19.3/source/include/linux/mmzone.h#L2167 > > > folio_is_pci_p2pdma(page_folio(page)); > > } > > > > > > But my broader question is: why are we calling a page-based API like > > is_pci_p2pdma_page() on non-struct-page memory in the first place? > > +1 > > > Could we instead add a helper to verify if the sg_page() return value > > is actually backed by a struct page? > > According to the SG design, callers should store only struct page pointers. > There is one known user that violates this requirement: dmabuf, which is > gradually being migrated away from this behavior [2]. > > [2] https://lore.kernel.org/all/0-v1-b5cab63049c0+191af-dmabuf_map_type_jgg@nvidia.com/ > > > If it isn't, we should arguably skip the P2PDMA logic entirely and fall > > back to a dma_map_phys style path. Isn't handling these "pageless" physical > > ranges the primary reason dma_map_phys exists? > > Right. dma_map_sg() is indeed the wrong API to use for memory that is not > backed by struct page pointers. > > Thanks > [--->8---] Thanks, Praan ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-02-27 14:13 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20260224104257.1641429-1-amhetre@nvidia.com>
[not found] ` <20260224123221.GM10607@unreal>
2026-02-24 20:57 ` [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state Pranjal Shrivastava
2026-02-25 4:49 ` Ashish Mhetre
2026-02-25 7:56 ` Leon Romanovsky
2026-02-25 20:11 ` Pranjal Shrivastava
2026-02-26 7:58 ` Leon Romanovsky
2026-02-27 5:46 ` Ashish Mhetre
2026-02-27 14:05 ` Robin Murphy
2026-02-27 14:08 ` Pranjal Shrivastava
2026-02-27 14:13 ` Jason Gunthorpe
2026-02-25 7:50 ` Leon Romanovsky
2026-02-25 20:15 ` Pranjal Shrivastava
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox