* [bug report] bad error return in walk_hugetlb_range()
@ 2025-10-04 6:22 Dan Carpenter
2025-10-07 10:13 ` David Hildenbrand
0 siblings, 1 reply; 3+ messages in thread
From: Dan Carpenter @ 2025-10-04 6:22 UTC (permalink / raw)
To: intel-xe, linux-mm
This is really old code. I think it's a bug in hugetlb.
drivers/gpu/drm/xe/xe_gt_pagefault.c:353 pf_queue_work_func()
warn: passing positive error code 's32min-(-12),(-10)-(-1),1' to 'ERR_PTR'
mm/pagewalk.c
319 static int walk_hugetlb_range(unsigned long addr, unsigned long end,
320 struct mm_walk *walk)
321 {
322 struct vm_area_struct *vma = walk->vma;
323 struct hstate *h = hstate_vma(vma);
324 unsigned long next;
325 unsigned long hmask = huge_page_mask(h);
326 unsigned long sz = huge_page_size(h);
327 pte_t *pte;
328 const struct mm_walk_ops *ops = walk->ops;
329 int err = 0;
330
331 hugetlb_vma_lock_read(vma);
332 do {
333 next = hugetlb_entry_end(h, addr, end);
334 pte = hugetlb_walk(vma, addr & hmask, sz);
335 if (pte)
336 err = ops->hugetlb_entry(pte, hmask, addr, next, walk);
The ->hugetlb_entry() is implemented by two functions which return
true/false instead of error codes. Smatch thinks this 1 value gets
propagated back to pf_queue_work_func() and results an an Oops.
The two problem functions are hwpoison_hugetlb_range() and
pagemap_hugetlb_range() which returns PM_END_OF_BUFFER from
add_to_pagemap().
337 else if (ops->pte_hole)
338 err = ops->pte_hole(addr, next, -1, walk);
339 if (err)
340 break;
341 } while (addr = next, addr != end);
342 hugetlb_vma_unlock_read(vma);
343
344 return err;
345 }
regards,
dan carpenter
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [bug report] bad error return in walk_hugetlb_range()
2025-10-04 6:22 [bug report] bad error return in walk_hugetlb_range() Dan Carpenter
@ 2025-10-07 10:13 ` David Hildenbrand
2025-10-07 11:41 ` Dan Carpenter
0 siblings, 1 reply; 3+ messages in thread
From: David Hildenbrand @ 2025-10-07 10:13 UTC (permalink / raw)
To: Dan Carpenter, intel-xe, linux-mm
On 04.10.25 08:22, Dan Carpenter wrote:
> This is really old code. I think it's a bug in hugetlb.
>
> drivers/gpu/drm/xe/xe_gt_pagefault.c:353 pf_queue_work_func()
> warn: passing positive error code 's32min-(-12),(-10)-(-1),1' to 'ERR_PTR'
>
> mm/pagewalk.c
> 319 static int walk_hugetlb_range(unsigned long addr, unsigned long end,
> 320 struct mm_walk *walk)
> 321 {
> 322 struct vm_area_struct *vma = walk->vma;
> 323 struct hstate *h = hstate_vma(vma);
> 324 unsigned long next;
> 325 unsigned long hmask = huge_page_mask(h);
> 326 unsigned long sz = huge_page_size(h);
> 327 pte_t *pte;
> 328 const struct mm_walk_ops *ops = walk->ops;
> 329 int err = 0;
> 330
> 331 hugetlb_vma_lock_read(vma);
> 332 do {
> 333 next = hugetlb_entry_end(h, addr, end);
> 334 pte = hugetlb_walk(vma, addr & hmask, sz);
> 335 if (pte)
> 336 err = ops->hugetlb_entry(pte, hmask, addr, next, walk);
>
> The ->hugetlb_entry() is implemented by two functions which return
> true/false instead of error codes. Smatch thinks this 1 value gets
> propagated back to pf_queue_work_func() and results an an Oops.
>
> The two problem functions are hwpoison_hugetlb_range() and
> pagemap_hugetlb_range() which returns PM_END_OF_BUFFER from
> add_to_pagemap().
hwpoison_hugetlb_range() seems to behave just like hwpoison_pte_range(),
returning "1" if check_hwpoisoned_entry() returned "1" -- if we found
the entry with the problematic PFN and can just abort.
Staring at kill_accessing_process() that ends up calling these
walk-functions, that seems to be correct. The value is converted to
0/-EHWPOISON, all good.
pagemap_hugetlb_range() can indeed return either 0 or PM_END_OF_BUFFER
obtained from add_to_pagemap().
But that's the same behavior as pagemap_pte_hole()/pagemap_pmd_range(),
so that's nothing hugetlb-specific.
pagemap_read() does the
ret = walk_page_range(mm, start_vaddr, end, &pagemap_ops, &pm);
After the loop, it does
if (!ret || ret == PM_END_OF_BUFFER)
ret = copied;
So I don't immediately seeing anything wrong with that?
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [bug report] bad error return in walk_hugetlb_range()
2025-10-07 10:13 ` David Hildenbrand
@ 2025-10-07 11:41 ` Dan Carpenter
0 siblings, 0 replies; 3+ messages in thread
From: Dan Carpenter @ 2025-10-07 11:41 UTC (permalink / raw)
To: David Hildenbrand; +Cc: intel-xe, linux-mm
On Tue, Oct 07, 2025 at 12:13:40PM +0200, David Hildenbrand wrote:
> On 04.10.25 08:22, Dan Carpenter wrote:
> > This is really old code. I think it's a bug in hugetlb.
> >
> > drivers/gpu/drm/xe/xe_gt_pagefault.c:353 pf_queue_work_func()
> > warn: passing positive error code 's32min-(-12),(-10)-(-1),1' to 'ERR_PTR'
> >
Thanks, David. Yeah. You're right. My apologies. I tracked down the
confusion and this warning is actually because Smatch thinks that
hmm_range_fault() propogates the positive returns from walk_page_range().
But actually walk_page_range() only returns positive with certain flags.
Someone explained this to me in Jun and I said I would silence the
warning but I forgot... Ugh... Sorry. :(
https://lore.kernel.org/all/aECCaCP3BGGGUUa0@stanley.mountain/
I have done it now, below.
regards,
dan carpenter
From fb706e39230f6f2bc6d68a18837171ea4c1fecc6 Mon Sep 17 00:00:00 2001
From: Dan Carpenter <dan.carpenter@linaro.org>
Date: Tue, 7 Oct 2025 14:37:51 +0300
Subject: [PATCH] db/kernel.delete_returns: hmm_range_fault() can't return 1
This is pretty tricky code to read. It doesn't return 1. This leads to
error pointer warnings.
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
---
smatch_data/db/kernel.delete.return_states | 1 +
1 file changed, 1 insertion(+)
diff --git a/smatch_data/db/kernel.delete.return_states b/smatch_data/db/kernel.delete.return_states
index a1b3553a9f03..cfdf252e472c 100644
--- a/smatch_data/db/kernel.delete.return_states
+++ b/smatch_data/db/kernel.delete.return_states
@@ -30,3 +30,4 @@ ubi_find_or_add_av 0
xe_migrate_copy 0
scmi_get_or_create_handler 0
alloc_frame_masks 0
+hmm_range_fault 1
--
2.51.0
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-10-07 11:41 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-04 6:22 [bug report] bad error return in walk_hugetlb_range() Dan Carpenter
2025-10-07 10:13 ` David Hildenbrand
2025-10-07 11:41 ` Dan Carpenter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox