[bug report] mm: avoid leaving partial pfn mappings around in error case

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [bug report] mm: avoid leaving partial pfn mappings around in error case
@ 2024-09-15 10:08 Dan Carpenter
  2024-09-15 10:23 ` Linus Torvalds
  2024-09-15 12:01 ` Lorenzo Stoakes
  0 siblings, 2 replies; 10+ messages in thread
From: Dan Carpenter @ 2024-09-15 10:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm

Hi Linus,

Commit 79a61cc3fc04 ("mm: avoid leaving partial pfn mappings around in
error case") from Sep 11, 2024 (linux-next), leads to the following
Smatch static checker warning:

	mm/memory.c:2709 remap_pfn_range_notrack()
	warn: sleeping in atomic context

mm/memory.c
    2696 int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr,
    2697                 unsigned long pfn, unsigned long size, pgprot_t prot)
    2698 {
    2699         int error = remap_pfn_range_internal(vma, addr, pfn, size, prot);
    2700 
    2701         if (!error)
    2702                 return 0;
    2703 
    2704         /*
    2705          * A partial pfn range mapping is dangerous: it does not
    2706          * maintain page reference counts, and callers may free
    2707          * pages due to the error. So zap it early.
    2708          */
--> 2709         zap_page_range_single(vma, addr, size, NULL);

The lru_add_drain() function at the start of zap_page_range_single() takes a
mutext.

    2710         return error;
    2711 }

It's the preempt_disable() in gru_fault() which is the issue.  The call tree
is:

gru_fault() <- disables preempt
-> remap_pfn_range()
   -> remap_pfn_range_notrack()

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug report] mm: avoid leaving partial pfn mappings around in error case
  2024-09-15 10:08 [bug report] mm: avoid leaving partial pfn mappings around in error case Dan Carpenter
@ 2024-09-15 10:23 ` Linus Torvalds
  2024-09-15 12:05   ` Dan Carpenter
  2024-09-15 12:01 ` Lorenzo Stoakes
  1 sibling, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2024-09-15 10:23 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: linux-mm

On Sun, 15 Sept 2024 at 12:08, Dan Carpenter <dan.carpenter@linaro.org> wrote:
>
> The lru_add_drain() function at the start of zap_page_range_single() takes a
> mutex.

Yes, that shouldn't be problematic. But:

> It's the preempt_disable() in gru_fault() which is the issue.  The call tree
> is:
>
> gru_fault() <- disables preempt
> -> remap_pfn_range()
>    -> remap_pfn_range_notrack()

That code is very odd. It was invalid to call remap_pfn_range() with
preemption disabled even before, because it will allocate the page
tables that it fills in.

But presumably *that* never happened in practice, and so nobody
noticed how broken that code was before.

Now smatch seems to see a new problem, but I *think* it's because
smatch didn't notice the sleeping by p4d_alloc() / pud_alloc() /
pmd_alloc() because those allocations are all conditional (so smatch
doesn't see them as static violations).

Put another way: I do not believe this is a new issue, but perhaps a
"new to smatch" issue?

I do get the feeling that even despite the pmd_alloc() having that
conditional, smatch should have seen how it goes to __pmd_alloc() and
then to pmd_alloc_one() and pagetable_alloc_noprof() ->
alloc_pages_noprof().

Have you maybe disabled those warnings from before? Or is there some
other reason why smatch just considers that to be ok with preemption
disabled?

              Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug report] mm: avoid leaving partial pfn mappings around in error case
  2024-09-15 10:08 [bug report] mm: avoid leaving partial pfn mappings around in error case Dan Carpenter
  2024-09-15 10:23 ` Linus Torvalds
@ 2024-09-15 12:01 ` Lorenzo Stoakes
  2024-09-15 12:09   ` Dan Carpenter
  1 sibling, 1 reply; 10+ messages in thread
From: Lorenzo Stoakes @ 2024-09-15 12:01 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: Linus Torvalds, linux-mm

On Sun, Sep 15, 2024 at 01:08:27PM GMT, Dan Carpenter wrote:
> Hi Linus,
>
> Commit 79a61cc3fc04 ("mm: avoid leaving partial pfn mappings around in
> error case") from Sep 11, 2024 (linux-next), leads to the following
> Smatch static checker warning:
>
> 	mm/memory.c:2709 remap_pfn_range_notrack()
> 	warn: sleeping in atomic context
>
> mm/memory.c
>     2696 int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr,
>     2697                 unsigned long pfn, unsigned long size, pgprot_t prot)
>     2698 {
>     2699         int error = remap_pfn_range_internal(vma, addr, pfn, size, prot);
>     2700
>     2701         if (!error)
>     2702                 return 0;
>     2703
>     2704         /*
>     2705          * A partial pfn range mapping is dangerous: it does not
>     2706          * maintain page reference counts, and callers may free
>     2707          * pages due to the error. So zap it early.
>     2708          */
> --> 2709         zap_page_range_single(vma, addr, size, NULL);
>
> The lru_add_drain() function at the start of zap_page_range_single() takes a
> mutext.

Hm does it? I see a local lock, and some folio batch locking which are
local locks too?

Unless this is hugetlb, I see:

-> hugetlb_zap_begin()
   -> __hugetlb_zap_begin()
      -> hugetlb_vma_lock_write()
         -> down_write()
            -> might_sleep()

(Also __hugetlb_zap_begin() -> i_mmap_lock_write() -> down_write())

I see only spin locks in the page table allocation paths (unless I'm
missing something).

I may be missing something, however!

>
>     2710         return error;
>     2711 }
>
> It's the preempt_disable() in gru_fault() which is the issue.  The call tree
> is:
>
> gru_fault() <- disables preempt
> -> remap_pfn_range()
>    -> remap_pfn_range_notrack()
>
> regards,
> dan carpenter
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug report] mm: avoid leaving partial pfn mappings around in error case
  2024-09-15 10:23 ` Linus Torvalds
@ 2024-09-15 12:05   ` Dan Carpenter
  2024-09-15 13:14     ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Dan Carpenter @ 2024-09-15 12:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm

On Sun, Sep 15, 2024 at 12:23:31PM +0200, Linus Torvalds wrote:
> On Sun, 15 Sept 2024 at 12:08, Dan Carpenter <dan.carpenter@linaro.org> wrote:
> >
> > The lru_add_drain() function at the start of zap_page_range_single() takes a
> > mutex.
> 
> Yes, that shouldn't be problematic. But:
> 
> > It's the preempt_disable() in gru_fault() which is the issue.  The call tree
> > is:
> >
> > gru_fault() <- disables preempt
> > -> remap_pfn_range()
> >    -> remap_pfn_range_notrack()
> 
> That code is very odd. It was invalid to call remap_pfn_range() with
> preemption disabled even before, because it will allocate the page
> tables that it fills in.
> 
> But presumably *that* never happened in practice, and so nobody
> noticed how broken that code was before.
> 
> Now smatch seems to see a new problem, but I *think* it's because
> smatch didn't notice the sleeping by p4d_alloc() / pud_alloc() /
> pmd_alloc() because those allocations are all conditional (so smatch
> doesn't see them as static violations).
> 
> Put another way: I do not believe this is a new issue, but perhaps a
> "new to smatch" issue?
> 

Yep.  You're right.  Smatch doesn't count allocations as sleeping when we pass a
variable to for the gfp flags and those functions do "get_zeroed_page(gfp)".
I've been intending for years to handle bitmasks better but I've never
implemented that code.

regards,
dan carpenter



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug report] mm: avoid leaving partial pfn mappings around in error case
  2024-09-15 12:01 ` Lorenzo Stoakes
@ 2024-09-15 12:09   ` Dan Carpenter
  2024-09-15 12:38     ` Lorenzo Stoakes
  0 siblings, 1 reply; 10+ messages in thread
From: Dan Carpenter @ 2024-09-15 12:09 UTC (permalink / raw)
  To: Lorenzo Stoakes; +Cc: Linus Torvalds, linux-mm

On Sun, Sep 15, 2024 at 01:01:43PM +0100, Lorenzo Stoakes wrote:
> On Sun, Sep 15, 2024 at 01:08:27PM GMT, Dan Carpenter wrote:
> > Hi Linus,
> >
> > Commit 79a61cc3fc04 ("mm: avoid leaving partial pfn mappings around in
> > error case") from Sep 11, 2024 (linux-next), leads to the following
> > Smatch static checker warning:
> >
> > 	mm/memory.c:2709 remap_pfn_range_notrack()
> > 	warn: sleeping in atomic context
> >
> > mm/memory.c
> >     2696 int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr,
> >     2697                 unsigned long pfn, unsigned long size, pgprot_t prot)
> >     2698 {
> >     2699         int error = remap_pfn_range_internal(vma, addr, pfn, size, prot);
> >     2700
> >     2701         if (!error)
> >     2702                 return 0;
> >     2703
> >     2704         /*
> >     2705          * A partial pfn range mapping is dangerous: it does not
> >     2706          * maintain page reference counts, and callers may free
> >     2707          * pages due to the error. So zap it early.
> >     2708          */
> > --> 2709         zap_page_range_single(vma, addr, size, NULL);
> >
> > The lru_add_drain() function at the start of zap_page_range_single() takes a
> > mutext.
> 
> Hm does it? I see a local lock, and some folio batch locking which are
> local locks too?

Ah...  No it doesn't.  It's the mmu_notifier_invalidate_range_start() which is
a might_sleep() function.  Sorry for the confusion.

regards,
dan carpenter



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug report] mm: avoid leaving partial pfn mappings around in error case
  2024-09-15 12:09   ` Dan Carpenter
@ 2024-09-15 12:38     ` Lorenzo Stoakes
  2024-09-15 13:14       ` Dan Carpenter
  0 siblings, 1 reply; 10+ messages in thread
From: Lorenzo Stoakes @ 2024-09-15 12:38 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Linus Torvalds, linux-mm, Dimitri Sivanich, Arnd Bergmann,
	Greg Kroah-Hartman, linux-kernel

+ get_maintainers.pl people for drivers/misc/sgi-gru/grumain.c

On Sun, Sep 15, 2024 at 03:09:35PM GMT, Dan Carpenter wrote:
> On Sun, Sep 15, 2024 at 01:01:43PM +0100, Lorenzo Stoakes wrote:
> > On Sun, Sep 15, 2024 at 01:08:27PM GMT, Dan Carpenter wrote:
> > > Hi Linus,
> > >
> > > Commit 79a61cc3fc04 ("mm: avoid leaving partial pfn mappings around in
> > > error case") from Sep 11, 2024 (linux-next), leads to the following
> > > Smatch static checker warning:
> > >
> > > 	mm/memory.c:2709 remap_pfn_range_notrack()
> > > 	warn: sleeping in atomic context
> > >
> > > mm/memory.c
> > >     2696 int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr,
> > >     2697                 unsigned long pfn, unsigned long size, pgprot_t prot)
> > >     2698 {
> > >     2699         int error = remap_pfn_range_internal(vma, addr, pfn, size, prot);
> > >     2700
> > >     2701         if (!error)
> > >     2702                 return 0;
> > >     2703
> > >     2704         /*
> > >     2705          * A partial pfn range mapping is dangerous: it does not
> > >     2706          * maintain page reference counts, and callers may free
> > >     2707          * pages due to the error. So zap it early.
> > >     2708          */
> > > --> 2709         zap_page_range_single(vma, addr, size, NULL);
> > >
> > > The lru_add_drain() function at the start of zap_page_range_single() takes a
> > > mutext.
> >
> > Hm does it? I see a local lock, and some folio batch locking which are
> > local locks too?
>
> Ah...  No it doesn't.  It's the mmu_notifier_invalidate_range_start() which is
> a might_sleep() function.  Sorry for the confusion.

OK so in conclusion it seems to be that Linus's commit introducing
zap_page_range_single() accidentally had smatch hit a might_sleep() via
mmu_notifier_invalidate_range_start(), but it should, in theory, have fired
due to page table allocations invoking the page allocator that might sleep,
but didn't, because smatch misses the below might_alloc() path...

-> prepare_alloc_pages()
   -> might_alloc()
      -> might_sleep_if(gfpflags_allow_blocking(gfp_mask))

...as a result of get_zeroed_page() tripping it up *breathes*. :)

(please correct me if I am wrong here).

The preempt_disable() is introduced in commit fe5bb6b00c3a9 ("sgi-gru: misc
GRU cleanup") from... 2009, but it fixed it from the far far more broken
'disable preemption before taking a mutex' situation that existed before.

So fix seems to me to not invoke remap_pfn_range() with preemption disabled
and a mutex held? gru_fault() maintainers added for input...

>
> regards,
> dan carpenter
>
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug report] mm: avoid leaving partial pfn mappings around in error case
  2024-09-15 12:05   ` Dan Carpenter
@ 2024-09-15 13:14     ` Linus Torvalds
  2024-09-18 21:08       ` Dimitri Sivanich
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2024-09-15 13:14 UTC (permalink / raw)
  To: Dan Carpenter, Dimitri Sivanich, Arnd Bergmann; +Cc: linux-mm

On Sun, 15 Sept 2024 at 14:05, Dan Carpenter <dan.carpenter@linaro.org> wrote:
>
> Yep.  You're right.  Smatch doesn't count allocations as sleeping when we pass a
> variable to for the gfp flags and those functions do "get_zeroed_page(gfp)".
> I've been intending for years to handle bitmasks better but I've never
> implemented that code.

That explains it. In this case, the 'variable' gfp values are
basically identical: GFP_PGTABLE_KERNEL and GFP_PGTABLE_USER depending
on which mm they get allocated to, and both are just variations of
GFP_KERNEL (with __GFP_ZERO and a __GFP_ACCOUNT for the user case), so
the exact details of th egfp flags are conditional, but they are
unconditionally sleeping allocations.

But yeah, if smatch doesn't follow that kind of conditional gfp flags,
it explains why smatch thought the odd gru_fault() code used to be ok,
even though it clearly isn't.

I'm not sure why gru_fault does that preempt-disable at all, since the
real locking seems to be that &gts->ts_ctxlock mutex, but it has done
it since its initial commit.

And it's been very much wrong since that initial commit, but I suspect
it never triggers in real life because the page tables probably are
already set up.

Probably more importantly, it doesn't trigger in real life because SGI
UV machines with that GRU hardware are probably hard to find.

Anyway, maybe somebody can explain why gru_fault() has that
preempt_disable() in it, but it is very wrong. It always has been, and
the recent change just made smatch notice a new case.

I don't actually see any reason for that preemption disable, and I
suspect it could just be removed. But maybe somebody else who knows
that odd driver can pipe up.. Adding Dimirti and Arnd just in case.

           Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug report] mm: avoid leaving partial pfn mappings around in error case
  2024-09-15 12:38     ` Lorenzo Stoakes
@ 2024-09-15 13:14       ` Dan Carpenter
  2024-09-15 13:26         ` Lorenzo Stoakes
  0 siblings, 1 reply; 10+ messages in thread
From: Dan Carpenter @ 2024-09-15 13:14 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Linus Torvalds, linux-mm, Dimitri Sivanich, Arnd Bergmann,
	Greg Kroah-Hartman, linux-kernel

On Sun, Sep 15, 2024 at 01:38:40PM +0100, Lorenzo Stoakes wrote:
> + get_maintainers.pl people for drivers/misc/sgi-gru/grumain.c
> 
> On Sun, Sep 15, 2024 at 03:09:35PM GMT, Dan Carpenter wrote:
> > On Sun, Sep 15, 2024 at 01:01:43PM +0100, Lorenzo Stoakes wrote:
> > > On Sun, Sep 15, 2024 at 01:08:27PM GMT, Dan Carpenter wrote:
> > > > Hi Linus,
> > > >
> > > > Commit 79a61cc3fc04 ("mm: avoid leaving partial pfn mappings around in
> > > > error case") from Sep 11, 2024 (linux-next), leads to the following
> > > > Smatch static checker warning:
> > > >
> > > > 	mm/memory.c:2709 remap_pfn_range_notrack()
> > > > 	warn: sleeping in atomic context
> > > >
> > > > mm/memory.c
> > > >     2696 int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr,
> > > >     2697                 unsigned long pfn, unsigned long size, pgprot_t prot)
> > > >     2698 {
> > > >     2699         int error = remap_pfn_range_internal(vma, addr, pfn, size, prot);
> > > >     2700
> > > >     2701         if (!error)
> > > >     2702                 return 0;
> > > >     2703
> > > >     2704         /*
> > > >     2705          * A partial pfn range mapping is dangerous: it does not
> > > >     2706          * maintain page reference counts, and callers may free
> > > >     2707          * pages due to the error. So zap it early.
> > > >     2708          */
> > > > --> 2709         zap_page_range_single(vma, addr, size, NULL);
> > > >
> > > > The lru_add_drain() function at the start of zap_page_range_single() takes a
> > > > mutext.
> > >
> > > Hm does it? I see a local lock, and some folio batch locking which are
> > > local locks too?
> >
> > Ah...  No it doesn't.  It's the mmu_notifier_invalidate_range_start() which is
> > a might_sleep() function.  Sorry for the confusion.
> 
> OK so in conclusion it seems to be that Linus's commit introducing
> zap_page_range_single() accidentally had smatch hit a might_sleep() via
> mmu_notifier_invalidate_range_start(), but it should, in theory, have fired
> due to page table allocations invoking the page allocator that might sleep,
> but didn't, because smatch misses the below might_alloc() path...
> 
> -> prepare_alloc_pages()
>    -> might_alloc()
>       -> might_sleep_if(gfpflags_allow_blocking(gfp_mask))
> 
> ...as a result of get_zeroed_page() tripping it up *breathes*. :)
> 
> (please correct me if I am wrong here).

That's an accurate summary...

> 
> The preempt_disable() is introduced in commit fe5bb6b00c3a9 ("sgi-gru: misc
> GRU cleanup") from... 2009, but it fixed it from the far far more broken
> 'disable preemption before taking a mutex' situation that existed before.
> 
> So fix seems to me to not invoke remap_pfn_range() with preemption disabled
> and a mutex held? gru_fault() maintainers added for input...

Every time I get a response to this bug report I feel dumber.  How did I not
see that this was a bug in drivers/misc/sgi-gru/?  Here is another one from the
same driver:

drivers/misc/sgi-gru/grukservices.c:262 gru_get_cpu_resources() warn: sleeping in atomic context

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug report] mm: avoid leaving partial pfn mappings around in error case
  2024-09-15 13:14       ` Dan Carpenter
@ 2024-09-15 13:26         ` Lorenzo Stoakes
  0 siblings, 0 replies; 10+ messages in thread
From: Lorenzo Stoakes @ 2024-09-15 13:26 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Linus Torvalds, linux-mm, Dimitri Sivanich, Arnd Bergmann,
	Greg Kroah-Hartman, linux-kernel

On Sun, Sep 15, 2024 at 04:14:21PM GMT, Dan Carpenter wrote:
> On Sun, Sep 15, 2024 at 01:38:40PM +0100, Lorenzo Stoakes wrote:
> > + get_maintainers.pl people for drivers/misc/sgi-gru/grumain.c
> >
> > On Sun, Sep 15, 2024 at 03:09:35PM GMT, Dan Carpenter wrote:
> > > On Sun, Sep 15, 2024 at 01:01:43PM +0100, Lorenzo Stoakes wrote:
> > > > On Sun, Sep 15, 2024 at 01:08:27PM GMT, Dan Carpenter wrote:
> > > > > Hi Linus,
> > > > >
> > > > > Commit 79a61cc3fc04 ("mm: avoid leaving partial pfn mappings around in
> > > > > error case") from Sep 11, 2024 (linux-next), leads to the following
> > > > > Smatch static checker warning:
> > > > >
> > > > > 	mm/memory.c:2709 remap_pfn_range_notrack()
> > > > > 	warn: sleeping in atomic context
> > > > >
> > > > > mm/memory.c
> > > > >     2696 int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr,
> > > > >     2697                 unsigned long pfn, unsigned long size, pgprot_t prot)
> > > > >     2698 {
> > > > >     2699         int error = remap_pfn_range_internal(vma, addr, pfn, size, prot);
> > > > >     2700
> > > > >     2701         if (!error)
> > > > >     2702                 return 0;
> > > > >     2703
> > > > >     2704         /*
> > > > >     2705          * A partial pfn range mapping is dangerous: it does not
> > > > >     2706          * maintain page reference counts, and callers may free
> > > > >     2707          * pages due to the error. So zap it early.
> > > > >     2708          */
> > > > > --> 2709         zap_page_range_single(vma, addr, size, NULL);
> > > > >
> > > > > The lru_add_drain() function at the start of zap_page_range_single() takes a
> > > > > mutext.
> > > >
> > > > Hm does it? I see a local lock, and some folio batch locking which are
> > > > local locks too?
> > >
> > > Ah...  No it doesn't.  It's the mmu_notifier_invalidate_range_start() which is
> > > a might_sleep() function.  Sorry for the confusion.
> >
> > OK so in conclusion it seems to be that Linus's commit introducing
> > zap_page_range_single() accidentally had smatch hit a might_sleep() via
> > mmu_notifier_invalidate_range_start(), but it should, in theory, have fired
> > due to page table allocations invoking the page allocator that might sleep,
> > but didn't, because smatch misses the below might_alloc() path...
> >
> > -> prepare_alloc_pages()
> >    -> might_alloc()
> >       -> might_sleep_if(gfpflags_allow_blocking(gfp_mask))
> >
> > ...as a result of get_zeroed_page() tripping it up *breathes*. :)
> >
> > (please correct me if I am wrong here).
>
> That's an accurate summary...

Thanks!

>
> >
> > The preempt_disable() is introduced in commit fe5bb6b00c3a9 ("sgi-gru: misc
> > GRU cleanup") from... 2009, but it fixed it from the far far more broken
> > 'disable preemption before taking a mutex' situation that existed before.
> >
> > So fix seems to me to not invoke remap_pfn_range() with preemption disabled
> > and a mutex held? gru_fault() maintainers added for input...
>
> Every time I get a response to this bug report I feel dumber.  How did I not
> see that this was a bug in drivers/misc/sgi-gru/?  Here is another one from the
> same driver:
>
> drivers/misc/sgi-gru/grukservices.c:262 gru_get_cpu_resources() warn: sleeping in atomic context

Nothing to feel dumb about, this stuff is confounding by nature, if I had a
penny for every time I felt dumb doing kernel work I'd be very rich by now! ;)

>
> regards,
> dan carpenter

Cheers for report! This means we can now get this thing fixed...


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug report] mm: avoid leaving partial pfn mappings around in error case
  2024-09-15 13:14     ` Linus Torvalds
@ 2024-09-18 21:08       ` Dimitri Sivanich
  0 siblings, 0 replies; 10+ messages in thread
From: Dimitri Sivanich @ 2024-09-18 21:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Dan Carpenter, Dimitri Sivanich, Arnd Bergmann, linux-mm

On Sun, Sep 15, 2024 at 03:14:06PM +0200, Linus Torvalds wrote:
> On Sun, 15 Sept 2024 at 14:05, Dan Carpenter <dan.carpenter@linaro.org> wrote:
> >
> > Yep.  You're right.  Smatch doesn't count allocations as sleeping when we pass a
> > variable to for the gfp flags and those functions do "get_zeroed_page(gfp)".
> > I've been intending for years to handle bitmasks better but I've never
> > implemented that code.
> 
> That explains it. In this case, the 'variable' gfp values are
> basically identical: GFP_PGTABLE_KERNEL and GFP_PGTABLE_USER depending
> on which mm they get allocated to, and both are just variations of
> GFP_KERNEL (with __GFP_ZERO and a __GFP_ACCOUNT for the user case), so
> the exact details of th egfp flags are conditional, but they are
> unconditionally sleeping allocations.
> 
> But yeah, if smatch doesn't follow that kind of conditional gfp flags,
> it explains why smatch thought the odd gru_fault() code used to be ok,
> even though it clearly isn't.
> 
> I'm not sure why gru_fault does that preempt-disable at all, since the
> real locking seems to be that &gts->ts_ctxlock mutex, but it has done
> it since its initial commit.
> 
> And it's been very much wrong since that initial commit, but I suspect
> it never triggers in real life because the page tables probably are
> already set up.
> 
> Probably more importantly, it doesn't trigger in real life because SGI
> UV machines with that GRU hardware are probably hard to find.
> 
> Anyway, maybe somebody can explain why gru_fault() has that
> preempt_disable() in it, but it is very wrong. It always has been, and
> the recent change just made smatch notice a new case.
> 
> I don't actually see any reason for that preemption disable, and I
> suspect it could just be removed. But maybe somebody else who knows
> that odd driver can pipe up.. Adding Dimirti and Arnd just in case.
>

Removing preemption disable from gru_fault, and the rest of the driver, seems
to be OK, and is probably the best way to fix this in light of the other
instance that Dan found in gru_get_cpu_resources.

I will submit a patch to do this.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-09-18 21:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-15 10:08 [bug report] mm: avoid leaving partial pfn mappings around in error case Dan Carpenter
2024-09-15 10:23 ` Linus Torvalds
2024-09-15 12:05   ` Dan Carpenter
2024-09-15 13:14     ` Linus Torvalds
2024-09-18 21:08       ` Dimitri Sivanich
2024-09-15 12:01 ` Lorenzo Stoakes
2024-09-15 12:09   ` Dan Carpenter
2024-09-15 12:38     ` Lorenzo Stoakes
2024-09-15 13:14       ` Dan Carpenter
2024-09-15 13:26         ` Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox