Re: [PATCH v8 2/5] mm,page_owner: Implement the tracking of the stacks count

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Vlastimil Babka <vbabka@suse.cz>
To: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Michal Hocko <mhocko@suse.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Alexander Potapenko <glider@google.com>
Subject: Re: [PATCH v8 2/5] mm,page_owner: Implement the tracking of the stacks count
Date: Tue, 13 Feb 2024 12:34:55 +0100	[thread overview]
Message-ID: <11cb2ac2-102f-4acd-aded-bbfd29f7269a@suse.cz> (raw)
In-Reply-To: <CANpmjNO8CHC6gSFVEOSzYsTAP-j5YvfbfzZMUwnGqSAC1Y4A8g@mail.gmail.com>

On 2/13/24 10:21, Marco Elver wrote:
> On Tue, 13 Feb 2024 at 10:16, Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> On 2/12/24 23:30, Oscar Salvador wrote:
>> > page_owner needs to increment a stack_record refcount when a new allocation
>> > occurs, and decrement it on a free operation.
>> > In order to do that, we need to have a way to get a stack_record from a
>> > handle.
>> > Implement __stack_depot_get_stack_record() which just does that, and make
>> > it public so page_owner can use it.
>> >
>> > Also implement {inc,dec}_stack_record_count() which increments
>> > or decrements on respective allocation and free operations, via
>> > __reset_page_owner() (free operation) and __set_page_owner() (alloc
>> > operation).
>> >
>> > Traversing all stackdepot buckets comes with its own complexity,
>> > plus we would have to implement a way to mark only those stack_records
>> > that were originated from page_owner, as those are the ones we are
>> > interested in.
>> > For that reason, page_owner maintains its own list of stack_records,
>> > because traversing that list is faster than traversing all buckets
>> > while keeping at the same time a low complexity.
>> > inc_stack_record_count() is responsible of adding new stack_records
>> > into the list stack_list.
>> >
>> > Modifications on the list are protected via a spinlock with irqs
>> > disabled, since this code can also be reached from IRQ context.
>> >
>> > Signed-off-by: Oscar Salvador <osalvador@suse.de>
>> > ---
>> >  include/linux/stackdepot.h |  9 +++++
>> >  lib/stackdepot.c           |  8 +++++
>> >  mm/page_owner.c            | 73 ++++++++++++++++++++++++++++++++++++++
>> >  3 files changed, 90 insertions(+)
>>
>> ...
>>
>>
>> > --- a/mm/page_owner.c
>> > +++ b/mm/page_owner.c
>> > @@ -36,6 +36,14 @@ struct page_owner {
>> >       pid_t free_tgid;
>> >  };
>> >
>> > +struct stack {
>> > +     struct stack_record *stack_record;
>> > +     struct stack *next;
>> > +};
>> > +
>> > +static struct stack *stack_list;
>> > +static DEFINE_SPINLOCK(stack_list_lock);
>> > +
>> >  static bool page_owner_enabled __initdata;
>> >  DEFINE_STATIC_KEY_FALSE(page_owner_inited);
>> >
>> > @@ -61,6 +69,57 @@ static __init bool need_page_owner(void)
>> >       return page_owner_enabled;
>> >  }
>> >
>> > +static void add_stack_record_to_list(struct stack_record *stack_record)
>> > +{
>> > +     unsigned long flags;
>> > +     struct stack *stack;
>> > +
>> > +     stack = kmalloc(sizeof(*stack), GFP_KERNEL);
>>
>> I doubt you can use GFP_KERNEL unconditionally? Think you need to pass down
>> the gfp flags from __set_page_owner() here?
>> And what about the alloc failure case, this will just leave the stack record
>> unlinked forever? Can we somehow know which ones we failed to link, and try
>> next time? Probably easier by not recording the stack for the page at all in
>> that case, so the next attempt with the same stack looks like the very first
>> again and attemps the add to list.
>> Still not happy that these extra tracking objects are needed, but I guess
>> the per-users stack depot instances I suggested would be a major change.
>>
>> > +     if (stack) {
>> > +             stack->stack_record = stack_record;
>> > +             stack->next = NULL;
>> > +
>> > +             spin_lock_irqsave(&stack_list_lock, flags);
>> > +             if (!stack_list) {
>> > +                     stack_list = stack;
>> > +             } else {
>> > +                     stack->next = stack_list;
>> > +                     stack_list = stack;
>> > +             }
>> > +             spin_unlock_irqrestore(&stack_list_lock, flags);
>> > +     }
>> > +}
>> > +
>> > +static void inc_stack_record_count(depot_stack_handle_t handle)
>> > +{
>> > +     struct stack_record *stack_record = __stack_depot_get_stack_record(handle);
>> > +
>> > +     if (stack_record) {
>> > +             /*
>> > +              * New stack_record's that do not use STACK_DEPOT_FLAG_GET start
>> > +              * with REFCOUNT_SATURATED to catch spurious increments of their
>> > +              * refcount.
>> > +              * Since we do not use STACK_DEPOT_FLAG_{GET,PUT} API, let us
>> > +              * set a refcount of 1 ourselves.
>> > +              */
>> > +             if (refcount_read(&stack_record->count) == REFCOUNT_SATURATED) {
>> > +                     refcount_set(&stack_record->count, 1);
>>
>> Isn't this racy? Shouldn't we use some atomic cmpxchg operation to change
>> from REFCOUNT_SATURATED to 1?
> 
> If 2 threads race here, both will want to add it to the list as well
> and take the lock. So this could just be solved with double-checked
> locking:
> 
> if (count == REFCOUNT_SATURATED) {
>   spin_lock(...);

Yeah probably stack_list_lock could be taken here already. But then the
kmalloc() of struct stack must happen also here, before taking the lock.

>   if (count == REFCOUNT_SATURATED) {
>     refcount_set(.., 1);
>     .. add to list ...
>   }
>   spin_unlock(..);
> }

next prev parent reply	other threads:[~2024-02-13 11:35 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-12 22:30 [PATCH v8 0/5] page_owner: print stacks and their outstanding allocations Oscar Salvador
2024-02-12 22:30 ` [PATCH v8 1/5] lib/stackdepot: Move stack_record struct definition into the header Oscar Salvador
2024-02-13  8:26   ` Marco Elver
2024-02-13 11:12   ` Vlastimil Babka
2024-02-12 22:30 ` [PATCH v8 2/5] mm,page_owner: Implement the tracking of the stacks count Oscar Salvador
2024-02-13  8:30   ` Marco Elver
2024-02-13  9:16     ` Oscar Salvador
2024-02-13  9:16   ` Vlastimil Babka
2024-02-13  9:21     ` Marco Elver
2024-02-13 11:34       ` Vlastimil Babka [this message]
2024-02-13 12:40         ` Oscar Salvador
2024-02-13 12:58           ` Marco Elver
2024-02-13  9:46     ` Oscar Salvador
2024-02-13 13:42   ` Vlastimil Babka
2024-02-13 15:29     ` Oscar Salvador
2024-02-13 16:04       ` Oscar Salvador
2024-02-12 22:30 ` [PATCH v8 3/5] mm,page_owner: Display all stacks and their count Oscar Salvador
2024-02-13  8:38   ` Marco Elver
2024-02-13  9:19     ` Oscar Salvador
2024-02-13 14:25   ` Vlastimil Babka
2024-02-13 15:33     ` Oscar Salvador
2024-02-13 15:36       ` Vlastimil Babka
2024-02-12 22:30 ` [PATCH v8 4/5] mm,page_owner: Filter out stacks by a threshold Oscar Salvador
2024-02-13  8:41   ` Marco Elver
2024-02-13  8:44   ` Marco Elver
2024-02-13  9:21     ` Oscar Salvador
2024-02-13 14:56       ` Vlastimil Babka
2024-02-12 22:30 ` [PATCH v8 5/5] mm,page_owner: Update Documentation regarding page_owner_stacks Oscar Salvador
2024-02-13  8:45   ` Marco Elver
2024-02-13  9:13     ` Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11cb2ac2-102f-4acd-aded-bbfd29f7269a@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox