From: Marco Elver <elver@google.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Michal Hocko <mhocko@suse.com>,
Andrey Konovalov <andreyknvl@gmail.com>,
Alexander Potapenko <glider@google.com>
Subject: Re: [PATCH v8 2/5] mm,page_owner: Implement the tracking of the stacks count
Date: Tue, 13 Feb 2024 10:21:14 +0100 [thread overview]
Message-ID: <CANpmjNO8CHC6gSFVEOSzYsTAP-j5YvfbfzZMUwnGqSAC1Y4A8g@mail.gmail.com> (raw)
In-Reply-To: <fc4f498b-fc35-4ba8-abf0-7664d6f1decb@suse.cz>
On Tue, 13 Feb 2024 at 10:16, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 2/12/24 23:30, Oscar Salvador wrote:
> > page_owner needs to increment a stack_record refcount when a new allocation
> > occurs, and decrement it on a free operation.
> > In order to do that, we need to have a way to get a stack_record from a
> > handle.
> > Implement __stack_depot_get_stack_record() which just does that, and make
> > it public so page_owner can use it.
> >
> > Also implement {inc,dec}_stack_record_count() which increments
> > or decrements on respective allocation and free operations, via
> > __reset_page_owner() (free operation) and __set_page_owner() (alloc
> > operation).
> >
> > Traversing all stackdepot buckets comes with its own complexity,
> > plus we would have to implement a way to mark only those stack_records
> > that were originated from page_owner, as those are the ones we are
> > interested in.
> > For that reason, page_owner maintains its own list of stack_records,
> > because traversing that list is faster than traversing all buckets
> > while keeping at the same time a low complexity.
> > inc_stack_record_count() is responsible of adding new stack_records
> > into the list stack_list.
> >
> > Modifications on the list are protected via a spinlock with irqs
> > disabled, since this code can also be reached from IRQ context.
> >
> > Signed-off-by: Oscar Salvador <osalvador@suse.de>
> > ---
> > include/linux/stackdepot.h | 9 +++++
> > lib/stackdepot.c | 8 +++++
> > mm/page_owner.c | 73 ++++++++++++++++++++++++++++++++++++++
> > 3 files changed, 90 insertions(+)
>
> ...
>
>
> > --- a/mm/page_owner.c
> > +++ b/mm/page_owner.c
> > @@ -36,6 +36,14 @@ struct page_owner {
> > pid_t free_tgid;
> > };
> >
> > +struct stack {
> > + struct stack_record *stack_record;
> > + struct stack *next;
> > +};
> > +
> > +static struct stack *stack_list;
> > +static DEFINE_SPINLOCK(stack_list_lock);
> > +
> > static bool page_owner_enabled __initdata;
> > DEFINE_STATIC_KEY_FALSE(page_owner_inited);
> >
> > @@ -61,6 +69,57 @@ static __init bool need_page_owner(void)
> > return page_owner_enabled;
> > }
> >
> > +static void add_stack_record_to_list(struct stack_record *stack_record)
> > +{
> > + unsigned long flags;
> > + struct stack *stack;
> > +
> > + stack = kmalloc(sizeof(*stack), GFP_KERNEL);
>
> I doubt you can use GFP_KERNEL unconditionally? Think you need to pass down
> the gfp flags from __set_page_owner() here?
> And what about the alloc failure case, this will just leave the stack record
> unlinked forever? Can we somehow know which ones we failed to link, and try
> next time? Probably easier by not recording the stack for the page at all in
> that case, so the next attempt with the same stack looks like the very first
> again and attemps the add to list.
> Still not happy that these extra tracking objects are needed, but I guess
> the per-users stack depot instances I suggested would be a major change.
>
> > + if (stack) {
> > + stack->stack_record = stack_record;
> > + stack->next = NULL;
> > +
> > + spin_lock_irqsave(&stack_list_lock, flags);
> > + if (!stack_list) {
> > + stack_list = stack;
> > + } else {
> > + stack->next = stack_list;
> > + stack_list = stack;
> > + }
> > + spin_unlock_irqrestore(&stack_list_lock, flags);
> > + }
> > +}
> > +
> > +static void inc_stack_record_count(depot_stack_handle_t handle)
> > +{
> > + struct stack_record *stack_record = __stack_depot_get_stack_record(handle);
> > +
> > + if (stack_record) {
> > + /*
> > + * New stack_record's that do not use STACK_DEPOT_FLAG_GET start
> > + * with REFCOUNT_SATURATED to catch spurious increments of their
> > + * refcount.
> > + * Since we do not use STACK_DEPOT_FLAG_{GET,PUT} API, let us
> > + * set a refcount of 1 ourselves.
> > + */
> > + if (refcount_read(&stack_record->count) == REFCOUNT_SATURATED) {
> > + refcount_set(&stack_record->count, 1);
>
> Isn't this racy? Shouldn't we use some atomic cmpxchg operation to change
> from REFCOUNT_SATURATED to 1?
If 2 threads race here, both will want to add it to the list as well
and take the lock. So this could just be solved with double-checked
locking:
if (count == REFCOUNT_SATURATED) {
spin_lock(...);
if (count == REFCOUNT_SATURATED) {
refcount_set(.., 1);
.. add to list ...
}
spin_unlock(..);
}
next prev parent reply other threads:[~2024-02-13 9:21 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-12 22:30 [PATCH v8 0/5] page_owner: print stacks and their outstanding allocations Oscar Salvador
2024-02-12 22:30 ` [PATCH v8 1/5] lib/stackdepot: Move stack_record struct definition into the header Oscar Salvador
2024-02-13 8:26 ` Marco Elver
2024-02-13 11:12 ` Vlastimil Babka
2024-02-12 22:30 ` [PATCH v8 2/5] mm,page_owner: Implement the tracking of the stacks count Oscar Salvador
2024-02-13 8:30 ` Marco Elver
2024-02-13 9:16 ` Oscar Salvador
2024-02-13 9:16 ` Vlastimil Babka
2024-02-13 9:21 ` Marco Elver [this message]
2024-02-13 11:34 ` Vlastimil Babka
2024-02-13 12:40 ` Oscar Salvador
2024-02-13 12:58 ` Marco Elver
2024-02-13 9:46 ` Oscar Salvador
2024-02-13 13:42 ` Vlastimil Babka
2024-02-13 15:29 ` Oscar Salvador
2024-02-13 16:04 ` Oscar Salvador
2024-02-12 22:30 ` [PATCH v8 3/5] mm,page_owner: Display all stacks and their count Oscar Salvador
2024-02-13 8:38 ` Marco Elver
2024-02-13 9:19 ` Oscar Salvador
2024-02-13 14:25 ` Vlastimil Babka
2024-02-13 15:33 ` Oscar Salvador
2024-02-13 15:36 ` Vlastimil Babka
2024-02-12 22:30 ` [PATCH v8 4/5] mm,page_owner: Filter out stacks by a threshold Oscar Salvador
2024-02-13 8:41 ` Marco Elver
2024-02-13 8:44 ` Marco Elver
2024-02-13 9:21 ` Oscar Salvador
2024-02-13 14:56 ` Vlastimil Babka
2024-02-12 22:30 ` [PATCH v8 5/5] mm,page_owner: Update Documentation regarding page_owner_stacks Oscar Salvador
2024-02-13 8:45 ` Marco Elver
2024-02-13 9:13 ` Oscar Salvador
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CANpmjNO8CHC6gSFVEOSzYsTAP-j5YvfbfzZMUwnGqSAC1Y4A8g@mail.gmail.com \
--to=elver@google.com \
--cc=akpm@linux-foundation.org \
--cc=andreyknvl@gmail.com \
--cc=glider@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox