From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EB45C48260 for ; Tue, 13 Feb 2024 08:31:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F2DD58D0006; Tue, 13 Feb 2024 03:31:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EDD0A8D0005; Tue, 13 Feb 2024 03:31:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D7E788D0006; Tue, 13 Feb 2024 03:31:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C62058D0005 for ; Tue, 13 Feb 2024 03:31:04 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 52A07A0297 for ; Tue, 13 Feb 2024 08:31:04 +0000 (UTC) X-FDA: 81786110448.14.AB95498 Received: from mail-vs1-f54.google.com (mail-vs1-f54.google.com [209.85.217.54]) by imf04.hostedemail.com (Postfix) with ESMTP id 7012B4000E for ; Tue, 13 Feb 2024 08:31:02 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XDIaHLNk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of elver@google.com designates 209.85.217.54 as permitted sender) smtp.mailfrom=elver@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707813062; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EiWyDX8gZQ5GT1eLZA5ntSQqn41f+SSf5JGmboV5GBo=; b=iyO99jzFJX6lPgBbxBwlV1lOqNdoDSKqoD1hK6AfnQ6eymTeZl2D3ilXTXjGhEElDlwpF6 2ap5vRzL9Yc2ID6/Oc6P2glXtCrVYghTTmw6WqlMhVpFfh8KicW6LF+m0h4FZK9QKV94Q0 LeT+vYN5VXDi29SSktAKbkLGYeqn4mI= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XDIaHLNk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of elver@google.com designates 209.85.217.54 as permitted sender) smtp.mailfrom=elver@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707813062; a=rsa-sha256; cv=none; b=YtLRk+XJtzuizVxDIykhXNgBT0UIRttuHTUzicGWDi5mPfvhSlNVEEmHld5NN1cCvhDV8q szmQQ186VgBuuK53zXzLC69hj+Q8s7qrU1/8kB/oYMtOBzCoo2ZR3miO4BP/IYlaAJXqTm k05OVD6weaGMgUeAYwocwjmcJr2ulEM= Received: by mail-vs1-f54.google.com with SMTP id ada2fe7eead31-46eb801f6beso322966137.1 for ; Tue, 13 Feb 2024 00:31:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1707813061; x=1708417861; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=EiWyDX8gZQ5GT1eLZA5ntSQqn41f+SSf5JGmboV5GBo=; b=XDIaHLNkEbXKwJ0dBM2Ml40yAp+eHQLR1boRUi7VRnhA2KrtqLMvjhC0Gj/KHmf5CP tOEZMoap+iUAZ9x9QqkDopa5cBYGyn3fPfmpgLxzdEtDNDXHEQzMzTrsLVeU3QaFlmMv 4zvh8j5XlRkR964qhswSarcdbUKp4DAtTfv2I0JWAi9YKszmHw1/Q/rmpA7whmAa00vo 1AjE3ZMraBjpOf9mdB+RdLGW2h/NeEZ2WxAZmKubr9DhjsMZqM+kxxQvOY0LeqlHIRJY 2BztK1TDG1kfFJsiTpTmhry+d2PaGeVXAKGsKXSNzoopzmNLJw3SOSSfwdJjSSuhkdQM pxPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707813061; x=1708417861; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=EiWyDX8gZQ5GT1eLZA5ntSQqn41f+SSf5JGmboV5GBo=; b=aoWnOcYJZQmWLvV/gDSvJgXr7MYrusRoa7z6OqvZzoCQwwRL7TiXN/HXTf9Le3ULhH AB9wx0kZCMID/7suQE3xWBGuZbjTHTAGqkbPSC+voM7/KD8HAo+VZONxfTJ8rtbPDCw3 /uKpD27/IuCzQOtm2sEiSuL1PGpRZuwdJr7NdybcH+RPQ4QhOgvrVWa6YFO4Yv1SMUtB HeqKl43LbEq9gp5b/APMPs13rwRb3tRoU4v7LK9nfFw1GxjwZj2ArIGs+P1ajgtNtpDD Bnlh6TOpkUoc+qiR/uBt0p3GGNJhkGIzqmJORX7bMmba94wTtUNQCdRDj3DbHZ75kUrC MZcQ== X-Forwarded-Encrypted: i=1; AJvYcCVtglA2gBD0KEmHp5/hBzdJ0OaruIi8qZIY0he+1bz7eC4ZGM+lQRM6Pvs3xiEDxMbuIrMjlluXAl9cGiRhTWuyJ3E= X-Gm-Message-State: AOJu0YzCeUvM72XvySS2FAOORemr/MeqFKwsTdRfVzaHtW70RJpuZ1am okZuDchuSkzJYh/QPhdPrGvu/nOCXynNkwfkwLA7J4v9Pm8cCbagnne3p5YcROF1nFTdDHXmJi2 MH6glmWcmkwn4/1cL+n8KLBCdGZbvR/3JxX6x X-Google-Smtp-Source: AGHT+IHJeM5sDvTlXvVffHHRpIV6W5kCTGGavnEnHCHDrtF2b+J/gj3+DI8aNndCTcTyifNRLqP/JqLU4eJQ/tBr5q4= X-Received: by 2002:a67:eb42:0:b0:46a:f7e0:e6a3 with SMTP id x2-20020a67eb42000000b0046af7e0e6a3mr6755456vso.24.1707813061441; Tue, 13 Feb 2024 00:31:01 -0800 (PST) MIME-Version: 1.0 References: <20240212223029.30769-1-osalvador@suse.de> <20240212223029.30769-3-osalvador@suse.de> In-Reply-To: <20240212223029.30769-3-osalvador@suse.de> From: Marco Elver Date: Tue, 13 Feb 2024 09:30:25 +0100 Message-ID: Subject: Re: [PATCH v8 2/5] mm,page_owner: Implement the tracking of the stacks count To: Oscar Salvador Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Michal Hocko , Vlastimil Babka , Andrey Konovalov , Alexander Potapenko Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 7012B4000E X-Stat-Signature: jiqkqm1975dhqhedasjcz57u1uafe6ek X-Rspam-User: X-HE-Tag: 1707813062-340598 X-HE-Meta: U2FsdGVkX1/iNghSvn/a5+my585t6Ams1qAKMM22IPauhG2vAwCKa9GvVsJdHEKj2lZoSOdOmIin0DyKEcCFHG8BhBz/LKVHDd0eKdwFYMECKxGCvV1NAc+MbCJ/+HbYN8OLnJ0waU4l8LivxDZUhMvU7xBlvN68B47x18tsR8TGelsqqpRQL4SkMxvbrl2dU60iSgExVvmTECwetXSfCqPDdy59/m7CRweyc2oiWI9M2cJ3duRCOhX6zuCRWoiIZiePj9AAkI53Lowd5rUqPVHxBRgqGTPo0nqnTuuhGl2zXDqz3jF+9fqXxQdIaHozWOY84iGV/loZeDQAX+0vh/HzQrvECsHp03XdmZ3J6kb4OMw0k/qOcr8VFUU7o73QHr+PW66NgQ66/eswPRHTCAXhIux9yCv20EXvYBFofRduzuvEpBNDAd7n4U9vF6oVAvoLzP8cJgv6Gx0q6SXUvCc7LXQBBVYQVEUtaY7FtR4SisjZOx0Wx0HtgEZhoJBLEdYCnhc8+GJk2PRYQCP6T2DP3SMWQ69lO2lzNlubwLeuF7qIi/RQKeMUXLL9g5s7ckYk5moTaybU9pkKybcccUTEO3dahlRTFZ9itk1xdGUkGu3jKky/cUktdqwAyKlTSKy34VTryQa3pLzVjHppMj0jQAiXthbm7EflecR/iopZ5sJe0LDjWBKBx2cJw2AAjLIHm6JbdcUTonTJlAcU/bJPJ+yKqFWeoHT0Ij+bgSqy5fTbOPI0tbQQS5Te1AwmL8qLAih0WdzLKHckLS24Dj78OhVVPUlS72sInC9r7japw4uzIxr6ashD1ptrJi2FlYCOtbf6UyVVRDVE75fYPzOo/ZgllXjRTnJIwl0IsCLDA502u3uaOZvG8rRWWaCNv/+lrBq/i9O+cW54J1icoOpEZ4seolvHLo8cUOJZVy4z82WU+eJkUEebJiiGYrCSO9QH3Jzzz+jUCrpz3Qm K3kwSd/w erKos5Lf58Sf5DZv8fqcezphl13iyivBss+EatRNKfefnpXkZ889CrmRdEhdNBWnBzwCo4CKpEbXNtaVsVarxe2AgCOkTWED7Lu4dwmSPAgGqtmRBvRx+LW0Vlbjvn0+jZ5RrGVG3ZmebjKWqwTDkg3NvxVUSshUw4e3S3Bo7oxbUQqKdcoUDbkD9+8UJoo87gUv4a9eSHMBzH1X2Jra7r28k65ofAUJUPUoiQyMrp8+989Iyyj2MVMdmhICEDLIAAxQzXUCkiZnvL7M0b4l1G5p/n5BkP2ZYVRNrmEcDP3g6KH4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 12 Feb 2024 at 23:29, Oscar Salvador wrote: > > page_owner needs to increment a stack_record refcount when a new allocation > occurs, and decrement it on a free operation. > In order to do that, we need to have a way to get a stack_record from a > handle. > Implement __stack_depot_get_stack_record() which just does that, and make > it public so page_owner can use it. > > Also implement {inc,dec}_stack_record_count() which increments > or decrements on respective allocation and free operations, via > __reset_page_owner() (free operation) and __set_page_owner() (alloc > operation). > > Traversing all stackdepot buckets comes with its own complexity, > plus we would have to implement a way to mark only those stack_records > that were originated from page_owner, as those are the ones we are > interested in. > For that reason, page_owner maintains its own list of stack_records, > because traversing that list is faster than traversing all buckets > while keeping at the same time a low complexity. > inc_stack_record_count() is responsible of adding new stack_records > into the list stack_list. > > Modifications on the list are protected via a spinlock with irqs > disabled, since this code can also be reached from IRQ context. > > Signed-off-by: Oscar Salvador For the code: Reviewed-by: Marco Elver But see minor comments below. > --- > include/linux/stackdepot.h | 9 +++++ > lib/stackdepot.c | 8 +++++ > mm/page_owner.c | 73 ++++++++++++++++++++++++++++++++++++++ > 3 files changed, 90 insertions(+) > > diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h > index 90274860fd8e..f3c2162bf615 100644 > --- a/include/linux/stackdepot.h > +++ b/include/linux/stackdepot.h > @@ -175,6 +175,15 @@ depot_stack_handle_t stack_depot_save_flags(unsigned long *entries, > depot_stack_handle_t stack_depot_save(unsigned long *entries, > unsigned int nr_entries, gfp_t gfp_flags); > > +/** > + * __stack_depot_get_stack_record - Get a pointer to a stack_record struct > + * This function is only for internal purposes. I think the body of the kernel doc needs to go after argument declarations. > + * @handle: Stack depot handle > + * > + * Return: Returns a pointer to a stack_record struct > + */ > +struct stack_record *__stack_depot_get_stack_record(depot_stack_handle_t handle); > + > /** > * stack_depot_fetch - Fetch a stack trace from stack depot > * > diff --git a/lib/stackdepot.c b/lib/stackdepot.c > index 6f9095374847..fdb09450a538 100644 > --- a/lib/stackdepot.c > +++ b/lib/stackdepot.c > @@ -685,6 +685,14 @@ depot_stack_handle_t stack_depot_save(unsigned long *entries, > } > EXPORT_SYMBOL_GPL(stack_depot_save); > > +struct stack_record *__stack_depot_get_stack_record(depot_stack_handle_t handle) > +{ > + if (!handle) > + return NULL; > + > + return depot_fetch_stack(handle); > +} > + > unsigned int stack_depot_fetch(depot_stack_handle_t handle, > unsigned long **entries) > { > diff --git a/mm/page_owner.c b/mm/page_owner.c > index 5634e5d890f8..7d1b3f75cef3 100644 > --- a/mm/page_owner.c > +++ b/mm/page_owner.c > @@ -36,6 +36,14 @@ struct page_owner { > pid_t free_tgid; > }; > > +struct stack { > + struct stack_record *stack_record; > + struct stack *next; > +}; > + > +static struct stack *stack_list; > +static DEFINE_SPINLOCK(stack_list_lock); > + > static bool page_owner_enabled __initdata; > DEFINE_STATIC_KEY_FALSE(page_owner_inited); > > @@ -61,6 +69,57 @@ static __init bool need_page_owner(void) > return page_owner_enabled; > } > > +static void add_stack_record_to_list(struct stack_record *stack_record) > +{ > + unsigned long flags; > + struct stack *stack; > + > + stack = kmalloc(sizeof(*stack), GFP_KERNEL); > + if (stack) { It's usually more elegant to write if (!stack) return; If the rest of the function is conditional. > + stack->stack_record = stack_record; > + stack->next = NULL; > + > + spin_lock_irqsave(&stack_list_lock, flags); > + if (!stack_list) { > + stack_list = stack; > + } else { > + stack->next = stack_list; > + stack_list = stack; > + } > + spin_unlock_irqrestore(&stack_list_lock, flags); > + } > +} > + > +static void inc_stack_record_count(depot_stack_handle_t handle) > +{ > + struct stack_record *stack_record = __stack_depot_get_stack_record(handle); > + > + if (stack_record) { > + /* > + * New stack_record's that do not use STACK_DEPOT_FLAG_GET start > + * with REFCOUNT_SATURATED to catch spurious increments of their > + * refcount. > + * Since we do not use STACK_DEPOT_FLAG_{GET,PUT} API, let us I think I mentioned this in the other email, there is no STACK_DEPOT_FLAG_PUT, only stack_depot_put(). > + * set a refcount of 1 ourselves. > + */ > + if (refcount_read(&stack_record->count) == REFCOUNT_SATURATED) { > + refcount_set(&stack_record->count, 1); > + > + /* Add the new stack_record to our list */ > + add_stack_record_to_list(stack_record); > + } > + refcount_inc(&stack_record->count); > + } > +} > + > +static void dec_stack_record_count(depot_stack_handle_t handle) > +{ > + struct stack_record *stack_record = __stack_depot_get_stack_record(handle); > + > + if (stack_record) > + refcount_dec(&stack_record->count); > +} > + > static __always_inline depot_stack_handle_t create_dummy_stack(void) > { > unsigned long entries[4]; > @@ -140,6 +199,7 @@ void __reset_page_owner(struct page *page, unsigned short order) > int i; > struct page_ext *page_ext; > depot_stack_handle_t handle; > + depot_stack_handle_t alloc_handle; > struct page_owner *page_owner; > u64 free_ts_nsec = local_clock(); > > @@ -147,6 +207,9 @@ void __reset_page_owner(struct page *page, unsigned short order) > if (unlikely(!page_ext)) > return; > > + page_owner = get_page_owner(page_ext); > + alloc_handle = page_owner->handle; > + > handle = save_stack(GFP_NOWAIT | __GFP_NOWARN); > for (i = 0; i < (1 << order); i++) { > __clear_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags); > @@ -158,6 +221,15 @@ void __reset_page_owner(struct page *page, unsigned short order) > page_ext = page_ext_next(page_ext); > } > page_ext_put(page_ext); > + if (alloc_handle != early_handle) > + /* > + * early_handle is being set as a handle for all those > + * early allocated pages. See init_pages_in_zone(). > + * Since their refcount is not being incremented because > + * the machinery is not ready yet, we cannot decrement > + * their refcount either. > + */ > + dec_stack_record_count(alloc_handle); > } > > static inline void __set_page_owner_handle(struct page_ext *page_ext, > @@ -199,6 +271,7 @@ noinline void __set_page_owner(struct page *page, unsigned short order, > return; > __set_page_owner_handle(page_ext, handle, order, gfp_mask); > page_ext_put(page_ext); > + inc_stack_record_count(handle); > } > > void __set_page_owner_migrate_reason(struct page *page, int reason) > -- > 2.43.0 >