linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: David Hildenbrand <david@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	cerasuolodomenico@gmail.com,  chrisl@kernel.org,
	kasong@tencent.com, peterx@redhat.com, surenb@google.com,
	 v-songbaohua@oppo.com, willy@infradead.org,
	yosryahmed@google.com,  yuzhao@google.com
Subject: Re: [PATCH v3] mm: add per-order mTHP alloc_success and alloc_fail counters
Date: Fri, 5 Apr 2024 17:01:05 +1300	[thread overview]
Message-ID: <CAGsJ_4yPJd25O314zBWzB9ZFdsTMHx29gePatiE=drtA4jyjXw@mail.gmail.com> (raw)
In-Reply-To: <CAGsJ_4zAUkN_cpVzdV-94zPqEAhh+gW9rYY2S8V8e24_OdjJaA@mail.gmail.com>

On Fri, Apr 5, 2024 at 3:57 PM Barry Song <21cnbao@gmail.com> wrote:
>
> On Fri, Apr 5, 2024 at 4:31 AM David Hildenbrand <david@redhat.com> wrote:
> >
> > On 04.04.24 09:21, Ryan Roberts wrote:
> > > On 03/04/2024 22:00, Barry Song wrote:
> > >> On Thu, Apr 4, 2024 at 12:48 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
> > >>>
> > >>> On 03/04/2024 09:22, David Hildenbrand wrote:
> > >>>> On 03.04.24 05:55, Barry Song wrote:
> > >>>>> From: Barry Song <v-songbaohua@oppo.com>
> > >>>>>
> > >>>>> Profiling a system blindly with mTHP has become challenging due
> > >>>>> to the lack of visibility into its operations. Presenting the
> > >>>>> success rate of mTHP allocations appears to be pressing need.
> > >>>>>
> > >>>>> Recently, I've been experiencing significant difficulty debugging
> > >>>>> performance improvements and regressions without these figures.
> > >>>>> It's crucial for us to understand the true effectiveness of
> > >>>>> mTHP in real-world scenarios, especially in systems with
> > >>>>> fragmented memory.
> > >>>>>
> > >>>>> This patch sets up the framework for per-order mTHP counters,
> > >>>>> starting with the introduction of anon_alloc_success and
> > >>>>> anon_alloc_fail counters.  Incorporating additional counters
> > >>>>> should now be straightforward as well.
> > >>>>>
> > >>>>> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > >>>>> ---
> > >>>>>    -v3:
> > >>>>>    * save some memory as order-0 and order-1 can't be THP, Ryan;
> > >>>>>    * rename to anon_alloc as right now we only support anon to address
> > >>>>>      David's comment;
> > >>>>>    * drop a redundant "else", Ryan
> > >>>>>
> > >>>>>    include/linux/huge_mm.h | 18 ++++++++++++++
> > >>>>>    mm/huge_memory.c        | 54 +++++++++++++++++++++++++++++++++++++++++
> > >>>>>    mm/memory.c             |  2 ++
> > >>>>>    3 files changed, 74 insertions(+)
> > >>>>>
> > >>>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> > >>>>> index e896ca4760f6..5e9af6be9537 100644
> > >>>>> --- a/include/linux/huge_mm.h
> > >>>>> +++ b/include/linux/huge_mm.h
> > >>>>> @@ -70,6 +70,7 @@ extern struct kobj_attribute shmem_enabled_attr;
> > >>>>>     * (which is a limitation of the THP implementation).
> > >>>>>     */
> > >>>>>    #define THP_ORDERS_ALL_ANON    ((BIT(PMD_ORDER + 1) - 1) & ~(BIT(0) | BIT(1)))
> > >>>>> +#define THP_MIN_ORDER        2
> > >>>>>      /*
> > >>>>>     * Mask of all large folio orders supported for file THP.
> > >>>>> @@ -264,6 +265,23 @@ unsigned long thp_vma_allowable_orders(struct
> > >>>>> vm_area_struct *vma,
> > >>>>>                          enforce_sysfs, orders);
> > >>>>>    }
> > >>>>>    +enum thp_event_item {
> > >>>>> +    THP_ANON_ALLOC_SUCCESS,
> > >>>>> +    THP_ANON_ALLOC_FAIL,
> > >>>>> +    NR_THP_EVENT_ITEMS
> > >>>>> +};
> > >>>>
> > >>>> Maybe use a prefix that resembles matches the enum name and is "obviously"
> > >>>> different to the ones in vm_event_item.h, like
> > >>>>
> > >>>> enum thp_event {
> > >>>>      THP_EVENT_ANON_ALLOC_SUCCESS,
> > >>>>      THP_EVENT_ANON_ALLOC_FAIL,
> > >>>>      __THP_EVENT_COUNT,
> > >>>> };
> > >>>
> > >>> FWIW, I'd personally replace "event" with "stat". For me "event" only ever
> > >>> increments, but "stat" can increment and decrement. An event is a type of stat.
> > >>>
> > >>> You are only adding events for now, but we have identified a need for inc/dec
> > >>> stats that will be added in future.
> > >>
> > >> What about the below?
> > >>
> > >> enum thp_stat {
>
> It seems we still need to use enum thp_stat_item rather than thp_stat.
> This follows
> enum zone_stat_item
> enum numa_stat_item
> enum node_stat_item
>
> And most importantly, the below looks much better
>
>        enum thp_stat_item {
>               THP_STAT_ANON_ALLOC,
>               THP_STAT_ANON_ALLOC_FALLBACK,
>               __THP_STAT_COUNT
>        };
>
>        struct thp_state {
>               unsigned long state[PMD_ORDER + 1][__THP_STAT_COUNT];
>        };
>
>        DECLARE_PER_CPU(struct thp_state, thp_states);
>
> than
>
>        enum thp_stat {
>               THP_STAT_ANON_ALLOC,
>               THP_STAT_ANON_ALLOC_FALLBACK,
>               __THP_STAT_COUNT
>        };
>
>        struct thp_state {
>               unsigned long state[PMD_ORDER + 1][__THP_STAT_COUNT];
>        };
>
> > >>     THP_EVENT_ANON_ALLOC,
> > >>     THP_EVENT_ANON_ALLOC_FALLBACK,
> > >>     THP_EVENT_SWPOUT,
> > >>     THP_EVENT_SWPOUT_FALLBACK,
> > >>     ...
> > >>     THP_NR_ANON_PAGES,
> > >>     THP_NR_FILE_PAGES,
> > >
> > > I find this ambiguous; Is it the number of THPs or the number of base pages?
> > >
> > > I think David made the point about incorporating the enum name into the labels
> > > too, so that there can be no namespace confusion. How about:
> > >
> > > <enum>_<type>_<name>
> > >
> > > So:
> > >
> > > THP_STAT_EV_ANON_ALLOC
> > > THP_STAT_EV_ANON_ALLOC_FALLBACK
> > > THP_STAT_EV_ANON_PARTIAL
> > > THP_STAT_EV_SWPOUT
> > > THP_STAT_EV_SWPOUT_FALLBACK
> > > ...
> > > THP_STAT_NR_ANON
> > > THP_STAT_NR_FILE
> > > ...
> > > __THP_STAT_COUNT
> >
> > I'd even drop the "EV". "NR_ANON" vs "ANON_ALLOC" etc. is expressive enough.
>
> ok.

Hi David, Ryan,

I've named everything as follows. Please let me know if you have any further
suggestions before I send the updated version :-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index e896ca4760f6..cc13fa14aa32 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -264,6 +264,23 @@ unsigned long thp_vma_allowable_orders(struct
vm_area_struct *vma,
                                          enforce_sysfs, orders);
 }

+enum thp_stat_item {
+       THP_STAT_ANON_ALLOC,
+       THP_STAT_ANON_ALLOC_FALLBACK,
+       __THP_STAT_COUNT
+};
+
+struct thp_state {
+       unsigned long state[PMD_ORDER + 1][__THP_STAT_COUNT];
+};
+
+DECLARE_PER_CPU(struct thp_state, thp_states);
+
+static inline void count_thp_state(int order, enum thp_stat_item item)
+{
+       this_cpu_inc(thp_states.state[order][item]);
+}
+
 #define transparent_hugepage_use_zero_page()                           \
        (transparent_hugepage_flags &                                   \
         (1<<TRANSPARENT_HUGEPAGE_USE_ZERO_PAGE_FLAG))
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 9d4b2fbf6872..e704b4408181 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -526,6 +526,46 @@ static const struct kobj_type thpsize_ktype = {
        .sysfs_ops = &kobj_sysfs_ops,
 };

+DEFINE_PER_CPU(struct thp_state, thp_states) = {{{0}}};
+
+static unsigned long sum_thp_states(int order, enum thp_stat_item item)
+{
+       unsigned long sum = 0;
+       int cpu;
+
+       for_each_online_cpu(cpu) {
+               struct thp_state *this = &per_cpu(thp_states, cpu);
+
+               sum += this->state[order][item];
+       }
+
+       return sum;
+}
+
+#define THP_STATE_ATTR(_name, _index)                                  \
+static ssize_t _name##_show(struct kobject *kobj,                      \
+                       struct kobj_attribute *attr, char *buf)         \
+{                                                                      \
+       int order = to_thpsize(kobj)->order;                            \
+                                                                       \
+       return sysfs_emit(buf, "%lu\n", sum_thp_states(order, _index)); \
+}                                                                      \
+static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
+
+THP_STATE_ATTR(anon_alloc, THP_STAT_ANON_ALLOC);
+THP_STATE_ATTR(anon_alloc_fallback, THP_STAT_ANON_ALLOC_FALLBACK);

>
> >
> > --
> > Cheers,
> >
> > David / dhildenb

Thanks
Barry


  reply	other threads:[~2024-04-05  4:01 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-03  3:55 Barry Song
2024-04-03  8:22 ` David Hildenbrand
2024-04-03 11:48   ` Ryan Roberts
2024-04-03 12:00     ` David Hildenbrand
2024-04-03 21:00     ` Barry Song
2024-04-04  7:21       ` Ryan Roberts
2024-04-04 10:52         ` Barry Song
     [not found]         ` <30392471-71f9-4eb1-8855-d9c12499346f@redhat.com>
2024-04-05  2:57           ` Barry Song
2024-04-05  4:01             ` Barry Song [this message]
2024-04-05  6:29               ` David Hildenbrand
2024-04-05  7:21                 ` Ryan Roberts
2024-04-05  9:04                   ` David Hildenbrand
2024-04-05  9:24                     ` Barry Song
2024-04-05 10:15                       ` David Hildenbrand
2024-04-05 10:51                         ` Barry Song
2024-04-05  7:18               ` Ryan Roberts
2024-04-05  9:08                 ` Barry Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGsJ_4yPJd25O314zBWzB9ZFdsTMHx29gePatiE=drtA4jyjXw@mail.gmail.com' \
    --to=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cerasuolodomenico@gmail.com \
    --cc=chrisl@kernel.org \
    --cc=david@redhat.com \
    --cc=kasong@tencent.com \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=surenb@google.com \
    --cc=v-songbaohua@oppo.com \
    --cc=willy@infradead.org \
    --cc=yosryahmed@google.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox