Re: [LSFMM] automating measuring memory fragmentation

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Yu Zhao <yuzhao@google.com>
To: Karim Manaouil <kmanaouil.dev@gmail.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>,
	David Bueso <dave@stgolabs.net>, Michal Hocko <mhocko@suse.com>,
	 Dan Williams <dan.j.williams@intel.com>,
	John Hubbard <jhubbard@nvidia.com>,
	 Daniel Gomez <da.gomez@samsung.com>,
	linux-mm <linux-mm@kvack.org>,
	 lsf-pc@lists.linux-foundation.org
Subject: Re: [LSFMM] automating measuring memory fragmentation
Date: Thu, 16 May 2024 15:36:57 -0600	[thread overview]
Message-ID: <CAOUHufZ9MLiDDNtNbOdT1cNnJ7gAnC1HDbhcGVmm_HNLf++7YQ@mail.gmail.com> (raw)
In-Reply-To: <ZkZ7fwkBQ_pBEImO@localhost.localdomain>

On Thu, May 16, 2024 at 3:32 PM Karim Manaouil <kmanaouil.dev@gmail.com> wrote:
>
> On Thu, May 16, 2024 at 02:05:24PM -0600, Yu Zhao wrote:
> > For example, if we have two systems, one has lower fragmentation for
> > some orders but higher fragmentation for the rest, and the other is
> > the opposite. How would we be able to use a single measure to describe
> > this? IOW, I don't think a single measurement can describe all orders
> > in a comparable way, which would be the weakest requirement we would
> > have to impose.
>
> > As I (badly) explained earlier, a single value can't do that because
> > different orders are not on the same footing (or so to speak), unless
> > we are only interested in one non-zero order. So we would need
> > fragmentation_index[NR_non_zero_orders].
>
> > No, for example, A can allocate 4 order-1 but 0 order-2, and B can
> > allocate 2 order-1 *or* 1 order-2, which one would you say is better
> > or worse? This, IMO, depends on which order you are trying to
> > allocate. Does it make sense?
>
> But higher order pages can always be broken down into lower order pages.
> However, the inverse is not always gauranteed (they may not be buddies,
> or compaction/reclaim isn't helpful).

Please read my example again, carefully.

> Obviously, I would rather have one order-4 page than two order-3 pages.
> You can always satisfy an allocation for an order n if a page with an
> order higher than n is available.
>
> One way to measure fragmentation is to compare how far we are from some
> perfect value. The perfect value represents the case when all the free
> memory is available as blocks of pageblock_order or MAX_PAGE_ORDER.
>
> I can do this as a one shot calculation, for example with
>
> static void estimate_numa_fragmentation(void)
> {
>         pg_data_t *pgdat;
>         struct zone *z;
>         unsigned long fragscore;
>         unsigned long bestscore;
>         unsigned long nr_free;
>         int order;
>
>         for_each_online_pgdat(pgdat) {
>                 nr_free = fragscore = 0;
>                 z = pgdat->node_zones;
>                 while (z < (pgdat->node_zones + pgdat->nr_zones)) {
>                         if (!populated_zone(z)) {
>                                 z++;
>                                 continue;
>                         }
>                         spin_lock_irq(&z->lock);
>                         for (order = 0; order < NR_PAGE_ORDERS; order++) {
>                                 nr_free += z->free_area[order].nr_free << order;
>                                 fragscore += z->free_area[order].nr_free << (order * 2);
>                         }
>                         spin_unlock_irq(&z->lock);
>                         z++;
>                         cond_resched();
>                 }
>                 bestscore = nr_free << MAX_PAGE_ORDER;
>                 fragscore = ((bestscore - fragscore) * 100) / bestscore;
>                 pr_info("fragscore on node %d: %lu\n", pgdat->node_id, fragscore);
>         }
> }
>
> But there must be a way to streamline the calculation and update the value
> with low overhead over time.
>
> Cheers
> Karim
> PhD Student
> Edinburgh University

next prev parent reply	other threads:[~2024-05-16 21:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-15 19:34 Luis Chamberlain
2024-05-16  5:15 ` Yu Zhao
2024-05-16  6:23   ` Luis Chamberlain
2024-05-16 20:05     ` Yu Zhao
2024-05-16 21:32       ` Karim Manaouil
2024-05-16 21:36         ` Yu Zhao [this message]
2024-05-20 14:34         ` Vlastimil Babka (SUSE)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOUHufZ9MLiDDNtNbOdT1cNnJ7gAnC1HDbhcGVmm_HNLf++7YQ@mail.gmail.com \
    --to=yuzhao@google.com \
    --cc=da.gomez@samsung.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave@stgolabs.net \
    --cc=jhubbard@nvidia.com \
    --cc=kmanaouil.dev@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mcgrof@kernel.org \
    --cc=mhocko@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox