linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: Yu Zhao <yuzhao@google.com>
Cc: Erhard Furtner <erhard_f@mailbox.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	 linuxppc-dev@lists.ozlabs.org,
	Johannes Weiner <hannes@cmpxchg.org>,
	 Nhat Pham <nphamcs@gmail.com>,
	Chengming Zhou <chengming.zhou@linux.dev>,
	 Sergey Senozhatsky <senozhatsky@chromium.org>,
	Minchan Kim <minchan@kernel.org>
Subject: Re: kswapd0: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 (Kernel v6.5.9, 32bit ppc)
Date: Tue, 4 Jun 2024 11:01:39 -0700	[thread overview]
Message-ID: <CAJD7tkZ+QY55GTzW9A7ZCm=rxAEfrW76cWXf8o5nwiKSXp8z=w@mail.gmail.com> (raw)
In-Reply-To: <CAOUHufb6zXr14Wm3T-4-OJh7iAq+vzDKwVYfHLhMMt96SpiZXg@mail.gmail.com>

On Tue, Jun 4, 2024 at 10:54 AM Yu Zhao <yuzhao@google.com> wrote:
>
> On Tue, Jun 4, 2024 at 11:34 AM Yosry Ahmed <yosryahmed@google.com> wrote:
> >
> > On Tue, Jun 4, 2024 at 10:19 AM Yu Zhao <yuzhao@google.com> wrote:
> > >
> > > On Tue, Jun 4, 2024 at 10:12 AM Yosry Ahmed <yosryahmed@google.com> wrote:
> > > >
> > > > On Tue, Jun 4, 2024 at 4:45 AM Erhard Furtner <erhard_f@mailbox.org> wrote:
> > > > >
> > > > > On Mon, 3 Jun 2024 16:24:02 -0700
> > > > > Yosry Ahmed <yosryahmed@google.com> wrote:
> > > > >
> > > > > > Thanks for bisecting. Taking a look at the thread, it seems like you
> > > > > > have a very limited area of memory to allocate kernel memory from. One
> > > > > > possible reason why that commit can cause an issue is because we will
> > > > > > have multiple instances of the zsmalloc slab caches 'zspage' and
> > > > > > 'zs_handle', which may contribute to fragmentation in slab memory.
> > > > > >
> > > > > > Do you have /proc/slabinfo from a good and a bad run by any chance?
> > > > > >
> > > > > > Also, could you check if the attached patch helps? It makes sure that
> > > > > > even when we use multiple zsmalloc zpools, we will use a single slab
> > > > > > cache of each type.
> > > > >
> > > > > Thanks for looking into this! I got you 'cat /proc/slabinfo' from a good HEAD, from a bad HEAD and from the bad HEAD + your patch applied.
> > > > >
> > > > > Good was 6be3601517d90b728095d70c14f3a04b9adcb166, bad was b8cf32dc6e8c75b712cbf638e0fd210101c22f17 which I got both from my bisect.log. I got the slabinfo shortly after boot and a 2nd time shortly before the OOM or the kswapd0: page allocation failure happens. I terminated the workload (stress-ng --vm 2 --vm-bytes 1930M --verify -v) manually shortly before the 2 GiB RAM exhausted and got the slabinfo then.
> > > > >
> > > > > The patch applied to git b8cf32dc6e8c75b712cbf638e0fd210101c22f17 unfortunately didn't make a difference, I got the kswapd0: page allocation failure nevertheless.
> > > >
> > > > Thanks for trying this out. The patch reduces the amount of wasted
> > > > memory due to the 'zs_handle' and 'zspage' caches by an order of
> > > > magnitude, but it was a small number to begin with (~250K).
> > > >
> > > > I cannot think of other reasons why having multiple zsmalloc pools
> > > > will end up using more memory in the 0.25GB zone that the kernel
> > > > allocations can be made from.
> > > >
> > > > The number of zpools can be made configurable or determined at runtime
> > > > by the size of the machine, but I don't want to do this without
> > > > understanding the problem here first. Adding other zswap and zsmalloc
> > > > folks in case they have any ideas.
> > >
> > > Hi Erhard,
> > >
> > > If it's not too much trouble, could you "grep nr_zspages /proc/vmstat"
> > > on kernels before and after the bad commit? It'd be great if you could
> > > run the grep command right before the OOM kills.
> > >
> > > The overall internal fragmentation of multiple zsmalloc pools might be
> > > higher than a single one. I suspect this might be the cause.
> >
> > I thought about the internal fragmentation of pools, but zsmalloc
> > should have access to highmem, and if I understand correctly the
> > problem here is that we are running out of space in the DMA zone when
> > making kernel allocations.
> >
> > Do you suspect zsmalloc is allocating memory from the DMA zone
> > initially, even though it has access to highmem?
>
> There was a lot of user memory in the DMA zone. So at a point the
> highmem zone was full and allocation fallback happened.
>
> The problem with zone fallback is that recent allocations go into
> lower zones, meaning they are further back on the LRU list. This
> applies to both user memory and zsmalloc memory -- the latter has a
> writeback LRU. On top of this, neither the zswap shrinker nor the
> zsmalloc shrinker (compaction) is zone aware. So page reclaim might
> have trouble hitting the right target zone.

I see what you mean. In this case, yeah I think the internal
fragmentation in the zsmalloc pools may be the reason behind the
problem.

How many CPUs does this machine have? I am wondering if 32 can be an
overkill for small machines, perhaps the number of pools should be
max(nr_cpus, 32)?

Alternatively, the number of pools should scale with the memory size
in some way, such that we only increase fragmentation when it's
tolerable.

>
> We can't really tell how zspages are distributed across zones, but the
> overall number might be helpful. It'd be great if someone could make
> nr_zspages per zone :)


  reply	other threads:[~2024-06-04 18:02 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-08 18:21 Erhard Furtner
2024-05-15 20:45 ` Erhard Furtner
2024-05-15 22:06   ` Yu Zhao
2024-06-01  6:01     ` Yu Zhao
2024-06-01 15:37       ` David Hildenbrand
2024-06-06  3:11         ` Michael Ellerman
2024-06-06  3:38           ` Yu Zhao
2024-06-06 12:08             ` Michael Ellerman
2024-06-06 16:05               ` Erhard Furtner
2024-06-02 18:03       ` Erhard Furtner
2024-06-02 20:38         ` Yu Zhao
2024-06-02 21:36           ` Erhard Furtner
2024-06-03 22:13         ` Erhard Furtner
2024-06-03 23:24           ` Yosry Ahmed
     [not found]             ` <20240604134458.3ae4396a@yea>
2024-06-04 16:11               ` Yosry Ahmed
2024-06-04 17:18                 ` Yu Zhao
2024-06-04 17:34                   ` Yosry Ahmed
2024-06-04 17:53                     ` Yu Zhao
2024-06-04 18:01                       ` Yosry Ahmed [this message]
2024-06-04 21:00                         ` Vlastimil Babka (SUSE)
2024-06-04 21:10                         ` Erhard Furtner
2024-06-05  3:03                           ` Yosry Ahmed
2024-06-05 23:04                             ` Erhard Furtner
2024-06-05 23:41                               ` Yosry Ahmed
2024-06-05 23:52                                 ` Yu Zhao
2024-06-05 23:58                                   ` Yosry Ahmed
2024-06-06 13:28                                     ` Erhard Furtner
2024-06-06 16:42                                       ` Yosry Ahmed
2024-06-06  2:49                                 ` Chengming Zhou
2024-06-06  4:31                                   ` Sergey Senozhatsky
2024-06-06  4:46                                     ` Chengming Zhou
2024-06-06  5:43                                       ` Sergey Senozhatsky
2024-06-06  5:55                                         ` Chengming Zhou
2024-06-07  9:40                                         ` Nhat Pham
2024-06-07 11:20                                           ` Sergey Senozhatsky
2024-06-06  7:24                                 ` Vlastimil Babka (SUSE)
2024-06-06 13:32                                   ` Erhard Furtner
2024-06-06 16:53                                     ` Vlastimil Babka (SUSE)
2024-06-06 17:14                                 ` Takero Funaki
2024-06-06 17:41                                   ` Yosry Ahmed
2024-06-06 17:55                                     ` Yu Zhao
2024-06-06 18:03                                       ` Yosry Ahmed
2024-06-04 22:17                   ` Erhard Furtner
2024-06-04 20:52             ` Vlastimil Babka (SUSE)
2024-06-04 20:55               ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJD7tkZ+QY55GTzW9A7ZCm=rxAEfrW76cWXf8o5nwiKSXp8z=w@mail.gmail.com' \
    --to=yosryahmed@google.com \
    --cc=chengming.zhou@linux.dev \
    --cc=erhard_f@mailbox.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=minchan@kernel.org \
    --cc=nphamcs@gmail.com \
    --cc=senozhatsky@chromium.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox