From: Michal Hocko <mhocko@suse.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Yosry Ahmed <yosryahmed@google.com>,
Rik van Riel <riel@surriel.com>,
Balbir Singh <balbirs@nvidia.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
hakeel Butt <shakeel.butt@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
cgroups@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel-team@meta.com,
Nhat Pham <nphamcs@gmail.com>
Subject: Re: [PATCH v2] memcg: allow exiting tasks to write back data to swap
Date: Tue, 14 Jan 2025 17:46:37 +0100 [thread overview]
Message-ID: <Z4aU7dn_TKeeTmP_@tiehlicka> (raw)
In-Reply-To: <20250114160955.GA1115056@cmpxchg.org>
On Tue 14-01-25 11:09:55, Johannes Weiner wrote:
> Hi,
>
> On Mon, Dec 16, 2024 at 04:39:12PM +0100, Michal Hocko wrote:
> > On Thu 12-12-24 13:30:12, Johannes Weiner wrote:
[...]
> > > If we return -ENOMEM to an OOM victim in a fault, the fault handler
> > > will re-trigger OOM, which will find the existing OOM victim and do
> > > nothing, then restart the fault.
> >
> > IIRC the task will handle the pending SIGKILL if the #PF fails. If the
> > charge happens from the exit path then we rely on ENOMEM returned from
> > gup as a signal to back off. Do we have any caller that keeps retrying
> > on ENOMEM?
>
> We managed to extract a stack trace of the livelocked task:
>
> obj_cgroup_may_swap
> zswap_store
> swap_writepage
> shrink_folio_list
> shrink_lruvec
> shrink_node
> do_try_to_free_pages
> try_to_free_mem_cgroup_pages
OK, so this is the reclaim path and it fails due to reasons you mention
below. This will retry several times until it hits mem_cgroup_oom which
will bail in mem_cgroup_out_of_memory because of task_is_dying (returns
true) and retry the charge + reclaim (as the oom killer hasn't done
anything) with passed_oom = true this time and eventually got to nomem
path and returns ENOMEM. This should propaged -ENOMEM down the path
> charge_memcg
> mem_cgroup_swapin_charge_folio
> __read_swap_cache_async
> swapin_readahead
> do_swap_page
> handle_mm_fault
> do_user_addr_fault
> exc_page_fault
> asm_exc_page_fault
> __get_user
All the way here and return the failure to futex_cleanup which doesn't
retry __get_user on the failure AFAICS (exit_robust_list). But I might
be missing something, it's been quite some time since I've looked into
futex code.
> futex_cleanup
> fuxtex_exit_release
> do_exit
> do_group_exit
> get_signal
> arch_do_signal_or_restart
> exit_to_user_mode_prepare
> syscall_exit_to_user_mode
> do_syscall
> entry_SYSCALL_64
> syscall
>
> Both memory.max and memory.zswap.max are hit. I don't see how this
> could ever make forward progress - the futex fault will retry until it
> succeeds.
I must be missing something but I do not see the retry, could you point
me where this is happening please?
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2025-01-14 16:46 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-12 16:57 Rik van Riel
2024-12-12 17:06 ` Yosry Ahmed
2024-12-12 17:51 ` Shakeel Butt
2024-12-12 18:02 ` Rik van Riel
2024-12-12 18:18 ` Nhat Pham
2024-12-12 18:11 ` Nhat Pham
2024-12-12 18:30 ` Johannes Weiner
2024-12-12 21:35 ` Shakeel Butt
2024-12-12 21:41 ` Yosry Ahmed
2024-12-13 0:32 ` Roman Gushchin
2024-12-13 4:42 ` Johannes Weiner
2024-12-16 15:39 ` Michal Hocko
2025-01-14 16:09 ` Johannes Weiner
2025-01-14 16:46 ` Michal Hocko [this message]
2025-01-14 16:51 ` Rik van Riel
2025-01-14 17:00 ` Michal Hocko
2025-01-14 17:11 ` Rik van Riel
2025-01-14 18:13 ` Michal Hocko
2025-01-14 19:23 ` Johannes Weiner
2025-01-14 19:42 ` Michal Hocko
2025-01-15 17:35 ` Rik van Riel
2025-01-15 19:41 ` Michal Hocko
2025-01-14 16:54 ` Michal Hocko
2025-01-14 16:56 ` Rik van Riel
2025-01-14 16:56 ` Michal Hocko
2024-12-12 18:31 ` Roman Gushchin
2024-12-12 20:00 ` Rik van Riel
2024-12-13 0:49 ` Roman Gushchin
2024-12-13 2:54 ` Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z4aU7dn_TKeeTmP_@tiehlicka \
--to=mhocko@suse.com \
--cc=akpm@linux-foundation.org \
--cc=balbirs@nvidia.com \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=riel@surriel.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox