linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Yosry Ahmed <yosryahmed@google.com>,
	Rik van Riel <riel@surriel.com>,
	Balbir Singh <balbirs@nvidia.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	hakeel Butt <shakeel.butt@linux.dev>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, kernel-team@meta.com,
	Nhat Pham <nphamcs@gmail.com>
Subject: Re: [PATCH v2] memcg: allow exiting tasks to write back data to swap
Date: Tue, 14 Jan 2025 17:46:37 +0100	[thread overview]
Message-ID: <Z4aU7dn_TKeeTmP_@tiehlicka> (raw)
In-Reply-To: <20250114160955.GA1115056@cmpxchg.org>

On Tue 14-01-25 11:09:55, Johannes Weiner wrote:
> Hi,
> 
> On Mon, Dec 16, 2024 at 04:39:12PM +0100, Michal Hocko wrote:
> > On Thu 12-12-24 13:30:12, Johannes Weiner wrote:
[...]
> > > If we return -ENOMEM to an OOM victim in a fault, the fault handler
> > > will re-trigger OOM, which will find the existing OOM victim and do
> > > nothing, then restart the fault.
> > 
> > IIRC the task will handle the pending SIGKILL if the #PF fails. If the
> > charge happens from the exit path then we rely on ENOMEM returned from
> > gup as a signal to back off. Do we have any caller that keeps retrying
> > on ENOMEM?
> 
> We managed to extract a stack trace of the livelocked task:
> 
> obj_cgroup_may_swap
> zswap_store
> swap_writepage
> shrink_folio_list
> shrink_lruvec
> shrink_node
> do_try_to_free_pages
> try_to_free_mem_cgroup_pages

OK, so this is the reclaim path and it fails due to reasons you mention
below. This will retry several times until it hits mem_cgroup_oom which
will bail in mem_cgroup_out_of_memory because of task_is_dying (returns
true) and retry the charge + reclaim (as the oom killer hasn't done
anything) with passed_oom = true this time and eventually got to nomem
path and returns ENOMEM. This should propaged -ENOMEM down the path

> charge_memcg
> mem_cgroup_swapin_charge_folio
> __read_swap_cache_async
> swapin_readahead
> do_swap_page
> handle_mm_fault
> do_user_addr_fault
> exc_page_fault
> asm_exc_page_fault
> __get_user

All the way here and return the failure to futex_cleanup which doesn't
retry __get_user on the failure AFAICS (exit_robust_list). But I might
be missing something, it's been quite some time since I've looked into
futex code.

> futex_cleanup
> fuxtex_exit_release
> do_exit
> do_group_exit
> get_signal
> arch_do_signal_or_restart
> exit_to_user_mode_prepare
> syscall_exit_to_user_mode
> do_syscall
> entry_SYSCALL_64
> syscall
> 
> Both memory.max and memory.zswap.max are hit. I don't see how this
> could ever make forward progress - the futex fault will retry until it
> succeeds.

I must be missing something but I do not see the retry, could you point
me where this is happening please?

-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2025-01-14 16:46 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-12 16:57 Rik van Riel
2024-12-12 17:06 ` Yosry Ahmed
2024-12-12 17:51   ` Shakeel Butt
2024-12-12 18:02     ` Rik van Riel
2024-12-12 18:18       ` Nhat Pham
2024-12-12 18:11   ` Nhat Pham
2024-12-12 18:30   ` Johannes Weiner
2024-12-12 21:35     ` Shakeel Butt
2024-12-12 21:41       ` Yosry Ahmed
2024-12-13  0:32     ` Roman Gushchin
2024-12-13  4:42       ` Johannes Weiner
2024-12-16 15:39     ` Michal Hocko
2025-01-14 16:09       ` Johannes Weiner
2025-01-14 16:46         ` Michal Hocko [this message]
2025-01-14 16:51           ` Rik van Riel
2025-01-14 17:00             ` Michal Hocko
2025-01-14 17:11               ` Rik van Riel
2025-01-14 18:13                 ` Michal Hocko
2025-01-14 19:23                   ` Johannes Weiner
2025-01-14 19:42                     ` Michal Hocko
2025-01-15 17:35                       ` Rik van Riel
2025-01-15 19:41                         ` Michal Hocko
2025-01-14 16:54           ` Michal Hocko
2025-01-14 16:56             ` Rik van Riel
2025-01-14 16:56             ` Michal Hocko
2024-12-12 18:31 ` Roman Gushchin
2024-12-12 20:00   ` Rik van Riel
2024-12-13  0:49     ` Roman Gushchin
2024-12-13  2:54     ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4aU7dn_TKeeTmP_@tiehlicka \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbirs@nvidia.com \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=nphamcs@gmail.com \
    --cc=riel@surriel.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox