From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E095BC433F5 for ; Mon, 17 Jan 2022 11:33:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 354B56B0071; Mon, 17 Jan 2022 06:33:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 305006B0073; Mon, 17 Jan 2022 06:33:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F42A6B0074; Mon, 17 Jan 2022 06:33:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0197.hostedemail.com [216.40.44.197]) by kanga.kvack.org (Postfix) with ESMTP id 123DA6B0071 for ; Mon, 17 Jan 2022 06:33:37 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4BC861801ABDF for ; Mon, 17 Jan 2022 11:33:36 +0000 (UTC) X-FDA: 79039568832.14.9DC1112 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf17.hostedemail.com (Postfix) with ESMTP id BDCB14000C for ; Mon, 17 Jan 2022 11:33:35 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 5A3991F39A; Mon, 17 Jan 2022 11:33:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1642419214; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6AT+BcSng3mCjm3YV8NAWDfJaHk3WflmRlnyGIs2osU=; b=W9fQoUk6ynN8A9SUKYIzk+1u9jMKeIkiIk30Qy2MmF2K2mc8MRRpBph23eyBNuPULQJYo1 GZAu82TYgSznNB+DKbd/DbZzZoI+YxFT4s2D4C1WyAK94WGmS9yg31YFEdl2VPXqAeuD0e lorOsIyOTPijwQW/AGlgGOJXgAekaRE= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id B21BBA3B84; Mon, 17 Jan 2022 11:33:33 +0000 (UTC) Date: Mon, 17 Jan 2022 12:33:33 +0100 From: Michal Hocko To: Joel Savitz Cc: Andrew Morton , linux-kernel , Waiman Long , linux-mm@kvack.org, Nico Pache , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Darren Hart , Davidlohr Bueso , =?iso-8859-1?Q?Andr=E9?= Almeida Subject: Re: [PATCH] mm/oom_kill: wake futex waiters before annihilating victim shared mutex Message-ID: References: <20211207214902.772614-1-jsavitz@redhat.com> <20211207154759.3f3fe272349c77e0c4aca36f@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: BDCB14000C X-Stat-Signature: wx6fgtwocpdisjut1oz33tm8i77ex1ok Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=W9fQoUk6; spf=pass (imf17.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com X-HE-Tag: 1642419215-455870 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: I have only noticed your email now after replying to v3 so our emails have crossed. On Fri 14-01-22 09:39:55, Joel Savitz wrote: > > What has happened to the oom victim and why it has never exited? > > What appears to happen is that the oom victim is sent SIGKILL by the > process that triggers the oom while also being marked as an oom > victim. > > As you mention in your patchset introducing the oom reaper in commit > aac4536355496 ("mm, oom: introduce oom reaper"), the purpose the the > oom reaper is to try and free more memory more quickly than it > otherwise would have been by assuming anonymous or swapped out pages > won't be needed in the exit path as the owner is already dying. > However, this assumption is violated by the futex_cleanup() path, > which needs access to userspace in fetch_robust_entry() when it is > called in exit_robust_list(). Trace_printk()s in this failure path > reveal an apparent race between the oom reaper thread reaping the > victim's mm and the futex_cleanup() path. There may be other ways that > this race manifests but we have been most consistently able to trace > that one. Please let's continue the discussion in the v3 email thread: http://lkml.kernel.org/r/20220114180135.83308-1-npache@redhat.com -- Michal Hocko SUSE Labs