From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C553DC33CB2 for ; Wed, 15 Jan 2020 20:27:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6E6C02084D for ; Wed, 15 Jan 2020 20:27:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="vOS+qbOG" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6E6C02084D Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B3C988E0005; Wed, 15 Jan 2020 15:27:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AED858E0003; Wed, 15 Jan 2020 15:27:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A03308E0005; Wed, 15 Jan 2020 15:27:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0115.hostedemail.com [216.40.44.115]) by kanga.kvack.org (Postfix) with ESMTP id 8816C8E0003 for ; Wed, 15 Jan 2020 15:27:12 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 51A60824805A for ; Wed, 15 Jan 2020 20:27:12 +0000 (UTC) X-FDA: 76381003104.20.crime87_4a7c23c8812e X-HE-Tag: crime87_4a7c23c8812e X-Filterd-Recvd-Size: 4430 Received: from mail-pf1-f194.google.com (mail-pf1-f194.google.com [209.85.210.194]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Wed, 15 Jan 2020 20:27:11 +0000 (UTC) Received: by mail-pf1-f194.google.com with SMTP id i6so9046505pfc.1 for ; Wed, 15 Jan 2020 12:27:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=kWBj9q//ngCI8CHduUy+zt1arDphaiPl0KmAyH3wXMI=; b=vOS+qbOG6Ei+ejHHP7brSCEHiRP3WSxVUn0SJkywCLHbCTk2meRAnqcz6CCmeNpPuZ x+5NcR43SKz4az/LB/QeA/hZePBit+amrkLl4DXNcz+tqWGr96jE5EqIpU8T7YKsGqF1 9/+2PhqYvcoJ6e396p1+9uZ+joECShT7FVNWbVXmG19YLYpMuMIoTAEKRoefdJIZL7q0 quXK7Tp7pAv7Zx/SiguSrQcdvFyaP4Xm+0OjB9SNoOUFbmQV3CbThGlSn56Emm8ISzP+ OcbASV6uLDKyly7yXihP0pYSN8g7A/NWps5zy8MVh8FC7W3g8ZJn5EoOtBBukrrfu2oV iaUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=kWBj9q//ngCI8CHduUy+zt1arDphaiPl0KmAyH3wXMI=; b=pkGBE6zBz7AZ4ymLLGyNJFWLIy4gOzPfBOVNuFeejJgEWn08DqZ/V87wCADAmHVKCc 0UIJMmHAbF3NpyheLlfX+V/5oUK9XXSTEiua3jqaIemlJs2I/FNMXxug9AKeodlXIjOk 6nS3zKLKwRb4ONj9rifsvquwsqfeN185pX7PXyfwm0Jo/hOreek1CWG+ObOe/IVrRgKI 9jxEiS4N2kqcw2ZVQWDJFRsbMuWXO90SB6l3JxCX2AePMGWtXbEntIfpGk+6Zjpron2k yGYfnZiKlxEQo+nx2y2VN3KfoK0RqTPLJnUtvm4MLzcELBnMIyxd1O90y/LADZbF457k iRAA== X-Gm-Message-State: APjAAAWKtnRbDX5Fvmi5wfZPVnugHNU0wFXFZaNl+OEZkz912n+nv2iD q0MvHEaW8qrJddm+9qQ8JAgSjg== X-Google-Smtp-Source: APXvYqwx103ddpfCyu40eH6QZZUjHXOFMIgHWpdjtdTq4tzEQ/4Gz956qvLsexoZqa4bRG/YlnpQ+A== X-Received: by 2002:aa7:946a:: with SMTP id t10mr33616856pfq.165.1579120030646; Wed, 15 Jan 2020 12:27:10 -0800 (PST) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id u11sm277705pjn.2.2020.01.15.12.27.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Jan 2020 12:27:10 -0800 (PST) Date: Wed, 15 Jan 2020 12:27:09 -0800 (PST) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Tetsuo Handa cc: Michal Hocko , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] mm, oom: dump stack of victim when reaping failed In-Reply-To: <9a7cbbf0-4283-f932-e422-84b4fb42a055@I-love.SAKURA.ne.jp> Message-ID: References: <20200115084336.GW19428@dhcp22.suse.cz> <9a7cbbf0-4283-f932-e422-84b4fb42a055@I-love.SAKURA.ne.jp> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 15 Jan 2020, Tetsuo Handa wrote: > >> When a process cannot be oom reaped, for whatever reason, currently the > >> list of locks that are held is currently dumped to the kernel log. > >> > >> Much more interesting is the stack trace of the victim that cannot be > >> reaped. If the stack trace is dumped, we have the ability to find > >> related occurrences in the same kernel code and hopefully solve the > >> issue that is making it wedged. > >> > >> Dump the stack trace when a process fails to be oom reaped. > > > > Yes, this is really helpful. > > tsk would be a thread group leader, but the thread which got stuck is not > always a thread group leader. Maybe dump all threads in that thread group > without PF_EXITING (or something) ? > That's possible, yes. I think it comes down to the classic problem of how much info in the kernel log on oom kill is too much. Stacks for all threads that match the mm being reaped may be *very* verbose. I'm currently tracking a stall in oom reaping where the victim doesn't always have a lock held so we don't know where it's at in the kernel; I'm hoping that a stack for the thread group leader will at least shed some light on it.