From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35930C433EF for ; Thu, 21 Jul 2022 11:37:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BB9058E0006; Thu, 21 Jul 2022 07:37:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B423F8E0001; Thu, 21 Jul 2022 07:37:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E3128E0006; Thu, 21 Jul 2022 07:37:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 88AF28E0001 for ; Thu, 21 Jul 2022 07:37:33 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id BB7384087F for ; Thu, 21 Jul 2022 08:01:39 +0000 (UTC) X-FDA: 79710362718.10.728F10F Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf17.hostedemail.com (Postfix) with ESMTP id 0229A40096 for ; Thu, 21 Jul 2022 08:01:38 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 81F3338028; Thu, 21 Jul 2022 08:01:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1658390497; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=CajTUMjjesrNHxlvidIJpyhqdgaaflwvP2c+v2NTBeU=; b=oAl6S2F1eq9/Sk+vjH6umRMMuLNSCtrdzp8Q1+4xnPEmqKBHbI/6AsH5OqGdm08wR4f+F4 V3QuaCnD1Z63zuSUV7UW7RezCj0bAUf0Dl4WTJGc0NIiI/nVtLJyt4bXyOZyzxLtqJ8kUv 9+oO8pTCK7Du7NXJtONvHWh7hBzQYzU= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 36D522C16D; Thu, 21 Jul 2022 08:01:31 +0000 (UTC) Date: Thu, 21 Jul 2022 10:01:36 +0200 From: Michal Hocko To: Tetsuo Handa Cc: Johannes Weiner , Andrew Morton , linux-mm Subject: Re: [PATCH] mm: memcontrol: fix potential oom_lock recursion deadlock Message-ID: References: <000000000000471c2905e3c2c2c2@google.com> <20220714141813.yi5p4o2tiyvkao6b@quack3> <534fa596-0c29-0f1e-b292-53ad9c3dbbe3@I-love.SAKURA.ne.jp> <20220715013908.ayyimue5yhfwonho@google.com> <03304bf8-d153-698f-0376-9e9a0ec1048e@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658390499; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CajTUMjjesrNHxlvidIJpyhqdgaaflwvP2c+v2NTBeU=; b=cUNmChjG38d5o4bRgPFMyPASSSSBxYJeksUAwNd17941qBd7D0mKLQ78KUn+yntL7KhKcc daAr1X9Ahk95hZ2PecZD3X55EBT/WP2diGVUjcdooEMtDtf+OklffXEeRKByKUjIgfC6Bv NjGhhuwyCpIGfc8+6PefCbp2wfznYJI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658390499; a=rsa-sha256; cv=none; b=6OjQWLELwwbPsZ4G9Xy739+or4L5skjHOoVl3j3PUff3U54zCgrz9E5rPcDWAQGhmpdlbw vj79pUm4pVugdwjtgHc0oSdpSadSg2nPARC7e/THO7ngQXD/McinhIV8wDi8WmgpZ1aeO9 +cT6f/JA9omLjfwuyv0LV+ZpSnWAo6c= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=oAl6S2F1; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf17.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com X-Rspamd-Queue-Id: 0229A40096 Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=oAl6S2F1; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf17.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com X-Rspamd-Server: rspam12 X-Rspam-User: X-Stat-Signature: u7c69quwdj6o7x8grrtfn4otm4bwgchx X-HE-Tag: 1658390498-320160 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 21-07-22 08:49:57, Tetsuo Handa wrote: > syzbot is reporting fs_reclaim allocation with oom_lock held [1]. We > must make sure that such allocation won't hit __alloc_pages_may_oom() > path which will retry forever if oom_lock is already held. > > I choose GFP_ATOMIC than GFP_NOWAIT, for since global OOM situation will > likely be avoided by killing some process in memcg, and memory will be > released after printk(), trying a little hard will be acceptable. Nope, this is not a proper fix. You are making memory.stat more likely to fail. An uncoditional GFP_KERNEL allocation is certainly not good but is there any reason to not use GFP_NOIO instead? Or even better. In an ideal world we won't allocate from here at all. Can we pre-allocate that single page and re-use it for the oom report? --- diff --git a/mm/memcontrol.c b/mm/memcontrol.c index abec50f31fe6..13483cb278bb 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1460,14 +1460,12 @@ static inline unsigned long memcg_page_state_output(struct mem_cgroup *memcg, return memcg_page_state(memcg, item) * memcg_page_state_unit(item); } -static char *memory_stat_format(struct mem_cgroup *memcg) +void memory_stat_format(struct mem_cgroup *memcg, char *buf) { struct seq_buf s; int i; - seq_buf_init(&s, kmalloc(PAGE_SIZE, GFP_KERNEL), PAGE_SIZE); - if (!s.buffer) - return NULL; + seq_buf_init(&s, buf, PAGE_SIZE); /* * Provide statistics on the state of the memory subsystem as @@ -1533,8 +1531,6 @@ static char *memory_stat_format(struct mem_cgroup *memcg) /* The above should easily fit into one page */ WARN_ON_ONCE(seq_buf_has_overflowed(&s)); - - return s.buffer; } #define K(x) ((x) << (PAGE_SHIFT-10)) @@ -1563,6 +1559,12 @@ void mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct * rcu_read_unlock(); } +/* + * preallocated buffer to collect memory stats for the oom situation. + * Usage protected by oom_lock + */ +char oombuf[PAGE_SIZE]; + /** * mem_cgroup_print_oom_meminfo: Print OOM memory information relevant to * memory controller. @@ -1570,7 +1572,7 @@ void mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct * */ void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg) { - char *buf; + lockdep_assert_held(&oom_lock); pr_info("memory: usage %llukB, limit %llukB, failcnt %lu\n", K((u64)page_counter_read(&memcg->memory)), @@ -1591,11 +1593,8 @@ void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg) pr_info("Memory cgroup stats for "); pr_cont_cgroup_path(memcg->css.cgroup); pr_cont(":"); - buf = memory_stat_format(memcg); - if (!buf) - return; - pr_info("%s", buf); - kfree(buf); + memory_stat_format(memcg, oombuf); + pr_info("%s", oombuf); } /* @@ -6335,11 +6334,11 @@ static int memory_events_local_show(struct seq_file *m, void *v) static int memory_stat_show(struct seq_file *m, void *v) { struct mem_cgroup *memcg = mem_cgroup_from_seq(m); - char *buf; + char *buf = kmalloc(PAGE_SIZE, GFP_KERNEL); - buf = memory_stat_format(memcg); if (!buf) return -ENOMEM; + memory_stat_format(memcg, buf); seq_puts(m, buf); kfree(buf); return 0; -- Michal Hocko SUSE Labs