From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 660A3C433E2 for ; Thu, 17 Sep 2020 12:12:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C9762208DB for ; Thu, 17 Sep 2020 12:12:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="key not found in DNS" (0-bit key) header.d=suse.com header.i=@suse.com header.b="YD2zY32T" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C9762208DB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EE2466B0003; Thu, 17 Sep 2020 08:12:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EB7E56B0037; Thu, 17 Sep 2020 08:12:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D80B36B0055; Thu, 17 Sep 2020 08:12:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0066.hostedemail.com [216.40.44.66]) by kanga.kvack.org (Postfix) with ESMTP id C1E1B6B0003 for ; Thu, 17 Sep 2020 08:12:17 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 71739180AD83A for ; Thu, 17 Sep 2020 12:12:17 +0000 (UTC) X-FDA: 77272440714.19.nerve18_490b62627122 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id 2A9FA1AD1B4 for ; Thu, 17 Sep 2020 12:12:17 +0000 (UTC) X-HE-Tag: nerve18_490b62627122 X-Filterd-Recvd-Size: 5696 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Thu, 17 Sep 2020 12:12:16 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=cantorsusede; t=1600344735; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ucEs7xhP0dgoIWRnrrSIRIUkhZCMqR/nH039Ku6cb1o=; b=YD2zY32TxbNO83mB6zCatekaG2tvrJJ22TwfnOx5ZSHNv/Bv08RPi08w+QKaXp8GfbmeHR fl20t1k+GrRYFndA00YWb1b5IGjot8fLyyEIxX76nILvkc2svFyxZmAP0nvsQ4hf2NUVC2 uE903qUvE3xkoSXgOCFiWkKJr/1BfPRTU7SOuFy0ZE+kDhvwG2n5HOA+MILX4e1iqD5tX6 E48q2UDINNkeTS3SRdvUfqp8XVl82tRVzfd4F8yIhkqdHPsjg4aBSk38NkPTKBjupv4S2M SAKcY4/E3U/S/a7Cnbk32FAD7LVaGeGXeNQ0PXJyPLLlJQC9IenfXH2ISVhDZw== Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 0C44BB215; Thu, 17 Sep 2020 12:12:49 +0000 (UTC) Date: Thu, 17 Sep 2020 14:12:13 +0200 From: Michal Hocko To: Vijay Balakrishna Cc: Andrew Morton , "Kirill A. Shutemov" , Oleg Nesterov , Song Liu , Andrea Arcangeli , Pavel Tatashin , Allen Pais , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [[PATCH]] mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged Message-ID: <20200917121213.GC29887@dhcp22.suse.cz> References: <1599770859-14826-1-git-send-email-vijayb@linux.microsoft.com> <20200914143312.GU16999@dhcp22.suse.cz> <20200915081832.GA4649@dhcp22.suse.cz> <53dd1e2c-f07e-ee5b-51a1-0ef8adb53926@linux.microsoft.com> <20200916065306.GB18998@dhcp22.suse.cz> <32b73685-48f2-b6dd-f000-8ea52cfee70a@linux.microsoft.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <32b73685-48f2-b6dd-f000-8ea52cfee70a@linux.microsoft.com> X-Rspamd-Queue-Id: 2A9FA1AD1B4 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 16-09-20 11:28:40, Vijay Balakrishna wrote: [...] > OOM splat below. I see we had kmem leak detection turned on here. We > haven't run stress with kmem leak detection since uncovereing low > min_free_kbytes. During investigation we wanted to make sure there is = no > kmem leaks, we didn't find significant leaks detected. >=20 > [330319.766059] systemd invoked oom-killer: > gfp_mask=3D0x40cc0(GFP_KERNEL|__GFP_COMP), order=3D1, oom_score_adj=3D0 [...] > [330319.861064] Mem-Info: > [330319.863519] active_anon:60744 inactive_anon:109226 isolated_anon:0 > active_file:6418 inactive_file:3869 isolated_file:2 > unevictable:0 dirty:8 writeback:1 unstable:0 > slab_reclaimable:34660 slab_unreclaimable:795718 > mapped:1256 shmem:165765 pagetables:689 bounce:0 > free:340962 free_pcp:4672 free_cma:0 The memory consumption is predominantely in slab (unreclaimable). Only ~8% of the memory is on LRUs (anonymous + file). Slab (both reclaimable and unreclaimable) is ~40%. So there is still a lot of memory unaccounted (direct users of the page allocator). This would partially explain why the oom killer is not able to make progress and eventually panics because it is the kernel which is blowing the memory consumption. There is still ~1G free memory but the problem is that this is a GFP_KERNEL request which is not allowed to consume Movable memory. Zone normal is depleted and therefore it cannot satisfy this request even when there are some order-1 pages available. > [330319.928124] Node 0 Normal free:12652kB min:14344kB low:19092kB=3D20 > high:23840kB active_anon:55340kB inactive_anon:60276kB active_file:60kB > inactive_file:128kB unevictable:0kB writepending:4kB present:6220656kB > managed:4750196kB mlocked:0kB kernel_stack:9568kB pagetables:2756kB > bounce:0kB free_pcp:10056kB local_pcp:1376kB free_cma:0kB [...] > [330319.996879] Node 0 Normal: 3138*4kB (UME) 38*8kB (UM) 0*16kB 0*32kB > 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB =3D 12856kB I do not see the state of swap in the oom splat so I assume you have swap disabled. If that is the case then the memory reclaim cannot really do much for this request. There is almost no page cache to reclaim. That being said I do not see how a increased min_free_kbytes could help for this particular OOM situation. If there is really any relation it is more of a unintended side effect. [...] > > > Extreme values can damage your system. Setting min_free_kbytes to a= n > > > extremely low value prevents the system from reclaiming memory, whi= ch can > > > result in system hangs and OOM-killing processes. However, setting > > > min_free_kbytes too high (for example, to 5=E2=80=9310% of total sy= stem memory) > > > causes the system to enter an out-of-memory state immediately, resu= lting in > > > the system spending too much time reclaiming memory. > >=20 > > The auto tuned value should never reach such a low value to cause > > problems. >=20 > The auto tuned value is incorrect post hotplug memory operation, in our= use > case memoy hot add occurs very early during boot. =20 Define incorrect. What are the actual values? Have you tried to increase the value manually after the hotplug? --=20 Michal Hocko SUSE Labs