From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6556C47094 for ; Thu, 10 Jun 2021 05:59:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5EBB6613BC for ; Thu, 10 Jun 2021 05:59:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5EBB6613BC Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=molgen.mpg.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AB6EF6B0036; Thu, 10 Jun 2021 01:58:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A668D6B006E; Thu, 10 Jun 2021 01:58:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 906E66B0070; Thu, 10 Jun 2021 01:58:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0209.hostedemail.com [216.40.44.209]) by kanga.kvack.org (Postfix) with ESMTP id 560876B0036 for ; Thu, 10 Jun 2021 01:58:59 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D96118249980 for ; Thu, 10 Jun 2021 05:58:58 +0000 (UTC) X-FDA: 78236760756.11.DD38FF7 Received: from mx1.molgen.mpg.de (mx3.molgen.mpg.de [141.14.17.11]) by imf25.hostedemail.com (Postfix) with ESMTP id BD5896000141 for ; Thu, 10 Jun 2021 05:58:53 +0000 (UTC) Received: from [192.168.0.2] (ip5f5aef16.dynamic.kabel-deutschland.de [95.90.239.22]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) (Authenticated sender: pmenzel) by mx.molgen.mpg.de (Postfix) with ESMTPSA id B6B5961E646E2; Thu, 10 Jun 2021 07:58:55 +0200 (CEST) Subject: Re: Cannot allocate memory despite buff/cache non-zero To: David Hildenbrand , Andrew Morton Cc: linux-mm@kvack.org, it+linux-mm@molgen.mpg.de References: From: Paul Menzel Message-ID: <2033f112-6704-100c-1eb9-7e1ad3187cce@molgen.mpg.de> Date: Thu, 10 Jun 2021 07:58:55 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Authentication-Results: imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of pmenzel@molgen.mpg.de designates 141.14.17.11 as permitted sender) smtp.mailfrom=pmenzel@molgen.mpg.de; dmarc=none X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: BD5896000141 X-Stat-Signature: 7c3uyw7ijj3ayrs58ou1r61kez1p13gk X-HE-Tag: 1623304733-477248 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Dear David, Thank you for your reply. Am 09.06.21 um 13:17 schrieb David Hildenbrand: > On 04.06.21 13:36, Paul Menzel wrote: >> On a 1 TB RAM compute server with Linux 5.10.24 and memory >> overcommitting disabled, we ran into a situation where processes like >> SSH couldn=E2=80=99t allocate memory anymore. >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 $ more /proc/cmdline >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 BOOT_IMAGE=3D/boot/bzImage-5.10.24.mx64= .375 root=3DLABEL=3Droot ro crashkernel=3D256M console=3DttyS0,115200n8 c= onsole=3Dtty0 init=3D/bin/systemd audit=3D0 random.trust_cpu=3Don >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2021-06-03T22:00:28+02:00 godsavetheque= en sshd[89163]: pam_systemd(sshd:session): Failed to create session: Unit= session-25654.scope not found. >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2021-06-03T22:00:29+02:00 godsavetheque= en sshd[89163]: error: do_exec_no_pty: fork: Cannot allocate memory >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2021-06-03T22:00:29+02:00 godsavetheque= en sshd[89163]: pam_unix(sshd:session): session closed for user root >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2021-06-03T22:01:41+02:00 godsavetheque= en sshd[1834]: error: fork: Cannot allocate memory >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2021-06-03T22:01:41+02:00 godsavetheque= en sshd[1834]: error: ssh_msg_send: write: Broken pipe >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2021-06-03T22:01:41+02:00 godsavetheque= en sshd[1834]: error: send_rexec_state: ssh_msg_send failed >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 $ free -h >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 total=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 used=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 free=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 shared=C2=A0 buff/cache available >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Mem:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 1.0T=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 6= 06G=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2.6G=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 2.2M=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 395G =C2= =A0=C2=A0=C2=A0=C2=A0 391G >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Swap:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0B=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 0B=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0B >> >> Looking at this, I would have expected, that the pages(?) in buff/cach= e >> would be moved/deleted to make memory available. >> >> Looking at `/proc/meminfo` (attached): >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 MemTotal:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 1052411824 kB >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 MemFree:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 2709976 kB >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 MemAvailable:=C2=A0=C2=A0 410847908 kB >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 [=E2=80=A6] >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 CommitLimit:=C2=A0=C2=A0=C2=A0 10524118= 24 kB >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Committed_AS:=C2=A0=C2=A0 1052455260 kB >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 [=E2=80=A6] >=20 > With memory overcommit disabled, each accountable mapping=20 > (mm/mmap.c:accountable_mapping()) will count towards Committed_AS. So=20 > you might still have plenty of free memory in the system reserved for=20 > these mappings, yet Linux won't allow for more accountable mappings.=20 > That's why you see the "Cannot allocate memory" messages. mmap() failed= . Buffers: 3212 kB Cached: 411083788 kB SwapCached: 0 kB Active: 303175824 kB Inactive: 740080100 kB Active(anon): 1448 kB Inactive(anon): 632169724 kB Active(file): 303174376 kB Inactive(file): 107910376 kB The documentation [1] describes *MemAvailable*, *Buffers*, and *Cached*: > MemAvailable > An estimate of how much memory is available for starting = new > applications, without swapping. Calculated from MemFree, > SReclaimable, the size of the file LRU lists, and the low > watermarks in each zone. > The estimate takes into account that the system needs som= e > page cache to function well, and that not all reclaimable > slab will be reclaimable, due to items being in use. The > impact of those factors will vary from system to system. > Buffers > Relatively temporary storage for raw disk blocks > shouldn't get tremendously large (20MB or so) > Cached > in-memory cache for files read from the disk (the > pagecache). Doesn't include SwapCached So I would have assumed, the kernel removes files from the in-memory=20 cache for files. >> Committed_AS is greater than the commit limit (total memory). >> >> Is such behavior expected? >=20 > We're talking about 43436 kB that exceed the CommitLimit. >=20 > The CommitLimit might change (grow/shrink) when > a) The number of hugetlb pages changes > b) Swap space is resized >=20 > If CommitLimit did not change, Committed_AS should actually not exceed=20 > it. IIUC, it can only happen temporarily while trying creation of a new= =20 > mapping. We increase Committed_AS unconditionally and decrease it again= =20 > if we reject it. I can=E2=80=99t say for sure, as the system was rebooted, but I thought t= he=20 value stayed the same. Kind regards, Paul [1]: https://www.kernel.org/doc/html/latest/filesystems/proc.html#meminfo