From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34D48C00523 for ; Wed, 8 Jan 2020 09:25:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F3F42206F0 for ; Wed, 8 Jan 2020 09:25:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F3F42206F0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8FB1B8E0005; Wed, 8 Jan 2020 04:25:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8AB3D8E0001; Wed, 8 Jan 2020 04:25:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C2348E0005; Wed, 8 Jan 2020 04:25:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0066.hostedemail.com [216.40.44.66]) by kanga.kvack.org (Postfix) with ESMTP id 63EB88E0001 for ; Wed, 8 Jan 2020 04:25:05 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id E8C4F840F for ; Wed, 8 Jan 2020 09:25:04 +0000 (UTC) X-FDA: 76353932928.17.flame60_4963dd81e9614 X-HE-Tag: flame60_4963dd81e9614 X-Filterd-Recvd-Size: 5753 Received: from mail-wm1-f68.google.com (mail-wm1-f68.google.com [209.85.128.68]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Wed, 8 Jan 2020 09:25:04 +0000 (UTC) Received: by mail-wm1-f68.google.com with SMTP id f129so1685386wmf.2 for ; Wed, 08 Jan 2020 01:25:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=q9MhtkHf3QBjgJPefDnrN9otTdXYXrE27qMbj4u0+Wc=; b=T1xs5PhTjUu/YLLd+PlqU5o8yGfphgXlACI7RLd1dN54egvioe6YBtF/NHew6wQFzO MWoj+sPHESPpMjKE1KSgD+qdqxR9G9mdUtx4e6dINKdNQwTRD9cXGODCpGuj6Z9Tmq7t zAOcSPd4rreC9q395xLE2Kryu+VC4S4xR4DMVWEMawaO+rd6Grv2Jwe/LGgBR41K1/o3 UIcZVBVh6/isg9kKXGSrjOkdpobz5a0sIpIsgl/zYUb4PGBE4heC/5za00qiIC4Z/dNn 9kqqPZn6hHXspG8kqNv9as7oMYH2m4zgBCzWQfWKz7ts9J+zGZdkfBSyau+PsCMLzLX8 CLWA== X-Gm-Message-State: APjAAAXAvvEYEjCvItbws59+CxOYfDwnnZuHYBBfvI+Whjy2Zj0In4d/ 3B65gi7o/YaH6PMHbNil8dM= X-Google-Smtp-Source: APXvYqytiSRTLTgwXwa1cHQiIQyEV3EWs7+7n6I6kHiQpm4hHDcJgJAYE5eRbGPCzjXSKR3S9uywtQ== X-Received: by 2002:a7b:c183:: with SMTP id y3mr2566026wmi.45.1578475502988; Wed, 08 Jan 2020 01:25:02 -0800 (PST) Received: from localhost (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id s19sm2949335wmj.33.2020.01.08.01.25.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2020 01:25:02 -0800 (PST) Date: Wed, 8 Jan 2020 10:25:01 +0100 From: Michal Hocko To: Chris Murphy Cc: linux-mm@kvack.org Subject: Re: user space unresponsive, followup: lsf/mm congestion Message-ID: <20200108092501.GO32178@dhcp22.suse.cz> References: <20200107205824.GM32178@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.12.2 (2019-09-21) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 07-01-20 14:25:46, Chris Murphy wrote: > On Tue, Jan 7, 2020 at 1:58 PM Michal Hocko wrote: [...] > > Btw. from a quick look at the sysrq output there seems to be quite a lot > > of tasks (more than 1k) running on the system. Only handful of them > > belong to the compilation. kswapd is busy and 13 processes in direct > > reclaim all swapping out to the disk. > > There might be many dozens of tabs in Firefox with nothing loaded in > them, trying to keep the testing more real world (a compile while > browsing) rather than being too deferential to the compile. That does > clutter the sysrq+t but it doesn't change the outcome of the central > culprit which is the ninja compile, which by default does n+2 jobs > where n is the number of virtual CPUs. How much memory does the compile process eat? > > From the above, my first guess would be that you are over subscribing > > memory you have available. I would focus on who is consuming all that > > memory. > > ninja - I have made the argument that it is in some sense sabotaging > the system, and I think they're trying to do something a little > smarter with their defaults; however, it's an unprivileged task acting > as a kind of fork bomb that takes down the system. Well, I am not sure the fork bomb analogy is appropriate. There is only a dozen compile processes captured so unless there are way much more in other phases then this is really negligibe comparing to the rest of the workloads running on the system. > It's a really > eyebrow raising and remarkable experience. And it's common within the > somewhat vertical use case of developers compiling things on their own > systems. Many IDE's use a ton of resources, as much as they can get. > It's not clear to me by what mechanism either the user or these > processes are supposed to effectively negotiate for limited resources, > other than resource restriction. But anyway, they aren't contrived or > malicious examples. If you know that the compilation process is too disruptive wrt. memory/cpu consumption then you can use cgroups (memory and cpu controllers) to throttle that consumption and protect the rest of the system. The compilation process will take much more time of course and the explicit configuration is obviously less comfortable than out of the box auto configuration but the kernel simply doesn't have information to prioritize resources. I do agree that the oom detection could be improved to detect a heavy threshing - be it on page cache or swapin/out - and kill something rather than leave the system struggling in a highly unproductive state. This is far from trivial because what is productive is not something kernel can tell easily as it depends on the workload. As mentioned elsewhere userspace is likely much better suited to define that policy and PSI seems to be a good indicator. > A much more synthetic example is 'tail /dev/zero' > which is much more quickly arrrested by the kernel oom-killer, at > least on recent kernels. Yeah, same like any other memory leak because the memory will simply run out at some point and the OOM killer can detect that with a good confidence. It is the threshing (working set not fitting into memory and refaulting like crazy) that the kernel struggles (and loses) to handle. -- Michal Hocko SUSE Labs