From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1FA5C33CB1 for ; Tue, 14 Jan 2020 09:46:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BF0E524670 for ; Tue, 14 Jan 2020 09:46:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BF0E524670 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5BB608E0005; Tue, 14 Jan 2020 04:46:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 51D2A8E0003; Tue, 14 Jan 2020 04:46:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 40BC18E0005; Tue, 14 Jan 2020 04:46:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0175.hostedemail.com [216.40.44.175]) by kanga.kvack.org (Postfix) with ESMTP id 290AC8E0003 for ; Tue, 14 Jan 2020 04:46:21 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id CE825A8E7 for ; Tue, 14 Jan 2020 09:46:20 +0000 (UTC) X-FDA: 76375759320.28.wall77_1a3b77b29502e X-HE-Tag: wall77_1a3b77b29502e X-Filterd-Recvd-Size: 5602 Received: from mail-wm1-f66.google.com (mail-wm1-f66.google.com [209.85.128.66]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Tue, 14 Jan 2020 09:46:20 +0000 (UTC) Received: by mail-wm1-f66.google.com with SMTP id b19so12930002wmj.4 for ; Tue, 14 Jan 2020 01:46:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=n/mOX8mrSaSHhaPWMMgeeGncJPtaaX4ZgxDoRaKV9lg=; b=WdJHyql5uMapAsE/xZfmhipIplMLbB/VefmOjJ4OC4J0Xkpp9GkhCFLIJQbH6Acqok scfBbKnQlHlDAK95/nvMx5SZzi9OeBTU6kjkIATvFDgMHGcPXrrCkfc03/eWUX4eNcIZ KDNQATgnO2tXwjUVHPJ/BAK9owW2UkqtZP67Zqutnz/lMiuONiFs1AckPNFs8931Xvwg l9WwYpgzVnmlOepFzyyaIdPmnZmLTc5SfDqpl0UOB+pvuBwHK8ultZ0rD22+eedDjWrK FOTnLAXbrtAcEcfbgr/eR+ea1seGXWRzsloZNvlC7GGEOVcgNWPyCVr60tb8F8uQg+kM lcCQ== X-Gm-Message-State: APjAAAWrdD16k+8gsw0wujie4nz9sWBSUhYhMgvtiW0kbehSPW6PXYTj DgrkNRibYKdvF8/wUmUCzTo= X-Google-Smtp-Source: APXvYqzxMldIB32hhs0tbqIIAlTmOQskeX9ZVP0n2iMHjxXGZWS3TxXFkwFBN+qyA6ubrtJvZcIEMw== X-Received: by 2002:a1c:960c:: with SMTP id y12mr14397762wmd.9.1578995179105; Tue, 14 Jan 2020 01:46:19 -0800 (PST) Received: from localhost (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id y6sm18566015wrl.17.2020.01.14.01.46.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Jan 2020 01:46:18 -0800 (PST) Date: Tue, 14 Jan 2020 10:46:17 +0100 From: Michal Hocko To: Chris Murphy Cc: linux-mm@kvack.org Subject: Re: user space unresponsive, followup: lsf/mm congestion Message-ID: <20200114094617.GJ19428@dhcp22.suse.cz> References: <20200107205824.GM32178@dhcp22.suse.cz> <20200108092501.GO32178@dhcp22.suse.cz> <20200109115147.GP4951@dhcp22.suse.cz> <20200109115327.GQ4951@dhcp22.suse.cz> <20200110110739.GD29802@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.12.2 (2019-09-21) X-Bogosity: Ham, tests=bogofilter, spamicity=0.027109, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri 10-01-20 15:27:10, Chris Murphy wrote: > On Fri, Jan 10, 2020 at 4:07 AM Michal Hocko wrote: > > > > So you have redirected the output (stdout) to a file. This is less > > effective than using a file directly because the progy makes sure to > > preallocate and mlock the output file data as well. Anyway, let's have a > > look what you managed to gather > > I just read the source :P and see the usage. I'll do that properly if > there's a next time. Should it be saved in /tmp to avoid disk writes > or does it not matter? The usage is described at the top of the c file. As this is my internal tool I am using I didn't bother to make it super easy ;) > > It would interesting to see whether tuning vm_swappiness to 100 helps > > but considering how large is the anonymous active list I would be very > > skeptical. > > I can try it. Is it better to capture the same amount of time as > before? Or the entire thing until it fails or is stuck for at least 30 > minutes? The last data provided a good insight so following the same methodology should be good. > > So in the end it is really hard to see what the kernel should have done > > better in this overcommitted case. Killing memory hogs would likely kill > > an active workload which would lead to better desktop experience but I > > can imagine setups which simply want to have work done albeit sloooowly. > > Right, so the kernel can't know and doesn't really want to know, user > intention. It's really a policy question. > > But if the distribution wanted to have a policy of, the mouse pointer > always works - i.e. the user should be able to kill this process, if > they want, from within the GUI - that implies possibly a lot of work > to carve out the necessary resources for that entire stack. I have no > idea if that's possible with the current state of things. Well, you always have a choice to invoke the oom killer by sysrq+f and kill the memory hog like that. The more memory demanding the userspace is the more users have to think how to partition the memory as a resource. We have tooling for that it just has to be used. > Anyway, I see it's a difficult problem, and I appreciate the > explanations. I don't care about this particular example, my interest > is making it better for everyone - I personally run into this only > when I'm testing for it, but those who experience it, experience it > often. And they're often developers. They have no idea in advance what > the build resource requirements are, and those requirements change a > lot as the compile happens. Difficult problem. I can only encourage people to report those problems and we can see where we get from there. Underlying problem might be different even though symptoms seem to be similar. -- Michal Hocko SUSE Labs