From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81920C2D0A3 for ; Fri, 6 Nov 2020 04:32:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C48FF20756 for ; Fri, 6 Nov 2020 04:32:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C48FF20756 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B1AE06B005C; Thu, 5 Nov 2020 23:32:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ACA036B005D; Thu, 5 Nov 2020 23:32:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9911C6B0068; Thu, 5 Nov 2020 23:32:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0165.hostedemail.com [216.40.44.165]) by kanga.kvack.org (Postfix) with ESMTP id 6A9FB6B005C for ; Thu, 5 Nov 2020 23:32:51 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 0295B181AEF10 for ; Fri, 6 Nov 2020 04:32:51 +0000 (UTC) X-FDA: 77452722942.11.flag96_500a7fa272cf Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id D98E4180F8B80 for ; Fri, 6 Nov 2020 04:32:50 +0000 (UTC) X-HE-Tag: flag96_500a7fa272cf X-Filterd-Recvd-Size: 4748 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by imf12.hostedemail.com (Postfix) with ESMTP for ; Fri, 6 Nov 2020 04:32:49 +0000 (UTC) IronPort-SDR: Dzi0QMcWllAo5ChiQ+bKZr3lNHO5543xvWQXw0kDcXEVQ3A5fJLKtvT0HoWMDw5iBC3kTaBsej Pdnu162l5Umg== X-IronPort-AV: E=McAfee;i="6000,8403,9796"; a="168720165" X-IronPort-AV: E=Sophos;i="5.77,455,1596524400"; d="scan'208";a="168720165" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Nov 2020 20:32:47 -0800 IronPort-SDR: fX9GZUa01MB4tbYWMM4a4rW8LnXAuRWyCmQvf4RV08oLFvMn5d1WfOrXi6mvOvQR286HC3SdZd EciBD0uUvurg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,455,1596524400"; d="scan'208";a="528210883" Received: from yhuang-dev.sh.intel.com (HELO yhuang-dev) ([10.239.159.65]) by fmsmga006.fm.intel.com with ESMTP; 05 Nov 2020 20:32:45 -0800 From: "Huang\, Ying" To: Michal Hocko Cc: Feng Tang , Andrew Morton , Johannes Weiner , Matthew Wilcox , Mel Gorman , dave.hansen@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 0/2] mm: fix OOMs for binding workloads to movable zone only node References: <1604470210-124827-1-git-send-email-feng.tang@intel.com> <20201104071308.GN21990@dhcp22.suse.cz> <20201104073826.GA15700@shbuild999.sh.intel.com> <20201104075819.GA10052@dhcp22.suse.cz> <20201104084021.GB15700@shbuild999.sh.intel.com> <20201104085343.GA18718@dhcp22.suse.cz> <20201105014028.GA86777@shbuild999.sh.intel.com> <20201105120818.GC21348@dhcp22.suse.cz> Date: Fri, 06 Nov 2020 12:32:44 +0800 In-Reply-To: <20201105120818.GC21348@dhcp22.suse.cz> (Michal Hocko's message of "Thu, 5 Nov 2020 13:08:18 +0100") Message-ID: <87zh3vp0k3.fsf@yhuang-dev.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Michal Hocko writes: > On Thu 05-11-20 09:40:28, Feng Tang wrote: >> On Wed, Nov 04, 2020 at 09:53:43AM +0100, Michal Hocko wrote: >> >> > > > As I've said in reply to your second patch. I think we can make the oom >> > > > killer behavior more sensible in this misconfigured cases but I do not >> > > > think we want break the cpuset isolation for such a configuration. >> > > >> > > Do you mean we skip the killing and just let the allocation fail? We've >> > > checked the oom killer code first, when the oom happens, both DRAM >> > > node and unmovable node have lots of free memory, and killing process >> > > won't improve the situation. >> > >> > We already do skip oom killer and fail for lowmem allocation requests already. >> > This is similar in some sense. Another option would be to kill the >> > allocating context which will have less corner cases potentially because >> > some allocation failures might be unexpected. >> >> Yes, this can avoid the helpless oom killing to kill a good process(no >> memory pressure at all) >> >> And I think the important thing is to judge whether this usage (binding >> docker like workload to unmovable node) is a valid case :) > > I am confused. Why wouldbe an unmovable node a problem. Movable > allocations can be satisfied from the Zone Normal just fine. It is other > way around that is a problem. > >> Initially, I thought it invalid too, but later think it still makes some >> sense for the 2 cases: >> * user want to bind his workload to one node(most of user space >> memory) to avoid cross-node traffic, and that node happens to >> be configured as unmovable > > See above > >> * one small DRAM node + big PMEM node, and memory latency insensitive >> workload could be bound to the cheaper unmovable PMEM node > > Please elaborate some more. As long as you have movable and normal nodes > then this should be possible with a deal of care - most notably the > movable:kernel ratio memory shouldn't be too big. > > Besides that why does PMEM node have to be MOVABLE only in the first > place? The performance of PMEM is much worse than that of DRAM. If we found that some pages on PMEM are accessed frequently (hot), we may want to move them to DRAM to optimize the system performance. If the unmovable pages are allocated on PMEM and hot, it's possible that we cannot move the pages to DRAM unless rebooting the system. So we think we should make the PMEM nodes to be MOVABLE only. Best Regards, Huang, Ying