From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EF24C433EF for ; Wed, 8 Sep 2021 08:12:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4233561108 for ; Wed, 8 Sep 2021 08:12:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4233561108 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BB2E2900002; Wed, 8 Sep 2021 04:12:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B62C16B0071; Wed, 8 Sep 2021 04:12:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A783B900002; Wed, 8 Sep 2021 04:12:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0202.hostedemail.com [216.40.44.202]) by kanga.kvack.org (Postfix) with ESMTP id 9901F6B006C for ; Wed, 8 Sep 2021 04:12:58 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 476442FE01 for ; Wed, 8 Sep 2021 08:12:58 +0000 (UTC) X-FDA: 78563690436.05.EF877FE Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf23.hostedemail.com (Postfix) with ESMTP id 805A0900009B for ; Wed, 8 Sep 2021 08:12:57 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10100"; a="242713790" X-IronPort-AV: E=Sophos;i="5.85,277,1624345200"; d="scan'208";a="242713790" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Sep 2021 01:12:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.85,277,1624345200"; d="scan'208";a="695421634" Received: from shbuild999.sh.intel.com (HELO localhost) ([10.239.146.151]) by fmsmga005.fm.intel.com with ESMTP; 08 Sep 2021 01:12:53 -0700 Date: Wed, 8 Sep 2021 16:12:53 +0800 From: Feng Tang To: Michal Hocko Cc: Andrew Morton , David Rientjes , Mel Gorman , Vlastimil Babka , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm/page_alloc: detect allocation forbidden by cpuset and bail out early Message-ID: <20210908081253.GA37918@shbuild999.sh.intel.com> References: <1631003150-96935-1-git-send-email-feng.tang@intel.com> <20210908015014.GA28091@shbuild999.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Authentication-Results: imf23.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none); spf=none (imf23.hostedemail.com: domain of feng.tang@intel.com has no SPF policy when checking 192.55.52.88) smtp.mailfrom=feng.tang@intel.com X-Stat-Signature: jbbobxdt4srf8tgnz9b7erb9umx94m68 X-Rspamd-Queue-Id: 805A0900009B X-Rspamd-Server: rspam04 X-HE-Tag: 1631088777-264264 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 08, 2021 at 09:06:24AM +0200, Michal Hocko wrote: > On Wed 08-09-21 09:50:14, Feng Tang wrote: > > On Tue, Sep 07, 2021 at 10:44:32AM +0200, Michal Hocko wrote: > [...] > > > While this is a good fix from the functionality POV I believe you can go > > > a step further. Please add a detection to the cpuset code and complain > > > to the kernel log if somebody tries to configure movable only cpuset. > > > Once you have that in place you can easily create a static branch for > > > cpuset_insane_setup() and have zero overhead for all reasonable > > > configuration. There shouldn't be any reason to pay a single cpu cycle > > > to check for something that almost nobody does. > > > > > > What do you think? > > > > I thought about the implementation, IIUC, the static_branch_enable() is > > easy, it could be done when cpuset.mems is set with movable only nodes, > > but disable() is much complexer, > > Do we care about disable at all? The point is to not have 99,999999% > users pay overhead of the check which is irrelevant to them. Once > somebody wants to use this "creative" setup then paying an extra check > sounds perfectly sensible to me. If somebody cares enough then the > disable logic could be implemented. But for now I believe we should be > OK with only enable case. Makes sense to me, thanks! > > as we may need a global reference > > counter to track the set/unset, and the unset could be the time when > > freeing the cpuset data structure, also one cpuset.mems could be changed > > runtime, and system could have multiple cpuset dirs (user space usage > > could be creative or crazy :)). > > > > While checking cpuset code, I thought more about configuring cpuset with > > movable only nodes, that we may still have normal usage: mallocing a big > > trunk of memory and do some scientific calculation, or AI training. It > > works with current code. > > It might work but it would be inherently subtle because a single > non-movable allocation will throw the whole thing off the cliff. Yes, this is a valid concern. Though I think when there is really usage reuqirement for cpuset binding to HBM (High Bandwidth Memory) or PMEM, we may need to reconsider to loose the limit for GFP_HIGHUSER, as the GFP_KERNEL type of allocation is permitted. Thanks, Feng > I do not think we want to even pretend we support such a setup. > -- > Michal Hocko > SUSE Labs