From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.0 required=3.0 tests=MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 608FDC2D0DB for ; Mon, 27 Jan 2020 13:29:54 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1D379207FD for ; Mon, 27 Jan 2020 13:29:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1D379207FD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A31816B0003; Mon, 27 Jan 2020 08:29:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9E2926B0006; Mon, 27 Jan 2020 08:29:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F8206B0007; Mon, 27 Jan 2020 08:29:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0072.hostedemail.com [216.40.44.72]) by kanga.kvack.org (Postfix) with ESMTP id 795976B0003 for ; Mon, 27 Jan 2020 08:29:53 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 1A469180AD802 for ; Mon, 27 Jan 2020 13:29:53 +0000 (UTC) X-FDA: 76423497066.14.hot08_7354e3508c843 X-HE-Tag: hot08_7354e3508c843 X-Filterd-Recvd-Size: 6848 Received: from mail-wr1-f65.google.com (mail-wr1-f65.google.com [209.85.221.65]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Mon, 27 Jan 2020 13:29:52 +0000 (UTC) Received: by mail-wr1-f65.google.com with SMTP id w15so11317051wru.4 for ; Mon, 27 Jan 2020 05:29:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=U0eQycVZcE3O8afEj675IsnFK3PsSZu1C9KgNaCL9Xo=; b=OvT9mFZ9qYW16JuyOZ9LspiasQF+e4hmVsLYZAnkLzywXYVhLNl4ckLeZUwcFkIT4j pRvnWZFLEEeWBTm+T2KV6/9+O3SaEksk52zyBNR8sPt/RcZOwOiKToT8cOLEb/gZE+z2 CIfXuK6wBIGNbGb5NFLFwtLBpR1WKaDAI2fRw+dLWjnMYw5zUNxgTXZzeYMZXD36M3BD rVFAUj2UNDbjwuElX4INHINq2aQ8KNChJ1QWKHnUQoENWcJSgBgew8+S30vIP4NGdNeh Xk89MfPHZDi5B4kQ2tYbNwErDGMtpCQ/qqUZlrJ3Uey0olDbV9m0+KgWX/3PCP5kRMPE USKA== X-Gm-Message-State: APjAAAWewhOrLEJ8a1LEvbt2V2cZSRs6I1nUMQo1vfcnCNkZxEyVhKSV lKZDqQ0TH5XBdTSOttZeHG8= X-Google-Smtp-Source: APXvYqyVFRZyGR6ELFEgDXR5Vlxa3HsmlnZMCI92BdfDk38nRTyMVa5dEynNWcKUUuYcFaxqVD+IkA== X-Received: by 2002:a5d:4a91:: with SMTP id o17mr20770480wrq.232.1580131791360; Mon, 27 Jan 2020 05:29:51 -0800 (PST) Received: from localhost (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id y20sm18183543wmi.25.2020.01.27.05.29.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Jan 2020 05:29:50 -0800 (PST) Date: Mon, 27 Jan 2020 14:29:50 +0100 From: Michal Hocko To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Dan Williams , Greg Kroah-Hartman , "Rafael J. Wysocki" , Andrew Morton , powerpc-utils-devel@googlegroups.com, util-linux@vger.kernel.org, Badari Pulavarty , Nathan Fontenot , Robert Jennings , Heiko Carstens , Karel Zak Subject: Re: [PATCH RFC] drivers/base/memory.c: indicate all memory blocks as removable Message-ID: <20200127132950.GH1183@dhcp22.suse.cz> References: <20200124155336.17126-1-david@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200124155336.17126-1-david@redhat.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri 24-01-20 16:53:36, David Hildenbrand wrote: > We see multiple issues with the implementation/interface to compute > whether a memory block can be offlined (exposed via > /sys/devices/system/memory/memoryX/removable) and would like to simplify > it (remove the implementation). > > 1. It runs basically lockless. While this might be good for performance, > we see possible races with memory offlining/unplug that will require > at least some sort of locking to fix. > > 2. Nowadays, more false positives are possible. No arch-specific checks > are performed that validate if memory offlining will not be denied > right away (and such check will require locking). For example, arm64 > won't allow to offline any memory block that was added during boot - > which will imply a very high error rate. Other archs have other > constraints. > > 3. The interface is inherently racy. E.g., if a memory block is > detected to be removable (and was not a false positive at that time), > there is still no guarantee that offlining will actually succeed. So > any caller already has to deal with false positives. > > 4. It is unclear which performance benefit this interface actually > provides. The introducing commit 5c755e9fd813 ("memory-hotplug: add > sysfs removable attribute for hotplug memory remove") mentioned > "A user-level agent must be able to identify which sections of > memory are likely to be removable before attempting the > potentially expensive operation." > However, no actual performance comparison was included. > > Known users: > - lsmem: Will group memory blocks based on the "removable" property. [1] > - chmem: Indirect user. It has a RANGE mode where one can specify > removable ranges identified via lsmem to be offlined. However, it > also has a "SIZE" mode, which allows a sysadmin to skip the manual > "identify removable blocks" step. [2] > - powerpc-utils: Uses the "removable" attribute to skip some memory > blocks right away when trying to find some to > offline+remove. However, with ballooning enabled, it > already skips this information completely (because it > once resulted in many false negatives). Therefore, the > implementation can deal with false positives properly > already. [3] > > With CONFIG_MEMORY_HOTREMOVE, always indicating "removable" should not > break any user space tool. We implement a very bad heuristic now. (in > contrast: always returning "not removable" would at least affect > powerpc-utils) > > Without CONFIG_MEMORY_HOTREMOVE we cannot offline anything, so report > "not removable" as before. > > Original discussion can be found in [4] ("[PATCH RFC v1] mm: > is_mem_section_removable() overhaul"). > > Other users of is_mem_section_removable() will be removed next, so that > we can remove is_mem_section_removable() completely. > > [1] http://man7.org/linux/man-pages/man1/lsmem.1.html > [2] http://man7.org/linux/man-pages/man8/chmem.8.html > [3] https://github.com/ibm-power-utilities/powerpc-utils > [4] https://lkml.kernel.org/r/20200117105759.27905-1-david@redhat.com > > Suggested-by: Michal Hocko > Cc: Dan Williams > Cc: Greg Kroah-Hartman > Cc: "Rafael J. Wysocki" > Cc: Andrew Morton > Cc: powerpc-utils-devel@googlegroups.com > Cc: util-linux@vger.kernel.org > Cc: Badari Pulavarty > Cc: Nathan Fontenot > Cc: Robert Jennings > Cc: Heiko Carstens > Cc: Karel Zak > Signed-off-by: David Hildenbrand Please add information provided by Nathan. Acked-by: Michal Hocko Minor nit below. > +#ifdef CONFIG_MEMORY_HOTREMOVE > + return sprintf(buf, "1\n"); > +#else > + return sprintf(buf, "0\n"); > +#endif int ret = IS_ENABLED(CONFIG_MEMORY_HOTREMOVE); return sprintf(buf, "%d\n", ret) would be slightly nicer than explicit ifdefs. -- Michal Hocko SUSE Labs