From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 410E7C001B0 for ; Thu, 10 Aug 2023 13:14:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 801646B0071; Thu, 10 Aug 2023 09:14:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 78AB96B0074; Thu, 10 Aug 2023 09:14:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 62B696B0075; Thu, 10 Aug 2023 09:14:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4EFB86B0071 for ; Thu, 10 Aug 2023 09:14:30 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 26A3840A0E for ; Thu, 10 Aug 2023 13:14:30 +0000 (UTC) X-FDA: 81108239100.21.0A16F84 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf01.hostedemail.com (Postfix) with ESMTP id 1EEB640018 for ; Thu, 10 Aug 2023 13:14:27 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=E5unmCC2; spf=none (imf01.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691673268; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=h/pqgTP3MYNHGE6UQyKXCNCZA9pix5VkiXb2wlFXenU=; b=lzG8kWr7+H3WIYUzrjHC2+3msI0zUjYktJuOlRsxiZN64SWgoY0agXGOpk/lVQKhJbKNEW Oe4ua1891S793DOGQ54ufLK51nzihBKnkcsr8kE88XuXS0cZKfzvQvoMnezQxu5Cixfzro 5OKX5FcRvodWFu2qAxlaiT/isF6Hu+s= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691673268; a=rsa-sha256; cv=none; b=v8ylf3PTPAzb2whnc3W8UGroPuY9930uqQxLIFBDJVIdZBnxvB3psDFglnEZUoXFv6eDxV we4Ix5T9du+Rrxo6waRGw1xUTVdLh42ouB7TadDKst6i4E8iB1MeFq1rXKAxEqL/YDejgK QhWMnDWaUo9fuRK4PRVHuHhO5H2aCVM= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=E5unmCC2; spf=none (imf01.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=h/pqgTP3MYNHGE6UQyKXCNCZA9pix5VkiXb2wlFXenU=; b=E5unmCC2FjOo2XdBbMV8ZJT+TM ycrDSjFwCYJrJQ9mwCIukHHC4iIGBm42RJVloSpVTF69WwGva40MKnhAhyyvsqWHnMeK9NE8U5SAG Ev4A97I9sKwUEXvgfKrxFX3KcXoqE2UCcNCjxfohFj9CNbABSMaY63qOgECmnKksyScmZmRMqCHK4 0Flb7SUWRgLwy5PGZKbG8M7v+qucqqhumb18H5gUZ71FsdwukDw+fNXwHfjyPS7Z0/RyedHeNRdIY XlKMMJC8EOgJLrt2bkC0ObavYtm23lYx3hIx1xB9/XqEv/p7LxbVZ7a/c0ZFh5jomVepW7DbqW52+ JXpFQEKw==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qU5UX-00CgSm-QL; Thu, 10 Aug 2023 13:14:17 +0000 Date: Thu, 10 Aug 2023 14:14:17 +0100 From: Matthew Wilcox To: Guo Hui Cc: akpm@linux-foundation.org, linux-mm@kvack.org, wangxiaohua@uniontech.com Subject: Re: [PATCH] mm: sparse: shift operation instead of division operation for root index Message-ID: References: <20230810103829.10007-1-guohui@uniontech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230810103829.10007-1-guohui@uniontech.com> X-Stat-Signature: 8fs76j7ddz8azkwornkyiih3b44nad8k X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 1EEB640018 X-Rspam-User: X-HE-Tag: 1691673267-280353 X-HE-Meta: U2FsdGVkX1/nDoD4JgT7oYPasncJNASsz4AYavPB0xhP7DJpq6cgqOKgnQDhKyrZJzy1yYbhEwcufNTlo9q1FCgeL4D62ZO902gpoWSu7gF8GdWmKHlg4hdT74eRGBgGClp3ndLNV7KiOLIuB0HdcSGW1vQVqXv/kxqM31GSgU1wOMxo4ro6lmWKCxIyYPwqp0CHRpRxNBQgksDpOhkrteOzDnzIz6DSWOhpusE3deD8cNJyJ1BN9GoAGz2tWGjnNW//NpCHSePW4y/Ai5chV5UVK2Li7K0e8NMHSvIPigZsJtEajaSWGTN9fw8Cungbc97EEvHD/CQXH/uC1VS3K880EV+CeWaLYvVeSF9Oz2IUvS7eLP9Fpj9JiqgMOz7j6qkEM8HVNK34GU8G+m485sZWz/rDgr4g/9V5Jod+iOgiucrf2Adyf682lF+KSMXfQbDkO/raGY9kEdkzweRfKd/jQGF6L2Nh8xfzyKuj4yu6XLU8eTN87M4jJjdTYtMYH/w2jg9qnr05tOtV9YOK2zzcJ3djnCIzh9HA07QjMOM1R/fhTkpL8HRj43UInHIeWmTi+nCcbc51NxnbO/cYwLkCrCtzxR56t0TsoDoRO5NoxnGZwg2XWZzg/IRmpErEWMkkLYriX3ZQ56NeDqvVEDRB9WL8SR9ZPFAA7JnBr54ek45XfPQIhqOS/ng2/S7qjGY1TrGvwxk0vbRhjBUxV7AJ4TCuuaH3bk/1iNZ5tdkOJy6FlCdWanTX5DZW/K2UfOmkJ7lIhawIymLZw15e+I0emxJONaN+D+50fF+pf9fgYG8ZbfAxT0nUUV6+G6SCKpE8HynYN1V7845zcySQZ/xqRt3NJHK9CNbY1IZgeAZ9C3NyFNdtK/BOcRG5QGM33N0YSFcrsDrL3QmWXUT24oOIYAVASX8P3CmV2Pm7gglC7vMmsP2dQoAOUi8fNwgbS43CVUbYbqO46sVJ170 u4kHXleg AJV4WPSO7G/PIUDPudthfrLUwAD+BUh1AkcAByKdnmrl6+pdtI7H3CWf4l6wcg32i1vGtXWXL0ncTA2zwjsWDIqWRkXC6MVDZtH+mPNfgZAjrojjeYvCzqrT4S3iLu9gA8AV1ZR9peLupglu9zmvKRs3CO5iuSF2idrNUf2A/vatla+7qVEdSSx6O2dejMe4kVW2Y X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Aug 10, 2023 at 06:38:29PM +0800, Guo Hui wrote: > In the function __nr_to_section, > Use shift operation instead of division operation > in order to improve the performance of memory management. > There are no functional changes. > > Some performance data is as follows: > Machine configuration: Hygon 128 cores, 256M memory > > Stream single core: > with patch without patch promote > Copy 23376.7731 23907.1532 -1.27% > Scale 12580.2913 11679.7852 +7.71% > Add 11922.9562 11461.8669 +4.02% > Triad 12549.2735 11491.9798 +9.20% How stable are these numbers? Because this patch makes no sense to me. #define SECTION_NR_TO_ROOT(sec) ((sec) / SECTIONS_PER_ROOT) with: #ifdef CONFIG_SPARSEMEM_EXTREME #define SECTIONS_PER_ROOT (PAGE_SIZE / sizeof (struct mem_section)) #else #define SECTIONS_PER_ROOT 1 #endif sizeof(struct mem_section) is a constant power-of-two. So if this result is real, then GCC isn't able to turn a divide-by-a-constant-power-of-two into a shift. That seems _really_ unlikely to me. And if that is what's going on, then that needs to be fixed! Can you examine some before-and-after assembly dumps to see if that is what's going on? > Signed-off-by: Guo Hui > --- > include/linux/mmzone.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 5e50b78d58ea..8dde6fb56109 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -1818,7 +1818,8 @@ struct mem_section { > #define SECTIONS_PER_ROOT 1 > #endif > > -#define SECTION_NR_TO_ROOT(sec) ((sec) / SECTIONS_PER_ROOT) > +#define SECTION_ROOT_SHIFT (__builtin_popcount(SECTIONS_PER_ROOT - 1)) > +#define SECTION_NR_TO_ROOT(sec) ((sec) >> SECTION_ROOT_SHIFT) > #define NR_SECTION_ROOTS DIV_ROUND_UP(NR_MEM_SECTIONS, SECTIONS_PER_ROOT) > #define SECTION_ROOT_MASK (SECTIONS_PER_ROOT - 1) > > -- > 2.20.1 > >