From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C309CC369AB for ; Mon, 21 Apr 2025 08:02:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ED9E26B0006; Mon, 21 Apr 2025 04:02:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E897C6B0007; Mon, 21 Apr 2025 04:02:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D77F06B0008; Mon, 21 Apr 2025 04:02:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B7DB76B0006 for ; Mon, 21 Apr 2025 04:02:09 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id B7EF968885 for ; Mon, 21 Apr 2025 08:02:09 +0000 (UTC) X-FDA: 83357307978.20.4501DAB Received: from out-189.mta0.migadu.com (out-189.mta0.migadu.com [91.218.175.189]) by imf01.hostedemail.com (Postfix) with ESMTP id BCE2D40004 for ; Mon, 21 Apr 2025 08:02:07 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=X9XRhR+D; spf=pass (imf01.hostedemail.com: domain of lance.yang@linux.dev designates 91.218.175.189 as permitted sender) smtp.mailfrom=lance.yang@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745222528; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PeLomQG/3eumv11HxKQE6tnY52ZFSGip0VlyTCwax8o=; b=cYJgykvqba49OE20OKotz2aYFsyFLlXBDzPtVievEAh6pE4HDZXKsYho2Rccv7CpnnNWjd 7StaCBrScRUkGlgapy+GQy/lkTJvAUCatXEwNhgIn+TdNCR/szj61oDJRh0xvV9ZIjDDJa Q7++AhGHGIlT2M+8CU3IIzk2pljFwKw= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=X9XRhR+D; spf=pass (imf01.hostedemail.com: domain of lance.yang@linux.dev designates 91.218.175.189 as permitted sender) smtp.mailfrom=lance.yang@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745222528; a=rsa-sha256; cv=none; b=YUX9ZEGa7v6b+4/4rcjNEVSHTQCWiL55+T41eKpYmxbkMqWmtpYMktmLUs04OYL4JxSvCK fi5z874MR/y4khzfBfBEZIfMKhu/gR/3OiRiu48hem3ptMWN+4A1l6YD0L0ij5SeH00I78 ThXP6DeKl+6HaakU9ARoqYFM/Zi3zAs= MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1745222523; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PeLomQG/3eumv11HxKQE6tnY52ZFSGip0VlyTCwax8o=; b=X9XRhR+DAs992DQsL3z04Ifp53AOVSR+nqsk6DrmT7aO39CQ6X0nx5uoiDC3t+XLgSKrbf mjik200wP6h8FgBW/55A39FBsgGxNRvVeJ+Lk0yLjLwpojbBWFBUwCfyvGKi1XoNaA0f+Z WWCLvCctelxigCpP8c5wXLmmfo7iPWU= Date: Mon, 21 Apr 2025 08:02:01 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: "Lance Yang" Message-ID: <9e3e1f9d238c01bdeacb165501483ab666a766cd@linux.dev> TLS-Required: No Subject: Re: [PATCH 1/1] mm/rmap: optimize MM-ID mapcount handling with union To: "David Hildenbrand" Cc: mingzhe.yang@ly.com, willy@infradead.org, ziy@nvidia.com, mhocko@suse.com, vbabka@suse.cz, surenb@google.com, linux-mm@kvack.org, jackmanb@google.com, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, "Lance Yang" In-Reply-To: References: <20250420055159.55851-1-lance.yang@linux.dev> <2e501e48-8604-4813-b76a-d467cad67f53@redhat.com> X-Migadu-Flow: FLOW_OUT X-Stat-Signature: e65asjjtbss9b17nyqg59yu55c6wdd4a X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: BCE2D40004 X-Rspam-User: X-HE-Tag: 1745222527-902449 X-HE-Meta: U2FsdGVkX19x2BB68JUCRJpF1s95ABXYnTpCf9e9Tndk3QsgD0asqGzNxyV2V0kqhGkyb8UjjDwtH2dAdESF7A5RMsMLAFRpKqkRmmUY12s2CXiUneDN4o8PiYiP6XsdK1jzO/CP54ZJLra6F54xl4UKwyao53ZRz/dBM03sFcoPYtt12CqmcbbzjLTNQlm5aNjSi7nmfIEoXDK5Se3I+PJWhdU8FE+fHYYJDQUETUv9NL6buaHwCC8pekA/qd9mkLuk0iGd4+4CXvcgMmBXWaFJ35BM5VkRX1VC9KusK4qdqKAL3b4J/WlFzsZCm6iZo0TjAJBp6o8wyrH5v4wQNlfWMQxwpWPFMvMStQSnTsM9R3pZjmav1DYOsvXAaDuiYqijXTFRLZC71pCQHv0lsm/IF8cT0k5lYU6eNJ/Hyayj2mCkKAFWBPVvYPbYbh6DsP8P3bWNQx4IyteemCGEsawqcpaE5z1uJN/jnHREteaIjvjoKwXPiK/ngvB+g1zsUbamVgz5pWgGlKTjtrkf+quRd++eTJmNUgmEFjGjOtRyluHl2axBbHVsuIBp+a2BX7dmllYh1obznEbjU3n3p1QRvvPblBCAi815jc2tq6coRRPxG6MdV3hpQGXBTkw6o5jMjA159YOLcxfGFwQOyFrFGO1IXNZh5YRdDDKUaRa/NvJ1Yk3OyEehBhmvRl1ZO1nFPXn9bMIpexumN4W88zhR8AAeGfBALENoPQMATjZ+oxAZr7qMiXqS8ytpyzeUqdRbpK8aGknsGZ+35foDA0oKdw723lcyyDSWGXmQLRU67XNNZOsTPn2fnz/SHSqCFDOZbTzq2xBbMoQMEY2Al4EZg9pi3oXj2kNtmelOIt+cKb2WyW2ojmcti9iUNGDTLe3vixjxcoT1sTqUhbFYfT9Bx5dvgeJ8IRtKRItlNQfAJBwtBUPwrYW5rv9ABpbjHdxVHX5ksdhxQDtvt3N fXYQ79yf 59AH8h4CkozqyB6ncbiSIasw3u7nSFYQVEZI+5SAi3JHUqciLUFxXxZWRGO1eRArz7ZoEq2vAD7Y+By76xoiFH4Kjy+xByMrnpvYVhDCwUt3uOEODkWO4Y6QWR6sGvuiE80YuFSiu8F6Ldzk2+hbib1rRclrwLKJ+KFWZNGqRXqqi8kjEsWAK1G0eQNiq9szKwdjgYN4SrYvR3mhhsjI77rO03Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: April 21, 2025 at 3:40 PM, "David Hildenbrand" wrote: >=20 >=20>=20 >=20> >=20 >=20> > Are we sure the compiler cannot optimize that itself? > > >=20 >=20> > On x86-64 I get with gcc 14.2.1: > > >=20 >=20> > ; folio->_mm_id_mapcount[0] =3D -1; > > >=20 >=20> > 3f2f: 48 c7 42 60 ff ff ff ff movq $-0x1, 0x60(%rdx) > > >=20 >=20> > Which should be a quadword (64bit) setting, so exactly what you = want to achieve. > > >=20 >=20>=20 >=20> > Yeah, the compiler should be as smart as we expect it to be. > >=20 >=20> However, it seems that gcc 4.8.5 doesn't behave as expected > >=20 >=20> with the -O2 optimization level on the x86-64 test machine. > >=20 >=20> struct folio_array { > >=20 >=20> int _mm_id_mapcount[2]; > >=20 >=20> }; > >=20 >=20> void init_array(struct folio_array *f) { > >=20 >=20> f->_mm_id_mapcount[0] =3D -1; > >=20 >=20> f->_mm_id_mapcount[1] =3D -1; > >=20 >=20> } > >=20 >=20> 0000000000000000 : > >=20 >=20> 0: c7 07 ff ff ff ff movl $0xffffffff,(%rdi) > >=20 >=20> 6: c7 47 04 ff ff ff ff movl $0xffffffff,0x4(%rdi) > >=20 >=20> d: c3 retq > >=20 >=20> --- > >=20 >=20> struct folio_union { > >=20 >=20> union { > >=20 >=20> int _mm_id_mapcount[2]; > >=20 >=20> unsigned long _mm_id_mapcounts; > >=20 >=20> }; > >=20 >=20> }; > >=20 >=20> void init_union(struct folio_union *f) { > >=20 >=20> f->_mm_id_mapcounts =3D -1UL; > >=20 >=20> } > >=20 >=20> 0000000000000010 : > >=20 >=20> 10: 48 c7 07 ff ff ff ff movq $0xffffffffffffffff,(%rdi) > >=20 >=20> 17: c3 retq > >=20 >=20> Hmm... I'm not sure if it's valuable for those compilers that > >=20 >=20> are not very new. > >=20 >=20 > Yeah, we shouldn't care about performance with rusty old compilers, esp= ecially if the gain would likely not even be measurable. Ah, nice to know that ;) >=20 >=20Note that even Linux requires 5.1 ever since 2021. GCC seems to imple= ment this optimization starting with 7.1 (at least when playing with the = compiler explorer). Thanks for the details. Let=E2=80=99s just drop it - no measurable gain. Thanks, Lance >=20 >=20-- Cheers, >=20 >=20David / dhildenb >