From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72000EB64D7 for ; Wed, 21 Jun 2023 01:27:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA7D48D0002; Tue, 20 Jun 2023 21:27:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A57B88D0001; Tue, 20 Jun 2023 21:27:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F86E8D0002; Tue, 20 Jun 2023 21:27:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7F88B8D0001 for ; Tue, 20 Jun 2023 21:27:30 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 47DEF4055A for ; Wed, 21 Jun 2023 01:27:30 +0000 (UTC) X-FDA: 80925017460.19.D95BAD5 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf12.hostedemail.com (Postfix) with ESMTP id 16A0240006 for ; Wed, 21 Jun 2023 01:27:27 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=idkFXeEy; spf=pass (imf12.hostedemail.com: domain of luto@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=luto@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687310848; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=k3mjmpG7iRCWAjgZY0tdNrZMRPGN4cyq4SSN3Cmxwgo=; b=z1WFNASbxOpCyWQSTPcUzY54rhQfstVTRZvLXn3bC6+Fd2lMdz7urGO17uWfd61ko9ff/U 0ph792OO+TzR8LL6vJQyvOvJHV/gmunF/olc3+HgObYw5SvN8RbJs3vSa1ndWeyZ5eFlbp YViau4wPfyr3Alq8FlMRhxW1t4z5PO4= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=idkFXeEy; spf=pass (imf12.hostedemail.com: domain of luto@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=luto@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687310848; a=rsa-sha256; cv=none; b=Cfs3ixE9MEHHkuqb/PNVukorDWuXu9aNJGT+lTJYZRTnm8JVOUfmwGwxYLJlKvgmWycWXy uRQ7uh/+GSlipsFjkgrULUhVwLAORqbmIGXYFb/If27DadzN3lld2nDY7HoM/dqG4dm0vs woqRuifCqWEogwOvZObZp8YuHKb7n2o= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id F023E61344; Wed, 21 Jun 2023 01:27:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1CB2AC433C0; Wed, 21 Jun 2023 01:27:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1687310846; bh=1otvnE5+wHePKnvvLsl/wIH5ilQt0tl8E7t6o0z9JvU=; h=In-Reply-To:References:Date:From:To:Cc:Subject:From; b=idkFXeEyGpeDCjbeK9q/lEYwkPL0bPEmfzHxWY8qunk+b6R7WmjsARsmDIwy2fYJE DYsTtIwM3ddhdbgNOFUQH7Ojtrrb/WQ2200GSJUvZauRsV29B85/EOTofs7oXFvNtW iFUFo9RShIk9mZhb2ergNJqVuaISFV4KGBkdSMbbp2ysd9xKW6IQ0YaodC3aUqfimJ NNylsxJhJKADtiHV3PlH8oIfL7f5QUVRBl6lfcTi+mpmiFB/mtGjG/dOrEUIYz+/Fp t7BfECvUu1d4Nlku3sT3DKy7x3Yiqyf6nEEcyKL5loJWGistZj5lYmWKsA6Qa1srY2 fTElDlnCHkdmw== Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailauth.nyi.internal (Postfix) with ESMTP id 035FC27C0054; Tue, 20 Jun 2023 21:27:24 -0400 (EDT) Received: from imap48 ([10.202.2.98]) by compute3.internal (MEProxy); Tue, 20 Jun 2023 21:27:25 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrgeefiedggeekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvfevufgtgfesthhqredtreerjeenucfhrhhomhepfdet nhguhicunfhuthhomhhirhhskhhifdcuoehluhhtoheskhgvrhhnvghlrdhorhhgqeenuc ggtffrrghtthgvrhhnpeelleehueeuudegjefglefftddtieetudduuefgveejhedtgfel leeggfegjeejjeenucffohhmrghinhepghhouggsohhlthdrohhrghenucevlhhushhtvg hrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegrnhguhidomhgvshhmthhp rghuthhhphgvrhhsohhnrghlihhthidqudduiedukeehieefvddqvdeifeduieeitdekqd hluhhtoheppehkvghrnhgvlhdrohhrgheslhhinhhugidrlhhuthhordhush X-ME-Proxy: Feedback-ID: ieff94742:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 7A20531A0063; Tue, 20 Jun 2023 21:27:24 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.9.0-alpha0-499-gf27bbf33e2-fm-20230619.001-gf27bbf33 Mime-Version: 1.0 Message-Id: <1be708d5-638c-40ff-bd52-b6b88c93d132@app.fastmail.com> In-Reply-To: References: <20230509165657.1735798-1-kent.overstreet@linux.dev> <20230509165657.1735798-8-kent.overstreet@linux.dev> <20230619104717.3jvy77y3quou46u3@moria.home.lan> <20230619191740.2qmlza3inwycljih@moria.home.lan> <5ef2246b-9fe5-4206-acf0-0ce1f4469e6c@app.fastmail.com> <20230620180839.oodfav5cz234pph7@moria.home.lan> <37d2378e-72de-e474-5e25-656b691384ba@intel.com> Date: Tue, 20 Jun 2023 18:27:04 -0700 From: "Andy Lutomirski" To: "Nadav Amit" Cc: "Dave Hansen" , "Kent Overstreet" , "Mark Rutland" , "Linux Kernel Mailing List" , linux-fsdevel@vger.kernel.org, "linux-bcachefs@vger.kernel.org" , "Kent Overstreet" , "Andrew Morton" , "Uladzislau Rezki" , "hch@infradead.org" , linux-mm , "Kees Cook" , "the arch/x86 maintainers" Subject: Re: [PATCH 07/32] mm: Bring back vmalloc_exec Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 16A0240006 X-Rspam-User: X-Stat-Signature: g9yytqo3r6o6a49q6kedueiuimg5yuue X-Rspamd-Server: rspam01 X-HE-Tag: 1687310847-736604 X-HE-Meta: U2FsdGVkX1+ZHfD5f3zMoqqGeT9FPXZwdn1c4TmIMv0AWGFaDZK5/5Y7panK49Jt7IRBKOZ+3bKunrqCKPb/vNr+04srz3S17bk8HFlbb9/4HWkOWlpho0DVl2vKEVuSOpuksQY0zoFkKJANlgSx2rVubAlcLmhF/VY1v2qHzvtC2XE26+Dl98VbcSK13H7KWeSk4hBk/z8q9NVXBPACt+UMbqYIwvHwe0D0WfBe1mruILiv0SM9WroJJaAdeKNo2yaRC0eJwUn7ZTWL4x3VvjKb1CzYrOv1R/fEXwd5/9J57pyoAqH44XSjK/qaXSABoqyOpscPS+4sNalYIioe58qOOnNG1VRtI/spHKdDpKFCsJfxSN5MnusXs6a4Xv4Uh6u6tJWYhAIu7nFyUxgkWm0DuEYXXZdwz94l9MjwOY3+VKpqIwHVAT789y54d/3JwmJzDyXenYhlF0uEc0XDG37CdBsD/mylfVWJnqQKWNL0Z3uCWu5KnPeNEL+E+xOXj1XtVmOr+2mPhJehfRmUpb44hTQYvj2sIS1PbZ/z8rjWfzWXctme7rKapn79e409bN/OhPLROEjdNRL/bqq8EXFiNlS/H9D8FQpuBya72knDLr6uD4Oy3oCG0uZdi0PDYj08NfVadUgUSuMyR/hZdaTkodLOiXoSHp4BwZJ2cuI+gWUjFbl1yyCnXiaJH+7UgtwrU2IBWkCDfvuLb4g6uYOCL679HFD9Qx+Z0zIb3pSxLkLnzhqsch4Oq3veEWqaDIEA3yukvVO0UTFQ7mGxbCVCat2QAoUHgoR27u2j1QFuc75cbRRRUfVRf0xLBk9AZFLobkgGLOl+bEMlOUUgBBGZUiN05IETmJBWlnKScCVePyM1EDW8i/iMhIAlNkKvpn5s9ESrYB7MpI30bH6uZKoLRfKuPE+rh8NgxeG6Mq4e+JwQetXgYPKiDG+8gYv4QduXAIOHP5ZtrYL+PGm oeEBysb5 RG6UoH3j6RgpAByn6eIfd7j9atfZhNHpbTet1E4IOqAFMHR4UZhpr5jAew7V/LK+1m6IJHGY3jF0zRWJ7GbT0M83eAcrrSEfq/fSvvEnwNqB2askqm0RlIPp87mzd2p+d4Eg8NIyAOZ8nHSQLOCOjPdyFpsFXT4JZHAlxjs7l3mdnQTAKePriMy8vZzbjQKXijag8OvAFP+ckPjWc1ZUu1g2kZcuydjeouatJf07JBI2P9NPJnYlNAjEMxhEJWVb03QWI4/lfZ86z9FsmXjIgv0JTseNiac6x+WHqdMmSne9HPdijFkO1XQyVZHwMH4H3iIzPnhYKYxw8LaTKsrx7cADE9M1pFgoPdvRJxrRerC51cytxWU/7oQHp5tG0Pj621xQC+xCJMIC4eBc8onn/7pMTrTYB7B1TdDJhqjv4Q7LBcPJUfprmOaVjoYB8tmJpZpsruZd3bfXuAul/BQZCf4TTD1oL6ZpAONBxCuXIVj0FyU1GteFeXiUiCebyLEEMMJU7nxc3KhO/51Ezl2B3MhLIBDT4P6F8jKuEkpBz/gGAM2s6piWoA9GuMq05elzPIhF64Bud3T4w4f18xJmo9kEmq00cJU7clshW87rh3O8pNKM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 20, 2023, at 3:43 PM, Nadav Amit wrote: >> On Jun 20, 2023, at 3:32 PM, Andy Lutomirski wrote: >>=20 >>> // out needs to be zeroed first >>> void unpack(struct uncompressed *out, const u64 *in, const struct=20 >>> bitblock *blocks, int nblocks) >>> { >>> u64 *out_as_words =3D (u64*)out; >>> for (int i =3D 0; i < nblocks; i++) { >>> const struct bitblock *b; >>> out_as_words[b->target] |=3D (in[b->source] & b->mask) <<=20 >>> b->shift; >>> } >>> } >>>=20 >>> void apply_offsets(struct uncompressed *out, const struct uncompress= ed *offsets) >>> { >>> out->a +=3D offsets->a; >>> out->b +=3D offsets->b; >>> out->c +=3D offsets->c; >>> out->d +=3D offsets->d; >>> out->e +=3D offsets->e; >>> out->f +=3D offsets->f; >>> } >>>=20 >>> Which generates nice code: https://godbolt.org/z/3fEq37hf5 >>=20 >> Thinking about this a bit more, I think the only real performance iss= ue with my code is that it does 12 read-xor-write operations in memory, = which all depend on each other in horrible ways. > > If you compare the generated code, just notice that you forgot to=20 > initialize b in unpack() in this version. > > I presume you wanted it to say "b =3D &blocks[i]=E2=80=9D. Indeed. I also didn't notice that -Wall wasn't set. Oops.