From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A76B3C10F1E for ; Thu, 15 Dec 2022 21:47:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 201338E0003; Thu, 15 Dec 2022 16:47:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1B1348E0002; Thu, 15 Dec 2022 16:47:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0A0418E0003; Thu, 15 Dec 2022 16:47:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id F19E98E0002 for ; Thu, 15 Dec 2022 16:47:12 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B898F120226 for ; Thu, 15 Dec 2022 21:47:12 +0000 (UTC) X-FDA: 80245876704.28.F03084A Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf26.hostedemail.com (Postfix) with ESMTP id 7417514000F for ; Thu, 15 Dec 2022 21:47:10 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=a+5wtMeg; dmarc=none; spf=none (imf26.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1671140831; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ericO2JP21XAYQtxMKC0PuaSNwuk659UUj4IZKm+oic=; b=W7mb97bxnNO3Y9f/luEJA54vtIKj3xFQx1/rMkXdLXp1McvODw+4sbVV+GjUab3FvnGKDc ErxrPHaOu/CA8N1BZJFFey/ansfFsBTKFi5DLeGUL0TLFoYgd/s9tej5kSO2tt+7pXfp7l SJUpGYmevzc9M6zpCx1B0A3mYr9KYNc= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=a+5wtMeg; dmarc=none; spf=none (imf26.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1671140831; a=rsa-sha256; cv=none; b=dcIqhzz5pPGGKrwt8cgbT5QxpP7Bkt2jw7cdOsvTdelUCrBFRY5Bcxu/goYjd2wSST5pqC 4Hx5VEwArPtSld8moZDgNDutwcfrS33BIjQaTjxTOvysn2WDYXvpQGIx2YL1HCCTzZouVd 6xW4pm2EpfzLmtl7H/C6JV7BEPoKrPw= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=ericO2JP21XAYQtxMKC0PuaSNwuk659UUj4IZKm+oic=; b=a+5wtMegakJuXcj9xMnG+QwIHp fvFe0urh1FqzJjdtfIzvQSmMEZd9ADAMeg46UpRtiLBWoUxBpMBfmsentD7dXJgVzoMff2E68AizQ I/KFoDOouZsZtTpr8/nvJFzeFZBI6H8NnsAd+pGzSsyr3cXcWev+wIT4hds1cv1q3ABL+Ciq5nYdf 9gT0IW07TrAOBV84oz5zW8BpcV0FF3tcLX1mZd9xsQC/EVotxfxtCykuMoq6xlf/UY2g6DFZTti84 KHiqyKLmaXZ5ly5csYpGVg/cMELi1qCsE0/Ka8JhCrka9ijcxlBS/S0oXeNHFtzZdbksO9LIulyHQ lBR1edkA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1p5w4M-00Emdp-Ee; Thu, 15 Dec 2022 21:47:10 +0000 Date: Thu, 15 Dec 2022 21:47:10 +0000 From: Matthew Wilcox To: Nico Pache Cc: Sidhartha Kumar , linux-kernel@vger.kernel.org, linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, akpm@linux-foundation.org, gerald.schaefer@linux.ibm.com, Waiman Long Subject: Re: [RFC V2] mm: add the zero case to page[1].compound_nr in set_compound_order Message-ID: References: <20221213234505.173468-1-npache@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 7417514000F X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: zijge46wh51dcn3ghfajkt64ifyrfmcn X-HE-Tag: 1671140830-129750 X-HE-Meta: U2FsdGVkX19Y2WwaTYpJLv+RZGdIh++bJdx3YRygnSQbECo+1ic9Dc0l55Hlkd+YtKpRFD35Ri8nA5swa2qdc/qtjM/kwptuRvSE+nk1kfa2q8Z1edhX7oqEXQQT+afdrr2FQAJ5xapTV8O8JbVsBgqrq1SvSytKbD+MeOt/4XuURHHDAQv9edhqxsTNd4fyItNbxIr+AM2JOEfO6uJW83XRO4S2kSNmbQ8KGn8w5xkzP8Bk0937LSBQuLeH2nd0vJ+D91BnlMwL1ZxK5BUX4mzwLn00wk7XZH/QR38Ud5X+8750CZTKIl40WUxAPpDHpLHjgKMWfhSZl9m9H1TEjF0TT8S/RW+Yrs1WCV+alyV4AEqe7Vgf4pyTDpF8WqP4Lc1VLybuq4IGLz3BCvkO4E7pG+a8rOK9Q3ZSdBCMBynkHyuaI72EueOsPzqSiOLSmEY3g6gV2wR4hHo7rZDOaRaCGTB7doDIvRmfLc5dm1idzyMSZSnaKnbrGy8PJgQDUQphyp9GJzTzQFCQmSRYBkSqx1geNSEzQUOmqVLs7+LeA79OZc+FOFw9gb7e3Tj5sinn+3LE8CmiirVBhDVqi6ARPF75wUZg9PIpH3Bq9rwh4GuoOe9hAt+CcsgO1qm80XuZ9T0b+LDKlWtNxtYv3H60C4z8PL7DAfbigAyuRQBVio7VoYE/VTwGrvAyHUFiI5dPwriUImvS4XVrFNOUu/Qg7BwXY2PQ/Yu8r4Az8uYqHU/COPMqRCOwxBMIu5fjx2eBSzRUftiNbdzepdFPXTGNber4/Pzhd6WwC6zlqN7iAFRzvQ8AV1YuwMkchJDcXLlbYIKkOFJBuygeoYeNy2h5Z/McZrt+01p7RFfkOJarbeBgh2/lP8cA8QYCP6KLw/kbw//7c0tX7hKgOUOdPA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Dec 15, 2022 at 02:38:28PM -0700, Nico Pache wrote: > To expand a little more on the analysis: > I computed the latency/throughput between <+24> and <+27> using > intel's manual (APPENDIX D): > > The bitmath solutions shows a total latency of 2.5 with a Throughput of 0.5. > The branch solution show a total latency of 4 and throughput of 1.5. > > Given this is not a tight loop, and the next instruction is requiring > the data computed, better (lower) latency is the more ideal situation. > > Just wanted to add that little piece :) I appreciate how hard you're working on this, but it really is straining at gnats ;-) For a modern cpu, the most important thing is cache misses and avoiding dirtying cachelines. Cycle counting isn't that important when an L3 cache miss takes 2000 (or more) cycles.