From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 165B8C4332F for ; Thu, 15 Dec 2022 02:48:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2C1378E0003; Wed, 14 Dec 2022 21:48:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 271A68E0002; Wed, 14 Dec 2022 21:48:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 113BE8E0003; Wed, 14 Dec 2022 21:48:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id F40438E0002 for ; Wed, 14 Dec 2022 21:48:34 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id BB9C8120743 for ; Thu, 15 Dec 2022 02:48:34 +0000 (UTC) X-FDA: 80243007348.10.18E7124 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf08.hostedemail.com (Postfix) with ESMTP id ED828160004 for ; Thu, 15 Dec 2022 02:48:32 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bEhJewZF; spf=pass (imf08.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1671072513; a=rsa-sha256; cv=none; b=rs2XvmyeQYf0tgoGc0cnEpHmramzzKpNn/N0nkENOgy1qRbVBYViUhjJzhqJQWgQ7+Enrw KFHQmHxhYm1aBRvra1qp0kdWm5a/QFqPCJeW2xFP5gQdEcL/DGUJJy0hvGvr4Y3wNfAL0j dNiM0zzRNhmJUTst/CHRSYPkaZmrnRg= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bEhJewZF; spf=pass (imf08.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1671072513; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=INZk4QyU/GabhLmYOY8H9CyYTVqKGsj2eitoXgep5KA=; b=3IvspIPe09ZNoZ98YRLmPnFgokb4FXaF832pcFtxbvX+eLlrTZkp1HrSfZugNBkaR8g6Vj uXp9xjJoxZfRn+BiXzFeiuTVTgNn6dnCN+vQYCv7i7GImtZgQbQ2f+fVxS+YlTSezIukQv l4nbNYyCTBZO1YNXcqiwcND48i1+udU= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1671072512; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=INZk4QyU/GabhLmYOY8H9CyYTVqKGsj2eitoXgep5KA=; b=bEhJewZFtiX0LXo6+OOv9PIFvBjNfe4+D2giMHe9lOUdAE5QAHhIkx27VEd3DSVtEf3HxJ l7K75qm3qpmQcudSYk86aAdoFhhp192gWT5/co2XliatrDrn4Eewo9TzPwiJdFdyMmnzyH rLuABnq4QDUeSXOMRHxduwTY3jEYF00= Received: from mail-yw1-f197.google.com (mail-yw1-f197.google.com [209.85.128.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-29-CiwXqnnPNH2-HsjgElzw1w-1; Wed, 14 Dec 2022 21:48:31 -0500 X-MC-Unique: CiwXqnnPNH2-HsjgElzw1w-1 Received: by mail-yw1-f197.google.com with SMTP id 00721157ae682-349423f04dbso20507837b3.13 for ; Wed, 14 Dec 2022 18:48:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=INZk4QyU/GabhLmYOY8H9CyYTVqKGsj2eitoXgep5KA=; b=tOSH9sCCmSxnQY84Lnza+v6UqLZj/Eq3f6P71eOj1rRYBSniueO3adG+wOWfePYmo4 d8B0pqAsRcQ8FJJueJ2O2JuT0O1AkgmFhcb3nfSEl+R878vwHnF6x1HpNwiHnG1cbIyu Dau+NM4As8o1Xn9W0sdewU5/zV7JrSzRQL9ltPUVHNDnaYYE0trIKdrjfTzeogFLt7CB QVyoAQ7wBJE7OMxuJp+Waz2LSUmCgntkMJGN0Gg3NtXM9nkdsX68edA83hOTfLfyaJDM O+BhDVBzpMt52G4zWv/cTd/nMhLxfuMmQHsMOg3G5KyI8Pky/tbOyXAje5xINCPa09wp iH6A== X-Gm-Message-State: ANoB5pmqEwHy/EDlCjaBDtlTVqsxfHz9LaI/unDeRczCZi3RyRJIO7Lp GNP3W7D2A6JKUPEQHCVftcuiuPK/gYBoxExyLyHxfg2NekSa2v3yHjGDGq/2W5UTBDOISq/PJpo h0AYSFIk3X79+ri9xi2EVMnoywvQ= X-Received: by 2002:a0d:dd15:0:b0:3d7:66df:9b62 with SMTP id g21-20020a0ddd15000000b003d766df9b62mr40259613ywe.133.1671072510607; Wed, 14 Dec 2022 18:48:30 -0800 (PST) X-Google-Smtp-Source: AA0mqf6+ZmVxnvKAoOCYUA8ExTUb3sOOn4uVq7rQovIxLjR5QCBDn/Um3Q+b6l4tYeK71js1u+y+Xqg3tixj+yT0u+s= X-Received: by 2002:a0d:dd15:0:b0:3d7:66df:9b62 with SMTP id g21-20020a0ddd15000000b003d766df9b62mr40259603ywe.133.1671072510279; Wed, 14 Dec 2022 18:48:30 -0800 (PST) MIME-Version: 1.0 References: <20221213234505.173468-1-npache@redhat.com> In-Reply-To: From: Nico Pache Date: Wed, 14 Dec 2022 19:48:04 -0700 Message-ID: Subject: Re: [RFC V2] mm: add the zero case to page[1].compound_nr in set_compound_order To: Matthew Wilcox , Sidhartha Kumar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, akpm@linux-foundation.org, gerald.schaefer@linux.ibm.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Queue-Id: ED828160004 X-Rspamd-Server: rspam01 X-Stat-Signature: shumeudaq5widay486qkwdhnn1qs7nb9 X-HE-Tag: 1671072512-71298 X-HE-Meta: U2FsdGVkX18R/z8Z8SW4KkTYdUegc+m2/jQ4O/EOXkh8VobtIPd5qQyO3PfAO3Zl2KDUPxHEx/Rb9nPQhub+40Cxt8KUPkqZJpjc/6ziNvXwSggtinBTv1R0YD0oq1UfilHeDDLGi8HOEpcSRrY93XPaelZ2xqrryeYR7DqutYm5QSgGUtQVo+PROCwtXn3O3BtvuHGT4ifzhcYf/fWdoQtlUNWdis0G7sAYCYS3SFTu0FLFkG4/xn/OvcKhT2HWnD/Tp3kV7OOuz0DGBPw9gtoLoT0dxix4hnIsGKsE2E05W1Ton2GQEzz0WqBsjJsoCS0QhkSL5/W2NLA/wBt8Z9fDxTQT2/txnhrM+Y4wUrTzSau11u7kcQDcs5qDrk5gnGcOS35jXiLUQ9YBM8BidYoPAsbalqzsOkqqNOgIzvc4VW7bnZ3FO7Nvvt4vA0znJS/PZBvvtJB9pUOQzPSv6H2/COnLN2WefWidi8T+cABLojctuu/+WqTAUhAQN4ZS1HkoGMfTaShoYcR4dFf8NwdaBcFJlKI0pFHHt47y9K7nQNcW2yJ5twnn6Rgu4DTE6HSo47VcKxsKF2tyFdIJHtMwVvqDyaPBxs6USmSNBhTmVXebbu+p6Kwb109tzBk2Jy82HJXrJN6c3P5Zs4Kco0Oa7PElwDmIKcszJzjZ2Q3W5eOr+8sri169krkSy9PvB0KJBHQNmiBfGc1odivQR4+mOxXEhOvsG6RFNV36mx7AEh0Toamnec0y3ZDrmrWD626RrgPw2sWI73MT8HDqwA7qkJr04msSzpiKfD07auVjDBgcakWYs7ZFm8WcDg6PvGVB5be1qTgLxlkNCHku16TYVg37HeQ+MX8EzWYebnXiLvwnfeUw3UxNTiW63bT+Iybqt+4+lh6WfPe6r0jLcMnQ4Qcwo0lLmscyt2HoL61JF32sJ7cgbSHo5xgb/isHAD4sd7VQ1XtJhEgHN5R ATEXiVz0 10QQTpgLzBQf/iepi9Fvd+4Lx992n+p2zTE5V5e905Dy0HBn5YuUK5M0Qj8cxY9B3csjAQ3p4CWxwooI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Dec 14, 2022 at 10:04 AM Matthew Wilcox wrote: > > On Tue, Dec 13, 2022 at 04:45:05PM -0700, Nico Pache wrote: > > Since commit 1378a5ee451a ("mm: store compound_nr as well as > > compound_order") the page[1].compound_nr must be explicitly set to 0 if > > calling set_compound_order(page, 0). > > > > This can lead to bugs if the caller of set_compound_order(page, 0) forgets > > to explicitly set compound_nr=0. An example of this is commit ba9c1201beaa > > ("mm/hugetlb: clear compound_nr before freeing gigantic pages") > > > > Collapse these calls into the set_compound_order by utilizing branchless > > bitmaths [1]. > > > > [1] https://graphics.stanford.edu/~seander/bithacks.html#ConditionalSetOrClearBitsWithoutBranching > > > > V2: slight changes to commit log and remove extra '//' in the comments > > We don't usually use // comments anywhere in the kernel other than > the SPDX header. Whoops! > > static inline void set_compound_order(struct page *page, unsigned int order) > > { > > + unsigned long shift = (1U << order); > > Shift is a funny name for this variable. order is the shift. this is 'nr'. Good point! Waiman found an even better/cleaner solution that would avoid needing an extra variable. page[1].compound_nr = (1U << order) & ~1U; > > page[1].compound_order = order; > > #ifdef CONFIG_64BIT > > - page[1].compound_nr = 1U << order; > > + // Branchless conditional: > > + // order > 0 --> compound_nr = shift > > + // order == 0 --> compound_nr = 0 > > + page[1].compound_nr = shift ^ (-order ^ shift) & shift; > > Can the compiler see through this? Before, the compiler sees: > > page[1].compound_order = 0; > page[1].compound_nr = 1U << 0; > ... > page[1].compound_nr = 0; > > and it can eliminate the first store. This may be the case at the moment, but with: https://lore.kernel.org/linux-mm/20221213212053.106058-1-sidhartha.kumar@oracle.com/ we will have a branch instead. Sidhartha tested it and found no regression; the concern is that if THPs get implemented using this callpath then we may end up seeing a slowdown. After doing my analysis below I dont think this is the case for the destroy case(at least on x86). In the destroy case for both the branch and branchless approach we see the compiler optimizing away the bitmath and the branch and setting the variable to zero. In the prep case we see the introduction of a test and cmovne instruction, implying a branch. > Now the compiler sees: > unsigned long shift = (1U << 0); > page[1].compound_order = order; > page[1].compound_nr = shift ^ (0 ^ shift) & shift; > > Does it do the maths at compile-time, knowing that order is 0 at this > callsite and deducing that it can just store a 0? > > I think it might, since shift is constant-1, > > page[1].compound_nr = 1 ^ (0 ^ 1) & 1; > -> page[1].compound_nr = 1 ^ 1 & 1; > -> page[1].compound_nr = 0 & 1; > -> page[1].compound_nr = 0; > > But you should run it through the compiler and check the assembly > output for __destroy_compound_gigantic_page(). Yep it does look like it gets optimized away for the destroy case: Bitmath Case (destroy) --------------------------------- Dump of assembler code for function __update_and_free_page: ... mov %rsi,%rbp //move 2nd arg (page) to rbp ... movb $0x0,0x51(%rbp) //page[1].compound_order = 0 movl $0x0,0x5c(%rbp) //page[1].compound_nr = 0 ... Math for movl : 0x5c (92) - 64 (sizeof page[0]) = 28 pahole page: unsigned int compound_nr; /* 28 4 */ Bitmath Case (prep) --------------------------------- In the case of prep_compound_gigantic_page the bitmath is being computed 0xffffffff8134f17d <+13>: mov %rdi,%r12 0xffffffff8134f180 <+16>: push %rbp 0xffffffff8134f181 <+17>: mov $0x1,%ebp 0xffffffff8134f186 <+22>: shl %cl,%ebp 0xffffffff8134f188 <+24>: neg %ecx 0xffffffff8134f18a <+26>: push %rbx 0xffffffff8134f18b <+27>: and %ebp,%ecx 0xffffffff8134f18d <+29>: mov %sil,0x51(%rdi) 0xffffffff8134f191 <+33>: mov %ecx,0x5c(%rdi) //set page[1].compound_nr Now to break down the approach with the branch: Branch Case (destroy) --------------------------------- No branch utilized to determine the following instructions. 0xffffffff813507bc <+236>: movb $0x0,0x51(%rbp) 0xffffffff813507c0 <+240>: movl $0x0,0x5c(%rbp) Branch Case (prep) --------------------------------- The branch is being computed with the introduction of a cmovne instruction. 0xffffffff8134f15d <+13>: mov %rdi,%r12 0xffffffff8134f160 <+16>: push %rbp 0xffffffff8134f161 <+17>: mov $0x1,%ebp 0xffffffff8134f166 <+22>: shl %cl,%ebp 0xffffffff8134f168 <+24>: test %esi,%esi //test 0xffffffff8134f16a <+26>: push %rbx 0xffffffff8134f16b <+27>: cmovne %ebp,%ecx //branch evaluation 0xffffffff8134f16e <+30>: mov %sil,0x51(%rdi) 0xffffffff8134f172 <+34>: mov %ecx,0x5c(%rdi) So it looks like in the destruction of compound pages we'll see no gain or loss between the bitmath or branch approach. However, in the prep case we may see some performance loss once/if THP utilizes this path due to the branch and the loss of CPU parallelization that can be achieved utilizing the bitmath approach. Cheers, -- Nico >