From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2A86C001DE for ; Thu, 20 Jul 2023 00:03:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 65CF028009D; Wed, 19 Jul 2023 20:03:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 60D3D28004C; Wed, 19 Jul 2023 20:03:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D77328009D; Wed, 19 Jul 2023 20:03:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 3F76D28004C for ; Wed, 19 Jul 2023 20:03:07 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 03E34140282 for ; Thu, 20 Jul 2023 00:03:06 +0000 (UTC) X-FDA: 81030040014.22.2F9CC08 Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) by imf23.hostedemail.com (Postfix) with ESMTP id 3C67314001A for ; Thu, 20 Jul 2023 00:03:04 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=lt14aI2q; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.179 as permitted sender) smtp.mailfrom=jthoughton@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689811385; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6AIdRrjXrsWcvUmbKux+wDH4PpMCO/1EThAv+qoaH/I=; b=B6PrnE7Y4Jvqr4Wesb61rwXrUctRFa98jM/BsXtrt/cmRcSI3QeP5Q1rCXjSceWXtqH7cu kqHEWjU//his7+pOB9koaPjp1n3IhWxgWniAXvlXKwjoMlnCzJ3ynJEpztkMxu3XUBCm9J nKM6JoBLUF6VRO4INQfrilKnpD/D6vM= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=lt14aI2q; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.179 as permitted sender) smtp.mailfrom=jthoughton@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689811385; a=rsa-sha256; cv=none; b=oEwM5fpi94UTXQ+2Ab05g/0FinpBd1X8aspYPcndSKTn7l1H5ZYQzWf7zQ8cMNKk6cBz3e wdj1HQo4CrNEYX4DsT8z/pxqU+ZN5t6Fg3dw3Jy/aqbMnDHtUyB+ODua4Quc0b3eQoKoGG JUm9rAnYBPCevetXpxl2ZO6P+71wo5Q= Received: by mail-qt1-f179.google.com with SMTP id d75a77b69052e-40540a8a3bbso79501cf.1 for ; Wed, 19 Jul 2023 17:03:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689811384; x=1690416184; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6AIdRrjXrsWcvUmbKux+wDH4PpMCO/1EThAv+qoaH/I=; b=lt14aI2qGCqGr/TnNm8djd6ephL8HCBQbXQYmuI8+6oDCVJ+zkWhw5fljzm7GQ+vIZ rQG9Y1YVwH6J13YZ/odaWPVemp4+MNLKSfuplt64XqQgMbTWsf4lokKBVHy+gmRg9php ga4s8NiHFYWokg8+gPL9plRZYfbriDOj21s9GqHiio+Bim72N66G4YKi30i5Hcv6dh/Y X2hKCqRfdHPUizPy+S5ZwiTjDC0SqyeP3z5HTfvgocV9yRkVGXSav8HFtAJXbZtmyLTl bE/8f1BlqbmblAQnlGX2IBS5FSzx8ULwLSmzU+onJoq3jpwOwYizdPdAsh83+6pMT3Qw Cksw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689811384; x=1690416184; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6AIdRrjXrsWcvUmbKux+wDH4PpMCO/1EThAv+qoaH/I=; b=XMZqnA53MYFIoiFv174MaTe7mE0JeYRT+iprh8+OUqV/HlzHZzJlxrIimpgziseMEr 2MK8+I1mCqDZFX5ta/kT6zFgCElN0EM5P0QP0NTkncq8N0Q5EXn3AOpRgEUBAt7GIamg wR7swVJlf1n6W+OkBdYrGnCk4oPmkYxmv6JHqH8bUycxDaMDK/s8qjJho7XNXJ7ymwsG 1Y12s2TwF5wLWsza5XboWk6+lH0tJwpn4o+4mGi1w6ZpVLRuVlrJckHjVjs9mUvw47Zd 0ztbVHxn2ffivtjtGSeWxm1sIXJSAUHpQ72fF5Z8QT/PPFdsBuTjU1kw+2RzMWIrNqQl UCxg== X-Gm-Message-State: ABy/qLYfT+KUY6R/TtKVZFGfZZ16S85fVk6RPkKMby/Q/cjZIdHTpX/w Xhp9ireNEPyRQuCyTW8+jOhDuJXBERYeejje70tnOQ== X-Google-Smtp-Source: APBJJlEG3jZz9gqjPlQZ08HPYNLMUX5FmZrlaAtJV6SjjVGQ2ekjzuGennDxJOZV/STQZpE3vYABW4WMEcE8hiKZlTU= X-Received: by 2002:a05:622a:1344:b0:3f8:8c06:c53b with SMTP id w4-20020a05622a134400b003f88c06c53bmr118687qtk.0.1689811384194; Wed, 19 Jul 2023 17:03:04 -0700 (PDT) MIME-Version: 1.0 References: <20230718004942.113174-1-mike.kravetz@oracle.com> <20230718004942.113174-3-mike.kravetz@oracle.com> <20230718164646.GA10413@monkey> In-Reply-To: <20230718164646.GA10413@monkey> From: James Houghton Date: Wed, 19 Jul 2023 17:02:26 -0700 Message-ID: Subject: Re: [PATCH v2 2/2] hugetlb: optimize update_and_free_pages_bulk to avoid lock cycles To: Mike Kravetz Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Jiaqi Yan , Naoya Horiguchi , Muchun Song , Miaohe Lin , Axel Rasmussen , Michal Hocko , Andrew Morton Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 1be5bpg6xszu9jbzpzzgrfxattwgsg61 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 3C67314001A X-HE-Tag: 1689811384-69696 X-HE-Meta: U2FsdGVkX18QOrpYlYkIzAwcji+v7BkNQJQiVwz4YWcoIyOblNLw4XBYDgfBPY9e4aJpDYDcKPzw1Kyq8Ttev+slwdzVKIJgec55sci5HoEx6b20xvF4fJAw6WIicPdoTyUx8NtDSMxKbhPb94l9qN3o8VLEo6syXuK+iL0069bUQqr+8YjGQV+449ielTaYP7Rfm7QGevJzC9t6hYGFf/JQiaPsS9uCgOgl1b0G/zfw3XxgmMKCo99KB4rF+lWxsWHG6AOt3wS+qt0VTeq1cKUOc7CPpInGD5sKYt7WPbIr9Gz/4OaU/WsFGItQVMODV3jDgCZTbyDIzlvggspR7rDG/iOdMNIqatD46NY7PQK8KoSMQzR33QsL4UxQ9H+MAsGDoh8o0ggBnp3Mh9RpcnWeCMF9brb0CT0JKVJz8aHgD+L4BPqRZWpRkm1wmjyBDCc4hv8eSaN70nuel4A2y46klQy58vUogYQ3MKqRcULOc9Z13EqfH/Qi1X5DbrhK/KAP6kNBvmaXF3dpDDqYrj5IFpm9LcxFatNAHTMbBdfty+UovXvhNc484RLD+oxqn2QQtxUprr80R1nVxtjexEP96R14Xwp5uTuCe0G42w6mVaLeXwDkv8w4dyf+f+sBMOdVoraEJ5Q+eOUF1UXS1ODEoKmoFVqJBbIngHIqUbmH+hzQzrr1BRR8Ib8KFtXTe1hB/+hIp1jcpjEyvTf46zxF9yGtnu3Q+davZJgUEr9CVPWhGWDmYIkwRhdVfQPu26wunIVrag9HLGlp125GZcL8pfoAvSB6NY6NXtj0ur9Ztk3w18aC4+EnC99gEPCBs2x8ORb2vgEZUQ9nSmQqoCjJOYxjZZwi0dgduqh/3XMCBQTwyiPG6PMPoUuA154P/fglJ5Ee/KUOg3Th2jODJM6rC5ZcFMUG6i2arImw4FG7+uF4HRiGpPR2FEuM6f79lvl0Eer/Bfkhbx31cdx XxcFoSGB WhE0dQzG4v0uE2uBCBAPgR2/nsy/YWiFHecLToNRMVyGhLPZViT/DJT1GJOqzYEMwAbVfGAHZLZFGG+/wQTEmNyP/B5jpjlSjSqlhkXOoNS6xFo2P93YBbEZjgHSIQiAM3dyjBT5U/DjjfzhvvbsvJeg0Eht02j6Pt58Lshj/L7RCIlRIOVFzia9IxCge8Wd7DuGkA01ViiD5lAwNZ9vW28Ruo8210UGeIpXVwO1SSMsh9EA/9aw4Ja5yyjksqUVFCyvun4LWRBpoR4JrliSGEEbAK8a9SjZaDlVh+81m2eFAviI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jul 18, 2023 at 9:47=E2=80=AFAM Mike Kravetz wrote: > > On 07/18/23 09:31, James Houghton wrote: > > On Mon, Jul 17, 2023 at 5:50=E2=80=AFPM Mike Kravetz wrote: > > > + * destructor of all pages on list. > > > + */ > > > + if (clear_dtor) { > > > + spin_lock_irq(&hugetlb_lock); > > > + list_for_each_entry(page, list, lru) > > > + __clear_hugetlb_destructor(h, page_folio(page= )); > > > + spin_unlock_irq(&hugetlb_lock); > > > } > > > > I'm not too familiar with this code, but the above block seems weird > > to me. If we successfully allocated the vmemmap for *any* folio, we > > clear the hugetlb destructor for all the folios? I feel like we should > > only be clearing the hugetlb destructor for all folios if the vmemmap > > allocation succeeded for *all* folios. If the code is functionally > > correct as is, I'm a little bit confused why we need `clear_dtor`; it > > seems like this function doesn't really need it. (I could have some > > huge misunderstanding here.) > > > > Yes, it is a bit strange. > > I was thinking this has to also handle the case where hugetlb vmemmap > optimization is off system wide. In that case, clear_dtor would never > be set and there is no sense in ever walking the list and calling > __clear_hugetlb_destructor() would would be a NOOP in this case. Think > of the case where there are TBs of hugetlb pages. > > That is one of the reasons I made __clear_hugetlb_destructor() check > for the need to modify the destructor. The other reason is in the > dissolve_free_huge_page() code path where we allocate vmemmap. I > suppose, there could be an explicit call to __clear_hugetlb_destructor() > in dissolve_free_huge_page. But, I thought it might be better if > we just handled both cases here. > > My thinking is that the clear_dtor boolean would tell us if vmemmap was > restored for ANY hugetlb page. I am aware that just because vmemmap was > allocated for one page, does not mean that it was allocated for others. > However, in the common case where hugetlb vmemmap optimization is on > system wide, we would have allocated vmemmap for all pages on the list > and would need to clear the destructor for them all. > > So, clear_dtor is really just an optimization for the > hugetlb_free_vmemmap=3Doff case. Perhaps that is just over thinking and > not a useful miro-optimization. Ok I think I understand; I think the micro-optimization is fine to add. But I think there's still a bug here: If we have two vmemmap-optimized hugetlb pages and restoring the page structs for one of them fails, that page will end up with the incorrect dtor (add_hugetlb_folio will set it properly, but then we clear it afterwards because clear_dtor was set). What do you think?