From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4183C001B0 for ; Thu, 20 Jul 2023 00:51:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 07E7B28009F; Wed, 19 Jul 2023 20:51:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 02E9C28004C; Wed, 19 Jul 2023 20:51:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E11DC28009F; Wed, 19 Jul 2023 20:51:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id CE69B28004C for ; Wed, 19 Jul 2023 20:51:08 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 903CBC04F7 for ; Thu, 20 Jul 2023 00:51:08 +0000 (UTC) X-FDA: 81030161016.18.C0FA50B Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf15.hostedemail.com (Postfix) with ESMTP id CEAFCA000A for ; Thu, 20 Jul 2023 00:51:06 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=xFeMX19H; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=jthoughton@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689814266; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0kdQ2CwwG9UjpnaQCd8BdaEAGWs9rZsSy7Glrcwdhwk=; b=nCSerzFA3GU+rmRlMvna489y34iDieca1ixWqZbvtU1Zm5YxW15ncV41E3Zcr1+F5YYJGt sUkBZ3wurjO5nf4ooAVhweqQEUCl9CC7N3f1jp9jmLcIet02JJ4lyUYTYk6YV51+2EjZ1P eKu13RfKWsLHLHw6aM46vfKe+xyUjiU= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=xFeMX19H; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=jthoughton@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689814266; a=rsa-sha256; cv=none; b=HZGBMgqK//3JLkVzx3iL4r6d2hjYYUhb+tSGQmPeP1BiqL6svUEXhgjWyXyvWdNxDxcO+Y hps7F9lzQ+QOYWInB+qALGfPTG+rlDianaiC+uQUdEHV8dFb4ZAgHxFoNF7Xxr7k+91uH5 dOEFeXZMqA+m7WSlBb+sWyRYaO7HsoM= Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-40540a8a3bbso90741cf.1 for ; Wed, 19 Jul 2023 17:51:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689814266; x=1690419066; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=0kdQ2CwwG9UjpnaQCd8BdaEAGWs9rZsSy7Glrcwdhwk=; b=xFeMX19HpDkoRigR98WG2haHY3z5bbIEzqKOM2yJXNVMbRa4ODdpqVuBPVP67JvnRU qV74fH9n5I9UMhK6gBEghXrg+RpC51vxjml1xnmyKwDk46VLkzJFWPbF3ajUjmPShJiz SkmwlaRdGP6ikXvOBjCNEMleZXsdtA7OPk4J8uQPy8kPjTcWuuf+ikbdllwURzDSVnip JmSaWomq39qFh0l/mYmMNXDmPIZg1ptRuC4F3uaZDggeVQqBhggkwCxOifUxc8KL+z6l CrqqTlB5FhhzgFX4Wn9VA/pDc1m0omxgDnxwZD9tj5abyWmzXRT2tqeOGxwToXMC6p0/ 3QhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689814266; x=1690419066; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0kdQ2CwwG9UjpnaQCd8BdaEAGWs9rZsSy7Glrcwdhwk=; b=BxhTE0V3MZnTS6SE6e+u8oH53nNpWl8Qux5O+IIzesbx7W+gzuV8pcdgFsmADrVfcm WhNpA9gW3MQ/UQ1PJlRI/eLVlazzI8UbUwMJq5L0e/l8r57xk7fNuJJ0DVbPV4bLOm5L 9xP9GwI8dz4kVim1ycGqC395Wsk5MmvJLoTivRYb2DMHJThxiMZLFwGDos5A5AAJUXSg tJyivdDhkZGzap/w63Yt9rBTmtvkyhX/RNcLTQu5+VrbdWzGjWBMKkiFT8aCIFKxRIZt qD6vq1293ODAPcVQwBO8LYaXxiT7UEjmtmHS1MBh8j0MKWfmMK52WT6BjT6MuMAoVqLm oLdQ== X-Gm-Message-State: ABy/qLbiRAi/xEQQeYs2nBJl098LN95+a/XJVFDN1O8PRpatMqrBAj2O hCyTXPa5z6yRwAo6BARlNdut4AVvPkZEiaAv0Gryiw== X-Google-Smtp-Source: APBJJlEvOB5bNjVvvhWpTSflzrE9FfY4DePH6+LlQmP55VhhOWiri2AvfecL8FBHdaL3dJpNphIyWhnsPTljriX9XkM= X-Received: by 2002:a05:622a:1a93:b0:404:132d:1127 with SMTP id s19-20020a05622a1a9300b00404132d1127mr61896qtc.19.1689814265808; Wed, 19 Jul 2023 17:51:05 -0700 (PDT) MIME-Version: 1.0 References: <20230718004942.113174-1-mike.kravetz@oracle.com> <20230718004942.113174-3-mike.kravetz@oracle.com> <20230718164646.GA10413@monkey> <20230720001833.GE3240@monkey> In-Reply-To: <20230720001833.GE3240@monkey> From: James Houghton Date: Wed, 19 Jul 2023 17:50:28 -0700 Message-ID: Subject: Re: [PATCH v2 2/2] hugetlb: optimize update_and_free_pages_bulk to avoid lock cycles To: Mike Kravetz Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Jiaqi Yan , Naoya Horiguchi , Muchun Song , Miaohe Lin , Axel Rasmussen , Michal Hocko , Andrew Morton Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: CEAFCA000A X-Stat-Signature: mzb9ecxc9pnpyozjkzft8bbpospq77at X-HE-Tag: 1689814266-218856 X-HE-Meta: U2FsdGVkX1+LcRpIJiOfm1F6VC/0EsmOZrp5sqhk5bXtA16it+RjNsawpKQ382fDG8EUI95YbfVBLOKpddhYInukAZ13sJkJoIQOvXfxAur9szqDL2yDuUbfrbnvXPMw/sMeJaSG1f9m5V3iBFrzKdc0/fXPN9M2oHC50wvPL879VBVeTH+FDbdjPwwKa+y1y4/U7aqt6oApjYswYI3SA52U7sjdLd2aPRJxUGbmCxD0JvBTaukgGP0juVXW+6ftnpHgyKa08sz5un5xGhB7GBDCUVQdbiyebTU496agrm2vjJyF2BIypwLi3he5OPNswKgOKS4Erp/8XQ9A3Z0tQQ+IxFNs9xRN28vMNsjPZ8rRLRoWD1I6m8GRyib1aVtAescMmqhkPSkhaVh4WL2edxodTIyRiTKx907xkmYicLXUcUJkQAnkwPHTUhaeJIpbtHf6HCOvZpcSanSHUlDofvkOYXbZDKc5wwYczHP+qZWMiAXAWiVLABMeWe8FhBVKX2PsrGTUpKvFepTe0SrwNJIMsT9SoSyZMHlLUpTij9YbGBAP7gypdLjFGsAHsCRGzQ7zPph7oHwFIHoLP6lpWHPzDIMd0qa0bwihfKJzzl4lS0CZiQZUYqeoGru/rzL+uwVEaIMamsB5tiOyz4eUeomkWMrzCDmxEui3RtXGXm+69m8sOXmvWqcGSaorV+Z3uEqelwjDn7CwcADqPCLQLbIolwB4h7em0rjTDy2EEfA/YDXwhfpfwnZUjXeemDdukksKSbw628ibfCwpiQtj7BVI82gKmuoAbe06aLKIvRsw7C4R+O9+de+zrTBwZnf4geIbFZp+0ZoW/W/YID0KhDCtYwJP4hG98sQdLXMj86qkprMK+Olj3C7vjwiVVDVV4WZLv84LdfjU1LX0SF9BKiM/6x8qHQ/nWvKVgMlK4vOyC2Pt6mBZ+31+0ikZGXF9e2n3C267vNrWAAuI3i4 KHlp5s5u AtfLQn67Lq9iu6VOEkuwBhYQhqPe7EROb/lBr3W6PuLpGotfRsd3WJpUgTcOQxRNpqs/cFYiay33Q4RqI1IWtgDPTKFXkkhcpEqUyvoQza8D0iReZEFES+BMwuCu29V+/UdtG3mOylbo4Ort+yb5c1gquveiXXNFv09AgDDsmlEXc08/OUBU3QTXDyUoI7hjdhNSd8B3xB3GKiOkDqrS8tK2DZdNH+//N982bwx/0C8M6qsa5L+0PKKBr3Arax+be25YAlj3/KJgViyF7peARfkvwYK24TC19ch3DOS+DgZNTgnbhKOvRDwkEMUkla+Zz1vq8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jul 19, 2023 at 5:19=E2=80=AFPM Mike Kravetz wrote: > > On 07/19/23 17:02, James Houghton wrote: > > On Tue, Jul 18, 2023 at 9:47=E2=80=AFAM Mike Kravetz wrote: > > > > > > On 07/18/23 09:31, James Houghton wrote: > > > > On Mon, Jul 17, 2023 at 5:50=E2=80=AFPM Mike Kravetz wrote: > > > > > + * destructor of all pages on list. > > > > > + */ > > > > > + if (clear_dtor) { > > > > > + spin_lock_irq(&hugetlb_lock); > > > > > + list_for_each_entry(page, list, lru) > > > > > + __clear_hugetlb_destructor(h, page_folio(= page)); > > > > > + spin_unlock_irq(&hugetlb_lock); > > > > > } > > > > > > > > I'm not too familiar with this code, but the above block seems weir= d > > > > to me. If we successfully allocated the vmemmap for *any* folio, we > > > > clear the hugetlb destructor for all the folios? I feel like we sho= uld > > > > only be clearing the hugetlb destructor for all folios if the vmemm= ap > > > > allocation succeeded for *all* folios. If the code is functionally > > > > correct as is, I'm a little bit confused why we need `clear_dtor`; = it > > > > seems like this function doesn't really need it. (I could have some > > > > huge misunderstanding here.) > > > > > > > > > > Yes, it is a bit strange. > > > > > > I was thinking this has to also handle the case where hugetlb vmemmap > > > optimization is off system wide. In that case, clear_dtor would neve= r > > > be set and there is no sense in ever walking the list and calling > > > __clear_hugetlb_destructor() would would be a NOOP in this case. Thi= nk > > > of the case where there are TBs of hugetlb pages. > > > > > > That is one of the reasons I made __clear_hugetlb_destructor() check > > > for the need to modify the destructor. The other reason is in the > > > dissolve_free_huge_page() code path where we allocate vmemmap. I > > > suppose, there could be an explicit call to __clear_hugetlb_destructo= r() > > > in dissolve_free_huge_page. But, I thought it might be better if > > > we just handled both cases here. > > > > > > My thinking is that the clear_dtor boolean would tell us if vmemmap w= as > > > restored for ANY hugetlb page. I am aware that just because vmemmap = was > > > allocated for one page, does not mean that it was allocated for other= s. > > > However, in the common case where hugetlb vmemmap optimization is on > > > system wide, we would have allocated vmemmap for all pages on the lis= t > > > and would need to clear the destructor for them all. > > > > > > So, clear_dtor is really just an optimization for the > > > hugetlb_free_vmemmap=3Doff case. Perhaps that is just over thinking = and > > > not a useful miro-optimization. > > > > Ok I think I understand; I think the micro-optimization is fine to > > add. But I think there's still a bug here: > > > > If we have two vmemmap-optimized hugetlb pages and restoring the page > > structs for one of them fails, that page will end up with the > > incorrect dtor (add_hugetlb_folio will set it properly, but then we > > clear it afterwards because clear_dtor was set). > > > > What do you think? > > add_hugetlb_folio() will call enqueue_hugetlb_folio() which will move > the folio from the existing list we are processing to the hugetlb free > list. Therefore, the page for which we could not restore vmemmap is not > on the list for that 'if (clear_dtor)' block of code. Oh, I see. Thanks! Unless you think it's pretty obvious, perhaps a comment would be good to add here, to explain that folios are removed from 'list' if their vmemmap isn't restored. Unrelated nit: I think you mean to use folio_test_hugetlb_vmemmap_optimized instead of HPageVmemmapOptimized in this patch. Feel free to add: Acked-by: James Houghton > > -- > Mike Kravetz