From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B101DC61DB3 for ; Thu, 12 Jan 2023 16:56:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EBBA78E0003; Thu, 12 Jan 2023 11:56:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E44C18E0001; Thu, 12 Jan 2023 11:56:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE56E8E0003; Thu, 12 Jan 2023 11:56:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id BB3CE8E0001 for ; Thu, 12 Jan 2023 11:56:05 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 83A4FA0551 for ; Thu, 12 Jan 2023 16:56:05 +0000 (UTC) X-FDA: 80346749490.12.9000DBE Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com [209.85.128.45]) by imf29.hostedemail.com (Postfix) with ESMTP id D53AC120010 for ; Thu, 12 Jan 2023 16:56:03 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Ulx820dP; spf=pass (imf29.hostedemail.com: domain of jthoughton@google.com designates 209.85.128.45 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673542563; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NjHuwJVvoYNT2uvYdEC/ruHEzZSQ99pHET9SRyBrpMg=; b=OF00IsZu01qGj51ckZfdprC+unuTYZ2JO2BzgWO73amn7FSBPRnNphmGfQbEc4ywDNOyyC sDa7SznEDJPyudYYp/lA6hVZvxVE97BGRnJNoE3/Z4GALr02kL6Jc0r8UjIlFX8oyOsPvV Nd99/12tWmHSe2W1HSFgIUH/3YbPxNY= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Ulx820dP; spf=pass (imf29.hostedemail.com: domain of jthoughton@google.com designates 209.85.128.45 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673542563; a=rsa-sha256; cv=none; b=flmFwYwhBGBxwgjm+OCvuzuQocoI9ymT/M+0Hu5JRJJNMh3XfCs3MxVu+GZFOYQqgYVQ9l x7C6tRSum/ETK8LxQxPBLwMJl8sTP5EJPK7444JMFz8dod/7iatFkBcCht9u67/jkLaQb4 O0n0/el7YXSPTehkAO+D8YsXrNgTsNY= Received: by mail-wm1-f45.google.com with SMTP id p3-20020a05600c1d8300b003d9ee5f125bso10543747wms.4 for ; Thu, 12 Jan 2023 08:56:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=NjHuwJVvoYNT2uvYdEC/ruHEzZSQ99pHET9SRyBrpMg=; b=Ulx820dPVpcnDvb0tvaeokZDfpZP37Nols/n3mmFUZTpY+hTISecMM8OFojo9iHXGP /YbBsDMjN9k38zUbm/9TosPj4s8npP/DCj9wKuA1fhGt7e/cbFh403wPuRHUeMOF0aP/ 2Bd2POxZ/33gYFjBdHcoVMyzU5b2A2nnStnb20rpSguN8DehiF3fswRfSd73Vmh75M6K 9b0VoFhlUyL7ymfKCmCJWztJDbMtI/9Kdgm5nZwzd/qS3PaWF8dlskd4anC6aeLwWk2Y F7O+KO4ctBh19H9F/m/lVYgFso+T5r6fXxk9PEvp78wWmbRkuAkFkcr6NjM36xfkxA6/ WZ3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NjHuwJVvoYNT2uvYdEC/ruHEzZSQ99pHET9SRyBrpMg=; b=7JyR5JRqWzL0Oc/XewSTK5koCWXf+d0KvobuIAbqtON+95cz04Qu+/p+dqZKlJKzBl vSkKP9doFAxzv1aC4R+sAZyRB8Lt4Ehcgyga+k2muyXQqoWJp0Qvt55pwvqGcS9hGKeG Q0aMu10ZN/odNtrml5Jx3XJKmETanjfdD+HxLF4s+5cTu07B2VByPAL3XAsrpWVuFX2q Pb3zT+zSPlHsITZNY+zInlLW6FlRxhm0YnrzEpcQy+Z2/sqNGrXLIjIwqbNJ/3+4TkZq gE0drx8Qq9J5tH4si0OHU6PhT0r5n/x0KKUwZSI7L7DVWdV/DF2fB1jiUhrUacxLHELs hKow== X-Gm-Message-State: AFqh2koBWThJjcVxKNlJ87v4Vfycm7uumLvB0nxZkhJ8OuQ6dYbUJrFf B8P1/yNRMMbwtJnFrcYte7iM8PJgldI1eyYmyi7kxQ== X-Google-Smtp-Source: AMrXdXvU9QWUX1dUYojLCt0gPxlCTcekhhvkG0RKdkGYihjF9ONnkAROkARv3qDd34glPFmVTVlZs8fDE0ahJb5nxVw= X-Received: by 2002:a05:600c:4f56:b0:3d9:7950:dc5f with SMTP id m22-20020a05600c4f5600b003d97950dc5fmr2932518wmq.120.1673542562430; Thu, 12 Jan 2023 08:56:02 -0800 (PST) MIME-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> <20230105101844.1893104-22-jthoughton@google.com> In-Reply-To: From: James Houghton Date: Thu, 12 Jan 2023 11:55:51 -0500 Message-ID: Subject: Re: [PATCH 21/46] hugetlb: use struct hugetlb_pte for walk_hugetlb_range To: Peter Xu Cc: Mike Kravetz , Muchun Song , David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D53AC120010 X-Stat-Signature: fe7g94aazjybh3juafuqctmcr4m7uxd9 X-Rspam-User: X-HE-Tag: 1673542563-729641 X-HE-Meta: U2FsdGVkX1+8BW4+PQ9fAqh2nRKEFrap2Mk3/WJ/jdjQfljmuiJTTaEBK6gfOHRWVmt8dMCtIsX3w3TU64tqOlfrnw08fGTfF31sENE5a5clBAqfsPQpYtj46MqP5gdEpZDiSEz68Lt0K0AjKqrZjHQr9xt8JYHd6cHybGvwrNOqJdIOk0+c2RtLk9N8Rizb4dXl4eKvdQ3M9AiE9Px51WPXzJ+3UKtHCGO+BxHR0Z8pL7C5lYT+k0gk3wgXnIh4+3+Y5WSarQHla+93n01OuHxOBRQQfcZEqH4pvlJX6hTN7p0BlaPa2RkUHzH1VwqEtoI3hStxVw0NeKk3k107r2AMGeiu9K/Cra2dKtor7M7SUUg7r8TtYK9U30nPoT0jPT2Lkwa0woy3loID6SVeKU3P0jQu+UsJrzJNcqMAP94u52SMrGFgZZFmuWHc0tvoPgJK9Agxyq8n8UasStdbN7NIQPQ7eBoZndKTWNgotmO1tBDde1iFYHf9JnZqB28LxZRqqdbA/DC5R/VkKO43rAm5IifZeJM+6xggJcUcy+drJBH1R6kCjnICQ2cA5OsVdV2YygeOBTbBgkpPuXUvYvWDBeIMveDDfloSaWrRgQPp3LRcFuVjANElhVmZdkSKysjSG1oYNbU6bbAn1s+kChb0vT5Xwa+1IFMdQQgtgYzc4eOMPWG9Ll+FMANxB+JKJudh7xDW6kf0J68Jtay+5BS/YyaNOTAhKH3KrRFuvNXB5ylF6dU826t68uD76djxQnNRfYrKr/ca4pvZirBGNx4fB2Uh0DwG+2Jz0W7BGJUcSnJZ/MnTx3fx6/A8jHP/hYfe7qlkWHTE8Fv/g1T2/K1ylkCE1ILIvx5aNSpfgir2I1dEHEqgwrLI5wMzkVx2L9Sx6562FF8eL28IgTjoTGoh/qQyRa6MimqKtI4M7645JfiggX7PF2S7uyOGYOhONk2Wy/4EoS4Na32AqN/ ywYfpciw yYheJa4cBSfDNwSb9dTwN8uoujl42P47BCxAEePm8nUCVq0NzfITiVWTYjw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > The original approach was implemented in RFC v1, but the > implementation was broken: the way refcount was handled was wrong; it > was incremented once for each new page table mapping. (How? > find_lock_page(), called once per hugetlb_no_page/UFFDIO_CONTINUE > would increment refcount and we wouldn't drop it, and in > __unmap_hugepage_range(), the mmu_gather bits would decrement the > refcount once per mapping.) > > At the time, I figured the complexity of handling mapcount AND > refcount correctly in the original approach would be quite complex, so > I switched to the new one. Sorry I didn't make this clear... the following steps are how we could correctly implement the original approach. > 1. In places that already change the mapcount, check that we're > installing the hstate-level PTE, not a high-granularity PTE. Adjust > mapcount AND refcount appropriately. > 2. In the HGM walking bits, to the caller if we made the hstate-level > PTE present. (hugetlb_[pmd,pte]_alloc is the source of truth.) Need to > keep track of this until we figure out which page we're allocating > PTEs for, then change mapcount/refcount appropriately. > 3. In unmapping bits, change mmu_gather/tlb bits to drop refcount only > once per hugepage. (This is probably the hardest of these three things > to get right.)