From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 933BEC27C52 for ; Thu, 6 Jun 2024 21:49:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2AD316B00BA; Thu, 6 Jun 2024 17:49:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 25CE86B00BC; Thu, 6 Jun 2024 17:49:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 124806B00BD; Thu, 6 Jun 2024 17:49:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E989C6B00BA for ; Thu, 6 Jun 2024 17:49:38 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A4E2F40445 for ; Thu, 6 Jun 2024 21:49:38 +0000 (UTC) X-FDA: 82201806036.04.1A97CF1 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf12.hostedemail.com (Postfix) with ESMTP id 8F3C240023 for ; Thu, 6 Jun 2024 21:49:36 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IH1GEZHA; spf=pass (imf12.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717710576; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XHXRx9Kt4fBI0Ptgj6D7yf4xKGdqdNFMSIf5Upvgzfo=; b=vPe5m/WNeyeli+GkqFxNi5BldvBWlAr8R0FRBA172+d8Ofa1bYrKuoRmQhPeoq90jrzFSF Ue8gT3BQFTmEwTJqdC7rmJXxDzTdFlD6ODSnE9mL1EmvpnWHTCp674Yy2xQOaqLKP1bX93 yzipY35bPRO70tGCvDm0DBK6F5L4dTY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717710576; a=rsa-sha256; cv=none; b=iPH4HFaqgITyzcpcAAalpwCnCIjyPb2Dt1/WlNcqjXOvqnyPtHT3iiIwT+fe9YEFcq5nwq 6Wg+j8O89AcrZ/H6KmZjJHp/oW1S1x7ezxacKUzuNfNiEaBNn1dmdtR1R1h/qAoV7dKF1x n7fk4D6zUe+mPsD/Fi2gixb5R/GDCTs= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IH1GEZHA; spf=pass (imf12.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1717710575; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=XHXRx9Kt4fBI0Ptgj6D7yf4xKGdqdNFMSIf5Upvgzfo=; b=IH1GEZHAmlx2xg3jqDxVZmianuC40wwy7MhJ/9MCNq+8UC1MBDDRIlDkB/UxdOg12ZDuLC N2/Kkug0TQcL/NRrgqVSlPJBOO0S17UHWWRTTSL5v5U/pJZjvtEyIi9yQr/jXA/xxDY177 NKFBOxmnXMwC0Q51BqU+68RjA1EFfKA= Received: from mail-vs1-f70.google.com (mail-vs1-f70.google.com [209.85.217.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-221-aQWrC5BNP52apxQLUAVNXg-1; Thu, 06 Jun 2024 17:49:34 -0400 X-MC-Unique: aQWrC5BNP52apxQLUAVNXg-1 Received: by mail-vs1-f70.google.com with SMTP id ada2fe7eead31-48a39728390so104101137.0 for ; Thu, 06 Jun 2024 14:49:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717710574; x=1718315374; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=XHXRx9Kt4fBI0Ptgj6D7yf4xKGdqdNFMSIf5Upvgzfo=; b=DJR5zEOwWVmPcCUGisR2CUUub/o9Y89Mnq0YdkU6syxCq2xn02+Mqtu+upGUPoeXjg 0bT9L7xvEFrq8JB4SbbP24C9MfWT7lo/CVnq60KVUynRn0YySE7JAN4XwJGDehTfk3nB oAVtDR2O/d+Ytt3kyFWzbsHB3SXtbs3GFYkpGmJ5aMkuqbp6d3kzNKQpFpZeKyUTj584 RUMvDI5XX7jv8ai9eu6viyqH7u1vl6RpM0Jz+gFK70uxTtsd7yYiEpPKWEc6PcGvijmE bGXeAmDH45UMAG+DGXL5syvPEX0yZslCr5vlcB04xtlwyKwqDf5qIQLg0Rh4eeRKdxbl GnCg== X-Forwarded-Encrypted: i=1; AJvYcCV0/G/EiC9SOo9UEjuzH9TttPmsSkio8wDxfDjSCr+e8OnTmi3LL8j6/7L/Gk5jRkgosUDUixadAO97d6kiMZG0adE= X-Gm-Message-State: AOJu0Yy7Ovr9lArzWHL5428nAagCBBAd2mkTvfqM95IdcdW/18uwSNWp pT2ha2T3mj+mYlhENbMRW+jfPVz+yLX1ZMGDlA1jsSaXF089kEcWDupMNJxt3WFLrWWhOBVy16E lcw3zUtRPLAU8WdwyC8+Q57QT+hOwgBZ3Lsb/cIQg3AEM08qX X-Received: by 2002:a67:ba03:0:b0:488:f11a:f3d6 with SMTP id ada2fe7eead31-48c2744a965mr591109137.2.1717710573651; Thu, 06 Jun 2024 14:49:33 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEEWculKBNvXH1SfICPzuAcE5C/hmbehwd9IFTRNpNfiCqWsX6M7YIY71pur9uSn43b3hMJrw== X-Received: by 2002:a67:ba03:0:b0:488:f11a:f3d6 with SMTP id ada2fe7eead31-48c2744a965mr591093137.2.1717710573194; Thu, 06 Jun 2024 14:49:33 -0700 (PDT) Received: from x1n (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id af79cd13be357-79538974213sm52520785a.22.2024.06.06.14.49.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 14:49:32 -0700 (PDT) Date: Thu, 6 Jun 2024 17:49:30 -0400 From: Peter Xu To: Matthew Wilcox Cc: Khalid Aziz , Vishal Moola , Jane Chu , Muchun Song , linux-mm@kvack.org Subject: Re: Unifying page table walkers Message-ID: References: MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 8F3C240023 X-Rspam-User: X-Stat-Signature: dan5ssygqxo33qe1etmpixmw6odbjr8u X-HE-Tag: 1717710576-882071 X-HE-Meta: U2FsdGVkX18zbO4yFB6usm1Lh/lkdHuPTU4g744TJl9N8lG8fMHEc2wDuLasG8HBIJ6R4HKfjNLVnU7j0aLHcJPnco3hbevdCcCK7vB/oqR5YBfpNb41F+gmho53G2/u9OVupjxUqVIjLX6ZbHufdPzA1yS9eqgOAunKg40koYzZRlGXXKpJiNkStLYdmy6lW6ZfRvBw6ZA+0UafPGqSNaJyaMrvlg22jKrprlMDo6W8Z79Ot8ZMTdsXvoD5YQqbE54rEDXOTm1/9ltJhT5nmEHs0JQnlRRFwrsgAzGeqzmKXbSqqYXSjHsRTnApJUsH35m933k7CQrZSSS8cKJXlezCXC03EtoYWdVawyloR4K4uP6vMXaMNPQ0r/s92ud2sa8nB3CiN5ru1aQntQIcYr06HbZkGbhKD1SAAGqYiC8Uu1eWylSGdx4Ro3CwekV33pyXsxap20AeZn1PQx0k9BSNo/2mdk3psSyiOFyGCNahCUuX0s1eRR6X3vscYxJRhklWHnYj6nWSxqaXrEDNADOFkht1w7f1AKIpTbaTto+88WsBXl5Xm90Lw8QIA7SFJxx+Ic/z9tOW5vBb+SdGxS7e6mvpC4+Q7lbHkv/QEx6V6M3hoafA22gPh1AhuNl+ljM45WzoYin3u6xnZpBZdW2NXUpl/s+1fFzHN77/7MJXeHSuUCuUQlIplJvk4mRVsWu8A3YxUYQ4oZUOUZJEELz7yetW2KHRxTSiLb91AiELP6NeRv0pQi9FRdLULrRqWvNyu6Nbp4G4GqaQrEmtxUnxffJEtGLF4ZVxGoQtP4I2fHaZ5k+Xa8uUnaPK/28nisjX20liRabcplCLB8dyXq5EMkin/iGk97HkxaIeMHPwabGzHGPA8/007X/FmilhpYhlH0zBTDMk0IeiM3/iNrzkelOfazIWiilzZmZkM4R9xrqIrNUN/nYchev4EfAcwhsbkX0tDDpPTGZdl87 FCO/A1FG 0DXObMRrwdCF6Rrp+LYEAGTZvK9sFCDQz8feuyhoPACdlds5Nud26PkqCAxtWDaqR457UeGDVgPBCON0MZNinM5vE3VviaD2MNiPl7+xIr8X2cHJZtEhLVLqaj8oSYsA83EuxZNMQ1sqc0rYwCj/pP5OEe42YyUoxuVmu+C/ziS8dZF+Ij50XY4ewMAE/LYMWw8XVCgzGfZMzK/mysxX0v25Eejg80iQ0gSYmvdntjbLmyLqL1kLjj1L/pIA/zAV6OoWxTT9pi9Q7JuaJPlyA/UZ6xr2P8JICOo3Q8v5r18poBATsDzIBTCchhJyHdl14Fk7mdeO1oKEtCA/2GSnOsl2Z5JsurWpaSXCCeYYonjcOz2wlJ9fdOJMMT5HXU6qMWw0MH3v2BDafKrmcpYyOG8xYoD36dqMT7Mf68aBZPqaVQqXf1JgU3RvciO5Ja1D95cFTD1yy4eJVT9I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 06, 2024 at 07:29:22PM +0100, Matthew Wilcox wrote: > The reason we have a separate hugetlb_entry from pmd_entry and pud_entry > is that it has a different locking context. It is called with the > hugetlb_vma_lock held for read (nb: this is not the same as the vma > lock; see walk_hugetlb_range()). Why do we need this? Because of page > table sharing. Just to quickly comment on this one: I think it's more than the per-vma lock. Oscar is actually working together with me (we had plenty of discussions but so far all offlist...), and the lock context is as simple as this after refactor for hugetlb_entry() path: https://github.com/leberus/linux/commit/88e56c1ecaf8c64ba9165aeba74335bdc15d1b56 hugetlb_entry() existed also because that's the only sane way to link to the hugetlb API (used to be huge_pte_offset() I believe, now hugetlb_walk()), which always walk to a specific level of hugetlb pgtable but without even telling the caller (hence the pte_t* force-cast trick). Then pxd_entry() won't apply if we don't know that info. So it's probably not only about the locking. Meanwhile, I had a very vague memory that the per-vma lock is also used for something else, perhaps fallocate() race against faults or something. But maybe I misremembered; I didn't read that part of code for quite some time, as our hugetlb refactoring work doesn't need that knowledge involved: we simply keep all the behaviors. Maybe Muchun could remember. Thanks, -- Peter Xu