From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B7E6EB64D7 for ; Tue, 20 Jun 2023 16:04:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8BA0C8D0002; Tue, 20 Jun 2023 12:04:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 869DB8D0001; Tue, 20 Jun 2023 12:04:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 70A4E8D0002; Tue, 20 Jun 2023 12:04:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5E3928D0001 for ; Tue, 20 Jun 2023 12:04:28 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3369B141076 for ; Tue, 20 Jun 2023 16:04:28 +0000 (UTC) X-FDA: 80923598616.28.7A5CEB2 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 3A95F800A7 for ; Tue, 20 Jun 2023 16:03:50 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="aEwIE/pF"; spf=pass (imf02.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687277031; a=rsa-sha256; cv=none; b=Co3BHKPETTRks6oUotSRN3ck0+QcmLd/fIPS64Faxlq4LfRrQlKDyaRsdO6YpAI3vAriE+ KzyNCeE5MWZ9+k+tnXo7f1ibR+bFNSl6e/1bXfKlXHL2Y5ADkAz1b6ZfM6nllg0wHfPfdi 2UEulXG4m5OXBF3W1QedGyeF2Ia5/wI= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="aEwIE/pF"; spf=pass (imf02.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687277031; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3LcOrS6cknEcAz4IzFsrfkY0KBDg1wCLzWoEd8Eb3Y4=; b=nYRjPwf5ep5OG2crhC6tRR0pNOQzJ5LeDtf/Nc6kO8TrQywT4UtCTnisC7xRUXfN1rYy9G KVHwrnTvfnE1m60urefYbKyMLt2IXGcSIysTbwNOCD9SQnAZt9jgTYCEiChRCXLxFUtJY1 om7ddHnC3cpI902vWja6zAgQJdPY6lU= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687277030; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=3LcOrS6cknEcAz4IzFsrfkY0KBDg1wCLzWoEd8Eb3Y4=; b=aEwIE/pFI2+UTfX5eSZDof8hNSjV8FclV2U4C8sAtkgRzuojhkTnOVcGrx1U+a5Uwxtx2T yyzeCKekaY9NGv1truZQlkEinMUtNKf5TZXG73JyGla3kuFmBx5qKl2xr4+GZF722BIdXH 0KotEmRjuRZ8vA5RF36McZLcAtROKuI= Received: from mail-vk1-f197.google.com (mail-vk1-f197.google.com [209.85.221.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-338-s_hWZXBXMpKEyow-Yd7xWg-1; Tue, 20 Jun 2023 12:03:32 -0400 X-MC-Unique: s_hWZXBXMpKEyow-Yd7xWg-1 Received: by mail-vk1-f197.google.com with SMTP id 71dfb90a1353d-46609b859f3so181168e0c.1 for ; Tue, 20 Jun 2023 09:03:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687276995; x=1689868995; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=3LcOrS6cknEcAz4IzFsrfkY0KBDg1wCLzWoEd8Eb3Y4=; b=ThTlOcualR620xFRxoKWYruT1Xtbkvu+edLMlvtEIoOW9ZEMOUoZMlx8z/ywYbUChz TX5as/BQ6b3OGj2Vrmceu5vZ/6R+LF01NFiAxWo42qhzkPr+Ar0poob1rnyjTUWjvnEc kAqkTg/eBliz+8SaShhlqIixzDwhi00EDnM7kyim3nIffUU5UwgVryYGm6TpJK8bFB1A q0DccHbRuOQF7D0TLij7STHC1KFtnxn8K2mcPG14G/NZjqJJSSe0/wgm3x4sGf8ucKyt o5tuiXBlsqc2Yasdp+nOSwLQEBJU6avUnbQxhYnfOxbCClk1wH9brgj0voOJkU8+weWv K32w== X-Gm-Message-State: AC+VfDy6vFME3GaPPcUssGEH2KApESWiWz/by+LzuspNURbx7Z5PlWMK Lf55/Z767GuVXDUXHhD/4OyJ21iH0PkMTueHSC0wHGoM7fxfXdNH6oOYuFL2G5pSNekJpd9RHRg qDF6vsSQp2CM= X-Received: by 2002:a67:c410:0:b0:440:afb0:2d3c with SMTP id c16-20020a67c410000000b00440afb02d3cmr2469006vsk.2.1687276994982; Tue, 20 Jun 2023 09:03:14 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4RNYw1RaqvJfK03dNilpUmT/+OXHmI9rtvkJOktQOIrKg7RJTysHreFAJhaFScS2g631BngA== X-Received: by 2002:a67:c410:0:b0:440:afb0:2d3c with SMTP id c16-20020a67c410000000b00440afb02d3cmr2468990vsk.2.1687276994718; Tue, 20 Jun 2023 09:03:14 -0700 (PDT) Received: from x1n (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id g5-20020a0cdf05000000b00630c0ed6339sm1362675qvl.64.2023.06.20.09.03.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jun 2023 09:03:14 -0700 (PDT) Date: Tue, 20 Jun 2023 12:03:12 -0400 From: Peter Xu To: David Hildenbrand Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrea Arcangeli , Mike Rapoport , Matthew Wilcox , Vlastimil Babka , John Hubbard , "Kirill A . Shutemov" , James Houghton , Andrew Morton , Lorenzo Stoakes , Hugh Dickins , Mike Kravetz , Jason Gunthorpe Subject: Re: [PATCH v2 2/8] mm/hugetlb: Prepare hugetlb_follow_page_mask() for FOLL_PIN Message-ID: References: <20230619231044.112894-1-peterx@redhat.com> <20230619231044.112894-3-peterx@redhat.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 3A95F800A7 X-Stat-Signature: se3hieg4uc1zqksg1kw9swa1aoyfh4f3 X-Rspam-User: X-HE-Tag: 1687277030-626210 X-HE-Meta: U2FsdGVkX187rJS56CJG7aXXYMtBaHzF0lphV/iiDGygN/tNjBvl73DNxpYrkbLU5JjFmI8wrhTEWLYDZlydtwdfMKK6Pr4ZtFZQ2410Qnh4oUtMMnxEWZT/+rhT930PMY0Idz5vpuFm6qQOlP/eYvCuDgPYFbdWcfvT5xE7vlnFGQg11dCIG8SX+TtCCA9icgbEThKxoEwzwjrqcSZsZKGAWUbJUTgR7QBNVjwSHLq1bUyu2J+BpuRRseSWF3Tz6LEQi9HWygRZ9VYzc6XYB10fds7irDL6NsNnMXNSDDPKY2kWTFBeAOiouusDVTAqcWNMohqslF/0GMxNrlRxNAYUKkAhUa1fpJA9GXmRcTjtXmXGSoP8Xw6uBb9fufKLZ6DP3pHfzWEB79ORHO6iRQTgGnhfaQmoxegOsuIeiFpmi6VYS03Fy0kDMKOnhRNHNS7Cco5vtewYHL/ly8gMQkbg8QYJPWxi6eFr79xInpgdF1Zq0PtlVlIIIOe/j6VGkaof2888Hznxov/K7orzGe6PXcF6wem36dbcuZPhSy05XOqQEu9douzspkFqZbE+midC8EdgHZCmGP7sf/DZCQRGibc0yA3ESUW+GgwsfQ9znx5tOFCIW8tkMTae+CBwrT3hk5vfa8eSL7C+O2a5GQF3uPInXb1Cun4ZEPW750VuXOS+nFfhfoslipdQ7s8b1m5FJBgBMhKVPL41TfDMwYAkRNsaDU+fZVicdip7bEzsJoaBxEpdW3Ls87tYlEiqBdM79abOZJAFai8CxnjigS32z+Mjdgp+N4sp4epME9JxMT7RQg207IWHfXHnvvj2ySYcXXz9eJHN71M8dn0E+IfKLhphDGag60ZO+V9QS5yK/O2zaX8VPbkoSO8RhhwsHH3AhcYQKiJRKOfJV4PVEBq+A97cx/TgLsqgwQ7CWbiqMRuitv7nFqQq243qsh93Tv6EGJ9eEtb5EhVD0IM RZ4Py8bR ywvpV4+SFOZkKJSwcIsce5/4hy9xFsz9Q8Mg/252bvgm1upDBbA4ifuV2/RL3zFLxX5BdBrATVpmzmpcfSfu/p78M1MiYXdt8UP6MifiFNUXtFoaps4hvglk1L+bgddymYbqQMmTgfjFRGPHTsAZzqUtm5AnEU0/hN2BrF/2AmnHYaDXyOCsI/RaK5KLR3jLN59YjqU/3OP4GoRbj0P7Arx/1quYcChY45z8u/kiVi8/fteBXsOTVwucLFqZ4XRP5K9LHeguINXwhdzlxX4UR3KA6bidXZJ+yMwupVKFx/qj8khhOpBMEwwt3jbmOSyknsVAQDcK5ACOYkihDRf/uRf2D/KgKsYIPZ6RYpugQgsrOlmIn2vFqTrZVqL56NtFWacm4q++hcuWgltKJ7LJOCnvurxysqh6vf1H4+hmF77PJqX3gu95sEIS4cDPuNQTB1eHsvbC3SNe/YxA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 20, 2023 at 05:22:02PM +0200, David Hildenbrand wrote: > On 20.06.23 01:10, Peter Xu wrote: > > follow_page() doesn't use FOLL_PIN, meanwhile hugetlb seems to not be the > > target of FOLL_WRITE either. However add the checks. > > > > Namely, either the need to CoW due to missing write bit, or proper CoR on > > s/CoR/unsharing/ > > > !AnonExclusive pages over R/O pins to reject the follow page. That brings > > this function closer to follow_hugetlb_page(). > > > > So we don't care before, and also for now. But we'll care if we switch > > over slow-gup to use hugetlb_follow_page_mask(). We'll also care when to > > return -EMLINK properly, as that's the gup internal api to mean "we should > > do CoR". Not really needed for follow page path, though. > > "we should unshare". > > > > > When at it, switching the try_grab_page() to use WARN_ON_ONCE(), to be > > clear that it just should never fail. > > > > Reviewed-by: Mike Kravetz > > Signed-off-by: Peter Xu > > --- > > mm/hugetlb.c | 24 +++++++++++++++--------- > > 1 file changed, 15 insertions(+), 9 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index f75f5e78ff0b..9a6918c4250a 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -6463,13 +6463,6 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, > > spinlock_t *ptl; > > pte_t *pte, entry; > > - /* > > - * FOLL_PIN is not supported for follow_page(). Ordinary GUP goes via > > - * follow_hugetlb_page(). > > - */ > > - if (WARN_ON_ONCE(flags & FOLL_PIN)) > > - return NULL; > > - > > hugetlb_vma_lock_read(vma); > > pte = hugetlb_walk(vma, haddr, huge_page_size(h)); > > if (!pte) > > @@ -6478,8 +6471,21 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, > > ptl = huge_pte_lock(h, mm, pte); > > entry = huge_ptep_get(pte); > > if (pte_present(entry)) { > > - page = pte_page(entry) + > > - ((address & ~huge_page_mask(h)) >> PAGE_SHIFT); > > + page = pte_page(entry); > > + > > + if (gup_must_unshare(vma, flags, page)) { > > All other callers (like follow_page_pte(), including > __follow_hugetlb_must_fault()) > > (a) check for write permissions first. > > (b) check for gup_must_unshare() only if !pte_write(entry) > > I'd vote to keep these checks as similar as possible to the other GUP code. I'm pretty sure the order doesn't matter here since one for read and one for write.. but sure I can switch the order. > > > + /* Tell the caller to do Copy-On-Read */ > > "Tell the caller to unshare". > > > + page = ERR_PTR(-EMLINK); > > + goto out; > > + } > > + > > + if ((flags & FOLL_WRITE) && !pte_write(entry)) { > > + page = NULL; > > + goto out; > > + } > > > I'm confused about pte_write() vs. huge_pte_write(), and I don't know what's > right or wrong here. AFAICT, they should always be identical in code. But yeah.. I should just use the huge_ version. Thanks, -- Peter Xu