From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CBDBC61DF4 for ; Fri, 24 Nov 2023 15:34:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 33BA28D0090; Fri, 24 Nov 2023 10:34:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E8C98D0084; Fri, 24 Nov 2023 10:34:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1B02A8D0090; Fri, 24 Nov 2023 10:34:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0D2C98D0084 for ; Fri, 24 Nov 2023 10:34:15 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id DB02040233 for ; Fri, 24 Nov 2023 15:34:14 +0000 (UTC) X-FDA: 81493244028.09.3AE1E7A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf04.hostedemail.com (Postfix) with ESMTP id B1DA440016 for ; Fri, 24 Nov 2023 15:34:12 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DlrFuL1q; spf=pass (imf04.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700840052; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ozG2lEKJl9jjPfth2z4nFDkyOFFciaTcIOevz9X1e8c=; b=FYIbDBFSsDEklM4btQlCeGlfqZEXPFmSt/iRB7r+XjGQ/J0PQT2/f+/AUjtMGjOSiW6YKw vJTF4X2FeFwjZRm0k8CxFYl/PiJUzsRA9wtHSVE5NCg6Zsg3g5SuDB815ETUER9HkFbaI9 0iJpt9M0NN26Tv9EYLq0hChIayWRzlI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700840052; a=rsa-sha256; cv=none; b=8JLppYlYYOaLfks4LzQz9VHJKZn5bcW+79rmwZG4gGtjD/bP9Hjm7tnQVKtUPJnnPk/lbN R8TWvRiov6VPdsYZa1aErlg/xyRxqV3VD+VsfZbskNonRhGYVVJ/DO+mIWtBahGHDwvj1o OrMDwXQceFzooiVYTUBZudwRKVD1DwU= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DlrFuL1q; spf=pass (imf04.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700840052; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ozG2lEKJl9jjPfth2z4nFDkyOFFciaTcIOevz9X1e8c=; b=DlrFuL1qLNbpJUYeYhaVe7ZGnMtqq9WPYQzDI132SH8t3bz6hqWapQNdrITbMRi3cDTE+D CeGmEuxep7ZM4osL5oOXFukkAl1xdtmdf0Tg5TjyN7+tcq6UjEWPrPT/eD/07m+7dApcz6 08gb0cFBD7bouRxC3HLwrgJ+KIu2LDw= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-622-pozdzfIjOBuhfjE4KSbIhg-1; Fri, 24 Nov 2023 10:34:11 -0500 X-MC-Unique: pozdzfIjOBuhfjE4KSbIhg-1 Received: by mail-qv1-f71.google.com with SMTP id 6a1803df08f44-67a0921b293so4084586d6.1 for ; Fri, 24 Nov 2023 07:34:11 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700840050; x=1701444850; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ozG2lEKJl9jjPfth2z4nFDkyOFFciaTcIOevz9X1e8c=; b=w1aB/8rgzqn5heaZ7uC0G9Ka70gLCK1MWQV6dZkqhJ9NGDOJhRvVTdnVxlNJOxAsHw xDzn0C0UlT5dgTiGXSX4Xd0qML55sH3Epwk8kX/69/udrpx8tKvWAVu9H4/yK90hCF+8 HENyqaLHvA0WMSBVR3rv3D4JIzar3EW6+K+9kUP7fvvac8G6iubb8jJ4wSkgUlLjPqJP 040wR/Ww4pP2fFmKpFdSZctI6tPzqpo+J0gqswKAX2thSKTwns+T+J7ZqjeOyg1gChGS m4ev4UeEErGodDgfz7a03X7DWrTTkwfCvrjZpjS1jKweuaeNGxDVaS9P/ai9ESVOk+4f n0aQ== X-Gm-Message-State: AOJu0Yx9nijO3FJpDRWen+EOtVPDYAvzOxhBaEiX2vh9Sh1sF/K3RiVQ iqbxvM71Hca/0HoUWFUNkk+fyFo5gl9GEfbEVHoZWKaQwzUoI3wrTEQ2aBArBoJkfkQpCh9sgVo 99H4AWrMVZOw= X-Received: by 2002:a05:6214:5845:b0:679:d92e:3915 with SMTP id ml5-20020a056214584500b00679d92e3915mr3426483qvb.6.1700840050271; Fri, 24 Nov 2023 07:34:10 -0800 (PST) X-Google-Smtp-Source: AGHT+IGvKONDQBUoQiMSxqSvbizULoHuFMqfQOSC8H1FGk9No9YBkdcMBDOGZLBUzauwPsWegc03FQ== X-Received: by 2002:a05:6214:5845:b0:679:d92e:3915 with SMTP id ml5-20020a056214584500b00679d92e3915mr3426445qvb.6.1700840049922; Fri, 24 Nov 2023 07:34:09 -0800 (PST) Received: from x1n (cpe688f2e2cb7c3-cm688f2e2cb7c0.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id k14-20020a0cf58e000000b00679ff35c886sm1479472qvm.142.2023.11.24.07.34.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Nov 2023 07:34:09 -0800 (PST) Date: Fri, 24 Nov 2023 10:34:07 -0500 From: Peter Xu To: "Aneesh Kumar K.V" Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Mike Kravetz , "Kirill A . Shutemov" , Lorenzo Stoakes , Axel Rasmussen , Matthew Wilcox , John Hubbard , Mike Rapoport , Hugh Dickins , David Hildenbrand , Andrea Arcangeli , Rik van Riel , James Houghton , Yang Shi , Jason Gunthorpe , Vlastimil Babka , Andrew Morton Subject: Re: [PATCH RFC 04/12] mm: Introduce vma_pgtable_walk_{begin|end}() Message-ID: References: <20231116012908.392077-1-peterx@redhat.com> <20231116012908.392077-5-peterx@redhat.com> <874jhb94u2.fsf@kernel.org> MIME-Version: 1.0 In-Reply-To: <874jhb94u2.fsf@kernel.org> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Queue-Id: B1DA440016 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: fbc4n3snm4rg3bey4ugaq5qnh3cebwhm X-HE-Tag: 1700840052-802081 X-HE-Meta: U2FsdGVkX191n3zuVKKLh6twNXveo2jsKoPe81sAlcmVs63ph+E308iiI3cl8x6KqFZtCad4OVbZJbCV6OEDlGKmY69GvGJCkv9miFURrNzThLSYNcFH9pQivxOOJEPSZuG6DG4JXYiaGdhZ/t4aBFduN9Oove7YDutKGkeNAZMkf5v2qPTrByYbFjoSWYxtxdV0ntJPQkRyJrKKxXxSmSyM4evdJuHMmbc3mfZLj3+2YcLq+kWLfpkzkvN+NvMo0Dwru1tbwuqZAAoCdTCERub28yqQ26Hs/ifEtxmsjj09uYB4OnNN4wpizOEOZ5NeZhchNoP6XoGalqxgGGeOLVVEYGs0RnJ9590pQdIpZV5utAqwEPkjgAZWzGGKEvH9KFPyNaAiau5PlBG9TeWmLQeef1yZhiCRj1VNm2e29NlDibYZ6exjZ2MtnezOHbclEOSvtf3QysWQFraHPbmfgDcUspODVB8a9ik0PgPA5PkJOTw7t3a3HSaNFHy6ih2XGH/tEE71GZ2Q4DLJOsa00HhsaFVHKPZTeAlf76UEar0GQ8e1oM2KivOFFQLBZ38BjJSARPFZqDpw/T0tOpPDSQoAs+WKUaAyFnkTrdPE2GwWpCLTRIrMPekvmdXyndNGKC4KixAyy/6m+o9Aro7FcFUNXukEXJa7W2dz6G1Z97q5MzAGJKBKQ9N+/Jcf3vBzcGBqdDCYMAUCcP11Bzm8KMxUb39G3rb3qjOfigBPdEQuisL98IFeUlhOwGKxFSTOc7lizdaT1Ha1eVJRPQlnYFOAmPiGUCIB6pn81nW4DBvjzcrgjqFiM1/ZaKHlUGS3ntfI/Ccz3iA5aZgIT8+y7MU4Vo8x7wPwHcx6LNCfOF93IyJi/lVSnlORq/0bFG1uSQCgp+OcA57gKChrGLa0AkqGREVMLBHfDi2Gx1leZpq+nZpEmRZhEBTgV2GHlgsl86+l7LK4jcxQgx+tk2u yMAPAKhp MvCkS0a7/C0llnqQyUAQUBEist4TV/J5ovJVF04dm15e89IWUxOGrDEE28rtwPs2U6ceGp/NUmQfamHIC3C/sziF6i/Xp9p/ngJZuQXlKDDtm7tZoro5ApYgxTh8W4yWs/MG7fXpdpEk0UbZsmaukmrL/ZbgCawFrZdsltzI4AlYF7L8io8jF1V9bTdqUrQkOJl5RTp9zi5cqmN6ZXNxXMbT6/GoLTZPTqvaoEiQJUzZ/oXPx15YdjTBEtTbSyEDNrIMrS24kFjULSC8nRraS72cmpHfCL4/LS788mWu9I/lK4ujpskOeaT09CjMBOJSQif4wnCX19fheHz5rAsn7sGa6EFUyaiZztQG8fy+vxRR709Kw/Uuvctq/Elk8cFodl23GeRXWKgWysamHbWiaq/tR19JNjUlP4znqFpLyi4fUCR4rLWQra5Ri7w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Nov 24, 2023 at 09:32:13AM +0530, Aneesh Kumar K.V wrote: > Peter Xu writes: > > > Introduce per-vma begin()/end() helpers for pgtable walks. This is a > > preparation work to merge hugetlb pgtable walkers with generic mm. > > > > The helpers need to be called before and after a pgtable walk, will start > > to be needed if the pgtable walker code supports hugetlb pages. It's a > > hook point for any type of VMA, but for now only hugetlb uses it to > > stablize the pgtable pages from getting away (due to possible pmd > > unsharing). > > > > Signed-off-by: Peter Xu > > --- > > include/linux/mm.h | 3 +++ > > mm/memory.c | 12 ++++++++++++ > > 2 files changed, 15 insertions(+) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 64cd1ee4aacc..349232dd20fb 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -4154,4 +4154,7 @@ static inline bool pfn_is_unaccepted_memory(unsigned long pfn) > > return range_contains_unaccepted_memory(paddr, paddr + PAGE_SIZE); > > } > > > > +void vma_pgtable_walk_begin(struct vm_area_struct *vma); > > +void vma_pgtable_walk_end(struct vm_area_struct *vma); > > + > > #endif /* _LINUX_MM_H */ > > diff --git a/mm/memory.c b/mm/memory.c > > index e27e2e5beb3f..3a6434b40d87 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -6180,3 +6180,15 @@ void ptlock_free(struct ptdesc *ptdesc) > > kmem_cache_free(page_ptl_cachep, ptdesc->ptl); > > } > > #endif > > + > > +void vma_pgtable_walk_begin(struct vm_area_struct *vma) > > +{ > > + if (is_vm_hugetlb_page(vma)) > > + hugetlb_vma_lock_read(vma); > > +} > > > > That is required only if we support pmd sharing? Correct. Note that for this specific gup code path, we're not changing the lock behavior because we used to call hugetlb_vma_lock_read() the same in hugetlb_follow_page_mask(), that's also unconditionally. It make things even more complicated if we see the recent private mapping change that Rik introduced in bf4916922c. I think it means we'll also take that lock if private lock is allocated, but I'm not really sure whether that's necessary for all pgtable walks, as the hugetlb vma lock is taken mostly in all walk paths currently, only some special paths take i_mmap rwsem instead of the vma lock. Per my current understanding, the private lock was only for avoiding a race between truncate & zapping. I had a feeling that maybe there's better way to do this rather than sticking different functions with the same lock (or, lock api). In summary, the hugetlb vma lock is still complicated and may prone to further refactoring. But all those needs further investigations. This series can be hopefully seen as completely separated from that so far. Thanks, -- Peter Xu