From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4041C4708C for ; Thu, 27 May 2021 13:05:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 50DD461378 for ; Thu, 27 May 2021 13:05:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 50DD461378 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B08CB6B006C; Thu, 27 May 2021 09:05:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ADEB16B006E; Thu, 27 May 2021 09:05:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9588D6B0070; Thu, 27 May 2021 09:05:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0180.hostedemail.com [216.40.44.180]) by kanga.kvack.org (Postfix) with ESMTP id 60CBB6B006C for ; Thu, 27 May 2021 09:05:04 -0400 (EDT) Received: from smtpin34.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id DC0E9181AF5C3 for ; Thu, 27 May 2021 13:05:03 +0000 (UTC) X-FDA: 78187031286.34.D5B5ADD Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 9C11E40F8C10 for ; Thu, 27 May 2021 13:04:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1622120702; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=4rzzOa6q+9jkHUrjBMT6aDA8TmT+khUNIgkx743Uidw=; b=XBRT1l8UkjViDViZF7DaJKbAZGRmiVLPiLVkSlpEYfefyPR/5ZTaR+tJ5fE+3qnFRqG2W9 pju/9VxBri6P7ws2r0LXe9ol7Foah3xnneRyJtqoqSFQfYMGgBrQiD/3lgntWoyYMmRWgY Zj5KD0dPuR/QVTeRBQFdOssNYJoQZ0o= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-131-LO1UwM_ZMWGsrYLEjmvZVg-1; Thu, 27 May 2021 09:05:00 -0400 X-MC-Unique: LO1UwM_ZMWGsrYLEjmvZVg-1 Received: by mail-qk1-f199.google.com with SMTP id l6-20020a3770060000b02902fa5329f2b4so358261qkc.18 for ; Thu, 27 May 2021 06:05:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=4rzzOa6q+9jkHUrjBMT6aDA8TmT+khUNIgkx743Uidw=; b=FdBiRgDy1V987lqxc9e1mHKtI/axAsLVCwQyfwHSiwpiM0dXjiiJG8lnpIHoZchZk8 9W2Bz7zVTohQY3+LhhOpWgqJcV3rU3xAIQxHI4TlqyladHuWxt/hr4TkfWpAlRq5oNc5 quIUgRT2WmHbAbApPGqi3YaI71X0QaxJfKME/fmIEYrxnEOAwIyHd24oKkDNY5wVP9Rx cnTs6NiZzPVfldr7pNKy0Ik21K46T7mNhHx9VAVITLO2M9bVixwlrqdfV0qAeWwP43dl hRWLiYkX3uOKkW14fCzijqKsY/S5NPdh+BzOqdW5NTHyWTS3cNo2wtscjJ2FgyQazoMY 3n0g== X-Gm-Message-State: AOAM531Mal8lxkt1v0lrXE44KwBr58O7hG9HIJYs7pYknvGvPlniJOdM cbXYr58WLJ8XNvZE2R0Vjfe7Jkew4h+yOyQ6R9/nDkZ1ct+kuR/7f847yibJe2O49votKowwS8B DfUO6j687nEM= X-Received: by 2002:ac8:5f84:: with SMTP id j4mr2991624qta.240.1622120700036; Thu, 27 May 2021 06:05:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzF9/f/oCcd+BBWrvSHgnJ7MXTSn158rInAQdiALMQLZF3k9Vcpr9xEm1OHlKUAk1TCJw9rGQ== X-Received: by 2002:ac8:5f84:: with SMTP id j4mr2991568qta.240.1622120699489; Thu, 27 May 2021 06:04:59 -0700 (PDT) Received: from t490s (bras-base-toroon474qw-grc-72-184-145-4-219.dsl.bell.ca. [184.145.4.219]) by smtp.gmail.com with ESMTPSA id p63sm1325517qkf.31.2021.05.27.06.04.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 May 2021 06:04:58 -0700 (PDT) Date: Thu, 27 May 2021 09:04:57 -0400 From: Peter Xu To: Alistair Popple Cc: linux-mm@kvack.org, akpm@linux-foundation.org, nouveau@lists.freedesktop.org, bskeggs@redhat.com, rcampbell@nvidia.com, linux-doc@vger.kernel.org, jhubbard@nvidia.com, bsingharora@gmail.com, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, hch@infradead.org, jglisse@redhat.com, willy@infradead.org, jgg@nvidia.com, hughd@google.com, Christoph Hellwig Subject: Re: [PATCH v9 07/10] mm: Device exclusive memory access Message-ID: References: <20210524132725.12697-1-apopple@nvidia.com> <20210524132725.12697-8-apopple@nvidia.com> <37725705.JvxlXkkoz5@nvdebian> MIME-Version: 1.0 In-Reply-To: <37725705.JvxlXkkoz5@nvdebian> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=XBRT1l8U; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf02.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=peterx@redhat.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 9C11E40F8C10 X-Stat-Signature: entpp86bscuinporrtidrsjuofeqsnyr X-HE-Tag: 1622120698-23751 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, May 27, 2021 at 01:35:39PM +1000, Alistair Popple wrote: > > > + * > > > + * @MMU_NOTIFY_EXCLUSIVE: to signal a device driver that the device will > > > no + * longer have exclusive access to the page. May ignore the > > > invalidation that's + * part of make_device_exclusive_range() if the > > > owner field > > > + * matches the value passed to make_device_exclusive_range(). > > > > Perhaps s/matches/does not match/? > > No, "matches" is correct. The MMU_NOTIFY_EXCLUSIVE notifier is to notify a > listener that a range is being invalidated for the purpose of making the range > available for some device to have exclusive access to. Which does also mean a > device getting the notification no longer has exclusive access if it already > did. > > A unique type is needed because when creating the range a driver needs to form > a mmu critical section (with mmu_interval_read_begin()/ > mmu_interval_read_end()) to ensure the entry remains valid long enough to > program the device pte and hasn't been invalidated. > > However without a way of filtering any invalidations will result in a retry, > but make_device_exclusive_range() needs to do an invalidation during > installation of the entry. To avoid this causing infinite retries the driver > ignores specific invalidation events that it knows don't apply, ie. the > invalidations that are a result of that driver asking for device exclusive > entries. OK I think I get it now.. so the driver checks both EXCLUSIVE and owner, if all match it skips the notify, otherwise it's treated like all the rest. Thanks. However then it's still confusing (as I raised it too in previous comment) that we use CLEAR when re-installing the valid pte. It's merely against what CLEAR means. How about sending EXCLUSIVE for both mark/restore? Just that when restore we notify with owner==NULL telling that no one is owning it anymore so driver needs to drop the ownership. I assume your driver patch does not need change too. Would that be much cleaner than CLEAR? I bet it also makes commenting the new notify easier. What do you think? [...] > > > + vma->vm_mm, address, min(vma->vm_end, > > > + address + page_size(page)), > > > args->owner); + mmu_notifier_invalidate_range_start(&range); > > > + > > > + while (page_vma_mapped_walk(&pvmw)) { > > > + /* Unexpected PMD-mapped THP? */ > > > + VM_BUG_ON_PAGE(!pvmw.pte, page); > > > + > > > + if (!pte_present(*pvmw.pte)) { > > > + ret = false; > > > + page_vma_mapped_walk_done(&pvmw); > > > + break; > > > + } > > > + > > > + subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte); > > > > I see that all pages passed in should be done after FOLL_SPLIT_PMD, so is > > this needed? Or say, should subpage==page always be true? > > Not always, in the case of a thp there are small ptes which will get device > exclusive entries. FOLL_SPLIT_PMD will first split the huge thp into smaller pages, then do follow_page_pte() on them (in follow_pmd_mask): if (flags & FOLL_SPLIT_PMD) { int ret; page = pmd_page(*pmd); if (is_huge_zero_page(page)) { spin_unlock(ptl); ret = 0; split_huge_pmd(vma, pmd, address); if (pmd_trans_unstable(pmd)) ret = -EBUSY; } else { spin_unlock(ptl); split_huge_pmd(vma, pmd, address); ret = pte_alloc(mm, pmd) ? -ENOMEM : 0; } return ret ? ERR_PTR(ret) : follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); } So I thought all pages are small pages? -- Peter Xu