linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Will Deacon <will@kernel.org>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Hugh Dickins <hughd@google.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Yang Shi <shy828301@gmail.com>,
	Wang Yugui <wangyugui@e16-tech.com>,
	Matthew Wilcox <willy@infradead.org>,
	Alistair Popple <apopple@nvidia.com>,
	Ralph Campbell <rcampbell@nvidia.com>, Zi Yan <ziy@nvidia.com>,
	Peter Xu <peterx@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 03/11] mm: page_vma_mapped_walk(): use pmd_read_atomic()
Date: Wed, 16 Jun 2021 11:27:43 +0100	[thread overview]
Message-ID: <20210616102742.GC22350@willie-the-truck> (raw)
In-Reply-To: <20210616004207.GU1096940@ziepe.ca>

On Tue, Jun 15, 2021 at 09:42:07PM -0300, Jason Gunthorpe wrote:
> On Tue, Jun 15, 2021 at 10:46:39AM +0100, Will Deacon wrote:
> 
> > Then the compiler can allocate the same register for x and z, but will
> > issue an additional load for y. If a concurrent update takes place to the
> > pmd which transitions from Invalid -> Valid, then it will look as though
> > things went back in time, because z will be stale. We actually hit this
> > on arm64 in practice [1].
> 
> The fact you actually hit this in the real world just seem to confirm
> my thinking that the mm's lax use of the memory model is something
> that deserves addressing.
> 
> Honestly I'm not sure the fix to stick a READ_ONCE in the macros is
> very robust. I prefer the gup_fast pattern of:
> 
>   pmd_t pmd = READ_ONCE(*pmdp);
>   pte_offset_phys(&pmd, addr);
> 
> To correctly force the READ_ONCE under unlocked access and the
> consistently use the single read of the unstable data.
> 
> It seems more maintainable 'hey look at me, I have no locks!' and has
> fewer possibilities for obscure order related bugs to creep in.

Oh, no objection to cleaning this up. It was a "issuing msync(2) causes
data loss argh!" issue, so adding READ_ONCE() to all the macros was the
most straightforward way to solve the immediate problem.

Generally speaking, I think all accesses to live page-tables should be
using READ_ONCE(), as there's also hardware updates from the CPU table
walker to contend with. If that's done in the caller and the macros are
changed to operate on the loaded value, all the better (although this
probably doesn't work so well once you get into rmw operations such as
ptep_test_and_clear_young()).

Will


  reply	other threads:[~2021-06-16 10:27 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-10  6:31 [PATCH 00/11] mm: page_vma_mapped_walk() cleanup and THP fixes Hugh Dickins
2021-06-10  6:34 ` [PATCH 01/11] mm: page_vma_mapped_walk(): use page for pvmw->page Hugh Dickins
2021-06-10  8:12   ` Alistair Popple
2021-06-10  8:55   ` Kirill A. Shutemov
2021-06-10 14:14     ` Peter Xu
2021-06-10  6:36 ` [PATCH 02/11] mm: page_vma_mapped_walk(): settle PageHuge on entry Hugh Dickins
2021-06-10  8:57   ` Kirill A. Shutemov
2021-06-10 14:17   ` Peter Xu
2021-06-10  6:38 ` [PATCH 03/11] mm: page_vma_mapped_walk(): use pmd_read_atomic() Hugh Dickins
2021-06-10  9:06   ` Kirill A. Shutemov
2021-06-10 12:15     ` Jason Gunthorpe
     [not found]       ` <aebb6b96-153e-7d7-59da-f6bad4337aa7@google.com>
2021-06-11 15:36         ` Jason Gunthorpe
2021-06-11 19:05           ` Hugh Dickins
2021-06-11 19:42             ` Jason Gunthorpe
2021-06-15  9:46               ` Will Deacon
2021-06-16  0:42                 ` Jason Gunthorpe
2021-06-16 10:27                   ` Will Deacon [this message]
2021-06-11 19:33           ` Hugh Dickins
2021-06-10  6:40 ` [PATCH 04/11] mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd Hugh Dickins
2021-06-10  9:10   ` Kirill A. Shutemov
2021-06-10 14:31   ` Peter Xu
2021-06-10  6:42 ` [PATCH 05/11] mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block Hugh Dickins
2021-06-10  9:16   ` Kirill A. Shutemov
2021-06-10 14:48   ` Peter Xu
2021-06-10  6:44 ` [PATCH 06/11] mm: page_vma_mapped_walk(): crossing page table boundary Hugh Dickins
2021-06-10  9:32   ` Kirill A. Shutemov
2021-06-10  6:46 ` [PATCH 07/11] mm: page_vma_mapped_walk(): add a level of indentation Hugh Dickins
2021-06-10  9:34   ` Kirill A. Shutemov
2021-06-10  6:48 ` [PATCH 08/11] mm: page_vma_mapped_walk(): use goto instead of while (1) Hugh Dickins
2021-06-10  9:39   ` Kirill A. Shutemov
2021-06-10  6:50 ` [PATCH 09/11] mm: page_vma_mapped_walk(): get vma_address_end() earlier Hugh Dickins
2021-06-10  9:40   ` Kirill A. Shutemov
2021-06-10  6:52 ` [PATCH 10/11] mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes Hugh Dickins
2021-06-10  9:42   ` Kirill A. Shutemov
2021-06-10  6:54 ` [PATCH 11/11] mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk() Hugh Dickins
2021-06-10  9:43   ` Kirill A. Shutemov
2021-06-11 18:29     ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210616102742.GC22350@willie-the-truck \
    --to=will@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=hughd@google.com \
    --cc=jgg@ziepe.ca \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=rcampbell@nvidia.com \
    --cc=shy828301@gmail.com \
    --cc=wangyugui@e16-tech.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox