From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 3EE13C77B73
	for <linux-mm@archiver.kernel.org>; Tue,  2 May 2023 15:04:39 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 9576C6B0071; Tue,  2 May 2023 11:04:38 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 908906B0074; Tue,  2 May 2023 11:04:38 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 7F7836B0075; Tue,  2 May 2023 11:04:38 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from casper.infradead.org (casper.infradead.org [90.155.50.34])
	by kanga.kvack.org (Postfix) with ESMTP id 5BFE86B0071
	for <linux-mm@kvack.org>; Tue,  2 May 2023 11:04:38 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
	d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding:
	Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:
	Sender:Reply-To:Content-ID:Content-Description;
	bh=PG6IY8RhSBx6/J3kgpkxZCTajDz7ZLAABNHUHUIFfvc=; b=ra/HsgKyfD9PCh/A7VH6yQWMYF
	5dkzOYshnkQPproRtInYF2yyk/pT58tMw8VFhQ5yx4jo9CTiR35zJbOSubuOdtlwNA1VJTDeW02Di
	FfaMlculVuT61wzRgOIid+TCtas15gWOus1Ds0RgKv4lNlj/1Z6GtnSEiV4dnIkjpCUoqY8/qFCUn
	eKMuyySmCaHp5A3hmjtgGxpr3rCr85SIfEdl27jm2YM1sYerH2aw8vrOxClsGb227NVNGbEDoS+hg
	fuI2UDTRFe2oZ7dRk+9i1O4k6KnVCpppn2mLD17H69xpLhFKZkLmZ6UOUPkVzwiHxUPU3YrMqE6FN
	ZFWcoEiw==;
Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux))
	id 1ptrXD-008PCd-FH; Tue, 02 May 2023 15:03:19 +0000
Date: Tue, 2 May 2023 16:03:19 +0100
From: Matthew Wilcox <willy@infradead.org>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@suse.com,
	josef@toxicpanda.com, jack@suse.cz, ldufour@linux.ibm.com,
	laurent.dufour@fr.ibm.com, michel@lespinasse.org,
	liam.howlett@oracle.com, jglisse@google.com, vbabka@suse.cz,
	minchan@google.com, dave@stgolabs.net, punit.agrawal@bytedance.com,
	lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH 1/3] mm: handle swap page faults under VMA lock if page
 is uncontended
Message-ID: <ZFEmN6G7WRy59Mum@casper.infradead.org>
References: <20230501175025.36233-1-surenb@google.com>
 <ZFBvOh8r5WbTVyA8@casper.infradead.org>
 <CAJuCfpHfAFx9rjv0gHK77LbP-8gd-kFnWw=aqfQTP6pH=zvMNg@mail.gmail.com>
 <ZFCB+G9KSNE+J9cZ@casper.infradead.org>
 <CAJuCfpES=G8i99yYXWoeJq9+JVUjX5Bkq_5VNVTVX7QT+Wkfxg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CAJuCfpES=G8i99yYXWoeJq9+JVUjX5Bkq_5VNVTVX7QT+Wkfxg@mail.gmail.com>
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Mon, May 01, 2023 at 10:04:56PM -0700, Suren Baghdasaryan wrote:
> On Mon, May 1, 2023 at 8:22 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Mon, May 01, 2023 at 07:30:13PM -0700, Suren Baghdasaryan wrote:
> > > On Mon, May 1, 2023 at 7:02 PM Matthew Wilcox <willy@infradead.org> wrote:
> > > >
> > > > On Mon, May 01, 2023 at 10:50:23AM -0700, Suren Baghdasaryan wrote:
> > > > > +++ b/mm/memory.c
> > > > > @@ -3711,11 +3711,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> > > > >       if (!pte_unmap_same(vmf))
> > > > >               goto out;
> > > > >
> > > > > -     if (vmf->flags & FAULT_FLAG_VMA_LOCK) {
> > > > > -             ret = VM_FAULT_RETRY;
> > > > > -             goto out;
> > > > > -     }
> > > > > -
> > > > >       entry = pte_to_swp_entry(vmf->orig_pte);
> > > > >       if (unlikely(non_swap_entry(entry))) {
> > > > >               if (is_migration_entry(entry)) {
> > > >
> > > > You're missing the necessary fallback in the (!folio) case.
> > > > swap_readpage() is synchronous and will sleep.
> > >
> > > True, but is it unsafe to do that under VMA lock and has to be done
> > > under mmap_lock?
> >
> > ... you were the one arguing that we didn't want to wait for I/O with
> > the VMA lock held?
> 
> Well, that discussion was about waiting in folio_lock_or_retry() with
> the lock being held. I argued against it because currently we drop
> mmap_lock lock before waiting, so if we don't drop VMA lock we would
> be changing the current behavior which might introduce new
> regressions. In the case of swap_readpage and swapin_readahead we
> already wait with mmap_lock held, so waiting with VMA lock held does
> not introduce new problems (unless there is a need to hold mmap_lock).
> 
> That said, you are absolutely correct that this situation can be
> improved by dropping the lock in these cases too. I just didn't want
> to attack everything at once. I believe after we agree on the approach
> implemented in https://lore.kernel.org/all/20230501175025.36233-3-surenb@google.com
> for dropping the VMA lock before waiting, these cases can be added
> easier. Does that make sense?

OK, I looked at this path some more, and I think we're fine.  This
patch is only called for SWP_SYNCHRONOUS_IO which is only set for
QUEUE_FLAG_SYNCHRONOUS devices, which are brd, zram and nvdimms
(both btt and pmem).  So the answer is that we don't sleep in this
path, and there's no need to drop the lock.