linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Minchan Kim <minchan@kernel.org>, Yu Zhao <yuzhao@google.com>
Cc: Mauricio Faria de Oliveira <mfo@canonical.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-block@vger.kernel.org,
	Huang Ying <ying.huang@intel.com>,
	Miaohe Lin <linmiaohe@huawei.com>, Yang Shi <shy828301@gmail.com>
Subject: Re: [PATCH v2] mm: fix race between MADV_FREE reclaim and blkdev direct IO read
Date: Tue, 11 Jan 2022 11:29:36 -0800	[thread overview]
Message-ID: <e75c8f37-782f-f4d4-b197-8fda18090b42@nvidia.com> (raw)
In-Reply-To: <Yd3SUXVy7MbyBzFw@google.com>

On 1/11/22 10:54, Minchan Kim wrote:
...
> Hi Yu,
> 
> I think you're correct. I think we don't like memory barrier
> there in page_dup_rmap. Then, how about make gup_fast is aware
> of FOLL_TOUCH?
> 
> FOLL_TOUCH means it's going to write something so the page

Actually, my understanding of FOLL_TOUCH is that it does *not* mean that
data will be written to the page. That is what FOLL_WRITE is for.
FOLL_TOUCH means: update the "accessed" metadata, without actually
writing to the memory that the page represents.


> should be dirty. Currently, get_user_pages works like that.
> Howver, problem is get_user_pages_fast since it looks like
> that lockless_pages_from_mm doesn't support FOLL_TOUCH.
> 
> So the idea is if param in internal_get_user_pages_fast
> includes FOLL_TOUCH, gup_{pmd,pte}_range try to make the
> page dirty under trylock_page(If the lock fails, it goes

Marking a page dirty solely because FOLL_TOUCH is specified would
be an API-level mistake. That's why it isn't "supported". Or at least,
that's how I'm reading things.

Hope that helps!

> slow path with __gup_longterm_unlocked and set_dirty_pages
> for them).
> 
> This approach would solve other cases where map userspace
> pages into kernel space and then write. Since the write
> didn't go through with the process's page table, we will
> lose the dirty bit in the page table of the process and
> it turns out same problem. That's why I'd like to approach
> this.
> 
> If it doesn't work, the other option to fix this specific
> case is can't we make pages dirty in advance in DIO read-case?
> 
> When I look at DIO code, it's already doing in async case.
> Could't we do the same thing for the other cases?
> I guess the worst case we will see would be more page
> writeback since the page becomes dirty unnecessary.

Marking pages dirty after pinning them is a pre-existing area of
problems. See the long-running LWN articles about get_user_pages() [1].


[1] https://lwn.net/Kernel/Index/#Memory_management-get_user_pages

thanks,
-- 
John Hubbard
NVIDIA


  reply	other threads:[~2022-01-11 19:29 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-05 23:34 Mauricio Faria de Oliveira
2022-01-06 23:15 ` Minchan Kim
2022-01-07  0:11   ` Yang Shi
2022-01-07  1:08     ` Yang Shi
2022-01-11  1:34   ` Huang, Ying
2022-01-11  6:48 ` Yu Zhao
2022-01-11 18:54   ` Minchan Kim
2022-01-11 19:29     ` John Hubbard [this message]
2022-01-11 20:20       ` Minchan Kim
2022-01-11 20:21         ` Minchan Kim
2022-01-11 21:59           ` Minchan Kim
2022-01-11 23:38             ` John Hubbard
2022-01-12  0:01               ` Minchan Kim
2022-01-12  1:46   ` Huang, Ying
2022-01-12 17:33     ` Minchan Kim
2022-01-12 21:53       ` Mauricio Faria de Oliveira
2022-01-12 22:37         ` Minchan Kim
2022-01-13  8:54           ` Huang, Ying
2022-01-13 12:30             ` Huang, Ying
2022-01-13 14:54               ` Mauricio Faria de Oliveira
2022-01-13 14:30           ` Mauricio Faria de Oliveira
2022-01-13  7:29         ` Yu Zhao
2022-01-14  0:35           ` Minchan Kim
2022-01-31 23:10             ` Mauricio Faria de Oliveira
2022-01-13  5:47       ` Huang, Ying
2022-01-13  6:37         ` Miaohe Lin
2022-01-13  8:04           ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e75c8f37-782f-f4d4-b197-8fda18090b42@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mfo@canonical.com \
    --cc=minchan@kernel.org \
    --cc=shy828301@gmail.com \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox