linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mark Hemment <markhemm@googlemail.com>
To: Charan Teja Kalla <quic_charante@quicinc.com>
Cc: Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 "Matthew Wilcox (Oracle)" <willy@infradead.org>,
	vbabka@suse.cz, rientjes@google.com, mhocko@suse.com,
	 Suren Baghdasaryan <surenb@google.com>,
	Shakeel Butt <shakeelb@google.com>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	Charan Teja Reddy <charante@codeaurora.org>
Subject: Re: [PATCH v3 RESEND] mm: shmem: implement POSIX_FADV_[WILL|DONT]NEED for shmem
Date: Wed, 12 Jan 2022 11:38:23 +0000	[thread overview]
Message-ID: <CANe_+Uj+ccUSaCcU_+XixuM9eJkrh3M1TOCMB5D=8rpUxUM0JA@mail.gmail.com> (raw)
In-Reply-To: <c19b1c9e-6351-6e71-d472-5ccd39885984@quicinc.com>

On Mon, 10 Jan 2022 at 15:14, Charan Teja Kalla
<quic_charante@quicinc.com> wrote:
>
> Thanks again Mark for the review comments!!
>
> On 1/10/2022 6:06 PM, Mark Hemment wrote:
> > On Thu, 6 Jan 2022 at 17:06, Charan Teja Reddy
> > <quic_charante@quicinc.com> wrote:
> >>
> >> From: Charan Teja Reddy <charante@codeaurora.org>
> >>
> >> Currently fadvise(2) is supported only for the files that doesn't
> >> associated with noop_backing_dev_info thus for the files, like shmem,
> >> fadvise results into NOP. But then there is file_operations->fadvise()
> >> that lets the file systems to implement their own fadvise
> >> implementation. Use this support to implement some of the POSIX_FADV_XXX
> >> functionality for shmem files.
> > ...
> >> +static void shmem_isolate_pages_range(struct address_space *mapping, loff_t start,
> >> +                               loff_t end, struct list_head *list)
> >> +{
> >> +       XA_STATE(xas, &mapping->i_pages, start);
> >> +       struct page *page;
> >> +
> >> +       rcu_read_lock();
> >> +       xas_for_each(&xas, page, end) {
> >> +               if (xas_retry(&xas, page))
> >> +                       continue;
> >> +               if (xa_is_value(page))
> >> +                       continue;
> >> +               if (!get_page_unless_zero(page))
> >> +                       continue;
> >> +               if (isolate_lru_page(page))
> >> +                       continue;
> >
> > Need to unwind the get_page on failure to isolate.
>
> Will be done.
>
> >
> > Should PageUnevicitable() pages (SHM_LOCK) be skipped?
> > (That is, does SHM_LOCK override DONTNEED?)
>
>
> Should be skipped. Will be done.
>
> >
> > ...
> >> +static int shmem_fadvise_dontneed(struct address_space *mapping, loff_t start,
> >> +                               loff_t end)
> >> +{
> >> +       int ret;
> >> +       struct page *page;
> >> +       LIST_HEAD(list);
> >> +       struct writeback_control wbc = {
> >> +               .sync_mode = WB_SYNC_NONE,
> >> +               .nr_to_write = LONG_MAX,
> >> +               .range_start = 0,
> >> +               .range_end = LLONG_MAX,
> >> +               .for_reclaim = 1,
> >> +       };
> >> +
> >> +       if (!shmem_mapping(mapping))
> >> +               return -EINVAL;
> >> +
> >> +       if (!total_swap_pages)
> >> +               return 0;
> >> +
> >> +       lru_add_drain();
> >> +       shmem_isolate_pages_range(mapping, start, end, &list);
> >> +
> >> +       while (!list_empty(&list)) {
> >> +               page = lru_to_page(&list);
> >> +               list_del(&page->lru);
> >> +               if (page_mapped(page))
> >> +                       goto keep;
> >> +               if (!trylock_page(page))
> >> +                       goto keep;
> >> +               if (unlikely(PageTransHuge(page))) {
> >> +                       if (split_huge_page_to_list(page, &list))
> >> +                               goto keep;
> >> +               }
> >
> > I don't know the shmem code and the lifecycle of a shm-page, so
> > genuine questions;
> > When the try-lock succeeds, should there be a test for PageWriteback()
> > (page skipped if true)?  Also, does page->mapping need to be tested
> > for NULL to prevent races with deletion from the page-cache?
>
> I failed to envisage it. I should have considered both these conditions
> here. BTW, I am just thinking about why we shouldn't use
> reclaim_pages(page_list) function here with an extra set_page_dirty() on
> a page that is isolated? It just call the shrink_page_list() where all
> these conditions are properly handled. What is your opinion here?

Should be possible to use reclaim_pages() (I haven't look closely).
It might actually be good to use this function, as will do some
congestion throttling.  Although it will always try to unmap
pages (note: your page_mapped() test is 'unstable' as done without the
page locked), so might give behaviour you want to avoid.
Note: reclaim_pages() is already used for madvise(PAGEOUT).  The shmem
code would need to prepare page(s) to help shrink_page_list() to make
progress (see madvise.c:madvise_cold_or_pageout_pte_range()).

Taking a step back; is fadvise(DONTNEED) really needed/wanted?  Yes,
you gave a usecase (which I cut from this thread in my earlier reply),
but I'm not familiar with various shmem uses to know if this feature
is needed.  Someone else will need to answer this.

Cheers,
Mark

>
> >
> > ...
> >> +
> >> +               clear_page_dirty_for_io(page);
> >> +               SetPageReclaim(page);
> >> +               ret = shmem_writepage(page, &wbc);
> >> +               if (ret || PageWriteback(page)) {
> >> +                       if (ret)
> >> +                               unlock_page(page);
> >> +                       goto keep;
> >> +               }
> >> +
> >> +               if (!PageWriteback(page))
> >> +                       ClearPageReclaim(page);
> >> +
> >> +               /*
> >> +                * shmem_writepage() place the page in the swapcache.
> >> +                * Delete the page from the swapcache and release the
> >> +                * page.
> >> +                */
> >> +               __mod_node_page_state(page_pgdat(page),
> >> +                               NR_ISOLATED_ANON + page_is_file_lru(page), compound_nr(page));
> >> +               lock_page(page);
> >> +               delete_from_swap_cache(page);
> >> +               unlock_page(page);
> >> +               put_page(page);
> >> +               continue;
> >> +keep:
> >> +               putback_lru_page(page);
> >> +               __mod_node_page_state(page_pgdat(page),
> >> +                               NR_ISOLATED_ANON + page_is_file_lru(page), compound_nr(page));
> >> +       }
> >
> > The putback_lru_page() drops the last reference hold this code has on
> > 'page'.  Is it safe to use 'page' after dropping this reference?
>
> True. Will correct it in the next revision.
>
> >
> > Cheers,
> > Mark
> >


  reply	other threads:[~2022-01-12 11:38 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-06 17:05 Charan Teja Reddy
2022-01-07 12:10 ` Mark Hemment
2022-01-10 10:21   ` Charan Teja Kalla
2022-01-12  8:21     ` Charan Teja Kalla
2022-01-12 11:34       ` Mark Hemment
2022-01-12 13:19       ` Matthew Wilcox
2022-01-12 13:35         ` Charan Teja Kalla
2022-01-18 11:35           ` Charan Teja Kalla
2022-01-18 13:27             ` Matthew Wilcox
2022-01-10 12:36 ` Mark Hemment
2022-01-10 15:14   ` Charan Teja Kalla
2022-01-12 11:38     ` Mark Hemment [this message]
2022-01-12 15:43       ` Charan Teja Kalla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANe_+Uj+ccUSaCcU_+XixuM9eJkrh3M1TOCMB5D=8rpUxUM0JA@mail.gmail.com' \
    --to=markhemm@googlemail.com \
    --cc=akpm@linux-foundation.org \
    --cc=charante@codeaurora.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=quic_charante@quicinc.com \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox