linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jan Stancek <jstancek@redhat.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: linux-mm@kvack.org,
	kirill shutemov <kirill.shutemov@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	ltp@lists.linux.it, mhocko@kernel.org,
	Rachel Sibley <rasibley@redhat.com>,
	hughd@google.com, n-horiguchi@ah.jp.nec.com, aarcange@redhat.com,
	aneesh kumar <aneesh.kumar@linux.vnet.ibm.com>,
	dave@stgolabs.net, prakash sangappa <prakash.sangappa@oracle.com>,
	colin king <colin.king@canonical.com>
Subject: Re: [bug] problems with migration of huge pages with v4.20-10214-ge1ef035d272e
Date: Thu, 3 Jan 2019 12:06:09 -0500 (EST)	[thread overview]
Message-ID: <495081357.93179893.1546535169172.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <1808265696.93134171.1546519652798.JavaMail.zimbra@redhat.com>



----- Original Message -----
<snip>

> > That commit does cause BUGs for migration and page poisoning of anon huge
> > pages.  The patch was trying to take care of i_mmap_rwsem locking outside
> > try_to_unmap infrastructure.  This is because try_to_unmap will take the
> > semaphore in read mode (for file mappings) and we really need it to be
> > taken in write mode.
> > 
> > The patch below continues to take the semaphore outside try_to_unmap for
> > the file mapping case.  For anon mappings, the locking is done as a special
> > case in try_to_unmap_one.  This is something I was trying to avoid as it
> > it harder to follow/understand.  Any suggestions on how to restructure this
> > or make it more clear are welcome.
> > 
> > Adding Andrew on Cc as he already sent the commit causing the BUGs
> > upstream.
> > 
> > From: Mike Kravetz <mike.kravetz@oracle.com>
> > 
> > hugetlbfs: fix migration and poisoning of anon huge pages
> > 
> > Expanded use of i_mmap_rwsem for pmd sharing synchronization incorrectly
> > used page_mapping() of anon huge pages to get to address_space
> > i_mmap_rwsem.  Since page_mapping() is NULL for pages of anon mappings,
> > an "unable to handle kernel NULL pointer" BUG would occur with stack
> > similar to:
> > 
> > RIP: 0010:down_write+0x1b/0x40
> > Call Trace:
> >  migrate_pages+0x81f/0xb90
> >  __ia32_compat_sys_migrate_pages+0x190/0x190
> >  do_move_pages_to_node.isra.53.part.54+0x2a/0x50
> >  kernel_move_pages+0x566/0x7b0
> >  __x64_sys_move_pages+0x24/0x30
> >  do_syscall_64+0x5b/0x180
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > 
> > To fix, only use page_mapping() for non-anon or file pages.  For anon
> > pages wait until we find a vma in which the page is mapped and get the
> > address_space from vm_file.
> > 
> > Fixes: b43a99900559 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing
> > synchronization")
> > Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> 
> Mike,
> 
> 1) with LTP move_pages12 (MAP_PRIVATE version of reproducer)
> Patch below fixes the panic for me.
> It didn't apply cleanly to latest master, but conflicts were easy to resolve.
> 
> 2) with MAP_SHARED version of reproducer
> It still hangs in user-space.
> v4.19 kernel appears to work fine so I've started a bisect.

My bisect with MAP_SHARED version arrived at same 2 commits:
  c86aa7bbfd55 hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race
  b43a99900559 hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization

Maybe a deadlock between page lock and mapping->i_mmap_rwsem?

thread1:
  hugetlbfs_evict_inode
    i_mmap_lock_write(mapping);
    remove_inode_hugepages
      lock_page(page);

thread2:
  __unmap_and_move
    trylock_page(page) / lock_page(page)
      remove_migration_ptes
        rmap_walk_file
          i_mmap_lock_read(mapping);

Here's strace output:
<snip>
1196  11:27:16 mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = 0x7f646c400000
1197  11:27:16 set_robust_list(0x7f646d5b0e60, 24) = 0
1197  11:27:16 getppid()                = 1196
1197  11:27:16 move_pages(1196, 1024, [0x7f646c400000, 0x7f646c401000, 0x7f646c402000, 0x7f646c403000, 0x7f646c404000, 0x7f646c405000, 0x7f646c406000, 0x7f646c407000, 0x7f646c408000, 0x7f646c409000, 0x7f646c40a000, 0x7f646c40b000, 0x7f646c40c000, 0x7f646c40d000, 0x7f646c40e000, 0x7f646c40f000, 0x7f646c410000, 0x7f646c411000, 0x7f646c412000, 0x7f646c413000, 0x7f646c414000, 0x7f646c415000, 0x7f646c416000, 0x7f646c417000, 0x7f646c418000, 0x7f646c419000, 0x7f646c41a000, 0x7f646c41b000, 0x7f646c41c000, 0x7f646c41d000, 0x7f646c41e000, 0x7f646c41f000, ...], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...], [-ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, -ENOENT, ...], MPOL_MF_MOVE_ALL) = 0
1197  11:27:16 move_pages(1196, 1024, [0x7f646c400000, 0x7f646c401000, 0x7f646c402000, 0x7f646c403000, 0x7f646c404000, 0x7f646c405000, 0x7f646c406000, 0x7f646c407000, 0x7f646c408000, 0x7f646c409000, 0x7f646c40a000, 0x7f646c40b000, 0x7f646c40c000, 0x7f646c40d000, 0x7f646c40e000, 0x7f646c40f000, 0x7f646c410000, 0x7f646c411000, 0x7f646c412000, 0x7f646c413000, 0x7f646c414000, 0x7f646c415000, 0x7f646c416000, 0x7f646c417000, 0x7f646c418000, 0x7f646c419000, 0x7f646c41a000, 0x7f646c41b000, 0x7f646c41c000, 0x7f646c41d000, 0x7f646c41e000, 0x7f646c41f000, ...], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...], [1, -EACCES, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...], MPOL_MF_MOVE_ALL) = 0
1197  11:27:16 move_pages(1196, 1024, [0x7f646c400000, 0x7f646c401000, 0x7f646c402000, 0x7f646c403000, 0x7f646c404000, 0x7f646c405000, 0x7f646c406000, 0x7f646c407000, 0x7f646c408000, 0x7f646c409000, 0x7f646c40a000, 0x7f646c40b000, 0x7f646c40c000, 0x7f646c40d000, 0x7f646c40e000, 0x7f646c40f000, 0x7f646c410000, 0x7f646c411000, 0x7f646c412000, 0x7f646c413000, 0x7f646c414000, 0x7f646c415000, 0x7f646c416000, 0x7f646c417000, 0x7f646c418000, 0x7f646c419000, 0x7f646c41a000, 0x7f646c41b000, 0x7f646c41c000, 0x7f646c41d000, 0x7f646c41e000, 0x7f646c41f000, ...], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...],  <unfinished ...>
1196  11:27:16 munmap(0x7f646c400000, 4194304 <unfinished ...>
<hangs>

Regards,
Jan

  parent reply	other threads:[~2019-01-03 17:06 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1038135449.92986364.1546459244292.JavaMail.zimbra@redhat.com>
2019-01-02 20:30 ` Jan Stancek
2019-01-02 20:30   ` Jan Stancek
2019-01-02 21:24   ` Mike Kravetz
2019-01-03  1:44     ` Mike Kravetz
2019-01-03 12:47       ` Jan Stancek
2019-01-03 12:47         ` Jan Stancek
2019-01-03 17:06         ` Jan Stancek [this message]
2019-01-03 17:06           ` Jan Stancek
2019-01-03 21:44           ` Mike Kravetz
2019-01-03 21:59             ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=495081357.93179893.1546535169172.JavaMail.zimbra@redhat.com \
    --to=jstancek@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=colin.king@canonical.com \
    --cc=dave@stgolabs.net \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    --cc=ltp@lists.linux.it \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=prakash.sangappa@oracle.com \
    --cc=rasibley@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox