From: Miklos Szeredi <miklos@szeredi.hu>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: miklos@szeredi.hu, dave@linux.vnet.ibm.com,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, shenlinf@cn.ibm.com,
volobuev@us.ibm.com, mel@linux.vnet.ibm.com, dingc@cn.ibm.com,
lnxninja@us.ibm.com
Subject: Re: Deadlocks with transparent huge pages and userspace fs daemons
Date: Wed, 15 Dec 2010 15:54:45 +0100 [thread overview]
Message-ID: <E1PSskf-00066t-US@pomaz-ex.szeredi.hu> (raw)
In-Reply-To: <20101215052450.GQ5638@random.random> (message from Andrea Arcangeli on Wed, 15 Dec 2010 06:24:50 +0100)
On Wed, 15 Dec 2010, Andrea Arcangeli wrote:
> Hello Miklos and everyone,
>
> On Tue, Dec 14, 2010 at 10:03:33PM +0100, Miklos Szeredi wrote:
> > This is all fine and dandy, but please let's not forget about the
> > other thing that Dave's test uncovered. Namely that page migration
> > triggered by transparent hugepages takes the page lock on arbitrary
> > filesystems. This is also deadlocky on fuse, but also not a good idea
> > for any filesystem where page reading time is not bounded (think NFS
> > with network down).
>
> In #33 I fixed the mmap_sem write issue which is more clear to me and
> it makes the code better.
>
> The page lock I don't have full picture on it. Notably there is no
> waiting on page lock on khugepaged and khugepaged can't use page
> migration (it's not migrating, it's collapsing).
>
> The page lock mentioned in migration context I don't see how can it be
> related to THP. There's not a _single_ lock_page in mm/huge_memory.c .
>
> If fuse has deadlock troubles in migration lock_page then I would
> guess THP has nothing to do with it memory compaction, and it can
> trigger already in upstream stable 2.6.36 when CONFIG_COMPACTION=y by
> just doing:
>
> echo 1024 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
>
> or by simply insmodding a driver that tries a large
> alloc_pages(order).
>
> My understanding of Dave's trace is that THP makes it easier to
> reproduce, but this isn't really THP related, it can happen already
> upstream without my patchset applied, and it's just a pure coincidence
> that THP makes it more easy to reproduce.
Right, it's questionable whether any page migration should wait for
I/O as it can introduce large delays, and even complete lockup of an
unrelated process (as in case of NFS server being offline).
The man page for move_pages() clearly defines I/O as an error
condition:
-EBUSY The page is currently busy and cannot be moved. Try again
later. This occurs if a page is undergoing I/O or another ker-
nel subsystem is holding a reference to the page.
yet the actual code waits for I/O, both read and write. That might be
OK with some timeouts. Page migration is best effort anyway, so
waiting forever on I/O makes little sense.
> How to fix I'm not sure yet
> as I didn't look into it closely as I was focusing on rolling a THP
> specific update first, but at the moment it even sounds more like an
> issue with strict migration than memory compaction.
Yes, this is a page migration issue. But the fact is, THP will make
it more visible exactly because it can be used without any special
configuration.
Thanks,
Miklos
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2010-12-15 14:55 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-03 20:43 Dave Hansen
2010-11-03 21:46 ` Miklos Szeredi
2010-11-04 16:41 ` Andrea Arcangeli
2010-11-04 19:53 ` Miklos Szeredi
2010-12-14 17:46 ` Andrea Arcangeli
2010-12-14 21:03 ` Miklos Szeredi
2010-12-15 5:24 ` Andrea Arcangeli
2010-12-15 14:54 ` Miklos Szeredi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E1PSskf-00066t-US@pomaz-ex.szeredi.hu \
--to=miklos@szeredi.hu \
--cc=aarcange@redhat.com \
--cc=dave@linux.vnet.ibm.com \
--cc=dingc@cn.ibm.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lnxninja@us.ibm.com \
--cc=mel@linux.vnet.ibm.com \
--cc=shenlinf@cn.ibm.com \
--cc=volobuev@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox