linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nishanth Aravamudan <nacc@us.ibm.com>
To: William Lee Irwin III <wli@holomorphy.com>
Cc: kenchen@google.com, linux-mm@kvack.org, agl@us.ibm.com, dwg@au1.ibm.com
Subject: Re: FADV_DONTNEED on hugetlbfs files broken
Date: Sat, 17 Mar 2007 12:37:29 -0700	[thread overview]
Message-ID: <20070317193729.GA11449@us.ibm.com> (raw)
In-Reply-To: <20070317061322.GI8915@holomorphy.com>

On 16.03.2007 [23:13:22 -0700], William Lee Irwin III wrote:
> On Fri, Mar 16, 2007 at 10:13:09PM -0700, Nishanth Aravamudan wrote:
> > git commit 6649a3863232eb2e2f15ea6c622bd8ceacf96d76 "[PATCH] hugetlb:
> > preserve hugetlb pte dirty state" fixed one bug and caused another (or,
> > at least, a regression): FADV_DONTNEED no longer works on hugetlbfs
> > files. git-bisect revealed this commit to be the cause. I'm still trying
> > to figure out what the solution is (but it is also the start of the
> > weekend :) Maybe it's not a bug, but it is a change in behavior, and I
> > don't think it was clear from the commit message.
> 
> Well, setting the pages always dirty like that will prevent things
> from dropping them because they think they still need to be written
> back. It is, however, legitimate and/or permissible to ignore
> fadvise() and/or madvise(); they are by definition only advisory. I
> think this is more of a "please add back FADV_DONTNEED support"
> affair.

Yes, that could be :) Sorry if my e-mail indicated I was asking
otherwise. I don't want Ken's commit to be reverted, as that would make
hugepages very nearly unusable on x86 and x86_64. But I had found a
functional change and wanted it to be documented. If hugepages can no
longer be dropped from the page cache, then we should make sure that is
clear (and expected/desired).

Now, even if I call fsync() on the file descriptor, I still don't get
the pages out of the page cache. It seems to me like fsync() would clear
the dirty state -- although perhaps with Ken's patch, writable hugetlbfs
pages will *always* be dirty? I'm still trying to figure out what ever
clears that dirty state (in hugetlbfs or anywhere else). Seems like
hugetlbfs truncates call cancel_dirty_page(), but the comment there
indicates it's only for truncates.

> Perhaps we should ask what ramfs, tmpfs, et al would do. Or, for that
> matter, if they suffer from the same issue as Ken Chen identified for
> hugetlbfs. Perhaps the issue is not hugetlb's dirty state, but
> drop_pagecache_sb() failing to check the bdi for BDI_CAP_NO_WRITEBACK.
> Or perhaps what safety guarantees drop_pagecache_sb() is supposed to
> have or lack.

A good point, and one I hadn't considered. I'm less concerned by the
drop_pagecache_sb() path (which is /proc/sys/vm/drop_caches, yes?),
although it appears that it and the FADV_DONTNEED code both end up
calling into invalidate_mapping_pages(). I'm still pretty new to this
part of the kernel code, and am trying to follow along as best I can.

In any case, if the problem were in drop_pagecache_sb(), it seems like
it wouldn't help the DONTNEED case, since that's a level above the call
to invalidate_mapping_pages().

I'll keep looking through the code and thinking, and if anyone has any
patches they'd like me to test, I'll be glad to.

Thanks,
Nish

-- 
Nishanth Aravamudan <nacc@us.ibm.com>
IBM Linux Technology Center

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-03-17 19:37 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-17  5:13 Nishanth Aravamudan
2007-03-17  6:13 ` William Lee Irwin III
2007-03-17 19:37   ` Nishanth Aravamudan [this message]
2007-03-18  2:13     ` William Lee Irwin III
2007-03-18  7:43     ` Ken Chen
2007-03-18 17:27       ` Nishanth Aravamudan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070317193729.GA11449@us.ibm.com \
    --to=nacc@us.ibm.com \
    --cc=agl@us.ibm.com \
    --cc=dwg@au1.ibm.com \
    --cc=kenchen@google.com \
    --cc=linux-mm@kvack.org \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox