From: Izik Eidus <ieidus@redhat.com>
To: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
Rik van Riel <riel@redhat.com>, Chris Wright <chrisw@redhat.com>,
Nick Piggin <nickpiggin@yahoo.com.au>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 9/12] ksm: fix oom deadlock
Date: Tue, 04 Aug 2009 22:32:14 +0300 [thread overview]
Message-ID: <4A788CBE.7080100@redhat.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0908031317190.16754@sister.anvils>
Hugh Dickins wrote:
> There's a now-obvious deadlock in KSM's out-of-memory handling:
> imagine ksmd or KSM_RUN_UNMERGE handling, holding ksm_thread_mutex,
> trying to allocate a page to break KSM in an mm which becomes the
> OOM victim (quite likely in the unmerge case): it's killed and goes
> to exit, and hangs there waiting to acquire ksm_thread_mutex.
>
> Clearly we must not require ksm_thread_mutex in __ksm_exit, simple
> though that made everything else: perhaps use mmap_sem somehow?
> And part of the answer lies in the comments on unmerge_ksm_pages:
> __ksm_exit should also leave all the rmap_item removal to ksmd.
>
> But there's a fundamental problem, that KSM relies upon mmap_sem to
> guarantee the consistency of the mm it's dealing with, yet exit_mmap
> tears down an mm without taking mmap_sem. And bumping mm_users won't
> help at all, that just ensures that the pages the OOM killer assumes
> are on their way to being freed will not be freed.
>
> The best answer seems to be, to move the ksm_exit callout from just
> before exit_mmap, to the middle of exit_mmap: after the mm's pages
> have been freed (if the mmu_gather is flushed), but before its page
> tables and vma structures have been freed; and down_write,up_write
> mmap_sem there to serialize with KSM's own reliance on mmap_sem.
>
> But KSM then needs to be careful, whenever it downs mmap_sem, to
> check that the mm is not already exiting: there's a danger of using
> find_vma on a layout that's being torn apart, or writing into page
> tables which have been freed for reuse; and even do_anonymous_page
> and __do_fault need to check they're not being called by break_ksm
> to reinstate a pte after zap_pte_range has zapped that page table.
>
> Though it might be clearer to add an exiting flag, set while holding
> mmap_sem in __ksm_exit, that wouldn't cover the issue of reinstating
> a zapped pte. All we need is to check whether mm_users is 0 - but
> must remember that ksmd may detect that before __ksm_exit is reached.
> So, ksm_test_exit(mm) added to comment such checks on mm->mm_users.
>
> __ksm_exit now has to leave clearing up the rmap_items to ksmd,
> that needs ksm_thread_mutex; but shift the exiting mm just after the
> ksm_scan cursor so that it will soon be dealt with. __ksm_enter raise
> mm_count to hold the mm_struct, ksmd's exit processing (exactly like
> its processing when it finds all VM_MERGEABLEs unmapped) mmdrop it,
> similar procedure for KSM_RUN_UNMERGE (which has stopped ksmd).
>
> But also give __ksm_exit a fast path: when there's no complication
> (no rmap_items attached to mm and it's not at the ksm_scan cursor),
> it can safely do all the exiting work itself. This is not just an
> optimization: when ksmd is not running, the raised mm_count would
> otherwise leak mm_structs.
>
> Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
> ---
>
Acked-by: Izik Eidus <ieidus@redhat.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-08-04 18:58 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-03 12:08 [PATCH 0/12] ksm: stats, oom, doc, misc Hugh Dickins
2009-08-03 12:10 ` [PATCH 1/12] ksm: rename kernel_pages_allocated Hugh Dickins
2009-08-03 14:21 ` Izik Eidus
2009-08-03 16:48 ` Andrea Arcangeli
2009-08-03 12:11 ` [PATCH 2/12] ksm: move pages_sharing updates Hugh Dickins
2009-08-03 14:34 ` Izik Eidus
2009-08-03 16:53 ` Andrea Arcangeli
2009-08-03 17:34 ` Hugh Dickins
2009-08-03 12:11 ` [PATCH 3/12] ksm: pages_unshared and pages_volatile Hugh Dickins
2009-08-03 14:54 ` Izik Eidus
2009-08-04 21:49 ` Andrew Morton
2009-08-05 11:39 ` Hugh Dickins
2009-08-05 15:11 ` Andrea Arcangeli
2009-08-03 12:12 ` [PATCH 4/12] ksm: break cow once unshared Hugh Dickins
2009-08-03 16:00 ` Izik Eidus
2009-08-03 12:14 ` [PATCH 5/12] ksm: keep quiet while list empty Hugh Dickins
2009-08-03 16:55 ` Izik Eidus
2009-08-04 21:59 ` Andrew Morton
2009-08-05 11:54 ` Hugh Dickins
2009-08-03 12:15 ` [PATCH 6/12] ksm: five little cleanups Hugh Dickins
2009-08-04 12:41 ` Izik Eidus
2009-08-03 12:16 ` [PATCH 7/12] ksm: fix endless loop on oom Hugh Dickins
2009-08-04 12:55 ` Izik Eidus
2009-08-03 12:17 ` [PATCH 8/12] ksm: distribute remove_mm_from_lists Hugh Dickins
2009-08-04 13:03 ` Izik Eidus
2009-08-03 12:18 ` [PATCH 9/12] ksm: fix oom deadlock Hugh Dickins
2009-08-04 19:32 ` Izik Eidus [this message]
2009-08-25 14:58 ` Andrea Arcangeli
2009-08-25 15:22 ` [PATCH 13/12] ksm: fix munlock during exit_mmap deadlock Andrea Arcangeli
2009-08-25 17:49 ` Hugh Dickins
2009-08-25 18:10 ` Andrea Arcangeli
2009-08-25 18:58 ` Hugh Dickins
2009-08-25 19:45 ` Andrea Arcangeli
2009-08-26 16:18 ` Justin M. Forbes
2009-08-26 19:17 ` Hugh Dickins
2009-08-26 19:44 ` Andrea Arcangeli
2009-08-26 19:57 ` Hugh Dickins
2009-08-26 20:28 ` Andrea Arcangeli
2009-08-26 20:54 ` Izik Eidus
2009-08-26 21:14 ` Andrea Arcangeli
2009-08-26 21:49 ` Izik Eidus
2009-08-27 19:11 ` Hugh Dickins
2009-08-27 19:35 ` Izik Eidus
2009-08-26 22:00 ` David Rientjes
2009-08-26 20:29 ` Hugh Dickins
2009-08-25 17:35 ` [PATCH 9/12] ksm: fix oom deadlock Hugh Dickins
2009-08-25 17:47 ` Andrea Arcangeli
2009-08-03 12:19 ` [PATCH 10/12] ksm: sysfs and defaults Hugh Dickins
2009-08-04 19:34 ` Izik Eidus
2009-08-03 12:21 ` [PATCH 11/12] ksm: add some documentation Hugh Dickins
2009-08-04 19:35 ` Izik Eidus
2009-08-03 12:22 ` [PATCH 12/12] ksm: remove VM_MERGEABLE_FLAGS Hugh Dickins
2009-08-04 19:35 ` Izik Eidus
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A788CBE.7080100@redhat.com \
--to=ieidus@redhat.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=chrisw@redhat.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nickpiggin@yahoo.com.au \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox