From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f47.google.com (mail-pa0-f47.google.com [209.85.220.47]) by kanga.kvack.org (Postfix) with ESMTP id EB4746B0031 for ; Tue, 15 Oct 2013 16:16:29 -0400 (EDT) Received: by mail-pa0-f47.google.com with SMTP id kp14so9521228pab.6 for ; Tue, 15 Oct 2013 13:16:29 -0700 (PDT) Date: Tue, 15 Oct 2013 16:16:23 -0400 From: Naoya Horiguchi Message-ID: <1381868183-6d50s9n5-mutt-n-horiguchi@ah.jp.nec.com> In-Reply-To: <20131015194428.GI3479@redhat.com> References: <20131015143407.GE3479@redhat.com> <20131015144827.C45DDE0090@blue.fi.intel.com> <20131015185510.GH3479@redhat.com> <1381865330-8nb86ucy-mutt-n-horiguchi@ah.jp.nec.com> <20131015194428.GI3479@redhat.com> Subject: Re: mm: fix BUG in __split_huge_page_pmd Mime-Version: 1.0 Content-Type: text/plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-linux-mm@kvack.org List-ID: To: Andrea Arcangeli Cc: Hugh Dickins , "Kirill A. Shutemov" , Andrew Morton , David Rientjes , linux-kernel@vger.kernel.org, linux-mm@kvack.org On Tue, Oct 15, 2013 at 09:44:28PM +0200, Andrea Arcangeli wrote: > On Tue, Oct 15, 2013 at 03:28:50PM -0400, Naoya Horiguchi wrote: > > On Tue, Oct 15, 2013 at 08:55:10PM +0200, Andrea Arcangeli wrote: > > > On Tue, Oct 15, 2013 at 10:53:10AM -0700, Hugh Dickins wrote: > > > > I'm afraid Andrea's mail about concurrent madvises gives me far more > > > > to think about than I have time for: seems to get into problems he > > > > knows a lot about but I'm unfamiliar with. If this patch looks good > > > > for now on its own, let's put it in; but no problem if you guys prefer > > > > to wait for a fuller solution of more problems, we can ride with this > > > > one internally for the moment. > > > > > > I'm very happy with the patch and I think it's a correct fix for the > > > COW scenario which is deterministic so the looping makes a meaningful > > > difference for it. If we wouldn't loop, part of the copied page > > > wouldn't be zapped after the COW. > > > > I like this patch, too. > > > > If we have the loop in __split_huge_page_pmd as suggested in this patch, > > can we assume that the pmd is stable after __split_huge_page_pmd returns? > > If it's true, we can remove pmd_none_or_trans_huge_or_clear_bad check > > in the callers side (zap_pmd_range and some other page table walking code.) > > We can assume it stable for the deterministic cases where the > looping is useful for and split_huge_page creates non-huge pmd that points to > a regular pte. > > But we cannot remove pmd_none_or_trans_huge_or_clear_bad after if for > the other non deterministic cases that I described in previous > email. Looping still provides no guarantee that when the function > returns, the pmd in not huge. So for safety we still need to handle > the non deterministic case and just discard it through > pmd_none_or_trans_huge_or_clear_bad. OK, this check is necessary. But pmd_none_or_trans_huge_or_clear_bad doesn't clear the pmd when pmd_trans_huge is true. So zap_pmd_range seems to do nothing on such irregular pmd_trans_huge. So it looks to me better that zap_pmd_range retries the loop on the same address instead of 'goto next'. The reason why I had this kind of question is that I recently study on page table walker and some related code do retry in the similar situation. Thanks, Naoya Horiguchi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org