From: Vlastimil Babka <vbabka@suse.cz>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Dave Hansen <dave.hansen@intel.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>,
David Rientjes <rientjes@google.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCHv5 5/7] mm: make compound_head() robust
Date: Mon, 14 Sep 2015 15:31:07 +0200 [thread overview]
Message-ID: <55F6CC1B.8020304@suse.cz> (raw)
In-Reply-To: <20150911133501.GA9129@node.dhcp.inet.fi>
On 09/11/2015 03:35 PM, Kirill A. Shutemov wrote:
>>> index 097c7a4bfbd9..330377f83ac7 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -1686,8 +1686,7 @@ static void __split_huge_page_refcount(struct page *page,
>>> (1L << PG_unevictable)));
>>> page_tail->flags |= (1L << PG_dirty);
>>>
>>> - /* clear PageTail before overwriting first_page */
>>> - smp_wmb();
>>> + clear_compound_head(page_tail);
>>
>> I would sleep better if this was done before setting all the page->flags above,
>> previously, PageTail was cleared by the first operation which is
>> "page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;"
>> I do realize that it doesn't use WRITE_ONCE, so it might have been theoretically
>> broken already, if it does matter.
>
> Right. Nothing enforces particular order. If we really need to have some
> ordering on PageTail() vs. page->flags let's define it, but so far I
> don't see a reason to change this part.
OK.
>>> diff --git a/mm/internal.h b/mm/internal.h
>>> index 36b23f1e2ca6..89e21a07080a 100644
>>> --- a/mm/internal.h
>>> +++ b/mm/internal.h
>>> @@ -61,9 +61,9 @@ static inline void __get_page_tail_foll(struct page *page,
>>> * speculative page access (like in
>>> * page_cache_get_speculative()) on tail pages.
>>> */
>>> - VM_BUG_ON_PAGE(atomic_read(&page->first_page->_count) <= 0, page);
>>> + VM_BUG_ON_PAGE(atomic_read(&compound_head(page)->_count) <= 0, page);
>>> if (get_page_head)
>>> - atomic_inc(&page->first_page->_count);
>>> + atomic_inc(&compound_head(page)->_count);
>>
>> Doing another compound_head() seems like overkill when this code already assumes
>> PageTail.
>
> "Overkill"? It's too strong wording for re-read hot cache line.
Hm good point.
>> All callers do it after if (PageTail()) which means they already did
>> READ_ONCE(page->compound_head) and here they do another one. Potentially with
>> different result in bit 0, which would be a subtle bug, that could be
>> interesting to catch with some VM_BUG_ON. I don't know if a direct plain
>> page->compound_head access here instead of compound_head() would also result in
>> a re-read, since the previous access did use READ_ONCE(). Maybe it would be best
>> to reorganize the code here and in the 3 call sites so that the READ_ONCE() used
>> to determine PageTail also obtains the compound head pointer.
>
> All we would possbily win by the change is few bytes in code. Additional
> READ_ONCE() only affect code generation. It doesn't introduce any cpu
> barriers. The cache line with compound_head is in L1 anyway.
>
> I don't see justification to change this part too. If you think we can
> gain something by reworking this code, feel free to propose patch on top.
OK, it's probably not worth:
add/remove: 0/0 grow/shrink: 1/3 up/down: 7/-66 (-59)
function old new delta
follow_trans_huge_pmd 516 523 +7
follow_page_pte 905 893 -12
follow_hugetlb_page 943 919 -24
__get_page_tail 440 410 -30
>>> diff --git a/mm/swap.c b/mm/swap.c
>>> index a3a0a2f1f7c3..faa9e1687dea 100644
>>> --- a/mm/swap.c
>>> +++ b/mm/swap.c
>>> @@ -200,7 +200,7 @@ out_put_single:
>>> __put_single_page(page);
>>> return;
>>> }
>>> - VM_BUG_ON_PAGE(page_head != page->first_page, page);
>>> + VM_BUG_ON_PAGE(page_head != compound_head(page), page);
>>> /*
>>> * We can release the refcount taken by
>>> * get_page_unless_zero() now that
>>> @@ -261,7 +261,7 @@ static void put_compound_page(struct page *page)
>>> * Case 3 is possible, as we may race with
>>> * __split_huge_page_refcount tearing down a THP page.
>>> */
>>> - page_head = compound_head_by_tail(page);
>>> + page_head = compound_head(page);
>>
>> This is also in a path after PageTail() is true.
>
> We can only save one branch here: other PageTail() is most likely in other
> compilation unit and compiler would not be able to eliminate additional
> load.
> Why bother?
Hmm, right. Bah.
add/remove: 0/0 grow/shrink: 1/0 up/down: 8/0 (8)
function old new delta
put_compound_page 500 508 +8
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-09-14 13:31 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-03 12:35 [PATCHv5 0/7] Fix compound_head() race Kirill A. Shutemov
2015-09-03 12:35 ` [PATCHv5 1/7] mm: drop page->slab_page Kirill A. Shutemov
2015-09-07 5:00 ` Alexander Duyck
2015-09-07 12:19 ` Kirill A. Shutemov
2015-09-03 12:35 ` [PATCHv5 2/7] slub: use page->rcu_head instead of page->lru plus cast Kirill A. Shutemov
2015-09-03 12:35 ` [PATCHv5 3/7] zsmalloc: use page->private instead of page->first_page Kirill A. Shutemov
2015-09-03 12:35 ` [PATCHv5 4/7] mm: pack compound_dtor and compound_order into one word in struct page Kirill A. Shutemov
2015-09-10 9:41 ` Vlastimil Babka
2015-09-03 12:35 ` [PATCHv5 5/7] mm: make compound_head() robust Kirill A. Shutemov
2015-09-10 10:54 ` Vlastimil Babka
2015-09-11 13:35 ` Kirill A. Shutemov
2015-09-14 13:31 ` Vlastimil Babka [this message]
2015-09-03 12:35 ` [PATCHv5 6/7] mm: use 'unsigned int' for page order Kirill A. Shutemov
2015-09-10 11:04 ` Vlastimil Babka
2015-09-03 12:35 ` [PATCHv5 7/7] mm: use 'unsigned int' for compound_dtor/compound_order on 64BIT Kirill A. Shutemov
2015-09-10 11:34 ` Vlastimil Babka
2015-09-04 13:43 ` [PATCHv5 0/7] Fix compound_head() race Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55F6CC1B.8020304@suse.cz \
--to=vbabka@suse.cz \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=dave.hansen@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=paulmck@linux.vnet.ibm.com \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox