From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-f200.google.com (mail-qk0-f200.google.com [209.85.220.200]) by kanga.kvack.org (Postfix) with ESMTP id 3F0E36B000E for ; Thu, 24 May 2018 16:56:01 -0400 (EDT) Received: by mail-qk0-f200.google.com with SMTP id p126-v6so2256185qkd.1 for ; Thu, 24 May 2018 13:56:01 -0700 (PDT) Received: from aserp2130.oracle.com (aserp2130.oracle.com. [141.146.126.79]) by mx.google.com with ESMTPS id b196-v6si838963qka.255.2018.05.24.13.56.00 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 May 2018 13:56:00 -0700 (PDT) Subject: Re: [PATCH -V2 -mm 1/4] mm, clear_huge_page: Move order algorithm into a separate function References: <20180524005851.4079-1-ying.huang@intel.com> <20180524005851.4079-2-ying.huang@intel.com> From: Mike Kravetz Message-ID: <4569310c-ae07-2353-8276-f9cba3011ea5@oracle.com> Date: Thu, 24 May 2018 13:55:48 -0700 MIME-Version: 1.0 In-Reply-To: <20180524005851.4079-2-ying.huang@intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: "Huang, Ying" , Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andi Kleen , Jan Kara , Michal Hocko , Andrea Arcangeli , "Kirill A. Shutemov" , Matthew Wilcox , Hugh Dickins , Minchan Kim , Shaohua Li , Christopher Lameter On 05/23/2018 05:58 PM, Huang, Ying wrote: > From: Huang Ying > > In commit c79b57e462b5d ("mm: hugetlb: clear target sub-page last when > clearing huge page"), to keep the cache lines of the target subpage > hot, the order to clear the subpages in the huge page in > clear_huge_page() is changed to clearing the subpage which is furthest > from the target subpage firstly, and the target subpage last. This > optimization could be applied to copying huge page too with the same > order algorithm. To avoid code duplication and reduce maintenance > overhead, in this patch, the order algorithm is moved out of > clear_huge_page() into a separate function: process_huge_page(). So > that we can use it for copying huge page too. > > This will change the direct calls to clear_user_highpage() into the > indirect calls. But with the proper inline support of the compilers, > the indirect call will be optimized to be the direct call. Our tests > show no performance change with the patch. > > This patch is a code cleanup without functionality change. > > Signed-off-by: "Huang, Ying" > Suggested-by: Mike Kravetz Thanks for doing this. The extra level of indirection does make this a bit more difficult to read. However, I believe this is offset by the reuse of the algorithm in subsequent copy_huge_page support. > Cc: Andi Kleen > Cc: Jan Kara > Cc: Michal Hocko > Cc: Andrea Arcangeli > Cc: "Kirill A. Shutemov" > Cc: Matthew Wilcox > Cc: Hugh Dickins > Cc: Minchan Kim > Cc: Shaohua Li > Cc: Christopher Lameter > --- > mm/memory.c | 90 ++++++++++++++++++++++++++++++++++++++----------------------- > 1 file changed, 56 insertions(+), 34 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 14578158ed20..b9f573a81bbd 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4569,71 +4569,93 @@ EXPORT_SYMBOL(__might_fault); > #endif > > #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) > -static void clear_gigantic_page(struct page *page, > - unsigned long addr, > - unsigned int pages_per_huge_page) > -{ > - int i; > - struct page *p = page; > - > - might_sleep(); > - for (i = 0; i < pages_per_huge_page; > - i++, p = mem_map_next(p, page, i)) { > - cond_resched(); > - clear_user_highpage(p, addr + i * PAGE_SIZE); > - } > -} > -void clear_huge_page(struct page *page, > - unsigned long addr_hint, unsigned int pages_per_huge_page) > +/* > + * Process all subpages of the specified huge page with the specified > + * operation. The target subpage will be processed last to keep its > + * cache lines hot. > + */ > +static inline void process_huge_page( > + unsigned long addr_hint, unsigned int pages_per_huge_page, > + void (*process_subpage)(unsigned long addr, int idx, void *arg), > + void *arg) There could be a bit more information in the comment about the function. But it is not a requirement, unless patch needs to be redone for some other reason. Reviewed-by: Mike Kravetz -- Mike Kravetz > { > int i, n, base, l; > unsigned long addr = addr_hint & > ~(((unsigned long)pages_per_huge_page << PAGE_SHIFT) - 1); > > - if (unlikely(pages_per_huge_page > MAX_ORDER_NR_PAGES)) { > - clear_gigantic_page(page, addr, pages_per_huge_page); > - return; > - } > - > - /* Clear sub-page to access last to keep its cache lines hot */ > + /* Process target subpage last to keep its cache lines hot */ > might_sleep(); > n = (addr_hint - addr) / PAGE_SIZE; > if (2 * n <= pages_per_huge_page) { > - /* If sub-page to access in first half of huge page */ > + /* If target subpage in first half of huge page */ > base = 0; > l = n; > - /* Clear sub-pages at the end of huge page */ > + /* Process subpages at the end of huge page */ > for (i = pages_per_huge_page - 1; i >= 2 * n; i--) { > cond_resched(); > - clear_user_highpage(page + i, addr + i * PAGE_SIZE); > + process_subpage(addr + i * PAGE_SIZE, i, arg); > } > } else { > - /* If sub-page to access in second half of huge page */ > + /* If target subpage in second half of huge page */ > base = pages_per_huge_page - 2 * (pages_per_huge_page - n); > l = pages_per_huge_page - n; > - /* Clear sub-pages at the begin of huge page */ > + /* Process subpages at the begin of huge page */ > for (i = 0; i < base; i++) { > cond_resched(); > - clear_user_highpage(page + i, addr + i * PAGE_SIZE); > + process_subpage(addr + i * PAGE_SIZE, i, arg); > } > } > /* > - * Clear remaining sub-pages in left-right-left-right pattern > - * towards the sub-page to access > + * Process remaining subpages in left-right-left-right pattern > + * towards the target subpage > */ > for (i = 0; i < l; i++) { > int left_idx = base + i; > int right_idx = base + 2 * l - 1 - i; > > cond_resched(); > - clear_user_highpage(page + left_idx, > - addr + left_idx * PAGE_SIZE); > + process_subpage(addr + left_idx * PAGE_SIZE, left_idx, arg); > cond_resched(); > - clear_user_highpage(page + right_idx, > - addr + right_idx * PAGE_SIZE); > + process_subpage(addr + right_idx * PAGE_SIZE, right_idx, arg); > } > } > > +static void clear_gigantic_page(struct page *page, > + unsigned long addr, > + unsigned int pages_per_huge_page) > +{ > + int i; > + struct page *p = page; > + > + might_sleep(); > + for (i = 0; i < pages_per_huge_page; > + i++, p = mem_map_next(p, page, i)) { > + cond_resched(); > + clear_user_highpage(p, addr + i * PAGE_SIZE); > + } > +} > + > +static void clear_subpage(unsigned long addr, int idx, void *arg) > +{ > + struct page *page = arg; > + > + clear_user_highpage(page + idx, addr); > +} > + > +void clear_huge_page(struct page *page, > + unsigned long addr_hint, unsigned int pages_per_huge_page) > +{ > + unsigned long addr = addr_hint & > + ~(((unsigned long)pages_per_huge_page << PAGE_SHIFT) - 1); > + > + if (unlikely(pages_per_huge_page > MAX_ORDER_NR_PAGES)) { > + clear_gigantic_page(page, addr, pages_per_huge_page); > + return; > + } > + > + process_huge_page(addr_hint, pages_per_huge_page, clear_subpage, page); > +} > + > static void copy_user_gigantic_page(struct page *dst, struct page *src, > unsigned long addr, > struct vm_area_struct *vma, >