From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3887C433F5 for ; Thu, 28 Apr 2022 14:51:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E77896B009F; Thu, 28 Apr 2022 10:51:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E00518D0001; Thu, 28 Apr 2022 10:51:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C78F96B00A2; Thu, 28 Apr 2022 10:51:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.25]) by kanga.kvack.org (Postfix) with ESMTP id B39666B009F for ; Thu, 28 Apr 2022 10:51:29 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 56AE92A91A for ; Thu, 28 Apr 2022 14:51:29 +0000 (UTC) X-FDA: 79406576298.15.403BD85 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf17.hostedemail.com (Postfix) with ESMTP id A23B440064 for ; Thu, 28 Apr 2022 14:51:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1651157487; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6GYfb1osoR9HaK//03SdHCI4d9M2Z3GMvUyItrC7XOE=; b=d8wLP7+o/qXoKGxNUbGB4L1sHKs/FHMVus8Tebg19OFgnKF2NzhFS26573qzQDp8luSM/4 v9aXXCEnEt+k5wmxjJkkLM+sUuV+cVNtOUmyuXfqFaqBbnx6/CGrTJNNmLHKN1lpC2GRTm JonHDQiFd4IKIlfvXbcVj3A1FfidJcU= Received: from mail-io1-f71.google.com (mail-io1-f71.google.com [209.85.166.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-311-91Y-O3XFPT6MFzzSrHnGng-1; Thu, 28 Apr 2022 10:51:26 -0400 X-MC-Unique: 91Y-O3XFPT6MFzzSrHnGng-1 Received: by mail-io1-f71.google.com with SMTP id g20-20020a056602151400b00654afb7ec5dso4610400iow.11 for ; Thu, 28 Apr 2022 07:51:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=6GYfb1osoR9HaK//03SdHCI4d9M2Z3GMvUyItrC7XOE=; b=DMXh5mLiSiLVie09ai1FB/f7GjnLJBIRjRtJz5eehuY2a3u5UliZDpOjD2miGZZ0OZ nvhLn1NqyuCJoJu4fljyDz/TQCu8Xe1z2FYrDcg/gTDrsPVYSvzc44ZIDLMSpzScgj0S SzQGQ7a+hlHzUXZiJ265mWfQKu11HOtqjz126jkBRwz42Ax5VIkR/VLULJc2XteNpI/i kyFdQIKEb6tyKEN5uDHjBPPMauwks7m0xHXiSb0JO1vZSNhKjqlB6dBOVlK/+oL6J1dQ RK/9y5Hy4qTiWKDcKnMonvtpKVZCap/M6B+gfhBE4Zna5W8cjTcyQfJM5iGdqwC0VGA1 1yaA== X-Gm-Message-State: AOAM532RQXw5lsy/GYaDnzWN9nfVMSBukAm+GHTwWWJ7YbbFxue5uc7O pjd0eT7OPe6LO+v87Ii8hibI1ISqyjGAlEFxF8gxDC1D/EhTtuhxKgzRvzTlQrvPoxyRqTmGfJX Ks1jWI+cFRkY= X-Received: by 2002:a92:cdac:0:b0:2cd:c915:bff8 with SMTP id g12-20020a92cdac000000b002cdc915bff8mr3619891ild.82.1651157485512; Thu, 28 Apr 2022 07:51:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzSVVhXbqLFdXWuuCi1IiHNmvZ2XnboRghg8BNnqBBjAMX0BQ3382oyopdcv2dYR52To2TrHQ== X-Received: by 2002:a92:cdac:0:b0:2cd:c915:bff8 with SMTP id g12-20020a92cdac000000b002cdc915bff8mr3619869ild.82.1651157485213; Thu, 28 Apr 2022 07:51:25 -0700 (PDT) Received: from xz-m1.local (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id r12-20020a92d44c000000b002cc0b7b043asm51789ilm.28.2022.04.28.07.51.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Apr 2022 07:51:24 -0700 (PDT) Date: Thu, 28 Apr 2022 10:51:21 -0400 From: Peter Xu To: Zach O'Keefe Cc: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org, Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , kernel test robot Subject: Re: [PATCH v3 03/12] mm/khugepaged: make hugepage allocation context-specific Message-ID: References: <20220426144412.742113-1-zokeefe@google.com> <20220426144412.742113-4-zokeefe@google.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: A23B440064 X-Stat-Signature: 33gfz6rps5j61qy1q9badxm58obacw9b X-Rspam-User: Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=d8wLP7+o; spf=none (imf17.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1651157479-193540 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 27, 2022 at 05:51:53PM -0700, Zach O'Keefe wrote: > > > @@ -1110,13 +1115,14 @@ static void collapse_huge_page(struct mm_struct *mm, > > > */ > > > mmap_read_unlock(mm); > > > > > > + node = khugepaged_find_target_node(cc); > > > /* sched to specified node before huage page memory copy */ > > > if (task_node(current) != node) { > > > cpumask = cpumask_of_node(node); > > > if (!cpumask_empty(cpumask)) > > > set_cpus_allowed_ptr(current, cpumask); > > > } > > > - new_page = khugepaged_alloc_page(hpage, gfp, node); > > > + new_page = cc->alloc_hpage(cc, gfp, node); > > > > AFAICT you removed all references of khugepaged_alloc_page() in this patch, > > then you'd better drop the function for both NUMA and UMA. > > > > Sorry, I'm not sure I follow. In khugepaged context, logic WRT > khugepaged_alloc_page() is unchanged - it's just called indirectly > through ->alloc_hpage(). Ah you're right, sorry for the confusion. > > > Said that, I think it's actually better to keep them, as they do things > > useful. For example,AFAICT this ->alloc_hpage() change can leak the hpage > > alloacted for UMA already so that's why I think keeping them makes sense, > > then iiuc the BUG_ON would trigger with UMA already. > > > > I saw that you've moved khugepaged_find_target_node() into this function > > which looks definitely good, but if we could keep khugepaged_alloc_page() > > and then IMHO we could even move khugepaged_find_target_node() into that > > and drop the "node" parameter in khugepaged_alloc_page(). > > > > I actually had done this, until commit 4272063db38c ("mm/khugepaged: > sched to numa node when collapse huge page") which forced me to keep > "node" visible in this function. Right, I was looking at my local tree and that patch seems to be very lately added into -next. I'm curious why it wasn't applying to file thps too if it is worthwhile, since if so that's also a suitable candidate to be moved into the same hook. I've asked in that thread instead. Before that, feel free to keep your code as is. > > > I'd even consider moving cc->gfp() all into it if possible, since the gfp > > seems to be always with __GFP_THISNODE anyways. > > > > I would have liked to do this, but the gfp flags are referenced again > when calling mem_cgroup_charge(), so I couldn't quite get rid of them > from here. Yeah, maybe we could move mem_cgroup_charge() into the hook too? As below codes are duplicated between file & anon and IMHO they're good candidate to a new helper already anyway: /* Only allocate from the target node */ gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE; new_page = khugepaged_alloc_page(hpage, gfp, node); if (!new_page) { result = SCAN_ALLOC_HUGE_PAGE_FAIL; goto out; } if (unlikely(mem_cgroup_charge(page_folio(new_page), mm, gfp))) { result = SCAN_CGROUP_CHARGE_FAIL; goto out; } count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC); If we want to generalize it maybe we want to return the "result" instead of the new page though, so the newpage can be passed in as a pointer. There's one mmap_read_unlock(mm) for the anonymous code but I think that can simply be moved above the chunk. No strong opinion here, please feel free to think about what's the best approach for landing this series. [...] > > I could (?) make .gfp of type gfp_t and just update it on every > khugepaged scan (in case it changed) and also remove the gfp parameter > for ->alloc_hpage(). If that's the case I'd prefer you keep you code as-is; gfp() is perfectly fine and gfp() is light, I'm afraid that caching thing could make things complicated. Thanks, -- Peter Xu