From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68EB7CA5523 for ; Wed, 13 Sep 2023 09:33:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B848E6B0171; Wed, 13 Sep 2023 05:33:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B35846B0172; Wed, 13 Sep 2023 05:33:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9FC6B6B0173; Wed, 13 Sep 2023 05:33:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 91CC66B0171 for ; Wed, 13 Sep 2023 05:33:57 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 6288F1CA7B8 for ; Wed, 13 Sep 2023 09:33:57 +0000 (UTC) X-FDA: 81231062514.11.DEED994 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf20.hostedemail.com (Postfix) with ESMTP id 4F7341C0035 for ; Wed, 13 Sep 2023 09:33:55 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ji1etq0J; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ioJqudl2; spf=pass (imf20.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694597635; a=rsa-sha256; cv=none; b=IGcKqZoEpuP+oTq+zzW8K1G7D11js9Gl3vNYG+QF3j/qx882kUv5KQaanl/BTdyOMP1OAU NMBMNlWK3wNnbkRqtTBTAaIL1Bd57HRTR1JtP2UnrFlvIIq7T/kVXQjEEAPVAthbS5kFYg ybHtJ9iZmjvlDDH8kwg9kLjXBkMhhQw= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ji1etq0J; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ioJqudl2; spf=pass (imf20.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694597635; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pqk2WVOxbseyPwLAufOCZjtPP927ewD0tk02pn9sP0E=; b=EBLI2R+SB+KE4/3aoeB6T7BSfOtCJwtp6YjSbhZmRUhQ54EghDStj5dr5k7WPiasJSTiJN +b34e2FHR6/89iQfOnmLAWKqZRATQ9nkir0DJSNg+p9l1wQR6u/aj4S2ff3g6yWRpHnYEq LnRdUSKAiBIxkJ+VZEShP43jGA7zE6g= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 87E95218E2; Wed, 13 Sep 2023 09:33:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1694597633; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pqk2WVOxbseyPwLAufOCZjtPP927ewD0tk02pn9sP0E=; b=ji1etq0JMbhoyRdK9wX+2P7mx7Q2/6NkbjoYB8likMTX6RbaoXjyc1AS8DRfVJ6HDUeFhn /QPaeQtQHsxH8awrAFCAHFJPWU5OFDCmV0gqoMoJ0QJYBXFF4aUwj8wrNpFIB+81jHcToT XCP8FZ1bqGc5FhfbIZFwQnCaN7GPVac= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1694597633; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pqk2WVOxbseyPwLAufOCZjtPP927ewD0tk02pn9sP0E=; b=ioJqudl2oOQ7dPKezfhMIPlCUCuMoSASbGHeCZ6Xe2PW22Q97syJIL9bMHSsF484QlQ44Y XBAwAavc8qDD88Ag== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 59E4313582; Wed, 13 Sep 2023 09:33:53 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id Bph/FAGCAWWcRgAAMHmgww (envelope-from ); Wed, 13 Sep 2023 09:33:53 +0000 Message-ID: <320c16a7-96b7-65ec-3d80-2eace0ddb290@suse.cz> Date: Wed, 13 Sep 2023 11:33:52 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: Re: [PATCH 1/6] mm: page_alloc: remove pcppage migratetype caching Content-Language: en-US To: Johannes Weiner Cc: Andrew Morton , Mel Gorman , Miaohe Lin , Kefeng Wang , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20230911195023.247694-1-hannes@cmpxchg.org> <20230911195023.247694-2-hannes@cmpxchg.org> <20230912145028.GA3228@cmpxchg.org> From: Vlastimil Babka In-Reply-To: <20230912145028.GA3228@cmpxchg.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 4F7341C0035 X-Stat-Signature: otghcxxpb5r537gho3yg9jxsg57ekb8y X-Rspam-User: X-HE-Tag: 1694597635-996786 X-HE-Meta: U2FsdGVkX19zptbTRwPofbR9iskEw+fW/cCXVE7qjnxNi6mPo3kjlf8rMw/depRoxHfzGo3NF6swtwdERPmx3EnXQSQ86pfKH2WKxHJNCZf6FS4cWRTgShefIRxRyvWew9UB5sJMbxNE0S7DJkVvpVaDNsxLQLENilNKUDE/I/kXh26fVk6OvdrGMociwmL69cIUAyF3JUV3ATzE1zP92UtzGj0iZzVUwmOyqOsoZiq/8nguPGwENXkUNnkWz6k6rx+7D+bYJ+inZd9s+4XD5gh+Cz3LVoOj6+w1m0t3m0SZ3Otetg2pWtxm2LG59UVNZwrWZSSslxfHvR/OaldJMFtp3Bt2ZOZ7xEFjFcamAaM0tgUK2htbFN2rdUv/1J8QNsr6a1Kj+TlrFlt+usxWSOvK3j2te03QiO7l/L6oGVSZYWE8NOAC42BEvJlW5vVau+M0U8rgOyiIJDTg4Wm6vs3N/hKpk+q3PzeB6DMJ6hhy2xm4XK2fXdwEdcxXvmFmkn8zzgWcq8p/hvgfrwsqwVqoeDgxqWI8G0nDfp1ZLYOwjXa84hyb1a6jdAtO3YGN8Z+4WhUUsiWcMup9tMJpNWSCQAVdAqFoGwyK7FdLaboO+YvJMKQumRx+2+wy9cbvM067ZKNsCRk2F0HHMb1EHWG5T9HRVyWC9eJd8pdHJtU1cJ3/bAukdg+xkWqAieKVUCdYFzAXUHExLYuMfTy2omXCmJrLXyQurXbR+F7sjHxNFH62W0pC6keJQ8fxwPutaQhJ6EDIQGVQ+aEWEKc3uldSWHhulMlCnyXooSliuxwUPLHNC0heBsDwV3fA6J82bzGK5XvOp1H8k5f7Bl+OcsVIgWQ8Q2EnhozlWeaqFPMq1DDbNFcqI4dVB3O3o0jaSGD1SuSQmJbrAeoRUdu9MKlBWMb089/ZMh5Nykq6T8wzJIDlYAk3tGiuy152RYHMzT8+fT2yq+joZMn9hKT fKYheiUq NNZOJV3W2T7LwPEk512rOzQFhS/0mIKB6Pm0VFPaBpenbW6kDobqPVgAzu0PUyzMyTRrO5kgL6VPa1cG2JyFippXC/dJuWa+mqCrGoMyy/jHmZvcz4jwA9zX0mDoW3RgchRXhIL3btFqJzDnv7aWfIXICBvH7aOxS1fOAgi/ixZYV2jUZR+4GiVMoiToPwPCD+uT7LNs/lglAmz2nJ92uBnxIdSyNwLEpNCDOiCH6Q49oqpD4rNopupTOEOAvjLnqnpb+FrwBM2JkBJh69diW4tbdNZhybEv53YJoRMCw7t5Hw74= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 9/12/23 16:50, Johannes Weiner wrote: > On Tue, Sep 12, 2023 at 03:47:45PM +0200, Vlastimil Babka wrote: >> On 9/11/23 21:41, Johannes Weiner wrote: > >> > @@ -1577,7 +1556,6 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, >> > continue; >> > del_page_from_free_list(page, zone, current_order); >> > expand(zone, page, order, current_order, migratetype); >> > - set_pcppage_migratetype(page, migratetype); >> >> Hm interesting, just noticed that __rmqueue_fallback() never did this >> AFAICS, sounds like a bug. > > I don't quite follow. Which part? > > Keep in mind that at this point __rmqueue_fallback() doesn't return a > page. It just moves pages to the desired freelist, and then > __rmqueue_smallest() gets called again. This changes in 5/6, but until > now at least all of the above would apply to fallback pages. Yep, missed that "doesn't return a page", thanks. >> > @@ -2145,7 +2123,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, >> > * pages are ordered properly. >> > */ >> > list_add_tail(&page->pcp_list, list); >> > - if (is_migrate_cma(get_pcppage_migratetype(page))) >> > + if (is_migrate_cma(get_pageblock_migratetype(page))) >> > __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, >> > -(1 << order)); >> >> This is potentially a source of overhead, I assume patch 6/6 might >> change that. > > Yes, 6/6 removes it altogether. > > But the test results in this patch's changelog are from this patch in > isolation, so it doesn't appear to be a concern even on its own. > >> > @@ -2457,7 +2423,7 @@ void free_unref_page_list(struct list_head *list) >> > * Free isolated pages directly to the allocator, see >> > * comment in free_unref_page. >> > */ >> > - migratetype = get_pcppage_migratetype(page); >> > + migratetype = get_pfnblock_migratetype(page, pfn); >> > if (unlikely(is_migrate_isolate(migratetype))) { >> > list_del(&page->lru); >> > free_one_page(page_zone(page), page, pfn, 0, migratetype, FPI_NONE); >> >> I think after this change we should move the isolated pages handling to >> the second loop below, so that we wouldn't have to call >> get_pfnblock_migratetype() twice per page. Dunno yet if some later patch >> does that. It would need to unlock pcp when necessary. > > That sounds like a great idea. Something like the following? > > Lightly tested. If you're good with it, I'll beat some more on it and > submit it as a follow-up. > > --- > > From 429d13322819ab38b3ba2fad6d1495997819ccc2 Mon Sep 17 00:00:00 2001 > From: Johannes Weiner > Date: Tue, 12 Sep 2023 10:16:10 -0400 > Subject: [PATCH] mm: page_alloc: optimize free_unref_page_list() > > Move direct freeing of isolated pages to the lock-breaking block in > the second loop. This saves an unnecessary migratetype reassessment. > > Minor comment and local variable scoping cleanups. Looks like batch_count and locked_zone could be moved to the loop scope as well. > > Suggested-by: Vlastimil Babka > Signed-off-by: Johannes Weiner Reviewed-by: Vlastimil Babka > --- > mm/page_alloc.c | 49 +++++++++++++++++++++---------------------------- > 1 file changed, 21 insertions(+), 28 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index e3f1c777feed..9cad31de1bf5 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2408,48 +2408,41 @@ void free_unref_page_list(struct list_head *list) > struct per_cpu_pages *pcp = NULL; > struct zone *locked_zone = NULL; > int batch_count = 0; > - int migratetype; > - > - /* Prepare pages for freeing */ > - list_for_each_entry_safe(page, next, list, lru) { > - unsigned long pfn = page_to_pfn(page); > > - if (!free_pages_prepare(page, 0, FPI_NONE)) { > + list_for_each_entry_safe(page, next, list, lru) > + if (!free_pages_prepare(page, 0, FPI_NONE)) > list_del(&page->lru); > - continue; > - } > - > - /* > - * Free isolated pages directly to the allocator, see > - * comment in free_unref_page. > - */ > - migratetype = get_pfnblock_migratetype(page, pfn); > - if (unlikely(is_migrate_isolate(migratetype))) { > - list_del(&page->lru); > - free_one_page(page_zone(page), page, pfn, 0, migratetype, FPI_NONE); > - continue; > - } > - } > > list_for_each_entry_safe(page, next, list, lru) { > unsigned long pfn = page_to_pfn(page); > struct zone *zone = page_zone(page); > + int migratetype; > > list_del(&page->lru); > migratetype = get_pfnblock_migratetype(page, pfn); > > /* > - * Either different zone requiring a different pcp lock or > - * excessive lock hold times when freeing a large list of > - * pages. > + * Zone switch, batch complete, or non-pcp freeing? > + * Drop the pcp lock and evaluate. > */ > - if (zone != locked_zone || batch_count == SWAP_CLUSTER_MAX) { > + if (unlikely(zone != locked_zone || > + batch_count == SWAP_CLUSTER_MAX || > + is_migrate_isolate(migratetype))) { > if (pcp) { > pcp_spin_unlock(pcp); > pcp_trylock_finish(UP_flags); > + locked_zone = NULL; > } > > - batch_count = 0; > + /* > + * Free isolated pages directly to the > + * allocator, see comment in free_unref_page. > + */ > + if (is_migrate_isolate(migratetype)) { > + free_one_page(zone, page, pfn, 0, > + migratetype, FPI_NONE); > + continue; > + } > > /* > * trylock is necessary as pages may be getting freed > @@ -2459,12 +2452,12 @@ void free_unref_page_list(struct list_head *list) > pcp = pcp_spin_trylock(zone->per_cpu_pageset); > if (unlikely(!pcp)) { > pcp_trylock_finish(UP_flags); > - free_one_page(zone, page, pfn, > - 0, migratetype, FPI_NONE); > - locked_zone = NULL; > + free_one_page(zone, page, pfn, 0, > + migratetype, FPI_NONE); > continue; > } > locked_zone = zone; > + batch_count = 0; > } > > /*