From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B11DC4332F for ; Fri, 18 Nov 2022 14:31:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 15CC18E0001; Fri, 18 Nov 2022 09:31:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 10F066B0073; Fri, 18 Nov 2022 09:31:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF05D8E0001; Fri, 18 Nov 2022 09:31:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DEE336B0072 for ; Fri, 18 Nov 2022 09:31:03 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3ACD0C1301 for ; Fri, 18 Nov 2022 14:31:01 +0000 (UTC) X-FDA: 80146799922.14.5F2AEE2 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf22.hostedemail.com (Postfix) with ESMTP id 69AB1C0010 for ; Fri, 18 Nov 2022 14:31:00 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id CD3EE21EFB; Fri, 18 Nov 2022 14:30:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1668781858; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=p5HxxiDG0eqNrAIBu00sqo6F9PU57FyKSnnHnLC7aD0=; b=3ESN/ayvAFKjCC0eKwke3jc86KDQhE/9CpOZUbBwdp9d4slLrRVVcVTEzc39pur0yKzA0M zxv3Bo5W25gSbArpRoNoZUvzYr7sIHD+MLLJ7WqL02aTv5VS/l/5xmJXSFP0c094s6BJs0 Xp4AUtZl0zd/4zqgG6Yos9W6y/AYLzI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1668781858; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=p5HxxiDG0eqNrAIBu00sqo6F9PU57FyKSnnHnLC7aD0=; b=PwOfhWXHL6M0JBTPpJRBePeMZbXNd3el1wW0PmGotABPLxnsn9RgI5yFxRsagxcYKvXlJB gILyfu4NdcWzMXBg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id B50461345B; Fri, 18 Nov 2022 14:30:57 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id I4wRKyGXd2OjSgAAMHmgww (envelope-from ); Fri, 18 Nov 2022 14:30:57 +0000 Message-ID: Date: Fri, 18 Nov 2022 15:30:57 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Subject: Re: [PATCH 2/2] mm/page_alloc: Leave IRQs enabled for per-cpu page allocations Content-Language: en-US To: Mel Gorman , Andrew Morton Cc: Hugh Dickins , Yu Zhao , Marcelo Tosatti , Michal Hocko , Marek Szyprowski , LKML , Linux-MM References: <20221118101714.19590-1-mgorman@techsingularity.net> <20221118101714.19590-3-mgorman@techsingularity.net> From: Vlastimil Babka In-Reply-To: <20221118101714.19590-3-mgorman@techsingularity.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1668781860; a=rsa-sha256; cv=none; b=d/QsraU8bB+OtzLEopayqbcKpDjs/HGlFVs64/J/lTp2h6hqCNdY07NuAMnEtzwEytLh+C byY4Z+KT5zYc3oSpAQpbfrjhpij40bi6aZfPnQy4RusBGzcfm50BSDYDh6VQK6xv1oEBTZ 4JRWkuEe11TW+ordCzaszgqJc47e6gY= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="3ESN/ayv"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=PwOfhWXH; spf=pass (imf22.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1668781860; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=p5HxxiDG0eqNrAIBu00sqo6F9PU57FyKSnnHnLC7aD0=; b=PRHcvkIvKATBANViXuy+YFLDBwmKQhAlhXppeb5a7akzQoxHDunt3PlyfkXe53sx91iPl8 lm6467zVeTskpZ6Jd0TggPfNDBXaMNzZopmOVfqlRvZ+cLgOl8RucCFkWx7y9eARu/18jb Atcu2+0vOTcv+3yMCzunIr7ZYP8xqqw= Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="3ESN/ayv"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=PwOfhWXH; spf=pass (imf22.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 69AB1C0010 X-Rspam-User: X-Stat-Signature: 83b7k5nqbiybfdmuftnge1mxgcnowq91 X-HE-Tag: 1668781860-903225 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11/18/22 11:17, Mel Gorman wrote: > The pcp_spin_lock_irqsave protecting the PCP lists is IRQ-safe as a task > allocating from the PCP must not re-enter the allocator from IRQ context. > In each instance where IRQ-reentrancy is possible, the lock is acquired > using pcp_spin_trylock_irqsave() even though IRQs are disabled and > re-entrancy is impossible. > > Demote the lock to pcp_spin_lock avoids an IRQ disable/enable in the common > case at the cost of some IRQ allocations taking a slower path. If the PCP > lists need to be refilled, the zone lock still needs to disable IRQs but > that will only happen on PCP refill and drain. If an IRQ is raised when > a PCP allocation is in progress, the trylock will fail and fallback to > using the buddy lists directly. Note that this may not be a universal win > if an interrupt-intensive workload also allocates heavily from interrupt > context and contends heavily on the zone->lock as a result. > > [yuzhao@google.com: Reported lockdep issue on IO completion from softirq] > [hughd@google.com: Fix list corruption, lock improvements, micro-optimsations] > Signed-off-by: Mel Gorman Reviewed-by: Vlastimil Babka Some nits below: > @@ -3516,10 +3485,10 @@ void free_unref_page(struct page *page, unsigned int order) > */ > void free_unref_page_list(struct list_head *list) > { > + unsigned long __maybe_unused UP_flags; > struct page *page, *next; > struct per_cpu_pages *pcp = NULL; > struct zone *locked_zone = NULL; > - unsigned long flags; > int batch_count = 0; > int migratetype; > > @@ -3550,11 +3519,26 @@ void free_unref_page_list(struct list_head *list) > > /* Different zone, different pcp lock. */ > if (zone != locked_zone) { > - if (pcp) > - pcp_spin_unlock_irqrestore(pcp, flags); > + if (pcp) { > + pcp_spin_unlock(pcp); > + pcp_trylock_finish(UP_flags); > + } > > + /* > + * trylock is necessary as pages may be getting freed > + * from IRQ or SoftIRQ context after an IO completion. > + */ > + pcp_trylock_prepare(UP_flags); > + pcp = pcp_spin_trylock(zone->per_cpu_pageset); > + if (!pcp) { Perhaps use unlikely() here? > + pcp_trylock_finish(UP_flags); > + free_one_page(zone, page, page_to_pfn(page), > + 0, migratetype, FPI_NONE); Not critical for correctness, but the migratepage here might be stale and we should do get_pcppage_migratetype(page); > + locked_zone = NULL; > + continue; > + } > locked_zone = zone; > - pcp = pcp_spin_lock_irqsave(locked_zone->per_cpu_pageset, flags); > + batch_count = 0; > } > > /* > @@ -3569,18 +3553,23 @@ void free_unref_page_list(struct list_head *list) > free_unref_page_commit(zone, pcp, page, migratetype, 0); > > /* > - * Guard against excessive IRQ disabled times when we get > - * a large list of pages to free. > + * Guard against excessive lock hold times when freeing > + * a large list of pages. Lock will be reacquired if > + * necessary on the next iteration. > */ > if (++batch_count == SWAP_CLUSTER_MAX) { > - pcp_spin_unlock_irqrestore(pcp, flags); > + pcp_spin_unlock(pcp); > + pcp_trylock_finish(UP_flags); > batch_count = 0; > - pcp = pcp_spin_lock_irqsave(locked_zone->per_cpu_pageset, flags); > + pcp = NULL; > + locked_zone = NULL; AFAICS if this block was just "locked_zone = NULL;" then the existing code would do the right thing. Or maybe to have simpler code, just do batch_count++ here and make the relocking check do if (zone != locked_zone || batch_count == SWAP_CLUSTER_MAX) > } > } > > - if (pcp) > - pcp_spin_unlock_irqrestore(pcp, flags); > + if (pcp) { > + pcp_spin_unlock(pcp); > + pcp_trylock_finish(UP_flags); > + } > } > > /*