From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A63AC4708A for ; Wed, 26 May 2021 18:14:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9EE24613D8 for ; Wed, 26 May 2021 18:14:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9EE24613D8 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7DAD76B006C; Wed, 26 May 2021 14:14:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 758C98D0002; Wed, 26 May 2021 14:14:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 25BEA6B006C; Wed, 26 May 2021 14:14:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0253.hostedemail.com [216.40.44.253]) by kanga.kvack.org (Postfix) with ESMTP id D0C996B006C for ; Wed, 26 May 2021 14:14:15 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 77172181AEF32 for ; Wed, 26 May 2021 18:14:15 +0000 (UTC) X-FDA: 78184181670.17.CCC13FD Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf21.hostedemail.com (Postfix) with ESMTP id 8D21BE00080D for ; Wed, 26 May 2021 18:14:05 +0000 (UTC) Received: from imap.suse.de (imap-alt.suse-dmz.suse.de [192.168.254.47]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id C4D9A218D6; Wed, 26 May 2021 18:14:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1622052853; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l9Z6k1AqIcX+FOS3J/1vgdvLOGpsPUhKVX0RFUimaCw=; b=VoxJjT9o8hJsjbxkCZ42Q/6OJV/qvhsZAlQWBWKK1JAF7+GsX685rNX/jtjohEcv7tq3Ew eVEpgTwsxw+Eoh4GW7ajeNc0JzsKrF/EpQjCzdXcXBjcVnuy2ftZF8/Lpnn7PHvPQz/Sv4 6DHUOSoUz4vYnRgv2Fb4S+zpqpA9QVc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1622052853; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l9Z6k1AqIcX+FOS3J/1vgdvLOGpsPUhKVX0RFUimaCw=; b=PoRZJE1yIaSTHr4D+BiJqj5vpbUC9IHwZlX4mKyxd7pC+5yojLSzrRxtkJaXH0UiqueIto Ydafz4++k6eUf7Bg== Received: from director2.suse.de (director2.suse-dmz.suse.de [192.168.254.72]) by imap.suse.de (Postfix) with ESMTPSA id AD80411A98; Wed, 26 May 2021 18:14:13 +0000 (UTC) To: Mel Gorman , Andrew Morton Cc: Hillf Danton , Dave Hansen , Michal Hocko , LKML , Linux-MM References: <20210525080119.5455-1-mgorman@techsingularity.net> <20210525080119.5455-3-mgorman@techsingularity.net> From: Vlastimil Babka Subject: Re: [PATCH 2/6] mm/page_alloc: Disassociate the pcp->high from pcp->batch Message-ID: <10cb326c-b4ad-3a82-a38b-aba7d2192736@suse.cz> Date: Wed, 26 May 2021 20:14:13 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.2 MIME-Version: 1.0 In-Reply-To: <20210525080119.5455-3-mgorman@techsingularity.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US X-Rspamd-Queue-Id: 8D21BE00080D Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=VoxJjT9o; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=PoRZJE1y; spf=pass (imf21.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Rspamd-Server: rspam04 X-Stat-Signature: s4wxktwkka1tph3pqj379if7jxxp5cim X-HE-Tag: 1622052845-155387 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 5/25/21 10:01 AM, Mel Gorman wrote: > The pcp high watermark is based on the batch size but there is no > relationship between them other than it is convenient to use early in > boot. >=20 > This patch takes the first step and bases pcp->high on the zone low > watermark split across the number of CPUs local to a zone while the bat= ch > size remains the same to avoid increasing allocation latencies. The int= ent > behind the default pcp->high is "set the number of PCP pages such that > if they are all full that background reclaim is not started prematurely= ". >=20 > Note that in this patch the pcp->high values are adjusted after memory > hotplug events, min_free_kbytes adjustments and watermark scale factor > adjustments but not CPU hotplug events which is handled later in the > series. >=20 > On a test KVM instance; >=20 > Before grep -E "high:|batch" /proc/zoneinfo | tail -2 > high: 378 > batch: 63 >=20 > After grep -E "high:|batch" /proc/zoneinfo | tail -2 > high: 649 > batch: 63 >=20 > Signed-off-by: Mel Gorman ... > @@ -6637,6 +6628,34 @@ static int zone_batchsize(struct zone *zone) > #endif > } > =20 > +static int zone_highsize(struct zone *zone, int batch) > +{ > +#ifdef CONFIG_MMU > + int high; > + int nr_local_cpus; > + > + /* > + * The high value of the pcp is based on the zone low watermark > + * so that if they are full then background reclaim will not be > + * started prematurely. The value is split across all online CPUs > + * local to the zone. Note that early in boot that CPUs may not be > + * online yet. > + */ > + nr_local_cpus =3D max(1U, cpumask_weight(cpumask_of_node(zone_to_nid(= zone)))); > + high =3D low_wmark_pages(zone) / nr_local_cpus; > + > + /* > + * Ensure high is at least batch*4. The multiple is based on the > + * historical relationship between high and batch. > + */ > + high =3D max(high, batch << 2); > + > + return high; > +#else > + return 0; > +#endif > +} > + > /* > * pcp->high and pcp->batch values are related and generally batch is = lower > * than high. They are also related to pcp->count such that count is l= ower > @@ -6698,11 +6717,10 @@ static void __zone_set_pageset_high_and_batch(s= truct zone *zone, unsigned long h > */ > static void zone_set_pageset_high_and_batch(struct zone *zone) > { > - unsigned long new_high, new_batch; > + int new_high, new_batch; > =20 > - new_batch =3D zone_batchsize(zone); > - new_high =3D 6 * new_batch; > - new_batch =3D max(1UL, 1 * new_batch); > + new_batch =3D max(1, zone_batchsize(zone)); > + new_high =3D zone_highsize(zone, new_batch); > =20 > if (zone->pageset_high =3D=3D new_high && > zone->pageset_batch =3D=3D new_batch) > @@ -8170,6 +8188,12 @@ static void __setup_per_zone_wmarks(void) > zone->_watermark[WMARK_LOW] =3D min_wmark_pages(zone) + tmp; > zone->_watermark[WMARK_HIGH] =3D min_wmark_pages(zone) + tmp * 2; > =20 > + /* > + * The watermark size have changed so update the pcpu batch > + * and high limits or the limits may be inappropriate. > + */ > + zone_set_pageset_high_and_batch(zone); Hm so this puts the call in the path of various watermark related sysctl handlers, but it's not protected by pcp_batch_high_lock. The zone lock wo= n't help against zone_pcp_update() from a hotplug handler. On the other hand,= since hotplug handlers also call __setup_per_zone_wmarks(), the zone_pcp_update= () calls there are now redundant and could be removed, no? But later there will be a new sysctl in patch 6/6 using pcp_batch_high_lo= ck, thus that one will not be protected against the watermark related sysctl handlers that reach here. To solve all this, seems like the static lock in setup_per_zone_wmarks() = could become a top-level visible lock and pcp high/batch updates could switch t= o that one instead of own pcp_batch_high_lock. And zone_pcp_update() calls from = hotplug handlers could be removed. > + > spin_unlock_irqrestore(&zone->lock, flags); > } > =20 >=20