From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19A69C41513 for ; Thu, 10 Aug 2023 07:26:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4BBF78E0002; Thu, 10 Aug 2023 03:26:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 46A038E0001; Thu, 10 Aug 2023 03:26:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 331438E0002; Thu, 10 Aug 2023 03:26:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 220F08E0001 for ; Thu, 10 Aug 2023 03:26:44 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id BD1851404BA for ; Thu, 10 Aug 2023 07:26:43 +0000 (UTC) X-FDA: 81107362686.03.0833AA4 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by imf25.hostedemail.com (Postfix) with ESMTP id C1618A0005 for ; Thu, 10 Aug 2023 07:26:41 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b=OuRRKRDv; dkim=pass header.d=linutronix.de header.s=2020e header.b=+TuJ5hUy; dmarc=pass (policy=none) header.from=linutronix.de; spf=pass (imf25.hostedemail.com: domain of bigeasy@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691652402; a=rsa-sha256; cv=none; b=sQPpjycpnftzH+fxZ7s9RF5/iCa2x+Me9CFNpTnSRmVLirCzoEh5eCriTOCAv2s4JSU5TX ppJduQDeTqWmC6VS4Vi4nrYN7teETuu6rn2PjJ8IXoK+kon4mvxlXwgZ2nsZQ2nhF9+5KB ElyTfd614wjY+94uE9ULYdG3ii/COEw= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b=OuRRKRDv; dkim=pass header.d=linutronix.de header.s=2020e header.b=+TuJ5hUy; dmarc=pass (policy=none) header.from=linutronix.de; spf=pass (imf25.hostedemail.com: domain of bigeasy@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691652402; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=A4O0xleWwbLEz5uNtQqAFP8rru9+g0EjKLnL+sKvBtM=; b=WwdxofTXqo5KwLqr9QEUbnyas734BxCeFJkxoeTHcmiNK7IR6VMbBPnWSfdu04TUP7BLGP 2UvQdX5p7MWkrhfwdRL7p1LQZVS4nyMlYELC+7uEiRLRcifrGnWnipUY38ert+BRuYQ0la WsFu7ivkdIdy7CbYAtRv3k01qZHnSRU= Date: Thu, 10 Aug 2023 09:26:37 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1691652399; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=A4O0xleWwbLEz5uNtQqAFP8rru9+g0EjKLnL+sKvBtM=; b=OuRRKRDvuB/9D8TwbLjCmSJBiNo71POOsbtBcqDpDDCis9ruu8BDyDKUxWwxHiBljBt4jA m+eEdgFTaXId1fVjQPXkSvT0nNNk729ni/KqRqMIg1/Ly9hePUk9OyWKybeO43+AmxLSV4 9eeVaKvH181VYnVQTU8F1gX6iclho9W7GisSfpuIDncU8ZVkLOqBNSfo4JVexvYrWBmcBG O9Es82SCSGwspTA9SMqFB1XW9hx4lg6nvQckrERkxN8AT/apS3qPU19nKVTApR7cMeEGbu tNtQZ0Mei+Gn1BeZL84iNSSZ/U0Z09MoZholgmdZGGYk1Cx7NgwApjiavRaj9A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1691652399; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=A4O0xleWwbLEz5uNtQqAFP8rru9+g0EjKLnL+sKvBtM=; b=+TuJ5hUyGLZg4p2ro9OWMxk1hQ1m0fb2xPbXFa1/RUhadKwNh80TIA22pKVgxS1TtC6a5o G5Ww0un+LBtToRDQ== From: Sebastian Andrzej Siewior To: Tetsuo Handa Cc: Michal Hocko , Andrew Morton , Petr Mladek , linux-mm , LKML , "Luis Claudio R. Goncalves" , Boqun Feng , Ingo Molnar , John Ogness , Mel Gorman , Peter Zijlstra , Thomas Gleixner , Waiman Long , Will Deacon Subject: Re: [PATCH v2] mm/page_alloc: don't check zonelist_update_seq from atomic allocations Message-ID: <20230810072637.6Sc3UU3R@linutronix.de> References: <6cc13636-eda6-6a95-6564-db1c9ae76bb6@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: <6cc13636-eda6-6a95-6564-db1c9ae76bb6@I-love.SAKURA.ne.jp> X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: C1618A0005 X-Stat-Signature: 36feeqf3e8i1uwcodcdmihu59isooo7a X-HE-Tag: 1691652401-78661 X-HE-Meta: U2FsdGVkX180ujTZ7W+neThwfRNCLY/oeXj4kU+X6YlJHUx84YXtuCEUyp0v43fx5A1RxweRyyxtbxBAhxwhKtwDgd3kiQjAJPYrTdY0LQa4Kp8kM/8H9Wp+zM9KLt5lnxOfZv955P6nA562PXfd97ydg/2oj3PTraxwlpIrkpM5LawPjyWzzQUImgivAI4o73xsmfWAXywP0LKKHg5oKOQUk4QXkMN46/puO2/AQww0EzgLMFuLRFSfApwuJs+KxFRk/XH5KW3D7xcE8UL9FwP+p8+iE0KbG8XL/QVOKqL747J629qkSPkQg9dgVxaMTHVuAgeZBRySPuva0bzUxY7mIqbsRLlZO2UYRInTR4+jNI/ysqttCrsuzuZcEjC5iqQFuvfBh/+y3o6orTbxS/xtq0031s4pdXIh4zTCSdqn4CS/j40BX+H/WTHEMW/kPvmqh5C/1e7xCfHTH5BjG8FWj6oOpdHQmoPi9DGT4NhrkK1MpHG3ncE8ArC+JfCK79bJf3sg9lkvI7D9bivJV0NZFBTQnZrSf77wIVtcuEjT5ICunbn6Bf/HGm0kaQziUXN7Ebd+7c/kmvAmchpoy+tWeJmioghbGaVriZ7mU8MlrdVjsOqC3NgS6+qVViTVYpOQ7tXZ1OlIGDuEaAWabN3kDQt2dyffnWhUzaTdnE6VHlpUUZ8F7xAbsHZ3LC5FAwyiqfe/Cxwqt/Q2qFzM8XpvKAzlXCkfIacYdMJoC1tJGtW+ufGxU/HVvBLWePOFRcnyrQ8nGLmnVpUzbpj82Tw6yyP74Js89QiGIkck/SXl2aVnrieXEgt0wsGK01bELcSyhUgQyKVJnaRWaLRktI/bdsb+XhtAyzXnXZLDEpxQ6p1u3uzeCxXQQ7ibozhPgMJnbd10IXSHBoB0s9ZFbvvgTtFNYNDQIcpnwQiLpJogac1/enLuwPhTKMz5UaGOAwD2CP2rO4NP20IPvQY io3/IkUl qCoNfb7mrYo2XkuGI9I2IqCeLcKVWPzNnt8jzQOjVmzaI5XEomxPEVIjnthjPv7ssnnEd8E/i6l2L1Qg+Cc9zHX0AejJpFhuHsZkeEB/yhBEbSdIYJ838ra3AbXPKM7RorQjZ5AaYE1sYUOUUJg0y5jB3tPZn/SmZOb2Zjlah5rqooxkR++mO9W7QgEG1HDyauxJEHHksBECmBPgYOSki08bkng== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023-08-09 20:03:00 [+0900], Tetsuo Handa wrote: > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 7d3460c7a480..5557d9a2ff2c 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -3638,26 +3638,44 @@ EXPORT_SYMBOL_GPL(fs_reclaim_release); =E2=80=A6 > -static DEFINE_SEQLOCK(zonelist_update_seq); > +static unsigned int zonelist_update_seq; > =20 > static unsigned int zonelist_iter_begin(void) > { > if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE)) > - return read_seqbegin(&zonelist_update_seq); > + /* See comment above. */ > + return data_race(READ_ONCE(zonelist_update_seq)); This is open coded raw_read_seqcount() while it should have been raw_seqcount_begin(). > return 0; > } > =20 > -static unsigned int check_retry_zonelist(unsigned int seq) > +static unsigned int check_retry_zonelist(gfp_t gfp, unsigned int seq) > { > - if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE)) > - return read_seqretry(&zonelist_update_seq, seq); > + if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE) && (gfp & __GFP_DIRECT_RECLAIM)= ) { > + /* See comment above. */ > + unsigned int seq2 =3D data_race(READ_ONCE(zonelist_update_seq)); > =20 > - return seq; > + /* > + * "seq !=3D seq2" indicates that __build_all_zonelists() has > + * started or has finished rebuilding zonelists, hence retry. > + * "seq =3D=3D seq2 && (seq2 & 1)" indicates that > + * __build_all_zonelists() is still rebuilding zonelists > + * with context switching disabled, hence retry. > + * "seq =3D=3D seq2 && !(seq2 & 1)" indicates that > + * __build_all_zonelists() did not rebuild zonelists, hence > + * no retry. > + */ > + return unlikely(seq !=3D seq2 || (seq2 & 1)); open coded read_seqcount_retry(). > + } > + > + return 0; > } > =20 > /* Perform direct synchronous page reclaim */ > @@ -5136,22 +5154,17 @@ static void __build_all_zonelists(void *data) > int nid; > int __maybe_unused cpu; > pg_data_t *self =3D data; > + static DEFINE_SPINLOCK(lock); > unsigned long flags; > =20 > - /* > - * Explicitly disable this CPU's interrupts before taking seqlock > - * to prevent any IRQ handler from calling into the page allocator > - * (e.g. GFP_ATOMIC) that could hit zonelist_iter_begin and livelock. > - */ > - local_irq_save(flags); > - /* > - * Explicitly disable this CPU's synchronous printk() before taking > - * seqlock to prevent any printk() from trying to hold port->lock, for > - * tty_insert_flip_string_and_push_buffer() on other CPU might be > - * calling kmalloc(GFP_ATOMIC | __GFP_NOWARN) with port->lock held. > - */ > - printk_deferred_enter(); > - write_seqlock(&zonelist_update_seq); > +#ifdef CONFIG_PREEMPT_RT > + migrate_disable() > +#endif There is no justification/ explanation why migrate_disable() here is needed on PREEMPT_RT and I don't see one. There are two changes here: - The replacement of seqlock_t with something open coded=20 - Logic change when a retry is needed (the gfp mask is considered). I am not a big fan of open coding things especially when not needed and then there is this ifdef which is not needed as well. I don't comment on the logic change. Can we please put an end to this? > + /* Serialize increments of zonelist_update_seq. */ > + spin_lock_irqsave(&lock, flags); > + zonelist_update_seq++; > + /* Tell check_retry_zonelist() that we started rebuilding zonelists. */ > + smp_wmb(); > =20 > #ifdef CONFIG_NUMA > memset(node_load, 0, sizeof(node_load)); Sebastian