From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7793AEB64DA for ; Mon, 26 Jun 2023 07:57:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ECA658D0002; Mon, 26 Jun 2023 03:57:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E7B4A8D0001; Mon, 26 Jun 2023 03:57:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1B5A8D0002; Mon, 26 Jun 2023 03:57:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C4AE88D0001 for ; Mon, 26 Jun 2023 03:57:00 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7879E1204E7 for ; Mon, 26 Jun 2023 07:57:00 +0000 (UTC) X-FDA: 80944143000.28.406A10B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf14.hostedemail.com (Postfix) with ESMTP id 24BFC10000D for ; Mon, 26 Jun 2023 07:56:57 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ieMOfEUu; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf14.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687766218; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uOAJnP04QAKFCJcxBbwOR++ymuaGUXxlWyrgK3rvWd0=; b=GFc4m7mIzbgTazi41SR5XTRZY6UJ16nMlDO3AkX2O8MhtJLRg6lflUFodb/0KD+nh/TFmN Lr4MnWPmfh89UtRus8jDmGNDrRsDjc+Fi0CsPiAJUrhNNplYGZ727KuyX806Tb2pQvjmOH /rpG9ubzd+pPfAOusWGnDxP/wTx4cN8= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ieMOfEUu; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf14.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687766218; a=rsa-sha256; cv=none; b=txWyVqb3uxzbYGnUgWwCaFC3WZJVt4EB8AP/ssly3ja1UI+rfVsjex6WqImCFStWLtHQKz 9wNleRs1opPJQePNJ8jsCw66r5BJD9QEyIXm0ulNkTY05v61CTe8XtgI/CoIAs5CniFndU TRnwv+eQy+uV7i8/h7mpy8uwOeiyZCE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687766217; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uOAJnP04QAKFCJcxBbwOR++ymuaGUXxlWyrgK3rvWd0=; b=ieMOfEUu++JrNc7njwO+kwRfULLr99fpzYyM5xFVQ4AOYuIPaXYry+Mr+0T/MUNXNQje7w 7KVn9YqZPbCNLuRBIH5fD39o4+VrMBsdHCK1bxCdBYEhYbRctvYxVrTwFPxxfVwSH/9n+n WKLDOt2dlfRkObKV5fCjOpsDOts7zDk= Received: from mail-lj1-f200.google.com (mail-lj1-f200.google.com [209.85.208.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-532-OXNu_-KRPUKMsvjgDFtMiQ-1; Mon, 26 Jun 2023 03:56:55 -0400 X-MC-Unique: OXNu_-KRPUKMsvjgDFtMiQ-1 Received: by mail-lj1-f200.google.com with SMTP id 38308e7fff4ca-2b46dc4f6faso22231441fa.3 for ; Mon, 26 Jun 2023 00:56:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687766214; x=1690358214; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=uOAJnP04QAKFCJcxBbwOR++ymuaGUXxlWyrgK3rvWd0=; b=gvSw+eFvKR96eU05fUeGwHoIJqvHWecOf+sSGR0I+VBH4ar3q+rWDHrmWAaD6lUSLz 1ffCt0SC9rkGozMbmySaj4UupkhgbvtviwBtLS5Ig/dttay51ABmfv+PUT1ZI9bzH0Qi 4t3GqDQgGD+i5whCdlI7fRx8iueLdzQowh/xy+rRx5DPcFhWPvUAtIg3H/4Mfy99Lskq Kgs4Vg0BaTnzeFnAEwFeOT1efOd0Aqz82iVcJpBBpnV+GGToDH85FD3HOEdKd4NXJ5P2 TOG4GspV/jJX6FdFEFutaT6KPmP9wdpux8+W8XljAp171ZgRaHqY7ehNdA+MbowlodOY AQfg== X-Gm-Message-State: AC+VfDwYPzJyWpCqKFnWVWL93AXVKa/l/f4BbC8+Ro8XCK0BpdWOac6G nlDIvLOx6hS4tcf77+7b1S2sjDLm4vkJhy0Yx6Je0gWuX1tCFMv30EsFBQ8PJZYF3SHfTAhkwG6 HLgBCN1JcVk0= X-Received: by 2002:a2e:998c:0:b0:2b1:a3ce:b709 with SMTP id w12-20020a2e998c000000b002b1a3ceb709mr18996192lji.39.1687766214338; Mon, 26 Jun 2023 00:56:54 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5Pp1hDwDmtTCkredimp4j8EGH9zDcZodiknqhWEV595slDqikBK6q21kUti8q4+gmhsIWPaw== X-Received: by 2002:a2e:998c:0:b0:2b1:a3ce:b709 with SMTP id w12-20020a2e998c000000b002b1a3ceb709mr18996180lji.39.1687766213960; Mon, 26 Jun 2023 00:56:53 -0700 (PDT) Received: from ?IPV6:2003:cb:c74b:7300:2ef6:6cd6:703c:e498? (p200300cbc74b73002ef66cd6703ce498.dip0.t-ipconnect.de. [2003:cb:c74b:7300:2ef6:6cd6:703c:e498]) by smtp.gmail.com with ESMTPSA id z20-20020a7bc7d4000000b003f90067880esm9749493wmk.47.2023.06.26.00.56.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 26 Jun 2023 00:56:53 -0700 (PDT) Message-ID: Date: Mon, 26 Jun 2023 09:56:52 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH v3 2/2] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save(). To: Sebastian Andrzej Siewior , Michal Hocko Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Luis Claudio R. Goncalves" , Andrew Morton , Boqun Feng , Ingo Molnar , John Ogness , Mel Gorman , Peter Zijlstra , Petr Mladek , Tetsuo Handa , Thomas Gleixner , Waiman Long , Will Deacon References: <20230623171232.892937-1-bigeasy@linutronix.de> <20230623171232.892937-3-bigeasy@linutronix.de> <20230623201517.yw286Knb@linutronix.de> From: David Hildenbrand Organization: Red Hat In-Reply-To: <20230623201517.yw286Knb@linutronix.de> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 24BFC10000D X-Stat-Signature: pi6rjbb9ms1z5j1t6nyfpte4hadig45a X-Rspam-User: X-HE-Tag: 1687766217-871150 X-HE-Meta: U2FsdGVkX18XwgRtfU3kFTTrK83v/CCzfs2smhfDuG+VqfggVXGlvalkNsM0KUhcZODOX88SY3VpDH9o/P9TQEhvF/9vEiqjn8hvyuoZNjnEjIPQzQxfdAaHQpluu+C+mns7ZESJLDRwNlQj6CCTdi8ensuU1WXVqUDgVQyBL9jNtab4fPdBGcvcHR8TWKe25mKAcccD71Ych5TzPaq6bgn6pqAbAg5d3+D7lKXfb0A80Q83Wz/sVROKf6gMG1Sp+5TekaXUkc8hJVgfBfMUfRp0I1FbcOLrl8xi1MqIP4iV6lFwekuocFsnHSxytuVrL4QtJZNo5wFPFj22mQWlQHbJq4ejAiYZ5nHJchB5HPjXVyB2bg8K2B2gUlPZNUyhQCpMe6urHihVOLCRovZRMuFcAiQOpUr6omB0xHzxtQI1ts728+M6+XRZKQrBmaH1KkmawTvQBEBNxpYJ/SE/C32jQaYSBHE1TuHK2MKaOja0Yld2aTC+IXRHGyBy1tbI5QeUdGKqgvYBa7MPWxPxrDZISYYGUQ6zYoBbN+eSEi9IgMe3QdbEoEW3HNEJEJcGiHF19XSQ+eq9m64VJEeT3Kmo9eb87fzZrSBmp5jjLl8WnlPXa5WgwM5KKTiT7RiRAeJ4gm+u5BX4BxIZFIaGexBqhLW5ydltt8WtOPdlgy1ZquUSQAHmCAuBaQ5Rl2Vqye5AwgyZzIG0ZUtqV41DCjQVC/Nwar1BRjP2MiAbl46EL+ilShAmF3uGnzmxTtdQLVVFNbZVLAFgLXp0/ItyrVmvNvzIVgXtErNpBtNzIWaG/rdQwikcbRH/qp6iinOCiVu9ZiiUdMHKGWIoq8gx8e4dj8zqGWuEcpI3LHZcYCT/8FfIE4PNHpPwm+k9YrNIB3pux9H+Ozzgxc1pMBlieVsIyPI0w/poNL+vRBzYA724w755VwsxwQrI5SHm8yk6sUcnmPbUPFYcU0fTPID ien5pwP7 xUzxJxnKMYduKSX0lesMOu5h3rocVAlURcxmt9onAuvqC7BN5pioWyF+Y6FqJodtCQEVQ0qhoOd4iJBDe/0hsfphVoL30i/vUDoHcp3SscuPoT2YbWcQgtagGYncfmKAp0SgJ/SxHXeWrKzx6RSyTo1BYCPYhqPBpdD+7Pa4hwInRuhO5vkQlual1eTV7+u9N0PkBWO5uR9qwQhzlxRpfokNT41/FtMTHKubg80Y64OF/MJalNEJKsOGY8FC2Hpnbsh0OA1NdNEMvqzQh9U1rlhw4YXoRw46RKtCsvwx4Gvw7cV0Oz4Hlqc9JcxJta337Yy+Mx6ySqivoNTvaL4r5CJlNM9P4CFNVJ1SXksfd902E05T4JR5JcIz8yuSps2v7NVlTxegmYh5b5EkLJJj5xx4VQYRAjgKsZ96z249aIlELpF5NMdTo5SWCCxcwBHTC7/cL9Ljid4dPrub8k8Rf9zwxondKyEd0XGhsMX+GufcG/gLfQHAAKxS2UAGpRPGq2RqYzShRi2RMUikqfuijwp1qShonILIyre0XsojjtJQ7jKX7c6BHZUFjvdB+OZqyqZRicbrt4Sdv8E8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 23.06.23 22:15, Sebastian Andrzej Siewior wrote: > __build_all_zonelists() acquires zonelist_update_seq by first disabling > interrupts via local_irq_save() and then acquiring the seqlock with > write_seqlock(). This is troublesome and leads to problems on > PREEMPT_RT. The problem is that the inner spinlock_t becomes a sleeping > lock on PREEMPT_RT and must not be acquired with disabled interrupts. > > The API provides write_seqlock_irqsave() which does the right thing in > one step. > printk_deferred_enter() has to be invoked in non-migrate-able context to > ensure that deferred printing is enabled and disabled on the same CPU. > This is the case after zonelist_update_seq has been acquired. > > There was discussion on the first submission that the order should be: > local_irq_disable(); > printk_deferred_enter(); > write_seqlock(); > > to avoid pitfalls like having an unaccounted printk() coming from > write_seqlock_irqsave() before printk_deferred_enter() is invoked. The > only origin of such a printk() can be a lockdep splat because the > lockdep annotation happens after the sequence count is incremented. > This is exceptional and subject to change. > > It was also pointed that PREEMPT_RT can be affected by the printk > problem since its write_seqlock_irqsave() does not really disable > interrupts. This isn't the case because PREEMPT_RT's printk > implementation differs from the mainline implementation in two important > aspects: > - Printing happens in a dedicated threads and not at during the > invocation of printk(). > - In emergency cases where synchronous printing is used, a different > driver is used which does not use tty_port::lock. > > Acquire zonelist_update_seq with write_seqlock_irqsave() and then defer > printk output. > > Fixes: 1007843a91909 ("mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock") > Signed-off-by: Sebastian Andrzej Siewior > Acked-by: Michal Hocko > --- > v2…v3 > - Update comment as per Michal's suggestion. > > v1…v2: > - Improve commit description > > mm/page_alloc.c | 15 ++++++--------- > 1 file changed, 6 insertions(+), 9 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 47421bedc12b7..440e9af67b48d 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5808,19 +5808,17 @@ static void __build_all_zonelists(void *data) > unsigned long flags; > > /* > - * Explicitly disable this CPU's interrupts before taking seqlock > - * to prevent any IRQ handler from calling into the page allocator > - * (e.g. GFP_ATOMIC) that could hit zonelist_iter_begin and livelock. > + * The zonelist_update_seq must be acquired with irqsave because the > + * reader can be invoked from IRQ with GFP_ATOMIC. > */ > - local_irq_save(flags); > + write_seqlock_irqsave(&zonelist_update_seq, flags); > /* > - * Explicitly disable this CPU's synchronous printk() before taking > - * seqlock to prevent any printk() from trying to hold port->lock, for > + * Also disable synchronous printk() to prevent any printk() from > + * trying to hold port->lock, for > * tty_insert_flip_string_and_push_buffer() on other CPU might be > * calling kmalloc(GFP_ATOMIC | __GFP_NOWARN) with port->lock held. > */ > printk_deferred_enter(); > - write_seqlock(&zonelist_update_seq); > > #ifdef CONFIG_NUMA > memset(node_load, 0, sizeof(node_load)); > @@ -5857,9 +5855,8 @@ static void __build_all_zonelists(void *data) > #endif > } > > - write_sequnlock(&zonelist_update_seq); > printk_deferred_exit(); > - local_irq_restore(flags); > + write_sequnlock_irqrestore(&zonelist_update_seq, flags); > } > > static noinline void __init Reviewed-by: David Hildenbrand -- Cheers, David / dhildenb