From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70BA0C77B75 for ; Tue, 23 May 2023 16:19:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE7AA900004; Tue, 23 May 2023 12:19:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D9800900002; Tue, 23 May 2023 12:19:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C383F900004; Tue, 23 May 2023 12:19:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B158B900002 for ; Tue, 23 May 2023 12:19:39 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 73BA314069A for ; Tue, 23 May 2023 16:19:39 +0000 (UTC) X-FDA: 80822030478.16.C338789 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf24.hostedemail.com (Postfix) with ESMTP id C7E6718000D for ; Tue, 23 May 2023 16:19:36 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=R+6sPL3k; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf24.hostedemail.com: domain of toke@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=toke@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684858776; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ta0TTukxf/yuFKWKhRvQkdmYEgZbKYn62o9tRnW5LPY=; b=2Hk9Hen9cN6s0YEpCp2HNinetn6Q2GQhePIDYncmmibayq/gA3Sp+J1VdiO0/cEiDvGTFc A7qoU829/nQB6fFW/L+54bV8yrugzJMU7i8K1ClJ7NZmBYVkQkZsmefJNbrXIfrNZgNxr4 uyW/b8P3z4+/2WMpC07zKwQnprSvAz0= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=R+6sPL3k; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf24.hostedemail.com: domain of toke@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=toke@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684858776; a=rsa-sha256; cv=none; b=gAZ0xmyp4xhc8O5vmGa0kS9x23jaxLwPHaUKDIgGvNBYAF7/GoHuCm0zIT8FHye6wTjkFm wwO5eS4prRaGzh/BJXJv5qkwLD4BgKizqFe5fP3f1TOzVHo6VryGwI1DogTLmN1lrxhlsR /08zcwqwv3fXF/7V2wOluwWchTwUK9s= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684858776; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ta0TTukxf/yuFKWKhRvQkdmYEgZbKYn62o9tRnW5LPY=; b=R+6sPL3kk3n60KGjUcZKK5k0KBc6G2k/FKPzIvp6WzmD37WJZ4H2qABgkUEDdE0ubpg4PD bNzrWPldzBw7s9o8rLSb43qtG95CacSajTPstw4q/3f50m1lhJVz/1lt6KlCm+hGcOJGYt JzFuNdIRw/lbGKn0od5GxwKdcGgYV9g= Received: from mail-ej1-f70.google.com (mail-ej1-f70.google.com [209.85.218.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-36-rl6VBeXNN_qAfDUaLjyVwg-1; Tue, 23 May 2023 12:19:32 -0400 X-MC-Unique: rl6VBeXNN_qAfDUaLjyVwg-1 Received: by mail-ej1-f70.google.com with SMTP id a640c23a62f3a-9715654ab36so103602966b.0 for ; Tue, 23 May 2023 09:19:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684858771; x=1687450771; h=mime-version:message-id:date:references:in-reply-to:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ta0TTukxf/yuFKWKhRvQkdmYEgZbKYn62o9tRnW5LPY=; b=lNxtbwd5GhRloBfysWYBuGi+dA/+VO8I1mLhIBLKBnq1QYLnY0V+Mat2icUZHAsb7o rTbKknIstYSSwZbe7P+kXTNtOlRFDdTAlZzr6MoXBthRh2EiFU8vyTIZXhJaVjP16uQS irAkA6s20Rtx0iXs8BwnrOzetoJWZZ/yLTxjfaQjDjDdqAFAaQB7bi4FUtQRKgUIVdd4 SGEXRW8+zKkwSFNvHVVNCf9DSyIqijHiTSr/g0hj63f8BaGiPQtKbxL+3nwy0Mzk7mgp UMHHX/h46qy6Gu9wFQ/6nWHCYPwmf8tr57QPg6Aw/m3ydixtuUvITlmhrje86TQoFFKb 2vNw== X-Gm-Message-State: AC+VfDxEmlEHztZRBOhK6nkMJJaIN5nr+Gkea7QnNC0bWgAb0GwX0ybr 1Yj/zxaS/YLVBDG71HNx/UywuEECkZWxwh7SWPEthBJIcR7rjUR9x5P6mu3x9CBwWQo+k63SkEK XsIPTQNkQc+k= X-Received: by 2002:a17:906:da89:b0:95e:d3f5:3d47 with SMTP id xh9-20020a170906da8900b0095ed3f53d47mr12106899ejb.48.1684858771261; Tue, 23 May 2023 09:19:31 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7KAbuTxsTAfHexWrricZ1m12j1S3s66Y0eWlvhqdzPPLrB7v6BV4ymSdnU+urffEnjwfE7PA== X-Received: by 2002:a17:906:da89:b0:95e:d3f5:3d47 with SMTP id xh9-20020a170906da8900b0095ed3f53d47mr12106869ejb.48.1684858770773; Tue, 23 May 2023 09:19:30 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([45.145.92.2]) by smtp.gmail.com with ESMTPSA id u7-20020a170906950700b0096f71ace804sm4576112ejx.99.2023.05.23.09.19.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 09:19:30 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 430B6BBC9CB; Tue, 23 May 2023 18:16:15 +0200 (CEST) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Jesper Dangaard Brouer , Ilias Apalodimas , netdev@vger.kernel.org, Eric Dumazet , linux-mm@kvack.org, Mel Gorman Cc: Jesper Dangaard Brouer , lorenzo@kernel.org, linyunsheng@huawei.com, bpf@vger.kernel.org, "David S. Miller" , Jakub Kicinski , Paolo Abeni , Andrew Morton , willy@infradead.org Subject: Re: [PATCH RFC net-next/mm V4 2/2] page_pool: Remove workqueue in new shutdown scheme In-Reply-To: <168485357834.2849279.8073426325295894331.stgit@firesoul> References: <168485351546.2849279.13771638045665633339.stgit@firesoul> <168485357834.2849279.8073426325295894331.stgit@firesoul> X-Clacks-Overhead: GNU Terry Pratchett Date: Tue, 23 May 2023 18:16:15 +0200 Message-ID: <87h6s3nhv4.fsf@toke.dk> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Rspam-User: X-Stat-Signature: mw97dj4hdsstx1c41bq1ad948nmr6xed X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: C7E6718000D X-HE-Tag: 1684858776-868254 X-HE-Meta: U2FsdGVkX1+Ust4FjUeaScphtfu+tXPPPM92cETZPM3CKGoncIKT/Gfg61KyMU3l2hJExVHnyF75Dul/UymeWhmMdpqSvjmqOPQ/PpznT5dJLwBBzmdMPdsBXsPj/5XEPhIPk39CWR6EI7dYBgur0aap3oC8Y39kHQ/cPGTWMS/widGcJxKJIwfyJc+E3qxXB09vHdTsheYnAAYq5kCZAl8splR0352lt4IfEJoo6MGWay4MOevSg2NuvCbJlpw5iTQoFJC4qzcRa8BBfdNedRPuS7uqXf7e4bvEJK4+GpEr2oJ5DhwrcbgGhC6gOC7YAWfcfz5tY2JWb6mfwafbS9WYJ+e8IV0Gcwk2bydyHSApEA7UA0Q3z9lXmwRaB1bcYHKVQnF0LVmmMwEgV3jEgnSkbW50EmvybFTs5q2InNHboJBZSnowTXUGSDB5aEd3m1CAzn0DxINrYu0jRCAK72yn11cEiduKBvVhYwD6kL0pJSVx/p7yZ/gVdjDS52ic+ZBoQVNH1GnQCWyjG+SN4YzHGsKkKCMQ9MzyABCAHYLdCsPKIwijquGvffVC3G8dRj7LatlsRPdORqH7ftrByW63YQglkzJfVhMt4rre3j5bB+7KjId8/Cq5D/o53jF8OnUOo7cGiE+O7jxRlytyZ7JZxfhRmxSBLGRCYA69hovZmnXsNf7Gn4j7Rx5IsIWfawDOiwj+TMR14XWI/GFTqVuz9cz+CsLqVSRkrk6b9j8SwQagwPCYM2ATYjAymwwWCAIiO600vLus5HsS6vAnt7c78lFJ+dQD5535pRSc2oRx/qvcgCJL/IyL11WQq+L9hsoN7LPOgAqNKtteO1miV6A5726LvJbgVDhz2RmXhL9kt0giCPk4bHYeHzG7YqNesYKPD6tFTxV6UlQFDLKu+WygHHGYDzNHh+313hUcMk0YLLLyaxizQtjgQcKwIGf8FMNXyEdUf+vD7AmtK0l IU5sy2TW M1OcmUVKh3Bd/SLMG9reMdhdbjvEbqCQfiruiEKTQX06k6BpxPEsVzeyDDmp2o2tuTzfctjPQ5FhSDmkdAEwiyVBkkSUBdVf0fgNwDrCdz5BX+uOtrUbXa1w5Gf2qm/F6PHP9O+yM0sAn0JtLNBx95FjPsNUS9JnU6FIOFcjaIRWg2zsDEANnp1NunGoEzdFNjkLO46ja5mZojPbDJAP2sta/iXgKuwCFvc8Fj8+YGTy1+6MBNfOBUDJ7QDqm1vCmKyeSoLDkCMk+bEKVjyxmbxuEfNEa4OlE7tYDrqWYmtng1paPXWMRaHvz4o9pnEP7PTAhJaRUg9wSFe6meG1rhW3icXpsTsKi/QCXXJc9+JBzyPH7p44DMa0InA4W5bEWRojJ6pDRg2byLP6zvEaVGEwRL2rsPBTxLjbt+KRmttJhDNmTXmidvUFLOyKrvq6Qz6HNhtw7cn0uPCs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > void page_pool_destroy(struct page_pool *pool) > { > + unsigned int flags; > + u32 release_cnt; > + u32 hold_cnt; > + > if (!pool) > return; > > @@ -868,11 +894,45 @@ void page_pool_destroy(struct page_pool *pool) > if (!page_pool_release(pool)) > return; > > - pool->defer_start = jiffies; > - pool->defer_warn = jiffies + DEFER_WARN_INTERVAL; > + /* PP have pages inflight, thus cannot immediately release memory. > + * Enter into shutdown phase, depending on remaining in-flight PP > + * pages to trigger shutdown process (on concurrent CPUs) and last > + * page will free pool instance. > + * > + * There exist two race conditions here, we need to take into > + * account in the following code. > + * > + * 1. Before setting PP_FLAG_SHUTDOWN another CPU released the last > + * pages into the ptr_ring. Thus, it missed triggering shutdown > + * process, which can then be stalled forever. > + * > + * 2. After setting PP_FLAG_SHUTDOWN another CPU released the last > + * page, which triggered shutdown process and freed pool > + * instance. Thus, its not safe to dereference *pool afterwards. > + * > + * Handling races by holding a fake in-flight count, via artificially > + * bumping pages_state_hold_cnt, which assures pool isn't freed under > + * us. Use RCU Grace-Periods to guarantee concurrent CPUs will > + * transition safely into the shutdown phase. > + * > + * After safely transition into this state the races are resolved. For > + * race(1) its safe to recheck and empty ptr_ring (it will not free > + * pool). Race(2) cannot happen, and we can release fake in-flight count > + * as last step. > + */ > + hold_cnt = READ_ONCE(pool->pages_state_hold_cnt) + 1; > + WRITE_ONCE(pool->pages_state_hold_cnt, hold_cnt); > + synchronize_rcu(); > + > + flags = READ_ONCE(pool->p.flags) | PP_FLAG_SHUTDOWN; > + WRITE_ONCE(pool->p.flags, flags); > + synchronize_rcu(); Hmm, synchronize_rcu() can be quite expensive; why do we need two of them? Should be fine to just do one after those two writes, as long as the order of those writes is correct (which WRITE_ONCE should ensure)? Also, if we're adding this (blocking) operation in the teardown path we risk adding latency to that path (network interface removal, BPF_PROG_RUN syscall etc), so not sure if this actually ends up being an improvement anymore, as opposed to just keeping the workqueue but dropping the warning? -Toke