From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C757CF9C6B for ; Tue, 24 Sep 2024 06:47:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AE3A06B00A8; Tue, 24 Sep 2024 02:47:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A91A76B00AA; Tue, 24 Sep 2024 02:47:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 959966B00AB; Tue, 24 Sep 2024 02:47:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7A1796B00A8 for ; Tue, 24 Sep 2024 02:47:14 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 33179ACBA1 for ; Tue, 24 Sep 2024 06:47:14 +0000 (UTC) X-FDA: 82598699988.13.1EB56B9 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by imf02.hostedemail.com (Postfix) with ESMTP id 4803780003 for ; Tue, 24 Sep 2024 06:47:11 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of gur.stavi@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=gur.stavi@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727160372; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gbsc/S+S7HQhPPj508Ku1haGxcV7bbpJJEqEmIjfO0A=; b=HCFNIZpnZJm9sdUaD0QSUiIPNYjRKfLDtmUh8TMQx4oNiwL2EQnXgoi7wIcgRJkSm4XjfB KkJtH7AykfhgUp0gLUrXnXpY7LPhXybO4XDRRFhw0JWOxeV0TCu99ohMLRc82inKTMxftM aVp76uylpsJ5OqePK1Me9FuUb3Th1xk= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of gur.stavi@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=gur.stavi@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727160372; a=rsa-sha256; cv=none; b=H1wwsEcABeEL0Y0VGtkO/zlmc2xENeDdsM4QVEpXpdTMU7ZcTfaX1/8xPrfaurCkD9xs9l RO3KZxhuMDY6Mt181/Z4VRBq2lGHloEdd+yJSJzDLtMt1YL+EiP6j/QP0zOo2mCPvzrRjB zza32UUqnlxLh5i2M2bq5DxsXvpNyGI= Received: from mail.maildlp.com (unknown [172.18.186.231]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4XCVl40H3fz6J6DN; Tue, 24 Sep 2024 14:46:40 +0800 (CST) Received: from frapeml500005.china.huawei.com (unknown [7.182.85.13]) by mail.maildlp.com (Postfix) with ESMTPS id EE8BB140B63; Tue, 24 Sep 2024 14:47:07 +0800 (CST) Received: from china (10.200.201.82) by frapeml500005.china.huawei.com (7.182.85.13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Tue, 24 Sep 2024 08:46:45 +0200 From: Gur Stavi To: CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH net 2/2] page_pool: fix IOMMU crash when driver has already unbound Date: Tue, 24 Sep 2024 09:45:59 +0300 Message-ID: <20240924064559.1681488-1-gur.stavi@huawei.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <2fb8d278-62e0-4a81-a537-8f601f61e81d@huawei.com> References: <2fb8d278-62e0-4a81-a537-8f601f61e81d@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.200.201.82] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To frapeml500005.china.huawei.com (7.182.85.13) X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 4803780003 X-Stat-Signature: oiy483zaowtfuxrx31ptjuzrk9qddmy7 X-HE-Tag: 1727160431-28932 X-HE-Meta: U2FsdGVkX1/lJRcRemVmOz7BGhshx6+deN4Q+cwl2Uyr34rb3aHu7yHs1GSMHSDoDWW0Mbjj5lQ36uvduYFAZR13p6It75jYkMyE7K0ahxXN7Gl8ZH8F7jStL4kbNv/VzbFr9lPq+fB4Cm/mMVw+Eeo5vUIwbtuio2yY/8no27lpULU421OCGEWrYUSBhoHpVqWtQdddOgeEMLt7HRxSLDTtYUDpKlSnX/DB/6hLtZaISzG90cmldbTy1ITz3WUM0RZMP1A5XkMnDeqMYdGWFyOzX5gEN9RiFIFHK1wBKhCJWfzhyd307hJYTD6kjocDRa1wIXgY0TJ9a3M3Y1odARDfoA2jbGc5+ziCc2IF0INLfiPo1OXTaKuXonwL+5EpYrF9QlYIi3jTnXsn0DpTU1+SuBHOGGfCXt8LDJiWBfWPkF2I8CStBWRFTxn0nXG3PttQgzAipvkUlPzixo9BfHZ1p2wzVpITDXkIL22wvpeq1CnNzgBB/NLQSmNpSejIXq+RDqeM9D3xi287i6L3Hu8RegoklSAdkTUqK94u46a71ld0O7hLqMI+AaP88jg5y7C9C8eJ4aCGbqI9ZjMGcH12smoleFBHXhybHeBs5i966OweSCTxbYTg+gFpih/74eJoAPbaJOQ95kZTsZSaZPcOk7EeRu5g7GWyDwFO5c2JXEPENKgxjxojz/0wo9MbhRKL/blKJXU5tf5qgYFB0nVY4FZ0jta3tYMruM4jZ0tlKM78pu87iIG2HNJFW+48som/xsuWJ+88Hf6POafDlJAmHq+wXurltk2UiRHJX764V0qrddJFLb8fbOJCXxJNp1VzZ3XhsbVZbNpX62ob71zBEMgyWTKcEU08I3962Nvnueoj6wLx5HdX2A3aj6GOiibBxp+UP0FnkaoB8gaTQ0Vg3zLr1cbBAop4W/8q2mTGehcyCscx+2lI5HP3I3UCQj1k6LW7G7eUvUECVdy CV11rWHr so1AHDU0nazA+V0tJy7zHtSAKolnsJRhNXYCr2BQCdnoAdkd9pEYW/27SGW2lMQJcLp9DU7RiPbAA9amjQHyKEQL+MI+7HXgiTvGTn2m1rZ13Ro6hpTCe6wAB3XtZQJJqJ8c44+aG/v/YMtATY0qEKEcDWkEAWgFILb64Gmox6pYI/gOgnbHBDPh9wN0jte6XmCnvxNyI1iTB+mjVLPxCUlZcASVl0mQ9q2Z9+wTgZDYDEv4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: >>>> With all the caching in the network stack, some pages may be >>>> held in the network stack without returning to the page_pool >>>> soon enough, and with VF disable causing the driver unbound, >>>> the page_pool does not stop the driver from doing it's >>>> unbounding work, instead page_pool uses workqueue to check >>>> if there is some pages coming back from the network stack >>>> periodically, if there is any, it will do the dma unmmapping >>>> related cleanup work. >>>> >>>> As mentioned in [1], attempting DMA unmaps after the driver >>>> has already unbound may leak resources or at worst corrupt >>>> memory. Fundamentally, the page pool code cannot allow DMA >>>> mappings to outlive the driver they belong to. >>>> >>>> Currently it seems there are at least two cases that the page >>>> is not released fast enough causing dma unmmapping done after >>>> driver has already unbound: >>>> 1. ipv4 packet defragmentation timeout: this seems to cause >>>> delay up to 30 secs: >>>> >>>> 2. skb_defer_free_flush(): this may cause infinite delay if >>>> there is no triggering for net_rx_action(). >>>> >>>> In order not to do the dma unmmapping after driver has already >>>> unbound and stall the unloading of the networking driver, add >>>> the pool->items array to record all the pages including the ones >>>> which are handed over to network stack, so the page_pool can >>>> do the dma unmmapping for those pages when page_pool_destroy() >>>> is called. >>> >>> So, I was thinking of a very similar idea. But what do you mean by >>> "all"? The pages that are still in caches (slow or fast) of the pool >>> will be unmapped during page_pool_destroy(). >> >> Yes, it includes the one in pool->alloc and pool->ring. > > It worths mentioning that there is a semantics changing here: > Before this patch, there can be almost unlimited inflight pages used by > driver and network stack, as page_pool doesn't really track those pages. > After this patch, as we use a fixed-size pool->items array to track the > inflight pages, the inflight pages is limited by the pool->items, currently > the size of pool->items array is calculated as below in this patch: > > +#define PAGE_POOL_MIN_ITEM_CNT 512 > + unsigned int item_cnt = (params->pool_size ? : 1024) + > + PP_ALLOC_CACHE_SIZE + PAGE_POOL_MIN_ITEM_CNT; > > Personally I would consider it is an advantage to limit how many pages which > are used by the driver and network stack, the problem seems to how to decide > the limited number of page used by network stack so that performance is not > impacted. In theory, with respect to the specific problem at hand, you only have a limit on the number of mapped pages inflight. Once you reach this limit you can unmap these old pages, forget about them and remember new ones.