From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DC22C3064D for ; Tue, 2 Jul 2024 12:28:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1EFF06B00A3; Tue, 2 Jul 2024 08:28:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A1506B00A4; Tue, 2 Jul 2024 08:28:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 08F706B00A5; Tue, 2 Jul 2024 08:28:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id DDFC86B00A3 for ; Tue, 2 Jul 2024 08:28:50 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 835AE1A2120 for ; Tue, 2 Jul 2024 12:28:50 +0000 (UTC) X-FDA: 82294741620.27.6CBDD12 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf28.hostedemail.com (Postfix) with ESMTP id DCC40C0012 for ; Tue, 2 Jul 2024 12:28:47 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719923316; a=rsa-sha256; cv=none; b=wo1XIUGn/bZmO+VwqmYxVVSYvRkAdSnIYjtJ3RETa5dbGr/t1G8jm5PZOxpxSUezqxFAuz VS4IrsZ8KoNAk1vjgQYutFCLI3rUC89xcztpSf9OTBmw3tRwuiP0ITQBGbznGNwBM7BHDE xAMYWt5ZEHfbKKhqVQMV6NUawXmSkiU= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719923316; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=85bgj+2BEIx1zVNnj0XOSqKHpJbfXKZMfQDCXwHOCA4=; b=jP3To1wcpPh3P/kmUcBd76+ZJ8q6QefB1XPO6Gx0g+z/Pr88Fmkmg6NC4aHgfl51GpO+ak HrtMtdwrLp/lGL/gl9iDhGIgw1syp+6OAyG6nQRWtN8xRXKQMQ1iq/7F6DgkgbKOlcQeJk MvCmvzlliT2UqcTtFJ1Uf096sstFQm0= Received: from mail.maildlp.com (unknown [172.19.88.194]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4WD2D81rGszQk09; Tue, 2 Jul 2024 20:24:56 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id 5FB66140134; Tue, 2 Jul 2024 20:28:43 +0800 (CST) Received: from [10.69.30.204] (10.69.30.204) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 2 Jul 2024 20:28:43 +0800 Subject: Re: [PATCH net-next v9 03/13] mm: page_frag: use initial zero offset for page_frag_alloc_align() To: Alexander H Duyck , , , CC: , , Andrew Morton , References: <20240625135216.47007-1-linyunsheng@huawei.com> <20240625135216.47007-4-linyunsheng@huawei.com> <8e80507c685be94256e0e457ee97622c4487716c.camel@gmail.com> From: Yunsheng Lin Message-ID: <01dc5b5a-bddf-bd2d-220c-478be6b62924@huawei.com> Date: Tue, 2 Jul 2024 20:28:42 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <8e80507c685be94256e0e457ee97622c4487716c.camel@gmail.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.69.30.204] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemf200006.china.huawei.com (7.185.36.61) X-Stat-Signature: ggc7r4eytm5137jm5pkz83khwcs684h7 X-Rspamd-Queue-Id: DCC40C0012 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1719923327-280954 X-HE-Meta: U2FsdGVkX18d7ItClhQXESblsxOjRELTRnPCCValN3476amK9/QQBsl78rcdjF1QYMFzvWJZ20lF8OsajRzv3A3EGaMY6PwZzF9/hw8PbmqAd/SxyTL0+HQE5gYS+HgBDrAJFBKEBhcebQ85GMdNvarOkLj6V9G0rVdAlqG2oiEre2CbRAp3i+x0/HvKytRHkrdj65wKw4TF1tSw25PlS2QnJdKoj29jOyMvhtLT8j1n6/sOIf0FhBgM77Pw+WPU6qahdrGfKgsxT9L+MuaE8AiNbdWruGRoCxhRDgvCLy9vBZ42CkI9Tq1TQBJGRk94asi3TOjVBk2LL41a36s8G0SeD6GrO2b8ba5A18CBl0rQt6fBAOdhv2FkSEF2y07ekilcbEj1s+TQWsvzkyTzfre2eW+K0n0HsFkolxkrlqOHsZ7HEMvSvegu+Ujg9yKqglIK0YBKlJuNNo1nle9GJm6M1zbjhpUSYzBIYt5M4a+64aNEl87PQAoX4XgIghZSDrEcKVcYZPGKVUSle69eVhD2prfBEN5oDAmVqB/9Urv51cI4iiTJGG6o/3HJRJUTZk5qVCBN1RxRrc/P/0Oug22QgjCz80a1NKWWRCgCJC5SPvh7n4HLX3IhH6k3fKnnpGPfntyw3JpxMsp2KsDdDACCaesaADe/5wLOBSTLBPrdCgOY3kBC+k3RIj7o2fgP8a5Wwcarnjm+hFD1UTaTaqAd1Qt7t9cUR+kdhQez3Ggmv9fZuVNMHV72Gj3l0VhcnTtJ6plrBpxQNtpa6qEiNvXfGUMLgCwvDChFJeH1VQjvwuHivxHsSY3fG5zxLddXnF9U1GpDJglHkSuGyKVFgr4RywYuIa27n4eZbCZdoihWm24U9PLZa1S4+V2DrcgyONRy2Ivuia28Mo7HEXJ+VDxvcf4BAm+6u97910BWrarQQR2Zl90tMeheW2phlEUZjDZlzDMipEjtpFlgF/D vzAatvKM DThJ6LmhWptkDX65WuHLc9vmebsTl8LzGD5irKnSC6RqT6XQtzM2hBJ7zk5oCtQzN1WPycM4LolUKEaa8gdr190FYbrvDiJhZFEedkrRqry01vpP9VLSAe/DGiqcsjahk0Dk4fGZxke2IYOeaWI8Q6fG+KMyny377lsZ115HXf1d4Fmc110Y926y/HBgXk+w47hJvy/vZlz2NWJf3tvlV1xujv84z4UJqxOYbrQ9pHMtErvsNM1xQWrqbLlI1uIexqEt3 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/7/2 7:27, Alexander H Duyck wrote: > On Tue, 2024-06-25 at 21:52 +0800, Yunsheng Lin wrote: >> We are above to use page_frag_alloc_*() API to not just > "about to use", not "above to use" Ack. > >> allocate memory for skb->data, but also use them to do >> the memory allocation for skb frag too. Currently the >> implementation of page_frag in mm subsystem is running >> the offset as a countdown rather than count-up value, >> there may have several advantages to that as mentioned >> in [1], but it may have some disadvantages, for example, >> it may disable skb frag coaleasing and more correct cache >> prefetching >> ... >> diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c >> index 88f567ef0e29..da244851b8a4 100644 >> --- a/mm/page_frag_cache.c >> +++ b/mm/page_frag_cache.c >> @@ -72,10 +72,6 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, >> if (!page) >> return NULL; >> >> -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) >> - /* if size can vary use size else just use PAGE_SIZE */ >> - size = nc->size; >> -#endif >> /* Even if we own the page, we do not use atomic_set(). >> * This would break get_page_unless_zero() users. >> */ >> @@ -84,11 +80,16 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, >> /* reset page count bias and offset to start of new frag */ >> nc->pfmemalloc = page_is_pfmemalloc(page); >> nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; >> - nc->offset = size; >> + nc->offset = 0; >> } >> >> - offset = nc->offset - fragsz; >> - if (unlikely(offset < 0)) { >> +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) >> + /* if size can vary use size else just use PAGE_SIZE */ >> + size = nc->size; >> +#endif >> + >> + offset = __ALIGN_KERNEL_MASK(nc->offset, ~align_mask); >> + if (unlikely(offset + fragsz > size)) { > > The fragsz check below could be moved to here. > >> page = virt_to_page(nc->va); >> >> if (!page_ref_sub_and_test(page, nc->pagecnt_bias)) >> @@ -99,17 +100,13 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, >> goto refill; >> } >> >> -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) >> - /* if size can vary use size else just use PAGE_SIZE */ >> - size = nc->size; >> -#endif >> /* OK, page count is 0, we can safely set it */ >> set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); >> >> /* reset page count bias and offset to start of new frag */ >> nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; >> - offset = size - fragsz; >> - if (unlikely(offset < 0)) { >> + offset = 0; >> + if (unlikely(fragsz > PAGE_SIZE)) { > > Since we aren't taking advantage of the flag that is left after the > subtraction we might just want to look at moving this piece up to just > after the offset + fragsz check. That should prevent us from trying to > refill if we have a request that is larger than a single page. In > addition we could probably just drop the 3 PAGE_SIZE checks above as > they would be redundant. I am not sure I understand the 'drop the 3 PAGE_SIZE checks' part and the 'redundant' part, where is the '3 PAGE_SIZE checks'? And why they are redundant? > >> /* >> * The caller is trying to allocate a fragment >> * with fragsz > PAGE_SIZE but the cache isn't big >> @@ -124,8 +121,7 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, >> } >> >> nc->pagecnt_bias--; >> - offset &= align_mask; >> - nc->offset = offset; >> + nc->offset = offset + fragsz; >> >> return nc->va + offset; >> } > > > . >