From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59233C25B77 for ; Wed, 8 May 2024 13:37:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6C2436B009C; Wed, 8 May 2024 09:37:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 64A9D6B009D; Wed, 8 May 2024 09:37:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 474DD6B009E; Wed, 8 May 2024 09:37:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 2389F6B009C for ; Wed, 8 May 2024 09:37:07 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id CB05480B69 for ; Wed, 8 May 2024 13:37:06 +0000 (UTC) X-FDA: 82095329652.26.98F0E65 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf26.hostedemail.com (Postfix) with ESMTP id 5DF0B140018 for ; Wed, 8 May 2024 13:37:03 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; spf=pass (imf26.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715175425; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XmxFd4GVUpHCH3X0wrdYhDw48JyEbQ+ZDWgVHv/4wK4=; b=DAG9U1rbKumaN/LC82JdICVYPfZT1Tp179ImEvhr2lagHW67ZvDEIlannq0pciBV91udao mm11HtTMe0s4t2CYaMIgHBulhpTYc4qSP5eUDvKKfZNslSKmDEBdt/4783Qz0yEG/ocUTM GLDh0l4vzDjuXS73qQAsT5q0266WEH0= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; spf=pass (imf26.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715175425; a=rsa-sha256; cv=none; b=SiJxWPa57zFIaFPI59RycLWDWRB5T2h5myA0dxulRYEUOy7xmsbRTOMre4aARydXO8ob7N pzlmWu+YAw55/6bYroWZX1RBkenqscGVx3++LjvQgvHIFwYZjdNNtqg30VGiD4+tj2HZ2/ 2/PYCqHPtwfwa2XtyP7yHMkrRcKhRXc= Received: from mail.maildlp.com (unknown [172.19.163.48]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4VZGLl2BwMztT5f; Wed, 8 May 2024 21:33:35 +0800 (CST) Received: from dggpemm500005.china.huawei.com (unknown [7.185.36.74]) by mail.maildlp.com (Postfix) with ESMTPS id 89D8218007B; Wed, 8 May 2024 21:37:01 +0800 (CST) Received: from localhost.localdomain (10.69.192.56) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Wed, 8 May 2024 21:37:01 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Alexander Duyck , Andrew Morton , Subject: [PATCH net-next v3 10/13] mm: page_frag: introduce prepare/probe/commit API Date: Wed, 8 May 2024 21:34:05 +0800 Message-ID: <20240508133408.54708-11-linyunsheng@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20240508133408.54708-1-linyunsheng@huawei.com> References: <20240508133408.54708-1-linyunsheng@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset="y" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.69.192.56] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpemm500005.china.huawei.com (7.185.36.74) X-Rspam-User: X-Rspamd-Queue-Id: 5DF0B140018 X-Rspamd-Server: rspam06 X-Stat-Signature: baabrmrncwfhjtqjxaajh9csogh8oh78 X-HE-Tag: 1715175423-716361 X-HE-Meta: U2FsdGVkX1/ClBmJbQl4+f9A47uiLicDirchHmn69/t1POTF7FnatphJ37jcdkPDA2QjxV9nXpT8aXB0Hfhl1hX5Zw95IMfCtSdOE/4pXVDGvAWYZbtV5hvtRGE9Qaq+DUUG412inauVmZCnwPKpvp2hy4NlvJcW7cdLJTlRRUyNfpcYy1lx/HTTpUHadxsswb/U8IctdgdOwN/MvChzwxUQAX4UDlRGyVReQcV4XokjYaNj7IfrSGILoioOMd4id7/t35+vEfYFaY0L++rKOBzk1ty3Wv7KwMhmqLCqrYNqsZtkMpz+0zid8XEUEgBb3gM1wQsLVbGYzgfW8pYEglLpwtsAXppt5fjPmmGlsiLGbl/mh+u7OhUhr0lyC+uBCPdZ8pud+fIBeXkqzS/samz3KV/iewzZc/H/6sJdnHovxBZxa3NPvqI0dgAIfRSNWdxbvFFrLiEJagPTsHS2uNiKQZae+7JzTFSl04qXM0JyU6feEFdrZjaRpAoebfO3BfxhueCj7Xke4FJfcXv3Sd5vUw7kKtU3TH/ARfpbEKW213MzDZfMP2H8jJq3jcSMhZ+1KAgs6YKhZoJ1D/oNnjoyvhUewy1z897ZpUGlMHzChC8TsQTau+FR+QSw5xCcOigTN/TR/88Q9xnlAmj+/fuohrbK6WrvQufn4gV3RfzDAfLFqyCqfxR4eA+9kuSDNi2V9DoLbzKA3w8NIEACjV2eMOrOoTJoAdPoVgCvcTIPQqj2BcqkOflNdWQDCGQeV4V/Uv8VfmgHCR7hQLEGuOy74aKioxQFKSptXMKJnEvO4ocAQVTZ2Z5x7a7yBBiw30oT//f8/8WKsNe8TZUpuwjv6kN3HIA62lhsAtJYWrFw5tMbB1+1DJuGYMrcTJj+IQE5rXyTF/zV/ite0zXxSdFgO3OVfl8BnM9YjotmqXICZGAs7TQCGkWRsA0mxCbRWDEUSp6Vlox/ZNfmgm2 8cuw6Liv i7l8wAqxGpVeEBTfgg4awr3RxdslLzSIM2q5qEIBWcXHIBJruzNhVy71IVf6YiSuNO0AW0jiwDG1j1y2BunuXOpYFcFG1DcJ5g+4uJh9abySIIRfzX8QQNhZRGH8s1XNMJk02j7Ghc4yV2j+cmT7+C+g1yKI0S6RVWm5zHPrbafb00bXqAEZk2DKaqKaiPcslWo5yBln4DqlgCCSziBUeL+0iCxjeWtcsaJmIaAIPRk0jejfuVk+QqKW4JmCwGB+9lA7iN374jneYSrhh0e7QeqPUxqex9ewc9NQv2H06MdwMXKyKrA1iBPIn+VPmYTCeoh2D X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: There are many use cases that need minimum memory in order for forward progressing, but more performant if more memory is available or need to probe the cache info to use any memory available for frag caoleasing reason. Currently skb_page_frag_refill() API is used to solve the above usecases, caller need to know about the internal detail and access the data field of 'struct page_frag' to meet the requirement of the above use cases and its implementation is similar to the one in mm subsystem. To unify those two page_frag implementations, introduce a prepare API to ensure minimum memory is satisfied and return how much the actual memory is available to the caller and a probe API to report the current available memory to caller without doing cache refilling. The caller needs to either call the commit API to report how much memory it actually uses, or not do so if deciding to not use any memory. As next patch is about to replace 'struct page_frag' with 'struct page_frag_cache' in linux/sched.h, which is included by the asm-offsets.s, using the virt_to_page() in the inline helper of page_frag_cache.h cause a "‘vmemmap’ undeclared" compiling error for asm-offsets.s, use a macro for probe API to avoid that compiling error. CC: Alexander Duyck Signed-off-by: Yunsheng Lin --- include/linux/page_frag_cache.h | 86 ++++++++++++++++++++++++ mm/page_frag_cache.c | 113 ++++++++++++++++++++++++++++++++ 2 files changed, 199 insertions(+) diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index 88e91ee57b91..30893638155b 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -71,6 +71,21 @@ static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache *nc) return encoded_page_pfmemalloc(nc->encoded_va); } +static inline unsigned int page_frag_cache_page_size(struct encoded_va *encoded_va) +{ +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + return PAGE_SIZE << encoded_page_order(encoded_va); +#else + return PAGE_SIZE; +#endif +} + +static inline unsigned int __page_frag_cache_page_offset(struct encoded_va *encoded_va, + unsigned int remaining) +{ + return page_frag_cache_page_size(encoded_va) - remaining; +} + void page_frag_cache_drain(struct page_frag_cache *nc); void __page_frag_cache_drain(struct page *page, unsigned int count); void *__page_frag_alloc_va_align(struct page_frag_cache *nc, @@ -85,12 +100,83 @@ static inline void *page_frag_alloc_va_align(struct page_frag_cache *nc, return __page_frag_alloc_va_align(nc, fragsz, gfp_mask, -align); } +static inline unsigned int page_frag_cache_page_offset(const struct page_frag_cache *nc) +{ + return __page_frag_cache_page_offset(nc->encoded_va, nc->remaining); +} + static inline void *page_frag_alloc_va(struct page_frag_cache *nc, unsigned int fragsz, gfp_t gfp_mask) { return __page_frag_alloc_va_align(nc, fragsz, gfp_mask, ~0u); } +void *page_frag_alloc_va_prepare(struct page_frag_cache *nc, unsigned int *fragsz, + gfp_t gfp); + +static inline void *page_frag_alloc_va_prepare_align(struct page_frag_cache *nc, + unsigned int *fragsz, + gfp_t gfp, + unsigned int align) +{ + WARN_ON_ONCE(!is_power_of_2(align) || align > PAGE_SIZE); + nc->remaining = nc->remaining & -align; + return page_frag_alloc_va_prepare(nc, fragsz, gfp); +} + +struct page *page_frag_alloc_pg_prepare(struct page_frag_cache *nc, + unsigned int *offset, + unsigned int *fragsz, gfp_t gfp); + +struct page *page_frag_alloc_prepare(struct page_frag_cache *nc, + unsigned int *offset, + unsigned int *fragsz, + void **va, gfp_t gfp); + +static inline struct encoded_va *__page_frag_alloc_probe(struct page_frag_cache *nc, + unsigned int *offset, + unsigned int *fragsz, + void **va) +{ + struct encoded_va *encoded_va; + + *fragsz = nc->remaining; + encoded_va = nc->encoded_va; + *offset = __page_frag_cache_page_offset(encoded_va, *fragsz); + *va = encoded_page_address(encoded_va) + *offset; + + return encoded_va; +} + +#define page_frag_alloc_probe(nc, offset, fragsz, va) \ +({ \ + struct encoded_va *__encoded_va; \ + struct page *__page = NULL; \ + \ + if (likely((nc)->remaining)) \ + __page = virt_to_page(__page_frag_alloc_probe(nc, \ + offset, \ + fragsz, \ + va)); \ + \ + __page; \ +}) + +static inline void page_frag_alloc_commit(struct page_frag_cache *nc, + unsigned int fragsz) +{ + VM_BUG_ON(fragsz > nc->remaining || !nc->pagecnt_bias); + nc->pagecnt_bias--; + nc->remaining -= fragsz; +} + +static inline void page_frag_alloc_commit_noref(struct page_frag_cache *nc, + unsigned int fragsz) +{ + VM_BUG_ON(fragsz > nc->remaining); + nc->remaining -= fragsz; +} + void page_frag_free_va(void *addr); #endif diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c index 4542d72e7b01..eb8bf59b26bb 100644 --- a/mm/page_frag_cache.c +++ b/mm/page_frag_cache.c @@ -60,6 +60,119 @@ static struct page *__page_frag_cache_refill(struct page_frag_cache *nc, return NULL; } +static struct page *page_frag_cache_refill(struct page_frag_cache *nc, + gfp_t gfp_mask) +{ + struct encoded_va *encoded_va = nc->encoded_va; + + if (likely(encoded_va)) { + struct page *page = virt_to_page(encoded_va); + + if (!page_ref_sub_and_test(page, nc->pagecnt_bias)) + return __page_frag_cache_refill(nc, gfp_mask); + + if (unlikely(encoded_page_pfmemalloc(encoded_va))) { + free_unref_page(page, compound_order(page)); + return __page_frag_cache_refill(nc, gfp_mask); + } + + /* OK, page count is 0, we can safely set it */ + set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); + + /* reset page count bias and offset to start of new frag */ + nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; + nc->remaining = page_frag_cache_page_size(encoded_va); + + return page; + } + + return __page_frag_cache_refill(nc, gfp_mask); +} + +void *page_frag_alloc_va_prepare(struct page_frag_cache *nc, + unsigned int *fragsz, gfp_t gfp) +{ + struct encoded_va *encoded_va; + unsigned int remaining; + + remaining = nc->remaining; + if (unlikely(*fragsz > remaining)) { + if (WARN_ON_ONCE(*fragsz > PAGE_SIZE) || + !page_frag_cache_refill(nc, gfp)) + return NULL; + + remaining = nc->remaining; + } + + encoded_va = nc->encoded_va; + *fragsz = remaining; + return encoded_page_address(encoded_va) + + __page_frag_cache_page_offset(encoded_va, remaining); +} +EXPORT_SYMBOL(page_frag_alloc_va_prepare); + +struct page *page_frag_alloc_pg_prepare(struct page_frag_cache *nc, + unsigned int *offset, + unsigned int *fragsz, gfp_t gfp) +{ + struct encoded_va *encoded_va; + unsigned int remaining; + struct page *page; + + remaining = nc->remaining; + if (unlikely(*fragsz > remaining)) { + if (WARN_ON_ONCE(*fragsz > PAGE_SIZE)) { + *fragsz = 0; + return NULL; + } + + page = page_frag_cache_refill(nc, gfp); + remaining = nc->remaining; + encoded_va = nc->encoded_va; + } else { + encoded_va = nc->encoded_va; + page = virt_to_page(encoded_va); + } + + *offset = __page_frag_cache_page_offset(encoded_va, remaining); + *fragsz = remaining; + + return page; +} +EXPORT_SYMBOL(page_frag_alloc_pg_prepare); + +struct page *page_frag_alloc_prepare(struct page_frag_cache *nc, + unsigned int *offset, + unsigned int *fragsz, + void **va, gfp_t gfp) +{ + struct encoded_va *encoded_va; + unsigned int remaining; + struct page *page; + + remaining = nc->remaining; + if (unlikely(*fragsz > remaining)) { + if (WARN_ON_ONCE(*fragsz > PAGE_SIZE)) { + *fragsz = 0; + return NULL; + } + + page = page_frag_cache_refill(nc, gfp); + remaining = nc->remaining; + encoded_va = nc->encoded_va; + } else { + encoded_va = nc->encoded_va; + page = virt_to_page(encoded_va); + } + + *offset = __page_frag_cache_page_offset(encoded_va, remaining); + *fragsz = remaining; + *va = encoded_page_address(encoded_va) + *offset; + + return page; +} +EXPORT_SYMBOL(page_frag_alloc_prepare); + void page_frag_cache_drain(struct page_frag_cache *nc) { if (!nc->encoded_va) -- 2.33.0