From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6BF0C25B10 for ; Fri, 10 May 2024 17:38:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C8EA6B010C; Fri, 10 May 2024 13:38:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4519F6B010D; Fri, 10 May 2024 13:38:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2F2C56B010E; Fri, 10 May 2024 13:38:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 101716B010C for ; Fri, 10 May 2024 13:38:20 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 7530416048B for ; Fri, 10 May 2024 17:38:19 +0000 (UTC) X-FDA: 82103195118.11.0206CC8 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf20.hostedemail.com (Postfix) with ESMTP id 255A81C000D for ; Fri, 10 May 2024 17:38:16 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=CAWd2quK; spf=pass (imf20.hostedemail.com: domain of martineau@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=martineau@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715362697; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VntKqTdhANBvvDGh7oY1tzG316/j6DY0pSN26TEeTxg=; b=QSxFnob0s77p1iJyq+Q4oHPd1868Zte9ufCVdNBwW4F0zsQIEYqERq6lGGCnd4Yf9VmmfO vplO9nIo9o1HBnL/xk+KMHKJYZSxQ+j2NA9+CpmvhOgW4wcQAsjoDqqJjH74NiPIP54NCP OLeWyOWzKAkDl5syjf5Y7KUc3K5TGIY= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=CAWd2quK; spf=pass (imf20.hostedemail.com: domain of martineau@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=martineau@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715362697; a=rsa-sha256; cv=none; b=DIu3d32YL+/hgyGBHynYl8mbJ2izfjbHT79U/I24FxuRXyzI60w0R4Hl2f8AlM7mdNdMMH UZRMZXQPjeKBvOOata43Z5gAjVhNFEIjSLJMWlgYnX7UQdOG7G+JY3vD9ie8/y9qsOHClr 9+aDXhHalpZom2V4Q6sdABOjy0UCNnM= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id A8B52CE1EB0; Fri, 10 May 2024 17:38:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BCBF8C113CC; Fri, 10 May 2024 17:38:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715362692; bh=Bml2RlcHEC2xHGOGeIryyjJgZFXO87f5TXOhI7vR8FM=; h=Date:From:To:cc:Subject:In-Reply-To:References:From; b=CAWd2quKnm8vjuP3pnD2QgCIYTIKHmtBJzxkM/o66d9Vkww1CzTgHXPuX78p9xHWd +CIBSgoi3GldfCfVskIJqPflUMzhcLXkEy3wGL8DOIBpDWD9hwtgRo+QwkCC6KE87c 4bimXyttauHjFVGqa/clLeFWxU0a2ohYJ9aPfxRlWiJoMY3812rwat+fOmJjoYtDAA HT1rPFg4zanjusaRzUG7ktroXwdkNZ102OY6SLRpDWaQVow0u3cNsr37Jjs5h0OBOZ bl70VHvobiItbuiQ7xgJ/9IY1JLHu/wnokGxw3gUDTp/HTBxF1gXvZ020naex0Bfyx r1cKgfJ+YdS+g== Date: Fri, 10 May 2024 10:38:12 -0700 (PDT) From: Mat Martineau To: Yunsheng Lin cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Alexander Duyck , Andrew Morton , linux-mm@kvack.org Subject: Re: [PATCH net-next v3 10/13] mm: page_frag: introduce prepare/probe/commit API In-Reply-To: <20240508133408.54708-11-linyunsheng@huawei.com> Message-ID: References: <20240508133408.54708-1-linyunsheng@huawei.com> <20240508133408.54708-11-linyunsheng@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed X-Stat-Signature: 8dq94b39gc3fa51pjigsx4xs15pdzkt4 X-Rspamd-Queue-Id: 255A81C000D X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1715362696-773705 X-HE-Meta: U2FsdGVkX189JMXHMdj1tEpeXk6nAYmG/1XpNSA57JYMY+RUXqYf9Cvz6rp0Hl8jy0PgqxVMMAd47/J3PZtWSr8WN/Jfi2L8Tl8jCg18bRpHDnGHI1FsvacPw9fATtKEo7R7XuLHD8wetgZm/fZMLQEtUG1J2d8IGnMrqQGv+KgItkFgaOEpuCbwO7LasAFrGthvo0XiVtkAvbzkPK7wjglZ8Vm1d9HkAzdKN6Oza0pxVIYFqIKvL4HLHYM3fWqU4cGClsv+VPeQxl9Y5HnJYEVkLOpG0eRQrJBFHm8Aok28GLaEClXZp18jhjQa3y8K/2N/fcHMxEPxcFtJNpXTQZshly2tMhcqPFI4PGami9FmWLi/6EfQDHCg2rfu4MUyZtQjmt6Ka+fA5mj+PGR6z+Bo+jfaz4M8Ec/dQx2f4KY8EhdpnWEv3eVVoOBXh351lTOLVdJShcW/mKvZVP8qHClv6ql6ZxjAEE08jU3S/U9T2eSy+tucj5Rp1HGfDinPhoPuB9WvJdWW0/nuq4bzc/pbnsd7S7dtXQZkTOnWwFqdJVbm0Ks6q2Zu0jqB8HfCgFbvJedcDzNdIDT8NDHJ7a1LDZhnbHOeKoBZh7iq0QLSI1878INSt0k/jXeS7pwxREqzLLjK9mb3YkXRocINaDZDqBLIHjkwk6BG09J3P3nVkeYn3KiSrJPzmnEnkVeqYwQlg1AT2vcJKLVO51nMalD6luurd7Jg4/sTgVPkGh4FGFrQNkK3Ym3zbIkD3D2S/+GPHNG8YLiCfwjFKiD2rC9seBOQ0XrIIBWNefxpGbbfebwmRNWiS0LRXxfs4+YCheqqz9O5nv5/cxmO+urZsiP7Q6eQtobL0sHgNw7wtuldcp5XOqANuP7ZGY1PyChx27uW4a00wHr9mkNkmXAWr2hJXjhfqXdySKXdUFuIZIoQTP5MfJG69UkJi5Dj8AcBibqo4FAWh0pjwhWYbkL C5VmUwPX DPfoMbUuGRFn74c07JD4GZK0m7o10vGap9lKKsaM3a+hTm9+6IfolYTLPz2xUHRdbLX4hajCK+73PnPE5lSK5dSZP9yRSz1NMINSl90y+HwnhI9jzpf7PIftlZyHIdrBQLxGr0ZY60JgLeZtv+6Kjh3Sh4q22MnAyL3+iLI9SwkPlRDC9gO/JToWkBRSGeREJyvy6+uUfSpxzKjjFkYPS2j0+8HriER7nEgeB5kG2EfIMW9fOXu/T7aSNYWxs3LutH/0TInWd1L7XFKZnCJ6yAhzz/7L6g3jSLjojpm0pRzeT7GAey1YTRkNCozqhj7A6XjqgWtod7TaN7hzv18eBJVW7GWRQ/A4P9azF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 8 May 2024, Yunsheng Lin wrote: > There are many use cases that need minimum memory in order > for forward progressing, but more performant if more memory > is available or need to probe the cache info to use any > memory available for frag caoleasing reason. > > Currently skb_page_frag_refill() API is used to solve the > above usecases, caller need to know about the internal detail > and access the data field of 'struct page_frag' to meet the > requirement of the above use cases and its implementation is > similar to the one in mm subsystem. > > To unify those two page_frag implementations, introduce a > prepare API to ensure minimum memory is satisfied and return > how much the actual memory is available to the caller and a > probe API to report the current available memory to caller > without doing cache refilling. The caller needs to either call > the commit API to report how much memory it actually uses, or > not do so if deciding to not use any memory. > > As next patch is about to replace 'struct page_frag' with > 'struct page_frag_cache' in linux/sched.h, which is included > by the asm-offsets.s, using the virt_to_page() in the inline > helper of page_frag_cache.h cause a "???vmemmap??? undeclared" > compiling error for asm-offsets.s, use a macro for probe API > to avoid that compiling error. > > CC: Alexander Duyck > Signed-off-by: Yunsheng Lin > --- > include/linux/page_frag_cache.h | 86 ++++++++++++++++++++++++ > mm/page_frag_cache.c | 113 ++++++++++++++++++++++++++++++++ > 2 files changed, 199 insertions(+) > > diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h > index 88e91ee57b91..30893638155b 100644 > --- a/include/linux/page_frag_cache.h > +++ b/include/linux/page_frag_cache.h > @@ -71,6 +71,21 @@ static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache *nc) > return encoded_page_pfmemalloc(nc->encoded_va); > } > > +static inline unsigned int page_frag_cache_page_size(struct encoded_va *encoded_va) > +{ > +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) > + return PAGE_SIZE << encoded_page_order(encoded_va); > +#else > + return PAGE_SIZE; > +#endif > +} > + > +static inline unsigned int __page_frag_cache_page_offset(struct encoded_va *encoded_va, > + unsigned int remaining) > +{ > + return page_frag_cache_page_size(encoded_va) - remaining; > +} > + > void page_frag_cache_drain(struct page_frag_cache *nc); > void __page_frag_cache_drain(struct page *page, unsigned int count); > void *__page_frag_alloc_va_align(struct page_frag_cache *nc, > @@ -85,12 +100,83 @@ static inline void *page_frag_alloc_va_align(struct page_frag_cache *nc, > return __page_frag_alloc_va_align(nc, fragsz, gfp_mask, -align); > } > > +static inline unsigned int page_frag_cache_page_offset(const struct page_frag_cache *nc) > +{ > + return __page_frag_cache_page_offset(nc->encoded_va, nc->remaining); > +} > + > static inline void *page_frag_alloc_va(struct page_frag_cache *nc, > unsigned int fragsz, gfp_t gfp_mask) > { > return __page_frag_alloc_va_align(nc, fragsz, gfp_mask, ~0u); > } > > +void *page_frag_alloc_va_prepare(struct page_frag_cache *nc, unsigned int *fragsz, > + gfp_t gfp); > + > +static inline void *page_frag_alloc_va_prepare_align(struct page_frag_cache *nc, > + unsigned int *fragsz, > + gfp_t gfp, > + unsigned int align) > +{ > + WARN_ON_ONCE(!is_power_of_2(align) || align > PAGE_SIZE); > + nc->remaining = nc->remaining & -align; > + return page_frag_alloc_va_prepare(nc, fragsz, gfp); > +} > + > +struct page *page_frag_alloc_pg_prepare(struct page_frag_cache *nc, > + unsigned int *offset, > + unsigned int *fragsz, gfp_t gfp); > + > +struct page *page_frag_alloc_prepare(struct page_frag_cache *nc, > + unsigned int *offset, > + unsigned int *fragsz, > + void **va, gfp_t gfp); > + > +static inline struct encoded_va *__page_frag_alloc_probe(struct page_frag_cache *nc, > + unsigned int *offset, > + unsigned int *fragsz, > + void **va) > +{ > + struct encoded_va *encoded_va; > + > + *fragsz = nc->remaining; > + encoded_va = nc->encoded_va; > + *offset = __page_frag_cache_page_offset(encoded_va, *fragsz); > + *va = encoded_page_address(encoded_va) + *offset; > + > + return encoded_va; > +} > + > +#define page_frag_alloc_probe(nc, offset, fragsz, va) \ > +({ \ > + struct encoded_va *__encoded_va; \ > + struct page *__page = NULL; \ > + \ Hi Yunsheng - I made this suggestion for patch 13 (documentation), but want to clarify my request here: > + if (likely((nc)->remaining)) \ I think it would be more useful to change this line to if ((nc)->remaining >= *fragsz) That way the caller can use this function to "probe" for a specific amount of available space, rather than "nonzero" space. If the caller wants to check for available space, they can set *fragsz = 1. In other words, I think the functionality you described in the documentation is better and the code should be changed to match! - Mat > + __page = virt_to_page(__page_frag_alloc_probe(nc, \ > + offset, \ > + fragsz, \ > + va)); \ > + \ > + __page; \ > +}) > + > +static inline void page_frag_alloc_commit(struct page_frag_cache *nc, > + unsigned int fragsz) > +{ > + VM_BUG_ON(fragsz > nc->remaining || !nc->pagecnt_bias); > + nc->pagecnt_bias--; > + nc->remaining -= fragsz; > +} > + > +static inline void page_frag_alloc_commit_noref(struct page_frag_cache *nc, > + unsigned int fragsz) > +{ > + VM_BUG_ON(fragsz > nc->remaining); > + nc->remaining -= fragsz; > +} > + > void page_frag_free_va(void *addr); > > #endif