From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61543C52D7C for ; Thu, 15 Aug 2024 15:26:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EBBC86B00CE; Thu, 15 Aug 2024 11:26:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E6C476B00CF; Thu, 15 Aug 2024 11:26:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D0CC96B00D0; Thu, 15 Aug 2024 11:26:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B036B6B00CE for ; Thu, 15 Aug 2024 11:26:01 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 602A5C1677 for ; Thu, 15 Aug 2024 15:26:01 +0000 (UTC) X-FDA: 82454855322.19.D2A2126 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) by imf17.hostedemail.com (Postfix) with ESMTP id 707F340020 for ; Thu, 15 Aug 2024 15:25:59 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=J31bW0Ji; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.128.51 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723735505; a=rsa-sha256; cv=none; b=gfb0EUkJupMGeIVIiShZyg6AGXdPzCzFx2KxFDL3Bq3KnlI8lPcAFtNGWeKg8OUWVW5x6Z hTuvydJVkXghdGbTDhe2Nvnia4Vcyiktv0sSXFGY+0G/gkp2xzevGpzdxbuioTNs4BFWau q69n41siDjQSomHIbCEg8aJzW4g1MNY= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=J31bW0Ji; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.128.51 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723735505; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zGytb5LTYI4euY6Qlrzf7/tyRyLi82Zco7uVwIScyTY=; b=shtTSVKb5oatuG0rlLkhpPJZVD2FqmM5p/1L3skW0QskTabar576oShRQxr3uMa5vgc6hG brajxo5KLOF9AIAgy8pBw/OTzkXN3nylQdlDfD4FsKTprWihOvX9gONOBwXw0BpBdRIVyH nD15sffLnksOj46kZcMR67Qe7yhxD/g= Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-428243f928fso9943815e9.0 for ; Thu, 15 Aug 2024 08:25:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723735558; x=1724340358; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=zGytb5LTYI4euY6Qlrzf7/tyRyLi82Zco7uVwIScyTY=; b=J31bW0Ji+4+YrwMgcr7SPvXS3mivwwWNtStAqLuak7U3oTdQbdWA9k4STKjPZuje2j kFGAkzMrntOS4Alfel2XhkTZYbD5SZZZQUrfWjYIsCZQQ1ERJwEGRbxKcEkARntENsBX e2Khq8AiiLLTxh8LO981sM2AV14jfsZOswewitWCuzpKQc5LY0/rAyWBjcYQd61X9p2h 8mIhlYXRMHT6ZjeSF602NtfiQmtGjsHzhnMX/HgwMX0Iu33iLpS5RsXft6L+SHgJAwZk Gz1rRGpsNo4Yr7EkBKDcqAJIkp31fjQKNN8r9NB4PuSlEb293e+q2m4MvZcIVIevV6VX kp5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723735558; x=1724340358; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zGytb5LTYI4euY6Qlrzf7/tyRyLi82Zco7uVwIScyTY=; b=nXZwHV84U86nKP37GahpaP5HrokQdWBOYuZHNtrpMeCUgEZGGTqDP3d20RrxAFFMWf AJij3wboplTj+uFfcWnF+sZiODj3cZwruO+wwvVIsc9pWspE/BBmfEUhjw17kxlprV/X X7gssWC7H5jU84yhbeA1w/5PXZZlYyBgnMEN6bCvQUg4/kpWm2BWino41kdBgZlj2lkk glsljb/JLsPyd48itEToJfTwdPH5k5pB8mRCRUh3FF/ElOqFUcQpQIxHbDjLw1wZbIJv pTtUEupRvuWltLhKopnqo2VCGsCfWNX0DCw/RP7aYdnVlG1LSyrMZYlRQjGLSZthx7sy KUMQ== X-Forwarded-Encrypted: i=1; AJvYcCXqSTyMY0tAaZSWOUSdY0BpLbFtlHjAcpsXC6HxV+STibHp2hTHc8LLnh5QBXvpuyyshhOqBjZGjhuU9BZ8eEHfDVI= X-Gm-Message-State: AOJu0YzPDp07u9Vhi2wPCWJg6WCd6/3ig+HXM24/EbdHvP/JITx/vsn4 rS9MeYEXcB3VdcRIMxKIy0gtm2cxqPGUkQ33OURssSLw+LPLrIdRI6pvm121qJDvn+vvdwnxMyx kVwopfYxv6GClFGZRgI42fasYilE= X-Google-Smtp-Source: AGHT+IF+q+3Fjs27gzFiH+tEX4ugMizHrZJESz/KVjstuzcfFCh03nxywzC6Lzyp8oSUUoYp5HmYF3tulbgjvTqq2Y4= X-Received: by 2002:a05:600c:511e:b0:426:59d3:8cae with SMTP id 5b1f17b1804b1-429dd236521mr63021845e9.13.1723735557660; Thu, 15 Aug 2024 08:25:57 -0700 (PDT) MIME-Version: 1.0 References: <20240808123714.462740-1-linyunsheng@huawei.com> <20240808123714.462740-12-linyunsheng@huawei.com> <7f06fa30-fa7c-4cf2-bd8e-52ea1c78f8aa@huawei.com> In-Reply-To: <7f06fa30-fa7c-4cf2-bd8e-52ea1c78f8aa@huawei.com> From: Alexander Duyck Date: Thu, 15 Aug 2024 08:25:21 -0700 Message-ID: Subject: Re: [PATCH net-next v13 11/14] mm: page_frag: introduce prepare/probe/commit API To: Yunsheng Lin Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 707F340020 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: mchw9dz6ifz7ahwtgf51dui93trhip19 X-HE-Tag: 1723735559-410542 X-HE-Meta: U2FsdGVkX1/E4urjq9yBeoNgDzLim/Rn8XQkQkLCskY3vbvLZYXYhrgKdDgL+ndbDsAA4PvEXk067EfROUBo1ZlS/d1+Da6uqOAnmknqbuoVkNEd0B839r7iqX3D1S2NhSQmggAO5+YniI5mmf253BUPNvhfCtVNynrp1liMAMVpjjFzM1Z9gHySmRe2MBaWMu+SeifPKGE6tm/KBziwurGyWt5E50ifzhtKUnTf6EA+ZrFa3+dk1B6vJzGvZRbs9vHEehVLN7ijx9o3xswVKNYSRPdkRBFDT/g4zCdjt+fUrKrOaSkaSG/nE790qhGtgeAUaivk71edJQ1WeUTZ75iGTuLIfl1vbkVgcFODJzkcYw9PyBmWClhq/oBjLpTnGk7GjtXLxCjmdWBSdITqEbIpElDRjLNRKDcSKK3VxA8JG9JNkZ0luabM76nNrP6Z271WjrlVd+EJIk63vaqWX6M6RrOqhyHD0i8BEn+X4wAvxqpe3ET4ufD4EVmD4WQEs62d4IDlaYmF9Y+38u3Sk95uwsunQrI98AlkA6lEuMnQgArBkWVLKQRJwF4ULDUzOdEBEH/xsYqq48aNBLD+DKl4ekNbH/kV+qDMGylgmQeCez66aW27xXT858HtshDov+3jnG11rFMQfJ+0HcbnIQtYW2vjZnE9GJDbwQJO5eor29hvz6CtQOgA/FtTLqCG6CqQAB9pYhxIJ/0TfIUm2dJ/uZTnqXdbyw/ZUvwsDVqu5GBuV2shXf2jfwUTNeAgkWrF+IoCs0iHgNuM7VIZL3cx7/RpYan6CCHHl1Rv4FBOyhjzNYNVh7MIX+HE4Sfe+TcNSpJ8aSn1KUY3syCZPyS0IbJ/OkUM5HK7CqEGwhaGTwg4Rlc3wjthLzEPb7R8j5BwlyqKK6pCU8nnKRojls/YjSeZnBo/nqAgkv39niGiDLLdhj98erkKoFUREXimeQoj3KvqPqVD2CxGFJZ Ze9dDshk JrH8Ouf3vPMUuzWvKxYjIpxi72/ZRrFOQNFkgwwBgIjKTIfo2ZCaEG4AaRrYh6IIQV7Ajg0lkh3g/1mgBAfN647MzYcpSMYa39Sq2AwGgQPQzK29fyvdf9e4+ldmdOEQKLVUI+Vaif5tw3xo8cc2vaBwYp0+9pPT+AC3vd/VEF0UxnqDqsjp9hHepKrlyRr3fq1rjtwGQ0h1gofVfGaDhKd99Xdo/XT+WKIB6zmnaUalwJCyeqZMjOLtlPDu1r7dl1Wl4QQxzDyA2Oi+fDDxMPNH5neB9uUvT/AbXAnjwnSM0FZyXdTp17ytXLg/Y5iboFGkoTPh1DUav79nO37vsGkLS3G50+NGXhG0RXVvWBxkKRuAEinEz50u4aP52LkSCzAsmh+0Poy4ovn3Z0Yy/BiLZl4qZwbU2u03fmFiSoWJ1ybnABNoAFp6H7voVfAOiDSA0chUFTqMzbCg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Aug 14, 2024 at 8:05=E2=80=AFPM Yunsheng Lin wrote: > > On 2024/8/15 5:00, Alexander H Duyck wrote: ... > >> +static inline void page_frag_alloc_abort(struct page_frag_cache *nc, > >> + unsigned int fragsz) > >> +{ > >> + nc->pagecnt_bias++; > >> + nc->remaining +=3D fragsz; > >> +} > >> + > > > > This doesn't add up. Why would you need abort if you have commit? Isn't > > this more of a revert? I wouldn't think that would be valid as it is > > possible you took some sort of action that might have resulted in this > > memory already being shared. We shouldn't allow rewinding the offset > > pointer without knowing that there are no other entities sharing the > > page. > > This is used for __tun_build_skb() in drivers/net/tun.c as below, mainly > used to avoid performance penalty for XDP drop case: Yeah, I reviewed that patch. As I said there, rewinding the offset should be avoided unless you can verify you are the only owner of the page as you have no guarantees that somebody else didn't take an access to the page/data to send it off somewhere else. Once you expose the page to any other entity it should be written off or committed in your case and you should move on to the next block. > > >> +static struct page *__page_frag_cache_reload(struct page_frag_cache *= nc, > >> + gfp_t gfp_mask) > >> { > >> + struct page *page; > >> + > >> if (likely(nc->encoded_va)) { > >> - if (__page_frag_cache_reuse(nc->encoded_va, nc->pagecnt_b= ias)) > >> + page =3D __page_frag_cache_reuse(nc->encoded_va, nc->page= cnt_bias); > >> + if (page) > >> goto out; > >> } > >> > >> - if (unlikely(!__page_frag_cache_refill(nc, gfp_mask))) > >> - return false; > >> + page =3D __page_frag_cache_refill(nc, gfp_mask); > >> + if (unlikely(!page)) > >> + return NULL; > >> > >> out: > >> /* reset page count bias and remaining to start of new frag */ > >> nc->pagecnt_bias =3D PAGE_FRAG_CACHE_MAX_SIZE + 1; > >> nc->remaining =3D page_frag_cache_page_size(nc->encoded_va); > >> - return true; > >> + return page; > >> +} > >> + > > > > None of the functions above need to be returning page. > > Are you still suggesting to always use virt_to_page() even when it is > not really necessary? why not return the page here to avoid the > virt_to_page()? Yes. The likelihood of you needing to pass this out as a page should be low as most cases will just be you using the virtual address anyway. You are essentially trading off branching for not having to use virt_to_page. It is unnecessary optimization. > > >> +struct page *page_frag_alloc_pg(struct page_frag_cache *nc, > >> + unsigned int *offset, unsigned int fragsz= , > >> + gfp_t gfp) > >> +{ > >> + unsigned int remaining =3D nc->remaining; > >> + struct page *page; > >> + > >> + VM_BUG_ON(!fragsz); > >> + if (likely(remaining >=3D fragsz)) { > >> + unsigned long encoded_va =3D nc->encoded_va; > >> + > >> + *offset =3D page_frag_cache_page_size(encoded_va) - > >> + remaining; > >> + > >> + return virt_to_page((void *)encoded_va); > >> + } > >> + > >> + if (unlikely(fragsz > PAGE_SIZE)) > >> + return NULL; > >> + > >> + page =3D __page_frag_cache_reload(nc, gfp); > >> + if (unlikely(!page)) > >> + return NULL; > >> + > >> + *offset =3D 0; > >> + nc->remaining =3D remaining - fragsz; > >> + nc->pagecnt_bias--; > >> + > >> + return page; > >> } > >> +EXPORT_SYMBOL(page_frag_alloc_pg); > > > > Again, this isn't returning a page. It is essentially returning a > > bio_vec without calling it as such. You might as well pass the bio_vec > > pointer as an argument and just have it populate it directly. > > I really don't think your bio_vec suggestion make much sense for now as > the reason mentioned in below: > > "Through a quick look, there seems to be at least three structs which hav= e > similar values: struct bio_vec & struct skb_frag & struct page_frag. > > As your above agrument about using bio_vec, it seems it is ok to use any > one of them as each one of them seems to have almost all the values we > are using? > > Personally, my preference over them: 'struct page_frag' > 'struct skb_fra= g' > > 'struct bio_vec', as the naming of 'struct page_frag' seems to best mat= ch > the page_frag API, 'struct skb_frag' is the second preference because we > mostly need to fill skb frag anyway, and 'struct bio_vec' is the last > preference because it just happen to have almost all the values needed. That is why I said I would be okay with us passing page_frag in patch 12 after looking closer at the code. The fact is it should make the review of that patch set much easier if you essentially just pass the page_frag back out of the call. Then it could be used in exactly the same way it was before and should reduce the total number of lines of code that need to be changed. > Is there any specific reason other than the above "almost all the values = you > are using are exposed by that structure already " that you prefer bio_vec= ?" > > 1. https://lore.kernel.org/all/ca6be29e-ab53-4673-9624-90d41616a154@huawe= i.com/ My reason for preferring bio_vec is that of the 3 it is the most setup to be used as a local variable versus something stored in a struct such as page_frag or used for some specialty user case such as skb_frag_t. In addition it already has a set of helpers for converting it to a virtual address or copying data to and from it which would make it easier to get rid of a bunch of duplicate code.