From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12F98C3DA4A for ; Tue, 6 Aug 2024 00:52:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 990E36B0088; Mon, 5 Aug 2024 20:52:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 91A126B0089; Mon, 5 Aug 2024 20:52:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7936F6B008C; Mon, 5 Aug 2024 20:52:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5B39B6B0088 for ; Mon, 5 Aug 2024 20:52:48 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id CE3C041E55 for ; Tue, 6 Aug 2024 00:52:47 +0000 (UTC) X-FDA: 82419995574.13.A5065B3 Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) by imf18.hostedemail.com (Postfix) with ESMTP id E96861C0006 for ; Tue, 6 Aug 2024 00:52:45 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XEhQYEaL; spf=pass (imf18.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.128.44 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722905517; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vbMs5Emme3Wh6YV9L5YnP1YT4GqHd5pCxV5C0hflVMM=; b=aHuy1B13/W+59WyBpACVxSgznwQ2+oo1A82irgPfQ1jpbOiQpOqoj+PYI2OtyUPBzk20Bj nyNg9G9pz1oGt8Mvgg2MYuD0ROuj9sEdXQfgOhejEN9vZY5XsCkUofgzmtNMEWtHpuTSEj /CXFN2ECfMS1nHnhNmFJoR95AUvPnyM= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XEhQYEaL; spf=pass (imf18.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.128.44 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722905517; a=rsa-sha256; cv=none; b=AMy0g/klD8TUV1ke4m7cIEzKcQw/XyF356+f9WmDdGH3IkYpRl379uHlfXBx6a+j/STi08 QHN6pg5Jkl09RaMF2vAGUCEEecnjIXmbuyAl+0ax1nQjF3RHNpifGLRU10JJhMUzFy3Elt r6SsDZyseFCnd4to5EVHxOTcVDFlWUY= Received: by mail-wm1-f44.google.com with SMTP id 5b1f17b1804b1-428243f928fso978935e9.0 for ; Mon, 05 Aug 2024 17:52:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722905564; x=1723510364; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=vbMs5Emme3Wh6YV9L5YnP1YT4GqHd5pCxV5C0hflVMM=; b=XEhQYEaL+twSLX82ox71sZDtw/5ucZXFzJ6utcqzSai19HDn4r5u3dPMzo+Df7DzQh N7rC55dvMB5T8y43387nFSpf5YSr4TyKolVHl5Hyo3HU1YWfYZaO4RxiuOp8FGH4xZrw 7gtEkrM/WMDjuEhSwRQVQcf0ewBMPLR22yWrO2Y6nintzzoXrBX8ybdA+ruw3qMZkzW6 WkaIEKKCxELtjdeZg7396Ud7gDS5rwUoqfQZyHqosPUk+eG0cGy5kGuuNLWmbpa5cYqA LW217Wu5z5iPWzXWHvtL00IJdREEicytZmqCmXHpoXuNaCmMQvBE4fQTAcKDXFWr1LHj EBMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722905564; x=1723510364; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vbMs5Emme3Wh6YV9L5YnP1YT4GqHd5pCxV5C0hflVMM=; b=UTIH8VJTlJb+4CGhyX3/Zs3G2XBpPK2r/kB8LqV5hw+ovmANjKvatPsKxhEh5IFZwd pG30ROJxrk195nGVXmpcwsqEVeaFbrFtyWS9Y3+xsErh0E91fICAUiOeJSRyOuQ+3Av7 xJBH9JkCNAMaKNXuevxEhnbTsmPgRfAwJVSqJxSWtc/Fu2hJbExCRQFWeygbMKeoYwrh 1xG0UXtyYPIXoXVrDCuS0OpuoSwR2jUqSctHnToTQ9sgOWq/t0bUKkg6MAXsiaqz49L6 nXb/7voNlnVNKMXYNm6p9tDf/TfCyUUu5Vhl+liS/CwcP9iP2BZQ+O8/cVnhvqkraWT1 y/RA== X-Forwarded-Encrypted: i=1; AJvYcCUcx1qq3lPBn3JoTyaQcItatjf0sekP6TTeasWySA4j4/66c9tbb7mVK29Ie2noW9anoN35UB4xd/3gzREP7VfTPpA= X-Gm-Message-State: AOJu0YxXULi/geuFXYSGoDPi4RFENTpl/q2jGeT3MLVSGvQOkKPcpTLt GIPBVHKhamiRe0+Zy2X1qWL2LWkDZ36aMaG0+8bPOe/wYcHIAkQk9rpiT4ohQrjZAltvGrp42GU lKNkgTchmkTLAgR9PU8cKZnWm/xY= X-Google-Smtp-Source: AGHT+IGqdqorrE2wlr/gRUXIf6uZerGtzKcZj5VR2LPXl0lkIPBaNlvbUHmQ57kNpAjkgdSFWmDjnDGMqxYFtJZpI1s= X-Received: by 2002:a5d:6042:0:b0:368:7f8f:ca68 with SMTP id ffacd0b85a97d-36bbc117fd0mr8920631f8f.30.1722905564079; Mon, 05 Aug 2024 17:52:44 -0700 (PDT) MIME-Version: 1.0 References: <20240731124505.2903877-1-linyunsheng@huawei.com> <20240731124505.2903877-5-linyunsheng@huawei.com> <22fda86c-d688-42e7-99e8-e2f8fcf1a5ba@huawei.com> <877efebe-f316-4192-aada-dd2657b74125@huawei.com> <2a29ce61-7136-4b9b-9940-504228b10cba@gmail.com> In-Reply-To: <2a29ce61-7136-4b9b-9940-504228b10cba@gmail.com> From: Alexander Duyck Date: Tue, 6 Aug 2024 06:22:08 +0530 Message-ID: Subject: Re: [PATCH net-next v12 04/14] mm: page_frag: add '_va' suffix to page_frag API To: Yunsheng Lin Cc: Yunsheng Lin , davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Subbaraya Sundeep , Jeroen de Borst , Praveen Kaligineedi , Shailend Chand , Eric Dumazet , Tony Nguyen , Przemek Kitszel , Sunil Goutham , Geetha sowjanya , hariprasad , Felix Fietkau , Sean Wang , Mark Lee , Lorenzo Bianconi , Matthias Brugger , AngeloGioacchino Del Regno , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Chaitanya Kulkarni , "Michael S. Tsirkin" , Jason Wang , =?UTF-8?Q?Eugenio_P=C3=A9rez?= , Andrew Morton , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , David Howells , Marc Dionne , Chuck Lever , Jeff Layton , Neil Brown , Olga Kornievskaia , Dai Ngo , Tom Talpey , Trond Myklebust , Anna Schumaker , intel-wired-lan@lists.osuosl.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, linux-nvme@lists.infradead.org, kvm@vger.kernel.org, virtualization@lists.linux.dev, linux-mm@kvack.org, bpf@vger.kernel.org, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 3tb7chqj9tcz45bz1um3z7pgb59dtsis X-Rspam-User: X-Rspamd-Queue-Id: E96861C0006 X-Rspamd-Server: rspam02 X-HE-Tag: 1722905565-980041 X-HE-Meta: U2FsdGVkX195mr0Yuq969XlwAHKLSf8Z2+tRMUR7W39uQ6sa2EhGg6wBbS4hMKTE1O8KyEr85sPBiGMuM/VzMkBrngOxo72jxOl4a+fP3DSqvvEE2dfooob1XHrEjfPHjZeVfMo6R2mJs1FeqjoxNII9lPdc554IPQtF0HtY6YSkGO67/FN2YF/s/TE5G29hVsGAt4UAj+v05EgcdUFJPhEW/MesbonUTrEOZ2bwAcsUHELv3m02R/t2pAEtn6EiJ8hIzrWsTT6X3Tda/2m2JuMwNRjGz38v0ylJcfZBnX2HbeDMTgTpJmS27CFtprgE1r9/pVIlUL/Q5HyAdc4zH04Q0rieAE6JZApzxJYE/sEvfB+lnJ+mzF0EOUmBUPl+KjZ7n8isNKK6oEKNRB9feVAH8oDQKJWs9EreGEBq/mOeCFsm1p1QS0HQ0EKNcbcl9YQWd+ZcRKoSdGhTYzUCD/vJn736Ou41dk2vx/m+7vK+TCSzYcLIeLeKGyxBXk845E5P0T2PqyypQbGsycWB+kzV2gmNAaPB2yf4svTZTNQ16B05eHlKdBJn95GDqS3SFdp4RRfVbvVqucr290s6gduTl3yE3cXUjtSHX5Flnm49vg0RetPkoQncGa4fJ6dXg2S9bAlK+5z71/QygkrxKBEy8/7pVCLzaeSb4SS1HBL0wj78Ag7ZihbAOuRRD4le1vM8MJebvX2gcBQ3qzt+mV+7pMu0WxYvqmcw4yKsE3S3jaRCPc7CDh45FB6ZmJvH6g0nTQ3a5j9afr7KBe8Wovv1kcxbAP8dv5Z2lDdoWM78J1SSIBURGopj+YLN9QGhRB8DLlMrKik0e05sXV1CxuVEzoeLVSMO/fR6EbypTitebgnSmTTAHEkSyWRQgd1U9a8he/rrjXlclebH+n6kjswJzhXOybPdVtneFeiYodSaAfgGp0ZF6Dhw80cpF57FWEAqvI0nmjnxELTnwV2 7EpvW+sP XImFxLPXuqkntQZd0GZQgp6NH7r1UU2YnEJTjBUQXc9Enx68fayR1S8udyiFMo6LpWsM9iMA78Yhg3uEzyzETdZz2iLu3rl1LgSnIgH3JxtaGX3B/yBNNtia8+zdeV2jFQewS1k3992/ts54lArtjG/xrskH12CJNFH54PrFwsj0JRYuJNWlPJaDaEDyIRXF6RN2K1eO3cJhEwUQuwxrjX8UWNvP/5Gc0aPvN6d9iD0FzCEIepx1/cNCQdRJT8NWwBeWakOoekLgnLdR0WhDtSjRJ3ySN8qBV0sjuGYabY5pSHQAKnutj0f1y9Q6Kup4G4FoPNZR7TB/koAwb3iMsbIfY0rW0TLvKDGCSg4UXrsQ9kuZp5bZ7oZBhwQtrkc0C+JwQChVTGTzJgXE63nQWodFRYCQLw/KVKzN9g0TEA0nIay+pSsLV7y3laH1NFY1ucEwnXU8b6LrPgTjEAv1+g0bL8CKilBisawyHG6wHzFvjec3HwXVuUk1MZ/Ier+KWn5kB83gsU7lWHa6l8uQfZONjTT8c1y8J2f1GlJFEDADovoTLjBZs3zZHXVqBwBL7huDo6uwFL1U46o2LGe+00pWINuePnjSHTFaGxFuHrdc2kKQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Aug 4, 2024 at 10:00=E2=80=AFAM Yunsheng Lin wrote: > > On 8/3/2024 1:00 AM, Alexander Duyck wrote: > > >> > >>> > >>> As far as your API extension and naming maybe you should look like > >>> something like bio_vec and borrow the naming from that since that is > >>> essentially what you are passing back and forth is essentially that > >>> instead of a page frag which is normally a virtual address. > >> > >> I thought about adding something like bio_vec before, but I am not sur= e > >> what you have in mind is somthing like I considered before? > >> Let's say that we reuse bio_vec like something below for the new APIs: > >> > >> struct bio_vec { > >> struct page *bv_page; > >> void *va; > >> unsigned int bv_len; > >> unsigned int bv_offset; > >> }; > > > > I wasn't suggesting changing the bio_vec. I was suggesting that be > > what you pass as a pointer reference instead of the offset. Basically > > your use case is mostly just for populating bio_vec style structures > > anyway. > > I wasn't trying/going to reuse/change bio_vec for page_frag, I was just > having a hard time coming with a good new name for it. > The best one I came up with is pfrag_vec, but I am not sure about the > 'vec' as the "vec" portion of the name would suggest, iovec structures > tend to come in arrays, mentioned in the below article: > https://lwn.net/Articles/625077/ > > Anther one is page_frag, which is currently in use. > > Or any better one in your mind? I was suggesting using bio_vec, not some new structure. The general idea is that almost all the values you are using are exposed by that structure already in the case of the page based calls you were adding, so it makes sense to use what is there rather than reinventing the wheel. > > > >> It seems we have the below options for the new API: > >> > >> option 1, it seems like a better option from API naming point of view,= but > >> it needs to return a bio_vec pointer to the caller, it seems we need t= o have > >> extra space for the pointer, I am not sure how we can avoid the memory= waste > >> for sk_page_frag() case in patch 12: > >> struct bio_vec *page_frag_alloc_bio(struct page_frag_cache *nc, > >> unsigned int fragsz, gfp_t gfp_ma= sk); > >> > >> option 2, it need both the caller and callee to have a its own local s= pace > >> for 'struct bio_vec ', I am not sure if passing the content instead of > >> the pointer of a struct through the function returning is the common p= attern > >> and if it has any performance impact yet: > >> struct bio_vec page_frag_alloc_bio(struct page_frag_cache *nc, > >> unsigned int fragsz, gfp_t gfp_mas= k); > >> > >> option 3, the caller passes the pointer of 'struct bio_vec ' to the ca= llee, > >> and page_frag_alloc_bio() fills in the data, I am not sure what is the= point > >> of indirect using 'struct bio_vec ' instead of passing 'va' & 'fragsz'= & > >> 'offset' through pointers directly: > >> bool page_frag_alloc_bio(struct page_frag_cache *nc, > >> unsigned int fragsz, gfp_t gfp_mask, struct = bio_vec *bio); > >> > >> If one of the above option is something in your mind? Yes, please be m= ore specific > >> about which one is the prefer option, and why it is the prefer option = than the one > >> introduced in this patchset? > >> > >> If no, please be more specific what that is in your mind? > > > > Option 3 is more or less what I had in mind. Basically you would > > return an int to indicate any errors and you would be populating a > > bio_vec during your allocation. In addition you would use the bio_vec > > Actually using this new bio_vec style structures does not seem to solve > the APIs naming issue this patch is trying to solve as my understanding, > as the new struct is only about passing one pointer or multi-pointers > from API naming perspective. It is part of the API naming, but not all > of it. I have no idea what you are talking about. The issue was you were splitting things page_frag_alloc_va and page_frag_alloc_pg. Now it would be page_frag_alloc and page_frag_alloc_bio or maybe page_frag_fill_bio which would better explain what you are doing with this function. > > as a tracker of the actual fragsz so when you commit you are > > committing with the fragsz as it was determined at the time of putting > > the bio_vec together so you can theoretically catch things like if the > > underlying offset had somehow changed from the time you setup the > > I think we might need a stronger argument than the above to use the new > *vec thing other than the above debugging feature. > > I looked throught the bio_vec related info, and come along somewhat not > really related, but really helpful "What=E2=80=99s all this get us" secti= on: > https://docs.kernel.org/block/biovecs.html > > So the question seems to be: what is this new struct for page_frag get > us? > > Generally, I am argeed with the new struct thing if it does bring us > something other than the above debugging feature. Otherwise we should > avoid introducing a new thing which is hard to argue about its existent. I don't want a new structure. I just want you to use the bio_vec for spots where you are needing to use a page because you are populating a bio_vec. > > allocation. It would fit well into your probe routines since they are > > all essentially passing the page, offset, and fragsz throughout the > > code. > > For the current probe routines, the 'va' need to be passed, do you > expect the 'va' to be passed by function return, double pointer, or > new the *_vec pointer? I would suggest doing so via the *_vec pointer. The problem as I see it is that the existing code is exposing too much of the internals and setting up the possibility for a system to get corrupted really easily. At least if you are doing this with a bio_vec you can verify that you have the correct page and offset before you move the offset up by the length which should have been provided by the API in the first place and not just guessed at based on what the fragsz was that you requested.