From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23BAEC3DA6E for ; Mon, 8 Jan 2024 16:26:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 839956B0080; Mon, 8 Jan 2024 11:26:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E9476B0081; Mon, 8 Jan 2024 11:26:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B14C6B0082; Mon, 8 Jan 2024 11:26:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 563306B0080 for ; Mon, 8 Jan 2024 11:26:05 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2CC941602C7 for ; Mon, 8 Jan 2024 16:26:05 +0000 (UTC) X-FDA: 81656670690.29.6EF8D9C Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) by imf06.hostedemail.com (Postfix) with ESMTP id 6396918000A for ; Mon, 8 Jan 2024 16:26:03 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eN3TAtO9; spf=pass (imf06.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704731163; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7c7bb56Fe3/y7pgmJGOLp2UvSxEw8VjV4D0LZYXHKG0=; b=FrsJ9+4HbtQ6NR5dWiDey6MKn2Y1eczJvDOShNBjEBfrKscd5qsBDjKyZRvTgU4tEz2Frf QS4IE0W5ob7gYIJvzN/wNp6rJRopmbs+hOGcpBzYWWSr7/XGV2YsdCQMmPZgaaFrYxRaS9 qOdoHfdTu07tFMll180PHtn08UP9DSo= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eN3TAtO9; spf=pass (imf06.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704731163; a=rsa-sha256; cv=none; b=sGsZ3ahW+G8QJEteGUOmyiykQOZBBLAMskJ/kZ7I2ik3wNwfn6AwPU+AxBucRb8Iw9r0Ly K9TSaGCEo8XqjCzM6kKJMHcFGNycjWYqVTfb5M94aHunpHPmNjK2rgeBdGJFOfayc6DXGk od1Xy/KItKBENFGEUHSSnNf2yq3lLf8= Received: by mail-pj1-f43.google.com with SMTP id 98e67ed59e1d1-28bc870c540so1764272a91.2 for ; Mon, 08 Jan 2024 08:26:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704731162; x=1705335962; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7c7bb56Fe3/y7pgmJGOLp2UvSxEw8VjV4D0LZYXHKG0=; b=eN3TAtO9VO5/FHvnCC56IMebP4uiNTi0pG7IhnSYIU205VmBA8buWPVyqK2ip550DE V9l7z+4Op5P7EUX6Mcv0o0MRryxEOjYl2jzfPgVTYJNRTK2v2Q+phOWJj9zBwsXXO5ck o0ysuu4WwH1Cyz4IggBrvqseLAIlEZRfs4GCTrcf/iKcRLbFClT7izVMazyT+Y1CpBJe 8bGZLtqC5e3JCmF0rBuB3nKiLmWAKsp4dJT0yRZ1KUcN7DVsRkFla7iB6LyuvZd7rWMY m1KRgVxPajUOedD0gAxcKbwBGX/TrSyElOBo3aEyAvsz/Y+PCTXuIID3RkpiIe0Qcdve TvvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704731162; x=1705335962; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7c7bb56Fe3/y7pgmJGOLp2UvSxEw8VjV4D0LZYXHKG0=; b=ruWbTD9031Fv2KuSygg6qOg6+1uDx/2XwmPd1QTAYPVx63xEpYGZ8gGIJaG0/ts16A PVeKn7EQ5osMUMzZ3QT9lUIr90MIn/rCVw1wm01PMbFB2eWBY/dLklgstDw6CFigSdcd n9ajAADf4R5ylH1qsirfwl0tutgeQSV5Lqcp3EEQUYQqVYNyoOiqXDTblmU71vvMyomN EpbK0mJF1ovd0u2ooi8daQB/AX4BboZrsvVb8MeHQtZGR4MdbqZgt/2tnRD48w71Nyp5 pY5itFlvJumtIMdKNKCsBGt/9HhxB4FELBmJ7FDcxUE1Xf2wt4+RlsovMPav7iiXOOVV oWNA== X-Gm-Message-State: AOJu0Ywqs9XnnSbbfX1bVa+0murbzjdpf+v0+IjGb+4hWvKs+ypoUwfj T0K0GyXCF7bC6FTDVoidQjl8QnmZ4xCGwRWKHO0= X-Google-Smtp-Source: AGHT+IF6ceeZSxlz/wlFZ+bbF/0v/1LMUhzHmD+gMxSm5drrqIrCieFYp3vecgYjUA0kN0zffPcMnL1OO1vVFqjTI78= X-Received: by 2002:a17:90b:1211:b0:28b:dafb:cc29 with SMTP id gl17-20020a17090b121100b0028bdafbcc29mr1720680pjb.26.1704731162168; Mon, 08 Jan 2024 08:26:02 -0800 (PST) MIME-Version: 1.0 References: <20240103095650.25769-1-linyunsheng@huawei.com> <20240103095650.25769-4-linyunsheng@huawei.com> <74c9a3a1-5204-f79a-95ff-5c108ec6cf2a@huawei.com> In-Reply-To: <74c9a3a1-5204-f79a-95ff-5c108ec6cf2a@huawei.com> From: Alexander Duyck Date: Mon, 8 Jan 2024 08:25:25 -0800 Message-ID: Subject: Re: [PATCH net-next 3/6] mm/page_alloc: use initial zero offset for page_frag_alloc_align() To: Yunsheng Lin Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 6396918000A X-Rspam-User: X-Stat-Signature: jaqmn1rjxksdk1fx5em9xwsy1um5gtzg X-Rspamd-Server: rspam01 X-HE-Tag: 1704731163-303114 X-HE-Meta: U2FsdGVkX19IcM5o1uvoXlNjmfLw9Qkpv+R2xse/wCm6wwzSF5aAWEX2NWccdnFDC47h1Y0R3NiUkIFVQNlWGPU62IR1NQv8RdC/+R3RTedFG6Y2lWRYgS2fzWQ5A5V7QCOhrm4GIedzAst8q3LMhMa1zQbuhAsddcKgNT9kF2yKq3vHb2mGWig1OWffgML0od0SbyBxkmRQV3HNHIT+h6a0XNN5VfMUzKR43/RJFke6j92wiQvw4PxcN5x+h7CVXw6yqx66iBeEROIjZip7/jxoqXQyDwicCosQa+wuHh7fV+gSzlVWM54ZRjrKgXa6PJVxOzxr1d7saOnQMWv4c7bwhkRVI/cqJRmf/asMR/Gqq1BJYq1Z7Up+2vsbLxOf8M6zyy/D3FEl/xfdU1x6cavY8DaaSU/cwkmDCGs4DajxpZsMQFGFrAewIx40nDgx/IJSmzdUdvVbmS1jQuLSBeLMeT1WgKyR30bWeAIEfQ+IJ5f2IKPNkkWtqMCW/DKGe1AGXsyvmmn4UIBXwTwWf0Y+EaC3oHCRx/HI5Ed+k7YIhhyE3A6Da7oZ9yKatU/E8cuKe8IlwIxbPg8gxecEGHWN4c3/WhJxvLa4UdUrRKoE+hkSqIhO4p0zx21C4NyAlaeTv4748gy8JGHLMgtQURDxNpnKghdOjaGSbmLkcJIQbRkkcEBBczL7+rqOdEU6a8SWHmy8ruVLedlfPliaB94fIku/iV1870N5n3F+sgPbvxnbvPq2M/k8/k2VcRdpY9L3MyjiEc/Bzpt/9+JJdOn7e+D0EajrcGYMAfovf48YnjT1A6mwyv4FcZclkFWhFoCTzuZO3obT0e5yaGdINECjNuImXuWfk+EoMXt+C7nW4rrhTjv1hr3epstv2Tptq/ZaSMONS9SYkmqhb76n8k7Sk6bSXeet24wsMHCtOFROKBR+wIco4KgUe6nbhRdPJJHAJrUqPj61u8pQRTi wDwCBdym 6Xhx1jeKzhUi3NPe1ZM7ED9QwzMttqmJCPN7oYMbbZj5JkWw+8ApBrv7WJAxPwtJn9CowbGqmHSOesEZfbcv5/u85gtMHgDUV5m1X3If7lILhFRRSz93ZHYmuQh97ohJM12LpHIWC3Gcjrdpc4rcufBCPhvUqSxv/xSBtcZapQLjaVMZGQEaf2Z6huwGqUSe/ZGfcHDlf5sjTN/SnvOK7KJFCHcUbLE6ztLwtb6JLAJhSk+PqMYrM68Iay/etb47Uhm1pOHD1x4EtekK4OU+wLZEBSk7v+MWYMYvc8hVQnzfLNJFBM0lMrM2+9rAbdSpVUVgDPbLkxEGYY0uRQ77nNhq2Oeml0mShzulMrfvIPajVRihrGbx3bZJo0bgDx4MHAjMWE1Lz93krIN/GIJ+tzxiufMHrezG3l+lFD6ydELUnE2cpOwxxrjtfwDT/iCYqb/5bn3ob5hMLSFF7LLFCOCahwLazrTG+6EHs X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 8, 2024 at 12:59=E2=80=AFAM Yunsheng Lin wrote: > > On 2024/1/5 23:42, Alexander H Duyck wrote: > > On Wed, 2024-01-03 at 17:56 +0800, Yunsheng Lin wrote: > >> The next patch is above to use page_frag_alloc_align() to > >> replace vhost_net_page_frag_refill(), the main difference > >> between those two frag page implementations is whether we > >> use a initial zero offset or not. > >> > >> It seems more nature to use a initial zero offset, as it > >> may enable more correct cache prefetching and skb frag > >> coalescing in the networking, so change it to use initial > >> zero offset. > >> > >> Signed-off-by: Yunsheng Lin > >> CC: Alexander Duyck > > > > There are several advantages to running the offset as a countdown > > rather than count-up value. > > > > 1. Specifically for the size of the chunks we are allocating doing it > > from the bottom up doesn't add any value as we are jumping in large > > enough amounts and are being used for DMA so being sequential doesn't > > add any value. > > What is the expected size of the above chunks in your mind? I suppose > that is like NAPI_HAS_SMALL_PAGE_FRAG to avoid excessive truesize > underestimation? > > It seems there is no limit for min size of allocation for > page_frag_alloc_align() now, and as the page_frag API seems to be only > used in networking, should we enforce the min size of allocation at API > level? The primary use case for this is to allocate fragments to be used for storing network data. We usually end up allocating a minimum of 1K in most cases as you end up having to reserve something like 512B for headroom and the skb_shared_info in an skb. In addition the slice lengths become very hard to predict as these are usually used for network traffic so the size can be as small as 60B for a packet or as large as 9KB > > > > 2. By starting at the end and working toward zero we can use built in > > functionality of the CPU to only have to check and see if our result > > would be signed rather than having to load two registers with the > > values and then compare them which saves us a few cycles. In addition > > it saves us from having to read both the size and the offset for every > > page. > > I suppose the above is ok if we only use the page_frag_alloc*() API to > allocate memory for skb->data, not for the frag in skb_shinfo(), as by > starting at the end and working toward zero, it means we can not do skb > coalescing. > > As page_frag_alloc*() is returning va now, I am assuming most of users > is using the API for skb->data, I guess it is ok to drop this patch for > now. > > If we allow page_frag_alloc*() to return struct page, we might need this > patch to enable coalescing. I would argue this is not the interface for enabling coalescing. This is one of the reasons why this is implemented the way it is. When you are aligning fragments you aren't going to be able to coalesce the frames anyway as the alignment would push the fragments apart.