From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E14E2C3DA6E for ; Fri, 5 Jan 2024 15:42:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 367096B02C2; Fri, 5 Jan 2024 10:42:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 317816B02C3; Fri, 5 Jan 2024 10:42:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1DE9F6B02C5; Fri, 5 Jan 2024 10:42:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 0F4BF6B02C2 for ; Fri, 5 Jan 2024 10:42:14 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C7F8240511 for ; Fri, 5 Jan 2024 15:42:13 +0000 (UTC) X-FDA: 81645673746.06.33B2310 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf05.hostedemail.com (Postfix) with ESMTP id DE868100010 for ; Fri, 5 Jan 2024 15:42:11 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=lQSKIUQQ; spf=pass (imf05.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704469331; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M0tdftrGJbQA0lnpxqT4ZNVpdbS5d9YFAgX2xz2KRWA=; b=6KM+dfR3U9B0mwumJJOhDRRsjta2GvP2nYlJgYrHrMKU8KoBftWCdltj/V07OWPlohMbVz s2nwM2QMjoRYpo2dQThK32rxxZnbKgXNGK6C/fb1DSMdE6PxER+qtmIGJWRJJCybIejUaz Y/EkA7SmKqpX5INH0A8iHvW+IjBeCNs= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704469331; a=rsa-sha256; cv=none; b=3KDYve4GctDzr4j4OmDf4HTso4eCSuh+M2frf/COxRHA/y44FA6eDcOizU3V/s1XdHGgE3 aSpLLFecC68ogUZ7kDxt66tSE9hqfSy8gEVNxpfos9/Mz8CVLpVwKs05cPuMp9epx6M3FS tvT+3l+xRFqpVA5sEtzw3c402OSJe2U= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=lQSKIUQQ; spf=pass (imf05.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-1d3e8a51e6bso11962485ad.3 for ; Fri, 05 Jan 2024 07:42:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704469330; x=1705074130; darn=kvack.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=M0tdftrGJbQA0lnpxqT4ZNVpdbS5d9YFAgX2xz2KRWA=; b=lQSKIUQQN9FnW+b1lHvYA+kx7FGHyAINCWIB5Bj/WURiJ1kbt6knc+fXIDVKOR4+Jl NEkPzDWzc1znAQbCcXP8sH3Qr0FpMu2R1UNMNu9ZrXfODtSB3szUvz3Ve1SikHgGQNBw y2iMqRYZ32rXmgat/wPJHLVBth4cnMZw0cC6F1+WALYXXY6d1wnmJzGPGhWJriH52dMy utdxphUj38P0TU4Nzy93A9LzI3/qfB6TWFhvtozyyMZgGtLFeb/0GkBw4L6evl3gngrh r63jMtDGMsoHll+sYqak0ISvdQx+wL8byBSJyfATrDrGZP9A4TeN+mmbs7B5KdKXlDXn I5oQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704469330; x=1705074130; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=M0tdftrGJbQA0lnpxqT4ZNVpdbS5d9YFAgX2xz2KRWA=; b=whtKmNDAorzXPRyfPfK9ZqTerhkgDTC+hjUhdghh76aVLO5XMSLMcwGf/o23Omxbbw U3E97u6l5y2pG0q4zI11IoFSteJYAPwopg1avPuacQQ7+58QAV1uj8Giw6A6J8joGJP6 NCId1aGe32bSgM0pbqmBowdpYg3Q8abaNosNype3hwvHNmhuX5UyUv3VZBZJ+jxn1Crt haodO5besGK50FapXFhqSC8IUV/G8oeah5ohxrLUPDBuNt+6AnFA3xY6IpH0j32iRN2K rXY9j8cBzDAoJFL0YH8bTTbHPomy6xgAbb621m4hSjtinH+1NGpQf6Uon4mvklleiQ8w vJ6g== X-Gm-Message-State: AOJu0YwRfg4Mi3TN077h0TY+YhR4Ld3dTEqIgm7BRLjqkIrlWsIs08sj hYnGFKtMKOWA+uUf7QY92jI= X-Google-Smtp-Source: AGHT+IF7HN0bTcfhahHnVhLxfa5wv0VPOrH2QarAcMVghgtzHLN1ZDN6Ogzg7D+86n/vXlxwTZFmaA== X-Received: by 2002:a17:902:e84a:b0:1d4:5b0d:7002 with SMTP id t10-20020a170902e84a00b001d45b0d7002mr2672794plg.112.1704469330643; Fri, 05 Jan 2024 07:42:10 -0800 (PST) Received: from ?IPv6:2605:59c8:448:b800:82ee:73ff:fe41:9a02? ([2605:59c8:448:b800:82ee:73ff:fe41:9a02]) by smtp.googlemail.com with ESMTPSA id c5-20020a170902d90500b001d3ec25614bsm1552423plz.24.2024.01.05.07.42.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jan 2024 07:42:10 -0800 (PST) Message-ID: Subject: Re: [PATCH net-next 3/6] mm/page_alloc: use initial zero offset for page_frag_alloc_align() From: Alexander H Duyck To: Yunsheng Lin , davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , linux-mm@kvack.org Date: Fri, 05 Jan 2024 07:42:08 -0800 In-Reply-To: <20240103095650.25769-4-linyunsheng@huawei.com> References: <20240103095650.25769-1-linyunsheng@huawei.com> <20240103095650.25769-4-linyunsheng@huawei.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.48.4 (3.48.4-1.fc38) MIME-Version: 1.0 X-Rspamd-Queue-Id: DE868100010 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: cxhsg4cfq8usutnctit78w3notwca961 X-HE-Tag: 1704469331-358952 X-HE-Meta: U2FsdGVkX1/QSZR7K1ykmUeAMhiGHsIYdYzq5IG33FFDbs6xyu46fBCIh364Q5BVxJo4m7j8x65mPnVnSfVEwUhoKEJqi6VEUFtaE5k4R8pfHM5C2TY1mRPHusc3Dt1WeU2wSpK52GubQiihXSX4Rqt2DdjPjni5ucTMPlH6xf1eJbCjV1hSFtyoPqHS44QywQf4oU5Z8MEE/4DDHSh06n4F7MrK4qjUzxIlP1qExRtfu+rzZVKAj18XDEyhl76FmQMXOpL2hnzViFh5J0CP7l/SQIvOaaOFdm6T2A/JF3cRb/ZPXOpofwtSsHDRLfv5cUQI1xQE0LWnSIoqfwGWY2+GRJyM/oEF4I+zYxYJlfeBSGEeLqgwhXkxI5+qL0S8pJ7T6jE+KdkfFIZih9jdpoyIOBsEJvCiw86t5lnhvUOy6DbuIAIFxVfg4/RpQIHn3FDFy4Hxu3YP8lMuWYagUPKMKL42vcaj+S8pmAZ5LYwdB1XA3Vu63vmCgYgXgUqWDPAvXE0bcu0Tnbw0BDDhfkqWfb4xa66SobcqF2ZrzQ0pgchcnaz5m4FJUZsJwO8pHoBzZfLl6HzMolqCNi8YTMHxWoe4Uwh/yQ1aHR9fPcmx8+9/gvaYzWVtwmjc00gPlmGWWKyusRMzBY8R/QbHcWlJJKwmOkzNcRd/lnQQhIPyug8kH2+3Z8Zeehoo7KuX/yZJFMv6RD2Huz1T3L34kkmjlLaIVd3A80/GmWuG7ZDZ+J9T+TS+2d1WIqziEENwFLPdWRS1lfrpaWKynaRuP+Iy98Ez6WgYS7jqNZS296XOdvE9wiUdiv8iuG3hVq4qYJsFEJ/P/d3sO2sAko8tGABhb5Hu2uZivwBJk5qIZGNPZyFehqyD+GeNUOpm8znpzH4JJNey+D4sbuQCWNlrfEhvflTtb7PVc6v4C+JNuywurBztshN2tpiNY7zpvQHAEzDritAwkp5JH2OSbT7 3ME3Fgrh l1be5TM+sxlwkGancn2oxjAtazYNiGzTJ+sUsSqL5WataTNeJhGjRVUtQyWwEDInhVy0fXe0HuraOVvDWgH8GaFhRTnd9ZyOZigkMiSwJoEwjvtnTOXiAwvXUUG0woX6vt06h1pAAPHetUcJYdZrF+ThRKYRlc1gSBJs7cBhQJFUwYTaKbf9/3G/vFxYte1o84f3b5Ff2v5ndVpjjxDx6UPY+phdYtmkNe8bOpjhPsXLZHIw1BXbUf1WsIb9Rc+qYTSVp5bYMhtGVW5HlnHk+rCIZrAReNJ+WA6N7I9e+r9bA15WY2PBGONYHY+mZx7XH7kFQXhWATbYbOXOqWKpJAWYWmDrr+T1NCtbOKKDtD5Uh9sHlxx/YLcLIksx3clJ3n/mEWTyKaRnDQKHHXPTDPi/hNfgsM98TYQTtdLANtQNMaE1tys4IpaVV3jCp+M2mEeAWxZS4nFs/CSW1G6XOTsMUHF8w9c/iK8f06hUa8EdTvo7JW9OTQK3jKz+s+6cfApzIrevKury99OAUtIFbiHmhzhWq8hZaTwrR X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 2024-01-03 at 17:56 +0800, Yunsheng Lin wrote: > The next patch is above to use page_frag_alloc_align() to > replace vhost_net_page_frag_refill(), the main difference > between those two frag page implementations is whether we > use a initial zero offset or not. >=20 > It seems more nature to use a initial zero offset, as it > may enable more correct cache prefetching and skb frag > coalescing in the networking, so change it to use initial > zero offset. >=20 > Signed-off-by: Yunsheng Lin > CC: Alexander Duyck There are several advantages to running the offset as a countdown rather than count-up value. 1. Specifically for the size of the chunks we are allocating doing it from the bottom up doesn't add any value as we are jumping in large enough amounts and are being used for DMA so being sequential doesn't add any value. 2. By starting at the end and working toward zero we can use built in functionality of the CPU to only have to check and see if our result would be signed rather than having to load two registers with the values and then compare them which saves us a few cycles. In addition it saves us from having to read both the size and the offset for every page. Again this is another code cleanup at the cost of performance. I realize many of the items you are removing would be considered micro- optimizations but when we are dealing with millions of packets per second those optimizations add up.