From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 059D2C4345F for ; Tue, 16 Apr 2024 16:08:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5A0E06B0089; Tue, 16 Apr 2024 12:08:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 551546B008A; Tue, 16 Apr 2024 12:08:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 43EFD6B008C; Tue, 16 Apr 2024 12:08:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 27CE46B0089 for ; Tue, 16 Apr 2024 12:08:56 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 99C8D409F5 for ; Tue, 16 Apr 2024 16:08:55 +0000 (UTC) X-FDA: 82015878630.17.BDE1030 Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) by imf23.hostedemail.com (Postfix) with ESMTP id 9A27B140026 for ; Tue, 16 Apr 2024 16:08:53 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=fZspBn40; spf=pass (imf23.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.221.45 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713283733; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TfqeinlZEWtIXVEkZjds66VVZ8TL7yzBqLVW5+SS5fc=; b=EWBUxqP1AjLmLfLLlL+IhZx8VrgU4CUaBkoEJzmI4L9v3TZR/AP2n4JHng3jnYkML0t7Mk lA9YRWUlChTYejnOcZuVR7JOxC1rgD/QDUcSt6+LFig94yrcpw92h9jb0D+To6QK23X73n K1qrDA3kea4IU3w33lnYbXOHgME0cjQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713283733; a=rsa-sha256; cv=none; b=Bc1wilC9bXISjFVP1+kRcz2KlmyNC1WBQUmmTvNIeSjkKpyvtQQJQp8WKCrhtbIIbV/3pH opPowi8Lmympd+COzsYARRmg7onCc2MWghD+Yn9GmTPNbSQ17ffFytlEqQMe+wVtLgApWl oy2TuWTxT/yW0Nu0Ub/Y8BGQqXcVVfw= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=fZspBn40; spf=pass (imf23.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.221.45 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-wr1-f45.google.com with SMTP id ffacd0b85a97d-347e635b1fcso1537329f8f.1 for ; Tue, 16 Apr 2024 09:08:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713283732; x=1713888532; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=TfqeinlZEWtIXVEkZjds66VVZ8TL7yzBqLVW5+SS5fc=; b=fZspBn40xGCZ6Co2UL6jBDxjERRfBvJYCRIZSjp1k8dtDjpdKbVjCqOAzxZfj7XVqh DvcUS9PsCJAWr5qhfpPD/i165hc3oe0liFTPjeHcjjcO8+yS24s5mfiVUG8C2q6VQXDF fU2pNiM6f7xiiIhz/WhVr6foHgIUyyLbJL3dMAgvcPptO2ZFPvwJafP9yW3Dw2VH6utr wKCdTLvvFf8gd0JTfet7f5gJWcAl2e36g5oDlSWopjayGKmGXZEVkBPrUWOHi33L/DNo 3dY3DmJTbtBqbFdSMXSmu16Hn2t0zQRiyzENeWUEcxfsuG7XWwWzzxXyZDwSIrkh9+L+ 5nwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713283732; x=1713888532; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TfqeinlZEWtIXVEkZjds66VVZ8TL7yzBqLVW5+SS5fc=; b=qOZxHgJT+4JWaNXQf0g7zSeUDD/E3bcTPbR8RAqa1pVitGk0TSQK5ic4L+VDmE89w3 kZ/2BCnL+Xb4BEWVyayaZoBDFiXoIfnXiFbhmirUVTqQYXLBjQaoRQ5ihigOHkIJgf5u TBPdyK8rd3nzwqTRAsMUGIcjDyQNh4beVALecWvhcEJT5M7K7W9lvhajMK3Bqep8Rgfi 1xZcZXZKfClCy/3n/TPfnO4yqOElvM2B5S6tEBeLSjjIMRn9FN7vMtZGL4upTWHmkgQk PQlU6SH7ReYRAK00BdVrNe5SGsF1oE1MyKiHYBpgne8MKs2QdcgtbEeVUsu12fCQLsB3 p63w== X-Forwarded-Encrypted: i=1; AJvYcCWWjxj4hhN02yVAmRmp/tUFCDtOwc+dHRDCpWqPP2ft54NCZtuMqwl4h/4lU8p9y7vxioPdDGLihIqigeVuxXhc4qs= X-Gm-Message-State: AOJu0YwvEtSj3+i0V9N2dJ0SaUk3fQOgTo84V4qXN3ZU6HiDAh27UCb/ SQE8pLXMYDUZQZCG+zsJkC0LeebFq1aI4c427fTbVuXEKcnB0hnmTA3tFWDwjwBM+hXose6Moco sHZogOF6mMlG21wSA2/8bNUxMsrQ= X-Google-Smtp-Source: AGHT+IGF5abEcTvau2mHvo9SZL7EcchFBWSdkvIwcQqqBgNJB7O3i8rimyfDFs0cfpgylrUUTfGBxVDks5HFfnr/FOY= X-Received: by 2002:a5d:510f:0:b0:346:cd1c:dc73 with SMTP id s15-20020a5d510f000000b00346cd1cdc73mr11149953wrt.46.1713283731429; Tue, 16 Apr 2024 09:08:51 -0700 (PDT) MIME-Version: 1.0 References: <20240415131941.51153-1-linyunsheng@huawei.com> <20240415131941.51153-7-linyunsheng@huawei.com> In-Reply-To: <20240415131941.51153-7-linyunsheng@huawei.com> From: Alexander Duyck Date: Tue, 16 Apr 2024 09:08:14 -0700 Message-ID: Subject: Re: [PATCH net-next v2 06/15] mm: page_frag: change page_frag_alloc_* API to accept align param To: Yunsheng Lin Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , Eric Dumazet , David Howells , Marc Dionne , linux-mm@kvack.org, linux-afs@lists.infradead.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 9A27B140026 X-Rspam-User: X-Stat-Signature: 7y7xr4ikqa9wbnuz7xgjmt4ujju187zf X-HE-Tag: 1713283733-584910 X-HE-Meta: U2FsdGVkX19/3Y8duwycyeJwtSWLw3ZqqQc40poNDwGCqHaWSroJwcQKotDw/Wli60w+9qPi+M6eK+xWs5B582tu7LQ6s+VfWA28FIl/lQsYKaZUBrig4q35KuhakSc0A+vIlqZorq+2ua3/WCzZTe2YVlMHgglaqXiaSEwIHCI+M20HREpV16PHKZGLeFymamEIyMKCQR7leCtpCbZjW0JKfQDtuENxhSXavLJGNxZPEYzSzLbQedi6zVkyTkokHirWYkea9Nylqa1ytyo/1RHmrb46TMdS1/BKVrOkOjzWIIPzBiy4UyJXORC4YU01I5Qfm53tJesSbfH2tOD77MWok/m0h1CnTVS/vGOvorVUb/YF0Gp+DvirhqNQONcdhXX0zQsBeqWeqKCzxkLF+1tY+0/skdn1hxnYgPXj1x2iO61tZKamGpt9KDvQCiZ8Z8HqvSFmERIku6jJyhlKbEDJ1uXjiML9khUslH021y4Ql4AMQPV6lJuiGxqgrJc0rMzdsigFwl4wEHgj9cOF7iFgzsrhFRsvzz+2xkvsHCM+D0oxHo43YZYc+2zHSXVX4bCK6w6n8ZOyoa2TqRbG6gLUE2IP5PYlG646fqKkL3oFgqCkVYoOzUh+xk7+ejOCQCY0ax4Di9a6D/qQcnBoGrsfDijPw9NMWTeovJak8lADizunQPcvPbCBA7mwL4QbLx8G4azat+miIRyk15AfyUPrn5nNAh1eQ8kYDCLPM10t3ittQIofPUB6TyKZqPXpnHXQVGTmb58QItKZVgmFvBDxJcWACJLLp21A6NABbG9bu0Egl4y/s34hWK3CkXXKtpbykeQ5aUMXmxoqZ+Xr73WGocS8Z5SYnr9NEUOaw36FHn1cr+gvX7uaM9mC2XEtq0HjmBJ81fvegAVrqwwuZBU42d1BgtoK9ED1O3ASKhGud+quWMbLqo3pgH90PWWQwXwUR6JQCkn51Yxb9UD lfos5BcC mvLENC15sCG5dCqQHcwOCmW6rLDtrNvGLAhhVn5OWTaLnkblg70CniE+LizC+HwdWO8JmwlpN4y9wKHqrHUk/pbGEnBcbQqkQ79Ituf1bUwfQIRPCukvoXiorszLtKFHB7t0IclbRiibJpGv7nOCq3Xmdh3vpRTj6AX8GIzc8s4HeHvYik0sysru/5LLHtNQcmwdslPGwzBw7nOFD0/VgLh1Zjv/N34WmJ52PqOme1lqRNpZbRs9r5AdbLFRUtpehFhJOl/u33tJcBK3T8dPO3atoe9WwYm9cpsG3LGQdn1mXfJUjunTQPmVM0FpDaANaBTuOTogZ7DpvJLq3XnIjC3sbwSOVIQaDfYAevQ167YYLoUdNj0kZMklEG0HWTjv8A3ThNtND8qnvA/59Np22sIUjL7DIccS7MHWtfIomJIVQ0beZmUGUjVDLEZz0j7lCPDUVGkK4nZDFY9o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Apr 15, 2024 at 6:22=E2=80=AFAM Yunsheng Lin wrote: > > When page_frag_alloc_* API doesn't need data alignment, the > ALIGN() operation is unnecessary, so change page_frag_alloc_* > API to accept align param instead of align_mask param, and do > the ALIGN()'ing in the inline helper when needed. > > Signed-off-by: Yunsheng Lin The vast majority of callers are using this aligned one way or another. If anything with your recent changes we should probably be making sure to align the fragsz as well as the offset since most callers were only using the alignment of the fragsz in order to get their alignment. My main concern is that this change implies that most are using an unaligned setup when it is in fact quite the opposite. > --- > include/linux/page_frag_cache.h | 20 ++++++++++++-------- > include/linux/skbuff.h | 12 ++++++------ > mm/page_frag_cache.c | 9 ++++----- > net/core/skbuff.c | 12 +++++------- > net/rxrpc/txbuf.c | 5 +++-- > 5 files changed, 30 insertions(+), 28 deletions(-) > > diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_ca= che.h > index 04810d8d6a7d..cc0ede0912f3 100644 > --- a/include/linux/page_frag_cache.h > +++ b/include/linux/page_frag_cache.h > @@ -25,21 +25,25 @@ struct page_frag_cache { > > void page_frag_cache_drain(struct page_frag_cache *nc); > void __page_frag_cache_drain(struct page *page, unsigned int count); > -void *__page_frag_alloc_align(struct page_frag_cache *nc, unsigned int f= ragsz, > - gfp_t gfp_mask, unsigned int align_mask); > +void *page_frag_alloc(struct page_frag_cache *nc, unsigned int fragsz, > + gfp_t gfp_mask); > + > +static inline void *__page_frag_alloc_align(struct page_frag_cache *nc, > + unsigned int fragsz, gfp_t gf= p_mask, > + unsigned int align) > +{ > + nc->offset =3D ALIGN(nc->offset, align); > + > + return page_frag_alloc(nc, fragsz, gfp_mask); > +} > I would rather not have us breaking up the alignment into another function. It makes this much more difficult to work with. In addition you are adding offsets without actually adding to the pages which makes this seem exploitable. Basically just pass an alignment value of 32K and you are forcing a page eviction regardless. > static inline void *page_frag_alloc_align(struct page_frag_cache *nc, > unsigned int fragsz, gfp_t gfp_= mask, > unsigned int align) > { > WARN_ON_ONCE(!is_power_of_2(align)); > - return __page_frag_alloc_align(nc, fragsz, gfp_mask, -align); > -} > > -static inline void *page_frag_alloc(struct page_frag_cache *nc, > - unsigned int fragsz, gfp_t gfp_mask) > -{ > - return page_frag_alloc_align(nc, fragsz, gfp_mask, ~0u); > + return __page_frag_alloc_align(nc, fragsz, gfp_mask, align); > } > > void page_frag_free(void *addr); > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index f2dc1f735c79..43c704589deb 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -3268,7 +3268,7 @@ static inline void skb_queue_purge(struct sk_buff_h= ead *list) > unsigned int skb_rbtree_purge(struct rb_root *root); > void skb_errqueue_purge(struct sk_buff_head *list); > > -void *__netdev_alloc_frag_align(unsigned int fragsz, unsigned int align_= mask); > +void *__netdev_alloc_frag_align(unsigned int fragsz, unsigned int align)= ; > > /** > * netdev_alloc_frag - allocate a page fragment > @@ -3279,14 +3279,14 @@ void *__netdev_alloc_frag_align(unsigned int frag= sz, unsigned int align_mask); > */ > static inline void *netdev_alloc_frag(unsigned int fragsz) > { > - return __netdev_alloc_frag_align(fragsz, ~0u); > + return __netdev_alloc_frag_align(fragsz, 1u); > } > > static inline void *netdev_alloc_frag_align(unsigned int fragsz, > unsigned int align) > { > WARN_ON_ONCE(!is_power_of_2(align)); > - return __netdev_alloc_frag_align(fragsz, -align); > + return __netdev_alloc_frag_align(fragsz, align); > } > > struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int = length, > @@ -3346,18 +3346,18 @@ static inline void skb_free_frag(void *addr) > page_frag_free(addr); > } > > -void *__napi_alloc_frag_align(unsigned int fragsz, unsigned int align_ma= sk); > +void *__napi_alloc_frag_align(unsigned int fragsz, unsigned int align); > > static inline void *napi_alloc_frag(unsigned int fragsz) > { > - return __napi_alloc_frag_align(fragsz, ~0u); > + return __napi_alloc_frag_align(fragsz, 1u); > } > > static inline void *napi_alloc_frag_align(unsigned int fragsz, > unsigned int align) > { > WARN_ON_ONCE(!is_power_of_2(align)); > - return __napi_alloc_frag_align(fragsz, -align); > + return __napi_alloc_frag_align(fragsz, align); > } > > struct sk_buff *napi_alloc_skb(struct napi_struct *napi, unsigned int le= ngth); > diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c > index dc864ee09536..b4408187e1ab 100644 > --- a/mm/page_frag_cache.c > +++ b/mm/page_frag_cache.c > @@ -61,9 +61,8 @@ void __page_frag_cache_drain(struct page *page, unsigne= d int count) > } > EXPORT_SYMBOL(__page_frag_cache_drain); > > -void *__page_frag_alloc_align(struct page_frag_cache *nc, > - unsigned int fragsz, gfp_t gfp_mask, > - unsigned int align_mask) > +void *page_frag_alloc(struct page_frag_cache *nc, unsigned int fragsz, > + gfp_t gfp_mask) > { > unsigned int size, offset; > struct page *page; > @@ -92,7 +91,7 @@ void *__page_frag_alloc_align(struct page_frag_cache *n= c, > size =3D PAGE_SIZE; > #endif > > - offset =3D ALIGN(nc->offset, -align_mask); > + offset =3D nc->offset; > if (unlikely(offset + fragsz > size)) { > page =3D virt_to_page(nc->va); > > @@ -129,7 +128,7 @@ void *__page_frag_alloc_align(struct page_frag_cache = *nc, > > return nc->va + offset; > } > -EXPORT_SYMBOL(__page_frag_alloc_align); > +EXPORT_SYMBOL(page_frag_alloc); > > /* > * Frees a page fragment allocated out of either a compound or order 0 p= age. > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index ea052fa710d8..676e2d857f02 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -306,18 +306,17 @@ void napi_get_frags_check(struct napi_struct *napi) > local_bh_enable(); > } > > -void *__napi_alloc_frag_align(unsigned int fragsz, unsigned int align_ma= sk) > +void *__napi_alloc_frag_align(unsigned int fragsz, unsigned int align) > { > struct napi_alloc_cache *nc =3D this_cpu_ptr(&napi_alloc_cache); > > fragsz =3D SKB_DATA_ALIGN(fragsz); > So this is a perfect example. This caller is aligning the size by SMP_CACHE_BYTES. This is the most typical case. Either this or L1_CACHE_BYTES. As such all requests should be aligned to at least that. I would prefer it if we didn't strip the alignment code out of our main allocating function. If anything, maybe we should make it more specific that the expectation is that fragsz is a multiple of the alignment. > - return __page_frag_alloc_align(&nc->page, fragsz, GFP_ATOMIC, > - align_mask); > + return __page_frag_alloc_align(&nc->page, fragsz, GFP_ATOMIC, ali= gn); > } > EXPORT_SYMBOL(__napi_alloc_frag_align); > > -void *__netdev_alloc_frag_align(unsigned int fragsz, unsigned int align_= mask) > +void *__netdev_alloc_frag_align(unsigned int fragsz, unsigned int align) > { > void *data; > > @@ -325,15 +324,14 @@ void *__netdev_alloc_frag_align(unsigned int fragsz= , unsigned int align_mask) > if (in_hardirq() || irqs_disabled()) { > struct page_frag_cache *nc =3D this_cpu_ptr(&netdev_alloc= _cache); > > - data =3D __page_frag_alloc_align(nc, fragsz, GFP_ATOMIC, > - align_mask); > + data =3D __page_frag_alloc_align(nc, fragsz, GFP_ATOMIC, = align); > } else { > struct napi_alloc_cache *nc; > > local_bh_disable(); > nc =3D this_cpu_ptr(&napi_alloc_cache); > data =3D __page_frag_alloc_align(&nc->page, fragsz, GFP_A= TOMIC, > - align_mask); > + align); > local_bh_enable(); > } > return data; > diff --git a/net/rxrpc/txbuf.c b/net/rxrpc/txbuf.c > index e0679658d9de..eb640875bf07 100644 > --- a/net/rxrpc/txbuf.c > +++ b/net/rxrpc/txbuf.c > @@ -32,9 +32,10 @@ struct rxrpc_txbuf *rxrpc_alloc_data_txbuf(struct rxrp= c_call *call, size_t data_ > hoff =3D round_up(sizeof(*whdr), data_align) - sizeof(*wh= dr); > total =3D hoff + sizeof(*whdr) + data_size; > > + data_align =3D max_t(size_t, data_align, L1_CACHE_BYTES); > mutex_lock(&call->conn->tx_data_alloc_lock); > - buf =3D __page_frag_alloc_align(&call->conn->tx_data_alloc, total= , gfp, > - ~(data_align - 1) & ~(L1_CACHE_BYTE= S - 1)); > + buf =3D page_frag_alloc_align(&call->conn->tx_data_alloc, total, = gfp, > + data_align); > mutex_unlock(&call->conn->tx_data_alloc_lock); > if (!buf) { > kfree(txb); > -- > 2.33.0 >