From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2332C28B20 for ; Sun, 30 Mar 2025 21:30:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C8A59280002; Sun, 30 Mar 2025 17:30:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BECE9280001; Sun, 30 Mar 2025 17:30:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A8DC7280002; Sun, 30 Mar 2025 17:30:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 849BC280001 for ; Sun, 30 Mar 2025 17:30:30 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 63ECD58F2A for ; Sun, 30 Mar 2025 21:30:30 +0000 (UTC) X-FDA: 83279511420.08.E767453 Received: from mail-wr1-f46.google.com (mail-wr1-f46.google.com [209.85.221.46]) by imf23.hostedemail.com (Postfix) with ESMTP id 73B6B14000A for ; Sun, 30 Mar 2025 21:30:28 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cSQI2soQ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf23.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.221.46 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743370228; a=rsa-sha256; cv=none; b=A9N7oTPH7bURp6taoavZpOkr0jIVewTIQho8t3Z+rp1HhZwMDojyyG6n1CNapzgZxkBMFX pkXgtpxuw4i7IEfdZ8BsGRfZb3ueLoTAwV0J/RPYfrTgcFBU7znc7Z0aCp8lT+koGW8ve8 IzNz741IOm1SpV2Q6X2VSG1bvynph6o= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cSQI2soQ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf23.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.221.46 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743370228; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yOOpYDacP3qw4CicXaURsjZhalE+xbMRn+milGOOg5E=; b=r6S32pVxV0Z9lNQvO/Oe6Ga/TRdRF+qqi+BCSqNDVwPtd6MJSmWkzNzHYBg1zxRgWvOtHg vIOslFDB+s7e+VGJvv0PdBr44DdKrBmq0pQLZMG3h5bPrXdkE/5k2e5BXFmu+YIwmZhrC7 MdJ5d0Wa0GK2Dmw5s83SXDfPFW8/530= Received: by mail-wr1-f46.google.com with SMTP id ffacd0b85a97d-39149bccb69so3455597f8f.2 for ; Sun, 30 Mar 2025 14:30:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1743370227; x=1743975027; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yOOpYDacP3qw4CicXaURsjZhalE+xbMRn+milGOOg5E=; b=cSQI2soQtdaqj0jsSHdY44o0saiR+XKxaQEyLnor8SHsKH3oZFi1h4EF9ktxYLFNaG jvGZZTIreUNSe79qvEi/R/a67DnAdBDSMs662uMHy4/FHluHPWe4oZwUvoc3FJWBZ4wE R4DyHOY6MDe3GSPpWluJFLaOhRTy4NBEbmJhblxqI4mkOUWQY3w02tM/dsD27I7Z5K50 5ndNdzq2NF4H1c+VzhUc0CToQaMkS6ZCZ5bQwctcY1fA21CUiSQLR3M22xoSxIadBPky HMjcwmTQpXHkclJox+jIQn26ZQu3eDNO2RadtCAqtAyg+tjNCrbjW3vkW4UTvvbPEgc1 xy+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743370227; x=1743975027; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yOOpYDacP3qw4CicXaURsjZhalE+xbMRn+milGOOg5E=; b=aQbcnIfGZp4O2zW5qyH81Yizd1z2OCIghjPtMgRPgAf5uVpjArDQAtjpb/BkCpCGEq 9xfAlZqTQ3IVSFOghICLQYgnEl9YD+ciJVBu3IOYWUw819GQUoiE/RGNG/VLUkQfCQjJ ckxFCmBFO3JJOREknOJlq1DYJImzIRUISIhRZlVQIz5GAYMoX4Kd7/UqPv/5G38cNmE3 ZUS/SRDcW69ASoHc4mF7fplirQkyOzTJn+p33QXUYkCZKpXsZW7XJNatTRhrHh4LsXTk PK9AVF2dcN16TiPtaxUc3UZYlFz2EBcL/pl6Q9/2xLiuxwU81vbdXWU04vwWXEp0+QBZ QxyA== X-Forwarded-Encrypted: i=1; AJvYcCXw0zn7BzS6fUmf9118ly9JPIOhDwPbvXmzhJP7rfRBvLJynOSL/uPGkIiwhjzT1O3pFyb/Oonu3Q==@kvack.org X-Gm-Message-State: AOJu0YyUJJ0iwaR70a+x4ygrtzSdZi0w133AG8lehGtRdBedrXY3xuCW dbq5PDIaUMFbbTspMopyMZPTV1K5F4aLExOig6IH2+5PReYdAtp18+5mkeMXLDNUmBviKwgd87B DujGorWB2RrUXTldfnQYIXXvbVyk= X-Gm-Gg: ASbGncve0Vhn8irPaCGdphnEeZTNREGcCahhNw5k959rWJEGAtg9KLUeYMwq+bibVQ0 Niww+x6QqPZYK6eiXrKr7zxA0DWeQyK4jzTn5vVQvqovzvDMH1EFVZIQz0yOJcDs8p0TWKbQ+VV t8tz6ZJRYcbDHxro0oMnWLVIFz7GLjECZJBGn5WJtdhKvy7TObcB8x X-Google-Smtp-Source: AGHT+IGy6TYkwqWgmB4qVvUpl+hS3Hb0sVBnt8U2ZwjeeCa8mROOAVHVYBHFbL7WUn3u4+jOQRKGFypsVayyAIua8Uc= X-Received: by 2002:a05:6000:2913:b0:39a:ca59:a626 with SMTP id ffacd0b85a97d-39c120e3e5cmr5650236f8f.28.1743370226631; Sun, 30 Mar 2025 14:30:26 -0700 (PDT) MIME-Version: 1.0 References: <20250327145159.99799-1-alexei.starovoitov@gmail.com> In-Reply-To: From: Alexei Starovoitov Date: Sun, 30 Mar 2025 14:30:15 -0700 X-Gm-Features: AQ5f1JqKoFYJATW7xwoyQD92tAvSQWSre1BbHdyzNRzzGs11RJb--CavgF31AIQ Message-ID: Subject: Re: [GIT PULL] Introduce try_alloc_pages for 6.15 To: Linus Torvalds Cc: bpf , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Andrew Morton , Peter Zijlstra , Vlastimil Babka , Sebastian Sewior , Steven Rostedt , Michal Hocko , Shakeel Butt , linux-mm , LKML , Johannes Weiner Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 73B6B14000A X-Stat-Signature: ykjhgnkb6q4twz1y7a546qjjryznggce X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1743370228-765644 X-HE-Meta: U2FsdGVkX1+yYW1YrRmgOMZfgzos02rkfUq8BoTYUrEIqRY5TO41FcMaQFEk1B2Eal9n5VUp7+H+nO9ZYvHsZsgAYP7AI7CPF0KSWBV64pnogln4Lc9kFj5YJhsgj0sBZzJmP8D8IPLAFABfqkE7qpqN/otYoCEyJbJGLV9wRiaLCCOwsmIcIcSShmrDX0+6wvejyMiVoehP0H7CJ7UW1YCv1JxqVvNEnr0kKv9iMRVWtfAWPHIdWfCr/TspyAA3rHC19+YqIwk0sBC874TJ7GG7Dg3hjWP/DDwUab835CzM7Y88Ty1JYuo0/XNJpCy6e5waDy90ZjMXn+CicNALwUfCqVeRQP1vxrjVAX7Q/fxIn2VrRsvnHf1cRzJT7BLs4EHeVfQ7N/eYroKzXBYm7T7YKFnqgqs26QFA+2zkKRgn3LaM36jQM+578/SyAbjOipz6+Tz0+ctlC8ldDS+DmTiNGT0lngITeKd3xqwWDAfTpxEKO9sU7YW1UBCePwbKKeQtY+FZ/6d4Pd+u6jYreR/axssQAAXVQSRT9x9xtpoDcRVXRJMephYi3QziexsURvp77f58v0jKwQtVhZ6G3qFGF0X+gYC2s4bGVR3kdgTsswLCgiMLjDkDw88VnQeRlVEdcNylDU/BWGvd8BNcBIz4x0bKZyvB/RlEQXp2cWVEYMkd6rJHb6iHmbguUDA82gbHIQkryCCp4TWp3y70Bo+3ZZkJ5A8DSvhdDDMQp9cKFrfOe65K+26G4+fLcUNH+EbOtIzNpMT4i+U59IaagqTTXBEwyfLmkZ4GCZsvLDSee124mAju2yXaHioQE1fZNyWFx0DRN5T4Wh/eTN5zoEQuqbbT+iHJA4/1yII3nqAjMXsqQvULrCKB7OuOXLj+rnYdfbk1HzKmduBVSRFUfem7ZBD5BZfDq0etq9DZqX0JHGENBLmPNCD6/8jh7e1EZeZQ6GLaev/tn2fSg4d 40LcGDwn Y1u/DBy+2/Q8X8Lxim77OwviAjQOqfDNmAH2yQYAAJhTYNT6JD/pzodEbN+uCjJ37+YttCLaNBIVePqg0lBsvbLwRXYeImOjkcGMQ43wdRhn6fj8nyF1xh4UbLRX4vr7AGkfj+VlaiRpHL3GREZSSeV6BTVH8+B/gOZyyHnDKs0fNFz4lYCCa1pAhMHrhdvBofki20ffvpxD9ceASIjsIEYe3t8FpB7/6TI/JHuvrKeLVEO0PHwEnE2r+EgfFqZB/qSVH2CSQxAqAbH1rpIfADeIpdy5hMtqmmHqtVwireEioAzi9yb1+QkyGPrpT5HLvwajK1WLMHYnejAvk1B16QVgLCKQDlV7a1wX2YRS/E/Wu/obwEpK+r5qaCE6pDWZnfuP91x9/ETogLVioDVO8b/byxeAT7q2SEY2/FWxrDCI7yFHrqJtwK/xwfVf8tldgIlhJGIVFNZ5nOBaCVe81O+520gPpc/AWJzr8ZYZuNxQbc2j0Rh17jWi0FAr65dviOAPYLo4tIkx/JTMRE9PAiUtaTY7uhK4D9EeX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Mar 30, 2025 at 1:42=E2=80=AFPM Linus Torvalds wrote: > > On Thu, 27 Mar 2025 at 07:52, Alexei Starovoitov > wrote: > > > > The pull includes work from Sebastian, Vlastimil and myself > > with a lot of help from Michal and Shakeel. > > This is a first step towards making kmalloc reentrant to get rid > > of slab wrappers: bpf_mem_alloc, kretprobe's objpool, etc. > > These patches make page allocator safe from any context. > > So I've pulled this too, since it looked generally fine. Thanks! > The one reaction I had is that when you basically change > > spin_lock_irqsave(&zone->lock, flags); > > into > > if (!spin_trylock_irqsave(&zone->lock, flags)) { > if (unlikely(alloc_flags & ALLOC_TRYLOCK)) > return NULL; > spin_lock_irqsave(&zone->lock, flags); > } > > we've seen bad cache behavior for this kind of pattern in other > situations: if the "try" fails, the subsequent "do the lock for real" > case now does the wrong thing, in that it will immediately try again > even if it's almost certainly just going to fail - causing extra write > cache accesses. > > So typically, in places that can see contention, it's better to either do > > (a) trylock followed by a slowpath that takes the fact that it was > locked into account and does a read-only loop until it sees otherwise > > This is, for example, what the mutex code does with that > __mutex_trylock() -> mutex_optimistic_spin() pattern, but our > spinlocks end up doing similar things (ie "trylock" followed by > "release irq and do the 'relax loop' thing). Right, __mutex_trylock(lock) -> mutex_optimistic_spin() pattern is equivalent to 'pending' bit spinning in qspinlock. > or > > (b) do the trylock and lock separately, ie > > if (unlikely(alloc_flags & ALLOC_TRYLOCK)) { > if (!spin_trylock_irqsave(&zone->lock, flags)) > return NULL; > } else > spin_lock_irqsave(&zone->lock, flags); > > so that you don't end up doing two cache accesses for ownership that > can cause extra bouncing. Ok, I will switch to above. > I'm not sure this matters at all in the allocation path - contention > may simply not be enough of an issue, and the trylock is purely about > "unlikely NMI worries", but I do worry that you might have made the > normal case slower. We actually did see zone->lock being contended in production. Last time the culprit was an inadequate per-cpu caching and these series in 6.11 fixed it: https://lwn.net/Articles/947900/ I don't think we've seen it contended in the newer kernels. Johannes, pls correct me if I'm wrong. But to avoid being finger pointed, I'll switch to checking alloc_flags first. It does seem a better trade off to avoid cache bouncing because of 2nd cmpxchg. Though when I wrote it this way I convinced myself and others that it's faster to do trylock first to avoid branch misprediction.