From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4411E7717F for ; Thu, 12 Dec 2024 14:18:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 403206B0088; Thu, 12 Dec 2024 09:18:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B1A46B0089; Thu, 12 Dec 2024 09:18:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A0386B008C; Thu, 12 Dec 2024 09:18:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 0B80D6B0088 for ; Thu, 12 Dec 2024 09:18:00 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 89C62121DE9 for ; Thu, 12 Dec 2024 14:17:59 +0000 (UTC) X-FDA: 82886510322.10.BE3C5BF Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com [209.85.160.173]) by imf26.hostedemail.com (Postfix) with ESMTP id 2BC19140019 for ; Thu, 12 Dec 2024 14:17:39 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AjAmVyMP; spf=pass (imf26.hostedemail.com: domain of surenb@google.com designates 209.85.160.173 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734013059; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=n8ajthcpMpapR/N161Mc5i8pJmZRr2mxkpMj7Lx+W90=; b=qSlnJRTHscP6KncmVrNFZ5J44CBDU2aJsgxHv7w9HpOLwjfFzPx49hOnrnH7jMfy3EcGzR 50f6hLJSS/ZCLabmAkqzj30tmsZm3mTo8CuHXP7k8GJTdoOG9XBzjBBALvEcuNrC762qh8 jsyNKmlGZpVu6kqKDFqq6CDKz49qyKo= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AjAmVyMP; spf=pass (imf26.hostedemail.com: domain of surenb@google.com designates 209.85.160.173 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734013059; a=rsa-sha256; cv=none; b=7PZBaQG1hRca+WlDfENSP37L8cJALlp7atb+CPvObgmUrKUPy9ruspMemtErG/IS9WHtlC ucTG2nsA3uF5AwUDhtCY4wyiBWs0F+3YF+zRUw+UTJBxUtLf7bYY+bOFBHke3BI+XIzRnU ToEWjRg42PJY5Y80RnLd+aq8pxBFu4c= Received: by mail-qt1-f173.google.com with SMTP id d75a77b69052e-467896541e1so219521cf.0 for ; Thu, 12 Dec 2024 06:17:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734013077; x=1734617877; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=n8ajthcpMpapR/N161Mc5i8pJmZRr2mxkpMj7Lx+W90=; b=AjAmVyMPznYFCvhdhsVokD2cYiWLstt++3olG6y0YrHBl7vfFvKQAfkmqiFbRBr6j7 WYxiCfXLZ8hZ+J64rOQmUaohOo4Z1i5ielb6nIhzRCR+3h8r1XMv2a6U96teb0puw17J uZ++RhT6lkgOO8GSQSuqNMP5Uxg65VU7ms59sUsI5A66s7hQnjFizOwoH4UgaqTNw0Vn iW2vNrgzh/sOjcfrWEITx0bDE/A82laYgB2MMBp+x87fopsa75e6cRgIDq0Pu9ATw3L+ +xpUbMLEOamjW87xq0JFrTxrhoPoZLEyJQS8jEvvA1uhvq//oSSyOT7rviTN8sLoa6G/ X1aA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734013077; x=1734617877; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=n8ajthcpMpapR/N161Mc5i8pJmZRr2mxkpMj7Lx+W90=; b=rjhWYhMSMXK+Tvucxr9Q+zsMmKyRCi2BA1+6g+MYINrHbgsEOHw6Ht7qL1dRObhAE0 rTCOaevAU48rb7PCUOI5lBMM9HXbAD62GKIvZENr+aZT7QyuuhSPl+Ax3NjZYYVsQlg8 MsxOpvpybKfjAdCjqemnQ602+QocCJgUUIDueoA4HXnvdxLP+SjZQEALfrXu3pk7/1qt BUvXxgEFh/jZiRfJaeZFpZiFy5C93cmgIjwYsrY/ptfNfJwaV1+Vdbj3nacNwxMT79aJ OsE1tbCn26I/mkE1pkICmH/hGWT2xbLxEaeiElcfvbUPY3ogRV6arvNO5GvKHPBsAYCx 8fZA== X-Forwarded-Encrypted: i=1; AJvYcCWlfJQZm3J1PQ+vbLAB/gwJxDanlqC5S6/p5uvZaoIrSvCmYQUQNB6OrkStq+VK+mY5auZ5O5h03Q==@kvack.org X-Gm-Message-State: AOJu0YzW0NgU2xdAcK/74XRb2UvtEF5r8r/hCRsNgWqu0tuocmLUFE6Q S7HU45rVGj6fjNrrqFkNyaYr7YjHwtob6LhnWaxcemluVqv+fgpalWh9gpYtTr/IJBNkss8SrhY LpmJxaAurmnvt9haXvDXBfyI9gJkig4INtnzz X-Gm-Gg: ASbGncsBu4H+S/FOrvxw+LlTp5FV2dUJzfGxv9cloJIF3eXnb+ZGRInhQPg/H+bMAp+ R6Qt96S2TsFbM8y7MQekkRFK4noqVaYTCKsvff/uIX/zp3hfCz6lMedlqOPs69r2eTV+f X-Google-Smtp-Source: AGHT+IEeu8zH/VnJryYj8++kCkoXsuJ3rlVdOfU959FqW2gNhYySMGPbFs/mLT5AeOlhVMLkO548p4WwPtmHykH53aY= X-Received: by 2002:a05:622a:428e:b0:467:8070:1573 with SMTP id d75a77b69052e-467a103d0b1mr598111cf.20.1734013076622; Thu, 12 Dec 2024 06:17:56 -0800 (PST) MIME-Version: 1.0 References: <20241111205506.3404479-1-surenb@google.com> <20241111205506.3404479-4-surenb@google.com> <20241210223850.GA2484@noisy.programming.kicks-ass.net> <20241211082541.GQ21636@noisy.programming.kicks-ass.net> <20241212091659.GU21636@noisy.programming.kicks-ass.net> In-Reply-To: <20241212091659.GU21636@noisy.programming.kicks-ass.net> From: Suren Baghdasaryan Date: Thu, 12 Dec 2024 06:17:44 -0800 Message-ID: Subject: Re: [PATCH 3/4] mm: replace rw_semaphore with atomic_t in vma_lock To: Peter Zijlstra Cc: Matthew Wilcox , akpm@linux-foundation.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 2BC19140019 X-Rspamd-Server: rspam12 X-Stat-Signature: 4rbbhqijwbbpphfhf6f9j577amo5bejo X-Rspam-User: X-HE-Tag: 1734013059-295315 X-HE-Meta: U2FsdGVkX1+m4rP3HI+j0klMOzM0mqBg/3Ugc7QAEQnDrHKTu+Uq/pTJnRyCeJIMq8aOAEuE3o7BMVYMckrK4/iXnP7tJrjPIODCOfSCE853FWRRJoNLeheUu+uauq/nokRDlmRdFHf5TrqNc3Q1LNkbzuIuhEFwiChaxo3w2zQHxyGDImYr66orwu+almzrs4WB3RcGLceR5i2f1chrxaP3lkHTTgwGzmtr6Nm2oG/dd2ihqZy2pjiaK99AnpsKJQarvSVAHyFgyta2Mcr9vgn3X1jajQqozI16h3QeMKhVytgtyd9irop4GnM7RIHwdvpNhkn8c/Ci/RKn2I102UCjCtU1NZgop2uaXtR7hsl+t55U+8Icp2wNx6AZ7tEbif1DY42ZqPnyPoU4bE8EWvgwdqMQjT7YOa/zcMoV4gUkx6IDcVNw4DDIHEK6KUWzO3/e3/FsGGNQU/oCbDrtCbd6Szv0KXEf6lENEXzYha7GnhR3DqU8ktlbApOrH5VyAJ320IHs/9NuRJro+YUO6khz16JmUVTrNtCnnJM7w9Q6jcaFbSxhjzgJQrWWNa20AoTHdNCsUG3NW8v6cNeqsbK1xQqkaqs/yYDn/m3vTR5bMscoAK5QwMRLkaPAq12iPzl8Kk92gp1d0aTi0LiY7HEDDtRZ8jzPnNJIawTZBckGao9oELKIOiF+77xgwI4DlPqWqPBdJ1q6EGi4mUJ+sQh8ZKAXiRrRyWj195DBzLpCaMNKTmAjwvEXWyzck35v4flfFhpoCAj4XkyVdLk3nzhxkZGSDmEVaT4hE6PvdNTkAacE6smtZdf2VXhRlmiltDkFXIAGQdNQtQWJhB8ycyaOwnGWEnqpaUnNcs29hm+BjS1nUPR2YkmsWJlJLP1+EDO6wjMZBu+Vsp0AEy9xmJTueRoI/SRk+gh1kitucoNx/iQlhutPW8oUmMQDQ52eOIoJQY9WZqpTCMMkOzn hqZC5AwE +uEwnnVGoGbtuk6lr7baK6+4L5wZOIYEc1GLI3Lq8dLIQl/t5XHhOcAL4SHCCaSeYYoHX4PEMEiyUICbPWPUCUEtqRBvmJanGOh/Ws4hTGeFwzr4Ym31bNud+TADgj0OZm3Y7nLkpNem1Z9sXd6Xh5NIVnJY+I8nOWF3OI6HiPMp4JvnZaT4hQO+NM/RJ+Gysgp2sIiYj3zcwGEGO9J9505OHVSiAOdCkngmZiKNF/9rK3Z3aFsiduOjO3lfV28YCXpDrmfgre5WeSdPswU8d1EGU2oLVkf3N1cy7dE+0sj7hT0lDAFrVvC8Qa5eHiAslBMnE+grBABzfbD9cZx6P/n/jcKoU/IQm2Y38WadIRDNV9s8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.020509, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Dec 12, 2024 at 1:17=E2=80=AFAM Peter Zijlstra wrote: > > On Wed, Dec 11, 2024 at 07:01:16PM -0800, Suren Baghdasaryan wrote: > > > > > > I think your proposal should work. Let me try to code it and see = if > > > > > something breaks. > > > > Ok, I tried it out and things are a bit more complex: > > 1. We should allow write-locking a detached VMA, IOW vma_start_write() > > can be called when vm_refcnt is 0. > > This sounds dodgy, refcnt being zero basically means the object is dead > and you shouldn't be touching it no more. Where does this happen and > why? > > Notably, it being 0 means it is no longer in the mas tree and can't be > found anymore. It happens when a newly created vma that was not yet attached (vma->vm_refcnt =3D 0) is write-locked before being added into the vma tree. For example: mmap() mmap_write_lock() vma =3D vm_area_alloc() // vma->vm_refcnt =3D 0 (detached) //vma attributes are initialized vma_start_write() // write 0x8000 0001 into vma->vm_refcnt mas_store_gfp() vma_mark_attached() mmap_write_lock() // vma_end_write_all() In this scenario, we write-lock the VMA before adding it into the tree to prevent readers (pagefaults) from using it until we drop the mmap_write_lock(). In your proposal, the first thing vma_start_write() does is add(0x8000'0001) and that will trigger a warning. For now instead of add(0x8000'0001) I can play this game to avoid the warni= ng: if (refcount_inc_not_zero(&vma->vm_refcnt)) refcount_add(0x80000000, &vma->vm_refcnt); else refcount_set(&vma->vm_refcnt, 0x80000001); this refcount_set() works because vma with vm_refcnt=3D=3D0 could not be found by readers. I'm not sure this will still work when we add TYPESAFE_BY_RCU and introduce vma reuse possibility. > > > 2. Adding 0x80000000 saturates refcnt, so I have to use a lower bit > > 0x40000000 to denote writers. > > I'm confused, what? We're talking about atomic_t, right? I thought you suggested using refcount_t. According to https://elixir.bootlin.com/linux/v6.13-rc2/source/include/linux/refcount.h#= L22 valid values would be [0..0x7fff_ffff] and 0x80000000 is outside of that range. What am I missing? > > > 3. Currently vma_mark_attached() can be called on an already attached > > VMA. With vma->detached being a separate attribute that works fine but > > when we combine it with the vm_lock things break (extra attach would > > leak into lock count). I'll see if I can catch all the cases when we > > do this and clean them up (not call vma_mark_attached() when not > > necessary). > > Right, I hadn't looked at that thing in detail, that sounds like it > needs a wee cleanup like you suggest. Yes, I'll embark on that today. Will see how much of a problem that is.