From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3F10C54E76 for ; Tue, 17 Jan 2023 21:22:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 70BD76B0073; Tue, 17 Jan 2023 16:22:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6BC6A6B0074; Tue, 17 Jan 2023 16:22:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 583AF6B0075; Tue, 17 Jan 2023 16:22:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4839B6B0073 for ; Tue, 17 Jan 2023 16:22:02 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 1286A40783 for ; Tue, 17 Jan 2023 21:22:02 +0000 (UTC) X-FDA: 80365563684.11.D31BE65 Received: from mail-yw1-f180.google.com (mail-yw1-f180.google.com [209.85.128.180]) by imf01.hostedemail.com (Postfix) with ESMTP id 801704000F for ; Tue, 17 Jan 2023 21:22:00 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=AcMcOiO2; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of surenb@google.com designates 209.85.128.180 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673990520; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xtczcpnEdrb6Lo/I50tBUUcvRCWBMIoMko6b3VfQkh8=; b=Uzo1kMkLTof5a6/btRWxS45pGN1nLjs0zsOgYiF/0073zoaETm+ngK2UTN2Gw/NUvybUY6 eE7Bqrxkaw7lz4CoF1msC3aPPTXM6D/LCeNhv7/Byl2xwPleG60nfkw52Ane+seD05DJTq C9jJ1nBCVTIvK+opcljsCQOT2VoNG78= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=AcMcOiO2; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of surenb@google.com designates 209.85.128.180 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673990520; a=rsa-sha256; cv=none; b=rzmLZYPCbGeZQ416yL/Z1Jvy4h8n804iiHf9OZngsbuuxfE2dUwwhWAFF72Ed9Zb6qWyiv vaCW39GvxOVOBTp5lFXo8EuvvFsGADeLAPuwNRpRxJguLkRrGmFIFBNE0bSjjSQ0AkK7bP XTQzrRD8/whYhgCEe5QgRFhaTI61RXQ= Received: by mail-yw1-f180.google.com with SMTP id 00721157ae682-4d4303c9de6so290648767b3.2 for ; Tue, 17 Jan 2023 13:22:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=xtczcpnEdrb6Lo/I50tBUUcvRCWBMIoMko6b3VfQkh8=; b=AcMcOiO2z6ozFh44VpqFr7ezkcbEZnhi09Dbp343HQiIrdSXkgI50tblBtSXc4Qkbr Dm0aJ5OWtPIPe1vN3owVIfQugxfobInM08+GWcQC0Wl/sGleOffZOyKeiGBU66WnPH9X yGh7HPsaf3D37qff2nHLKSk9GB5gXj4/bJSoqoPygqkBfp0n5aCUJW1R2r1cGv48MYT9 AF0QYZ+3MvoaJhRFmZkp/7b87hU6GOckZwAwfWDK+vU6aqzT+gjcR5kQJuyq+Vt1cTTM SPO3L3JDVl95dXMuhHrtNaPJurOeiaEkl6ib7er/Q0Jv+k3LnRbtNLzrD3OvrfjKSDHz HteA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xtczcpnEdrb6Lo/I50tBUUcvRCWBMIoMko6b3VfQkh8=; b=baF/Usu9HdSHVnYUcbBP0nWGDiaQkTQznNEVe4guwYzCx/lzmHrWqzb3WaYxCGaFrL jnN5ZPJD0049mkk/chytBySSu5tSCXFEJRhhifW7NAnElhPJsYRU3rP+OChZwKPJTKJh ZghEtQ8QU/9KI8ULtN+PPcUyjYYTRFIt34qezR2YB2v9dmwQynTNPWyH4YxJndNiVtK4 Wq8LdWdqWpTh95PqzrqJi+h+2vjKACrifMZGqn73JNpLlJgsCwcKv7WyVqOaEL6MrzMS 5i9KhVO3eAIaloZszAG0xwBHXYWgw6AuwhQ3ctzZA7ZLl6PPRbe8y8eW39xSK5TbgFCo 7NlQ== X-Gm-Message-State: AFqh2koCUKiCQbm6mcXav9d9PTMGNyw63ndlBL1eLGRCSiUZ5S/KkZCU q06X06HnHLSjAQPPBhKRZ11bU7F3rJsexKNUx3uV8g== X-Google-Smtp-Source: AMrXdXtePqeFLTDxDaGD0SiAwIlsFIJGXklyX1m/MZjPOlYrdLeSS6NZFiXQQ8abRq5qEH0JCgU5ZQh27jLTDLKC0bU= X-Received: by 2002:a81:1d2:0:b0:433:f1c0:3f1c with SMTP id 201-20020a8101d2000000b00433f1c03f1cmr623569ywb.438.1673990519331; Tue, 17 Jan 2023 13:21:59 -0800 (PST) MIME-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> <20230109205336.3665937-13-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Tue, 17 Jan 2023 13:21:47 -0800 Message-ID: Subject: Re: [PATCH 12/41] mm: add per-VMA lock and helper functions to control it To: Michal Hocko Cc: akpm@linux-foundation.org, michel@lespinasse.org, jglisse@google.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 801704000F X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: n838njd1cjjejc8wb1w9fqiw315hwhx8 X-HE-Tag: 1673990520-147893 X-HE-Meta: U2FsdGVkX1+0rEUmEJ+HlWc1tWQs2mMeoi9T188HoWSxTjWpA/DIff1jQvMTVYMeUlrTOt1dN/d+W6L6kdif5p/HJ0HQdCNyDUeA6c2Z8K8+zz2LLT7y/eU0vLA+9+uEVHfqfDaaw7eqEdtCbGix5Sr3xW59h94i8j2sYBYXCJTt1xUNA7wgeILpHMUZowsH4x8MdkAI19F0yskuK1U2FXVLz8rrXmsX6N1xJjXj6cy2S8mFygdhrUoxkMGXtVu/O5uaaD6zpOkRPNNVAdkEvfPVic7jfWdWL8k9f6Iv2Nvt5ZizCWme3x61VqDaPR2yIXj8V0pw67v7pBqacS1+A1Bw4tPA3F7jQtmal3Dsokj/KpxXk3KbnW8o7tB429Fn/L58bvDHRs0nx4H22IcQr5j8/bdUL1JYr+3HxY4mABChJmt6wQOo9xiLMA7jfU/6+eIZKi129vW/z5XJw2iF6nihtyKoK56HUqoIBmCxKUOZ4BwFDuWkq4W8ssxsb9jH25Sl+VZGmEEo4tRiB7rWBTf/pkXIwmUqrgOtvypzKhzK2IGSsxEZ/XiDeEteeM36Olpk1aOrSR3Am8WpMQCo7E+NycjGLHv8j8QDT5dR9Ct9ruy7s14aZ8cAr+DneagA0g1voRqHjWFnzpQjLrzcVJ2BsfGCfOIaXRDVMjT3rwPKDOH/Y0v3x6oIJhf8sr32VtXrQrDSa4zzbfTEtHy1L51jK7X5Uf/lKHpruX11QrlH+243mAtDubNgcoN2bDy8cFz/wdorZzXKBjN8pN/oIkUJPHh29M3ayBLFzjRiiRp1pGSqS3yAYAFp0yEBx1wGsg1Wc6c9NGiGYMkJxZc9rbGrICnAK22UAXhn50Yvk+9HzLEbXgdotl20Dhapl6Rngh489sxQS/A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jan 17, 2023 at 7:12 AM Michal Hocko wrote: > > On Tue 17-01-23 16:04:26, Michal Hocko wrote: > > On Mon 09-01-23 12:53:07, Suren Baghdasaryan wrote: > > > Introduce a per-VMA rw_semaphore to be used during page fault handling > > > instead of mmap_lock. Because there are cases when multiple VMAs need > > > to be exclusively locked during VMA tree modifications, instead of the > > > usual lock/unlock patter we mark a VMA as locked by taking per-VMA lock > > > exclusively and setting vma->lock_seq to the current mm->lock_seq. When > > > mmap_write_lock holder is done with all modifications and drops mmap_lock, > > > it will increment mm->lock_seq, effectively unlocking all VMAs marked as > > > locked. > > > > I have to say I was struggling a bit with the above and only understood > > what you mean by reading the patch several times. I would phrase it like > > this (feel free to use if you consider this to be an improvement). > > > > Introduce a per-VMA rw_semaphore. The lock implementation relies on a > > per-vma and per-mm sequence counters to note exclusive locking: > > - read lock - (implemented by vma_read_trylock) requires the the > > vma (vm_lock_seq) and mm (mm_lock_seq) sequence counters to > > differ. If they match then there must be a vma exclusive lock > > held somewhere. > > - read unlock - (implemented by vma_read_unlock) is a trivial > > vma->lock unlock. > > - write lock - (vma_write_lock) requires the mmap_lock to be > > held exclusively and the current mm counter is noted to the vma > > side. This will allow multiple vmas to be locked under a single > > mmap_lock write lock (e.g. during vma merging). The vma counter > > is modified under exclusive vma lock. > > Didn't realize one more thing. > Unlike standard write lock this implementation allows to be > called multiple times under a single mmap_lock. In a sense > it is more of mark_vma_potentially_modified than a lock. In the RFC it was called vma_mark_locked() originally and renames were discussed in the email thread ending here: https://lore.kernel.org/all/621612d7-c537-3971-9520-a3dec7b43cb4@suse.cz/. If other names are preferable I'm open to changing them. > > > - write unlock - (vma_write_unlock_mm) is a batch release of all > > vma locks held. It doesn't pair with a specific > > vma_write_lock! It is done before exclusive mmap_lock is > > released by incrementing mm sequence counter (mm_lock_seq). > > - write downgrade - if the mmap_lock is downgraded to the read > > lock all vma write locks are released as well (effectivelly > > same as write unlock). > -- > Michal Hocko > SUSE Labs