From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4BA6D5E145 for ; Fri, 8 Nov 2024 08:26:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A764D6B00A0; Fri, 8 Nov 2024 03:26:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A00186B00A1; Fri, 8 Nov 2024 03:26:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 878896B00A2; Fri, 8 Nov 2024 03:26:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5A8486B00A0 for ; Fri, 8 Nov 2024 03:26:19 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CC4851C5DB0 for ; Fri, 8 Nov 2024 08:26:18 +0000 (UTC) X-FDA: 82762245300.14.B2F7B6D Received: from fout-b7-smtp.messagingengine.com (fout-b7-smtp.messagingengine.com [202.12.124.150]) by imf19.hostedemail.com (Postfix) with ESMTP id EE46B1A0003 for ; Fri, 8 Nov 2024 08:25:30 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b="G 3oypHI"; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=JVhX8o3I; dmarc=none; spf=pass (imf19.hostedemail.com: domain of kirill@shutemov.name designates 202.12.124.150 as permitted sender) smtp.mailfrom=kirill@shutemov.name ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731054251; a=rsa-sha256; cv=none; b=AVdM8JquVJ+AAZAdcqDNRF6TfwAl/qprAEYP4fyPX2al1/9aMbGR5OQNruHFbqy466xTdq LAJOndiXBiZzRx4ckYwx/Jj1UiYP+/Qo3xw3q07+FJ2ABO7yqhHlyhBhLPETk6i36KtGwI 1hkWy9LgYtIGxJ1YfT40U4E7r4xRJUM= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b="G 3oypHI"; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=JVhX8o3I; dmarc=none; spf=pass (imf19.hostedemail.com: domain of kirill@shutemov.name designates 202.12.124.150 as permitted sender) smtp.mailfrom=kirill@shutemov.name ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731054251; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MP9fVPar+7YgC7aGLJp0qPa3a2dJ4S4MvIznfgmcU9U=; b=d1Ixccql62j4vyQTc+eCxODoXLndCxS/k4at1T8DLyJpHcu3E7+ZL6AmxfkMxcEJyyzJOz lubmo4q1hgE7IYR36Y0XTuRqhzQYklStAHdR1QCyRPJlFXwAFYNs2xO8Ww0Lr/pEEH6zMv cHiBcornD9jaY1YANicLnXcOBX4UHMc= Received: from phl-compute-09.internal (phl-compute-09.phl.internal [10.202.2.49]) by mailfout.stl.internal (Postfix) with ESMTP id 724A21140181; Fri, 8 Nov 2024 03:26:15 -0500 (EST) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-09.internal (MEProxy); Fri, 08 Nov 2024 03:26:15 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm1; t=1731054375; x= 1731140775; bh=MP9fVPar+7YgC7aGLJp0qPa3a2dJ4S4MvIznfgmcU9U=; b=G 3oypHIhJym4f0LlS22GiDXcT6BkX8ceGWhdDCKPu0fvfM9mihUvRHbDEX06ABddd sx3ncgTbKx+RMyWdNEVyKcErK36vY3eoV2+H2Pw/lWyqJJlaG1s3aJrmi8e3L03R 0YitRFi6vLLHE89cbK7QTsNNZ82XrqDzS2Io3xjTrl/pJtPsqBHvyY+EhPn6JUZB fNKL4p8QOI+Z6kruVvZc5ytk0MXxWhk+HR/D2rPhL1gkzyYfK7v1poK9uaux35zs XWtG9e+tQkJdqKX4/KEJBpq1MkQdgQ42zrDBECeWbJTb1GhiAEY8tr2+60+cotrt w/B+0DjfiJdRqDMkJyGvA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1731054375; x=1731140775; bh=MP9fVPar+7YgC7aGLJp0qPa3a2dJ4S4MvIz nfgmcU9U=; b=JVhX8o3IDek13+NrfMb1hYh/cRtk7Bw5+X1/LyDeD1deVUPlJZt hLcjiuUUJeAs7Zqpc7stp6aRTPOi62CwOXCxiuTanzkSw7pSMJLlHVWwoNp2uNxL gsUE020u6FMA9ypx82BIatE3nmzLXSgYDRiAC/zAzPpz4aRC1F5PGiJWuCAS8m1l Mxs+CiE6SDa+ZuPztOFH5hjDd44y1GlHpaFswnQTOezmd6qLQX/PIR8S5g/Ac/EU zeg9dybKnYROq2YrGv7K2KjxQbgVA7jGQfEVDjHiggnz20Ls9UZabdHTOth9ah8Z YvvsnIYdM8TS+MK1Zvjd2ascHsa5krGhn9A== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefuddrtdehgdduudelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhepfffhvfevuffkfhggtggujgesthdtsfdttddtvden ucfhrhhomhepfdfmihhrihhllhcutedrucfuhhhuthgvmhhovhdfuceokhhirhhilhhlse hshhhuthgvmhhovhdrnhgrmhgvqeenucggtffrrghtthgvrhhnpeffvdevueetudfhhfff veelhfetfeevveekleevjeduudevvdduvdelteduvefhkeenucevlhhushhtvghrufhiii gvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehkihhrihhllhesshhhuhhtvghmohhv rdhnrghmvgdpnhgspghrtghpthhtohepudejpdhmohguvgepshhmthhpohhuthdprhgtph htthhopehlohhrvghniihordhsthhorghkvghssehorhgrtghlvgdrtghomhdprhgtphht thhopegrkhhpmheslhhinhhugidqfhhouhhnuggrthhiohhnrdhorhhgpdhrtghpthhtoh eptghorhgsvghtsehlfihnrdhnvghtpdhrtghpthhtoheplhhirghmrdhhohiflhgvthht sehorhgrtghlvgdrtghomhdprhgtphhtthhopehvsggrsghkrgesshhushgvrdgtiidprh gtphhtthhopehjrghnnhhhsehgohhoghhlvgdrtghomhdprhgtphhtthhopegrlhhitggv rhihhhhlsehgohhoghhlvgdrtghomhdprhgtphhtthhopegsohhquhhnrdhfvghnghesgh hmrghilhdrtghomhdprhgtphhtthhopeifihhllhihsehinhhfrhgruggvrggurdhorhhg X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 8 Nov 2024 03:26:07 -0500 (EST) Date: Fri, 8 Nov 2024 10:26:03 +0200 From: "Kirill A. Shutemov" To: Lorenzo Stoakes Cc: Andrew Morton , Jonathan Corbet , "Liam R . Howlett" , Vlastimil Babka , Jann Horn , Alice Ryhl , Boqun Feng , Matthew Wilcox , Mike Rapoport , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Suren Baghdasaryan , Hillf Danton , Qi Zheng , SeongJae Park Subject: Re: [PATCH] docs/mm: add VMA locks documentation Message-ID: References: <20241107190137.58000-1-lorenzo.stoakes@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241107190137.58000-1-lorenzo.stoakes@oracle.com> X-Rspamd-Queue-Id: EE46B1A0003 X-Stat-Signature: u9jznuozmqg3gfo8skecygukgnj5ruoh X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1731054330-84298 X-HE-Meta: U2FsdGVkX19w/c7mMpMmuOZ1Dyo/uXnX4MF8tcu6i7qKLf/DbifaiBt68f7GKFdi8q7iFAYZV2LeRVYcIX0vxqyanV6h+dTBXsVLLJ6rPvHY90HObWouBP1s6ZXS3GA0pA1oKzqSV9G9g/HKpr936vrKhoVrBQHzae4Cv2HnFhG/Hoxik3aa/rYEcyikPj1BkaKavv9RXdVkOldF3X/8MT0JDXBGxIcu458ZSKai2KxH0pBJqQ3USs9Gfft3IhMUQZJtSq0EHxjmxHqHN9/lTiFw8IowfCZLCpXcQ3OHpbkUYUk/+e5ceUrQ146+uYt1nxsK7R25XlwkUGJuH4aioFuM0m7vbtPTKNJi3AesVW16g7RMfs2cQD/VIGGp0BEHE+HQytnDqUXxFvxkIMt3CXvvdzzktJ7ot+VU1br3SrE0fLr8ZoV3/Ov3UM60ILuU6irLn8zZ9z+3zSgM4hGYcuZDzeNYURzOOZlIXaJmCNBHpuTtz77EXY3bhnPsluaQiihn3oyERz2PMCwqx7MuAEmW4CaG4VjrantgEHNdbcryXDNlbONTLDOqOTn02gmuN7qocnFXK/zq0hN/edcV2nedQnDKW7z8/aHK8ruYg2Xcw05Ryj61szcmMuXL0zBC95Wh20Pevdn6LvlieCe0fC9bfPsIZ8j8l4J8XtOBlBHd52lZVBubbXeIpWDMgqaBr5ZfGQFug/wl/Da1ZPppZZnRuKcBFXyM+aDyPGR2dajplJfRGerIglV99TkKQ/C9HzNQ4U3eeBvq2wEp+fonwonLDtBr8fEGa0VCpfSF/0iaJovMbh1t/l9XFCwj1dV1Dghf9cGXW1kpYJsbt4HMBs3fz+B4bdpV/DbvzaxUYPURCqB37FdDbDKDCPibXokTPFPFEvicmxZ5keHSXqx8UNL84BqI6zSa0Ws0DGrr+ts1jTSHlkFy8eBMTzmqMi1NUFY9AbnALqsJH5gl5of e3iEnKTH VlWEn/5MMnd5010NKNi32IIMmQgwUPJ3G4QGvtFFP/3mv0GUkxLZvMHBNZowbbGQzP8pwNBQnniLcSgJrz90qHsGlL53aJPNRW3X0QPVYjckVCk52deQXec2g2g/czVdtrSpjMBgvfCwr9exkidpCiUODzQF4s53L50+1fJ7XloVHmBeEOHfKB39MaSyBKBxwu121X0ieHLjm03xt+VR0GSqzLxbg0u3SHGXCQEmjTv7/MaUWPTv6HanjIJtAWJxSreom5OgG9OmLRJE3KLsJqgNCFn27PDpH5TtZxqQ5wcakJFymLf3g9akPgLlWiZhAmllNEOjUvfzUS6XglhMQel2fyoJFkVKMvpTJohtZHeBFymJWa0hVSqp61ICLAHEBukT/hzYWaWyzJ+tN90qeir5Cvck3LgNQpq9pDCFiYo2Rfk7RRDKBMtLhQFLfqRMuponJ2p0YkOO4BL/w9MICPkXuLDNULSLL9PhY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Nov 07, 2024 at 07:01:37PM +0000, Lorenzo Stoakes wrote: > +.. table:: Config-specific fields > + > + ================================= ===================== ======================================== =============== > + Field Configuration option Description Write lock > + ================================= ===================== ======================================== =============== > + :c:member:`!anon_name` CONFIG_ANON_VMA_NAME A field for storing a mmap write, > + :c:struct:`!struct anon_vma_name` VMA write. > + object providing a name for anonymous > + mappings, or :c:macro:`!NULL` if none > + is set or the VMA is file-backed. > + :c:member:`!swap_readahead_info` CONFIG_SWAP Metadata used by the swap mechanism mmap read. > + to perform readahead. It is not clear how writes to the field is serialized by a shared lock. It worth noting that it is atomic. > + :c:member:`!vm_policy` CONFIG_NUMA :c:type:`!mempolicy` object which mmap write, > + describes the NUMA behaviour of the VMA write. > + VMA. > + :c:member:`!numab_state` CONFIG_NUMA_BALANCING :c:type:`!vma_numab_state` object which mmap read. > + describes the current state of > + NUMA balancing in relation to this VMA. > + Updated under mmap read lock by > + :c:func:`!task_numa_work`. Again, shared lock serializing writes make zero sense. There's other mechanism in play. I believe there's some kind of scheduler logic that excludes parallel updates for the same process. But I cannot say I understand this. > + :c:member:`!vm_userfaultfd_ctx` CONFIG_USERFAULTFD Userfaultfd context wrapper object of mmap write, > + type :c:type:`!vm_userfaultfd_ctx`, VMA write. > + either of zero size if userfaultfd is > + disabled, or containing a pointer > + to an underlying > + :c:type:`!userfaultfd_ctx` object which > + describes userfaultfd metadata. > + ================================= ===================== ======================================== =============== ... > +Lock ordering > +------------- > + > +As we have multiple locks across the kernel which may or may not be taken at the > +same time as explicit mm or VMA locks, we have to be wary of lock inversion, and > +the **order** in which locks are acquired and released becomes very important. > + > +.. note:: Lock inversion occurs when two threads need to acquire multiple locks, > + but in doing so inadvertently cause a mutual deadlock. > + > + For example, consider thread 1 which holds lock A and tries to acquire lock B, > + while thread 2 holds lock B and tries to acquire lock A. > + > + Both threads are now deadlocked on each other. However, had they attempted to > + acquire locks in the same order, one would have waited for the other to > + complete its work and no deadlock would have occurred. > + > +The opening comment in `mm/rmap.c` describes in detail the required ordering of > +locks within memory management code: > + > +.. code-block:: > + > + inode->i_rwsem (while writing or truncating, not reading or faulting) > + mm->mmap_lock > + mapping->invalidate_lock (in filemap_fault) > + folio_lock > + hugetlbfs_i_mmap_rwsem_key (in huge_pmd_share, see hugetlbfs below) > + vma_start_write > + mapping->i_mmap_rwsem > + anon_vma->rwsem > + mm->page_table_lock or pte_lock > + swap_lock (in swap_duplicate, swap_info_get) > + mmlist_lock (in mmput, drain_mmlist and others) > + mapping->private_lock (in block_dirty_folio) > + i_pages lock (widely used) > + lruvec->lru_lock (in folio_lruvec_lock_irq) > + inode->i_lock (in set_page_dirty's __mark_inode_dirty) > + bdi.wb->list_lock (in set_page_dirty's __mark_inode_dirty) > + sb_lock (within inode_lock in fs/fs-writeback.c) > + i_pages lock (widely used, in set_page_dirty, > + in arch-dependent flush_dcache_mmap_lock, > + within bdi.wb->list_lock in __sync_single_inode) > + > +Please check the current state of this comment which may have changed since the > +time of writing of this document. I think we need one canonical place for this information. Maybe it worth moving it here from rmap.c? There's more locking ordering info in filemap.c. -- Kiryl Shutsemau / Kirill A. Shutemov