From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75AD1C4829E for ; Mon, 12 Feb 2024 22:31:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0DF536B0095; Mon, 12 Feb 2024 17:31:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0907D6B00AF; Mon, 12 Feb 2024 17:31:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E73F16B00BD; Mon, 12 Feb 2024 17:31:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D7C446B0095 for ; Mon, 12 Feb 2024 17:31:11 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B166412044B for ; Mon, 12 Feb 2024 22:31:11 +0000 (UTC) X-FDA: 81784598742.04.79896EB Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) by imf30.hostedemail.com (Postfix) with ESMTP id E4BE58000E for ; Mon, 12 Feb 2024 22:31:09 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=K2iQI2bu; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of lokeshgidra@google.com designates 209.85.128.43 as permitted sender) smtp.mailfrom=lokeshgidra@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707777070; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=l9mO69PHSXpV4KszXXC4FO0K4MGQhMEHtnBuaX2b1wA=; b=L1GNFZM3GT5c6tAr2ED4nLY8d0zilf7AcsCCoz0aIZGtgB6DkYm3D8tjada41Mx5+nRVZl Y4E7hf2YWrh82Q4oSelLx2b1kWLNiLLFg4JJbSIJBsc+svAcM0lxyw8GqhAZPvUFtL2ao5 r3W5Pj7ji/3DyMxZOSlcHjcFrKoSyJY= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=K2iQI2bu; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of lokeshgidra@google.com designates 209.85.128.43 as permitted sender) smtp.mailfrom=lokeshgidra@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707777070; a=rsa-sha256; cv=none; b=XUxX887wgpZJR79In00sMMs6XuaJDWKig/yCoXUUR4Qne5+LLYvZ6aPpeiNRo354qokWju Z6pJor+l8+MYwffgYUpXwSnKv4napLvftQZA/OUUB/20NByzl6ojWEkLrqhAfyko/GeGpp 5Wm487IEJ3iaMRAokhsJeNiepFhaLkA= Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-410ca9e851bso11230695e9.3 for ; Mon, 12 Feb 2024 14:31:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1707777068; x=1708381868; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=l9mO69PHSXpV4KszXXC4FO0K4MGQhMEHtnBuaX2b1wA=; b=K2iQI2buWg6n8e7L8jzX6DfMgDNPtgvwiqCYuVFsbKC1lFW8sWPUgvKXVEIpoSM6/B gKQgNHYoOC0WMJE5FOl1EvqmFLSQ7xMicrmn0sfuQtS0VZ+7kWpMOBwG5p1OAFxH/Fla O3sahz28eHt4IMQQ7UvYqBYNsND1N9WxtExPnh22KX96ibP67mlQz0Rc7W7xpCyx8Pml IgBLTPIAIXpuj5Coa+xvQwn00eckikUNer2kUU5rLgJfLGg2k3s4AuJkerI8Qyy2vfGB OWGmZlIHxcJ0IYnaEZaqoR6JhvODrjCqyuUQmRpwYWgwxINDZW4SE6VatIjIBkfc+Sy7 KvpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707777068; x=1708381868; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=l9mO69PHSXpV4KszXXC4FO0K4MGQhMEHtnBuaX2b1wA=; b=qVAOB57vk57CMV9ktNCtxzFb1Ra8pQ3tBiCILKyxgSM7GkZQ1QmK6nqYb1Vd+n9I1y bxwj2Vo3/07+HL6Q6/UBhiBUdNn/7C+OCpNAv3nFNC9o6UYqJX2Qh8RxK3hBeAghy9kU 4L8gWaKxxqaxKtow/QpkJgP8jGAcpb3GyHPImuLHewXAQSGdHBJiWW0b4woOflNIKp9i 6Eqjc/11CPmRSBOfhIrxPshWYXZzMhv1VWCLoSEkEcaccYHFms/OCH8eVLKutsGSvBYU CdE4GCL10MzE3vSbTM5Vx2FiE9jDOsX6E1BiRpzPMXZjnqLvLgc81UbMWTAt7cgpGwYv SjOg== X-Forwarded-Encrypted: i=1; AJvYcCUTB0RKE2/+QTbXIS+1/ff3WK00NeDLJKKA9YFOGwEj6xGpOCjnsuO6Lxmq8NFw8p0gC+m08rKoSl4Pqf6qsN3zZl4= X-Gm-Message-State: AOJu0YzBieeWIYjPvA7Lq6rbB8o7wlz07/F6q7AiOcM2+zqwJaXNmF6J kIsnDe9JCDJ8EiohFHV5rZ0f/QsHXeQOgUGNtsHUVmQ2oGreOG4kd8xLsKjKgJPeyF2XmQhE4Yu R+CM8VRtFZM8eiMip5KY80gb5yh7Bu3yP0x4T X-Google-Smtp-Source: AGHT+IFZvGus6qBaDKHVpEnYLxBVaPb/kTrQZBtuxYaA/mFQXqriiKZKEZmpVs+/oJ+VH2Y/q/TlTzvux7Yi7IHL9oc= X-Received: by 2002:adf:ed90:0:b0:33b:47d0:52ce with SMTP id c16-20020adfed90000000b0033b47d052cemr6385336wro.25.1707777068103; Mon, 12 Feb 2024 14:31:08 -0800 (PST) MIME-Version: 1.0 References: <20240208212204.2043140-1-lokeshgidra@google.com> <20240208212204.2043140-4-lokeshgidra@google.com> <20240209030654.lxh4krmxmiuszhab@revolver> <20240209190605.7gokzhg7afy7ibyf@revolver> <20240209193110.ltfdc6nolpoa2ccv@revolver> <20240212151959.vnpqzvpvztabxpiv@revolver> <20240212201134.fqys2zlixy4z565s@revolver> In-Reply-To: <20240212201134.fqys2zlixy4z565s@revolver> From: Lokesh Gidra Date: Mon, 12 Feb 2024 14:30:55 -0800 Message-ID: Subject: Re: [PATCH v4 3/3] userfaultfd: use per-vma locks in userfaultfd operations To: "Liam R. Howlett" , Lokesh Gidra , akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, selinux@vger.kernel.org, surenb@google.com, kernel-team@android.com, aarcange@redhat.com, peterx@redhat.com, david@redhat.com, axelrasmussen@google.com, bgeffon@google.com, willy@infradead.org, jannh@google.com, kaleshsingh@google.com, ngeoffray@google.com, timmurray@google.com, rppt@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: E4BE58000E X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: yjwaxuka183dkmfhq4e6s8uy6u6dpcqh X-HE-Tag: 1707777069-100805 X-HE-Meta: U2FsdGVkX1/wOHNVxtwbtlFW3NDqmuq06e9OIitx7jl2/Huy11nlswPxbj6x8KfneNr5siRN/mPLJPOCliJZKdiDsCivHTZWZ3cCkF5YLDcxsJjtppRMKZ/6cQ/0nyN2drWAEQj3lxuoJAjq/zRmpc13IBbzxBJSy6C5Jaq6aZTCuIMDd8juMlM9SXxF01KZ/0D2b037/3+zj5CquVxWRGNPGKMcXVJI25oa79rlfTo6WOZkc9d8ZOoRXou58qb+rzJ0nZgwp2+1y1jn2dB08uxD5U91bZZE137QDONHECvkOM0LJ4e7LAmOomJvXqWbPCjVwbXeSIULWsT9MxRQsXTAW0ZNdxYa4h30eDIQfCninNOl5supZL/pPZ7oCppqnYz1z/aWc12MTEgmxTic2Irc491E398093LAWms5H8HEQDzqK8RnPekZDVW8BRFY6ivtaZmp3WUBrKsniD56nWFWjHzPRJ5n3GjynKsvs1fTPm3lxA/j8YbIff1KVNFijIg8WOY6irFq0ItLDDqZR9m7GY2rH6NREADeT3aKyqqkd+rENKnQTb1Cg5fiOUm9hkY8Hv+JILBx1yABqbxQtKPXOvMJpHs5a+bKn4weC8JB6QF7UXCAXhil2LiVUeDowugWLwSsz8yc++EyAO+3vvzblXcpjW3fUgrbhKrmyYO26mmxD5hDmrSzfpzOCXA7JA1mRDgObvl/rKPsjaQN6pXlkt4PD2TYo8CwUTCEgHcB74Yc9MHdhW3EHe7/HnkYSkqAfwQIX+Mr003rB2T+grvnOuuqzbwn9LpFhUebUYZSiirJ5JPg1KstXFnYPegdVyCK+a1tUBxsUVx7D0RqH6vTRo1qCPrkPeg7VHmTx8vGlejnkHbsA8LIA+VJsMCZSN7Xzj8ZJlXWEeZdkeoHtKdAxLIqt/J9/PYOlpt5vkhRajGlB0IpcOsHB9Nz7+sW24VGXY0wgwaGbfx01FE 662R7VA3 F1caoQ+YQyrFo/ul3aiXCEgYX4v0xmJW5uwhOrKR4PnWTvWYtPuBRLFSAzmpEWGvfIurml6iJbWhLapaRIzeSqb4xAGChJNREXVzYmfBUM6xsMvrkFOKnfQLTkOWf6vVRlPsYAVicgQ1V3SpQXEYb0nuQ4GRpLZKG6A4L9XwI7HhPJjg8usn0Y9dE67UesimxY+pDx8MY1IrbE2+vafxvth6/AW6C76dZYnSy61R4oPODhrMzo+K64SqmJAzbjHV2gycGfNlHx508sZQAK5hlvyd0OA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000777, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Feb 12, 2024 at 12:11=E2=80=AFPM Liam R. Howlett wrote: > > * Lokesh Gidra [240212 13:08]: > > On Mon, Feb 12, 2024 at 7:20=E2=80=AFAM Liam R. Howlett wrote: > ... > > > > > > > > > The current implementation has a deadlock problem: > ... > > > > On contention you will now abort vs block. > > > > Is it? On contention mmap_read_trylock() will fail and we do the whole > > operation using lock_mm_and_find_vmas() which blocks on mmap_lock. Am > > I missing something? > > You are right, I missed the taking of the lock in the function call. > > > > > > > > } > > > > return 0; > > > > } > > > > > > > > Of course this would need defining lock_mm_and_find_vmas() regardle= ss > > > > of CONFIG_PER_VMA_LOCK. I can also remove the prepare_anon conditio= n > > > > in lock_vma(). > > > > > > You are adding a lot of complexity for a relatively rare case, which = is > > > probably not worth optimising. > > > > ... > > > > > Agreed on reduced complexity. But as Suren pointed out in one of his > > replies that lock_vma_under_rcu() may fail due to seq overflow. That's > > why lock_vma() uses vma_lookup() followed by direct down_read() on > > vma-lock. > > I'd rather see another function that doesn't care about anon (I think > src is special that way?), and avoid splitting the locking across > functions as much as possible. > Fair point about not splitting locking across functions. > > > IMHO what we need here is exactly lock_mm_and_find_vmas() > > and the code can be further simplified as follows: > > > > err =3D lock_mm_and_find_vmas(...); > > if (!err) { > > down_read(dst_vma...); > > if (dst_vma !=3D src_vma) > > down_read(src_vma....); > > mmap_read_unlock(mm); > > } > > return err; > > If we exactly needed lock_mm_and_find_vmas(), there wouldn't be three > lock/unlock calls depending on the return code. > > The fact that lock_mm_and_find_vmas() returns with the mm locked or > unlocked depending on the return code is not reducing the complexity of > this code. > > You could use a widget that does something with dst, and a different > widget that does something with src (if they are different). The dst > widget can be used for the lock_vma(), and in the > lock_mm_and_find_vmas(), while the src one can be used in this and the > lock_mm_and_find_vmas(). Neither widget would touch the locks. This way > you can build your functions that have the locking and unlocking > co-located (except the obvious necessity of holding the mmap_read lock > for the !per-vma case). > I think I have managed to minimize the code duplication while not complicating locking/unlocking. I have added a find_vmas_mm_locked() handler which, as the name suggests, expects mmap_lock is held and finds the two vmas and ensures anon_vma for dst_vma is populated. I call this handler from lock_mm_and_find_vmas() and find_and_lock_vmas() in the fallback case. I have also introduced a handler for finding dst_vma and preparing its anon_vma, which is used in lock_vma() and find_vmas_mm_locked(). Sounds good? > I've also thought of how you can name the abstraction in the functions: > use a 'prepare() and complete()' to find/lock and unlock what you need. > Might be worth exploring? If we fail to 'prepare()' then we don't need > to 'complete()', which means there won't be mismatched locking hanging > around. Maybe it's too late to change to this sort of thing, but I > thought I'd mention it. > Nice suggestion! But after (fortunately) finding the function names that are self-explanatory, dropping them seems like going in the wrong direction. Please let me know if you think this is a missing piece. I am open to incorporating this. > Thanks, > Liam