From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2A5FEE49AA for ; Mon, 21 Aug 2023 01:13:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 64B786B0078; Sun, 20 Aug 2023 21:13:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D4386B007B; Sun, 20 Aug 2023 21:13:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44DCC8D0001; Sun, 20 Aug 2023 21:13:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 31D196B0078 for ; Sun, 20 Aug 2023 21:13:13 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E3D3880305 for ; Mon, 21 Aug 2023 01:13:12 +0000 (UTC) X-FDA: 81146338224.11.BBAD169 Received: from mail-ej1-f46.google.com (mail-ej1-f46.google.com [209.85.218.46]) by imf10.hostedemail.com (Postfix) with ESMTP id 1B183C0011 for ; Mon, 21 Aug 2023 01:13:09 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="YAs6/WLa"; spf=pass (imf10.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.218.46 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692580390; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cd4uReI0FD9gttAYzEGkqMidc3xQd6d6Ip+eVtIwMWE=; b=bfP4mGFE/GMLHKXWsR3FNKTuf16gVu3fzQTmC+lYpzaGXNxIHQzD2LDbr/snhUsH0Vodn1 OVzwUoTtCZ3cI5lEVjNQP6a9enC3/J0jGNJkCilvBr+ZqKQwGkEjnN/LBtgVr0/CqdBW+1 1C0dbrDGxTiRmvDrWOxq+8vRBWojbEE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692580390; a=rsa-sha256; cv=none; b=niZT8QnufD7MYwsQAz+cxn2NCWANeXVOptrZQl2dYCtRmjBv5P42b2K64jK90SqwsIpuS6 KwvNkFEje6j1Jnc44Q2JnFar3WXQ5GQTt9p6ZoxSoO4K6mFQC7Uv/5FCePStrnyHQbXwWl 0y0TxVM6meU32pkx9fzecJbJc5mWWck= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="YAs6/WLa"; spf=pass (imf10.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.218.46 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-ej1-f46.google.com with SMTP id a640c23a62f3a-98377c5d53eso352067666b.0 for ; Sun, 20 Aug 2023 18:13:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692580388; x=1693185188; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=cd4uReI0FD9gttAYzEGkqMidc3xQd6d6Ip+eVtIwMWE=; b=YAs6/WLahY1fK7C+kFryxufJ7p1lXXG9emRHtgrHbqhz+6Uhqb31y8vBj3tUkeBrwS YWMUEZ9EVkgL/Npzh041mo8Ep6mfy4+hHR8ibQtlJbdM7/u3YAMp2F2PRIM/dCxkUgKD asRb56n/O1ncFgTGS5rTkGWvuTNsv3IW5LVw0vGM5KpGE77PSIWQleg3mZYt4F+5ndla EfH6x68P4ndfVwtAgMm2qzAHBHYStEEMO2H7hI4qdfBVTpqKrlVm7BXt/kerZP+IylhM Gg+v+8pvan1W1PC/4Cro+7Ht+sZdNdtQYW5eRGoONJHLn4sZrDgjD+QijWeLdtXro2HS 3WRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692580388; x=1693185188; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=cd4uReI0FD9gttAYzEGkqMidc3xQd6d6Ip+eVtIwMWE=; b=dvf38BYbXwW1gLL8FoFbjG7U3061NMitEGyLGd0ESmfUjYDXprqqC4I27XPUZlKyP8 EQo8zxby2hZaIRPE3SmD8/OoNj7lTdjwCws7gdCiV/hnh6BW6XlilQcuuL5XLAwndJiw ZHQt5cef/1LrgzkMweNlCrhW/urq/GL1jgF6hYWSEDV6giEZ0McphemI723K6g34LUjA eAW/S1G9HoRCbUo3Ic94lHXs6cDDZ27Pm8xXnGRA/HKPqRwBA7QxejAn0wJbfFccJYPv Ue9EgkwB/Jz9/EGkZZLNflSi0prvS6E29qp1qzTMMvkYyHwX9s7wHJVuFYm9o1sjiunP nUHA== X-Gm-Message-State: AOJu0YwVupy0N5ul5xcNkkeaB6PgRFEzlmpCUAUs3wpOYFRjp5LFasgd WKTTGxA4HjPHvOCLS4Ec/JQ= X-Google-Smtp-Source: AGHT+IEpYpor4uwMDHfscJ7+Vym89dWDF7RQQ5EHRPDBXLLu5UczDDC6HHUH4oQIABilPua/y0S+fw== X-Received: by 2002:a17:907:2bca:b0:99d:f778:b8e5 with SMTP id gv10-20020a1709072bca00b0099df778b8e5mr4044845ejc.28.1692580388274; Sun, 20 Aug 2023 18:13:08 -0700 (PDT) Received: from f (cst-prg-27-89.cust.vodafone.cz. [46.135.27.89]) by smtp.gmail.com with ESMTPSA id h4-20020a1709060f4400b00988f168811bsm5595393ejj.135.2023.08.20.18.13.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Aug 2023 18:13:07 -0700 (PDT) Date: Mon, 21 Aug 2023 03:13:03 +0200 From: Mateusz Guzik To: Matthew Wilcox Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] mm: remove unintentional voluntary preemption in get_mmap_lock_carefully Message-ID: <20230821011303.hoeqjbmjaxajh255@f> References: <20230820104303.2083444-1-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Stat-Signature: s4kqd5h8oabekao93aep45s3fek7tfxh X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 1B183C0011 X-Rspam-User: X-HE-Tag: 1692580389-717906 X-HE-Meta: U2FsdGVkX1/dNs4x5GhAA1S2b/MyEBfScJoK56OMzwRFxGMScIRXZYrkerYH1BoZqgURyeTTD/oj2jz1aIGPa97YkYmSsEPlnbu7ObsDRT2BKKRB5XX1bXwlwTfRH0UU9vOS+3sqCkNnYnb1QlqdTYdsxnwttnfKmEzXLga5pAO8nwNUg2lp+ffJeWAE/6+VxBTlniK+SBcf+gTufOR9xGV8cTTDfm3pb69JNMiLtnpkS5MQpuK79iX2uhAAMC7ge5L4IdA2liGatqtGC0Irp/N2tsu5d0s1VWJQT4kNzZceAJNZfkNRVE52Xqljj6MyV6mbF5a6dzRiH8p9Qyi6R6G4BHPZQ+mqodhc8DevUCLx2HvEQY1PTgCfQUHhvHwZg1oA8Nx74Kn3+m5DR2mAKylyGMGkqlEvpMXi8lPW5mtCN3B2TGb7wUx8Q99K+JEAju3WlUXei5zHvJ5hzq/Hc26xmrZ7eIuyZB4XYtRo95aozaMF4NmKCJCpKT0jvMJCZoigdJRTnY8mHS+YaJEzHqx/YGsOhD2ssjKcaW6cPfIMN8oGCJONfRMETyyEibHOpDikhJtSZFv5e+NvRyqcwkzzoCDB0gIzkm6v65NxrpOD9sr2MtksQB3QFiQrDI3gMPzHyVRdC0OpHTjzDXfWFXqfohMdtSlivBnyfOJHiQVxOi2xWa6xK3QzWnvRClmToCG3L4cMUoREsgi9fM0iSXGhiytZ4pBKo4SO15n+n5/SvrDEkSqtlUTjZ5ZD1/SU+ccrEcWvRBlrJx8qg04siAJWHBYC/5fq2UtrwkVYVSM1UxNzCdnY3jER3JrpGQBQbMjzXYkBxHq5cAkVUR0ABON/TDpcy4CR85swUg0AO/3py86WqvOiKiIdou9CTnmQadBr2F/ZbianL2OUqZKOhuo2pvOvWDroN5uACgQO608JUlbtCRg5tkz0RbPTxsJ2WoQp2+sv8rtPq5Eir6R ZZsR0C5b XsqqT8/M8wNf9Qndn7plbrXZ0/A981iu/+h/K8wlCPMjtmNnG9hg444aBKIGzgNmKJgqg4g46Y19+hOT4AbrFClmVBDaowTCryVRqnyqRQmxP3YcR8dkPPKa3U2pyOQyq4pK+86ip1y4cB4bBdZScUv61H2JjlSkoxNRnRJVesxoJoDEokxFW4imdQQo+3ksnzouvcxuHpJAvVrQysaEbCfYLuV76TMrfUcq+pWV4RgkKQZusLxAV2ZCa/6JdDsEAiJfLVc2wrjnw7w+j4gnW59zPa1vYDLVAPaiXbEuw8IRAaqh9jw1YPE+IrRJ5bVmf27prqQLivsTgPGtno7Ikbach7Vvr8GEVJh5RBpTBnvR/Z6fE3o2Yz7ChBQeIYXsBWC3vd53JG2/t2IVlYk7jCHSTBn4/iNX3BFFeYVXb67UmcmOhYwtfRP484o3hhlSWXd1yE3DVtLEWKcxzBanDtnO2sHo2Nb/eLKI3h6eIOvbeHc7EceVqrNvaQw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000041, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Aug 20, 2023 at 07:12:16PM +0100, Matthew Wilcox wrote: > On Sun, Aug 20, 2023 at 12:43:03PM +0200, Mateusz Guzik wrote: > > Found by checking off-CPU time during kernel build (like so: > > "offcputime-bpfcc -Ku"), sample backtrace: > > finish_task_switch.isra.0 > > __schedule > > __cond_resched > > lock_mm_and_find_vma > > do_user_addr_fault > > exc_page_fault > > asm_exc_page_fault > > - sh (4502) > > Now I'm awake, this backtrace really surprises me. Do we not check > need_resched on entry? It seems terribly unlikely that need_resched > gets set between entry and getting to this point, so I guess we must > not. > > I suggest the version of the patch which puts might_sleep() before the > mmap_read_trylock() is the right one to apply. It's basically what > we've done forever, except that now we'll be rescheduling without the > mmap lock held, which just seems like an overall win. > I can't sleep and your response made me curious, is that really safe here? As I wrote in another email, the routine is concerned with a case of the kernel faulting on something it should not have. For a case like that I find rescheduling to another thread to be most concerning. That said I think I found a winner -- add need_resched() prior to trylock. This adds less work than you would have added with might_sleep (a func call), still respects the preemption point, dodges exception table checks in the common case and does not switch away if the there is anything fishy going on. Or just do that might_sleep. I'm really buggering off the subject now. ==== mm: remove unintentional voluntary preemption in get_mmap_lock_carefully Should the trylock succeed (and thus blocking was avoided), the routine wants to ensure blocking was still legal to do. However, might_sleep() used ends up calling __cond_resched() injecting a voluntary preemption point with the freshly acquired lock. __might_sleep() instead with the lock, but check for preemption prior to taking it. Found by checking off-CPU time during kernel build (like so: "offcputime-bpfcc -Ku"), sample backtrace: finish_task_switch.isra.0 __schedule __cond_resched lock_mm_and_find_vma do_user_addr_fault exc_page_fault asm_exc_page_fault - sh (4502) 10 Signed-off-by: Mateusz Guzik --- mm/memory.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 1ec1ef3418bf..6dac9dbb7b59 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5258,8 +5258,8 @@ EXPORT_SYMBOL_GPL(handle_mm_fault); static inline bool get_mmap_lock_carefully(struct mm_struct *mm, struct pt_regs *regs) { /* Even if this succeeds, make it clear we *might* have slept */ - if (likely(mmap_read_trylock(mm))) { - might_sleep(); + if (likely(!need_resched() && mmap_read_trylock(mm))) { + __might_sleep(__FILE__, __LINE__); return true; } -- 2.39.2