From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0911EE4993 for ; Sun, 20 Aug 2023 12:41:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DCB928D0003; Sun, 20 Aug 2023 08:41:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D7BB98D0002; Sun, 20 Aug 2023 08:41:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C43908D0003; Sun, 20 Aug 2023 08:41:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id B3EF88D0002 for ; Sun, 20 Aug 2023 08:41:04 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 88467B16E7 for ; Sun, 20 Aug 2023 12:41:04 +0000 (UTC) X-FDA: 81144442848.28.C47BBF0 Received: from mail-oa1-f51.google.com (mail-oa1-f51.google.com [209.85.160.51]) by imf08.hostedemail.com (Postfix) with ESMTP id B4AB6160018 for ; Sun, 20 Aug 2023 12:41:02 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="ZsidU9/X"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.160.51 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692535262; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M/UepjmmwRqS7wRG3YQoZ+ltBfUkEu+XD/qNnf6ViU0=; b=ogOyxtK907zUgp+b2pSptTsIljq98N/xEou5QLlieQjHhISWK7UPv0X+ZWAid0kSxnozMG UE+NBlnE01EZefSfcQu5vQtnXbNTKSaLhbL73oDrKFwb87NGIIt9fMbLiqybxNiCFVYwlI EQyW0HwzjFipSWRPN17vILF7Vgnev2g= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="ZsidU9/X"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.160.51 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692535262; a=rsa-sha256; cv=none; b=u5OzaF1lNxhNOFzPS0Tr8HJGLxjcYmHsviHchzcZj2jw6AKkob2j22GlgmEVZTz2HNgVoO ZCEE3zxhsSL1TemVTznkwQScUpcp7wCIvVcWvFXvcRUVWA932Kj1LaWxXr6TaNnE4m/veI xMJ53dwmsiw2v34IhMcc4s+Q5+O0S20= Received: by mail-oa1-f51.google.com with SMTP id 586e51a60fabf-1c4d67f493bso1813211fac.2 for ; Sun, 20 Aug 2023 05:41:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692535262; x=1693140062; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=M/UepjmmwRqS7wRG3YQoZ+ltBfUkEu+XD/qNnf6ViU0=; b=ZsidU9/X+P/PzRHunU6Bd7XtG5EssCI6MAUaWwYrHjuZ9fHHiQTx6/GnDA6cqBQVjG 0w/wuzSne2VhN/9MXOxHdUJy8+0CFZhA5cUnWGhIXsiXLI5fPK4fcdEQC17BYofL+7CC NsHYlcXjpAeyo5F3b9DAu5H+lEHSvPp5GG7/Y3jdgDKF10aZODYX45fNFbR9CiD3ThHj ykpfi4u+s3zwnZsfAmygGMLHD1JoD96ga17E/UN6ijCNkcMyrHwNVNLOgJ5GtJ1DeXFo U4cBPPYzMRHyjR+JVaXNWiakg413ZbvB8uWU9nPSQCAgCkLLlt93+/za0wn9ygUadY4V 6HEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692535262; x=1693140062; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=M/UepjmmwRqS7wRG3YQoZ+ltBfUkEu+XD/qNnf6ViU0=; b=RvFptWhEpwY2CPl7DoRg5CO4kZbUzVnW9tZBYweXWtMngzxamMypqpcooEViuUQviF T48528AqGWNdc72TkMLiAs0SbJ4dBZ0EdHgBq8KiF6q+3/7Oz4rjafbadaLvCJZBKVxG JasXxTaoJHKk0l+2GUCMTmzRJ1PunZrRPVZk+9znkkUQ8GD87riZZbSFtmc+iXLdoxWN qRE/IOgDZg4Uwr2SIBanckVNtr1AQHK9KbCFN8+Z3mM49uASTPoUx7ohHUpcFPM1NujH uNe6e4W+V5lJxZbu1UctOmlB1bsbqXtdBYj0IsxtqCvnQ/SwAoCCtbPKwuyIZmUEzYvM jW+w== X-Gm-Message-State: AOJu0YynlhhOFypDEJneKkeV4HC8fHuqbyv1v4UGnvfWVghgzBVQEvW8 zrCQFxrUl37IBaNmkuj+QUWOJ3JCa99dp3mxd9w= X-Google-Smtp-Source: AGHT+IE6Q88za12cPfRq5zLNMe+6KhWwXIiXJplCCvztczGO+VmrrdULPBZKovrTBIGqz/TVvWP/ADbPxrWZjmLV9VU= X-Received: by 2002:a05:6871:5227:b0:1bb:b4a2:958f with SMTP id ht39-20020a056871522700b001bbb4a2958fmr6643668oac.53.1692535261844; Sun, 20 Aug 2023 05:41:01 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:ac9:79d9:0:b0:4f0:1250:dd51 with HTTP; Sun, 20 Aug 2023 05:41:01 -0700 (PDT) In-Reply-To: References: <20230820104303.2083444-1-mjguzik@gmail.com> From: Mateusz Guzik Date: Sun, 20 Aug 2023 14:41:01 +0200 Message-ID: Subject: Re: [PATCH] mm: remove unintentional voluntary preemption in get_mmap_lock_carefully To: Matthew Wilcox Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: B4AB6160018 X-Stat-Signature: gqyqxsarumbiio8uq5kia9bp4u39dora X-HE-Tag: 1692535262-169985 X-HE-Meta: U2FsdGVkX19i/jWve/iIE5u4z3BRaECw7zqDPcBkN0nq9OI+n8rgsAO7aOmPnqBJ8QRPRXj39y3Ew6qMmZ683a6iKZZgGwCFT+UXfw1Biv15BBo9g8l7cYtQb04ezL2bX2mB7kquz7eZsycN/FsEcUN3Pxlix+YTg0ZreWMdaY5AwjXeSZagFaX4RLdF+LAa+p+FMI8Ok6yUVZCeJ2PU3po200Pz3ai62qr7aNay5xR2kyXuka2YQqLHH/NFXf+925CAYT03/9eKLiRyn+MikLVM5XxIOL9lRe6G324Gx64ZmUjbet0zeiA26VQGdC6UnrMYGFk9KR0tHsjGrBwnkEDV3BqgIIewFqnoZde6hoqyLnD+aYdp+jHUi0+miWZ0SavV0/QiRXGeUM3XzezYCvVa4BxWEHdvG0QbtdioyoOB4+yq0Tzp2Io4dPPFfd00fCHMAlOyGPk7gLZiZ9YKObDU2CyYSINe9c8gS+SHomQoAZVMEZR3R3LyjKtC2At1TVxeYmEwfyDfmwqjDI2jYFR6pdeqod/v0jJvaJztaB7CnAC8l0WbgKD2jcEeF8L/hNVno32pEIw0Nrcdb7vwrpFDTRQTFPS3OHwpQjtUdRB4YKU3/C1s/BEzoLHWZCHyOpurOejNQpGcLHsnCms4Bki9QC+8q073SaK1jouWLOeg/Kff4l3IamDqg9IdDYB9QzGZOwBrWD2y9Ad2N6SisiCHC0gaOX6iL3axZW57luQbdT4/dswuVGK2wA+AUwNJUiZroOpgXI5xk/1/RIZ5X8qdRll2DWT8yFCpy0hBidM0ZPnsF6n368KRVfhXnnBiyUaIRkPbJ2JoOcg0nbPQXm/BH5S8o0EK1uy4W57bOg4R1rFEmhzQrMj/cTwoEpornn/1y/qMXjKKuD1oPApOmew6sJ1pSLVparqhGl217ybCs2sw0hSBDRwpP/7ap7JKyhmyasvGolVnvd/B2d5 bhRvm3LI MQ/wqmulvUxTPmfp+5ZzqZf1SZe6IcOOezxXQY8vJnoaZsezTQ5RC+YC7tww8EGfEQ+jxUWvm1NkcOwPmr2VhkriSuzb4cQmhig/kfKrzPh0vvSzTnZ17xqxabX93BF4djsHZ6FNM78MhKWTeoCAZ2dvBmWysU/jYyRSJgQsJKR5jr+DqNE4ABHPjz2t8BbykXBhxY/J8Pgg8p0GqfbsianSd1KBBWgBYDMhpyhkgQxDGAiWiSR2gNRJr3e53xBvgIO1NGLGXWAV+0NEO6GmpzTd4V3p1RSEUUwKiUfe+tUAhZt8FGS80OcIt86jUx5VLxJUYfxyf9gwiucwYvOc9rzDV7MnuUbGIsIdPqEn9AjZc7yRfgkNe7mAyoELNOkNht8BH9wp+/BTiUsyxHHG81wDAf0pKT2JsiPektvT/a0rC+5zTama4kPwcjau0Q/WSjZH+/ztbibjPLrUXOXcCmGIXxEdkm+JLqwGmzlUCmpu9rxc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 8/20/23, Matthew Wilcox wrote: > On Sun, Aug 20, 2023 at 12:43:03PM +0200, Mateusz Guzik wrote: >> Should the trylock succeed (and thus blocking was avoided), the routine >> wants to ensure blocking was still legal to do. However, the method >> used ends up calling __cond_resched injecting a voluntary preemption >> point with the freshly acquired lock. >> >> One can hack around it using __might_sleep instead of mere might_sleep, >> but since threads keep going off CPU here, I figured it is better to >> accomodate it. > > Except now we search the exception tables every time we call it. > The now-deleted comment (c2508ec5a58d) suggests this is slow: > I completely agree it a rather unfortunate side-effect. > - /* > - * Kernel-mode access to the user address space should only occur > - * on well-defined single instructions listed in the exception > - * tables. But, an erroneous kernel fault occurring outside one of > - * those areas which also holds mmap_lock might deadlock attempting > - * to validate the fault against the address space. > - * > - * Only do the expensive exception table search when we might be at > - * risk of a deadlock. This happens if we > - * 1. Failed to acquire mmap_lock, and > - * 2. The access did not originate in userspace. > - */ > > Now, this doesn't mean we're doing it on every page fault. We skip > all of this if we're able to handle the fault under the VMA lock, > so the effect is probably much smaller than it was. But I'm surprised > not to see you send any data quantifying the effect of this change! > Going off CPU *after* taking the lock sounds like an obviously bad thing to happen and as such I did not think it warrants any measurements. My first patch looked like this: diff --git a/mm/memory.c b/mm/memory.c index 1ec1ef3418bf..8662fd69eae8 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5259,7 +5259,9 @@ static inline bool get_mmap_lock_carefully(struct mm_struct *mm, struct pt_regs { /* Even if this succeeds, make it clear we *might* have slept */ if (likely(mmap_read_trylock(mm))) { - might_sleep(); +#if defined(CONFIG_DEBUG_ATOMIC_SLEEP) + __might_sleep(__FILE__, __LINE__); +#endif return true; } This keeps assertions while dodging __cond_resched. But then I figured someone may complain about scheduling latency which was not there prior to the patch. Between the 2 not so great choices I rolled with what I considered the safer one. However, now that I said it, I wonder if perhaps the search could be circumvented on x86-64? The idea would be to check if SMAP got disabled (and assuming the CPU supports it) -- if so and the faulting address belongs to userspace, assume it's all good. This is less precise in that SMAP can be disabled over the intended users access but would be fine as far as I'm concerned if the search is such a big deal. -- Mateusz Guzik