From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0601BC4345F for ; Mon, 15 Apr 2024 15:58:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7A8C46B009A; Mon, 15 Apr 2024 11:58:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 757006B009B; Mon, 15 Apr 2024 11:58:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 61E896B009D; Mon, 15 Apr 2024 11:58:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 43BFF6B009A for ; Mon, 15 Apr 2024 11:58:44 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id F331D160654 for ; Mon, 15 Apr 2024 15:58:43 +0000 (UTC) X-FDA: 82012224126.24.D035507 Received: from mail-yb1-f175.google.com (mail-yb1-f175.google.com [209.85.219.175]) by imf18.hostedemail.com (Postfix) with ESMTP id 5AF1F1C0011 for ; Mon, 15 Apr 2024 15:58:42 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=HzR6uwbu; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of surenb@google.com designates 209.85.219.175 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713196722; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vQs4ugT2y6J/4IutYdKUBVbhJN3bb9YCkttNLEXXcuk=; b=ZJXl2eLs1QNBHaNCtSyTocQqZpRBV6HYeQBHsdcoCQD+Cmh3HuorzPF2deGKaWaBPwesfu CL06mG32eojhsw975/w8UsK2Lp+BB1RntZSIbVNoccURw/9McY4Qp7RCcrg+/PjZLZAg3U mkTYz022subU2GuvjUxCQ4wyi7DALlI= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=HzR6uwbu; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of surenb@google.com designates 209.85.219.175 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713196722; a=rsa-sha256; cv=none; b=2EyjRINwBCCTQpNE+Lpjycyix0zoYA/4QmTInGIy495UXv0IaqDBmE2eIGpXJSC2p7vSAu Wkb2pWr0pK0SCryEyOu/bAl5ZC7JBJjZFLD8XI3fF/kt+NoHhIdyKoeXROOj34WqWJOr2X 1eG9SY7ROT/ybrW2xyT44O5Ogq7LEhE= Received: by mail-yb1-f175.google.com with SMTP id 3f1490d57ef6-dc236729a2bso3308982276.0 for ; Mon, 15 Apr 2024 08:58:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1713196721; x=1713801521; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=vQs4ugT2y6J/4IutYdKUBVbhJN3bb9YCkttNLEXXcuk=; b=HzR6uwbuXJTQKpn4HkPNZWFjr3S8+lPATc4nf581VDlYNCGWUmm8glDy/WWU+UZUxn Zio6FRwo6Lea1qo2zBsti4fB1lFs7PpekTeWbiFEyGbyKebU9A9HKbMekFP4eG+i9Yhv mJvX2Z/fmF7/shgQIrPs+kRoqFsCqsoWeqhaXI0KAaL+LHpseMASTqECMXyYkEytVKB5 GLYZkChesSjaQOBz7SzcVdmcxRLN358xHcOfVTa2ZPfZGvyWczfD23woog1lEuHXTnx5 K5Ye6k9iGqTV05691R1j7mSosBdh7lm8m7ZCkhuhJ6BaS/VlyZ/zQzULx2NLPA96DOmW xC8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713196721; x=1713801521; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vQs4ugT2y6J/4IutYdKUBVbhJN3bb9YCkttNLEXXcuk=; b=BJp2PHlngjtsGGnka1VUG0UdV9Ni82YC0URAJ4FXhdTtUXeQ8GBR74MXXUjLOEKmIp CHftzJgF/na50qVf355eNP3y4DqMIGD6kG9Y2o/cqbrqrAf2wUt5hny/oqr82Wb8B+Qw KWzcH7Z4+pz6ZxPCO7opvfyeMunUg7FvbH42P1RCkVrjL1/OwV1NFn+pA2VHZiTWtHRo o27ewOemnm2ZgGnGd7f5roTs2hlNe4dx9IjTptbuYAX2BSpffKbhd7FXQSBMmOx+bRDe yVhfcHA1cKMbFx0eHeQfem/LfW6Ohf5LOU1Q747ZxjknHrpxastsSQSeUJTuT91ChnvL QFVA== X-Forwarded-Encrypted: i=1; AJvYcCXvTqEn2Aub96C0XYqLjQzjVLItOqJp+X/zwqiD4TGyGmQtFumYIx0q/3c4Eue8vzGerqt+W9QkZMkhZnuG+HTwGwA= X-Gm-Message-State: AOJu0Yy2i3Z6tBSVXzspCmaicgTw4RjkoLiCxyevZMgFfSDjtBjrt5LG JZIi84in0AyTQ3nkeOPwr7ENLuoWl1BVFXZD9NRAObQJ1WEVOvXJbR/5GwWdlqow9mgX9sB90aX vQ/91lerqPB+8r7vlyC5XJ/NbMhJPlSK/Kpqi X-Google-Smtp-Source: AGHT+IHj+dUON+x7kGfY5lFgFmSTO1gbp1zS1dRy20XQvezBONy3gNljvQoqw/0z2FVCc4FIrrRpwWxfBSbd20Fg1Rc= X-Received: by 2002:a25:9e05:0:b0:dcc:99b6:830b with SMTP id m5-20020a259e05000000b00dcc99b6830bmr9897702ybq.19.1713196721063; Mon, 15 Apr 2024 08:58:41 -0700 (PDT) MIME-Version: 1.0 References: <20240411171319.almhz23xulg4f7op@revolver> In-Reply-To: From: Suren Baghdasaryan Date: Mon, 15 Apr 2024 08:58:28 -0700 Message-ID: Subject: Re: [PATCH] mm: Always sanity check anon_vma first for per-vma locks To: Matthew Wilcox Cc: Peter Xu , "Liam R. Howlett" , LKML , linux-mm@kvack.org, Andrew Morton , Lokesh Gidra , Alistair Popple Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 5AF1F1C0011 X-Stat-Signature: ideqobicaryum5nbzyh6fjksa1w69ydw X-HE-Tag: 1713196722-82676 X-HE-Meta: U2FsdGVkX19lpiXOV8tBxelPxWhTa/FVaJZzlT5TJO7VZipLU8qPTMiljapVadW6GfblS9vY0tpgGVknuivWOoCITp6a2r+OLsjUT9YrNDKkCrT5j/BI38qOK2YSfTDZfcyYtnxWSKtmtbd7tM8vQbucjzkkfc+Tr2XTYu3SyFcLaok/uL4IsmtqO1BgPfr/z9swSExfXvBJpO/rs1YmmnB7VtOsKqRjT9kNN+BApuk5h68hJfgPS6f/PNlaFWTyU8Xgn+A7Cp0ei0kT+qkiD3zxhYbUXkjsgMt0XlGzeurClthvO6UPd3zica6AUEmQ4HDr3WwdrKUhASo8yoI4bq/GzcFi5gp25415AJGdwvnXWi65W8cqoC0R362B/J7ob2BlDoOnRcFH5l7XdRqzm+vJkmJVpAfT8uRoR4s+8mEtv4xpWXqA7lTAa/FUSYg2zWh2vakO0Cickf5J5Qew+cpCrS75ydEeQur3bh+2svUyJl+NSSRRiKU60kKCDCLmqEEEisxzHAq6F1bIiCCiGUQJ29SvvsOFPlQ3ZDU4bHwpx20jru3HKwMxjOHccF1Xmp7C4hQU9S5+zvn95gMwsP2d860iQKzVEkh2Yzb45ECV9RmuEegeSuIXfEZG4ld/E3ntOsUGpa0jbt7V+I6kHhKjhev64N3eLJWBWWH7T42DqrDTEHIxlm7V8UMm0z2smVDSzRoZ7ipFjQWKZsOGR0gtZiql7aQsip1OoGoPDejumOIZKK4Xz4g7zHydtU0bZ6JiP9EsYtbjC4ToSGIRVgYonozg74tFKTuLRVwByNZ2jqlKnfkHHBJE57PAfCM6/5tvk6LzDPwcAg3Qo41GM9nD8A7kYn7rjAKwHAY4XKswVdFhlOYQiTTlahYV6iDrn64eDCCBlx0obQsiKwbttbrQoY67JXNrTq9x12Q7JcGzj7Z0MVvJe/Sz+GXiwcbKM1COfjXnfvQC9HnAIZm +VCP2QWk WCa+rCHkNupg4rZkiNq3royd/S9NUCKWRbA4Yy7KPedFWz34= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 12, 2024 at 8:19=E2=80=AFAM Matthew Wilcox wrote: > > On Fri, Apr 12, 2024 at 09:53:29AM -0500, Suren Baghdasaryan wrote: > > Unless vmf_anon_prepare() already explains why vma->anon_vma poses a > > problem for per-vma locks, we should have an explanation there. This > > comment would serve that purpose IMO. > > I'll do you one better; here's some nice kernel-doc for > vmd_anon_prepare(): I was looking at the find_tcp_vma(), which seems to be the only other place where lock_vma_under_rcu() is currently used. I think it's used there only for file-backed pages, so I don't think your change affects that usecase but this makes me think that we should have some kind of a warning for lock_vma_under_rcu() future users... Maybe your addition of mmap_assert_locked() inside __anon_vma_prepare() is enough. Please don't forget to include that assertion into your final patch. > > commit f89a1cd17f13 > Author: Matthew Wilcox (Oracle) > Date: Fri Apr 12 10:41:02 2024 -0400 > > mm: Delay the check for a NULL anon_vma > > Instead of checking the anon_vma early in the fault path where all pa= ge > faults pay the cost, delay it until we know we're going to need the > anon_vma to be filled in. This will have a slight negative effect on= the > first fault in an anonymous VMA, but it shortens every other page fau= lt. > It also makes the code slightly cleaner as the anon and file backed > fault handling look more similar. > > Signed-off-by: Matthew Wilcox (Oracle) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index d8d2ed80b0bf..718f91f74a48 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -1057,11 +1057,13 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_f= ault *vmf) > gfp_t gfp; > struct folio *folio; > unsigned long haddr =3D vmf->address & HPAGE_PMD_MASK; > + vm_fault_t ret; > > if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER)) > return VM_FAULT_FALLBACK; > - if (unlikely(anon_vma_prepare(vma))) > - return VM_FAULT_OOM; > + ret =3D vmf_anon_prepare(vmf); > + if (ret) > + return ret; > khugepaged_enter_vma(vma, vma->vm_flags); > > if (!(vmf->flags & FAULT_FLAG_WRITE) && > diff --git a/mm/memory.c b/mm/memory.c > index 6e2fe960473d..46b509c3bbc1 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3213,6 +3213,21 @@ static inline vm_fault_t vmf_can_call_fault(const = struct vm_fault *vmf) > return VM_FAULT_RETRY; > } > > +/** > + * vmf_anon_prepare - Prepare to handle an anonymous fault. > + * @vmf: The vm_fault descriptor passed from the fault handler. > + * > + * When preparing to insert an anonymous page into a VMA from a > + * fault handler, call this function rather than anon_vma_prepare(). > + * If this vma does not already have an associated anon_vma and we are > + * only protected by the per-VMA lock, the caller must retry with the > + * mmap_lock held. __anon_vma_prepare() will look at adjacent VMAs to > + * determine if this VMA can share its anon_vma, and that's not safe to > + * do with only the per-VMA lock held for this VMA. > + * > + * Return: 0 if fault handling can proceed. Any other value should be > + * returned to the caller. > + */ > vm_fault_t vmf_anon_prepare(struct vm_fault *vmf) > { > struct vm_area_struct *vma =3D vmf->vma; > @@ -4437,8 +4452,9 @@ static vm_fault_t do_anonymous_page(struct vm_fault= *vmf) > } > > /* Allocate our own private page. */ > - if (unlikely(anon_vma_prepare(vma))) > - goto oom; > + ret =3D vmf_anon_prepare(vmf); > + if (ret) > + return ret; > /* Returns NULL on OOM or ERR_PTR(-EAGAIN) if we must retry the f= ault */ > folio =3D alloc_anon_folio(vmf); > if (IS_ERR(folio)) > @@ -5821,15 +5837,6 @@ struct vm_area_struct *lock_vma_under_rcu(struct m= m_struct *mm, > if (!vma_start_read(vma)) > goto inval; > > - /* > - * find_mergeable_anon_vma uses adjacent vmas which are not locke= d. > - * This check must happen after vma_start_read(); otherwise, a > - * concurrent mremap() with MREMAP_DONTUNMAP could dissociate the= VMA > - * from its anon_vma. > - */ > - if (unlikely(vma_is_anonymous(vma) && !vma->anon_vma)) > - goto inval_end_read; > - > /* Check since vm_start/vm_end might change before we lock the VM= A */ > if (unlikely(address < vma->vm_start || address >=3D vma->vm_end)= ) > goto inval_end_read;