From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15030C4345F for ; Sat, 13 Apr 2024 21:42:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1C3466B007B; Sat, 13 Apr 2024 17:42:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 14CA86B0082; Sat, 13 Apr 2024 17:42:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F084D6B0083; Sat, 13 Apr 2024 17:42:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id CF0C46B007B for ; Sat, 13 Apr 2024 17:42:10 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 5634AC0188 for ; Sat, 13 Apr 2024 21:42:09 +0000 (UTC) X-FDA: 82005831978.08.0572B20 Received: from mail-yb1-f181.google.com (mail-yb1-f181.google.com [209.85.219.181]) by imf01.hostedemail.com (Postfix) with ESMTP id A9E3F4000A for ; Sat, 13 Apr 2024 21:42:07 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=DZRtIerM; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of surenb@google.com designates 209.85.219.181 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713044527; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=H5SznCkZxk/IFmHBV34npTiNaRojYfg7iFEK8KRaY3s=; b=MpZO29JZ7RrB5IogAzHJtPmo+6As7KVz3PmqSM/+T2HbeCIoRS6VoFyKgtixexpjlmsr15 ZWb5r8IyRmfjZge9xNO5RTcUO0jbk0G/lhj9hIfgADBl8amccPTSUCnoQeImLBoJXFQiS9 Q5EPQdZ9UMwhvPdqQWUegxTNOvcPPMA= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=DZRtIerM; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of surenb@google.com designates 209.85.219.181 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713044527; a=rsa-sha256; cv=none; b=ISehGvu9A8HhFvuvO88vFcaYMYBpglYNZVNFavRtb8GlHbi4QDnXc7D8NG3vfXwnSWrdPc aBLUxq/NQ1BzFqVbpThG0iXnY1sQsfLp22z8GUvnVXfUzhOInZWR0JKP9e+I6027LWhvam GhqpJFGPevst/CI48erdqTnrx3wUrtY= Received: by mail-yb1-f181.google.com with SMTP id 3f1490d57ef6-dc74435c428so1987350276.2 for ; Sat, 13 Apr 2024 14:42:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1713044527; x=1713649327; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=H5SznCkZxk/IFmHBV34npTiNaRojYfg7iFEK8KRaY3s=; b=DZRtIerMW/LJah4OQ0KuLiJvnEG8C2Gnyos/gGR1uMK22pbBFoSt7tIWJf37qUglWt rbtT1EfQMe8X3FOc3E6+trinAmVlWv+FH+PRBgnntXBXN6ZLS1MSFLbyV0w2p4xWEw6j +dd+1UZ7ZV3j1FgVnd5g7AREYxW4GFmBsBT4dLLti6nvaE08SV3WOazHeQp/R9c9jKdJ mOr+LxSQ/PN3dPynQ69tTySz+lC9ks6uoS73Q4EgBKfTpTBRTPgfWGXVpnL6pCeJfsav 0sPQ3Ib72Cwoh+s02agG0Okx/hOvgI+T3iiKlG+RCSOKH2rXtWx7jG9uOn8ocNk/5pQU 3+vQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713044527; x=1713649327; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H5SznCkZxk/IFmHBV34npTiNaRojYfg7iFEK8KRaY3s=; b=P4GWi46spR5kFG9DwTqea1lf0nzaV11hS4BUV2j3gMwfgfPEk6mbLbJBnK59cGDWus RZYBP5PNFh/A/ZKaLzsE68FMNPW3LWDs0089LbQYfnLMe+PNnq3/4wMreTCbdTJ5qJLm 7ukaPvauyX2GKnOvdXcj4Hy+sHMwt2fasNm4z33O9qskUvUmB/Un5IRmCkRoPLYx4d1d UsjVHVkW/1EAYevT3EU7XUY6qWwiZ0hOpeVzDSesNN/Gx0sIa0XBccPlXxU/+/BOru++ iWDX/dQX5YqJHJVPJotrkiJ2JXWAr/mzpxKfQ2ngOHtDxCbK+Xs4iUqq5n8Q4ShcdxKW BJ/g== X-Forwarded-Encrypted: i=1; AJvYcCXgtjdBpM+BtyzKv0uNn+AE+Ylx4oYr2KTfsCLIfMRfCSgeqb8yOo7bb5ECVuY6FgjKSHHr1NdSqXrZAeG/lLEJ5ZY= X-Gm-Message-State: AOJu0YxP+ZOY93Bn/5V4lud/T+UmCF8+Stk8LLlCl926eYPNA/DgKvM1 QcKOUffqSKr/XcXiwOKm5GDHt6KdyvLYucAkpLIzCfBKhN4voFh3z/LPextLoxg7VnsZGvvL6Bg ezH1THVFWd5jlOCMVePm6mWEz3/AZuh51QqTw X-Google-Smtp-Source: AGHT+IExGEinrM+v5ONv6PmTFB4ZaVOzQQm7Zmt+sR6VM9hDFB5og76FYXA2h0tp1zEqMvog+k68jD5NUHas2t1k0lE= X-Received: by 2002:a25:8106:0:b0:dc6:de64:f74 with SMTP id o6-20020a258106000000b00dc6de640f74mr5340865ybk.9.1713044526444; Sat, 13 Apr 2024 14:42:06 -0700 (PDT) MIME-Version: 1.0 References: <20240411171319.almhz23xulg4f7op@revolver> In-Reply-To: From: Suren Baghdasaryan Date: Sat, 13 Apr 2024 14:41:55 -0700 Message-ID: Subject: Re: [PATCH] mm: Always sanity check anon_vma first for per-vma locks To: Matthew Wilcox Cc: Peter Xu , "Liam R. Howlett" , LKML , linux-mm@kvack.org, Andrew Morton , Lokesh Gidra , Alistair Popple Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: i1itn978ffqfroepkifty9renywcpetn X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: A9E3F4000A X-HE-Tag: 1713044527-592622 X-HE-Meta: U2FsdGVkX18KA9D6ySKyR7z6aVpGWxCzaMCJoH54VAb6oPG5Z8Fy0MH5/YtaayOIgGaGd9B34Vj2KvkTYFKf3vY3B0Ds1b63s/5nOMGFHIIQjmda2sHFDFsnSZF26+4bQiM+AODQZ6mK+3EtJHkhmT0IUGE9XWHGFJYzAnIhtNLfybWMRHm6mzr3lpzTjLnviv+cdq/q97wv9xZku1z0U/7O39wiABz9lhjJqEblPOgqEbQi+lCFbw1vrIrFNDmQJfuvgZru7dS2GA977Lc7SbEDCjVca4vs1JtlKBX88M/AhxWIH0RzZaWZNMwvujyewpvEnCJ5mz6KqP0eagnmpSzlYdtJEqFCRNaB7JqLjh7gsOdOmpshuLt55Hb5dxNu6qY20EJ41K76ig+WEKQQNzS/ERB0W+DMvdskTZsekYeNl0YsMMNVPxRTYQUFRtaSPNdqGA+z/TsiplUhPJoI56uYLYy4nDoo8FASghLfaTNMZo4WPV7MQ/KXBsSlTdorisDvzzC28GIZ1+PACPo/sB78y+rQzCfsRbvQwZlqLUHXfvUcV3Q62HJzhFBYCsI8EjppcEmO3BdE+9za4seZY/BXNBGdFAqFtR2nB47M94WjYXGZWqOFH2r0gDEbP35XKet8iACxUVRxU2b85N5fmCteuSteLZgHQZtio7HUWsLomJnr+4v7UguJ7p9uHJesfpQGp4i1OfBDzwfvlKt4gT677i/6ZCenXkD6wLJ5Q3Ci5CHVM4Zh0/UZPp3qQUjQfqidW7RaiqUGHNgD9X5s7x7RsXnM0UYu1+tJADrsw0qIjuLxTWiAdcA+yXna1Sd+y0SHjM1I5bztX1yvYfhnBj9yhlVuvkRE9gIOA4b3U7S9d6xke8zWsumhw9DiPr6y9hVRFAQPSVXs59lcuQ0UORX6W1G/xg5OG065gyQiIr6STUc7NShW4oi51iqLB8XLBJCw/HftfbS7KRxmZJH 4SkWZ1YL LsgVgXJ30fsYhVqpwZhsZY9PSmc4yFffFI0cUg7AcI0+5NIg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 12, 2024 at 8:19=E2=80=AFAM Matthew Wilcox wrote: > > On Fri, Apr 12, 2024 at 09:53:29AM -0500, Suren Baghdasaryan wrote: > > Unless vmf_anon_prepare() already explains why vma->anon_vma poses a > > problem for per-vma locks, we should have an explanation there. This > > comment would serve that purpose IMO. > > I'll do you one better; here's some nice kernel-doc for > vmd_anon_prepare(): > > commit f89a1cd17f13 > Author: Matthew Wilcox (Oracle) > Date: Fri Apr 12 10:41:02 2024 -0400 > > mm: Delay the check for a NULL anon_vma > > Instead of checking the anon_vma early in the fault path where all pa= ge > faults pay the cost, delay it until we know we're going to need the > anon_vma to be filled in. This will have a slight negative effect on= the > first fault in an anonymous VMA, but it shortens every other page fau= lt. > It also makes the code slightly cleaner as the anon and file backed > fault handling look more similar. > > Signed-off-by: Matthew Wilcox (Oracle) The patch looks good to me but I would like to run benchmark tests to see how it affects different workloads (both positive and negative effects). I'll try to do that in this coming week and will report if I find any considerable difference. For now: Reviewed-by: Suren Baghdasaryan Also we should ensure that this patch goes together with the next one adjusting related code in uffd. It would be better to resend them as one patchset to avoid confusion. > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index d8d2ed80b0bf..718f91f74a48 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -1057,11 +1057,13 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_f= ault *vmf) > gfp_t gfp; > struct folio *folio; > unsigned long haddr =3D vmf->address & HPAGE_PMD_MASK; > + vm_fault_t ret; > > if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER)) > return VM_FAULT_FALLBACK; > - if (unlikely(anon_vma_prepare(vma))) > - return VM_FAULT_OOM; > + ret =3D vmf_anon_prepare(vmf); > + if (ret) > + return ret; > khugepaged_enter_vma(vma, vma->vm_flags); > > if (!(vmf->flags & FAULT_FLAG_WRITE) && > diff --git a/mm/memory.c b/mm/memory.c > index 6e2fe960473d..46b509c3bbc1 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3213,6 +3213,21 @@ static inline vm_fault_t vmf_can_call_fault(const = struct vm_fault *vmf) > return VM_FAULT_RETRY; > } > > +/** > + * vmf_anon_prepare - Prepare to handle an anonymous fault. > + * @vmf: The vm_fault descriptor passed from the fault handler. > + * > + * When preparing to insert an anonymous page into a VMA from a > + * fault handler, call this function rather than anon_vma_prepare(). > + * If this vma does not already have an associated anon_vma and we are > + * only protected by the per-VMA lock, the caller must retry with the > + * mmap_lock held. __anon_vma_prepare() will look at adjacent VMAs to > + * determine if this VMA can share its anon_vma, and that's not safe to > + * do with only the per-VMA lock held for this VMA. > + * > + * Return: 0 if fault handling can proceed. Any other value should be > + * returned to the caller. > + */ > vm_fault_t vmf_anon_prepare(struct vm_fault *vmf) > { > struct vm_area_struct *vma =3D vmf->vma; > @@ -4437,8 +4452,9 @@ static vm_fault_t do_anonymous_page(struct vm_fault= *vmf) > } > > /* Allocate our own private page. */ > - if (unlikely(anon_vma_prepare(vma))) > - goto oom; > + ret =3D vmf_anon_prepare(vmf); > + if (ret) > + return ret; > /* Returns NULL on OOM or ERR_PTR(-EAGAIN) if we must retry the f= ault */ > folio =3D alloc_anon_folio(vmf); > if (IS_ERR(folio)) > @@ -5821,15 +5837,6 @@ struct vm_area_struct *lock_vma_under_rcu(struct m= m_struct *mm, > if (!vma_start_read(vma)) > goto inval; > > - /* > - * find_mergeable_anon_vma uses adjacent vmas which are not locke= d. > - * This check must happen after vma_start_read(); otherwise, a > - * concurrent mremap() with MREMAP_DONTUNMAP could dissociate the= VMA > - * from its anon_vma. > - */ > - if (unlikely(vma_is_anonymous(vma) && !vma->anon_vma)) > - goto inval_end_read; > - > /* Check since vm_start/vm_end might change before we lock the VM= A */ > if (unlikely(address < vma->vm_start || address >=3D vma->vm_end)= ) > goto inval_end_read;