From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 854F5C63797
	for <linux-mm@archiver.kernel.org>; Tue, 17 Jan 2023 04:53:00 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id BF1276B0071; Mon, 16 Jan 2023 23:52:59 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id BA2356B0073; Mon, 16 Jan 2023 23:52:59 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id A688E6B0074; Mon, 16 Jan 2023 23:52:59 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12])
	by kanga.kvack.org (Postfix) with ESMTP id 93A8A6B0071
	for <linux-mm@kvack.org>; Mon, 16 Jan 2023 23:52:59 -0500 (EST)
Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay05.hostedemail.com (Postfix) with ESMTP id 69AD14015F
	for <linux-mm@kvack.org>; Tue, 17 Jan 2023 04:52:59 +0000 (UTC)
X-FDA: 80363071278.17.CC22A53
Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173])
	by imf12.hostedemail.com (Postfix) with ESMTP id DB31540005
	for <linux-mm@kvack.org>; Tue, 17 Jan 2023 04:52:57 +0000 (UTC)
Authentication-Results: imf12.hostedemail.com;
	dkim=pass header.d=google.com header.s=20210112 header.b=Q2yAZjcO;
	spf=pass (imf12.hostedemail.com: domain of surenb@google.com designates 209.85.128.173 as permitted sender) smtp.mailfrom=surenb@google.com;
	dmarc=pass (policy=reject) header.from=google.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673931177; a=rsa-sha256;
	cv=none;
	b=3Qhh61ewVsT2AM3SRj/liNMKdFdB6vM0MHiFZ1PKd6FAPXdmwCw0SeguQiJsgvAAZCXYxp
	Wgm0cyoBK4bnbEjIPJRgINb+IPtK/9sMoZ+wXRUkB2u4/IIoxjHVFNTFOrNjPG8rsyv8FW
	mN+xAiKXJhV4zI7JconHZQOivtMpTpQ=
ARC-Authentication-Results: i=1;
	imf12.hostedemail.com;
	dkim=pass header.d=google.com header.s=20210112 header.b=Q2yAZjcO;
	spf=pass (imf12.hostedemail.com: domain of surenb@google.com designates 209.85.128.173 as permitted sender) smtp.mailfrom=surenb@google.com;
	dmarc=pass (policy=reject) header.from=google.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1673931177;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=6cw4KABvzxyFyTr8U+aFOnDbCHOejbJ4PKoRgeioGcE=;
	b=TS1S7jLocExjoxm5dJqxUsITsf3fq3ePvanMGIbuTD6MI0D5+JYfbQu1cEkaauw/ItYH63
	AOKv5luvHrHeZlfDKuGzEVcn3k2/2w65w8UirfB9Y6QHzSZ+6PQKSFyeKTGCAVLER4zVmV
	9eBmdU9AjKJfKvR+z33FdCTEJZ4yNrg=
Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-4b6255ce5baso406629937b3.11
        for <linux-mm@kvack.org>; Mon, 16 Jan 2023 20:52:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=cc:to:subject:message-id:date:from:in-reply-to:references
         :mime-version:from:to:cc:subject:date:message-id:reply-to;
        bh=6cw4KABvzxyFyTr8U+aFOnDbCHOejbJ4PKoRgeioGcE=;
        b=Q2yAZjcOitMQvX26vUcaa9AFrjuCsba52/KOG4Iti5Hj+jvI9tbyNfN3Oc1HWbNWZv
         ljjQ+ykxuh8fFz165trv6V1C3jeu+OmSBNt/pGxUJin2o0zDUdFf9o7NytgGh2yKTSbt
         2Kd1W/STFE3dTmDISrylLaLTA/pIsdEOzqDgDI925ScM+SVAV4ufTIg3JYYLt5Hzt4Tp
         qub/t7eS+VUQjc+hdJ8yQMcVVQgogVoex6HhgmHkOBIU0x+/syByOU9bLHWKV66ejiLD
         Lcu5PWvj0ZUCXFI5Gr+7UEzaYek7iqfbBrGQnowq+6/7RBUZ8tEN4G3kz2tIhFmgzGUm
         v2Sg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=cc:to:subject:message-id:date:from:in-reply-to:references
         :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=6cw4KABvzxyFyTr8U+aFOnDbCHOejbJ4PKoRgeioGcE=;
        b=LgpCDFnrU6WasKmurVOqv8h/e3Yenr0L3GindMCyTmFnzybu5Ds+ZR73epnSt4DNQu
         8sL7NXvzOhvcdCyBqYbGXbedgSDVXUd9+FGrdwpdPnVjNkH9XhGvTLRDROJd4T2GW3En
         mPKitLxJjgyfYLv0KgFG2tCT2MjRMgkkkBe5D62FC8In4bkOYBc30eqHTnRFXTkY0rQ/
         939Oc0GdU0+2orC7ftCz572vdkgwcV8ubWzfaGuedoWj0EYRp9enpN5BReAu5qDvhArm
         kEwsgDnTtU9iM06b5A3Ef0OFtqjeXOhBQLD0pNDQBJrD7tLzIdI0iNjejabbT3FJsKjC
         1lNA==
X-Gm-Message-State: AFqh2kpyIg6NjcpF++77q8cB2xpWNLinFfxbbOUHfmlI8fcYcRzYceUk
	xbWOrLe0+7IZT1AMRUD45zvKzgCnevvmlnZi+JR85w==
X-Google-Smtp-Source: AMrXdXvcMR9BSaGwhNU0+zyHk9pwCldBcAo4dRNPFsCz0h2FJWOV0z6ELu4HGpTdDNaFFvi6tSOBflN/EtknrzMPBDk=
X-Received: by 2002:a0d:d484:0:b0:4dc:4113:f224 with SMTP id
 w126-20020a0dd484000000b004dc4113f224mr275356ywd.455.1673931176846; Mon, 16
 Jan 2023 20:52:56 -0800 (PST)
MIME-Version: 1.0
References: <20230109205336.3665937-42-surenb@google.com> <20230116140649.2012-1-hdanton@sina.com>
 <CAJuCfpHoHcZxQZgt4Ki1kiBu9O+sANZQambOa+1gSQu2brPoyA@mail.gmail.com> <20230117031632.2321-1-hdanton@sina.com>
In-Reply-To: <20230117031632.2321-1-hdanton@sina.com>
From: Suren Baghdasaryan <surenb@google.com>
Date: Mon, 16 Jan 2023 20:52:45 -0800
Message-ID: <CAJuCfpFq23m-KYKaDoCS2K2aM8rO7j8aPa0nVPs-_xP2Sf6GGg@mail.gmail.com>
Subject: Re: [PATCH 41/41] mm: replace rw_semaphore with atomic_t in vma_lock
To: Hillf Danton <hdanton@sina.com>
Cc: vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, 
	peterz@infradead.org, hughd@google.com, linux-kernel@vger.kernel.org, 
	linux-mm@kvack.org
Content-Type: text/plain; charset="UTF-8"
X-Rspam-User: 
X-Rspamd-Queue-Id: DB31540005
X-Rspamd-Server: rspam01
X-Stat-Signature: p4f5fhb9as6pkcbm99i3xwbpn745b713
X-HE-Tag: 1673931177-619703
X-HE-Meta: U2FsdGVkX1+WFcU54YTCay9wCCnKajdTP7CM1ElX7WUTsTjTfjaol3+I7HqsfIhiXhQQTRm9vjfiWdehuuAE9BhbPqf8RBfi3+XVX5J/KWH1ykPRlnKwjON/RaBE2g229hhKowutzF+0WKzCd9hW37y8m19DE+b8kaem5PDW+Sq+TdZnQ69dYTopmsJpxmxWL/oX3FCexZXi9VQUP3NdAgNT3s1mkA3MPMLX0pN6S0l98Vf5Ho4cEEESBiaFbMUpw3Y6V7B6zdXYM07lxnDPDYHZBvqkJnXI1opDESgy6Yav+z4qOLaMiinHqnOqmWxBUEFxB6hq7DvrgIySIwQRoiKCzsPBBs8XSagMXlRSH8G9Vml4Ry3j8/gweNhA4+XDDjpPPMYElM06f1lbZO+/OABnoFbUPobM1Z7Wa8HiSIXFBQso6ht+6oMFSrAmowkTUUuTg5SN+VObdMzRrtZUIkXsHVybMkByh38LLt+Rqn8wmKMz9XKWjcyAGSlpjX6pXSH0x9QwmX+QydJTZ5MXnP4ZqIWvJVAzp9cB5YmuhMFM4gW34M3c6zHRl1/YFXV1N7Cowll6MB+vpwhjhA3qX85gWUJgJ6XigIvX37TjKW36THinlHmstKYG5cQjnGy/cc4/ukA61Ts0UKU5KCZ9OKWQZAZZ5UknMyqeYY72/0TJkoqaRe5SbaArcoKhaSdBLMzjcF5i89R5dggXxBYVVqf44vSimMmP05IgLMYIRD4S12SpeASc3bNv9SFcJ35ptbWFUeyCOAeM4GhZNS48vEOXLQgXGsxWi1ySNWpFm8T7lQ+Gq6+/fNz6wiEQfVjHaQXXc/0SkEK2NakSvXkloAnztP1zPXrE9Eqf2NUb2pyfXVQzW4eR/kw+S/Npfv5xV0RQtzrRfanR6hQtXQhXx2rIFPx6afo84imPXgmTKsZMEiDz6LvA5q8O2BzUuRcK0c1PjnRvrU5HxdhXDsy
 NBVl8vmK
 Vt1S2d8gdFCfgj5falast/kLDZCeQfFF8gegtezAIJYsAXsd8WdMs/dhEzg==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000021, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Mon, Jan 16, 2023 at 7:16 PM Hillf Danton <hdanton@sina.com> wrote:
>
> On Mon, 16 Jan 2023 15:08:48 -0800 Suren Baghdasaryan <surenb@google.com>
> > On Mon, Jan 16, 2023 at 6:07 AM Hillf Danton <hdanton@sina.com> wrote:
> > > On Mon, 9 Jan 2023 12:53:36 -0800 Suren Baghdasaryan <surenb@google.com>
> > > >  static inline bool vma_read_trylock(struct vm_area_struct *vma)
> > > >  {
> > > >       /* Check before locking. A race might cause false locked result. */
> > > > -     if (vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))
> > > > +     if (vma->vm_lock->lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))
> > > >               return false;
> > >
> > > Add mb to pair with the above wmb like
> >
> > The wmb above is to ensure the ordering between updates of lock_seq
> > and vm_lock->count (lock_seq is updated first and vm_lock->count only
> > after that). The first access to vm_lock->count in this function is
> > atomic_inc_unless_negative() and it's an atomic RMW operation with a
> > return value. According to documentation such functions are fully
> > ordered, therefore I think we already have an implicit full memory
> > barrier between reads of lock_seq and vm_lock->count here. Am I wrong?
>
> No you are not.

I'm not wrong or the other way around? Please expand a bit.

> Revisit it in case of full mb not ensured.
>
> #ifndef arch_atomic_inc_unless_negative
> static __always_inline bool
> arch_atomic_inc_unless_negative(atomic_t *v)
> {
>         int c = arch_atomic_read(v);
>
>         do {
>                 if (unlikely(c < 0))
>                         return false;
>         } while (!arch_atomic_try_cmpxchg(v, &c, c + 1));
>
>         return true;
> }
> #define arch_atomic_inc_unless_negative arch_atomic_inc_unless_negative
> #endif

I think your point is that the full mb is not ensured here, specifically if c<0?

>
> > >         int count = atomic_read(&vma->vm_lock->count);
> > >         for (;;) {
> > >                 int new = count + 1;
> > >
> > >                 if (count < 0 || new < 0)
> > >                         return false;
> > >
> > >                 new = atomic_cmpxchg(&vma->vm_lock->count, count, new);
> > >                 if (new == count)
> > >                         break;
> > >                 count = new;
> > >                 cpu_relax();
> > >         }
> > >
> > >         (wake up waiting readers after taking the lock;
> > >         but the write lock owner before this read trylock should be
> > >         responsible for waking waiters up.)
> > >
> > >         lock acquire for read.
> >
> > This schema might cause readers to wait, which is not an exact
>
> Could you specify a bit more on wait?

Oh, I misunderstood your intent with that for() loop. Indeed, if a
writer took the lock, the count will be negative and it terminates
with a failure. Yeah, I think that would work.

>
> > replacement for down_read_trylock(). The requirement to wake up
> > waiting readers also complicates things
>
> If the writer lock owner is preempted by a reader while releasing lock,
>
>         set count to zero
>                           <-- preempt
>         wake up waiters
>
> then lock is owned by reader but with read waiters.
>
> That is buggy if write waiter starvation is allowed in this patchset.

I don't quite understand your point here. Readers don't wait, so there
can't be "read waiters". Could you please expand with a race diagram
maybe?

>
> > and since we can always fall
> > back to mmap_lock, that complication is unnecessary IMHO. I could use
> > part of your suggestion like this:
> >
> >                  int new = count + 1;
> >
> >                  if (count < 0 || new < 0)
> >                          return false;
> >
> >                  new = atomic_cmpxchg(&vma->vm_lock->count, count, new);
> >                  if (new == count)
> >                          return false;
> >
> > Compared to doing atomic_inc_unless_negative() first, like I did
> > originally, this schema opens a bit wider window for the writer to get
> > in the middle and cause the reader to fail locking but I don't think
> > it would result in any visible regression.
>
> It is definitely fine for writer to acquire the lock while reader is
> doing trylock, no?

Yes.

>