From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 496CEEB64DA for ; Fri, 14 Jul 2023 03:41:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C237E900016; Thu, 13 Jul 2023 23:41:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BADBC900002; Thu, 13 Jul 2023 23:41:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A255B900016; Thu, 13 Jul 2023 23:41:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8CF76900002 for ; Thu, 13 Jul 2023 23:41:03 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id D4562404B4 for ; Fri, 14 Jul 2023 03:41:02 +0000 (UTC) X-FDA: 81008816364.02.428ABED Received: from mail-yb1-f177.google.com (mail-yb1-f177.google.com [209.85.219.177]) by imf06.hostedemail.com (Postfix) with ESMTP id 1AAC1180017 for ; Fri, 14 Jul 2023 03:41:00 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=jaOnftt4; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of surenb@google.com designates 209.85.219.177 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689306061; a=rsa-sha256; cv=none; b=oqrXLINaHyn1Kjdo7d0DhRwBf64EKMFjJgwkXH5lyfs2nQ29Y5Sf0o3GoUQuLgTYVM0wxF CTN/Sv0hEc9dWAAjs+IZtYVArl0VkYEjkpq8NtOy4oxkTnv3SJvVHbIqZ647u+8cdMEv5C DRmei5aKwytkiginlWbVg18M/GLARD8= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=jaOnftt4; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of surenb@google.com designates 209.85.219.177 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689306061; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HvQTgMQayfgjmUKJRh67IR25P+b4FA+q1tIxwLi8BoA=; b=fQ+h2wt3I6RKsK/KGkkx8ODEqXBIFOlfNXoXpaGKABAOElpJuszhvcK02eGjMmf//xLDO1 aYhX6xaL6bDW4XT6YUPEL7+toJZsNMQ/nigP4w94OxACVTKkpHvO/u59HST5lcUjt2e1UG cfMGBm9Evy1Uuph9S2Ik8L8/KPhLIBs= Received: by mail-yb1-f177.google.com with SMTP id 3f1490d57ef6-cb7b6ecb3cdso812611276.1 for ; Thu, 13 Jul 2023 20:41:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689306060; x=1691898060; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=HvQTgMQayfgjmUKJRh67IR25P+b4FA+q1tIxwLi8BoA=; b=jaOnftt48afzTlJLrxhluck7j/hhKo5yJyFXsOxcU9P1t8Tyjv+anRZXACyy1xN6AF zfqXYt2UYR7sZFBVTxuY7tmDK+Nt/NABifiIdgKVs0StWUMTcvCWWDeOg+RlxDBkQ4bd EAYEvk5IZqEFRF6Cu7O+UUSI9bcIOAQxGyYk+2asVrWC/vaGbcDlEgBAOwCkTTsG6Lm0 vN+QNEL9gEht3UlRLoQhEs2rp0TQNi86GfrQxh89hzRwcQVdqqzB7jnR7w4wRvmWxUge miuU+s8TjOzlPnuDuwqQm+MZc3uDWZOA7VZdhpAZGZfStA8HbmAYAXgo1g1MQl+I2Nwj Envg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689306060; x=1691898060; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HvQTgMQayfgjmUKJRh67IR25P+b4FA+q1tIxwLi8BoA=; b=JTIpAedEnaHQuJg8Ufl0hxTEhbS5PkXSyJaR03p3fQVWENkZHiHIi5EY1v7E4LX2Gy wKqEgh7swcxo/L8woorUaZfB9pDQybmEFvDr5lIXgL2ylfWmg5Tkdy0wlGSoCe6MvuIS YO16ZNOjH3PJP+WV1pNFN58ccrUSbD0Ram6fAwopEBDIe7YWfsPlQ0NPv8xVHWvw4veZ RENZ6HpgPuaMW6Tb2XpsmC95St6+TIedSa0ovyYDSbx1zU9MWOXSyv6Ij1Ho52++7DAp h1BU/DDV/KKXgOhrJ5U2KNXbMItbn4QILSzE2/RtZPHlOO3Qk3FedIJEQgGZ+i2kcsIl 8Jsg== X-Gm-Message-State: ABy/qLaJ8j8EE8j8NEJhghPUlUXtu0V7p8auCRgqQFHG/DKpMgf7eTtp f9dXKox8aG3cR0X1KQHsKkscUMERT9bCJDz4HJPepBZH5H6KXMWv/O+FUQ== X-Google-Smtp-Source: APBJJlHoQ6t+FL9wcuDLnzhzIsH9P7aK0vErv0wtlWTHif17IFj5QJ2w+F5quh6Yx0rEzxY0qgTVCogf20UT6noLb/I= X-Received: by 2002:a0d:c986:0:b0:561:d1ef:3723 with SMTP id l128-20020a0dc986000000b00561d1ef3723mr3908311ywd.38.1689306059905; Thu, 13 Jul 2023 20:40:59 -0700 (PDT) MIME-Version: 1.0 References: <20230711202047.3818697-1-willy@infradead.org> <20230711202047.3818697-10-willy@infradead.org> In-Reply-To: <20230711202047.3818697-10-willy@infradead.org> From: Suren Baghdasaryan Date: Thu, 13 Jul 2023 20:40:46 -0700 Message-ID: Subject: Re: [PATCH v2 9/9] tcp: Use per-vma locking for receive zerocopy To: "Matthew Wilcox (Oracle)" Cc: linux-mm@kvack.org, Arjun Roy , Eric Dumazet , linux-fsdevel@vger.kernel.org, Punit Agrawal , "David S . Miller" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 1AAC1180017 X-Stat-Signature: 4bw47e71nqrdd7dr8wxk9s9b7wj3acc4 X-HE-Tag: 1689306060-27239 X-HE-Meta: U2FsdGVkX19BXNqm+8M38t8Ng6P5IN7lK5cC20cL+LXWo2+6BCg8vxp3Ds+6DvdGiZmXutgcj8xfiXGuWBWSCF8N85uMmjhfMPWMRhvZR9juWJsUaZMf5gzWb/TDd/JG+OloNqDvT6lgWWE+WaiIvefd5/2Y55u0Ss8sXq1U30EvWOgQzfUaWGq/jqOEGvtUcEJ2NBvaOwpm0hzBVt+hxr0vXKSvO0o7x7b/AbP323Cn7jlMRwLQAY8xwI7DEN41h3xkk1ECWw252lChWehOIMCnZ9bcgC6fm1WCygHIg8zEXnHLrzNOUNyliIMB8wHykUnMO/IgR4XG6hzwOvwdfplwHusBKPazNCsM8ksCdI7J6ghlKlykB9xDKCy6TZklAVqo2v/lwpSIRT/t8hJg4S+Vl3h/+6+xCm3u04fo0TMrYHGccYaaYXYQe2CTBLxgWivm60FglHRfuF1ITP7oVKYNnGlEbpH5Fnlq70HwkNv4ElP8rQA1d8pLh36XiR/cfdxqzXJhwLymn9Y9po5sfnlrB+/H4f77ludIQdBwL0WnHECx6F5soF9FaWA0HGYcFCbf3krmxQ//7IkL7/E5CtwUw3YCfDaTGyskqXfYPuMArM5TMtpk2dbb/jVIvpbjRv56wBuHrDSTHq5t2YYzofF1uiurZinmOEJRAkLeIFydG73qe70IkR98TkU3EiqViCe6tB9o5lggpMhyAopPSFweXLWgFe2L5Y+HsoqHU06tqF2wzfcJM3QVB8WfAdaqjUHAdasypFgESnUh8ewQoMr+qSmJ8uepUI7et9chjcccr8IeRSiis8WHAruPTOFTqmA3Qv73RqCUW8BTkCtWtwDv47pDVpQW3Ew9zPrIMV6CcOUsevGRiW3V72tXx8G6eveZ4dPtO43epxnsPle3INrS2MV1dfkV4eU1EWygdAWP73holEiy1zCGgkyTZBp3ufr93nPsiLYtmKoijAg uofyd2Mg 7eJaNhGdT4mgrYv8kM5VPZeXXXudLGS9YyVha76CBbXNlmcR/nKJHx5Qt4LHT2I6W32DIttKxOSCiKWlff+O1nElr+pcFJ+J2ZmW2QwnVVPZynNM8p3D+DWmwjkyPBnB+t+m7BP/7yAF5ew0B9WTkVQLEmki5B7b8Am1AL6TuUR7d9Z2rX9bp2tnjWiVPhBoaHAnHB+Vw+64GeKCIuKXhIKnCjP7eK1Ass//GL9a5CncQwQhQAnTvGL5CYM8Yltd6zTx3ddrjUIR/DLVj1avFS6/vI9OBVwW/r8anw7IuMxkAt1Rl+z0xKDzrJ8zb2QCRXb9tc+yk6Yl86PI0KgCVVfar1yOLHPyTT8qeId1KVde/qWE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jul 11, 2023 at 1:21=E2=80=AFPM Matthew Wilcox (Oracle) wrote: > > From: Arjun Roy > > Per-VMA locking allows us to lock a struct vm_area_struct without > taking the process-wide mmap lock in read mode. > > Consider a process workload where the mmap lock is taken constantly in > write mode. In this scenario, all zerocopy receives are periodically > blocked during that period of time - though in principle, the memory > ranges being used by TCP are not touched by the operations that need > the mmap write lock. This results in performance degradation. > > Now consider another workload where the mmap lock is never taken in > write mode, but there are many TCP connections using receive zerocopy > that are concurrently receiving. These connections all take the mmap > lock in read mode, but this does induce a lot of contention and atomic > ops for this process-wide lock. This results in additional CPU > overhead caused by contending on the cache line for this lock. > > However, with per-vma locking, both of these problems can be avoided. > > As a test, I ran an RPC-style request/response workload with 4KB > payloads and receive zerocopy enabled, with 100 simultaneous TCP > connections. I measured perf cycles within the > find_tcp_vma/mmap_read_lock/mmap_read_unlock codepath, with and > without per-vma locking enabled. > > When using process-wide mmap semaphore read locking, about 1% of > measured perf cycles were within this path. With per-VMA locking, this > value dropped to about 0.45%. > > Signed-off-by: Arjun Roy > Reviewed-by: Eric Dumazet > Signed-off-by: David S. Miller > Signed-off-by: Matthew Wilcox (Oracle) Seems to match the original version with less churn. Reviewed-by: Suren Baghdasaryan > --- > net/ipv4/tcp.c | 39 ++++++++++++++++++++++++++++++++------- > 1 file changed, 32 insertions(+), 7 deletions(-) > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > index 1542de3f66f7..7118ec6cf886 100644 > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -2038,6 +2038,30 @@ static void tcp_zc_finalize_rx_tstamp(struct sock = *sk, > } > } > nit: Maybe add a comment that mmap_locked value is undefined/meaningless if the function returns NULL? > +static struct vm_area_struct *find_tcp_vma(struct mm_struct *mm, > + unsigned long address, bool *mmap_locked) > +{ > + struct vm_area_struct *vma =3D lock_vma_under_rcu(mm, address); > + > + if (vma) { > + if (vma->vm_ops !=3D &tcp_vm_ops) { > + vma_end_read(vma); > + return NULL; > + } > + *mmap_locked =3D false; > + return vma; > + } > + > + mmap_read_lock(mm); > + vma =3D vma_lookup(mm, address); > + if (!vma || vma->vm_ops !=3D &tcp_vm_ops) { > + mmap_read_unlock(mm); > + return NULL; > + } > + *mmap_locked =3D true; > + return vma; > +} > + > #define TCP_ZEROCOPY_PAGE_BATCH_SIZE 32 > static int tcp_zerocopy_receive(struct sock *sk, > struct tcp_zerocopy_receive *zc, > @@ -2055,6 +2079,7 @@ static int tcp_zerocopy_receive(struct sock *sk, > u32 seq =3D tp->copied_seq; > u32 total_bytes_to_map; > int inq =3D tcp_inq(sk); > + bool mmap_locked; > int ret; > > zc->copybuf_len =3D 0; > @@ -2079,13 +2104,10 @@ static int tcp_zerocopy_receive(struct sock *sk, > return 0; > } > > - mmap_read_lock(current->mm); > - > - vma =3D vma_lookup(current->mm, address); > - if (!vma || vma->vm_ops !=3D &tcp_vm_ops) { > - mmap_read_unlock(current->mm); > + vma =3D find_tcp_vma(current->mm, address, &mmap_locked); > + if (!vma) > return -EINVAL; > - } > + > vma_len =3D min_t(unsigned long, zc->length, vma->vm_end - addres= s); > avail_len =3D min_t(u32, vma_len, inq); > total_bytes_to_map =3D avail_len & ~(PAGE_SIZE - 1); > @@ -2159,7 +2181,10 @@ static int tcp_zerocopy_receive(struct sock *sk, > zc, total_bytes_to_map= ); > } > out: > - mmap_read_unlock(current->mm); > + if (mmap_locked) > + mmap_read_unlock(current->mm); > + else > + vma_end_read(vma); > /* Try to copy straggler data. */ > if (!ret) > copylen =3D tcp_zc_handle_leftover(zc, sk, skb, &seq, cop= ybuf_len, tss); > -- > 2.39.2 >