From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD70BC25B79 for ; Wed, 22 May 2024 17:23:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 504DC6B0083; Wed, 22 May 2024 13:23:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4B50E6B0085; Wed, 22 May 2024 13:23:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 37C236B0088; Wed, 22 May 2024 13:23:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1965D6B0083 for ; Wed, 22 May 2024 13:23:03 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 8A7241C45FC for ; Wed, 22 May 2024 17:23:02 +0000 (UTC) X-FDA: 82146702204.20.8AD643D Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) by imf28.hostedemail.com (Postfix) with ESMTP id AD328C000E for ; Wed, 22 May 2024 17:23:00 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QsTpt0hX; spf=pass (imf28.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.167.42 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716398580; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+hKhh70jMeptWU7ZZ/ftWiaVt+5czwlS5uWtiZA1kFo=; b=nQKyVlBKsZ5jtY+O3S+xog1EKcF2CcJOl5JCeehPDz0KG8rnzzJ281EsgPNKEiZ34COsTc zpbAkqUVEi5Grt5M4N7z5YGb+K8rH9sIrthh+x9L0lOgYXz6NiYliwQqJrirJFGhj7l6Mq YP8dotQ8ZuRlXIJIH+6ah8Kgv9VvCwI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716398580; a=rsa-sha256; cv=none; b=gOWJvMW0kXkUSgGruvITJngmF150Y9sLkowLabwwQqMpyEVAyTpv1k0io7mNrBWaGm7tuV O+r3Qm6r6/I9mVqHiFfkuBWBrhlYpJwnn1GlP9X6JZ9QPfe2fS4Wm5SzG0A8t53W1Lo2qp sBVyy4Jahhq8uS42v/Um8NEynpCvHhc= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QsTpt0hX; spf=pass (imf28.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.167.42 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-lf1-f42.google.com with SMTP id 2adb3069b0e04-51ff65b1e14so7804555e87.2 for ; Wed, 22 May 2024 10:23:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716398579; x=1717003379; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=+hKhh70jMeptWU7ZZ/ftWiaVt+5czwlS5uWtiZA1kFo=; b=QsTpt0hXplG30xX75jAjuYTu6v+HGEJqK07tmK+FK+Ot4k0qBHIphvSstkcCDaB5xZ pF+p9SgPNSM5y2+MwwaViry6FUtBiJsgsqoQOuHtP8uEo+m90Qk+ONk0auyY/AIfc98x RQTAFB1kqRRXUctZ4RAgLRXygbLQb5tb9NAMJGjOXF9BY+14KyvSRf40uao+tTmtqsGN ODAunAtiqxXGhkhiQSiGmJwPt+g5E8UGYiBvLlrARuHMz7iSG2GRKiURMRiJunJUYYJO 8Y8u09LyRnDjT3v2/jL9WZoojnEAfDKxaAGDDaWdJjDkijyWSLKgrqQnDYUlpn5uhMap nt7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716398579; x=1717003379; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=+hKhh70jMeptWU7ZZ/ftWiaVt+5czwlS5uWtiZA1kFo=; b=mXk6eMYzQxhDaO5n9RUqsfuHYWHI4gLWwBh7JV0l+zBiwTByC+T6j0embeysJ3EB8a MmuNUtwvTcA/+c518IDcuP1pWCey8AOSG26Dlo74OjSmtLtWtga6wCwZ+BgNWj2yAQ0s 4irw/+kHlHYfkm+9ioCNvgNU0LJKW0vCwfwgNjSGbMRH+YxW7ciCuovWAVC8DJvKF+p+ KY0YvQxtrsiAfbZ8Vwfch6u64flL8v1Qb1t6uzdPLcwKlHE9qztwM6idqcjppO4ivFYw NFpori4yS/Sd3ueFrFOtp/XW5HylyWPrPa3Hi2t0bPzbKdJGTubrLWNgYSFHu6++IuBA eaFA== X-Forwarded-Encrypted: i=1; AJvYcCVgfAhtIyuAFVs6f32Iph4GCitbTuC4wfBMXCSGugLPH3j1hqaCuXj29iLNnqaaiZFv1zr6k5W17Tm9UtoOlZzLOFs= X-Gm-Message-State: AOJu0YxdLK66pmPT7EU//LPEBfy0gRsKuAtoiZ0e8azJq2cnLv+8Ycl3 OajQwzESoooVO47XCebSFtAs6H5Vjgwzabth1YDfuc8TUv+ZeVjf X-Google-Smtp-Source: AGHT+IGPJ3i5nUpc9c4LI4ItTOdGFOaAJybcYO4oMTgcCTCP3WcUbMHOGvS3axUGJdizHrNQUcn/Cw== X-Received: by 2002:a19:640a:0:b0:521:b1ca:9c99 with SMTP id 2adb3069b0e04-526be8d483cmr1706903e87.5.1716398578612; Wed, 22 May 2024 10:22:58 -0700 (PDT) Received: from f (cst-prg-19-178.cust.vodafone.cz. [46.135.19.178]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a5a17892471sm1815056766b.84.2024.05.22.10.22.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 May 2024 10:22:58 -0700 (PDT) Date: Wed, 22 May 2024 19:22:25 +0200 From: Mateusz Guzik To: "Liam R. Howlett" , akpm@linux-foundation.org, vbabka@suse.cz, lstoakes@gmail.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2] mm: batch unlink_file_vma calls in free_pgd_range Message-ID: References: <20240521234321.359501-1-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: AD328C000E X-Rspam-User: X-Stat-Signature: zi4kjimaxu3w11dyd86d75cbhki7pzuo X-HE-Tag: 1716398580-54680 X-HE-Meta: U2FsdGVkX1/Qv19LyDfR7SGfoZJMoRLt4vAFUMmRoTiqjSZ9+yR6iG2w0jn1PPKeImGqE7HuuSbpUdjUHcV1Zccn2e18SjLaO9i0VG1PyMOWLKUAZGweg7mMoxF0PHXybRAZM7GDZDl41cA/c69TkTIZ3Gu18YEYwu0s1umlQNdxdTnr5aCHD+1kP4w5nnyLcEevOUSFK2TUr6WgKYCGd2caksjduRkkoQWp6oV1oxyvRE/uNBGdxWFkPRCPYKaM5M2JF6w2HA1DK0tl9ERoKacQLhYmwSG4lvxZ0OzkuOAqbg0KwxDjqPq3+F84Uc8AVXAlyPPNe9KD9EAO4ZU7JqwfTFvAAuGSrmG8GMjE3CvJgPTZSYY+a7GPOpqK4Ef0Lu4YDs/7ySiaf9RmaRBE8txaLgLioG5oZmf1ubtnaEW8mFAsetvg3bCPdvwNKvDFyN/lIRWmS+wtD80IkyWJvlkL5MW61Q9CEs08p9sfCH33uVWH9wgejRi/9H1fUw5NOujb0nBFk6ThgjAQhgt1GjtJxRP7WbB30rX8Zb+vSUUoMyLSA9iBtnP/wO+mVL7VGQrz9jN/9LWndtyEM/7X1xQK9ez/n2JlWvERRHAbhWlHKIyjsGfdbrV7n7XktpvrQ3ODMwEpiSIdaFAtV59neXzypQLm43LHkjc13OSOa8DMM8xJxKjiyyALkPYWGDRb/xf9Rka5c59DdAQfbvx+kBNLdKFjdTIud1A26NvHZmjUogAmkcO0zM/tc48NTTrwxaOpkhE/KRA092GzGZjgl2+m7H6ow0gKRtVyNFrfo9/PADgr6VGxMacIfSG0WZjeGgTkxN8UvhUvwLK0HHC6MAiqs2b0beeZFrb8p/02yaJnMyZz5UnNePAIsC1K4XOurfimWCWjUqFxptxVujFqBhcXJPi8WCNPjkCY/bZpwmwpe6q0e86CMumJcFy3IFWaJ68OALyzR+AeQ6Qx5Di vtb5GUMv GKae5NqtrSUGxgvyZXNTEAHmxEPfQrbghdUi9OY/ppcYaLle9F9//tYvQDgnQwAldhWitqab/7R1u2+VprVLpcp9VCmjxSn9ek2eCxtUxCsQbXEVBEEGMk5PV+pvUoUX1/wMHTnFL0XYLnFORAZnQD/8z5fnwqsIOvedl33UZO5AhSAmEmebRBxLYuSj4MQnf/qf/wNft5/OAG9kX4hQIZqmeR3dfZgEk7oHKeeq41wQCRzyaGMyNiRL7jJNOGjdj5RxDRuVLs/QsVMEM3S7hN8+s5l5WNsM8UKeot7RahCekzSI8SvZKXSa9sjhj0FV+JSX4PnudSFm7/vfDRI2RCEJc64/Xaw2MxGVKvYI9R7T4i5nr8OutPnbOhWIcavTfCfgleUOf5Urwd5XQd9e6+4gFsPWsoHs7euh7ftEYKtifwR2pELXeHfHirkTgJOLDz0jPig3gcPOz7QhgWRbFa7ToYm1/caw5w7IgwxU3PUVc8YO7z7CmL+iCqfddhrxUJ3blual2ze2s3hk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 22, 2024 at 11:19:45AM -0400, Liam R. Howlett wrote: > * Mateusz Guzik [240521 19:43]: > > Execs of dynamically linked binaries at 20-ish cores are bottlenecked on > > the i_mmap_rwsem semaphore, while the biggest singular contributor is > > free_pgd_range inducing the lock acquire back-to-back for all > > consecutive mappings of a given file. > > > > Tracing the count of said acquires while building the kernel shows: > > [1, 2) 799579 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| > > [2, 3) 0 | | > > [3, 4) 3009 | | > > [4, 5) 3009 | | > > [5, 6) 326442 |@@@@@@@@@@@@@@@@@@@@@ | > > > > So in particular there were 326442 opportunities to coalesce 5 acquires > > into 1. > > > > Doing so increases execs per second by 4% (~50k to ~52k) when running > > the benchmark linked below. > > > > The lock remains the main bottleneck, I have not looked at other spots > > yet. > > Thanks. This change is compact and allows for a performance gain. It > looks good to me. > > I guess this would cause a regression on single mappings, probably > within the noise and probably not a real work load. Just something to > keep in mind to check if the bots yell about some contrived benchmark. > Trivial tidy ups can be done should someone be adamant there is a slowdown and it needs to be recouped, starting with inlining the new routines (apart from unlink_file_vma_batch_process). > On that note, kernel/fork.c uses this lock for each cloned vma right > now. If you saved the file pointer in your struct, it could be used > for bulk add as well. The only complication I see is the insert order > being inserted "just after mpnt", maybe a bulk add version of the struct > would need two lists of vmas - if the size of the struct is of concern, > I don't think it would be. > Looks like it would need a different spin on batching than the one implemented above. Maybe I'll get around to this some time early next month. > > @@ -131,6 +131,47 @@ void unlink_file_vma(struct vm_area_struct *vma) > > } > > } > > > > +void unlink_file_vma_batch_init(struct unlink_vma_file_batch *vb) > > +{ > > + vb->count = 0; > > +} > > + > > +static void unlink_file_vma_batch_process(struct unlink_vma_file_batch *vb) > > +{ > > + struct address_space *mapping; > > + int i; > > + > > + mapping = vb->vmas[0]->vm_file->f_mapping; > > + i_mmap_lock_write(mapping); > > + for (i = 0; i < vb->count; i++) { > > + VM_WARN_ON_ONCE(vb->vmas[i]->vm_file->f_mapping != mapping); > > + __remove_shared_vm_struct(vb->vmas[i], mapping); > > + } > > + i_mmap_unlock_write(mapping); > > + > > + unlink_file_vma_batch_init(vb); > > +} > > + > > +void unlink_file_vma_batch_add(struct unlink_vma_file_batch *vb, > > + struct vm_area_struct *vma) > > +{ > > + if (vma->vm_file == NULL) > > + return; > > + > > It might be worth a comment about count being always ahead of the last > vma in the array. On first glance, I was concerned about an off-by-one > here (and in the process function). But maybe it's just me, the > increment is pretty close to this statement - I had to think about > ARRAY_SIZE() here. > I think that's upgringing on different codebases. Idiomatic array iteration of n elements being "for (i = 0; i < n; i++)" to me makes the below assignment + counter bump pair obviously correct. That is to say some other arrangement would require me to do a double take. :) > > + if ((vb->count > 0 && vb->vmas[0]->vm_file != vma->vm_file) || > > + vb->count == ARRAY_SIZE(vb->vmas)) > > Since you are checking vm_file and only support a single vm_file in this > version, it might be worth saving it in your unlink_vma_file_batch > struct. It could also be used in the processing to reduce dereferencing > to f_mappings. > > I'm not sure if this is worth it with modern cpus, though. I'm just > thinking that this step is executed the most so any speedup here will > help you. > I had it originally but it imo uglified the code. > Feel free to add > > Reviewed-by: Liam R. Howlett > thanks