From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E635AC10F1B for ; Wed, 21 Dec 2022 17:45:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 618F68E0006; Wed, 21 Dec 2022 12:45:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5A0978E0005; Wed, 21 Dec 2022 12:45:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 441C38E0006; Wed, 21 Dec 2022 12:45:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 3281C8E0002 for ; Wed, 21 Dec 2022 12:45:01 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id AC220AAA85 for ; Wed, 21 Dec 2022 17:45:00 +0000 (UTC) X-FDA: 80267039160.10.F0BEFBA Received: from mail-lf1-f41.google.com (mail-lf1-f41.google.com [209.85.167.41]) by imf14.hostedemail.com (Postfix) with ESMTP id 06F4B10000E for ; Wed, 21 Dec 2022 17:44:57 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=NiK7RWvl; spf=pass (imf14.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.41 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1671644698; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=7FIqKDrnuYPUOY6BxbRSJvB6UTorb2a6qA8yexJXh/E=; b=AiAbdD+GfPETST4uQOh8dI2giDyByV5qzFeYQBhbywaRJzqd9s5Sx+cbON7zY8nwgVeju8 uDdx28HJ6K9e9ykLLl30Vjdu5xGJ2iDHDYk5LHtQT3Vf8unAvVYdIEQ8NkYgXlbaAhTnnC wCOHN9l1WTqyjnXFJX4pw/PYHBu3UCM= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=NiK7RWvl; spf=pass (imf14.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.41 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1671644698; a=rsa-sha256; cv=none; b=HtjrCb8uhZ8KW8E5U4BMzXWKKRoarNLZ3tE5WKNvW2/M6i0Lj/1SHPWgriF2Jfz1TYqzL/ cxiQ87y1hUMitFa/vXrQlFh55e0xH4aiu86aI/zrPJs5T0KbG7gCFwVO9j0HrC61BgW+KR RXYyOlNb1N2It9MfJ2cG6QHg0X6AqR4= Received: by mail-lf1-f41.google.com with SMTP id bp15so24542298lfb.13 for ; Wed, 21 Dec 2022 09:44:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=7FIqKDrnuYPUOY6BxbRSJvB6UTorb2a6qA8yexJXh/E=; b=NiK7RWvlQXZ8dOj9iWHcMGSJodA9Hf/lFIjKQhzPDf9DMqkZp3h/c7gX7m3+ZbMX8l Qu9OkrtUYSt2BBdq4z+DK+4WSjl9QXihlcK7TAJGoA/bbnWdBjeq3gUpJ3MN4AE+JAwB PfjfHzbwQrwOsGqWmGPNQVKRx8nKi27gm0nREMD5tLmP3BBBeY+eTyvaQtZU1zKEGzZi EvkSokhXUyJXOfu0CMLNP95tHVyzD3nLXWuVMiEM5Dv2NU33FQFYR9ulL1YdY5T4vCsK Mx4TNltMVYa31SP37rNDz0K22gPGkfeeRMYOAA2p2SCp00PNtBakk9NFtcTs8bLZDRfk HJ8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=7FIqKDrnuYPUOY6BxbRSJvB6UTorb2a6qA8yexJXh/E=; b=Ma5JQyAr08dRckFo8UDtq2PRj/UaP3nBf8FHOJOnFlJ5rCpQtvGoXWjI8S6UW6CYFr 7gzUwOmleYu5/Nlq0ABVLgZ5tHLExUWZShMCfRsn1AJoza1lkZVYnxIGhWS3Ci5SDwzg T0ZWRYcaZAx91B4K5kburvpP78gsDgEtvjNTwWk7k2+CUhY1IbIy1qzhoYIUBMebOI2r Mb0qCanHXGV/cpaRty56IGVeCtYLnt3rTJvD9VlTlin7VtTIf79Q1xE0UtVLvI4KAe9c ExBpLnG11tPMEN9UrPlICE0bj4zVj3ann+CwGXtaJnti7qsz49+qm4oASuiSVZcGeE2i uMXg== X-Gm-Message-State: AFqh2kqw2RYFgkV0ZC65gYy7/1XdYFzLLVFhin3v/PScZBk1UEnXxXCO qjT2qcyRWt2B7AYEBQQbo1A= X-Google-Smtp-Source: AMrXdXsoLv+kqFCeuTy3EBv+oDdqKzEisHULSKtyfyTs0KE5oVC8/Uqy3qIDA0xiJyW9MFSgGMkIYw== X-Received: by 2002:ac2:5edc:0:b0:4b5:869e:b5ec with SMTP id d28-20020ac25edc000000b004b5869eb5ecmr1843243lfq.61.1671644696112; Wed, 21 Dec 2022 09:44:56 -0800 (PST) Received: from pc638.lan ([155.137.26.201]) by smtp.gmail.com with ESMTPSA id r18-20020ac252b2000000b004a91d1b3070sm1904915lfm.308.2022.12.21.09.44.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Dec 2022 09:44:55 -0800 (PST) From: "Uladzislau Rezki (Sony)" To: Andrew Morton Cc: linux-mm@kvack.org, LKML , Baoquan He , Lorenzo Stoakes , Christoph Hellwig , Matthew Wilcox , Nicholas Piggin , Uladzislau Rezki , Oleksiy Avramchenko , Roman Gushchin Subject: [PATCH v2 1/3] mm: vmalloc: Avoid of calling __find_vmap_area() twise in __vunmap() Date: Wed, 21 Dec 2022 18:44:52 +0100 Message-Id: <20221221174454.1085130-1-urezki@gmail.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 06F4B10000E X-Stat-Signature: uyy6c3ywce39tigmq5p6iwo7jaapydtk X-HE-Tag: 1671644697-255857 X-HE-Meta: U2FsdGVkX1+XSdIuUK3YVB660Gc1UiNsbACnB3ZBWPpYInX1f6iwLKzqowLYWH/m9uDscN4QWbcZ76EbKy3KOn28+8D70FlMTp+flIfVF5NfC7YK6UHhysF7FpJA97ITnnn14O51s+ZNcF04Xo4v+mL5v9PzHQKAseUgw2wx+oP70BF/O0Q5pc2bD2Kaf84E8u6ylxLXhz/+Y6SCKXdOSC2hXR5tdGBJ4GXtROuZ4AP3RrfoQ3V/Y0ZctlzsPzn4x/a8P/KDWHsWOFO2mpDwxKzuyBNsjNTPxZAX/N/6hqsT2NxUv7ax3ifqDwIZanslKgJPQcIycPeq18VmpbcYS7IkF6B9CfK9ePqV2+j1oxaVSJaoFepsHSpARD8Ud5dc6zRuzm51LYQmoRnmj2fL87ySNJShU4R3wi6reAfCfbytYvndtyOsA+UFueAFZWyq8MnQDonNN6v6rgo5SPohZ6BA6PNVV+PRWrv5CKzJYIDVUoRNuQ/HA0f6QEkQQjvzjv8O7aVDDHAmqcbWEvU/TvyUnjyc1IPhy6Oy/JWOFjfjJOgYNyQKg6ha8vIZHofnx2vZtl6fHk6M8XVSPhdj71+ZzehHnlqMgpuIFYIn0AmVPgkVbq+aOC//4FXJP3jN95s5e98+YuZ6WK62b2DmIDKTAAPkQtwVEI8MFuTSQuXkVIGa34usuMXYyqRnXDgcBNVhHO1DKKCwKEaGs+40WaWaQ3m9WpV4jVHI4sddgB6eL5WiXFefTeON3BM9eRg7X9t+qADI28LAWXvjRJc59Ae6jaWCH6qE5f6WfokyBNvOFjha37HoOiEhTx9IQFOwHNT2Lrg2wa/9L3BDtvjr9Fs/nMxoq5D40oxxSd2LHIYI6bDa6G1f/a67insEhj+nJjU+i3kMCFGV7LpcvI2vvpmHYiTj2qtt4MXWTxbTvdpOYLO9a0DAlM4UftF38Xve24YAKIMOa3aJnKVGz4s KZ3IQ4vJ pCbBf66VkoFvhJ3CJUIS1Lsjd3MJXb/ic9jci1COWbIw0xdfSdo1dt4YEGwen98fJvdf0sFUOeQxgQZtQFRKs6Ep0WrPZkpMZTb6c8lA0Xr1GJltpf4Rc5IE95qT5qHKEmL1jWgz+ITCai2mM0P4VSmsLwdWTVaV3VmuV2UV05k3NXQE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently __vunmap() path calls __find_vmap_area() two times. One on entry to check that area exists, second time inside remove_vm_area() function that also performs a new search of VA. In order to improvie it from a performance point of view we split remove_vm_area() into two new parts: - find_unlink_vmap_area() that does a search and unlink from tree; - __remove_vm_area() that does a removing but without searching. In this case there is no any functional change for remove_vm_area() whereas vm_remove_mappings(), where a second search happens, switches to the __remove_vm_area() variant where already detached VA is passed as a parameter, so there is no need to find it again. Performance wise, i use test_vmalloc.sh with 32 threads doing alloc free on a 64-CPUs-x86_64-box: perf without this patch: - 31.41% 0.50% vmalloc_test/10 [kernel.vmlinux] [k] __vunmap - 30.92% __vunmap - 17.67% _raw_spin_lock native_queued_spin_lock_slowpath - 12.33% remove_vm_area - 11.79% free_vmap_area_noflush - 11.18% _raw_spin_lock native_queued_spin_lock_slowpath 0.76% free_unref_page perf with this patch: - 11.35% 0.13% vmalloc_test/14 [kernel.vmlinux] [k] __vunmap - 11.23% __vunmap - 8.28% find_unlink_vmap_area - 7.95% _raw_spin_lock 7.44% native_queued_spin_lock_slowpath - 1.93% free_vmap_area_noflush - 0.56% _raw_spin_lock 0.53% native_queued_spin_lock_slowpath 0.60% __vunmap_range_noflush __vunmap() consumes around ~20% less CPU cycles on this test. Reported-by: Roman Gushchin Signed-off-by: Uladzislau Rezki (Sony) --- mm/vmalloc.c | 66 +++++++++++++++++++++++++++++++++------------------- 1 file changed, 42 insertions(+), 24 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 9e30f0b39203..28030d2441f1 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1825,9 +1825,11 @@ static void free_vmap_area_noflush(struct vmap_area *va) unsigned long va_start = va->va_start; unsigned long nr_lazy; - spin_lock(&vmap_area_lock); - unlink_va(va, &vmap_area_root); - spin_unlock(&vmap_area_lock); + if (!list_empty(&va->list)) { + spin_lock(&vmap_area_lock); + unlink_va(va, &vmap_area_root); + spin_unlock(&vmap_area_lock); + } nr_lazy = atomic_long_add_return((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); @@ -1871,6 +1873,19 @@ struct vmap_area *find_vmap_area(unsigned long addr) return va; } +static struct vmap_area *find_unlink_vmap_area(unsigned long addr) +{ + struct vmap_area *va; + + spin_lock(&vmap_area_lock); + va = __find_vmap_area(addr, &vmap_area_root); + if (va) + unlink_va(va, &vmap_area_root); + spin_unlock(&vmap_area_lock); + + return va; +} + /*** Per cpu kva allocator ***/ /* @@ -2591,6 +2606,20 @@ struct vm_struct *find_vm_area(const void *addr) return va->vm; } +static struct vm_struct *__remove_vm_area(struct vmap_area *va) +{ + struct vm_struct *vm; + + if (!va || !va->vm) + return NULL; + + vm = va->vm; + kasan_free_module_shadow(vm); + free_unmap_vmap_area(va); + + return vm; +} + /** * remove_vm_area - find and remove a continuous kernel virtual area * @addr: base address @@ -2607,22 +2636,8 @@ struct vm_struct *remove_vm_area(const void *addr) might_sleep(); - spin_lock(&vmap_area_lock); - va = __find_vmap_area((unsigned long)addr, &vmap_area_root); - if (va && va->vm) { - struct vm_struct *vm = va->vm; - - va->vm = NULL; - spin_unlock(&vmap_area_lock); - - kasan_free_module_shadow(vm); - free_unmap_vmap_area(va); - - return vm; - } - - spin_unlock(&vmap_area_lock); - return NULL; + va = find_unlink_vmap_area((unsigned long) addr); + return __remove_vm_area(va); } static inline void set_area_direct_map(const struct vm_struct *area, @@ -2637,15 +2652,16 @@ static inline void set_area_direct_map(const struct vm_struct *area, } /* Handle removing and resetting vm mappings related to the vm_struct. */ -static void vm_remove_mappings(struct vm_struct *area, int deallocate_pages) +static void vm_remove_mappings(struct vmap_area *va, int deallocate_pages) { + struct vm_struct *area = va->vm; unsigned long start = ULONG_MAX, end = 0; unsigned int page_order = vm_area_page_order(area); int flush_reset = area->flags & VM_FLUSH_RESET_PERMS; int flush_dmap = 0; int i; - remove_vm_area(area->addr); + __remove_vm_area(va); /* If this is not VM_FLUSH_RESET_PERMS memory, no need for the below. */ if (!flush_reset) @@ -2690,6 +2706,7 @@ static void vm_remove_mappings(struct vm_struct *area, int deallocate_pages) static void __vunmap(const void *addr, int deallocate_pages) { struct vm_struct *area; + struct vmap_area *va; if (!addr) return; @@ -2698,19 +2715,20 @@ static void __vunmap(const void *addr, int deallocate_pages) addr)) return; - area = find_vm_area(addr); - if (unlikely(!area)) { + va = find_unlink_vmap_area((unsigned long)addr); + if (unlikely(!va)) { WARN(1, KERN_ERR "Trying to vfree() nonexistent vm area (%p)\n", addr); return; } + area = va->vm; debug_check_no_locks_freed(area->addr, get_vm_area_size(area)); debug_check_no_obj_freed(area->addr, get_vm_area_size(area)); kasan_poison_vmalloc(area->addr, get_vm_area_size(area)); - vm_remove_mappings(area, deallocate_pages); + vm_remove_mappings(va, deallocate_pages); if (deallocate_pages) { int i; -- 2.30.2