From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DF95C77B73 for ; Wed, 24 May 2023 10:34:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 71DCE900003; Wed, 24 May 2023 06:34:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6CE18900002; Wed, 24 May 2023 06:34:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 59584900003; Wed, 24 May 2023 06:34:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 46CE2900002 for ; Wed, 24 May 2023 06:34:12 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 1AEA01C77D4 for ; Wed, 24 May 2023 10:34:12 +0000 (UTC) X-FDA: 80824788744.07.BE2EAD1 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf17.hostedemail.com (Postfix) with ESMTP id D27FF40008 for ; Wed, 24 May 2023 10:34:09 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=crSnfS+T; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684924450; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0pcmrMRkjbOsRMu6QmJik+v/MwViFImY5DBhJxT4PTw=; b=gb53yr2JrBfmEQwHRvxNiT47U04S+ZdvMNvu5HmNTxhyz7JC9hev4p7eDD7zkE3//ZQnHK oF8RIzh/HHt707DfX3LREnr1tUZUM1wFTmfn8xpOq/HVl888duYlAQUZY68vP0XYF8Lcji w7S/+wQ4scKhcN+5C702nj3iOnLBnlE= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=crSnfS+T; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684924450; a=rsa-sha256; cv=none; b=wz/mlQg65KxwxjAxCSKOAa9vwcNXUoQ3/3fWThmx4KYTueqCCgkj3TSw45eDnJbyHWCmJx a8fq7aqPUQGXmJhWQFiU/R00Wenz45O0r3Y1a6Wg1IjT0SqtxlrLftAkv5CEXGjp6G+4En XmOHGi8tF5Hjn7PZ9JMlKw0L3UAsSCw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684924449; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=0pcmrMRkjbOsRMu6QmJik+v/MwViFImY5DBhJxT4PTw=; b=crSnfS+TsMDBGrh1EE0CpmJjLlNG6UYWTErjHCLJF3Y2VKfbDsrpyZ1TxZeCkLediOxw2n 2s9sLcH52fl3Xgx/nIJp3U3kMB7uwZ6WAUCb1Kf0GOk/l0lj3TofMzjOejPtu23GoMR6ns 0TAqQ0m4TFGKMHv50pmNHH5z8SFO6Ug= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-425-SXvwfKKJPVaH_9rJUz9moA-1; Wed, 24 May 2023 06:34:06 -0400 X-MC-Unique: SXvwfKKJPVaH_9rJUz9moA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5B86D185A78E; Wed, 24 May 2023 10:34:05 +0000 (UTC) Received: from localhost (ovpn-12-35.pek2.redhat.com [10.72.12.35]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4D6A12166B25; Wed, 24 May 2023 10:34:03 +0000 (UTC) Date: Wed, 24 May 2023 18:34:00 +0800 From: Baoquan He To: Thomas Gleixner Cc: linux-mm@kvack.org, Andrew Morton , Christoph Hellwig , Uladzislau Rezki , Lorenzo Stoakes , Peter Zijlstra Subject: Re: [patch 6/6] mm/vmalloc: Dont purge usable blocks unnecessarily Message-ID: References: <20230523135902.517032811@linutronix.de> <20230523140002.852175941@linutronix.de> MIME-Version: 1.0 In-Reply-To: <20230523140002.852175941@linutronix.de> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Rspamd-Queue-Id: D27FF40008 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: df61u7w8x885w38enhqu4jwieychbk7m X-HE-Tag: 1684924449-207624 X-HE-Meta: U2FsdGVkX18SrRfrOo1s1Xc8pqE1I0atgzIxnidGSLr9GSKGtIRVpsTqXN66CSolrtnmrlbOrIB8rvYKfIcu7OQP2RzdCNzZmpLylY6rS/O1TS9rEUX2cN0yA2C2q8BX26UA7ujrmexxW9DkoFDmyYC1WqnQrKMcJ/ONB3SS5XsWiH2qIeTt/wdIDAhpK7maf8BA428CT1x00mVTnwJpF72BxF63n9kN6KUywJftTTj308oNwX/RbDOqf2N4PhOrGBYPn70iWKRNZPPPBNcu2GSyyGpoxXDa79jfZml6ni3Nn5HkEhoN+QU5eRjlrNkL7xK5XZjFylKHO5pxc2rvVn2/sxwq2fnooG4cQvSV+lIMzmsg4JOsiPV1Kg4R0NQjX7HAQshf9gGi+4r7YknQGc3jdyZQzyWpWMNbYO1reQ3EZMthKeui95yahNLnQYFuk44wCHg2XwK6WxxwfWCu5oHMknC2F7F4+qAXIOAGXzvzZX04l6vFiYTK4xgi7E9HbvjjYzFlOOLv3Q69+nSUWR6jCb1z3mqiVqO7+YEXJJbGUMtcoIQf/wLTnafznuJDTHjEZk3MbHG12Qg9KClOQVfQYJFStaCZTFJr5syobPi7Cytj8/Isxy2ekd9aqa0GAG+DMXjvHf44qaM77qk1HlojYRSVFL2z9VRNQ8p0MFCoQEJkIsK5xYgvkBxA3IU9NB72X6GqcXDweDr2+YEQJ0yXM3WHYKrfgIbZSIjZeHpo/xTo2A/MRR7Za8e30hK1hGdnlX1ZY8iPgIWq0CAhSTeXkVhEo2FWmFQjNEnGzjMKbqr5yE9O8O//Xz+UG/hN+odCqy6K0UT33HK0hoIP3LlWy7lbcrUSm7+mUE4Jz8+dZmH04h+4WFWUEYqV7w0zw+kwiOjvrqy2UrMMDHekP4IJyhEacfd6zRm5okDaR8ZgLzDwFRIrodSiq7L7HKzR6p8aXQKQlhtBFozfybl qeLOfIsT bb9/ydbdbHBzcTOXSJsczVhCkGS3Gva5hL8btUkONbtUDo4CXf58wyGmlxPP8g5RR3Np8q+n2HPn82auK9FdUKWVWeKuV/gKdNURFEVShVklKz1Np7FEcmhRu54z0JD/FLg1b6vjJC67Tljss9A7n77n2AaUQoReE9nTJ5g7L0Avtw2NqwVS3K8EsF3stvJDIoTgUi2ftS9IMBLo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 05/23/23 at 04:02pm, Thomas Gleixner wrote: > Purging fragmented blocks is done unconditionally in several contexts: > > 1) From drain_vmap_area_work(), when the number of lazy to be freed > vmap_areas reached the threshold > > 2) Reclaiming vmalloc address space from pcpu_get_vm_areas() > > 3) _unmap_aliases() > > #1 There is no reason to zap fragmented vmap blocks unconditionally, simply > because reclaiming all lazy areas drains at least > > 32MB * fls(num_online_cpus()) > > per invocation which is plenty. > > #2 Reclaiming when running out of space or due to memory pressure makes a > lot of sense > > #3 _unmap_aliases() requires to touch everything because the caller has no > clue which vmap_area used a particular page last and the vmap_area lost > that information too. > > Except for the vfree + VM_FLUSH_RESET_PERMS case, which removes the > vmap area first and then cares about the flush. That in turn requires > a full walk of _all_ vmap areas including the one which was just > added to the purge list. > > But as this has to be flushed anyway this is an opportunity to combine > outstanding TLB flushes and do the housekeeping of purging freed areas, > but like #1 there is no real good reason to zap usable vmap blocks > unconditionally. > > Add a @force_purge argument to the relevant functions and if not true only > purge fragmented blocks which have less than 1/4 of their capacity left. > > Signed-off-by: Thomas Gleixner > --- > mm/vmalloc.c | 34 ++++++++++++++++++++++------------ > 1 file changed, 22 insertions(+), 12 deletions(-) > > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -791,7 +791,7 @@ get_subtree_max_size(struct rb_node *nod > RB_DECLARE_CALLBACKS_MAX(static, free_vmap_area_rb_augment_cb, > struct vmap_area, rb_node, unsigned long, subtree_max_size, va_size) > > -static void purge_vmap_area_lazy(void); > +static void purge_vmap_area_lazy(bool force_purge); > static BLOCKING_NOTIFIER_HEAD(vmap_notify_list); > static void drain_vmap_area_work(struct work_struct *work); > static DECLARE_WORK(drain_vmap_work, drain_vmap_area_work); > @@ -1649,7 +1649,7 @@ static struct vmap_area *alloc_vmap_area > > overflow: > if (!purged) { > - purge_vmap_area_lazy(); > + purge_vmap_area_lazy(true); > purged = 1; > goto retry; > } > @@ -1717,7 +1717,7 @@ static atomic_long_t vmap_lazy_nr = ATOM > static DEFINE_MUTEX(vmap_purge_lock); > > /* for per-CPU blocks */ > -static void purge_fragmented_blocks_allcpus(void); > +static void purge_fragmented_blocks_allcpus(bool force_purge); > > /* > * Purges all lazily-freed vmap areas. > @@ -1787,10 +1787,10 @@ static bool __purge_vmap_area_lazy(unsig > /* > * Kick off a purge of the outstanding lazy areas. > */ > -static void purge_vmap_area_lazy(void) > +static void purge_vmap_area_lazy(bool force_purge) > { > mutex_lock(&vmap_purge_lock); > - purge_fragmented_blocks_allcpus(); > + purge_fragmented_blocks_allcpus(force_purge); > __purge_vmap_area_lazy(ULONG_MAX, 0); > mutex_unlock(&vmap_purge_lock); > } > @@ -1908,6 +1908,12 @@ static struct vmap_area *find_unlink_vma > > #define VMAP_BLOCK_SIZE (VMAP_BBMAP_BITS * PAGE_SIZE) > > +/* > + * Purge threshold to prevent overeager purging of fragmented blocks for > + * regular operations: Purge if vb->free is less than 1/4 of the capacity. > + */ > +#define VMAP_PURGE_THRESHOLD (VMAP_BBMAP_BITS / 4) > + > #define VMAP_RAM 0x1 /* indicates vm_map_ram area*/ > #define VMAP_BLOCK 0x2 /* mark out the vmap_block sub-type*/ > #define VMAP_FLAGS_MASK 0x3 > @@ -2087,12 +2093,16 @@ static void free_vmap_block(struct vmap_ > } > > static bool purge_fragmented_block(struct vmap_block *vb, struct vmap_block_queue *vbq, > - struct list_head *purge_list) > + struct list_head *purge_list, bool force_purge) > { > if (!(vb->free + vb->dirty == VMAP_BBMAP_BITS && vb->dirty != VMAP_BBMAP_BITS)) > return false; > > - /* prevent further allocs after releasing lock */ > + /* Don't overeagerly purge usable blocks unless requested */ > + if (!force_purge && vb->free < VMAP_PURGE_THRESHOLD) > + return false; > + > + /* prevent further allocs after releasing lock */ > WRITE_ONCE(vb->free, 0); > /* prevent purging it again */ > WRITE_ONCE(vb->dirty, VMAP_BBMAP_BITS); > @@ -2115,7 +2125,7 @@ static void free_purged_blocks(struct li > } > } > > -static void purge_fragmented_blocks(int cpu) > +static void purge_fragmented_blocks(int cpu, bool force_purge) > { > LIST_HEAD(purge); > struct vmap_block *vb; > @@ -2130,19 +2140,19 @@ static void purge_fragmented_blocks(int > continue; > > spin_lock(&vb->lock); > - purge_fragmented_block(vb, vbq, &purge); > + purge_fragmented_block(vb, vbq, &purge, force_purge); > spin_unlock(&vb->lock); > } > rcu_read_unlock(); > free_purged_blocks(&purge); > } > > -static void purge_fragmented_blocks_allcpus(void) > +static void purge_fragmented_blocks_allcpus(bool force_purge) > { > int cpu; > > for_each_possible_cpu(cpu) > - purge_fragmented_blocks(cpu); > + purge_fragmented_blocks(cpu, force_purge); > } > > static void *vb_alloc(unsigned long size, gfp_t gfp_mask) > @@ -4173,7 +4183,7 @@ struct vm_struct **pcpu_get_vm_areas(con > overflow: > spin_unlock(&free_vmap_area_lock); > if (!purged) { > - purge_vmap_area_lazy(); > + purge_vmap_area_lazy(true); > purged = true; > > /* Before "retry", check if we recover. */ Wondering why bothering to add 'force_purge' to purge_vmap_area_lazy(), purge_fragmented_blocks_allcpus() if they are all true. Can't we set 'force_purge' as true for purge_fragmented_block() in purge_fragmented_blocks()? alloc_vmap_area() pcpu_get_vm_areas() -->purge_vmap_area_lazy(true) -->purge_fragmented_blocks_allcpus(force_purge=true) -->purge_fragmented_block(force_purge=true) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 062f4a86b049..c812f8afa985 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2140,7 +2140,7 @@ static void purge_fragmented_blocks(int cpu, bool force_purge) continue; spin_lock(&vb->lock); - purge_fragmented_block(vb, vbq, &purge, force_purge); + purge_fragmented_block(vb, vbq, &purge, true); spin_unlock(&vb->lock); } rcu_read_unlock(); And one place of change is missing, it will fail building. diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 062f4a86b049..0453bc66812e 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2277,7 +2277,7 @@ static void _vm_unmap_aliases(unsigned long start, unsigned long end, int flush) * not purgeable, check whether there is dirty * space to be flushed. */ - if (!purge_fragmented_block(vb, vbq, &purge_list) && + if (!purge_fragmented_block(vb, vbq, &purge_list, false) && vb->dirty_max && vb->dirty != VMAP_BBMAP_BITS) { unsigned long va_start = vb->va->va_start; unsigned long s, e; >