From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C3D1C433EF for ; Tue, 14 Jun 2022 18:22:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B12E16B0071; Tue, 14 Jun 2022 14:22:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AC27C6B0072; Tue, 14 Jun 2022 14:22:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 98A646B0073; Tue, 14 Jun 2022 14:22:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 87F4F6B0071 for ; Tue, 14 Jun 2022 14:22:48 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2B4C1210D5 for ; Tue, 14 Jun 2022 18:22:48 +0000 (UTC) X-FDA: 79577662416.28.16A6968 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf25.hostedemail.com (Postfix) with ESMTP id AC8FBA00A8 for ; Tue, 14 Jun 2022 18:22:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655230967; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0hxI4K0t8DRahKzprtzQSK9X+0OxW+lQYlpeiHgm6W0=; b=eSO/x+qEXU2i8l0z0j85nu+5b93Lk9aVGacgPuzJZVL8tsg8FWLgbFUhxPvzhr73u/5OtI C5UYJ9eL5HgnXpPlB3Kp9rTegdzOvt3tUcBinIx06AXY8l34cERiT9nfrRMOd+TcHP1Fwc MspbBgTcCODiVbcj9UrSYm+mQkagoxw= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-619-RHgqwUqWM7mrjlXUpNsy5w-1; Tue, 14 Jun 2022 14:22:41 -0400 X-MC-Unique: RHgqwUqWM7mrjlXUpNsy5w-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DB01A1C05158; Tue, 14 Jun 2022 18:22:40 +0000 (UTC) Received: from [10.22.33.116] (unknown [10.22.33.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id A1DDF2026D64; Tue, 14 Jun 2022 18:22:40 +0000 (UTC) Message-ID: <325768cd-19bd-71ae-83d6-1ca5e84f7462@redhat.com> Date: Tue, 14 Jun 2022 14:22:40 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCH 3/3] mm/kmemleak: Prevent soft lockup in first object iteration loop of kmemleak_scan() Content-Language: en-US To: Catalin Marinas Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20220612183301.981616-1-longman@redhat.com> <20220612183301.981616-4-longman@redhat.com> From: Waiman Long In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655230967; a=rsa-sha256; cv=none; b=Lo71IHjCvnea1O/FsHLSEz2/P0T5pF8Ss0w0FKYCddDYuOO8+oOp4KGUf7E+uzqbCVfS4q eBg6eUElHxGdwsF6nNzGMMqlSQqM6p8/si/QvGqwEfrhBNpApNzLlaaIt1nWc581EmQxPY wnSWJYf8QRn7NByAMOgXvhAWcqTDQ5s= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="eSO/x+qE"; spf=none (imf25.hostedemail.com: domain of longman@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=longman@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655230967; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0hxI4K0t8DRahKzprtzQSK9X+0OxW+lQYlpeiHgm6W0=; b=nj+jUTmQcgR1UrldsFj9VFaZkJVZr9vhDbqPMAU3LTHEwdrdkpcRL2Zyx5scvLLHEgMFqY 2A/Y1FLoKWGztaevN2R+VDTcHk/ZmtbOeOYhcE4q8A3VBPAncZaNxPmAb8DKjKuqdI7YjZ hMWD+pQq4R4prUnUISNQW8AGPQE5xLA= X-Stat-Signature: u9si7pp77drp9fk6j5ge8nhg4n9y11np X-Rspamd-Queue-Id: AC8FBA00A8 Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="eSO/x+qE"; spf=none (imf25.hostedemail.com: domain of longman@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=longman@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1655230967-639106 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/14/22 13:27, Catalin Marinas wrote: > On Tue, Jun 14, 2022 at 06:15:14PM +0100, Catalin Marinas wrote: >> On Sun, Jun 12, 2022 at 02:33:01PM -0400, Waiman Long wrote: >>> @@ -1437,10 +1440,25 @@ static void kmemleak_scan(void) >>> #endif >>> /* reset the reference count (whiten the object) */ >>> object->count = 0; >>> - if (color_gray(object) && get_object(object)) >>> + if (color_gray(object) && get_object(object)) { >>> list_add_tail(&object->gray_list, &gray_list); >>> + gray_list_cnt++; >>> + object_pinned = true; >>> + } >>> I may have the mistaken belief that setting count to 0 will make most object gray. Apparently, that may not be the case here. >>> raw_spin_unlock_irq(&object->lock); >>> + >>> + /* >>> + * With object pinned by a positive reference count, it >>> + * won't go away and we can safely release the RCU read >>> + * lock and do a cond_resched() to avoid soft lockup every >>> + * 64k objects. >>> + */ >>> + if (object_pinned && !(gray_list_cnt & 0xffff)) { >>> + rcu_read_unlock(); >>> + cond_resched(); >>> + rcu_read_lock(); >>> + } >> I'm not sure this gains much. There should be very few gray objects >> initially (those passed to kmemleak_not_leak() for example). The >> majority should be white objects. >> >> If we drop the fine-grained object->lock, we could instead take >> kmemleak_lock outside the loop with a cond_resched_lock(&kmemleak_lock) >> within the loop. I think we can get away with not having an >> rcu_read_lock() at all for list traversal with the big lock outside the >> loop. > Actually this doesn't work is the current object in the iteration is > freed. Does list_for_each_rcu_safe() help? list_for_each_rcu_safe() helps if we are worrying about object being freed. However, it won't help if object->next is freed instead. How about something like: diff --git a/mm/kmemleak.c b/mm/kmemleak.c index 7dd64139a7c7..fd836e43cb16 100644 --- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -1417,12 +1417,16 @@ static void kmemleak_scan(void)         struct zone *zone;         int __maybe_unused i;         int new_leaks = 0; +       int loop1_cnt = 0;         jiffies_last_scan = jiffies;         /* prepare the kmemleak_object's */         rcu_read_lock();         list_for_each_entry_rcu(object, &object_list, object_list) { +               bool obj_pinned = false; + +               loop1_cnt++;                 raw_spin_lock_irq(&object->lock);  #ifdef DEBUG                 /* @@ -1437,10 +1441,32 @@ static void kmemleak_scan(void)  #endif                 /* reset the reference count (whiten the object) */                 object->count = 0; -               if (color_gray(object) && get_object(object)) +               if (color_gray(object) && get_object(object)) {                         list_add_tail(&object->gray_list, &gray_list); +                       obj_pinned = true; +               }                 raw_spin_unlock_irq(&object->lock); + +               /* +                * Do a cond_resched() to avoid soft lockup every 64k objects. +                * Make sure a reference has been taken so that the object +                * won't go away without RCU read lock. +                */ +               if (loop1_cnt & 0xffff) { +                       if (!obj_pinned && !get_object(object)) { +                               /* Try the next object instead */ +                               loop1_cnt--; +                               continue; +                       } + +                       rcu_read_unlock(); +                       cond_resched(); +                       rcu_read_lock(); + +                       if (!obj_pinned) +                               put_object(object); +               }         }         rcu_read_unlock(); Cheers, Longman