From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E26DEC4320A for ; Wed, 1 Sep 2021 10:52:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 443D560243 for ; Wed, 1 Sep 2021 10:52:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 443D560243 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A95366B006C; Wed, 1 Sep 2021 06:52:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A44086B0071; Wed, 1 Sep 2021 06:52:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 90C306B0072; Wed, 1 Sep 2021 06:52:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0044.hostedemail.com [216.40.44.44]) by kanga.kvack.org (Postfix) with ESMTP id 83D706B006C for ; Wed, 1 Sep 2021 06:52:02 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4AFF31802F639 for ; Wed, 1 Sep 2021 10:52:02 +0000 (UTC) X-FDA: 78538689684.06.4EFF60E Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf04.hostedemail.com (Postfix) with ESMTP id B5E0850000A1 for ; Wed, 1 Sep 2021 10:52:01 +0000 (UTC) Received: from imap1.suse-dmz.suse.de (imap1.suse-dmz.suse.de [192.168.254.73]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 7DA1F202BE; Wed, 1 Sep 2021 10:52:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1630493520; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FY/3larYehmNS+MRr0SOgl2y9fcSkfnRp8Vu0FsWvSU=; b=IK2fkm3HZiRmmIk9/C5/dMb0MjKLbYe7/OBaxmjnyUoopTgLWZff+zAnE9z5CvJN/OazYT FGAhJVsoCRY0MNdaw4dtVMY1pmJjN6hE+/LwWQzWUJacoh265JdL3BiLdjfTkpHvvmfWxH hoaf32uIiZYZ5NrmpyN78DHowA1HrKw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1630493520; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FY/3larYehmNS+MRr0SOgl2y9fcSkfnRp8Vu0FsWvSU=; b=PAKREOovTg72roI9nAbj6zXLDBg4z/DsbjUD8r+qKzwnYm8dZJ8FBFgRxDvaKveNK4qBHN EYwV9xt01v9UTsAA== Received: from imap1.suse-dmz.suse.de (imap1.suse-dmz.suse.de [192.168.254.73]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap1.suse-dmz.suse.de (Postfix) with ESMTPS id 6513213A44; Wed, 1 Sep 2021 10:52:00 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap1.suse-dmz.suse.de with ESMTPSA id 5DO/F1BbL2FhLwAAGKfGzw (envelope-from ); Wed, 01 Sep 2021 10:52:00 +0000 Message-ID: <1585872a-1562-c74a-b686-e0051fa75cda@suse.cz> Date: Wed, 1 Sep 2021 12:52:00 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.0.3 Content-Language: en-US To: Liam Howlett , "linux-mm@kvack.org" Cc: Michel Lespinasse References: <20210831203959.lwofr24z63wjsgkc@revolver> From: Vlastimil Babka Subject: Re: Strange mmap_lock tracepoint In-Reply-To: <20210831203959.lwofr24z63wjsgkc@revolver> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=IK2fkm3H; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=PAKREOov; spf=pass (imf04.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: B5E0850000A1 X-Stat-Signature: ih41u4pouuxuayanu68ew6o9q4kazk5o X-HE-Tag: 1630493521-496605 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 8/31/21 22:40, Liam Howlett wrote: > > Hello, > > I've been trying to trace the mmap_lock calls using tracepoints and > captured this behaviour which I cannot explain. This was with the maple > tree v2 patches running ebizzy with multiple threads in an x86_64 KVM > using 8 virtual CPUs. > > AFAICT, there are zero callers that use the mmap_lock directly besides a > prefetchw(&mm->mmap_lock); > > > ebizzy-803 [000] .... 5376.655366: lock_release: 00000000de8cf25e lock > ebizzy-803 [000] .... 5376.655366: lock_release: 00000000b5e38448 ptlock_ptr(page) > ebizzy-803 [000] .... 5376.655367: lock_release: 000000006b581afd &mm->mmap_lock > ebizzy-803 [000] .... 5376.655367: mmap_lock_released: mm=000000001de7b122 memcg_path= write=false > > ebizzy-803 [000] .... 5376.655369: mmap_lock_start_locking: mm=000000001de7b122 memcg_path= write=false > > ebizzy-803 [000] .... 5376.655369: mmap_lock_acquire_returned: mm=000000001de7b122 memcg_path= write=false success=false > > ebizzy-803 [000] .... 5376.655369: mmap_lock_start_locking: mm=000000001de7b122 memcg_path= write=false I was thinking khugepaged interference, but this seems to be a system-wide tracing (judging from -0) so you would see khugepaged too, right? In the other hand it seems strange to have a long list of just cpu 0 here. Are other CPU's missing or just the interleaving is imperfect because timestamps are not perfectly in sync between cpus, and in fact there was another CPU who took the lock? > ebizzy-803 [000] .... 5376.655369: lock_acquire: 000000006b581afd read &mm->mmap_lock > ebizzy-803 [000] d... 5376.655369: lock_acquire: 00000000984d533e &sem->wait_lock > ebizzy-803 [000] d... 5376.655370: lock_release: 00000000984d533e &sem->wait_lock > ebizzy-803 [000] d... 5376.655370: lock_acquire: 00000000cbb1a456 &rq->__lock > ebizzy-803 [000] d... 5376.655371: lock_acquire: 00000000b3d53a93 read rcu_read_lock > ebizzy-803 [000] d... 5376.655371: lock_release: 00000000b3d53a93 rcu_read_lock > ebizzy-803 [000] d... 5376.655371: lock_acquire: 00000000b3d53a93 read rcu_read_lock > ebizzy-803 [000] d... 5376.655371: lock_release: 00000000b3d53a93 rcu_read_lock > ebizzy-803 [000] d... 5376.655372: lock_acquire: 00000000b3d53a93 read rcu_read_lock > ebizzy-803 [000] d... 5376.655372: lock_release: 00000000b3d53a93 rcu_read_lock > ebizzy-803 [000] d... 5376.655372: lock_release: 00000000cbb1a456 &rq->__lock > -0 [000] d... 5376.655373: lock_acquire: 00000000cbb1a456 &rq->__lock > -0 [000] d... 5376.655373: lock_release: 00000000cbb1a456 &rq->__lock > -0 [000] d... 5376.655373: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > -0 [000] d... 5376.655374: lock_release: 00000000bb7b405b tk_core.seq.seqcount > -0 [000] d... 5376.655374: lock_acquire: 000000007c3ac380 read jiffies_seq.seqcount > -0 [000] d... 5376.655374: lock_release: 000000007c3ac380 jiffies_seq.seqcount > -0 [000] d.h. 5376.655460: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > -0 [000] d.h. 5376.655460: lock_release: 00000000bb7b405b tk_core.seq.seqcount > -0 [000] d.h. 5376.655460: lock_acquire: 00000000f7c85b8f hrtimer_bases.lock > -0 [000] d.h. 5376.655461: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > -0 [000] d.h. 5376.655461: lock_release: 00000000bb7b405b tk_core.seq.seqcount > -0 [000] d.h. 5376.655461: lock_release: 00000000f7c85b8f hrtimer_bases.lock > -0 [000] d.h. 5376.655462: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > -0 [000] d.h. 5376.655462: lock_release: 00000000bb7b405b tk_core.seq.seqcount > -0 [000] d.h. 5376.655462: lock_acquire: 00000000cbb1a456 &rq->__lock > -0 [000] d.h. 5376.655463: lock_release: 00000000cbb1a456 &rq->__lock > -0 [000] d.h. 5376.655463: lock_acquire: 00000000f7c85b8f hrtimer_bases.lock > -0 [000] d.h. 5376.655464: lock_release: 00000000f7c85b8f hrtimer_bases.lock > -0 [000] d.h. 5376.655464: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > -0 [000] d.h. 5376.655464: lock_release: 00000000bb7b405b tk_core.seq.seqcount > -0 [000] d... 5376.655466: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > -0 [000] d... 5376.655466: lock_release: 00000000bb7b405b tk_core.seq.seqcount > -0 [000] d... 5376.655467: lock_acquire: 000000007c3ac380 read jiffies_seq.seqcount > -0 [000] d... 5376.655467: lock_release: 000000007c3ac380 jiffies_seq.seqcount > -0 [000] d.h. 5376.655488: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > -0 [000] d.h. 5376.655489: lock_release: 00000000bb7b405b tk_core.seq.seqcount > -0 [000] d.h. 5376.655489: lock_acquire: 00000000cbb1a456 &rq->__lock > -0 [000] dNh. 5376.655490: lock_release: 00000000cbb1a456 &rq->__lock > -0 [000] dN.. 5376.655490: lock_acquire: 00000000cbb1a456 &rq->__lock > -0 [000] d... 5376.655491: lock_release: 00000000cbb1a456 &rq->__lock > ebizzy-803 [000] d... 5376.655491: lock_acquire: 00000000cbb1a456 &rq->__lock > ebizzy-803 [000] d... 5376.655492: lock_release: 00000000cbb1a456 &rq->__lock > ebizzy-803 [000] .... 5376.655492: mmap_lock_acquire_returned: mm=000000001de7b122 memcg_path= write=false success=true > > The following tracepoints were enabled: > events/mmap_lock/enable > events/lock/enable > > My reading of the above trace is that the ebizzy thread dropped the lock > and immediately attempted to reacquire and detected it was in contention > so the thread added itself to the list and went to sleep only to wake up > and get the lock? Which path does this mmap_read_trylock() followed by immediate mmap_read_lock() anyway? I mean down_read() is implemented like this internally, but that wouldn't generate these mmap_lock_ events. > Doesn't that mean the mmap_lock is _not_ contended? The lack of > contention makes sense since there is no tracepoint of an attempt to > acquire the lock between the dropping of the lock and the same MM waking > up to get the lock. > > What is even stranger is if I change the value of _Q_PENDING_LOOPS from > (1 << 9) to (1 << 10), the benchmark performs better. > > From the above trace and the effects of the _Q_PENDING_LOOPS change, it > looks like the lock is somehow remaining in a state that causes a > failure to acquire the lock even when it is not held or contended? > > > Thanks, > Liam >