From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE800C433DF for ; Fri, 29 May 2020 22:38:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7034120721 for ; Fri, 29 May 2020 22:38:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="mr2E5w7+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7034120721 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B565A8001A; Fri, 29 May 2020 18:38:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AE04F80010; Fri, 29 May 2020 18:38:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A8538001A; Fri, 29 May 2020 18:38:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0020.hostedemail.com [216.40.44.20]) by kanga.kvack.org (Postfix) with ESMTP id 804FE80010 for ; Fri, 29 May 2020 18:38:45 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 410EA2DFA for ; Fri, 29 May 2020 22:38:45 +0000 (UTC) X-FDA: 76871222610.04.wrist67_869481b80564e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin04.hostedemail.com (Postfix) with ESMTP id 2094B800EEB1 for ; Fri, 29 May 2020 22:38:45 +0000 (UTC) X-HE-Tag: wrist67_869481b80564e X-Filterd-Recvd-Size: 8129 Received: from mail-lj1-f195.google.com (mail-lj1-f195.google.com [209.85.208.195]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Fri, 29 May 2020 22:38:44 +0000 (UTC) Received: by mail-lj1-f195.google.com with SMTP id z18so1131439lji.12 for ; Fri, 29 May 2020 15:38:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=uAv0vm6ASq1X+4PDERaYd4AVRV0Dlgc0SNGKodU+s84=; b=mr2E5w7+XkXXlXfqcSUUUSC/rpdID6uRkIGMqdBuBzMpP4wOco0Cr8WUTRKjBUAUKu jrsiKHkj0tMeZswiSUAmm7rp3nd/3PNVgfgZnz/9Kz6AGb/CKL++h5Dz4C6Lj7PYzCBB 58Lrl/u1iSeJp8o0r+XmugySrxgVsmg82+aCvHjlXTE289ogNQMw2YZX551bHPbzI9AT ybjGQBYdvf6bbEPdzGRBs+QoxXSw0eTMjVbCrPMbe82kZE82/XAFun1K7X1opsrF2jNi xsU+QV+r7WlIYQO/RY6GbVj/iwFTM1tWT7PTj4mw2MoVLY3qgHjrZtELU7Th6HZJdHN2 0wYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=uAv0vm6ASq1X+4PDERaYd4AVRV0Dlgc0SNGKodU+s84=; b=jL3hLeIeNdpiwWaR+azmjRvC/V9IiepiSXuzAzwmmMwfD7E8ds6WC8B+bI2CeUSWlC jV7fgnP0gnY2DHzYzD2Tg2ycbrB55sWNwNWQCG56Qv6arBR/Ck1AMHo2C9RZcYdjIQQg soNAegNLdFe6G6zzRHejEMIitnYrqmeFsqsxd0FkErr08EbfIsrGg3/hAn7DVUBOigBP rR0nRaCF/aqkW0h61vJE4aSTBTE3w7ADw+qXcpeLREWgyzJlj/fzDx+ZKqB7aFpaMlMX qWSoLz8gcQh91SnQifZ/q4j+VLYn1+pEncraE7ylJH0UJGgBXa6WUhAwHQnnZS4XLzBB AwPA== X-Gm-Message-State: AOAM5307JYUIQqnWYRRhf/zgsV0PKqxC+Q2Up9Ic3yP/qfDi+2cXlXku blgUKLeJf887e5m4SEoMqulCU8BNklBJWo70V/zOWw== X-Google-Smtp-Source: ABdhPJwxZVjxL0vTLY/qT6fiLYbJNq2Zj8XxkkXhDKnDupkwqcDOZ5VnIDVNWkGo/L+aYvs0O3txgq+cYf3vc4gtLrA= X-Received: by 2002:a2e:95d2:: with SMTP id y18mr5093529ljh.321.1590791922726; Fri, 29 May 2020 15:38:42 -0700 (PDT) MIME-Version: 1.0 References: <20200528235238.74233-1-axelrasmussen@google.com> <20200528202435.65396221@oasis.local.home> <20200529080957.GI706495@hirez.programming.kicks-ass.net> <20200530000359.519f38720ab457435ecf7b6f@kernel.org> In-Reply-To: <20200530000359.519f38720ab457435ecf7b6f@kernel.org> From: Axel Rasmussen Date: Fri, 29 May 2020 15:38:05 -0700 Message-ID: Subject: Re: [PATCH v2 0/7] Add histogram measuring mmap_lock contention latency To: Masami Hiramatsu Cc: Peter Zijlstra , Steven Rostedt , Andrew Morton , David Rientjes , Davidlohr Bueso , Ingo Molnar , Ingo Molnar , Jerome Glisse , Laurent Dufour , "Liam R . Howlett" , Matthew Wilcox , Michel Lespinasse , Vlastimil Babka , Will Deacon , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, AKASHI Takahiro , Aleksa Sarai , Alexander Potapenko , Alexey Dobriyan , Al Viro , Andrei Vagin , Ard Biesheuvel , Brendan Higgins , chenqiwu , Christian Brauner , Christian Kellner , Corentin Labbe , Daniel Jordan , Dan Williams , David Gow , "David S. Miller" , "Dmitry V. Levin" , "Eric W. Biederman" , Eugene Syromiatnikov , Jamie Liu , Jason Gunthorpe , John Garry , John Hubbard , Jonathan Adams , Junaid Shahid , Kees Cook , "Kirill A. Shutemov" , Konstantin Khlebnikov , Krzysztof Kozlowski , Mark Rutland , Masahiro Yamada , Mathieu Desnoyers , Michal Hocko , Mikhail Zaslonko , Petr Mladek , Ralph Campbell , Randy Dunlap , Roman Gushchin , Shakeel Butt , Tal Gilboa , Thomas Gleixner , =?UTF-8?Q?Uwe_Kleine=2DK=C3=B6nig?= , Vincenzo Frascino , Yang Shi , Yu Zhao , Tom Zanussi Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 2094B800EEB1 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, May 29, 2020 at 8:04 AM Masami Hiramatsu wrote: > > On Fri, 29 May 2020 10:09:57 +0200 > Peter Zijlstra wrote: > > > On Thu, May 28, 2020 at 06:39:08PM -0700, Axel Rasmussen wrote: > > > > > The use case we have in mind for this is to enable this instrumentation > > > widely in Google's production fleet. Internally, we have a userspace thing > > > which scrapes these metrics and publishes them such that we can look at > > > aggregate metrics across our fleet. The thinking is that mechanisms like > > > lockdep or getting histograms with e.g. BPF attached to the tracepoint > > > introduces too much overhead for this to be viable. (Although, granted, I > > > don't have benchmarks to prove this - if there's skepticism, I can produce > > > such a thing - or prove myself wrong and rethink my approach. :) ) > > > > Whichever way around; I don't believe in special instrumentation like > > this. We'll grow a thousand separate pieces of crap if we go this route. > > > > Next on, someone will come and instrument yet another lock, with yet > > another 1000 lines of gunk. > > > > Why can't you kprobe the mmap_lock things and use ftrace histograms? > > +1. > As far as I can see the series, if you want to make a histogram > of the duration of acquiring locks, you might only need 7/7 (but this > is a minimum subset.) I recommend you to introduce a set of tracepoints > -- start-locking, locked, and released so that we can investigate > which process is waiting for which one. Then you can use either bpf > or ftrace to make a histogram easily. > > Thank you, > > -- > Masami Hiramatsu The reasoning against using BPF or ftrace basically comes down to overhead. My intuition is that BPF/ftrace are great for testing / debugging on a small number of machines, but are less suitable for leaving them enabled in production across many servers. This may not be generally true, but due to how "hot" this lock is, I think this may be sort of a pathological case. Consider maple trees and range locks: if we're running Linux on many servers, with many different workloads, it's useful to see the impact of these changes in production, and in aggregate, over a "long" period of time, instead of just under test conditions on a small number of machines. I'll circle back next week with some benchmarks to confirm/deny my intuition on this. If I can confirm the overhead of BPF / ftrace is low enough, I'll pursue that route instead. The point about special instrumentation is well taken. I completely agree we don't want a file in /proc for each lock in the kernel. :) I think there's some argument to be made that mmap_lock in particular is "special", considering the amount of investment going into optimizing it vs. other locks.