From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BBD9CD5BC6 for ; Tue, 19 Sep 2023 14:08:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E23636B053C; Tue, 19 Sep 2023 10:08:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DCFFF6B053E; Tue, 19 Sep 2023 10:08:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C97F86B053F; Tue, 19 Sep 2023 10:08:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B542D6B053C for ; Tue, 19 Sep 2023 10:08:41 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 6C5AC40C8B for ; Tue, 19 Sep 2023 14:08:41 +0000 (UTC) X-FDA: 81253527642.22.A882412 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf08.hostedemail.com (Postfix) with ESMTP id 2C4ED160039 for ; Tue, 19 Sep 2023 14:08:38 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf08.hostedemail.com: domain of "SRS0=RLdg=FD=goodmis.org=rostedt@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=RLdg=FD=goodmis.org=rostedt@kernel.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695132519; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J1h1iJ0wFVP3zHxNcBhL2Oixo7OH135dJdL3D7qr0/U=; b=iR/Nxa/aW8zkB9m0KBLt3VYkKmOeqQrMcupWGegQOq4lSxEo/Q8hj9vCXCwhMNsS4wAl8j mjYajJ3B5w/F3twUFNjrBJWOlXElz8W/N2HbfYkI+1RAEheBCTkFd99Uo7f2sp+jB4NXWo Tf+XgBrMVdPkpeSQROmYI8WZvnAUn0o= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf08.hostedemail.com: domain of "SRS0=RLdg=FD=goodmis.org=rostedt@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=RLdg=FD=goodmis.org=rostedt@kernel.org" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695132519; a=rsa-sha256; cv=none; b=FFi87O/1XIaxnR37d4IHldk4+NczCGwBq/2PuJiORaBVH06WzKGjE0sFU00LsFfL+tJuUn IkAOnD4pBbgM+eaMXrblX5wHjyHtIxi2WvFapPGeMKiy3p0d+OcJUSBzOlviFjVI4IiaIv m8TMbWMfdt9+K0RN8b1KY+myD6Ckm2U= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D05FE616AE; Tue, 19 Sep 2023 14:08:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D1141C433C8; Tue, 19 Sep 2023 14:08:35 +0000 (UTC) Date: Tue, 19 Sep 2023 10:09:09 -0400 From: Steven Rostedt To: Jaewon Kim Cc: yuzhao@google.com, tjmercier@google.com, kaleshsingh@google.com, akpm@linux-foundation.org, vbabka@suse.cz, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org, jaewon31.kim@gmail.com Subject: Re: [PATCH] vmscan: add trace events for lru_gen Message-ID: <20230919095927.5a964094@gandalf.local.home> In-Reply-To: <20230919025216.1878-1-jaewon31.kim@samsung.com> References: <20230919025216.1878-1-jaewon31.kim@samsung.com> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 2C4ED160039 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 4cawkc8ipey5ga13obcep41dd3dyz9xi X-HE-Tag: 1695132518-148644 X-HE-Meta: U2FsdGVkX19aMUlNtzUPwOKTnwz+lTNpVNWbuuR9yMIqAXEZrborx9E+lORsdKdDLHryDarXu37MlsXzLrlaMcFSRHH1KEwBHaZpzlShrP4zr7CpmUXkvNxZ5t2ZUEYgaMiV9atjcGmEj8z4VLJTndAUQuTX+2a2PhMts+zinpTNPOK+2AaYpCzAHgqHWvFz2l/qq1x8Pyfm00RgBXc6kAgXL8jVSKyESj0nL5bij6w6UDnl+EJVNOhF+O2fpLvNeXMWIPTJ9wi+4m45HkXAVEXgEIbyG/k9nbGAPIoHCpkOQmc9ehY6Lu1dVIcKIPuMtxnHdjVQiBF8+y2ZkOetaAM29kK01rBG8IJcympSlZKzh0fZMiNLq6+77CsCy3fiqeoGSMFrhRDJY7HGFjyBhlHdzvORfzOrdGjuDXW1dRDsi71Q6bB1htZBWs/ZHVYdo4mTKdZdgXbv3cmjp3tIIwpjSgtYTQhWq87cTxjSPbLoUie6fdKWAfY+b7oKNyTFaCrYkJQGzvsLo3DXwuvcbxPk6+RvdhaUH3QjEHJBHBC3wgmfh4gH8CdGHkDr0CBazxbo+0LqfUosF6TH5GG/P9rmYT+RYGhImv27PgUxopXAz72PQ3bbkkTBLBOvSp+6YO6QvrtIGiPsUX2BoyvISDUYqYVtWF3CbTZg/dCRKXQiyteTE6diUWy+Z5f2lrAPSo3JZUIkj9IcmJFFhtBPlT7Xx+k3Y/SDajJK/4pm4bMdutupo2WXtrq4J14azk/Cgv/Azs1QL4DHMvfxxnTqtyt0sX8RicXG0eLI+c/W/0Um7GjezLBZETq9RrFvUOCZLg3KVKliaWctfJwMciHi4ZR/E0tzjiV6/Hx5DGR4FGKqsgBAhSLrmOl5miWowZtU+mxLXkjla9BI2u0+FEjNb4uEIMqnay5rJ+KNdtvcgZrD2kjN1kFqiCvLdxWJC0/mywYTCRB7tDOTIhz0D2X 5VOuLi0U nmMjA5VuiZaTJLBDC9TarSxIcaKCWcgesD/yam+2XFZbSbU8MZht4nOPXR8HLB0a51TPjxM4SGOa34AYNNo73w2ywq6Cm6dXj8EIc3J96NBr7ph8UOLPmBF+C+mwBy87IGGUrvzpr4HfFvsfRmF/EatdCVc1wS0nccUadBsfdVu5blw+M7f4kdi56yV+ocIKhaOgIgUrSrTF9gkI8yfWjiTruCepZbVutUYuCvNYQh6xQ1xwn80LsZ3I26WIg/XCFcQSQwt+hqossBktQaNqueB8hZP79MGwQeKveg6yBJEausz08mBrpgvpUU8+ecnODK+Toj0yfL2q/DnCJ7O7PxuLnwzITATTGC5zrxAKFHipRdPVa12RwaYy1f74hdWhYRRFmox21M5KbVR1MbERTKUhGwaYteNuqtOEj/klZgq+H4LVIq6xgeWXjSjXhPRataL14FgLnAN07UcMy8Yyx4YD3Mo71fPK7TAUjCoofhHSUwr6bIS3baOoVrw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 19 Sep 2023 11:52:16 +0900 Jaewon Kim wrote: > /* > * Now redefine the EM() and EMe() macros to map the enums to the strings > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h > index d2123dd960d5..e8f9d0452e89 100644 > --- a/include/trace/events/vmscan.h > +++ b/include/trace/events/vmscan.h > @@ -327,6 +327,55 @@ TRACE_EVENT(mm_vmscan_lru_isolate, > __print_symbolic(__entry->lru, LRU_NAMES)) > ); > > +TRACE_EVENT(mm_vmscan_lru_gen_scan, > + TP_PROTO(int highest_zoneidx, > + int order, > + unsigned long nr_requested, > + unsigned long nr_scanned, > + unsigned long nr_skipped, > + unsigned long nr_taken, > + isolate_mode_t isolate_mode, > + int lru), This is a lot of parameter passing, can you consolidate it? (see below to where you call this) > + > + TP_ARGS(highest_zoneidx, order, nr_requested, nr_scanned, nr_skipped, nr_taken, isolate_mode, lru), > + > + TP_STRUCT__entry( > + __field(int, highest_zoneidx) > + __field(int, order) > + __field(unsigned long, nr_requested) > + __field(unsigned long, nr_scanned) > + __field(unsigned long, nr_skipped) > + __field(unsigned long, nr_taken) > + __field(unsigned int, isolate_mode) > + __field(int, lru) > + ), > + > + TP_fast_assign( > + __entry->highest_zoneidx = highest_zoneidx; > + __entry->order = order; > + __entry->nr_requested = nr_requested; > + __entry->nr_scanned = nr_scanned; > + __entry->nr_skipped = nr_skipped; > + __entry->nr_taken = nr_taken; > + __entry->isolate_mode = (__force unsigned int)isolate_mode; > + __entry->lru = lru; > + ), > + > + /* > + * classzone is previous name of the highest_zoneidx. > + * Reason not to change it is the ABI requirement of the tracepoint. > + */ > + TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_skipped=%lu nr_taken=%lu lru=%s", > + __entry->isolate_mode, > + __entry->highest_zoneidx, > + __entry->order, > + __entry->nr_requested, > + __entry->nr_scanned, > + __entry->nr_skipped, > + __entry->nr_taken, > + __print_symbolic(__entry->lru, LRU_GEN_NAMES)) > +); > + > TRACE_EVENT(mm_vmscan_write_folio, > > TP_PROTO(struct folio *folio), > @@ -437,6 +486,53 @@ TRACE_EVENT(mm_vmscan_lru_shrink_active, > show_reclaim_flags(__entry->reclaim_flags)) > ); > > +TRACE_EVENT(mm_vmscan_lru_gen_evict, > + > + TP_PROTO(int nid, unsigned long nr_reclaimed, > + struct reclaim_stat *stat, int priority, int file), > + > + TP_ARGS(nid, nr_reclaimed, stat, priority, file), > + > + TP_STRUCT__entry( > + __field(int, nid) On 64 bit architectures, this causes a 4 byte hole in the ring buffer layout. Please keep 32 bit size fields paired with other 32 bit size if possible. That is, move the above "int nid" down where it doesn't cause a long field to be 4 bytes away. > + __field(unsigned long, nr_reclaimed) > + __field(unsigned long, nr_dirty) > + __field(unsigned long, nr_writeback) > + __field(unsigned long, nr_congested) > + __field(unsigned long, nr_immediate) > + __field(unsigned int, nr_activate0) > + __field(unsigned int, nr_activate1) > + __field(unsigned long, nr_ref_keep) > + __field(unsigned long, nr_unmap_fail) __field(int, nid) here! > + __field(int, priority) > + __field(int, reclaim_flags) > + ), > + > + TP_fast_assign( > + __entry->nid = nid; > + __entry->nr_reclaimed = nr_reclaimed; > + __entry->nr_dirty = stat->nr_dirty; > + __entry->nr_writeback = stat->nr_writeback; > + __entry->nr_congested = stat->nr_congested; > + __entry->nr_immediate = stat->nr_immediate; > + __entry->nr_activate0 = stat->nr_activate[0]; > + __entry->nr_activate1 = stat->nr_activate[1]; > + __entry->nr_ref_keep = stat->nr_ref_keep; > + __entry->nr_unmap_fail = stat->nr_unmap_fail; > + __entry->priority = priority; > + __entry->reclaim_flags = trace_reclaim_flags(file); > + ), > + > + TP_printk("nid=%d nr_reclaimed=%ld nr_dirty=%ld nr_writeback=%ld nr_congested=%ld nr_immediate=%ld nr_activate_anon=%d nr_activate_file=%d nr_ref_keep=%ld nr_unmap_fail=%ld priority=%d flags=%s", > + __entry->nid, __entry->nr_reclaimed, > + __entry->nr_dirty, __entry->nr_writeback, > + __entry->nr_congested, __entry->nr_immediate, > + __entry->nr_activate0, __entry->nr_activate1, > + __entry->nr_ref_keep, __entry->nr_unmap_fail, > + __entry->priority, > + show_reclaim_flags(__entry->reclaim_flags)) > +); > + > TRACE_EVENT(mm_vmscan_node_reclaim_begin, > > TP_PROTO(int nid, int order, gfp_t gfp_flags), > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 6f13394b112e..cc10e3fb8fa2 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -5005,6 +5005,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, > int sorted = 0; > int scanned = 0; > int isolated = 0; > + int skipped = 0; > int remaining = MAX_LRU_BATCH; > struct lru_gen_folio *lrugen = &lruvec->lrugen; > struct mem_cgroup *memcg = lruvec_memcg(lruvec); > @@ -5018,7 +5019,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, > > for (i = MAX_NR_ZONES; i > 0; i--) { > LIST_HEAD(moved); > - int skipped = 0; > + int skipped_zone = 0; > int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES; > struct list_head *head = &lrugen->folios[gen][type][zone]; > > @@ -5040,16 +5041,17 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, > isolated += delta; > } else { > list_move(&folio->lru, &moved); > - skipped += delta; > + skipped_zone += delta; > } > > - if (!--remaining || max(isolated, skipped) >= MIN_LRU_BATCH) > + if (!--remaining || max(isolated, skipped_zone) >= MIN_LRU_BATCH) > break; > } > > - if (skipped) { > + if (skipped_zone) { > list_splice(&moved, head); > - __count_zid_vm_events(PGSCAN_SKIP, zone, skipped); > + __count_zid_vm_events(PGSCAN_SKIP, zone, skipped_zone); > + skipped += skipped_zone; > } > > if (!remaining || isolated >= MIN_LRU_BATCH) > @@ -5065,6 +5067,10 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, > __count_memcg_events(memcg, PGREFILL, sorted); > __count_vm_events(PGSCAN_ANON + type, isolated); > > + if (scanned) BTW, you can make this branch conditional with the trace event logic, so that it isn't tested when tracing is enabled. That is, remove the "if (scanned)" test and use TRACE_EVENT_CONDITION() as I show below. > + trace_mm_vmscan_lru_gen_scan(sc->reclaim_idx, sc->order, > + MAX_LRU_BATCH, scanned, skipped, isolated, > + sc->may_unmap ? 0 : ISOLATE_UNMAPPED, type); Why not pass the sc in to the trace event, and then do the assigning there? // use CONDITION to test scanned TRACE_EVENT_CONDITION(mm_vmscan_lru_gen_scan, TP_PROTO(struct scan_control *sc, unsigned long nr_requested, unsigned long nr_scanned, unsigned long nr_skipped, unsigned long nr_taken, int lru), TP_ARGS(...) TP_CONDITION(nr_scanned) TP_fast_assign( __entry->highest_zoneidx = sc->reclaim_idx; __entry->order = sc->order; __entry->nr_requested = nr_requested; __entry->nr_scanned = nr_scanned; __entry->nr_skipped = nr_skipped; __entry->nr_taken = nr_taken; __entry->isolate_mode = (__force unsigned int)(sc->may_unmap ? 0 : ISOLATE_UNMAPPED); __entry->lru = lru; ), Lots of parameters can be expensive to pass, as it requires more copying. -- Steve > /* > * There might not be eligible folios due to reclaim_idx. Check the > * remaining to prevent livelock if it's not making progress. > @@ -5194,6 +5200,8 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap > retry: > reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false); > sc->nr_reclaimed += reclaimed; > + trace_mm_vmscan_lru_gen_evict(pgdat->node_id, reclaimed, &stat, > + sc->priority, type); > > list_for_each_entry_safe_reverse(folio, next, &list, lru) { > if (!folio_evictable(folio)) {