linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org, Namhyung Kim <namhyung@kernel.org>,
	David Ahern <dsahern@gmail.com>, Jiri Olsa <jolsa@redhat.com>,
	Minchan Kim <minchan@kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	linux-mm@kvack.org
Subject: Re: [PATCH 1/5] tracing, mm: Record pfn instead of pointer to struct page
Date: Thu, 31 Aug 2017 13:38:17 +0200	[thread overview]
Message-ID: <39539ab2-c697-e286-dd6b-2c5eacd8bb90@suse.cz> (raw)
In-Reply-To: <897eb045-d63c-b9e3-c6e7-0f6b94536c0f@suse.cz>

Ping?

On 07/31/2017 09:43 AM, Vlastimil Babka wrote:
> On 04/14/2015 12:14 AM, Arnaldo Carvalho de Melo wrote:
>> From: Namhyung Kim <namhyung@kernel.org>
>>
>> The struct page is opaque for userspace tools, so it'd be better to save
>> pfn in order to identify page frames.
>>
>> The textual output of $debugfs/tracing/trace file remains unchanged and
>> only raw (binary) data format is changed - but thanks to libtraceevent,
>> userspace tools which deal with the raw data (like perf and trace-cmd)
>> can parse the format easily.
> 
> Hmm it seems trace-cmd doesn't work that well, at least on current
> x86_64 kernel where I noticed it:
> 
>  trace-cmd-22020 [003] 105219.542610: mm_page_alloc:        [FAILED TO PARSE] pfn=0x165cb4 order=0 gfp_flags=29491274 migratetype=1
> 
> I'm quite sure it's due to the "page=%p" part, which uses pfn_to_page().
> The events/kmem/mm_page_alloc/format file contains this for page:
> 
> REC->pfn != -1UL ? (((struct page *)vmemmap_base) + (REC->pfn)) : ((void *)0)
> 
> I think userspace can't know vmmemap_base nor the implied sizeof(struct
> page) for pointer arithmetic?
> 
> On older 4.4-based kernel:
> 
> REC->pfn != -1UL ? (((struct page *)(0xffffea0000000000UL)) + (REC->pfn)) : ((void *)0)
> 
> This also fails to parse, so it must be the struct page part?
> 
> I think the problem is, even if ve solve this with some more
> preprocessor trickery to make the format file contain only constant
> numbers, pfn_to_page() on e.g. sparse memory model without vmmemap is
> more complicated than simple arithmetic, and can't be exported in the
> format file.
> 
> I'm afraid that to support userspace parsing of the trace data, we will
> have to store both struct page and pfn... or perhaps give up on reporting
> the struct page pointer completely. Thoughts?
> 
>> So impact on the userspace will also be
>> minimal.
>>
>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>> Based-on-patch-by: Joonsoo Kim <js1304@gmail.com>
>> Acked-by: Ingo Molnar <mingo@kernel.org>
>> Acked-by: Steven Rostedt <rostedt@goodmis.org>
>> Cc: David Ahern <dsahern@gmail.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Cc: linux-mm@kvack.org
>> Link: http://lkml.kernel.org/r/1428298576-9785-3-git-send-email-namhyung@kernel.org
>> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>> ---
>>  include/trace/events/filemap.h |  8 ++++----
>>  include/trace/events/kmem.h    | 42 +++++++++++++++++++++---------------------
>>  include/trace/events/vmscan.h  |  8 ++++----
>>  3 files changed, 29 insertions(+), 29 deletions(-)
>>
>> diff --git a/include/trace/events/filemap.h b/include/trace/events/filemap.h
>> index 0421f49a20f7..42febb6bc1d5 100644
>> --- a/include/trace/events/filemap.h
>> +++ b/include/trace/events/filemap.h
>> @@ -18,14 +18,14 @@ DECLARE_EVENT_CLASS(mm_filemap_op_page_cache,
>>  	TP_ARGS(page),
>>  
>>  	TP_STRUCT__entry(
>> -		__field(struct page *, page)
>> +		__field(unsigned long, pfn)
>>  		__field(unsigned long, i_ino)
>>  		__field(unsigned long, index)
>>  		__field(dev_t, s_dev)
>>  	),
>>  
>>  	TP_fast_assign(
>> -		__entry->page = page;
>> +		__entry->pfn = page_to_pfn(page);
>>  		__entry->i_ino = page->mapping->host->i_ino;
>>  		__entry->index = page->index;
>>  		if (page->mapping->host->i_sb)
>> @@ -37,8 +37,8 @@ DECLARE_EVENT_CLASS(mm_filemap_op_page_cache,
>>  	TP_printk("dev %d:%d ino %lx page=%p pfn=%lu ofs=%lu",
>>  		MAJOR(__entry->s_dev), MINOR(__entry->s_dev),
>>  		__entry->i_ino,
>> -		__entry->page,
>> -		page_to_pfn(__entry->page),
>> +		pfn_to_page(__entry->pfn),
>> +		__entry->pfn,
>>  		__entry->index << PAGE_SHIFT)
>>  );
>>  
>> diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
>> index 4ad10baecd4d..81ea59812117 100644
>> --- a/include/trace/events/kmem.h
>> +++ b/include/trace/events/kmem.h
>> @@ -154,18 +154,18 @@ TRACE_EVENT(mm_page_free,
>>  	TP_ARGS(page, order),
>>  
>>  	TP_STRUCT__entry(
>> -		__field(	struct page *,	page		)
>> +		__field(	unsigned long,	pfn		)
>>  		__field(	unsigned int,	order		)
>>  	),
>>  
>>  	TP_fast_assign(
>> -		__entry->page		= page;
>> +		__entry->pfn		= page_to_pfn(page);
>>  		__entry->order		= order;
>>  	),
>>  
>>  	TP_printk("page=%p pfn=%lu order=%d",
>> -			__entry->page,
>> -			page_to_pfn(__entry->page),
>> +			pfn_to_page(__entry->pfn),
>> +			__entry->pfn,
>>  			__entry->order)
>>  );
>>  
>> @@ -176,18 +176,18 @@ TRACE_EVENT(mm_page_free_batched,
>>  	TP_ARGS(page, cold),
>>  
>>  	TP_STRUCT__entry(
>> -		__field(	struct page *,	page		)
>> +		__field(	unsigned long,	pfn		)
>>  		__field(	int,		cold		)
>>  	),
>>  
>>  	TP_fast_assign(
>> -		__entry->page		= page;
>> +		__entry->pfn		= page_to_pfn(page);
>>  		__entry->cold		= cold;
>>  	),
>>  
>>  	TP_printk("page=%p pfn=%lu order=0 cold=%d",
>> -			__entry->page,
>> -			page_to_pfn(__entry->page),
>> +			pfn_to_page(__entry->pfn),
>> +			__entry->pfn,
>>  			__entry->cold)
>>  );
>>  
>> @@ -199,22 +199,22 @@ TRACE_EVENT(mm_page_alloc,
>>  	TP_ARGS(page, order, gfp_flags, migratetype),
>>  
>>  	TP_STRUCT__entry(
>> -		__field(	struct page *,	page		)
>> +		__field(	unsigned long,	pfn		)
>>  		__field(	unsigned int,	order		)
>>  		__field(	gfp_t,		gfp_flags	)
>>  		__field(	int,		migratetype	)
>>  	),
>>  
>>  	TP_fast_assign(
>> -		__entry->page		= page;
>> +		__entry->pfn		= page ? page_to_pfn(page) : -1UL;
>>  		__entry->order		= order;
>>  		__entry->gfp_flags	= gfp_flags;
>>  		__entry->migratetype	= migratetype;
>>  	),
>>  
>>  	TP_printk("page=%p pfn=%lu order=%d migratetype=%d gfp_flags=%s",
>> -		__entry->page,
>> -		__entry->page ? page_to_pfn(__entry->page) : 0,
>> +		__entry->pfn != -1UL ? pfn_to_page(__entry->pfn) : NULL,
>> +		__entry->pfn != -1UL ? __entry->pfn : 0,
>>  		__entry->order,
>>  		__entry->migratetype,
>>  		show_gfp_flags(__entry->gfp_flags))
>> @@ -227,20 +227,20 @@ DECLARE_EVENT_CLASS(mm_page,
>>  	TP_ARGS(page, order, migratetype),
>>  
>>  	TP_STRUCT__entry(
>> -		__field(	struct page *,	page		)
>> +		__field(	unsigned long,	pfn		)
>>  		__field(	unsigned int,	order		)
>>  		__field(	int,		migratetype	)
>>  	),
>>  
>>  	TP_fast_assign(
>> -		__entry->page		= page;
>> +		__entry->pfn		= page ? page_to_pfn(page) : -1UL;
>>  		__entry->order		= order;
>>  		__entry->migratetype	= migratetype;
>>  	),
>>  
>>  	TP_printk("page=%p pfn=%lu order=%u migratetype=%d percpu_refill=%d",
>> -		__entry->page,
>> -		__entry->page ? page_to_pfn(__entry->page) : 0,
>> +		__entry->pfn != -1UL ? pfn_to_page(__entry->pfn) : NULL,
>> +		__entry->pfn != -1UL ? __entry->pfn : 0,
>>  		__entry->order,
>>  		__entry->migratetype,
>>  		__entry->order == 0)
>> @@ -260,7 +260,7 @@ DEFINE_EVENT_PRINT(mm_page, mm_page_pcpu_drain,
>>  	TP_ARGS(page, order, migratetype),
>>  
>>  	TP_printk("page=%p pfn=%lu order=%d migratetype=%d",
>> -		__entry->page, page_to_pfn(__entry->page),
>> +		pfn_to_page(__entry->pfn), __entry->pfn,
>>  		__entry->order, __entry->migratetype)
>>  );
>>  
>> @@ -275,7 +275,7 @@ TRACE_EVENT(mm_page_alloc_extfrag,
>>  		alloc_migratetype, fallback_migratetype),
>>  
>>  	TP_STRUCT__entry(
>> -		__field(	struct page *,	page			)
>> +		__field(	unsigned long,	pfn			)
>>  		__field(	int,		alloc_order		)
>>  		__field(	int,		fallback_order		)
>>  		__field(	int,		alloc_migratetype	)
>> @@ -284,7 +284,7 @@ TRACE_EVENT(mm_page_alloc_extfrag,
>>  	),
>>  
>>  	TP_fast_assign(
>> -		__entry->page			= page;
>> +		__entry->pfn			= page_to_pfn(page);
>>  		__entry->alloc_order		= alloc_order;
>>  		__entry->fallback_order		= fallback_order;
>>  		__entry->alloc_migratetype	= alloc_migratetype;
>> @@ -294,8 +294,8 @@ TRACE_EVENT(mm_page_alloc_extfrag,
>>  	),
>>  
>>  	TP_printk("page=%p pfn=%lu alloc_order=%d fallback_order=%d pageblock_order=%d alloc_migratetype=%d fallback_migratetype=%d fragmenting=%d change_ownership=%d",
>> -		__entry->page,
>> -		page_to_pfn(__entry->page),
>> +		pfn_to_page(__entry->pfn),
>> +		__entry->pfn,
>>  		__entry->alloc_order,
>>  		__entry->fallback_order,
>>  		pageblock_order,
>> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
>> index 69590b6ffc09..f66476b96264 100644
>> --- a/include/trace/events/vmscan.h
>> +++ b/include/trace/events/vmscan.h
>> @@ -336,18 +336,18 @@ TRACE_EVENT(mm_vmscan_writepage,
>>  	TP_ARGS(page, reclaim_flags),
>>  
>>  	TP_STRUCT__entry(
>> -		__field(struct page *, page)
>> +		__field(unsigned long, pfn)
>>  		__field(int, reclaim_flags)
>>  	),
>>  
>>  	TP_fast_assign(
>> -		__entry->page = page;
>> +		__entry->pfn = page_to_pfn(page);
>>  		__entry->reclaim_flags = reclaim_flags;
>>  	),
>>  
>>  	TP_printk("page=%p pfn=%lu flags=%s",
>> -		__entry->page,
>> -		page_to_pfn(__entry->page),
>> +		pfn_to_page(__entry->pfn),
>> +		__entry->pfn,
>>  		show_reclaim_flags(__entry->reclaim_flags))
>>  );
>>  
>>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-08-31 11:38 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-13 22:14 [GIT PULL 0/5] perf/core improvements and fixes Arnaldo Carvalho de Melo
2015-04-13 22:14 ` [PATCH 1/5] tracing, mm: Record pfn instead of pointer to struct page Arnaldo Carvalho de Melo
2017-07-31  7:43   ` Vlastimil Babka
2017-08-31 11:38     ` Vlastimil Babka [this message]
2017-08-31 13:43     ` Steven Rostedt
2017-08-31 14:31       ` Vlastimil Babka
2017-08-31 14:44         ` Steven Rostedt
2017-09-01  8:16           ` Vlastimil Babka
2017-09-01 11:15             ` Steven Rostedt
2015-04-13 22:14 ` [PATCH 2/5] perf kmem: Analyze page allocator events also Arnaldo Carvalho de Melo
2015-04-13 22:33 ` [GIT PULL 0/5] perf/core improvements and fixes Masami Hiramatsu
2015-04-13 23:09   ` Arnaldo Carvalho de Melo
2015-04-13 23:19     ` Arnaldo Carvalho de Melo
2015-04-14  7:04       ` Masami Hiramatsu
2015-04-14 12:17         ` Arnaldo Carvalho de Melo
2015-04-14 12:12       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=39539ab2-c697-e286-dd6b-2c5eacd8bb90@suse.cz \
    --to=vbabka@suse.cz \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@kernel.org \
    --cc=dsahern@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox