* [PATCH] mm: alloc_contig: demote PFN busy message to debug level
@ 2016-12-02 9:57 Lucas Stach
2016-12-02 10:18 ` Vlastimil Babka
0 siblings, 1 reply; 6+ messages in thread
From: Lucas Stach @ 2016-12-02 9:57 UTC (permalink / raw)
To: linux-mm
Cc: Andrew Morton, Vlastimil Babka, Joonsoo Kim, kernel, patchwork-lst
There are a lot of reasons why a PFN might be busy and unable to be isolated
some of which can't really be avoided. This message is spamming the logs when
a lot of CMA allocations are happening, causing isolation to happen quite
frequently.
Demote the message to log level, as CMA will just retry the allocation, so
there is no need to have this message in the logs. If someone is interested
in the failing case, there is a tracepoint to track those failures properly.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2b3bf6767d54..b2cfb4074f90 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7398,7 +7398,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
/* Make sure the range is really isolated. */
if (test_pages_isolated(outer_start, end, false)) {
- pr_info("%s: [%lx, %lx) PFNs busy\n",
+ pr_debug("%s: [%lx, %lx) PFNs busy\n",
__func__, outer_start, end);
ret = -EBUSY;
goto done;
--
2.10.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: alloc_contig: demote PFN busy message to debug level
2016-12-02 9:57 [PATCH] mm: alloc_contig: demote PFN busy message to debug level Lucas Stach
@ 2016-12-02 10:18 ` Vlastimil Babka
2016-12-02 10:41 ` Lucas Stach
0 siblings, 1 reply; 6+ messages in thread
From: Vlastimil Babka @ 2016-12-02 10:18 UTC (permalink / raw)
To: Lucas Stach, linux-mm
Cc: Andrew Morton, Joonsoo Kim, kernel, patchwork-lst, Michal Hocko,
Michal Nazarewicz, Marek Szyprowski, Robin H. Johnson
On 12/02/2016 10:57 AM, Lucas Stach wrote:
> There are a lot of reasons why a PFN might be busy and unable to be isolated
> some of which can't really be avoided. This message is spamming the logs when
> a lot of CMA allocations are happening, causing isolation to happen quite
> frequently.
Is this related to Robin's report [1] or you have an independent case of
lots of CMA allocations, and in which context are there?
> Demote the message to log level, as CMA will just retry the allocation, so
> there is no need to have this message in the logs. If someone is interested
> in the failing case, there is a tracepoint to track those failures properly.
I don't think we should just hide the issue like this, as getting high
volume reports from this is also very likely associated with high
overhead for the allocations. If it's the generic dma-cma context, like
in [1] where it attempts CMA for order-0 allocations, we should first do
something about that, before tweaking the logging.
[1] http://marc.info/?l=linux-mm&m=148053714627617&w=2
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> ---
> mm/page_alloc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2b3bf6767d54..b2cfb4074f90 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7398,7 +7398,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>
> /* Make sure the range is really isolated. */
> if (test_pages_isolated(outer_start, end, false)) {
> - pr_info("%s: [%lx, %lx) PFNs busy\n",
> + pr_debug("%s: [%lx, %lx) PFNs busy\n",
> __func__, outer_start, end);
> ret = -EBUSY;
> goto done;
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: alloc_contig: demote PFN busy message to debug level
2016-12-02 10:18 ` Vlastimil Babka
@ 2016-12-02 10:41 ` Lucas Stach
2016-12-02 10:48 ` Michal Hocko
0 siblings, 1 reply; 6+ messages in thread
From: Lucas Stach @ 2016-12-02 10:41 UTC (permalink / raw)
To: Vlastimil Babka
Cc: linux-mm, Michal Nazarewicz, Michal Hocko, Robin H. Johnson,
kernel, Andrew Morton, patchwork-lst, Joonsoo Kim,
Marek Szyprowski
Am Freitag, den 02.12.2016, 11:18 +0100 schrieb Vlastimil Babka:
> On 12/02/2016 10:57 AM, Lucas Stach wrote:
> > There are a lot of reasons why a PFN might be busy and unable to be isolated
> > some of which can't really be avoided. This message is spamming the logs when
> > a lot of CMA allocations are happening, causing isolation to happen quite
> > frequently.
>
> Is this related to Robin's report [1] or you have an independent case of
> lots of CMA allocations, and in which context are there?
>
No, I've seen this bug report, but this patch was sitting to be sent out
for a while now.
> > Demote the message to log level, as CMA will just retry the allocation, so
> > there is no need to have this message in the logs. If someone is interested
> > in the failing case, there is a tracepoint to track those failures properly.
>
> I don't think we should just hide the issue like this, as getting high
> volume reports from this is also very likely associated with high
> overhead for the allocations. If it's the generic dma-cma context, like
> in [1] where it attempts CMA for order-0 allocations, we should first do
> something about that, before tweaking the logging.
>
Etnaviv (the driver I maintain) currently does a stupid thing by
allocating and freeing lots of DMA buffers (higher-order) from different
threads. This is causing overhead at the CMA side, but really isn't
something to be handled at the CMA side, but rather Etnaviv must get
more clever about its CMA usage.
Still this message is really disturbing as page isolation failures can
be caused by lots of other reasons like temporarily pinned pages.
Regards,
Lucas
> [1] http://marc.info/?l=linux-mm&m=148053714627617&w=2
>
> > Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> > ---
> > mm/page_alloc.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 2b3bf6767d54..b2cfb4074f90 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7398,7 +7398,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
> >
> > /* Make sure the range is really isolated. */
> > if (test_pages_isolated(outer_start, end, false)) {
> > - pr_info("%s: [%lx, %lx) PFNs busy\n",
> > + pr_debug("%s: [%lx, %lx) PFNs busy\n",
> > __func__, outer_start, end);
> > ret = -EBUSY;
> > goto done;
> >
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: alloc_contig: demote PFN busy message to debug level
2016-12-02 10:41 ` Lucas Stach
@ 2016-12-02 10:48 ` Michal Hocko
2016-12-02 10:57 ` Lucas Stach
0 siblings, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2016-12-02 10:48 UTC (permalink / raw)
To: Lucas Stach
Cc: Vlastimil Babka, linux-mm, Michal Nazarewicz, Robin H. Johnson,
kernel, Andrew Morton, patchwork-lst, Joonsoo Kim,
Marek Szyprowski
On Fri 02-12-16 11:41:11, Lucas Stach wrote:
> Am Freitag, den 02.12.2016, 11:18 +0100 schrieb Vlastimil Babka:
> > On 12/02/2016 10:57 AM, Lucas Stach wrote:
> > > There are a lot of reasons why a PFN might be busy and unable to be isolated
> > > some of which can't really be avoided. This message is spamming the logs when
> > > a lot of CMA allocations are happening, causing isolation to happen quite
> > > frequently.
> >
> > Is this related to Robin's report [1] or you have an independent case of
> > lots of CMA allocations, and in which context are there?
> >
> No, I've seen this bug report, but this patch was sitting to be sent out
> for a while now.
>
> > > Demote the message to log level, as CMA will just retry the allocation, so
> > > there is no need to have this message in the logs. If someone is interested
> > > in the failing case, there is a tracepoint to track those failures properly.
> >
> > I don't think we should just hide the issue like this, as getting high
> > volume reports from this is also very likely associated with high
> > overhead for the allocations. If it's the generic dma-cma context, like
> > in [1] where it attempts CMA for order-0 allocations, we should first do
> > something about that, before tweaking the logging.
> >
> Etnaviv (the driver I maintain) currently does a stupid thing by
> allocating and freeing lots of DMA buffers (higher-order) from different
> threads. This is causing overhead at the CMA side, but really isn't
> something to be handled at the CMA side, but rather Etnaviv must get
> more clever about its CMA usage.
>
> Still this message is really disturbing as page isolation failures can
> be caused by lots of other reasons like temporarily pinned pages.
Hmm, then I think that what Robin has proposed [1] should be a generally
better solution because it both ratelimits and points to the user who is
triggering this path. I am still not really sure I understand why is the
message useful, to be honest so it very well might be better to just
remove it altogether. This is something for the CMA guys to answer
though.
[1] http://lkml.kernel.org/r/robbat2-20161130T195244-998539995Z@orbis-terrarum.net
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: alloc_contig: demote PFN busy message to debug level
2016-12-02 10:48 ` Michal Hocko
@ 2016-12-02 10:57 ` Lucas Stach
2016-12-02 12:32 ` Michal Nazarewicz
0 siblings, 1 reply; 6+ messages in thread
From: Lucas Stach @ 2016-12-02 10:57 UTC (permalink / raw)
To: Michal Hocko
Cc: Vlastimil Babka, linux-mm, Michal Nazarewicz, Robin H. Johnson,
kernel, Andrew Morton, patchwork-lst, Joonsoo Kim,
Marek Szyprowski
Am Freitag, den 02.12.2016, 11:48 +0100 schrieb Michal Hocko:
> On Fri 02-12-16 11:41:11, Lucas Stach wrote:
> > Am Freitag, den 02.12.2016, 11:18 +0100 schrieb Vlastimil Babka:
> > > On 12/02/2016 10:57 AM, Lucas Stach wrote:
> > > > There are a lot of reasons why a PFN might be busy and unable to be isolated
> > > > some of which can't really be avoided. This message is spamming the logs when
> > > > a lot of CMA allocations are happening, causing isolation to happen quite
> > > > frequently.
> > >
> > > Is this related to Robin's report [1] or you have an independent case of
> > > lots of CMA allocations, and in which context are there?
> > >
> > No, I've seen this bug report, but this patch was sitting to be sent out
> > for a while now.
> >
> > > > Demote the message to log level, as CMA will just retry the allocation, so
> > > > there is no need to have this message in the logs. If someone is interested
> > > > in the failing case, there is a tracepoint to track those failures properly.
> > >
> > > I don't think we should just hide the issue like this, as getting high
> > > volume reports from this is also very likely associated with high
> > > overhead for the allocations. If it's the generic dma-cma context, like
> > > in [1] where it attempts CMA for order-0 allocations, we should first do
> > > something about that, before tweaking the logging.
> > >
> > Etnaviv (the driver I maintain) currently does a stupid thing by
> > allocating and freeing lots of DMA buffers (higher-order) from different
> > threads. This is causing overhead at the CMA side, but really isn't
> > something to be handled at the CMA side, but rather Etnaviv must get
> > more clever about its CMA usage.
> >
> > Still this message is really disturbing as page isolation failures can
> > be caused by lots of other reasons like temporarily pinned pages.
>
> Hmm, then I think that what Robin has proposed [1] should be a generally
> better solution because it both ratelimits and points to the user who is
> triggering this path.
Dumping a stacktrace at this point is only going to increase the noise
from this message, as it can be trigger under normal operating
conditions of CMA. If someone temporarily locked a previously movable
page with GUP or something alike, the stacktrace will point to the
victim rather than the offender, so I think the value of the stackstrace
is rather limited.
Regards,
Lucas
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm: alloc_contig: demote PFN busy message to debug level
2016-12-02 10:57 ` Lucas Stach
@ 2016-12-02 12:32 ` Michal Nazarewicz
0 siblings, 0 replies; 6+ messages in thread
From: Michal Nazarewicz @ 2016-12-02 12:32 UTC (permalink / raw)
To: Lucas Stach, Michal Hocko
Cc: Vlastimil Babka, linux-mm, Robin H. Johnson, kernel,
Andrew Morton, patchwork-lst, Joonsoo Kim, Marek Szyprowski
>>> Am Freitag, den 02.12.2016, 11:18 +0100 schrieb Vlastimil Babka:
>>>> I don't think we should just hide the issue like this, as getting high
>>>> volume reports from this is also very likely associated with high
>>>> overhead for the allocations. If it's the generic dma-cma context, like
>>>> in [1] where it attempts CMA for order-0 allocations, we should first do
>>>> something about that, before tweaking the logging.
That was also my concern. Ideally we would have a counter which
increments whenever isolation failure happens and some monitoring of
that counter but this is kernel so that’s just a pipe dream.
>> On Fri 02-12-16 11:41:11, Lucas Stach wrote:
>>> Still this message is really disturbing as page isolation failures can
>>> be caused by lots of other reasons like temporarily pinned pages.
Just so we’re on the same page, lots of allocations is not a *reason* of
isolation failures. It only surfaces it.
This is not to disagree about better having code that is smart about
allocating DMA buffers. This is true regardless.
> Am Freitag, den 02.12.2016, 11:48 +0100 schrieb Michal Hocko:
>> Hmm, then I think that what Robin has proposed [1] should be a generally
>> better solution because it both ratelimits and points to the user who is
>> triggering this path.
On Fri, Dec 02 2016, Lucas Stach wrote:
> Dumping a stacktrace at this point is only going to increase the noise
> from this message, as it can be trigger under normal operating
> conditions of CMA. If someone temporarily locked a previously movable
> page with GUP or something alike, the stacktrace will point to the
> victim rather than the offender, so I think the value of the stackstrace
> is rather limited.
I agree, which is why I suggested printing the stack only if
CONFIG_CMA_DEBUG is enabled.
--
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-12-02 12:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-02 9:57 [PATCH] mm: alloc_contig: demote PFN busy message to debug level Lucas Stach
2016-12-02 10:18 ` Vlastimil Babka
2016-12-02 10:41 ` Lucas Stach
2016-12-02 10:48 ` Michal Hocko
2016-12-02 10:57 ` Lucas Stach
2016-12-02 12:32 ` Michal Nazarewicz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox