* Re: stable request: mm, page_alloc: actually ignore mempolicies for high priority allocations [not found] <a66fb268-74fe-6f4e-a99f-3257b8a5ac3b@vyatta.att-mail.com> @ 2018-11-08 7:54 ` Vlastimil Babka 2018-11-08 8:30 ` Mike Manning 0 siblings, 1 reply; 9+ messages in thread From: Vlastimil Babka @ 2018-11-08 7:54 UTC (permalink / raw) To: mmanning, stable; +Cc: Greg Kroah-Hartman, Mel Gorman, linux-mm +CC linux-mm On 11/7/18 6:33 PM, Mike Manning wrote: > Hello, Please consider backporting to 4.14.y the following commit from > kernel-net-next by Vlastimil Babka [CC'ed]: > > d6a24df00638 ("mm, page_alloc: actually ignore mempolicies for high > priority allocations") It cherry-picks cleanly and builds fine. > > The reason for the request is that the commit 1d26c112959f > <http://stash.eng.vyatta.net:7990/projects/VC/repos/linux-vyatta/commits/1d26c112959f>A ("mm, > page_alloc:do not break __GFP_THISNODE by zonelist reset") that was > previously backported to 4.14.y broke some of our functionality after we > upgraded from an earlier 4.14 kernel without the fix. Well, that's very surprising! Could you be more specific about what exactly got broken? > The reason this is > happening is not clear, with this commit only found by bisect. > Fortunately the requested commit resolves the issue. I would like to understand the problem first, because I currently can't imagine how the first commit could break something and the second fix it. > Best Regards, > > Mike Manning > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: stable request: mm, page_alloc: actually ignore mempolicies for high priority allocations 2018-11-08 7:54 ` stable request: mm, page_alloc: actually ignore mempolicies for high priority allocations Vlastimil Babka @ 2018-11-08 8:30 ` Mike Manning 2018-11-08 9:01 ` Michal Hocko 2018-11-08 9:32 ` Vlastimil Babka 0 siblings, 2 replies; 9+ messages in thread From: Mike Manning @ 2018-11-08 8:30 UTC (permalink / raw) To: Vlastimil Babka, stable; +Cc: Greg Kroah-Hartman, Mel Gorman, linux-mm On 08/11/2018 07:54, Vlastimil Babka wrote: > +CC linux-mm > > On 11/7/18 6:33 PM, Mike Manning wrote: >> Hello, Please consider backporting to 4.14.y the following commit from >> kernel-net-next by Vlastimil Babka [CC'ed]: >> >> d6a24df00638 ("mm, page_alloc: actually ignore mempolicies for high >> priority allocations") It cherry-picks cleanly and builds fine. >> >> The reason for the request is that the commit 1d26c112959f ("mm, >> page_alloc:do not break __GFP_THISNODE by zonelist reset") that was >> previously backported to 4.14.y broke some of our functionality after we >> upgraded from an earlier 4.14 kernel without the fix. > Well, that's very surprising! Could you be more specific about what > exactly got broken? Thank you for your reply. I agree, we were also very surprised when bisecting our updated 4.14 kernel, as this change is apparently completely unrelated to our application running in userspace. But the problem was 100% reproducible on a baremetal setup running automated performance multi-stream testing, so only seen under load. With the fix reverted from the 4.14 kernel, the problem went away, and this is with many repeated runs (the load test is part of a suite that is automatically run quite a few times every day, and this test was failing since the upgrade). > >> The reason this is >> happening is not clear, with this commit only found by bisect. >> Fortunately the requested commit resolves the issue. > I would like to understand the problem first, because I currently can't > imagine how the first commit could break something and the second fix it. I agree, but from an empirical point of view, 2 options present: 1) The original commit was not suitable for backport to 4.14 and should be reverted. 2) For the same reason that the original commit was suitable for backport to 4.14, the requested commit should also be backported. >> Best Regards, >> >> Mike Manning >> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: stable request: mm, page_alloc: actually ignore mempolicies for high priority allocations 2018-11-08 8:30 ` Mike Manning @ 2018-11-08 9:01 ` Michal Hocko 2018-11-08 9:06 ` Vlastimil Babka 2018-11-08 9:32 ` Vlastimil Babka 1 sibling, 1 reply; 9+ messages in thread From: Michal Hocko @ 2018-11-08 9:01 UTC (permalink / raw) To: Mike Manning Cc: Vlastimil Babka, stable, Greg Kroah-Hartman, Mel Gorman, linux-mm On Thu 08-11-18 08:30:40, Mike Manning wrote: [...] > 1) The original commit was not suitable for backport to 4.14 and should > be reverted. Yes, the original patch hasn't been marked for the stable tree and as such shouldn't have been backported. Even though it looks simple enough it is not really trivial. This is not the first time automagic selection has provent to be wrong. So can we stop this finally please? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: stable request: mm, page_alloc: actually ignore mempolicies for high priority allocations 2018-11-08 9:01 ` Michal Hocko @ 2018-11-08 9:06 ` Vlastimil Babka 2018-11-08 9:11 ` Michal Hocko 2018-11-08 9:41 ` Mike Manning 0 siblings, 2 replies; 9+ messages in thread From: Vlastimil Babka @ 2018-11-08 9:06 UTC (permalink / raw) To: Michal Hocko, Mike Manning Cc: stable, Greg Kroah-Hartman, Mel Gorman, linux-mm On 11/8/18 10:01 AM, Michal Hocko wrote: > On Thu 08-11-18 08:30:40, Mike Manning wrote: > [...] >> 1) The original commit was not suitable for backport to 4.14 and should >> be reverted. > > Yes, the original patch hasn't been marked for the stable tree and as > such shouldn't have been backported. Even though it looks simple enough > it is not really trivial. I think you confused the two patches. Original commit 1d26c112959f ("mm, page_alloc: do not break __GFP_THISNODE by zonelist reset") was marked for stable, especially pre-4.7 where SLAB could be potentially broken. Commit d6a24df00638 ("mm, page_alloc: actually ignore mempolicies for high priority allocations") was not marked stable and is being requested in this thread. But I'm reluctant to agree with this without properly understanding what went wrong. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: stable request: mm, page_alloc: actually ignore mempolicies for high priority allocations 2018-11-08 9:06 ` Vlastimil Babka @ 2018-11-08 9:11 ` Michal Hocko 2018-11-08 9:41 ` Mike Manning 1 sibling, 0 replies; 9+ messages in thread From: Michal Hocko @ 2018-11-08 9:11 UTC (permalink / raw) To: Vlastimil Babka Cc: Mike Manning, stable, Greg Kroah-Hartman, Mel Gorman, linux-mm On Thu 08-11-18 10:06:35, Vlastimil Babka wrote: > On 11/8/18 10:01 AM, Michal Hocko wrote: > > On Thu 08-11-18 08:30:40, Mike Manning wrote: > > [...] > >> 1) The original commit was not suitable for backport to 4.14 and should > >> be reverted. > > > > Yes, the original patch hasn't been marked for the stable tree and as > > such shouldn't have been backported. Even though it looks simple enough > > it is not really trivial. > > I think you confused the two patches. > > Original commit 1d26c112959f ("mm, page_alloc: do not break > __GFP_THISNODE by zonelist reset") was marked for stable, especially > pre-4.7 where SLAB could be potentially broken. You are right. My apology! > Commit d6a24df00638 ("mm, page_alloc: actually ignore mempolicies for > high priority allocations") was not marked stable and is being requested > in this thread. But I'm reluctant to agree with this without properly > understanding what went wrong. Agreed -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: stable request: mm, page_alloc: actually ignore mempolicies for high priority allocations 2018-11-08 9:06 ` Vlastimil Babka 2018-11-08 9:11 ` Michal Hocko @ 2018-11-08 9:41 ` Mike Manning 2018-12-28 10:49 ` Greg Kroah-Hartman 1 sibling, 1 reply; 9+ messages in thread From: Mike Manning @ 2018-11-08 9:41 UTC (permalink / raw) To: Vlastimil Babka, Michal Hocko Cc: stable, Greg Kroah-Hartman, Mel Gorman, linux-mm On 08/11/2018 09:06, Vlastimil Babka wrote: > On 11/8/18 10:01 AM, Michal Hocko wrote: >> On Thu 08-11-18 08:30:40, Mike Manning wrote: >> [...] >>> 1) The original commit was not suitable for backport to 4.14 and should >>> be reverted. >> Yes, the original patch hasn't been marked for the stable tree and as >> such shouldn't have been backported. Even though it looks simple enough >> it is not really trivial. > I think you confused the two patches. > > Original commit 1d26c112959f ("mm, page_alloc: do not break > __GFP_THISNODE by zonelist reset") was marked for stable, especially > pre-4.7 where SLAB could be potentially broken. > > Commit d6a24df00638 ("mm, page_alloc: actually ignore mempolicies for > high priority allocations") was not marked stable and is being requested > in this thread. But I'm reluctant to agree with this without properly > understanding what went wrong. Apologies, the original commit was not a backport, but is a fix in 4.14 for pre-4.7 kernels. All I can do from a user perspective is report the problem and the fortuitous follow-on commit that resolved the issue in our case. It has already taken quite some time to find that the problem was unexpectedly due to the kernel upgrade (this failure is a first, we have been running these tests for some years going back to the 4.1 kernel), then to go through the process of pinpointing the change that caused the issue in our case. Given that the problem is not manually reproducible, and given that it could take a very substantial period of time to understand how the change is impacting our scale & performance testing, it seems most expedient to backport the 1-line commit that resolves the issue. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: stable request: mm, page_alloc: actually ignore mempolicies for high priority allocations 2018-11-08 9:41 ` Mike Manning @ 2018-12-28 10:49 ` Greg Kroah-Hartman 0 siblings, 0 replies; 9+ messages in thread From: Greg Kroah-Hartman @ 2018-12-28 10:49 UTC (permalink / raw) To: Mike Manning; +Cc: Vlastimil Babka, Michal Hocko, stable, Mel Gorman, linux-mm On Thu, Nov 08, 2018 at 09:41:37AM +0000, Mike Manning wrote: > On 08/11/2018 09:06, Vlastimil Babka wrote: > > On 11/8/18 10:01 AM, Michal Hocko wrote: > >> On Thu 08-11-18 08:30:40, Mike Manning wrote: > >> [...] > >>> 1) The original commit was not suitable for backport to 4.14 and should > >>> be reverted. > >> Yes, the original patch hasn't been marked for the stable tree and as > >> such shouldn't have been backported. Even though it looks simple enough > >> it is not really trivial. > > I think you confused the two patches. > > > > Original commit 1d26c112959f ("mm, page_alloc: do not break > > __GFP_THISNODE by zonelist reset") was marked for stable, especially > > pre-4.7 where SLAB could be potentially broken. > > > > Commit d6a24df00638 ("mm, page_alloc: actually ignore mempolicies for > > high priority allocations") was not marked stable and is being requested > > in this thread. But I'm reluctant to agree with this without properly > > understanding what went wrong. > > Apologies, the original commit was not a backport, but is a fix in 4.14 > for pre-4.7 kernels. > > All I can do from a user perspective is report the problem and the > fortuitous follow-on commit that resolved the issue in our case. It has > already taken quite some time to find that the problem was unexpectedly > due to the kernel upgrade (this failure is a first, we have been running > these tests for some years going back to the 4.1 kernel), then to go > through the process of pinpointing the change that caused the issue in > our case. > > Given that the problem is not manually reproducible, and given that it > could take a very substantial period of time to understand how the > change is impacting our scale & performance testing, it seems most > expedient to backport the 1-line commit that resolves the issue. Ok, you are asking for this to be added, without really knowing _why_ it resolves the issue and Michal is asking to know _why_ before acking it, correct? So I'll hold off on merging this for now until you all come to some kind of resolution :) thanks, greg k-h ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: stable request: mm, page_alloc: actually ignore mempolicies for high priority allocations 2018-11-08 8:30 ` Mike Manning 2018-11-08 9:01 ` Michal Hocko @ 2018-11-08 9:32 ` Vlastimil Babka 2018-11-08 10:01 ` Mike Manning 1 sibling, 1 reply; 9+ messages in thread From: Vlastimil Babka @ 2018-11-08 9:32 UTC (permalink / raw) To: mmanning, stable; +Cc: Greg Kroah-Hartman, Mel Gorman, linux-mm On 11/8/18 9:30 AM, Mike Manning wrote: > On 08/11/2018 07:54, Vlastimil Babka wrote: >> +CC linux-mm >> >> On 11/7/18 6:33 PM, Mike Manning wrote: >>> Hello, Please consider backporting to 4.14.y the following commit from >>> kernel-net-next by Vlastimil Babka [CC'ed]: >>> >>> d6a24df00638 ("mm, page_alloc: actually ignore mempolicies for high >>> priority allocations") It cherry-picks cleanly and builds fine. >>> >>> The reason for the request is that the commit 1d26c112959f ("mm, >>> page_alloc:do not break __GFP_THISNODE by zonelist reset") that was >>> previously backported to 4.14.y broke some of our functionality after we >>> upgraded from an earlier 4.14 kernel without the fix. >> Well, that's very surprising! Could you be more specific about what >> exactly got broken? > > Thank you for your reply. I agree, we were also very surprised when > bisecting our updated 4.14 kernel, as this change is apparently > completely unrelated to our application running in userspace. But the > problem was 100% reproducible on a baremetal setup running automated > performance multi-stream testing, so only seen under load. So what was the workload doing, and what were the symptoms, at least from a high level perspective? And was it a vanilla 4.14.y kernel, or with some additional patches, out of tree modules etc? > With the fix > reverted from the 4.14 kernel, the problem went away, and this is with > many repeated runs (the load test is part of a suite that is > automatically run quite a few times every day, and this test was failing > since the upgrade). > >> >>> The reason this is >>> happening is not clear, with this commit only found by bisect. >>> Fortunately the requested commit resolves the issue. >> I would like to understand the problem first, because I currently can't >> imagine how the first commit could break something and the second fix it. > > I agree, but from an empirical point of view, 2 options present: > > 1) The original commit was not suitable for backport to 4.14 and should > be reverted. > > 2) For the same reason that the original commit was suitable for > backport to 4.14, the requested commit should also be backported. I don't think that covers all possibilities. You didn't say what the observed problem was, so I can imagine it was either allocation failures, OOM's, or worse performance (probably related to network). The original commit should be a non-functional change for allocations that don't use __GFP_THISNODE. The zonelist reassignment that was removed by the patch might have changed order of zones to allocate from, but the set of available zones for the allocation should be unchanged, unless the zonelist generation code is broken (and then we should better find out). Otherwise I can only imagine some minor performance impact. The patch you are requesting is a functional change, with positive effects expected on all 3 potential problems I listed above. It should however make a difference only in the context of processes restricted by bind mempolicies... or potentially some out-of-tree modules. And again, it shouldn't be related to the original commit. >>> Best Regards, >>> >>> Mike Manning >>> > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: stable request: mm, page_alloc: actually ignore mempolicies for high priority allocations 2018-11-08 9:32 ` Vlastimil Babka @ 2018-11-08 10:01 ` Mike Manning 0 siblings, 0 replies; 9+ messages in thread From: Mike Manning @ 2018-11-08 10:01 UTC (permalink / raw) To: Vlastimil Babka, stable; +Cc: Greg Kroah-Hartman, Mel Gorman, linux-mm On 08/11/2018 09:32, Vlastimil Babka wrote: > On 11/8/18 9:30 AM, Mike Manning wrote: >> On 08/11/2018 07:54, Vlastimil Babka wrote: >>> +CC linux-mm >>> >>> On 11/7/18 6:33 PM, Mike Manning wrote: >>>> Hello, Please consider backporting to 4.14.y the following commit from >>>> kernel-net-next by Vlastimil Babka [CC'ed]: >>>> >>>> d6a24df00638 ("mm, page_alloc: actually ignore mempolicies for high >>>> priority allocations") It cherry-picks cleanly and builds fine. >>>> >>>> The reason for the request is that the commit 1d26c112959f ("mm, >>>> page_alloc:do not break __GFP_THISNODE by zonelist reset") that was >>>> previously backported to 4.14.y broke some of our functionality after we >>>> upgraded from an earlier 4.14 kernel without the fix. >>> Well, that's very surprising! Could you be more specific about what >>> exactly got broken? >> Thank you for your reply. I agree, we were also very surprised when >> bisecting our updated 4.14 kernel, as this change is apparently >> completely unrelated to our application running in userspace. But the >> problem was 100% reproducible on a baremetal setup running automated >> performance multi-stream testing, so only seen under load. > So what was the workload doing, and what were the symptoms, at least > from a high level perspective? And was it a vanilla 4.14.y kernel, or > with some additional patches, out of tree modules etc? We carry patches, but they are not at all in this area. The tests, which are in relation to traffic forwarding, are not portable to a stock linux image. >> With the fix >> reverted from the 4.14 kernel, the problem went away, and this is with >> many repeated runs (the load test is part of a suite that is >> automatically run quite a few times every day, and this test was failing >> since the upgrade). >> >>>> The reason this is >>>> happening is not clear, with this commit only found by bisect. >>>> Fortunately the requested commit resolves the issue. >>> I would like to understand the problem first, because I currently can't >>> imagine how the first commit could break something and the second fix it. >> I agree, but from an empirical point of view, 2 options present: >> >> 1) The original commit was not suitable for backport to 4.14 and should >> be reverted. >> >> 2) For the same reason that the original commit was suitable for >> backport to 4.14, the requested commit should also be backported. > I don't think that covers all possibilities. > > You didn't say what the observed problem was, so I can imagine it was > either allocation failures, OOM's, or worse performance (probably > related to network). No errors are reported in the journal such as OOM. > The original commit should be a non-functional change for allocations > that don't use __GFP_THISNODE. The zonelist reassignment that was > removed by the patch might have changed order of zones to allocate from, > but the set of available zones for the allocation should be unchanged, > unless the zonelist generation code is broken (and then we should better > find out). Otherwise I can only imagine some minor performance impact. The failure is with ND being tested with Spirent under load, so it causes a complete test failure, as it stops any subsequent performance testing for IPv6 traffic forwarding. > The patch you are requesting is a functional change, with positive > effects expected on all 3 potential problems I listed above. It should > however make a difference only in the context of processes restricted by > bind mempolicies... or potentially some out-of-tree modules. And again, > it shouldn't be related to the original commit. We do not have any processes explicitly restricted by bind mempolicies. >>>> Best Regards, >>>> >>>> Mike Manning >>>> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-12-28 10:49 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <a66fb268-74fe-6f4e-a99f-3257b8a5ac3b@vyatta.att-mail.com>
2018-11-08 7:54 ` stable request: mm, page_alloc: actually ignore mempolicies for high priority allocations Vlastimil Babka
2018-11-08 8:30 ` Mike Manning
2018-11-08 9:01 ` Michal Hocko
2018-11-08 9:06 ` Vlastimil Babka
2018-11-08 9:11 ` Michal Hocko
2018-11-08 9:41 ` Mike Manning
2018-12-28 10:49 ` Greg Kroah-Hartman
2018-11-08 9:32 ` Vlastimil Babka
2018-11-08 10:01 ` Mike Manning
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox