From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail6.bemta8.messagelabs.com (mail6.bemta8.messagelabs.com [216.82.243.55]) by kanga.kvack.org (Postfix) with ESMTP id 869496B006E for ; Thu, 10 Nov 2011 11:30:26 -0500 (EST) Received: by iaek3 with SMTP id k3so755422iae.14 for ; Thu, 10 Nov 2011 08:30:24 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <20111110161331.GG3083@suse.de> References: <20111110100616.GD3083@suse.de> <20111110142202.GE3083@suse.de> <20111110161331.GG3083@suse.de> Date: Fri, 11 Nov 2011 01:30:24 +0900 Message-ID: Subject: Re: [PATCH] mm: Do not stall in synchronous compaction for THP allocations From: Minchan Kim Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org List-ID: To: Mel Gorman Cc: Andrew Morton , Jan Kara , Andy Isaacson , Johannes Weiner , Andrea Arcangeli , linux-kernel@vger.kernel.org, linux-mm@kvack.org On Fri, Nov 11, 2011 at 1:13 AM, Mel Gorman wrote: > On Fri, Nov 11, 2011 at 12:12:01AM +0900, Minchan Kim wrote: >> Hi Mel, >> >> You should have Cced with me because __GFP_NORETRY is issued by me. >> > > I don't understand. __GFP_NORETRY has been around a long time ago > and has loads of users. Do you have particular concerns about the > behaviour of __GFP_NORETRY? It seems that I got misunderstood your point. I thought you mentioned this. http://www.spinics.net/lists/linux-mm/msg16371.html > >> On Thu, Nov 10, 2011 at 11:22 PM, Mel Gorman wrote: >> > On Thu, Nov 10, 2011 at 10:06:16AM +0000, Mel Gorman wrote: >> >> than stall. It was suggested that __GFP_NORETRY be used instead of >> >> __GFP_NO_KSWAPD. This would look less like a special case but would >> >> still cause compaction to run at least once with sync compaction. >> >> >> > >> > This comment is bogus - __GFP_NORETRY would have caught THP allocation= s >> > and would not call sync compaction. The issue was that it would also >> > have caught any hypothetical high-order GFP_THISNODE allocations that >> > end up calling compaction here >> >> In fact, the I support patch concept so I would like to give >> >> Acked-by: Minchan Kim >> But it is still doubt about code. >> >> __GFP_NORETRY: The VM implementation must not retry indefinitely >> >> What could people think if they look at above comment? >> At least, I can imagine two >> >> First, it is related on *latency*. > > I never read it to mean that. I always read it to mean "it must > not retry indefinitely". I expected it was still fine to try direct > reclaim which is not a latency-sensitive operation. > >> Second, "I can handle if VM fails allocation" >> > > Fully agreed. > >> I am biased toward latter. >> Then, __GFP_NO_KSWAPD is okay? It means "let's avoid sync compaction >> or long latency"? > > and do not wake kswapd to swap additional pages. > >> It's rather awkward name. Already someone started to use >> __GFP_NO_KSWAPD as such purpose. >> See mtd_kmalloc_up_to. He mentioned in comment of function as follows, >> >> =C2=A0* the system page size. This attempts to make sure it does not adv= ersely >> =C2=A0* impact system performance, so when allocating more than one page= , we >> =C2=A0* ask the memory allocator to avoid re-trying, swapping, writing b= ack >> =C2=A0* or performing I/O. >> > > I was not aware of this user although their reason for using it > seems fine. We are not breaking their expectations with this patch but > it is a slippery slope. > >> That thing was what I concerned. >> In future, new users of __GFP_NO_KSWAPD is coming and we can't prevent >> them under our sight. >> So I hope we can change the flag name or fix above code and comment >> out __GFP_NO_KSWAPD >> > > I can't think of a better name but we can at least improve the comment. > >> /* >> =C2=A0* __GFP_NO_KSWAPD is very VM internal flag so Please don't use it >> without allowing mm guys >> =C2=A0* >> #define __GFP_NO_KSWAPD xxxx >> >> > >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * High-order a= llocations do not necessarily loop after >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * direct recla= im and reclaim/compaction depends on >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * compaction b= eing called after reclaim so call directly if >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * necessary >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 */ >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0page =3D __allo= c_pages_direct_compact(gfp_mask, order, >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0zoneli= st, high_zoneidx, >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0nodema= sk, >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0alloc_= flags, preferred_zone, >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0migrat= etype, &did_some_progress, >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sync_m= igration); >> > >> > __GFP_NORETRY is used in a bunch of places and while the most >> > of them are not high-order, some of them potentially are like in >> > sound/core/memalloc.c. Using __GFP_NO_KSWAPD as the flag allows >> > these callers to continue using sync compaction. =C2=A0It could be arg= ued >> >> Okay. If I was biased first, I have opposed this comment because they >> might think __GFP_NORETRY is very latency sensitive. > > As __GFP_NORETRY uses direct reclaim, I do not think users expect it > to be latency sensitive. If they were latency sensitive, they would > mask out __GFP_WAIT. Indeed. > >> So they wanted allocation is very fast without any writeback/retrial. >> In view point, __GFP_NORETRY isn't bad, I think. >> >> Having said that, I was biased latter, as I said earlier. >> >> > that they would prefer __GFP_NORETRY but the potential side-effects >> > should be taken should be taken into account and the comment updated >> >> Considering side-effect, your patch is okay. >> But I can't understand you mentioned "the comment updated if that >> happens" sentence. :( >> > > I meant the comment above sync_migration =3D !(gfp_mask & __GFP_NO_KSWAPD= ) should > be updated if it changes to __GFP_NORETRY. > > I updated the commentary a bit. Does this help? I am happy with your updated comment but there is a concern. We are adding additional flags to be considered for all page allocation cal= lers. If it isn't, there are too many flags in there. I really hope this flag makes VM internal so let other do not care about it= . Until now, users was happy without such flag and it's only huge page proble= m. --=20 Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org