Hi Alex, could you also try to reverse this below bit (not the whole previous patch: only the bit below quoted below) with "patch -p1 -R < thismail" on top of your current aa.git tree, and see if you notice any regression compared to the previous aa.git build that worked well? This is part of the fix, but I'd need to be sure this really makes a difference before sticking to it for long. I'm not concerned by keeping it, but it adds dirt, and the closer THP allocations are to any other high order allocation the better. So the less __GFP_NO_KSWAPD affects the better. The hint about not telling kswapd to insist in the background for order 9 allocations with fallback (like THP) is the maximum I consider clean because there's khugepaged with its alloc_sleep_millisecs that replaces the kswapd task for THP allocations. So that is clean enough, but when __GFP_NO_KSWAPD starts to make compaction behave slightly different from a SLUB order 2 allocation I don't like it (especially because if you later enable SLUB or some driver you may run into the same compaction issue again if the below change is making a difference). If things works fine even after you reverse the below, we can safely undo this change and also feel safer for all other high order allocations, so it'll make life easier. (plus we don't want unnecessary special changes, we need to be sure this makes a difference to keep it for long) --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2085,7 +2085,7 @@ rebalance: sync_migration); if (page) goto got_pg; - sync_migration = true; + sync_migration = !(gfp_mask & __GFP_NO_KSWAPD); /* Try direct reclaim and then allocating */ page = __alloc_pages_direct_reclaim(gfp_mask, order,
I tried to reformat the stick as UDF to check whether the stall was filesystem-sensitive. Apparently it is. I managed to induce the freeze on firefox while performing the same copy on the aa.git kernel. Then I reformatted the stick as FAT32 and repeated the test, and it also induced freezes, although they were a bit shorter and occurred late in the copy progress. I have attached the traces in the bug report. All of this is with the kernel before reversing the quoted patch.On Tue, Mar 22, 2011 at 03:34:10PM -0500, Alex Villacís Lasso wrote:I have just tested aa.git as of today, with the USB stick formatted as FAT32. I could no longer reproduce the stallProbably udf is not optimized enough but I wonder if maybe the udf->vfat change helped more than the other patches. We need the other patches anyway to provide responsive behavior including the one you tested before aa.git so it's not very important if udf was the problem, but it might have been.