On Fri, May 20, 2011 at 12:19 PM, Minchan Kim <minchan.kim@gmail.com> wrote:
> On Fri, May 20, 2011 at 12:01:12PM -0400, Andrew Lutomirski wrote:
>> On Fri, May 20, 2011 at 11:33 AM, Minchan Kim <minchan.kim@gmail.com> wrote:
>>
>> > diff --git a/mm/vmscan.c b/mm/vmscan.c
>> > index 8bfd450..a5c01e9 100644
>> > --- a/mm/vmscan.c
>> > +++ b/mm/vmscan.c
>> > @@ -1430,7 +1430,10 @@ shrink_inactive_list(unsigned long nr_to_scan, struct zone *zone,
>> >
>> >        /* Check if we should syncronously wait for writeback */
>> >        if (should_reclaim_stall(nr_taken, nr_reclaimed, priority, sc)) {
>> > +               unsigned long nr_active;
>> >                set_reclaim_mode(priority, sc, true);
>> > +               nr_active = clear_active_flags(&page_list, NULL);
>> > +               count_vm_events(PGDEACTIVATE, nr_active);
>> >                nr_reclaimed += shrink_page_list(&page_list, zone, sc);
>> >        }
>> >
>> > --
>>
>> I'm now running that patch *without* the pgdat_balanced fix or the
>> need_resched check.  The VM_BUG_ON doesn't happen but I still get
>
> Please forget need_resched.
> Instead of it, could you test shrink_slab patch with !pgdat_balanced?
>
> @@ -231,8 +231,11 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>       if (scanned == 0)
>               scanned = SWAP_CLUSTER_MAX;
>
> -       if (!down_read_trylock(&shrinker_rwsem))
> -               return 1;       /* Assume we'll be able to shrink next time */
> +       if (!down_read_trylock(&shrinker_rwsem)) {
> +               /* Assume we'll be able to shrink next time */
> +               ret = 1;
> +               goto out;
> +       }
>
>       list_for_each_entry(shrinker, &shrinker_list, list) {
>               unsigned long long delta;
> @@ -286,6 +289,8 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>               shrinker->nr += total_scan;
>       }
>       up_read(&shrinker_rwsem);
> +out:
> +       cond_resched();
>       return ret;
>  }
>
>> incorrect OOM kills.
>>
>> However, if I replace the check with:
>>
>>       if (false &&should_reclaim_stall(nr_taken, nr_reclaimed, priority, sc)) {
>>
>> then my system lags under bad memory pressure but recovers without
>> OOMs or oopses.
>
> I agree you can see OOM but oops? Did you see any oops?

No oops.  I've now reproduced the OOPS with both the if (false) change
and the clear_active_flags change.

Also, would this version be better?  I think your version overcounts
nr_scanned, but I'm not sure what effect that would have.

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3f44b81..d1dabc9 100644
@@ -1426,8 +1437,13 @@ shrink_inactive_list(unsigned long nr_to_scan,
struct zone *zone,

 	/* Check if we should syncronously wait for writeback */
 	if (should_reclaim_stall(nr_taken, nr_reclaimed, priority, sc)) {
+		unsigned long nr_active, old_nr_scanned;
 		set_reclaim_mode(priority, sc, true);
+		nr_active = clear_active_flags(&page_list, NULL);
+		count_vm_events(PGDEACTIVATE, nr_active);
+		old_nr_scanned = sc->nr_scanned;
 		nr_reclaimed += shrink_page_list(&page_list, zone, sc);
+		sc->nr_scanned = old_nr_scanned;
 	}

 	local_irq_disable();

I just tested 2.6.38.6 with the attached patch.  It survived dirty_ram
and test_mempressure without any problems other than slowness, but
when I hit ctrl-c to stop test_mempressure, I got the attached oom.

--Andy