linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [MAILER-DAEMON@watson.ibm.com: Returned mail: see transcript for details]
       [not found] <20070903201645.GA11502@wotan.suse.de>
@ 2007-09-04  6:31 ` Balbir Singh
  2007-09-04 17:37   ` Nick Piggin
  0 siblings, 1 reply; 2+ messages in thread
From: Balbir Singh @ 2007-09-04  6:31 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Balbir Singh, Shailabh Nagar, Linux Memory Management List, Hugh Dickins

Nick Piggin wrote:
> Hi,
> 
> This mail to Shailabh bounced, but I noticed you're on the Signed-off-by
> trail too, so forwarding to you so you don't miss it. Thanks,
> 

Thanks, Nick. I am cc'ing Shailabh's new email id.

> Hi,
> 
> I can't convince myself that delay accounting for swapin is quite right
> at the moment (not having a test setup handy to run and check for myself).
> Maybe I'm not reading the swapin code very well...
> 
> lookup_swap_cache, and read_swap_cache_async should be non-blocking
> operations for the most part.
> 
> read_swap_cache_async might, when allocating the new page, go into reclaim
> and take a long time to come back. However is that any more a "swapin" delay
> than eg. when we sleep on mmap_sem when first taking the fault, or any other
> types of fault which require allocations? None of which we account for as
> swapin delay.
> 

Yes, for us it is. We measure the end to end delay of swapping in/out a page.

> But the most obvious delay, where we actually lock the page waiting for
> the swap IO to finish, does not seem to be accounted at all!
> 

Hmm.. Does lock_page() eventually call io_schedule() or io_schedule_timeout()?
I think it does -- via sync_page(). The way our accounting works is that we
account for block I/O in io_schedule*(). If we see the SWAPIN flag set, we
then account for that I/O as swap I/O.


> My proposed fix is to just move the swaping delay accounting to the
> point where the VM does actually wait, for the swapin.
> 
> I have no idea what uses swapin delay accounting, but it would be good to
> see if this makes a positive (or at least not negative) impact on those
> users...
> 
> Thanks,
> Nick
> 
> --
> Index: linux-2.6/mm/memory.c
> ===================================================================
> --- linux-2.6.orig/mm/memory.c
> +++ linux-2.6/mm/memory.c
> @@ -2158,7 +2158,6 @@ static int do_swap_page(struct mm_struct
>  		migration_entry_wait(mm, pmd, address);
>  		goto out;
>  	}
> -	delayacct_set_flag(DELAYACCT_PF_SWAPIN);

Let's start the accounting here.

>  	page = lookup_swap_cache(entry);
>  	if (!page) {
>  		grab_swap_token(); /* Contend for token _before_ read-in */
> @@ -2172,7 +2171,6 @@ static int do_swap_page(struct mm_struct
>  			page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
>  			if (likely(pte_same(*page_table, orig_pte)))
>  				ret = VM_FAULT_OOM;
> -			delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
>  			goto unlock;
>  		}
> 
> @@ -2181,9 +2179,10 @@ static int do_swap_page(struct mm_struct
>  		count_vm_event(PGMAJFAULT);
>  	}
> 
> -	delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
>  	mark_page_accessed(page);
> +	delayacct_set_flag(DELAYACCT_PF_SWAPIN);
>  	lock_page(page);
> +	delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
> 

I agree that we should end it after lock_page().

>  	/*
>  	 * Back out if somebody else already faulted in this pte.
> 
> 
> ----- End forwarded message -----


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [MAILER-DAEMON@watson.ibm.com: Returned mail: see transcript for details]
  2007-09-04  6:31 ` [MAILER-DAEMON@watson.ibm.com: Returned mail: see transcript for details] Balbir Singh
@ 2007-09-04 17:37   ` Nick Piggin
  0 siblings, 0 replies; 2+ messages in thread
From: Nick Piggin @ 2007-09-04 17:37 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Balbir Singh, Shailabh Nagar, Linux Memory Management List, Hugh Dickins

On Tue, Sep 04, 2007 at 07:31:14AM +0100, Balbir Singh wrote:
> Nick Piggin wrote:
> 
> > But the most obvious delay, where we actually lock the page waiting for
> > the swap IO to finish, does not seem to be accounted at all!
> > 
> 
> Hmm.. Does lock_page() eventually call io_schedule() or io_schedule_timeout()?
> I think it does -- via sync_page(). The way our accounting works is that we
> account for block I/O in io_schedule*(). If we see the SWAPIN flag set, we
> then account for that I/O as swap I/O.

Yes, lock_page will io_schedule() down the path you say.

> > My proposed fix is to just move the swaping delay accounting to the
> > point where the VM does actually wait, for the swapin.
> > 
> > I have no idea what uses swapin delay accounting, but it would be good to
> > see if this makes a positive (or at least not negative) impact on those
> > users...
> > 
> > Thanks,
> > Nick
> > 
> > --
> > Index: linux-2.6/mm/memory.c
> > ===================================================================
> > --- linux-2.6.orig/mm/memory.c
> > +++ linux-2.6/mm/memory.c
> > @@ -2158,7 +2158,6 @@ static int do_swap_page(struct mm_struct
> >  		migration_entry_wait(mm, pmd, address);
> >  		goto out;
> >  	}
> > -	delayacct_set_flag(DELAYACCT_PF_SWAPIN);
> 
> Let's start the accounting here.
 
If you start accounting here, then you can potentially also count other
stuff as swapin delay which I don't think you want to. I'm not sure if
there is any part of reclaim where we directly wait on page IO, but we
do wait on blkdev congestion which calls io_schedule. It seems funny to
account this as swapin delay.


> >  	page = lookup_swap_cache(entry);
> >  	if (!page) {
> >  		grab_swap_token(); /* Contend for token _before_ read-in */
> > @@ -2172,7 +2171,6 @@ static int do_swap_page(struct mm_struct
> >  			page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
> >  			if (likely(pte_same(*page_table, orig_pte)))
> >  				ret = VM_FAULT_OOM;
> > -			delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
> >  			goto unlock;
> >  		}
> > 
> > @@ -2181,9 +2179,10 @@ static int do_swap_page(struct mm_struct
> >  		count_vm_event(PGMAJFAULT);
> >  	}
> > 
> > -	delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
> >  	mark_page_accessed(page);
> > +	delayacct_set_flag(DELAYACCT_PF_SWAPIN);
> >  	lock_page(page);
> > +	delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
> > 
> 
> I agree that we should end it after lock_page().

OK.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2007-09-04 17:37 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20070903201645.GA11502@wotan.suse.de>
2007-09-04  6:31 ` [MAILER-DAEMON@watson.ibm.com: Returned mail: see transcript for details] Balbir Singh
2007-09-04 17:37   ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox