From: Amol Kumar Lad <amolk@ishoni.com>
To: Martin Maletinsky <maletinsky@scs.ch>
Cc: Joseph A Knapka <jknapka@earthlink.net>,
linux-mm@kvack.org, kernelnewbies@nl.linux.org
Subject: Re: Question on swapping
Date: 06 Dec 2002 20:29:50 -0500 [thread overview]
Message-ID: <1039224592.4551.57.camel@amol.in.ishoni.com> (raw)
In-Reply-To: <3DF071C3.C3E1EC39@scs.ch>
On Fri, 2002-12-06 at 04:45, Martin Maletinsky wrote:
> Hello Joe,
>
> Thank you for your reply. I have an additional question.
>
> Joseph A Knapka wrote:
>
> > Martin Maletinsky wrote:
> > > Hello,
> > >
> > > I am looking at the swapping mechanism in Linux. I have read the
> relevant chapter 16 in 'Understanding the Linux Kernel' from
> Bovet&Cesati, and looked at the 2.2.18 kernel
> > > source code. I still have the follwing question:
> > >
> > > Function try_to_swap_out() [p. 481 in 'Understanding the Linux
> Kernel']:
> > > If the page in question already belongs to the swap cache, the
> function performs no data transfer to the swap space on the disk (but
> only marks the page as swapped out).
> > > The corresponding comment in the try_to_swap_out() functions states
> 'Is the page already in the swap cache? If so, ..... - it is already
> up-to-date on disk.
> > > Understanding the Linux Kernel states on p. 482 'If the page belongs
> to the swap cache .... no memory transfer is performed'.
> > > Now my question is, couldn't the page have been modified since it
> was added to the swap cache (and written to disk), and thus differ from
> the data in the swap space? In
> > > this case shouldn't the page be written to disk (again)?
> >
> > If the page is in the swap cache, it's *effectively* up to date on
> disk,
> > because the swap cache page is *the* authoritative image of the page.
> > If it's dirty it will get written out by page_launder() in short
> > order, because whomever dirtied it set the page_dirty bit in the
> > page struct. That issue is unimportant to the process doing the
> > swap_out, though - all it cares about is that the page is going
> > to be taken care of by the cache machinery.
>
> Assume a page P that is marked as clean (i.e. PG_dirty bit not set), and
> is in the page cache. Additionaly assume that P is mapped by a process
> A. Now let A perform a store
> operation into the page P, which will mark A's page table entry mapping
> P as dirty (i.e. set the dirty bit).
> Subsequently assume that try_to_swap_out() is called on A's page table
> entry that maps P. try_to_swap_out() will see that P is in the swap
> cache already,
>>>> For the first time P would never be found in the swap cache, infact
try_to_swap_out shall do following
a] Page is dirty (in page table entry), so set PG_DIRTY in struct page
b] Allocate a swap entry and add this page to swap cache
c] release the page, and add the modify page table entry to point it to
swap entry
Now We have
a] Page table entry for P contains swap info
b] Page P in swap cache
c] PG_DIRTY _is_ set (infact for a page in swap cache this is always
true)
Do remember, along with the swap cache P may be party of inactive_dirty
list.
The actual swapping to backing store is done by page scanner.
It shall do following. Assume it has decided to _really_ free P
1] As page is dirty, call the page write back function. Thus here for
the first time page found its place in swap.
2] send P back home, to buddy allocator
If process A again access the page, then page fault handler shall do
following
1] allocate a swap cache page
2] read the page from swap.
3] Modify page table entry of A to point to this page
Just give a little thought about all this, VM shall reveal herself to
you :)
bye
Amol
and thus drop the pte.
> This leads to a situation, where P is in the swap cache, marked as clear
> (i.e. PG_dirty bit clear), while the disk image differs from the data
> that is in the main memory page
> frame.
> I would have expected try_to_swap_out() to check the page table entries
> dirty bit, and to mark the page dirty. However, I can't see any such
> code in the function (I pasted the
> relevant lines of code from linux 2.2.18 below).
>
> static int try_to_swap_out(struct task_struct * tsk, struct
> vm_area_struct* vma,
> unsigned long address, pte_t * page_table, int gfp_mask)
> {
> pte_t pte;
> unsigned long entry;
> unsigned long page;
> struct page * page_map;
>
> pte = *page_table;
> if (!pte_present(pte))
> return 0;
> page = pte_page(pte);
> if (MAP_NR(page) >= max_mapnr)
> return 0;
> page_map = mem_map + MAP_NR(page);
>
> if (pte_young(pte)) {
> /*
> * Transfer the "accessed" bit from the page
> * tables to the global page map.
> */
> set_pte(page_table, pte_mkold(pte));
> flush_tlb_page(vma, address);
> set_bit(PG_referenced, &page_map->flags);
> return 0;
> }
>
> if (PageReserved(page_map)
> || PageLocked(page_map)
> || ((gfp_mask & __GFP_DMA) && !PageDMA(page_map)))
> return 0;
>
> /*
> * Is the page already in the swap cache? If so, then
> * we can just drop our reference to it without doing
> * any IO - it's already up-to-date on disk.
> *
> * Return 0, as we didn't actually free any real
> * memory, and we should just continue our scan.
> */
> if (PageSwapCache(page_map)) {
> entry = page_map->offset;
> swap_duplicate(entry);
> set_pte(page_table, __pte(entry));
> drop_pte:
> vma->vm_mm->rss--;
> flush_tlb_page(vma, address);
> __free_page(page_map);
> return 0;
> }
> ............
>
>
> Thanks again, with best regards
> Martin
>
> --
> Supercomputing System AG email: maletinsky@scs.ch
> Martin Maletinsky phone: +41 (0)1 445 16 05
> Technoparkstrasse 1 fax: +41 (0)1 445 16 10
> CH-8005 Zurich
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
next prev parent reply other threads:[~2002-12-07 1:29 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-12-04 15:17 Martin Maletinsky
2002-12-05 19:08 ` Joseph A Knapka
2002-12-06 9:45 ` Martin Maletinsky
2002-12-07 1:29 ` Amol Kumar Lad [this message]
2002-12-12 15:56 ` Joseph A Knapka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1039224592.4551.57.camel@amol.in.ishoni.com \
--to=amolk@ishoni.com \
--cc=jknapka@earthlink.net \
--cc=kernelnewbies@nl.linux.org \
--cc=linux-mm@kvack.org \
--cc=maletinsky@scs.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox