From: Bharata B Rao <bharata@in.ibm.com>
To: linux-kernel@vger.kernel.org
Cc: Suparna <bsuparna@in.ibm.com>, Andrew Morton <akpm@zip.com.au>,
"Eric W. Biederman" <ebiederm@xmission.com>,
marcelo@brutus.conectiva.com.br, linux-mm@kvack.org
Subject: Re: [PATCH]Fix: Init page count for all pages during higher order allocs
Date: Tue, 7 May 2002 13:04:46 +0530 [thread overview]
Message-ID: <20020507130446.F982@in.ibm.com> (raw)
In-Reply-To: <20020502142441.A1668@in.ibm.com>; from suparna@in.ibm.com on Thu, May 02, 2002 at 02:24:41PM +0530
On Thu, May 02, 2002 at 02:24:41PM +0530, Suparna Bhattacharya wrote:
> On Tue, Apr 30, 2002 at 12:47:24PM -0700, Andrew Morton wrote:
>
> > An alternative is to just set PG_inuse against _all_ pages
> > in rmqueue(), and clear PG_inuse against all pages in
> > __free_pages_ok(). Which seems cleaner, and would fix other
> > problems, I suspect.
>
> This works well for us. If no one minds the extra flag, and it
> is preferable to the option of initializing page count for
> higher order pages, we'll go ahead and do this.
>
Here is a patch against 2.4.18 kernel which uses PG_inuse page flag.
As has been discussed in this thread, this flag will be used to track all
pages (including partial higher order pages) allocated by the kernel.
Per Andi Kleen's suggestion it is preferable to use __set_bit/__clear_bit
in this case. However __clear_bit doesn't seem to be defined for all
architectures in 2.4, and hence we couldn't use it in arch
independent code as yet. A backport from 2.5 can be contemplated, but
even there definitions seem to be missing for some archs (e.g mips).
diff -urN -X dontdiff 2418-pure/include/linux/mm.h linux-2.4.18+flg/include/linux/mm.h
--- 2418-pure/include/linux/mm.h Fri Dec 21 23:12:03 2001
+++ linux-2.4.18+flg/include/linux/mm.h Tue May 7 12:46:53 2002
@@ -285,6 +285,9 @@
#define PG_arch_1 13
#define PG_reserved 14
#define PG_launder 15 /* written out by VM pressure.. */
+#define PG_inuse 16 /* set for all pages(including higher
+ order partial-pages) allocated
+ by the kernel. */
/* Make it prettier to test the above... */
#define UnlockPage(page) unlock_page(page)
@@ -301,6 +304,15 @@
#define SetPageChecked(page) set_bit(PG_checked, &(page)->flags)
#define PageLaunder(page) test_bit(PG_launder, &(page)->flags)
#define SetPageLaunder(page) set_bit(PG_launder, &(page)->flags)
+
+/*
+ * Using the non-atomic version __set_bit as per Andi Kleen's suggestion.
+ * Currently __clear_bit is not available on all architectures in 2.4.
+ */
+#define PageInuse(page) test_bit(PG_inuse, &(page)->flags)
+#define SetPageInuse(page) __set_bit(PG_inuse, &(page)->flags)
+/* Replace this by __clear_bit in 2.5 */
+#define ClearPageInuse(page) clear_bit(PG_inuse, &(page)->flags)
extern void FASTCALL(set_page_dirty(struct page *));
diff -urN -X dontdiff 2418-pure/mm/page_alloc.c linux-2.4.18+flg/mm/page_alloc.c
--- 2418-pure/mm/page_alloc.c Tue Feb 26 01:08:14 2002
+++ linux-2.4.18+flg/mm/page_alloc.c Thu May 2 15:16:05 2002
@@ -69,6 +69,14 @@
free_area_t *area;
struct page *base;
zone_t *zone;
+ unsigned int i;
+
+ i = 1UL << order;
+ page += i;
+ do {
+ page--;
+ ClearPageInuse(page);
+ } while (--i);
/* Yes, think what happens when other parts of the kernel take
* a reference to a page in order to pin it for io. -ben
@@ -181,7 +189,7 @@
static struct page * rmqueue(zone_t *zone, unsigned int order)
{
free_area_t * area = zone->free_area + order;
- unsigned int curr_order = order;
+ unsigned int i, curr_order = order;
struct list_head *head, *curr;
unsigned long flags;
struct page *page;
@@ -206,6 +214,13 @@
page = expand(zone, page, index, order, curr_order, area);
spin_unlock_irqrestore(&zone->lock, flags);
+ i = 1UL << order;
+ page += i;
+ do {
+ page--;
+ SetPageInuse(page);
+ } while (--i);
+
set_page_count(page, 1);
if (BAD_RANGE(zone,page))
BUG();
@@ -236,6 +251,7 @@
{
struct page * page = NULL;
int __freed = 0;
+ unsigned int i;
if (!(gfp_mask & __GFP_WAIT))
goto out;
@@ -264,9 +280,15 @@
if (tmp->index == order && memclass(tmp->zone, classzone)) {
list_del(entry);
current->nr_local_pages--;
- set_page_count(tmp, 1);
- page = tmp;
+ i = 1UL << order;
+ page = tmp + i;
+ do {
+ page--;
+ SetPageInuse(page);
+ } while (--i);
+
+ set_page_count(page, 1);
if (page->buffers)
BUG();
if (page->mapping)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
prev parent reply other threads:[~2002-05-07 7:34 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20020429202446.A2326@in.ibm.com>
2002-04-29 17:40 ` Eric W. Biederman
2002-04-30 5:31 ` Suparna Bhattacharya
2002-04-30 14:05 ` Eric W. Biederman
2002-04-30 15:08 ` Suparna Bhattacharya
2002-04-30 19:47 ` Andrew Morton
2002-05-02 8:54 ` Suparna Bhattacharya
2002-05-02 13:08 ` Hugh Dickins
2002-05-02 21:13 ` Daniel Phillips
2002-05-03 12:24 ` Suparna Bhattacharya
2002-05-03 13:46 ` Hugh Dickins
2002-05-07 10:11 ` Suparna Bhattacharya
2002-05-07 7:34 ` Bharata B Rao [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020507130446.F982@in.ibm.com \
--to=bharata@in.ibm.com \
--cc=akpm@zip.com.au \
--cc=bsuparna@in.ibm.com \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=marcelo@brutus.conectiva.com.br \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox