From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail172.messagelabs.com (mail172.messagelabs.com [216.82.254.3]) by kanga.kvack.org (Postfix) with ESMTP id 3335F6B0169 for ; Thu, 28 Jul 2011 04:58:52 -0400 (EDT) Received: by vws14 with SMTP id 14so1930998vws.9 for ; Thu, 28 Jul 2011 01:58:49 -0700 (PDT) MIME-Version: 1.0 Date: Thu, 28 Jul 2011 14:28:48 +0530 Message-ID: Subject: Crashes on ARM platform when sparsemem enabled in linux-2.6.35.13 due to pfn_valid() and pfn_valid_within(). From: ck ck Content-Type: multipart/alternative; boundary=bcaec547ca458b5b1b04a91d5f75 Sender: owner-linux-mm@kvack.org List-ID: To: Russell King , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: yad.naveen@gmail.com, linux-arm-kernel-request@lists.arm.linux.org.uk --bcaec547ca458b5b1b04a91d5f75 Content-Type: text/plain; charset=ISO-8859-1 Hi, On my ARM machine, the total kernel memory is not aligned to the section size SECTION_SIZE_BITS. I observe kernel crashes in the following 3 scenarios: i) When we do a "cat /proc/pagetypeinfo": This happens because the pfn_valid() macro is not able to detect invalid PFNs in the loop in vmstat.c: pagetypeinfo_showblockcount_print(). ii) When we do "echo xxxx > /proc/vm/sys/min_free_kbytes": This happens because the pfn_valid() macro is not able to detect invalid PFNs in page_alloc.c: setup_zone_migrate_reserve(). iii) When I try to copy a really huge file form one directory to another: This happens because the CONFIG_HOLES_IN_ZONE config option is not set. So, the code in move_freepages() crashes. I did find one patch somewhat related to this problem at: https://patchwork.kernel.org/patch/793862/ However, this patch is not suitable for my version of linux-2.6.35.13 because this uses memblock_*() functionality and CONFIG_HAVE_MEMBLOCK is not enabled for my platform. Also, I am not sure whether this will solve point iii) above, as pfn_valid_within() will continue to return 1 back to the caller when it should be calling pfn_vald() instead. I created a solution patch for this and would appreciate it if anyone in these mailing lists could review this patch and tell me whether: i) There is a better way to do this or, ii) There is already a more suitable patch for the configuration I am mentioning above. Patch: --- linux-2.6.35.11.p29-FR/arch/arm/Kconfig 2011-07-27 10:27:02.243936001 +0530 +++ linux-2.6.35.11.p29-FR.new/arch/arm/Kconfig 2011-07-27 09:54:00.823935866 +0530 @@ -581,6 +581,16 @@ config OABI_COMPAT config ARCH_HAS_HOLES_MEMORYMODEL bool +config ARCH_HAS_PFN_VALID + bool + depends on SPARSEMEM + default y + +config HOLES_IN_ZONE + bool + depends on SPARSEMEM + default y + # Discontigmem is deprecated config ARCH_DISCONTIGMEM_ENABLE bool --- linux-2.6.35.9/arch/arm/mm/init.c 2011-06-13 15:18:47.921796999 +0530 +++ linux-2.6.35.9.new/arch/arm/mm/init.c 2011-06-13 11:59:47.236796983 +0530 @@ -350,6 +350,27 @@ static void arm_memory_present(struct me { } #else +#ifdef CONFIG_ARCH_HAS_PFN_VALID +int arch_pfn_valid(unsigned long pfn) +{ + struct meminfo *mi = &meminfo; + unsigned int left = 0, right = mi->nr_banks; + + do { + unsigned int mid = (right + left) / 2; + struct membank *bank = &mi->bank[mid]; + + if (pfn < bank_pfn_start(bank)) + right = mid; + else if (pfn >= bank_pfn_end(bank)) + left = mid + 1; + else + return 1; + } while (left < right); + return 0; +} +#endif + static void arm_memory_present(struct meminfo *mi, int node) { int i; --- linux-2.6.35.9/include/linux/mmzone.h 2010-11-23 00:31:26.000000000 +0530 +++ linux-2.6.35.9.new/include/linux/mmzone.h 2011-06-13 12:32:37.182796701 +0530 @@ -1062,10 +1062,20 @@ static inline struct mem_section *__pfn_ return __nr_to_section(pfn_to_section_nr(pfn)); } +#ifdef CONFIG_ARCH_HAS_PFN_VALID +int arch_pfn_valid(unsigned long) ; +#endif + static inline int pfn_valid(unsigned long pfn) { if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) return 0; + +#ifdef CONFIG_ARCH_HAS_PFN_VALID + if (!arch_pfn_valid(pfn)) + return 0 ; +#endif + return valid_section(__nr_to_section(pfn_to_section_nr(pfn))); } @@ -1073,6 +1083,12 @@ static inline int pfn_present(unsigned l { if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) return 0; + +#ifdef CONFIG_ARCH_HAS_PFN_VALID + if (!arch_pfn_valid(pfn)) + return 0 ; +#endif + return present_section(__nr_to_section(pfn_to_section_nr(pfn))); } Thanks, Kautuk. --bcaec547ca458b5b1b04a91d5f75 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi,
=A0
On my ARM machine, the total kernel memory is not aligned to = the section size SECTION_SIZE_BITS.
=A0
I observe kernel crashes in t= he following 3 scenarios:
i)=A0=A0=A0 When we do a "cat /proc/paget= ypeinfo": This happens because the pfn_valid() macro is not able to de= tect invalid PFNs in the loop in
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 vmsta= t.c: pagetypeinfo_showblockcount_print().
ii)=A0=A0=A0 When we do "= echo xxxx > /proc/vm/sys/min_free_kbytes": This happens because the= pfn_valid() macro is not able to detect invalid PFNs in
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0page_alloc.c: setup_zone_migrate_reserve().
iii)=A0= =A0 When I try to copy a really huge file form one directory to another: Th= is happens because the CONFIG_HOLES_IN_ZONE config option is not set.
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0So, the code in mo= ve_freepages() crashes.
=A0
I did find one patch somewhat related to = this problem at: htt= ps://patchwork.kernel.org/patch/793862/
However, this patch is not suitable for my version of linux-2.6.35.13 becau= se this uses memblock_*() functionality and
CONFIG_HAVE_MEMBLOCK is not = enabled for my platform. Also, I am not sure whether this will solve point = iii) above, as pfn_valid_within()
will continue to return 1 back to the caller when it should be calling pfn_= vald() instead.
=A0
I created a solution patch for this and would app= reciate it if anyone in these mailing lists could review this patch and tel= l me whether:
i)=A0 There is a better way to do this or,
ii)=A0 There is already a mor= e suitable patch for the configuration I am mentioning above.
=A0
Pat= ch:

--- linux-2.6.35.11.p29-FR/arch/arm/Kconfig 2011-07-27 10:27:02.= 243936001 +0530

+++ linux-2.6.35.11.p29-FR.new/arch/arm/Kconfig 2011-07-27 09:54:00.823= 935866 +0530

@@ -581,6 +581,16 @@ config OABI_COMPAT

config A= RCH_HAS_HOLES_MEMORYMODEL

bool

+config ARCH_HAS_PFN_VALID

+ bool

+ depends on SPARSEMEM

+ default y

+
+config HOLES_IN_ZONE

+ bool

+ depends on SPARSEMEM

+= default y

+

# Discontigmem is deprecated

config ARCH_= DISCONTIGMEM_ENABLE

bool

--- linux-2.6.35.9/arch/arm/mm/init.c 2011-06-13 15:18:47.9= 21796999 +0530

+++ linux-2.6.35.9.new/arch/arm/mm/init.c 2011-06-13 = 11:59:47.236796983 +0530

@@ -350,6 +350,27 @@ static void arm_memory= _present(struct me

{

}

#else

+#ifdef CONFIG_ARCH_HAS_PFN_VALID
+int arch_pfn_valid(unsigned long pfn)

+{

+ struct meminfo *= mi =3D &meminfo;

+ unsigned int left =3D 0, right =3D mi->nr_= banks;

+

+ do {

+ unsigned int mid =3D (right + left) / 2;
+ struct membank *bank =3D &mi->bank[mid];

+

+ if (p= fn < bank_pfn_start(bank))

+ right =3D mid;

+ else if (pfn= >=3D bank_pfn_end(bank))

+ left =3D mid + 1;

+ else

+ return 1;

+ } while (= left < right);

+ return 0;

+}

+#endif

+
<= br>static void arm_memory_present(struct meminfo *mi, int node)

{
int i;

--- linux-2.6.35.9/include/linux/mmzone.h 2010-11-23 00:3= 1:26.000000000 +0530

+++ linux-2.6.35.9.new/include/linux/mmzone.h 2= 011-06-13 12:32:37.182796701 +0530

@@ -1062,10 +1062,20 @@ static in= line struct mem_section *__pfn_

return __nr_to_section(pfn_to_section_nr(pfn));

}

+#ifdef= CONFIG_ARCH_HAS_PFN_VALID

+int arch_pfn_valid(unsigned long) ;
<= br>+#endif

+

static inline int pfn_valid(unsigned long pfn)
{

if (pfn_to_section_nr(pfn) >=3D NR_MEM_SECTIONS)

ret= urn 0;

+

+#ifdef CONFIG_ARCH_HAS_PFN_VALID

+ if (!arch= _pfn_valid(pfn))

+ return 0 ;

+#endif

+

return = valid_section(__nr_to_section(pfn_to_section_nr(pfn)));

}

@@ -1073,6 +1083,12 @@ static inline int pfn_present(unsigned = l

{

if (pfn_to_section_nr(pfn) >=3D NR_MEM_SECTIONS)
return 0;

+

+#ifdef CONFIG_ARCH_HAS_PFN_VALID

+ if (= !arch_pfn_valid(pfn))

+ return 0 ;

+#endif

+

return present_section(__nr= _to_section(pfn_to_section_nr(pfn)));

}

=A0
Thanks,
Kau= tuk. --bcaec547ca458b5b1b04a91d5f75-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org