From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk1-f199.google.com (mail-vk1-f199.google.com [209.85.221.199]) by kanga.kvack.org (Postfix) with ESMTP id A2ED76B0323 for ; Wed, 31 Oct 2018 02:43:40 -0400 (EDT) Received: by mail-vk1-f199.google.com with SMTP id s197so3400822vks.23 for ; Tue, 30 Oct 2018 23:43:40 -0700 (PDT) Received: from mail-sor-f41.google.com (mail-sor-f41.google.com. [209.85.220.41]) by mx.google.com with SMTPS id s65sor15622434vse.30.2018.10.30.23.43.39 for (Google Transport Security); Tue, 30 Oct 2018 23:43:39 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20181031073149.55ddc085@mschwideX1> References: <1539621759-5967-1-git-send-email-schwidefsky@de.ibm.com> <1539621759-5967-4-git-send-email-schwidefsky@de.ibm.com> <20181031073149.55ddc085@mschwideX1> From: Li Wang Date: Wed, 31 Oct 2018 14:43:38 +0800 Message-ID: Subject: Re: [PATCH 3/3] s390/mm: fix mis-accounting of pgtable_bytes Content-Type: multipart/alternative; boundary="0000000000005263e70579809f2d" Sender: owner-linux-mm@kvack.org List-ID: To: Martin Schwidefsky Cc: Guenter Roeck , Janosch Frank , "Kirill A. Shutemov" , Heiko Carstens , linux-kernel , Linux-MM --0000000000005263e70579809f2d Content-Type: text/plain; charset="UTF-8" On Wed, Oct 31, 2018 at 2:31 PM, Martin Schwidefsky wrote: > On Wed, 31 Oct 2018 14:18:33 +0800 > Li Wang wrote: > > > On Tue, Oct 16, 2018 at 12:42 AM, Martin Schwidefsky < > schwidefsky@de.ibm.com > > > wrote: > > > > > In case a fork or a clone system fails in copy_process and the error > > > handling does the mmput() at the bad_fork_cleanup_mm label, the > > > following warning messages will appear on the console: > > > > > > BUG: non-zero pgtables_bytes on freeing mm: 16384 > > > > > > The reason for that is the tricks we play with mm_inc_nr_puds() and > > > mm_inc_nr_pmds() in init_new_context(). > > > > > > A normal 64-bit process has 3 levels of page table, the p4d level and > > > the pud level are folded. On process termination the free_pud_range() > > > function in mm/memory.c will subtract 16KB from pgtable_bytes with a > > > mm_dec_nr_puds() call, but there actually is not really a pud table. > > > > > > One issue with this is the fact that pgtable_bytes is usually off > > > by a few kilobytes, but the more severe problem is that for a failed > > > fork or clone the free_pgtables() function is not called. In this case > > > there is no mm_dec_nr_puds() or mm_dec_nr_pmds() that go together with > > > the mm_inc_nr_puds() and mm_inc_nr_pmds in init_new_context(). > > > The pgtable_bytes will be off by 16384 or 32768 bytes and we get the > > > BUG message. The message itself is purely cosmetic, but annoying. > > > > > > To fix this override the mm_pmd_folded, mm_pud_folded and mm_p4d_folded > > > function to check for the true size of the address space. > > > > > > > I can confirm that it works to the problem, the warning message is gone > > after applying this patch on s390x. And I also done ltp syscalls/cve test > > for the patch set on x86_64 arch, there has no new regression. > > > > Tested-by: Li Wang > > Thanks for testing. Unfortunately Heiko reported another issue yesterday > with the patch applied. This time the other way around: > > BUG: non-zero pgtables_bytes on freeing mm: -16384 > Okay, the problem is still triggered by LTP/cve-2017-17052.c? I tried this patch on my platform and it works! My test environment as: # lscpu Architecture: s390x CPU op-mode(s): 32-bit, 64-bit Byte Order: Big Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s) per book: 1 Book(s) per drawer: 1 Drawer(s): 2 Vendor ID: IBM/S390 Machine type: 2827 CPU dynamic MHz: 5504 CPU static MHz: 5504 BogoMIPS: 2913.00 Hypervisor vendor: vertical Virtualization type: full Dispatching mode: horizontal L1d cache: 96K L1i cache: 64K L2d cache: 1024K L2i cache: 1024K L3 cache: 49152K L4 cache: 393216K Flags: esan3 zarch stfle msa ldisp eimm dfp edat etf3eh highgprs te sie > I am trying to understand how this can happen. For now I would like to > keep the patch on hold in case they need another change. > Sure. > > -- > blue skies, > Martin. > > "Reality continues to ruin my life." - Calvin. > > -- Regards, Li Wang --0000000000005263e70579809f2d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Wed, Oct 31, 2018 at 2:31 PM, Martin Schwidefsky <schwidefs= ky@de.ibm.com> wrote:
On Wed, 31 Oct 2018 14:18:33 +0800
Li Wang <liwang@r= edhat.com> wrote:

> On Tue, Oct 16, 2018 at 12:42 AM, Martin Schwidefsky <schwidefsky@de.ibm.com > > wrote:=C2=A0
>
> > In case a fork or a clone system fails in copy_process and the er= ror
> > handling does the mmput() at the bad_fork_cleanup_mm label, the > > following warning messages will appear on the console:
> >
> >=C2=A0 =C2=A0BUG: non-zero pgtables_bytes on freeing mm: 16384
> >
> > The reason for that is the tricks we play with mm_inc_nr_puds() a= nd
> > mm_inc_nr_pmds() in init_new_context().
> >
> > A normal 64-bit process has 3 levels of page table, the p4d level= and
> > the pud level are folded. On process termination the free_pud_ran= ge()
> > function in mm/memory.c will subtract 16KB from pgtable_bytes wit= h a
> > mm_dec_nr_puds() call, but there actually is not really a pud tab= le.
> >
> > One issue with this is the fact that pgtable_bytes is usually off=
> > by a few kilobytes, but the more severe problem is that for a fai= led
> > fork or clone the free_pgtables() function is not called. In this= case
> > there is no mm_dec_nr_puds() or mm_dec_nr_pmds() that go together= with
> > the mm_inc_nr_puds() and mm_inc_nr_pmds in init_new_context(). > > The pgtable_bytes will be off by 16384 or 32768 bytes and we get = the
> > BUG message. The message itself is purely cosmetic, but annoying.=
> >
> > To fix this override the mm_pmd_folded, mm_pud_folded and mm_p4d_= folded
> > function to check for the true size of the address space.
> >=C2=A0
>
> I can confirm that it works to the problem, the warning message is gon= e
> after applying this patch on s390x. And I also done ltp syscalls/cve t= est
> for the patch set on x86_64 arch, there has no new regression.
>
> Tested-by: Li Wang <liwang@redhat.com>

Thanks for testing. Unfortunately Heiko reported another issue = yesterday
with the patch applied. This time the other way around:

BUG: non-zero pgtables_bytes on freeing mm: -16384
Okay, the pro= blem is still triggered by=C2=A0LTP/cve-2017-17052.c?=C2=A0
=
I tried this patc= h on my platform and it works! My test environment as:

# lscpu
Architecture:=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 s390x
CPU op-= mode(s):=C2=A0 =C2=A0 =C2=A0 =C2=A0 32-bit, 64-bit
Byte Order:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Big Endian
CPU(s):=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 2
On-line CPU(s) lis= t:=C2=A0 =C2=A00,1
Thread(s) per core:=C2= =A0 =C2=A0 1
Core(s) per socket:=C2=A0 = =C2=A0 1
Socket(s) per book:=C2=A0 =C2=A0= 1
Book(s) per drawer:=C2=A0 =C2=A0 1
Drawer(s):=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A02
Vendor ID:=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0IBM/S390
Machine type:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 2827
CPU dynamic MHz:=C2=A0 =C2=A0 =C2=A0 =C2=A05504
CPU static MHz:=C2=A0 =C2=A0 =C2=A0 =C2=A0 5504
<= div class=3D"gmail_default">BogoMIPS:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 2913.00
Hypervisor vendor:=C2= =A0 =C2=A0 =C2=A0vertical
Virtualization = type:=C2=A0 =C2=A0full
Dispatching mode:= =C2=A0 =C2=A0 =C2=A0 horizontal
L1d cache= :=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A096K
L1i cache:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A064K
L2d cache:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A01024K
L2i cache:=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01024K
L3 cache:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 49152K
L4 cache:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 393216K
Flags:=C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0esan3 zarch stfle msa ldis= p eimm dfp edat etf3eh highgprs te sie

I am trying to understand how this can happen. For now I would like to
keep the patch on hold in case they need another change.

Sure.
=
--
blue skies,
=C2=A0 =C2=A0Martin.

"Reality continues to ruin my life." - Calvin.




--
Regards,
Li Wang
--0000000000005263e70579809f2d--