From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl0-f72.google.com (mail-pl0-f72.google.com [209.85.160.72]) by kanga.kvack.org (Postfix) with ESMTP id D3CF56B0005 for ; Wed, 15 Aug 2018 23:46:50 -0400 (EDT) Received: by mail-pl0-f72.google.com with SMTP id f66-v6so1930350plb.10 for ; Wed, 15 Aug 2018 20:46:50 -0700 (PDT) Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id w28-v6sor5050576pgc.289.2018.08.15.20.46.49 for (Google Transport Security); Wed, 15 Aug 2018 20:46:49 -0700 (PDT) Date: Thu, 16 Aug 2018 12:46:42 +0900 From: Minchan Kim Subject: Re: Re: [PATCH] zsmalloc: fix linking bug in init_zspage Message-ID: <20180816034642.GA62516@rodete-desktop-imager.corp.google.com> References: <20180810002817.2667-1-zhouxianrong@tom.com> <20180813060549.GB64836@rodete-desktop-imager.corp.google.com> <20180813105536.GA435@jagdpanzerIV> <20180814002416.GA34280@rodete-desktop-imager.corp.google.com> <1784999539.213028.1534378242020.JavaMail.root@rz-web2> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: <1784999539.213028.1534378242020.JavaMail.root@rz-web2> Sender: owner-linux-mm@kvack.org List-ID: To: zhouxianrong Cc: sergey.senozhatsky.work@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, ngupta@vflare.org, zhouxianrong@huawei.com, vbabka@suse.cz, yudongbin@hisilicon.com Hi zhouxianrong, Please could you be more sepcific what case can we encounter below BUG? (Please use plain text) What zs_class size did you this this problem? Could you say how that can happen? As I wrote in other reply, zsmalloc should never allocate last parital object when I look at source code so we need to understand what specific case we are missing if it's a real zsmalloc bug. Please explain how that can be happen with a real example. Thanks. On Thu, Aug 16, 2018 at 08:10:42AM +0800, zhouxianrong wrote: > Hi:

  I am sorry so later for replying this message due to something.
This is the backtrace edited by me we met.

[pid:3471,cp= u4,thread-3]------------[ cut here ]------------
[pid:3471,cpu4,thread= -3]kernel bug at ../../../../../../mm/zsmalloc.c:1455!
[pid:3471,cpu4,= thread-3]internal error: oops - bug: 0 [#1] preempt smp
[pid:3471,cpu4= ,thread-3]modules linked in:
[pid:3471,cpu4,thread-3]cpu: 4 pid: 3471 = comm: thread-3 tainted: g        w =       4.9.84 #1
[pid:3471,cpu4,thread-3]tgid:= 715 comm: proc-a
[pid:3471,cpu4,thread-3]task: ffffffcc83ba1d00 task.= stack: ffffffcad99b0000
[pid:3471,cpu4,thread-3]pc is at zs_map_object= +0x1e0/0x1f0
[pid:3471,cpu4,thread-3]lr is at zs_map_object+0x9c/0x1f0=
[pid:3471,cpu4,thread-3]pc : [] lr : [] pstate: 20000145
[pid:34= 71,cpu4,thread-3]sp : ffffffcad99b3530
[pid:3471,cpu4,thread-3]x29: ff= ffffcad99b3530 x28: ffffffcc97533c40
[pid:3471,cpu4,thread-3]x27: ffff= ffcc974dd720 x26: ffffffcad99b0000
[pid:3471,cpu4,thread-3]x25: 000000= 0001fa9f80 x24: 0000000000000002
[pid:3471,cpu4,thread-3]x23: ffffff89= c3a27000 x22: ffffff89c30e6000
[pid:3471,cpu4,thread-3]x21: ffffff89c3= 54f000 x20: ffffff89c3234720
[pid:3471,cpu4,thread-3]x19: 000000000000= 0f90 x18: 0000000000000008
[pid:3471,cpu4,thread-3]x17: 00000000bbb877= ff x16: 00000000ffdba560
[pid:3471,cpu4,thread-3]x15: ffffffcaeab13ff5= x14: 000000009e3779b1
[pid:3471,cpu4,thread-3]x13: 0000000000000ff4 x= 12: ffffffcaeab13fd9
[pid:3471,cpu4,thread-3]x11: ffffffcaeab13ffa x10= : ffffffcaeab13ff8
[pid:3471,cpu4,thread-3]x9 : ffffffca8cc201b8 x8 : = ffffffca8cc20190
[pid:3471,cpu4,thread-3]x7 : 000000000000008e x6 : 00= 0000000000009b
[pid:3471,cpu4,thread-3]x5 : 0000000000000000 x4 : 0000= 000000000001
[pid:3471,cpu4,thread-3]x3 : 00000042d42a9000 x2 : 000000= 00000009d0
[pid:3471,cpu4,thread-3]x1 : ffffffcc994ddbc0 x0 : 00000000= 00000000

[pid:3471,cpu4,thread-3] zs_map_object+0x1e0/0x1f0
[pid:3471,cpu4,thread-3] zs_zpool_map+0x44/0x54
[pid:3471,cpu4,thread= -3] zpool_map_handle+0x44/0x58
[pid:3471,cpu4,thread-3] zram_bvec_writ= e+0x22c/0x76c
[pid:3471,cpu4,thread-3] zram_bvec_rw+0x288/0x488
[= pid:3471,cpu4,thread-3] zram_rw_page+0x124/0x1a4
[pid:3471,cpu4,thread= -3] bdev_write_page+0x8c/0xd8
[pid:3471,cpu4,thread-3] __swap_writepag= e+0x1c0/0x3a8
[pid:3471,cpu4,thread-3] swap_writepage+0x3c/0x64
[= pid:3471,cpu4,thread-3] shrink_page_list+0x844/0xd84
[pid:3471,cpu4,th= read-3] reclaim_pages_from_list+0xf4/0x1bc
[pid:3471,cpu4,thread-3] re= claim_pte_range+0x208/0x2a0
[pid:3471,cpu4,thread-3] walk_pgd_range+0x= e8/0x238
[pid:3471,cpu4,thread-3] walk_page_range+0x7c/0x164
[pid= :3471,cpu4,thread-3] reclaim_write+0x208/0x608
[pid:3471,cpu4,thread-3= ] __vfs_write+0x50/0x88
[pid:3471,cpu4,thread-3] vfs_write+0xbc/0x2b0<= br />[pid:3471,cpu4,thread-3] sys_write+0x60/0xc4
[pid:3471,cpu4,threa= d-3] el0_svc_naked+0x34/0x38
[pid:3471,cpu4,thread-3]code: 17ffffdd d4= 210000 97ffff1f 97ffff83 (d4210000)
[pid:3471,cpu4,thread-3]---[ end t= race 652caafc4c4b6d06 ]---
Hi S=
ergey,
>=20
> On Mon, Aug 13, 2018 at 07:55:36PM +0900, Sergey Senozhatsky wrote:
> > On (08/13/18 15:05), Minchan Kim wrote:
> > > > From: zhouxianrong 
> > > >=20
> > > > The last partial object in last subpage of zspage should n=
ot be linked
> > > > in allocation list. Otherwise it could trigger BUG_ON expl=
icitly at
> > > > function zs_map_object. But it happened rarely.
> > >=20
> > > Could you be more specific? What case did you see the problem?
> > > Is it a real problem or one founded by review?
> > [..]
> > > > Signed-off-by: zhouxianrong 
> > > > ---
> > > >  mm/zsmalloc.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >=20
> > > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > > > index 8d87e973a4f5..24dd8da0aa59 100644
> > > > --- a/mm/zsmalloc.c
> > > > +++ b/mm/zsmalloc.c
> > > > @@ -1040,6 +1040,8 @@ static void init_zspage(struct size_=
class *class, struct zspage *zspage)
> > > >  			 * Reset OBJ_TAG_BITS bit to last link to tell
> > > >  			 * whether it's allocated object or not.
> > > >  			 */
> > > > +			if (off > PAGE_SIZE)
> > > > +				link -=3D class->size / sizeof(*link);
> > > >  			link->next =3D -1UL << OBJ_TAG_BITS;
> > > >  		}
> > > >  		kunmap_atomic(vaddr);
> >=20
> > Hmm. This can be a real issue. Unless I'm missing something.
> >=20
> > So... I might be wrong, but the way I see the bug report is:
> >=20
> > When we link objects during zspage init, we do the following:
> >=20
> > 	while ((off +=3D class->size) < PAGE_SIZE) {
> > 		link->next =3D freeobj++ << OBJ_TAG_BITS;
> > 		link +=3D class->size / sizeof(*link);
> > 	}
> >=20
> > Note that we increment the link first, link +=3D class->size / si=
zeof(*link),
> > and check for the offset only afterwards. So by the time we break ou=
t of
> > the while-loop the link *might* point to the partial object which st=
arts at
> > the last page of zspage, but *never* ends, because we don't have nex=
t_page
> > in current zspage. So that's why that object should not be linked in,
> > because it's not a valid allocates object - we simply don't have spa=
ce
> > for it anymore.
> >=20
> > zspage [      page 1     ][      page 2      ]
> >         ...............................link
> > 	                                   [..###]
> >=20
> > therefore the last object must be "link - 1" for such case=
s.
> >=20
> > I think, the following change can also do the trick:
> >=20
> > 	while ((off + class->size) < PAGE_SIZE) {
> > 		link->next =3D freeobj++ << OBJ_TAG_BITS;
> > 		link +=3D class->size / sizeof(*link);
> > 		off +=3D class->size;
> > 	}
> >=20
> > Once again, I might be wrong on this.
> > Any thoughts?
>=20
> If we want a refactoring, I'm not against but description said it tiggered
> BUG_ON on zs_map_object rarely. That means it should be stable material
> and need more description to understand. Please be more specific with
> some example. The reason I'm hesitating is zsmalloc moves ZS_FULL group
> when the zspage->inuse is equal to class->objs_per_zspage so I thou=
ght
> it shouldn't allocate last partial object.
>=20
> Thanks.
> 
=E9=9A=8F=E5=BF=83=E9= =82=AE-=E5=9C=A8=E5=BE=AE=E4=BF=A1=E9=87=8C=E6=94=B6=E5=8F=91=E9=82=AE=E4= =BB=B6=EF=BC=8C=E5=8F=8A=E6=97=B6=E7=9C=81=E7=94=B5=E5=8F=88=E5=AE=89=E5=BF= =83