From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E692EC433EF for ; Tue, 8 Mar 2022 15:16:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 107D98D0002; Tue, 8 Mar 2022 10:16:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 09B1E8D0001; Tue, 8 Mar 2022 10:16:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E995C8D0002; Tue, 8 Mar 2022 10:16:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0049.hostedemail.com [216.40.44.49]) by kanga.kvack.org (Postfix) with ESMTP id DA89A8D0001 for ; Tue, 8 Mar 2022 10:16:40 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 89F0A182A5EFC for ; Tue, 8 Mar 2022 15:16:40 +0000 (UTC) X-FDA: 79221570960.19.072ED51 Received: from forward501o.mail.yandex.net (forward501o.mail.yandex.net [37.140.190.203]) by imf10.hostedemail.com (Postfix) with ESMTP id 6A5F6C000F for ; Tue, 8 Mar 2022 15:16:39 +0000 (UTC) Received: from iva8-4654c25a7801.qloud-c.yandex.net (iva8-4654c25a7801.qloud-c.yandex.net [IPv6:2a02:6b8:c0c:7784:0:640:4654:c25a]) by forward501o.mail.yandex.net (Yandex) with ESMTP id F050E45C4249; Tue, 8 Mar 2022 18:16:36 +0300 (MSK) Received: from iva6-2d18925256a6.qloud-c.yandex.net (iva6-2d18925256a6.qloud-c.yandex.net [2a02:6b8:c0c:7594:0:640:2d18:9252]) by iva8-4654c25a7801.qloud-c.yandex.net (mxback/Yandex) with ESMTP id Fy0V8XOSv8-Gaeiulld; Tue, 08 Mar 2022 18:16:36 +0300 X-Yandex-Fwd: 2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=clicknet.pro; s=mail; t=1646752596; bh=y5bFm+wcpM3yETvsY5zoUoJutQ/V3dIzNkO8mxW8AHw=; h=In-Reply-To:References:Date:Cc:To:From:Subject:Message-ID; b=awVxqSfTrZmiXrynbLgUDBxSfHyJCq60UemiiJzClmTXEErj7LKhdDpU7h6jh7eI/ AYEj9BJnWawBLqTCNyUk2wa+0aF3EN31SdooQ3dnmUyjWELgUjcUL93OEDBFpnOvN/ jctWYzHFqaXWvWXxqvWaSoZtRrfbuCrIq5/9AM0U= Received: by iva6-2d18925256a6.qloud-c.yandex.net (smtp/Yandex) with ESMTPSA id QMBg6RdLRG-GZK0xEFh; Tue, 08 Mar 2022 18:16:35 +0300 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client certificate not present) Message-ID: <8255c6882aa7f78e6855cf4adda66afe0c1df809.camel@clicknet.pro> Subject: Re: [PATCH v3] mm: add ztree - new allocator for use via zpool API From: Ananda Badmaev To: Matthew Wilcox Cc: linux-mm@kvack.org, vitaly.wool@konsulko.com, vbabka@suse.cz, akpm@linux-foundation.org Date: Tue, 08 Mar 2022 18:16:35 +0300 In-Reply-To: References: <20220307142724.14519-1-a.badmaev@clicknet.pro> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.38.3-1 MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 6A5F6C000F X-Rspam-User: Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=clicknet.pro header.s=mail header.b=awVxqSfT; spf=pass (imf10.hostedemail.com: domain of a.badmaev@clicknet.pro designates 37.140.190.203 as permitted sender) smtp.mailfrom=a.badmaev@clicknet.pro; dmarc=none X-Stat-Signature: 5jtdb41ct8ptge4wk11ahgcpuxsr3fpn X-HE-Tag: 1646752599-582746 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: =D0=92 =D0=9F=D0=BD, 07/03/2022 =D0=B2 15:08 +0000, Matthew Wilcox =D0=BF= =D0=B8=D1=88=D0=B5=D1=82: > On Mon, Mar 07, 2022 at 05:27:24PM +0300, Ananda wrote: > > +/***************** > > + * Structures > > + *****************/ >=20 > You don't need this.=C2=A0 I can see they're structures. >=20 > > +/** > > + * struct ztree_block - block metadata > > + * Block consists of several (1/2/4/8) pages and contains fixed > > + * integer number of slots for allocating compressed pages. > > + * @block_node:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 links block into the relevant tree in > > the pool > > + * @slot_info:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 contains data about fre= e/occupied slots > > + * @compressed_data: pointer to the memory block > > + * @block_index:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unique fo= r each ztree_block in the tree > > + * @free_slots:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 number of free slots in the bl= ock > > + * @coeff:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= to switch between blocks > > + * @under_reclaim:=C2=A0=C2=A0 if true shows that block is being evi= cted > > + */ >=20 > Earlier in the file you say this exposes no API and is to be used > only > through zpool.=C2=A0 So what's the point of marking this as kernel-doc? >=20 It will be removed in the next version. > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* 1 page blocks with 11 s= lots */ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0[1] =3D { PAGE_SIZE / (11 = * sizeof(long)) * sizeof(long), > > 0xB, 0 }, >=20 > Hm.=C2=A0 So 368 bytes on 64-bit, but 372 bytes on 32-bit?=C2=A0 Is tha= t > intentional?=C2=A0 Why 11? >=20 Yes, 'slot_size' and 'slots_per_block' values are chosen so that in general the range from 0 to PAGE_SIZE is split more or less evenly and the size of the block is as close as possible to the power of 2. Also 'slot_size' values are aligned to the size of long. > > +/* > > + * allocate new block and add it to corresponding block tree > > + */ > > +static struct ztree_block *alloc_block(struct ztree_pool *pool, > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0int block_type, gfp_t gfp) >=20 > You have some very strange indentation (throughout). >=20 I was trying to limit the length of lines. > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0block =3D kmem_cache_alloc= (pool->block_cache, > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0(gfp & ~(__GFP_HIGHMEM | > > __GFP_MOVABLE))); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (!block) > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0return NULL; > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0block->compressed_data =3D= (void *)__get_free_pages(gfp, > > tree_desc[block_type].order); >=20 > It makes no sense to mask out __GFP_HIGHMEM and __GFP_MOVABLE for the > call > to slab and then not mask them out here.=C2=A0 Either they shouldn't ev= er > be > passed in, in which case that could either be asserted or made true > in > your own code.=C2=A0 Or they can be passed in, and should always be > masked. > Or you genuinely want to be able to use highmem & movable memory for > these data blocks, in which case you're missing calls to kmap() and > memory notifiers to let you move the memory around. >=20 > This smacks of "I tried something, and slab warned, so I fixed the > warning" instead of thinking about what the warning meant. >=20 It seems that these flags should be masked out in alloc_block(). > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0spin_lock(&tree->lock); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* check if there are free= slots in the current and the > > last added blocks */ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (tree->current_block &&= tree->current_block->free_slots > > > 0) { > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0block =3D tree->current_block; > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0goto found; > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (tree->last_block && tr= ee->last_block->free_slots > 0) { > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0block =3D tree->last_block; > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0goto found; > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0spin_unlock(&tree->lock); > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* not found block with fr= ee slots try to allocate new > > empty block */ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0block =3D alloc_block(pool= , block_type, gfp); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0spin_lock(&tree->lock); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (block) { > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0tree->current_block =3D block; > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0goto found; > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} >=20 > Another place that looks like "I fixed the warning instead of > thinking > about it".=C2=A0 What if you have two threads that execute this path > concurrently?=C2=A0 Looks to me like you leak the memory allocated by t= he > first thread. >=20 Probably I should pass GFP_ATOMIC flag to alloc_block() and execute this entire section of code under single spinlock. Copy of previous mail message, but in text/plain.