From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D6BEC77B70 for ; Mon, 17 Apr 2023 11:17:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F0EE78E0002; Mon, 17 Apr 2023 07:17:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EBF078E0001; Mon, 17 Apr 2023 07:17:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D87298E0002; Mon, 17 Apr 2023 07:17:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C529F8E0001 for ; Mon, 17 Apr 2023 07:17:04 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 39594C03EF for ; Mon, 17 Apr 2023 11:17:04 +0000 (UTC) X-FDA: 80690631168.07.02E9446 Received: from mail-ej1-f42.google.com (mail-ej1-f42.google.com [209.85.218.42]) by imf06.hostedemail.com (Postfix) with ESMTP id 2C92A18000E for ; Mon, 17 Apr 2023 11:17:01 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=VTcCqdXz; spf=pass (imf06.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681730222; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6NTB4Q0z/IhAD+EcoZ3Xz1MiJAWYaim+Egf+Tm9hjRU=; b=G0QHZSLlOUHO4UGpwVRrT8AeJ4xKvFvz0kBFI17uyTuB3Zewymffck8BuJRqYbTdIsHZbm hkev9QFfGkYtj1Yfc3becM2v5+2S2uICluAbLAIoQB45iGq8+CcELpN7WaCRlpQds1mz8r POUBcNYrXeOzSeP9zewRNTFvc/P9wqI= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=VTcCqdXz; spf=pass (imf06.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681730222; a=rsa-sha256; cv=none; b=r6wEaGiuLdVHJzUbxsQyPGsMouNXt9LNuWBjfq2xckUXfpiRCIVFFR4a5vPlEMZ4Ky6VVc AwOSSQiliarcHhRBDHG0WQfWbiwFtjfhpt3yK+YwiUZHqejxigtcHCmPxeL1FofxjNsVXt Wl9qbZ80+jBeu5lhQkhxI90BDaxB8S4= Received: by mail-ej1-f42.google.com with SMTP id vc20so8163297ejc.10 for ; Mon, 17 Apr 2023 04:17:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1681730220; x=1684322220; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6NTB4Q0z/IhAD+EcoZ3Xz1MiJAWYaim+Egf+Tm9hjRU=; b=VTcCqdXzZD0E4aIFk6a1Blohsoee33GNiEMcLpBxUOoxLtRwSGxGse6lCugl5/FJcQ RfXXotM/Vnyd8/x00XEf5j3W7bXKUZFQxuqooxR1dyrDktZ/82yKzSiA2ZPb1CPDpf7k 0ogGGXwaCOqtY5CqeRqQSrYGrvgtARf3nxCNx+nuyUP1HUtiog6b0/7e7F/F1ew5K0P4 ltDQ5is6aQiYfMo6hnkYK4Cv1dK31scI+5l75ZWlsBUleWL68H8QeBwP2wnQAG6MX6xH 7XKEYv3jWB+pvYOct1GCAyqRyY/3I/vLTyTdQeltraE5T7+oVFIQ9PjXhgmyxBB9p9yk dgRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681730220; x=1684322220; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6NTB4Q0z/IhAD+EcoZ3Xz1MiJAWYaim+Egf+Tm9hjRU=; b=ZJzBTI/DRlBClfJVSNbOnH6tQD58AZFScgk+rFm+yOWfpVQhZgEfheNih3B92QyTj0 5cqJKT3x3hSim1W3THK48Scx91UAYqljSK1IzXlX0HcFqWPjfwHWn8/s4jrYvevYgUwi 12qnkfbzLD+Gn2rtS3wF3RbMAD55Xrhp7NxTEVn6Ig0Pef0gPk02KKSNF/uhsv+yH9oe v2kPj5XVgu0EeFkzHewo38impdfYXUepapkBJIiE++fjFtzjqa+0K+PiRf+nEVMtkRtR k2ZNjC2Foq/Fkp03pljHBAcvrTelh9X2KrbI1NRnGemkYXdmuylgHgT1o3xVsqA73G/+ FC7w== X-Gm-Message-State: AAQBX9cEqfu9Qzyueu9escoIQChD+y76ijKTr4miLx92SSqrS6+vKm3M XIj9VQE8BtppDL7eIgK67tAc7Qjyn/gkgHwL3tuNgA== X-Google-Smtp-Source: AKy350bHhucG0xli/fxrhyI487NvtKlgQbsw+bN8A8XItaZEy2DhoMlJsKdncP5V8EbnRNNTLP+3AZuJSYznK9ErTqA= X-Received: by 2002:a17:906:7b52:b0:94a:597f:30ee with SMTP id n18-20020a1709067b5200b0094a597f30eemr2946656ejo.15.1681730220495; Mon, 17 Apr 2023 04:17:00 -0700 (PDT) MIME-Version: 1.0 References: <20230304034835.2082479-1-senozhatsky@chromium.org> <20230416151853.GK25053@google.com> <20230417024446.GL25053@google.com> <20230417035232.GM25053@google.com> <20230417111243.GN25053@google.com> In-Reply-To: <20230417111243.GN25053@google.com> From: Yosry Ahmed Date: Mon, 17 Apr 2023 04:16:24 -0700 Message-ID: Subject: Re: [PATCHv4 0/4] zsmalloc: fine-grained fullness and new compaction algorithm To: Sergey Senozhatsky Cc: Yu Zhao , Minchan Kim , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 2C92A18000E X-Stat-Signature: qs6d3y7bsx1i4c8de1imdq65rfo5csas X-Rspam-User: X-HE-Tag: 1681730221-40939 X-HE-Meta: U2FsdGVkX19QZ/4c1izvCvfP5mrV3rqgTIYe5Q9xuWI75huXHVbtskxgJv1Kx/fn5TbchmKdG0ac1+ce7A4lmMvIAmmHyfgTLbtdGoQpCEi7IrwZtj+ZopZUEmT18Dt1ASDLgef3nZ/bHvj1m2mplGwCIVtzmivmMg0CtQhftXGgm4C+PlDpf+YvrPX5f/d32XGPot1DohT1e6QD5HQccpGr9NFQHk2/rstUwgJSj0LEziGyHchF9UgysY+AhAMwJSbcwT4W68qn5m3Tm4vDicFQLZn2Rc43LIym9T+8pHzOEAODBzVSTFxmiati/+u70FRO3U9Qc2HxWqWJ6JBqFSC2KqH/FmJpzkp/jnAJlLJ4fpdrIEWtmWoKByn1cmEy5XG/PMiKBXTcW9e7LOLrA6xv5Vcic4hmOK8IKwMSzVsEe6VZiUA6bqDzTg31e6WF/9PdNh/62MTEgFY1OeMwL1PsuTqk1FZO4S1IzlvPhFTkVAd0g/R9y9FFWTgVjhw/UJUbwa+On0/HpPzrJKAnfYB2vzbhUjm8HSb5jd5t33669neC5D052WxUo0lGQbVqDlI9ZlDr/qlm2KnnqBNStyOc55LN93qya83dLRiH7ur/EL718UOjXkCXDL4WsUWsC75sRUwHkNWPl4VJQIrGnaSEq+FSjGmQBaENFHMZpJkSIixxoiOQdmYmO2g9bEDdtS143+NE7XWEwjf4aJeThH38bme1i1KoVxcRaaUJzWVGRM4T6s8LfoXpl7L7nGuAshlUdyOJYtUlFvAfpN5Jb30EfsHJTllJ35Zi1TFdMLFOyO8O1xY0iUwCj9Pckso8xh67mil9FV41bZSBi09uAwEJdV3a+/t/V+ZPnjJRC7BRUVh33Q8w8qfl8NsQeKH+oSE14rF9PzWQe6ERYis/sCUXOPtVrgiclyi/oxjhkEJfDq92oO/MZrh2eLRVk8D0nfsbIDX0+bETh85w32/ I9epbmsF H6TGRcBIOv6oQ3W/GI4D4UYUNZ8glbdJIP61MtLpdx/DunW+ywcV1bj1910zp8D2ybi7Fpp29OuhxmSQY56W91DlGTlkLKr0NFPSx4POhH4emQZsCVJo1tUePdNKcCQ+tN7zH0ucyHbHvjaQnYiCsCXd4uAdmVhV4wZNhLBLLQLTTi9M94liEpus+N5hnWCI07nnWVa06djE5QPTrxb6flnz/DKGAKBkUP07ppTiKNx9CefXhaD+v0Qjvc11DvJgkskNP/8e/WXLIq6oKOJ2ldWrKU/F9k36/xJQQCI5ondb0MsM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.004992, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Apr 17, 2023 at 4:12=E2=80=AFAM Sergey Senozhatsky wrote: > > On (23/04/17 01:29), Yosry Ahmed wrote: > > > @@ -2239,8 +2241,8 @@ static unsigned long __zs_compact(struct zs_poo= l *pool, > > > if (fg =3D=3D ZS_INUSE_RATIO_0) { > > > free_zspage(pool, class, src_zspage); > > > pages_freed +=3D class->pages_per_zspage; > > > - src_zspage =3D NULL; > > > } > > > + src_zspage =3D NULL; > > > > > > if (get_fullness_group(class, dst_zspage) =3D=3D ZS_I= NUSE_RATIO_100 > > > || spin_is_contended(&pool->lock)) { > > > > For my own education, how can this result in the "next is NULL" debug > > error Yu Zhao is seeing? > > > > IIUC if we do not set src_zspage to NULL properly after putback, then > > we will attempt to putback again after the main loop in some cases. > > This can result in a zspage being present more than once in the > > per-class fullness list, right? > > > > I am not sure how this can lead to "next is NULL", which sounds like a > > corrupted list_head, because the next ptr should never be NULL as far > > as I can tell. I feel like I am missing something. > > That's a good question to which I don't have an answer. We can list_add() > the same zspage twice, unlocking the pool after first list_add() so that > another process (including another zs_compact()) can do something to that > zspage. The answer is somewhere between these lines, I guess. But the first list_add() is (in this case) the correct add, so we expect other processes to be able to access the zspage after the first list_add() anyway, right? > > I can see how, for example, another DEBUG_LIST check can be triggered: > "list_add double add", because we basically can do > > list_add(page, list) > list_add(page, list) > > I can also see how lockdep can be unhappy with us doing > > write_unlock(&zspage->lock); > write_unlock(&zspage->lock); > > But I don't think I see how "next is NULL" happens (I haven't observed > it). Yeah I reached the same conclusion. Couldn't figure out how we can reach the NULL scenario.