From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 187C8C02196 for ; Thu, 6 Feb 2025 07:22:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A507D280010; Thu, 6 Feb 2025 02:22:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A275828000F; Thu, 6 Feb 2025 02:22:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 916D8280010; Thu, 6 Feb 2025 02:22:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 75B8C28000F for ; Thu, 6 Feb 2025 02:22:37 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 3B253120FA1 for ; Thu, 6 Feb 2025 07:22:37 +0000 (UTC) X-FDA: 83088677154.21.AF1AF20 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf04.hostedemail.com (Postfix) with ESMTP id 487D940002 for ; Thu, 6 Feb 2025 07:22:35 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=TVzlyCm5; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf04.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.172 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738826555; a=rsa-sha256; cv=none; b=eMZHfufZbpkhQ1t4Q+RBQKPoT2pVHtLOCGnyS1Vo/UDofBA/CZkm7MtzTs4sD4hHFsEh8b oHfT12iFQRlf2i1/ohOWB3QzyijIxan7QIsrWXbChIt/e4mKQwi7QGtAjjfTgqWKUqcurl pEd6+xyEW1bKVtKfFLadbUpYF+skrr4= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=TVzlyCm5; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf04.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.172 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738826555; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8ND3YwwZD7x/+hPDujpYDVCvxd5yZcP3jGfKq28g5ss=; b=JLrSz5nvbJ7Lzb0ccWbbp01y2Pwy5Ca3uUmLsUDN0IP5OJnTKB1+w786InZz4acyufsHs/ tMpvmQCs4qaVF51hmLuD1QoCuGvhwYPAss0hdFCb/ALaiD683wDWTqjw+zIfSCP9VA/Vhn fK7z5T0QMPxzA29XXT+0DvEaOPnSG0c= Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-216728b1836so10510885ad.0 for ; Wed, 05 Feb 2025 23:22:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1738826554; x=1739431354; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=8ND3YwwZD7x/+hPDujpYDVCvxd5yZcP3jGfKq28g5ss=; b=TVzlyCm5p2j61zmn7Rk80zBkhqnPs911u6IvJ6eegsDV3OXjmK05oi/3SKOWSSmEbd 1CgAmOXkPlK3KUQ5ej4ZSuWt3ZpaxUpCP9CGQOG6YM0+BIdxSQFl6d800kz6qODAvCm2 NbRzys0SsCuMhFwqw2tSR79ctuirC403f6Acw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738826554; x=1739431354; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=8ND3YwwZD7x/+hPDujpYDVCvxd5yZcP3jGfKq28g5ss=; b=buyL2mU0UupQpQJFbQ2fw+r7buIgmeNwnTVLhylZfqFkr8cHvCLBM2vc7z2CEl2vcn asHSyvCMt8yTcjQSEeNzJaq1GJHPyu4aEMwtcdWcSTL+81a5dS1HsajWzmThRCoWK6IQ oIWr8vMIzXR/lNEnO0Y6pxaKPfENSDo+et/KICszbhoYYcbFb6bhBNbAn0vA9st2g9Pc UC+RjIEgDGHTrwVdJBoRpQsxyg9J0sTW3MFC4q6W6lmv8x3hAKS/6WwrP3DDAMNLR2nP /aqtEg88Ct/YhkxxWcDoUUzCV2G/yBA60VBN8AqeDq+VzQI1fzLeotVZ3gbv5Oo51Slb q7dg== X-Forwarded-Encrypted: i=1; AJvYcCX8f4Z5uM+cErGL5DZNKfi/imyXQdi5OcHNOqfI3/44k18Z8NMT7b+yVbLQebRD9XyvSXEuqGrbtw==@kvack.org X-Gm-Message-State: AOJu0YxVdFlM1nWuq53cQ1AfpLswFMa1/9vfys/0AMlIsg0VUi452r37 uV275N0p1N22mlXeOQ8IHi4taGE8RHaxAVlEPI6IQfO4D3ofhQGHvs8T0aHj1w== X-Gm-Gg: ASbGnctFkuLyOryeavRj22pvaAHjvN0MAwJndoUbxvjoDdB6QV3aQpmO0HBzCqWaWmg s2YoLJMGztqhe9NztoMmOv1xruLGkNUOYgC5rpBztmMkuu115y/BsIoU8e+s2QaH/hGZajz9VAX R8pZvZRwCgztlL1+JFf9KgTQ5gCOldC9r6WgqHDje9yLotbiDRP5FdiaeH8ZYDdvCqIrWSeQgdw AvON9Pvm+t3nSFIWlw2ZqoTYqG7CIQIo16SPEJl+3MpDn+sgNEA/rhcYWnsrqiRgn2xk6MMpVZ4 4xgpyT+OhDr44N5iMqc= X-Google-Smtp-Source: AGHT+IEWqlUmGEMZDMpp/ZVMExE1e2zM+lWtI6158t8LS5XpAGtkgzj7kcVIZOm7bY8N47//IfjoXA== X-Received: by 2002:a17:902:f64f:b0:215:b75f:a1cb with SMTP id d9443c01a7336-21f17e26cf8mr95007055ad.9.1738826554127; Wed, 05 Feb 2025 23:22:34 -0800 (PST) Received: from google.com ([2401:fa00:8f:203:28ab:cea4:aa8a:127a]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21f3683d8ffsm5632095ad.150.2025.02.05.23.22.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Feb 2025 23:22:33 -0800 (PST) Date: Thu, 6 Feb 2025 16:22:27 +0900 From: Sergey Senozhatsky To: Kairui Song Cc: Sergey Senozhatsky , Andrew Morton , Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed Subject: Re: [PATCHv4 02/17] zram: do not use per-CPU compression streams Message-ID: References: <20250131090658.3386285-1-senozhatsky@chromium.org> <20250131090658.3386285-3-senozhatsky@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Queue-Id: 487D940002 X-Rspamd-Server: rspam10 X-Stat-Signature: dh9fm1mks1bopddhjrnbf3ora1p5dqaq X-HE-Tag: 1738826555-95457 X-HE-Meta: U2FsdGVkX191nNDftnT2iNvPSsSIcE8eMSfIM1/fbZ8LrpbGejo69acQbcodOFDkF/1j0c5j4XXZdFl1qd4tdT8pbdhYAiTU9DdS/zzFjy/P9dr289B8/hRUYPjwB/svYFNvI99WKAxdbXzs0ECAMav0tDQ1MEFlg70p/IGX237AishPKZrazNdfLSjDWyqta/7GHoqiDbTBw80O7dExnVg7oCUNMd5s3N7JbRLhuy5HVrPJTuZ0nYuyjOAS48KRD59l2FKj4PBH1GBrqOLyjltWKnmC9vMLGjkDvohmOD2U+iIoGp5JWF5w9y0o5LKNSFtVT3hrzDkonVXHL4KvA1qJggToYWk7MBBvfYzYsL1lT2sn2lV50r+sbQSWMJc2ZH7G7+cKbBF9pAettDMkZoG+WZgCM135RV05HJXRcV4e/FfrRkPNED+dNwioM4sjnrmbsDDZ+8htdd7TStUcYa4KB8+40UVsXPWy3NKzoP04PtV8UAW0R9/DeJPlS0NMblhJtZSZFu5eLaR/TFFLurtB6FUEP4uGBMCO2OmY8fbHJx0/95A8ye4/xgZDoYNkCNGDUbxp9HlMoOAeAnjKapyyLTuhd7pQEFSI3u18jReCcMbh8oOYRJG1H9H2jkQdqdv79XFWNBWw3sHF5ZuSzwtAStFWAOOUa9Hv/8pbMljDlcO4d5+z2K33mAd4sptZdGhXDHTStDNjFAZ8lp7fknjOKyZ2ma5YP4OopP5A/Mkes3YBtwQGcMHxrnEyXU92wXfAMONvGE5ZErbGomjXys3eN9lnxW8TgGHi/rS5acCpmTpUIc7Glt6UaW24Y+/gwBkERyeVcm5jnGLt9rlxpI0vuiCZFc6RFC8RL7RGFUZolerTvu4mbGZeUs3XNkeNo8k6/QxtTr9O3Qis5pXab6T45pGRwpg48NhpslId/8ISHoaz/XBXj5D6BbC3hEPupZ+wSFqXnzJAbpNIX1Z MnUS5voT TBLF3d9ApbVGRF5vx1QEVAfIxNqJQbR0ygJm1z/7WkqY3TIM4UhHfWO//xzeQBdLXRSi69v8ZqB+scgq3IHGpX9hTojSj4fkuJ05W6rG+8N8HM5vCEm9t8suLoIc7jo4AhiteNZ0bKQU28qUCNSav5GwOEth6hD5CgFLCYwtL+onQ6SBPWht2Ztb6FsVjOt/4w3+CwwcDJdzrc8yWP53jr9RnSv8jQV+j5/urg3bP9xBwEmGJhNfEtb/jpnr9+MSTZDMoV4MPG1R6y98M939BvJ/MvT4F2DIb9zHA/xcb6PlLE1ZjrdTZ4FT4V36yZIQK6TIPlnKLeSYrhCfgl1ls2BEdpALYYKE1Ho/A+B+e8UfTn8A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On (25/02/06 14:55), Kairui Song wrote: > > On (25/02/01 17:21), Kairui Song wrote: > > > This seems will cause a huge regression of performance on multi core > > > systems, this is especially significant as the number of concurrent > > > tasks increases: > > > > > > Test build linux kernel using ZRAM as SWAP (1G memcg): > > > > > > Before: > > > + /usr/bin/time make -s -j48 > > > 2495.77user 2604.77system 2:12.95elapsed 3836%CPU (0avgtext+0avgdata > > > 863304maxresident)k > > > > > > After: > > > + /usr/bin/time make -s -j48 > > > 2403.60user 6676.09system 3:38.22elapsed 4160%CPU (0avgtext+0avgdata > > > 863276maxresident)k > > > > How many CPUs do you have? I assume, preemption gets into way which is > > sort of expected, to be honest... Using per-CPU compression streams > > disables preemption and uses CPU exclusively at a price of other tasks > > not being able to run. I do tend to think that I made a mistake by > > switching zram to per-CPU compression streams. > > > > What preemption model do you use and to what extent do you overload > > your system? > > > > My tests don't show anything unusual (but I don't overload the system) > > > > CONFIG_PREEMPT > > I'm using CONFIG_PREEMPT_VOLUNTARY=y, and there are 96 logical CPUs > (48c96t), make -j48 shouldn't be considered overload I think. make > -j32 also showed an obvious slow down. Hmm, there should be more than enough compression streams then, the limit is num_online_cpus. That's strange. I wonder if that's zsmalloc handle allocation ("remove two-staged handle allocation" in the series.) [..] > > Hmm it's just > > > > spin_lock() > > list first entry > > spin_unlock() > > > > Shouldn't be "a big spin lock", that's very odd. I'm not familiar with > > perf lock contention, let me take a look. > > I can debug this a bit more to figure out why the contention is huge > later That will be appreciated, thank you. > but my first thought is that, as Yosry also mentioned in > another reply, making it preemptable doesn't necessarily mean the per > CPU stream has to be gone. Was going to reply to Yosry's email today/tomorrow, didn't have time to look into, but will reply here. So for spin-lock contention - yes, but that lock really should not be so visible. Other than that we limit the number of compression streams to the number of the CPUs and permit preemption, so it should be the same as the "preemptible per-CPU" streams, roughly. The difference, perhaps, is that we don't pre-allocate streams, but allocate only as needed. This has two sides: one side is that later allocations can fail, but the other side is that we don't allocate streams that we don't use. Especially secondary streams (priority 1 and 2, which are used for recompression). I didn't know it was possible to use per-CPU data and still have preemption enabled at the same time. So I'm not opposed to the idea of still having per-CPU streams and do what zswap folks did.