From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CFF0C2BA1A for ; Wed, 19 Jun 2024 09:17:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C6E126B03B0; Wed, 19 Jun 2024 05:17:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1CE86B03B1; Wed, 19 Jun 2024 05:17:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0B226B03B2; Wed, 19 Jun 2024 05:17:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 91CFD6B03B0 for ; Wed, 19 Jun 2024 05:17:15 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 38EEA120554 for ; Wed, 19 Jun 2024 09:17:15 +0000 (UTC) X-FDA: 82247084430.27.7A1E37D Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf19.hostedemail.com (Postfix) with ESMTP id 55F3F1A0008 for ; Wed, 19 Jun 2024 09:17:13 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718788626; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nu5TR085RRSyOGpPJI7pMcBDnjW7iQty3oV92if1HaU=; b=Ph3a+pBjOObNd13nBOgi2VmWkn3DNHqpbqg5kNS+B9ny1+9QQKrlE9qCWA0HHiPLg3Edj3 TOT92Whsz48P0r4iHE2pB8mXiIZEA8uwXn8N7q72On1L1QMwaLI2koPSGDGFXk0BXYtvTq 3pLx2GMkkCXJX0GV7BH8dmWbEf9ME0U= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718788626; a=rsa-sha256; cv=none; b=5Cz+pLhudCSivO91GekznhLFIbshy7yLQTd/iqIm1ih0nSVgrEpIyaiZD8c9xFZdxPTjHN KVzgqtgNdwWQRlORtIQGOaD4Og0VZ4kubF478aUYVJYBYX6TR3mTTUVfrDqDYIwyMdGR3s o4j+oOIkiPCgaqNhFcetQ94iw8rnGdY= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 049C8DA7; Wed, 19 Jun 2024 02:17:37 -0700 (PDT) Received: from [10.1.36.163] (XHFQ2J9959.cambridge.arm.com [10.1.36.163]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 88B963F64C; Wed, 19 Jun 2024 02:17:10 -0700 (PDT) Message-ID: Date: Wed, 19 Jun 2024 10:17:08 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v1 0/5] Alternative mTHP swap allocator improvements To: "Huang, Ying" , Andrew Morton Cc: Chris Li , Kairui Song , Kalesh Singh , Barry Song , Hugh Dickins , David Hildenbrand , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20240618232648.4090299-1-ryan.roberts@arm.com> <87tthp4cx9.fsf@yhuang6-desk2.ccr.corp.intel.com> Content-Language: en-GB From: Ryan Roberts In-Reply-To: <87tthp4cx9.fsf@yhuang6-desk2.ccr.corp.intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 55F3F1A0008 X-Stat-Signature: ux7xtuc6jjdw38om3h3sb7rct1waa4q4 X-HE-Tag: 1718788633-49855 X-HE-Meta: U2FsdGVkX18efkQn4uKG3F2fF+XWI1nWlaGUzmTfCE/XBXW7TEDbZNF18RuMKDrudiKi9NnJ3q4pJq8eW4UN9ST38zAtdgZdmbiaOCdz4WSTLvCufQ5n0E/N/u9AhXhkx0ZeP/ziLzd8/QMYO4uULpnroxZACTRQYJq0AthD1IPJpqVSRod4zPqUASH2K0QXAwO5KKSco67IvoY90H8JnGvCaR9Yck0NbzO70PyCbJQYPgJK2P/jCE6UcVqd5PWJ6OTtxa+ZbOiVd+NYfGhMl6U/xbWPEaDo68pwqDkxR4wCrddprYBJqNcjBfX0Yi7NQC07hAywqkPW0yKVxWn/+LMx8w8oJEKYFXszRv37qZWhRsPzxAiYxB28DtC+SFQicVEW1Z77bGReMZaaNwL6Lk0R8khoGHncK0DMGiPES1A5EQBU3d/Q3ZeXNk1DF4gB+OBfvPAxpdowqzOJgo9hXacZxNFHJrmCNkiMmhnSzbT5SdmREP3wBoizwNBmqrtVzrzN9oS4EJW4/CR9XjARCxxPXHDpN/iCN54tJ2trrDgI3qRiVmxlfOU5p+RwI4xIwr74IR8FGI60OSeGVJijYypHdGV3od3HdKucAaniDrMI1XSn2AaCMLusiesPzaojhPzIavb0/Q41cZl3t2VQD+SzRJMzP1r2g4Sj7uYmdDmo0Q0d9thUk4dlJONXa783UcDPkqHN0gjWnEU8l0L+ybbFZ3A0kFV5oXGWckjgDQ88Oy7XEz2yiLSbgHZuyWaWwaoHNJp6C8kGcGPvVni+GgRlW1XGCxBW9cicmVXEnW2wm+iI9oBgcOTGTaGREOfK4Pd+LrXnAwxBA7jYlv3jiIkTGywGqh5FFFpPxjQBzset/EGzl4cecvqGpsSkpAjAwvaeTDeg0jxTIeneuEXbrZ6uB9eD2+IwZULVwUTOp65sV+THQocaa2E1XuiPuPwdCP9m4hltYGU4qFYrQzX /OuW6xRv NrC5WvcV3kdIVUGvVulM/HxczXW11+C2lgQ7adx69Nu9P3PXFHRw4VBZ3ytMZJF9J6IFHiE7EKQHTc+qW3ttguk30pfB3y1wDIkwXn3SYSjGgKPXEfW2AkC4/uWbyvHJVFrUhEbddt95CXa0851eDPlT7c1940d/++45Q7PYhOncIp6nlFBZyG5eVwme6POIih1XUH1K/7aUK6mk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 19/06/2024 08:19, Huang, Ying wrote: > Hi, Ryan, > > Ryan Roberts writes: > >> Hi All, >> >> Chris has been doing great work at [1] to clean up my mess in the mTHP swap >> entry allocator. > > I don't think the original behavior is something like mess. It's just > the first step in the correct direction. It's straightforward and > obviously correctly. Then, we can optimize it step by step with data to > justify the increased complexity. OK, perhaps I was over-egging it by calling it a "mess". What you're describing was my initial opinion too, but I saw Andrew complaining that we shouldn't be merging a feature if it doesn't work. This series fixes the problem in a minimal way - if you ignore the last patch, which is really is just a performance optimization and could be dropped. If we can ultimately get Chris's series to 0% fallback like this one, and everyone is happy with the current state for v6.10, then agreed - let's concentrate on Chris's series for v6.11. Thanks, Ryan > >> But Barry posted a test program and results at [2] showing that >> even with Chris's changes, there are still some fallbacks (around 5% - 25% in >> some cases). I was interested in why that might be and ended up putting this PoC >> patch set together to try to get a better understanding. This series ends up >> achieving 0% fallback, even with small folios ("-s") enabled. I haven't done >> much testing beyond that (yet) but thought it was worth posting on the strength >> of that result alone. >> >> At a high level this works in a similar way to Chris's series; it marks a >> cluster as being for a particular order and if a new cluster cannot be allocated >> then it scans through the existing non-full clusters. But it does it by scanning >> through the clusters rather than assembling them into a list. Cluster flags are >> used to mark clusters that have been scanned and are known not to have enough >> contiguous space, so the efficiency should be similar in practice. >> >> Because its not based around a linked list, there is less churn and I'm >> wondering if this is perhaps easier to review and potentially even get into >> v6.10-rcX to fix up what's already there, rather than having to wait until v6.11 >> for Chris's series? I know Chris has a larger roadmap of improvements, so at >> best I see this as a tactical fix that will ultimately be superseeded by Chris's >> work. > > I don't think we need any mTHP swap entry allocation optimization to go > into v6.10-rcX. There's no functionality or performance regression. > Per my understanding, we merge optimization when it's ready. > > Hi, Andrew, > > Please correct me if you don't agree. > > [snip] > > -- > Best Regards, > Huang, Ying