From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4B18E77199 for ; Wed, 8 Jan 2025 14:14:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C4EC6B0082; Wed, 8 Jan 2025 09:14:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 373CF6B0083; Wed, 8 Jan 2025 09:14:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 214A66B0088; Wed, 8 Jan 2025 09:14:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 041DA6B0082 for ; Wed, 8 Jan 2025 09:14:13 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id A5E27A120C for ; Wed, 8 Jan 2025 14:14:13 +0000 (UTC) X-FDA: 82984479186.25.5F55D95 Received: from mailout1.w1.samsung.com (mailout1.w1.samsung.com [210.118.77.11]) by imf19.hostedemail.com (Postfix) with ESMTP id 8D5401A000A for ; Wed, 8 Jan 2025 14:14:10 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b=G+anyXIc; dmarc=pass (policy=none) header.from=samsung.com; spf=pass (imf19.hostedemail.com: domain of da.gomez@samsung.com designates 210.118.77.11 as permitted sender) smtp.mailfrom=da.gomez@samsung.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736345651; a=rsa-sha256; cv=none; b=G8B5wjo+COGiJ/31akRKYU6muxvGRb2qs7qVx0YQ9QECgMzJqP4nVNuBjrREJZJIbnDbJF kGLJgdg+F8XLOkHnM94Gmx7C5tfUNPFxnjUGj5OKktbsfcDY6ctjTFeePvTPqBW2sdtp1+ RhJegJv87kbDidXx09mdLmD2WW7craE= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b=G+anyXIc; dmarc=pass (policy=none) header.from=samsung.com; spf=pass (imf19.hostedemail.com: domain of da.gomez@samsung.com designates 210.118.77.11 as permitted sender) smtp.mailfrom=da.gomez@samsung.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736345651; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yXUYmk0JObYM/8ynIWwLCdMzql7k68fmyurZ+cGXNzA=; b=cXVF46tc0SHdob1NVaJKvQE+HBcnAXDdbI3W39ntjCBRc92YyEeKh/nnkr7b3XqBoo9Uer MZYm4lHjYh0FJ4Sintde1NeEOGRhvd8aLRxCHWjA7X8cW4Yygwg9OQBLRmE/xV4YK2XPdS sLP7Uain8qKhdFeJeBTBs+z10k9fRxg= Received: from eucas1p1.samsung.com (unknown [182.198.249.206]) by mailout1.w1.samsung.com (KnoxPortal) with ESMTP id 20250108141408euoutp01d0653191d613ebaab30a6aa3be747e42~YvSrx67Xy1170911709euoutp01Z for ; Wed, 8 Jan 2025 14:14:08 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout1.w1.samsung.com 20250108141408euoutp01d0653191d613ebaab30a6aa3be747e42~YvSrx67Xy1170911709euoutp01Z DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1736345648; bh=yXUYmk0JObYM/8ynIWwLCdMzql7k68fmyurZ+cGXNzA=; h=Date:From:To:CC:Subject:In-Reply-To:References:From; b=G+anyXIc30IQ5FmM9rqOib39TdUcPIlf4lW6h+k9imA3qm7jzhnvLIeCvTTZBFFAY J5VBXFwwYN/5cgYiq//U+IM5ZUHt8FxXvCErlmMDYlqaVejLui1UHqwpavVEwnUfDK PLgYESbG1UhoQHsx5OzXBYCloe7HT9sLffF1OPcI= Received: from eusmges1new.samsung.com (unknown [203.254.199.242]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20250108141408eucas1p1547f7472e389b2f1f1dd851fd8aeb5fd~YvSrlg9_62921129211eucas1p1F; Wed, 8 Jan 2025 14:14:08 +0000 (GMT) Received: from eucas1p2.samsung.com ( [182.198.249.207]) by eusmges1new.samsung.com (EUCPMTA) with SMTP id 34.46.20821.0388E776; Wed, 8 Jan 2025 14:14:08 +0000 (GMT) Received: from eusmtrp1.samsung.com (unknown [182.198.249.138]) by eucas1p2.samsung.com (KnoxPortal) with ESMTPA id 20250108141407eucas1p287c48576d65485c781b4e335b22d092a~YvSq6KexQ1128811288eucas1p2N; Wed, 8 Jan 2025 14:14:07 +0000 (GMT) Received: from eusmgms1.samsung.com (unknown [182.198.249.179]) by eusmtrp1.samsung.com (KnoxPortal) with ESMTP id 20250108141407eusmtrp1dd269a72d43fee8b429dde93a1f1edb7~YvSq5gwev1901519015eusmtrp13; Wed, 8 Jan 2025 14:14:07 +0000 (GMT) X-AuditID: cbfec7f2-b11c470000005155-79-677e8830a206 Received: from eusmtip1.samsung.com ( [203.254.199.221]) by eusmgms1.samsung.com (EUCPMTA) with SMTP id 84.E6.19920.F288E776; Wed, 8 Jan 2025 14:14:07 +0000 (GMT) Received: from CAMSVWEXC01.scsc.local (unknown [106.1.227.71]) by eusmtip1.samsung.com (KnoxPortal) with ESMTPA id 20250108141407eusmtip15bb7200289a8d2ab84bdb3a2452f2b7e~YvSqvOIZo2790427904eusmtip1Q; Wed, 8 Jan 2025 14:14:07 +0000 (GMT) Received: from localhost (106.110.32.87) by CAMSVWEXC01.scsc.local (2002:6a01:e347::6a01:e347) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Wed, 8 Jan 2025 14:14:06 +0000 Date: Wed, 8 Jan 2025 15:14:06 +0100 From: Daniel Gomez To: David Hildenbrand CC: Ryan Roberts , Barry Song , Andrew Morton , , Luis Chamberlain , Pankaj Raghav , Subject: Re: Swap Min Odrer Message-ID: <20250108141406.3gen6dnlb3b4zga6@AALNPWDAGOMEZ1.aal.scsc.local> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <470be5fa-97d6-4045-a855-5332d3a46443@redhat.com> X-Originating-IP: [106.110.32.87] X-ClientProxiedBy: CAMSVWEXC01.scsc.local (2002:6a01:e347::6a01:e347) To CAMSVWEXC01.scsc.local (2002:6a01:e347::6a01:e347) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprCKsWRmVeSWpSXmKPExsWy7djP87oGHXXpBl9/8lrMWb+GzaLpwSMm i6/rfzFb3Fvzn9XixoSnjBY9u6cyWuy428vmwO6xZt4aRo9NqzrZPDZ9msTucWLGbxaPd7OU PN7vu8oWwBbFZZOSmpNZllqkb5fAldF5YDNzQaN8xf45q5kaGD+KdzFyckgImEhM3TOftYuR i0NIYAWjxNsvDSwQzhdGiWlNF6Gcz4wSPxZOYoVpmXfzLBNEYjmjxMedOxhBEmBVL44YQyQ2 M0r0XpsOlODgYBFQkXjUEg9SwyagKbHv5CZ2EFtEQENiU9sGZpB6ZpDeX6v2g20QFpCWaDl1 ggXE5hXwlri3+D8ThC0ocXLmE7A4s4COxILdn9hA5jMD1S//xwES5hSwk3jZNBHqUEWJGRNX skDYtRJrj51hh7DfcEhMns0HYbtIbF2zlBHCFpZ4dXwLVI2MxOnJPVC92RJnv++BqimR+Pfh F1TcWuL/2rtgL0oIOEocvC0LYfJJ3HgrCHEkn8SkbdOZIcK8Eh1tQhCNahKr771hmcCoPAvJ W7OQvDUL4a0FjMyrGMVTS4tz01OLDfNSy/WKE3OLS/PS9ZLzczcxAhPO6X/HP+1gnPvqo94h RiYOxkOMEhzMSiK8lrK16UK8KYmVValF+fFFpTmpxYcYpTlYlMR5VVPkU4UE0hNLUrNTUwtS i2CyTBycUg1MrilLMr8GbndjyxBN6PzYrlqx8mpMRRZTUpB+x5JTIi/eTzx5zKLl1mRX05ow lqOCl2Z2HbmjMW2W2Cq5z/ONVmoeOJPLJxr0c0bO22WHo2OcXf5vWb3+TMAE5TVBCq5eXs9i 9Apv1OhaRh7vWnWuo8TE7P6za7bivREL99rcXpRds+v8hE21DhxRAfunnvgp9nzOx/wjEb8y dGc9Wz0zIXxTQsrV7Z+jHj1XD3974UVpr+zkM956l2LnxHrqRj+d8s3yaes1+XTRtakqbC1R qQ8cM66s33vI8+fE80ol+mnveI6m/Nj77UJY7m+fvnhb+QCtZwKGVz5Jz4tXmLHOOe5apmrX yY9SKv0Nwr43lViKMxINtZiLihMBr8zSU6cDAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrKIsWRmVeSWpSXmKPExsVy+t/xu7r6HXXpBv8WyFnMWb+GzaLpwSMm i6/rfzFb3Fvzn9XixoSnjBY9u6cyWuy428vmwO6xZt4aRo9NqzrZPDZ9msTucWLGbxaPd7OU PN7vu8oWwBalZ1OUX1qSqpCRX1xiqxRtaGGkZ2hpoWdkYqlnaGwea2VkqqRvZ5OSmpNZllqk b5egl9F5YDNzQaN8xf45q5kaGD+KdzFyckgImEjMu3mWqYuRi0NIYCmjRMvtxcwQCRmJjV+u skLYwhJ/rnWxQRR9ZJT4tOUnlLOZUeL63LfsXYwcHCwCKhKPWuJBGtgENCX2ndzEDmKLCGhI bGrbwAxSzyzwmVHiTvd6RpCEsIC0RMupEywgNq+At8S9xf+hzjjJJPGn6QFUQlDi5MwnYDaz gI7Egt2f2ECWMQM1L//HARLmFLCTeNk0EepSRYkZE1eyQNi1Eq/u72acwCg8C8mkWUgmzUKY tICReRWjSGppcW56brGhXnFibnFpXrpecn7uJkZg/G079nPzDsZ5rz7qHWJk4mA8xCjBwawk wmspW5suxJuSWFmVWpQfX1Sak1p8iNEUGBQTmaVEk/OBCSCvJN7QzMDU0MTM0sDU0sxYSZzX 7fL5NCGB9MSS1OzU1ILUIpg+Jg5OqQam2b+3ftsfI/1F+QTnp/Dlb1k2f9t+KOaffN2y+V/7 DTaGFs5QmbfYOaDm2fqnyZcj/h7KmH2icKGcTJroHOPNMacZtr7g7uqV8omIXHrkndFRW3+H +cwtakoOrgalXU5nutTucPNzds45I912q7Viwe4tqa8idNlvJdpPi21+9/C8fZZRVOM6lbUG Aq+naN38U7rhyH5Gqy+2N1gseR/sPLrL58DzxHtXzmXoTO0zeqJUIHzqxKKq53J/Dkx8f+rY qWWH/byOdL6QztuhU33ucZ0mb9p01fdHc1/4/fyQsGcW06pCkfZtctYiJQa9pXrKt5oOvmQK YOR5kxxuvdb0+daLXncXnU4Nr1PcwG8zQYmlOCPRUIu5qDgRAHYOnqJIAwAA X-CMS-MailID: 20250108141407eucas1p287c48576d65485c781b4e335b22d092a X-Msg-Generator: CA X-RootMTR: 20250107094349eucas1p1c973738624046458bbd8ca980cf6fe33 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20250107094349eucas1p1c973738624046458bbd8ca980cf6fe33 References: <20250107094347.l37isnk3w2nmpx2i@AALNPWDAGOMEZ1.aal.scsc.local> <20250107122931.qpkn43yvs4kq3twi@AALNPWDAGOMEZ1.aal.scsc.local> <470be5fa-97d6-4045-a855-5332d3a46443@redhat.com> X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 8D5401A000A X-Stat-Signature: d1wu48s6pjf7eq3g6qimez5ypcwi6g3s X-Rspam-User: X-HE-Tag: 1736345650-633057 X-HE-Meta: U2FsdGVkX18Cki42zn9Q6trLfDIpQCl/hftI5/0g95TI0GaEEe7PROrxfFtfU5QO5taVx9Zt0DnoEU24mNTKexc7MUmlty7wjlR5V1FmoffFJVTp35Dx/QYWo7Jrlb4yjrZn1WWZCrgXBbUWf4f+dZrmuLq5zlfmPiA9o8gKQrkTgC82P9+84yfhEoU/9MaljHSj2IFRUs/FKJ0JP6tadPmt33sZqClAFKlufhHxJkFwJ56TUaA/xgfvJaNOm4KokAx8JUKuRmN7TEjIqd6Se13NrbEsi5SGcPCVQZXvldyXOiq4p0VZhoQAdzgO4P1yyq3mpbLwULEmS0Sq/IDATlFfgV8CFoKJ8XuDvXv6W9jpHmn0pwGX43m+NRecZ7pgAExms257EXvTljFzS0Ok/AkjZDS5A89h3XPD1Ccrp5GpfaYWIZvLEGdtEFxDowtvHzKbXUBKnmMMMwFSPu58nJqKDnsKBaAUGftFSvCKgczPzPr3WNfgD+dlQyJC9/HZVzjuIbpzZ74NePDu6iuA9Itkjf/CU4IyUOds/nu31uan60BqBxYdQOditM5tQPf/lhMW0H6C4T8lt6Hc/TTZHQQ/eHzXM8x9PQpDUKASSBWfq42QJEy+SEmxRCCL6K3JoJDB33GErILXE5gnKvdtgEG7JTYLp4+n1wjjFQEGmwX5aMRsXP4XO44wuTTRl+AyfZUFfEbqfNCCcupvekWGgTE950zb/eXFfelZPD/cXksasr9AFGH01RIDXhkOpKm8sXhY1XRy19RqMtR/B9VrTVvZPFjr/KvFdq70eXANMouusa1FpMMc8KwwBS8yGcmYvQtv4DCDWHYUH/zGnQSND/MRmehIq95NDD1RG5W7c3OJZJCsIDFEGptx1pqjc4SsI7zu9wgvv4OZlhbiU19pAs+BgbIXKdfnsbjV2nhFmIbugj8kJhEipNAqRfoEMWLPdFxXqYZtGaR2JxBAMm7 jaIoQbw1 0hb+iidfTmo8T8KUfZfgaVuWub4oid6R18FY+o7m7+42s/a4Lju8+olm3UdzTgKH3fBzhgGhxq7kX9IXDYFUNBOZWdipa+afCy9ByDCRuVEkrJdfnzhAE1Pm4bcU8hE1RuOMwCeWkGttwD2LzTo1+YAQZnLUeIBbhYp2zS8pzyEBAZ1psQpJlDrZWuKd7qIQqLo7qlpdhIfUBIPo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 07, 2025 at 05:41:23PM +0100, David Hildenbrand wrote: > On 07.01.25 13:29, Daniel Gomez wrote: > > On Tue, Jan 07, 2025 at 11:31:05AM +0100, David Hildenbrand wrote: > > > On 07.01.25 10:43, Daniel Gomez wrote: > > > > Hi, > > > > > > Hi, > > > > > > > > > > > High-capacity SSDs require writes to be aligned with the drive's > > > > indirection unit (IU), which is typically >4 KiB, to avoid RMW. To > > > > support swap on these devices, we need to ensure that writes do not > > > > cross IU boundaries. So, I think this may require increasing the minimum > > > > allocation size for swap users. > > > > > > How would we handle swapout/swapin when we have smaller pages (just imagine > > > someone does a mmap(4KiB))? > > > > Swapout would require to be aligned to the IU. An mmap of 4 KiB would > > have to perform an IU KiB write, e.g. 16 KiB or 32 KiB, to avoid any > > potential RMW penalty. So, I think aligning the mmap allocation to the > > IU would guarantee a write of the required granularity and alignment. > > We must be prepared to handle and VMA layout with single-page VMAs, > single-page holes etc ... :/ IMHO we should try to handle this transparently > to the application. > > > But let's also look at your suggestion below with swapcache. > > > > Swapin can still be performed at LBA format levels (e.g. 4 KiB) without > > the same write penalty implications, and only affecting performance > > if I/Os are not conformant to these boundaries. So, reading at IU > > boundaries is preferred to get optimal performance, not a 'requirement'. > > > > > > > > Could this be something that gets abstracted/handled by the swap > > > implementation? (i.e., multiple small folios get added to the swapcache but > > > get written out / read in as a single unit?). > > > > Do you mean merging like in the block layer? I'm not entirely sure if > > this could guarantee deterministically the I/O boundaries the same way > > it does min order large folio allocations in the page cache. But I guess > > is worth exploring as optimization. > > Maybe the swapcache could somehow abstract that? We currently have the swap > slot allocator, that assigns slots to pages. > > Assuming we have a 16 KiB BS but a 4 KiB page, we might have various options > to explore. > > For example, we could size swap slots 16 KiB, and assign even 4 KiB pages a > single slot. This would waste swap space with small folios, that would go > away with large folios. So batching order-0 folios in bigger slots that match the FS BS (e.g. 16 KiB) to perform disk writes, right? Can we also assign different orders to the same slot? And can we batch folios while keeping alignment to the BS (IU)? > > If we stick to 4 KiB swap slots, maybe pageout() could be taught to > effectively writeback "everything" residing in the relevant swap slots that > span a BS? > > I recall there was a discussion about atomic writes involving multiple > pages, and how it is hard. Maybe with swaping it is "easier"? Absolutely no > expert on that, unfortunately. Hoping Chris has some ideas. Not sure about the discussion but I guess the main concern for atomic and swaping is the alignment and the questions I raised above. > > > > > > > > > > I recall that we have been talking about a better swap abstraction for years > > > :) > > > > Adding Chris Li to the cc list in case he has more input. > > > > > > > > Might be a good topic for LSF/MM (might or might not be a better place than > > > the MM alignment session). > > > > Both options work for me. LSF/MM is in 12 weeks so, having a previous > > session would be great. > > Both work for me. Can we start by scheduling this topic for the next available MM session? Would be great to get initial feedback/thoughts/concerns, etc while we keep this thread going on. > > -- > Cheers, > > David / dhildenb >