From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBB5EC54E49 for ; Wed, 28 Feb 2024 14:01:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 121DB940008; Wed, 28 Feb 2024 09:01:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D25B940007; Wed, 28 Feb 2024 09:01:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E65E9940008; Wed, 28 Feb 2024 09:01:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D0C44940007 for ; Wed, 28 Feb 2024 09:01:26 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7568DA06E5 for ; Wed, 28 Feb 2024 14:01:26 +0000 (UTC) X-FDA: 81841374972.23.6A8C1A3 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by imf13.hostedemail.com (Postfix) with ESMTP id D5F1B2003F for ; Wed, 28 Feb 2024 14:01:21 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=meta.com header.s=s2048-2021-q4 header.b=NXv2qebf; spf=pass (imf13.hostedemail.com: domain of "prvs=878861422a=clm@meta.com" designates 67.231.153.30 as permitted sender) smtp.mailfrom="prvs=878861422a=clm@meta.com"; dmarc=pass (policy=reject) header.from=meta.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709128882; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Y01z/nFHibwAGs2HNPk5FY6N9xED5KP581+OKUR8j64=; b=zCNZ8MwYl27RcQn9AlcWFrMp26LimXd5KNEfvEAlGluHBNHwN8makueGyadelUrxKL3Wlp /gWZ16VuxaBn6f6IjC1yB9CyGfD6+7LnbrVSpnONa1ThnONniySDPTtIwkYv2SQoLOFtCl E2opqGqlQmTkkMGTSiGh1nK3ae3xef8= ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1709128882; a=rsa-sha256; cv=pass; b=8mORmN+ASApy6o30vQ0kF01BAz3R82xsUcxLjlHPu3FmOhGzeerDZ2qUSunnjs8t51xVqx 8sU+9vu+GGKhButQm01uq+tVMpHc4qcV4EABe0QpAnFp6i3d0tyKR2ggV77prrUKOutsuV uqhfSe4YeM+ZVBDUUp0yFaCzuNtTcvw= ARC-Authentication-Results: i=2; imf13.hostedemail.com; dkim=pass header.d=meta.com header.s=s2048-2021-q4 header.b=NXv2qebf; spf=pass (imf13.hostedemail.com: domain of "prvs=878861422a=clm@meta.com" designates 67.231.153.30 as permitted sender) smtp.mailfrom="prvs=878861422a=clm@meta.com"; dmarc=pass (policy=reject) header.from=meta.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") Received: from pps.filterd (m0001303.ppops.net [127.0.0.1]) by m0001303.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 41SB4P01031487; Wed, 28 Feb 2024 06:01:08 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=message-id : date : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding : mime-version; s=s2048-2021-q4; bh=Y01z/nFHibwAGs2HNPk5FY6N9xED5KP581+OKUR8j64=; b=NXv2qebfDLBw16mWC1Uh9+3eZXpL55qbZZPzdkakWSJOcq8xXiqj1sHvsVxmcnMwTJjj ErHo2fzKcO9udIpopdTYFSzlNTsR0RUbBlVigG/eCWKEqtQX0Z4iPxMc1cguB9+Qq9TC tFAD0PrcPmWG/ix47Qtf4UFxm2ZN62y5UfvR4FMKlaXAZjcyXXLHsKFB4db7QRr03Qam OOx/KJu9Bbx9a1/V4cl/ZWlrBC3JmdwAOPKZxK4/NM0YHGulcW4JXmtJ5yCtVZrfoLlV hQbkzWLQNnbgy39ExbqGAWPk080DD9Rcmv1LfRJs92D6TTPRsrxT7kq4saaSdkuWub8s Sw== Received: from nam02-dm3-obe.outbound.protection.outlook.com (mail-dm3nam02lp2041.outbound.protection.outlook.com [104.47.56.41]) by m0001303.ppops.net (PPS) with ESMTPS id 3whypwsxfb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 28 Feb 2024 06:01:08 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KeHsX3Bxtz4rRYaaR2X2jjMKcCiEkoU2FPTTYemM9G8/qZ9SPoQQWHAfKyqBiWkJaZJwUvaLMYeORVN9JIiGjUMRvgMP4EoLlo65dmtD0w7rKsHYAxBv/sY3H+ExmEHSttnOeFeX5rh8OxUxldEJRg8tVBi+9uZJTYH9KGlPYhGgzJnB4o7o/rwIDT9EvzlbsxlyauG7NOJ/88BZVMyCH0lt3oxEl6pBLU8ZBygEbweRms+67RpjNzsAZBd+cNvt4pQBZ1blw8NQy7elPNRow4Z0tzPyFbu1DdTnvxrJXvC8SkS4hFofViY7uYy3c5qd9jmq8+2m2cN3eOrmFtFkQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Y01z/nFHibwAGs2HNPk5FY6N9xED5KP581+OKUR8j64=; b=HS8Mv0Rycs9iC7ln9tKKv7QJd9vzZF9oIsPKm9j3a2pbdnXCW/ZccxY1sZVUyPhThUgqh6y2hXgDfEiwTN1FJtDEutsttuDfpqii59eYswqe6iW3ya74JWMyCvyZDrbN2lF9wC3/C/4gYHrW5ZYShKdbZGOidqMPvVjutJmGaC8IxSK3Fc/mS8a3TrSoa67S+2QcgtTdsQLQ52UWM24UV/+U2puemxUcsRVU/NmO7Iw8aiK12D4eDVAEUXUfo5Z8RNqJExubHFJ7FfVVgvIjSmDZ5o0wL0FxkPt5kKeo6ik6/hiXKMk4UWOYBhVfHudfsFFkqLYfPBN4woxc4DylOQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=meta.com; dmarc=pass action=none header.from=meta.com; dkim=pass header.d=meta.com; arc=none Received: from LV3PR15MB6455.namprd15.prod.outlook.com (2603:10b6:408:1ad::10) by IA1PR15MB5983.namprd15.prod.outlook.com (2603:10b6:208:44c::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7316.41; Wed, 28 Feb 2024 14:01:05 +0000 Received: from LV3PR15MB6455.namprd15.prod.outlook.com ([fe80::9c98:3298:e8d6:e8e0]) by LV3PR15MB6455.namprd15.prod.outlook.com ([fe80::9c98:3298:e8d6:e8e0%4]) with mapi id 15.20.7316.035; Wed, 28 Feb 2024 14:01:04 +0000 Message-ID: Date: Wed, 28 Feb 2024 09:01:02 -0500 User-Agent: Mozilla Thunderbird Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Measuring limits and enhancing buffered IO Content-Language: en-US To: Amir Goldstein , Dave Chinner Cc: Kent Overstreet , Pankaj Raghav , Jens Axboe , Chris Mason , Matthew Wilcox , Daniel Gomez , linux-mm , Luis Chamberlain , Johannes Weiner , linux-fsdevel@vger.kernel.org, lsf-pc@lists.linux-foundation.org, Linus Torvalds , Christoph Hellwig , Josef Bacik , Jan Kara References: From: Chris Mason In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-ClientProxiedBy: BLAPR03CA0143.namprd03.prod.outlook.com (2603:10b6:208:32e::28) To LV3PR15MB6455.namprd15.prod.outlook.com (2603:10b6:408:1ad::10) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: LV3PR15MB6455:EE_|IA1PR15MB5983:EE_ X-MS-Office365-Filtering-Correlation-Id: 077ca3ea-efb9-47f3-e09e-08dc3865b3f8 X-FB-Source: Internal X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: amCNt5xMF+9yOmMyFCNVEWEiiNAHPgXc3gTniNi3kUzSXVvutJo9KUnDnenx9VRd0vOZ7ravBHJ9c9heQLytm1MC2+5UsgnGwg0AU7IhznwSnn0uAcahx3mVxVGBN9LAE+0So1DUmN3cf+jFa84yn/7c/SljLIDc9uU2WepjbviEwIbhrPJtWnIJX/FHjHf37zn6kNNbW3HbQw2Z+DIWZ1kcf5loSVX+ghmZYsLLNCt9MheZ5W+/OuIrafFpO0UN6vN8abgNnheJwO8OnMGQJOaHraCrBMNfK6qjqCsLmGBPAO5dcV51zgtALNc/RK/thaBTpOSq8owHPMC0SxA+2oi66vn97jOEcGxu7EvgaHcGxXpUNNCGZv+kJubwQ4YQS82aW59onKcVrFVzWJFMEmAh6TlqexNKUosRHQMpgfD8nkddQCOOa4vWyAp9GlI02JbXVtYbDc1MmUfrezwEl6XHFM8xVQiZ/oSBT3oZB/KAHwBU7phS/0pHNZaTK/8rtBFiPWNnauIr4gI8+F7CdGX45OCZn+/usKLja514B7EHlenEUth7dx0vlQ8rapOdMvu9UFtKvMwXr8lkCszn2Mv19kaArhfI419rpnbFbMhzEie27puBFr/oBzVn62LSuIcChS34qRkUAoEqTYSTCA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:LV3PR15MB6455.namprd15.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?MklBU0VRLzhOS1IzUklURHBLa2cydC9GTEd0TCtpK3RnOG1LREdFV0d2eDl5?= =?utf-8?B?K000MFIzL1FVZ3Y2NVJuVldYMkRQSE92WWxIYnYyQ2hNN1lHUHZpcENGT3Rj?= =?utf-8?B?cEJDaHJEYm52YXpTa0p6K3FzV0lOSDBTM1JzWEFZMjVnbzhWdnpDZ3AybVdB?= =?utf-8?B?WVZZYUFWK29rbHR4Y3AwYnppMlQraUpNRzFab1U3NlZvZm10SlRObXF2Tk1F?= =?utf-8?B?ZGx6TUtWS2JzS2dGelU1cys1ekpheG5BTG1SYWV5T0FkaEJnZTNGY3pscUxm?= =?utf-8?B?b0h1NlRMOXhLd3FZU3Q2SVArVHhzU1FyTndXUzIvT1I1UHVPOWRqM0wwd1JO?= =?utf-8?B?b3lOVlJwdVpEMVZPcnJTN0h1Zm1GVzQvbUJuS0xyZ05jWjhPWHlBa2hOYytl?= =?utf-8?B?UisxeXNsQmh4dU8zSHZWaU9JMStxZ2QrWWhxQnRxMUkzQUJUTFpWN0EzeDRW?= =?utf-8?B?SDlYOUM4d0dpRTNkRUtDcFBINVBsV3gxQkFMT2tUazBRU0RpUzAzd2kwU0JT?= =?utf-8?B?UTNXRmlxQmdLcWFSS05tckxzb1ArOVMxZEJMV0VmYWptVWY4OXp4alp6MjBx?= =?utf-8?B?THFSOGRNZEN4V2dLRFF3UWgzWkJRM2crdW1pdkZVL2VscVcrdTh0cUFENXNv?= =?utf-8?B?cGRCWFNsWW1XL1dHdkxGZmZaMkt5dExwanRUdjhWNGppV21zUXltb2JSUFJ6?= =?utf-8?B?N3FHQUZHSmJwUjlwZ3NFbnU5N3RTUTNUTGYwNkdMS3JzOFJSUDhEQkZlbFFv?= =?utf-8?B?SzJBV3JTOUpwU0RSUlJZVHNKSE1qT25OVzZvSDNSWUkrd2p4MTM5dTZuNUI1?= =?utf-8?B?TFRrU2JrWjNMZitUSTgrRm5WNkNnV0lYL01GREVuaXFuR29sTmdNL04wTExS?= =?utf-8?B?ZlJRNTEyeU4zM3FXaUNvUFBtaWtHTStoSE9DbFdxWFdlczBrL2dmMGRTQ0Ew?= =?utf-8?B?K2FxYXBSQmM0VEpkdGFXT0YwQVZ5aXNlL0pqazZOMCtTNjZteVVqYzlYMjVZ?= =?utf-8?B?YXVQSmdsZEUxbEM3MnFKQng0cDhodllPbU5IZWJGcE9WQUVFVGF2Vm45RFJD?= =?utf-8?B?dlNKUzZ1QW4vRWhLZHRmalhxZHl1bXcrSzdJRHcrZU82bUx2Y1ZTYUpWRzZW?= =?utf-8?B?WVlJOXpDWWc2Sk1VUDFwSXF1YUJpbk90RG9xaHNkYXFaYmJxMkxDblZ5Tjh2?= =?utf-8?B?eGhHMkRpcUJLblBHSjdGWFRnYUlGV0tGRXdzZGNIcTc4VnlneFIzNjZzekMv?= =?utf-8?B?d0d1M3ZLc3c5MWtrd1F0dVIxQitRd0s4R0lJUk5QOURsT1NnaGJuWlNtUmND?= =?utf-8?B?Y2twYUFqblhScFhkR2VyTDBsdTlhSGlLWnVNek1UNERKbW04UXZHeVFDbUhI?= =?utf-8?B?QVFUODlEa3pnWDUyaVBCZDFDblQ1RnloRUFLbjh2T0dPcXMvWDZkRXRkeDdw?= =?utf-8?B?TlBUM0RGTldGSmxwZGd1dXQ4aUdESWx0cGZpSGxQN1VoYmpTbVRvK2pGSDZu?= =?utf-8?B?aWVYR1RPM0lseVY2N1Z5WDlnTnFmalloNllJK05LQkxVSDUyQm1nL1lZSzMv?= =?utf-8?B?SHR5cCt6YjVQaW5MQTlaYzc1cG5pV0FybndKVjhNTXI4MHphRW5pcGdlOXJp?= =?utf-8?B?MTR2MVdZSlYrMkxWV0RCM1kvK0EwVHB1QStlelZMOWkzZEIxM2NKTHlNeC9M?= =?utf-8?B?K3RYdHFXc3dpaXM3MGJ4cXZvL3ZQOEZiQ3Q1TEw1bCs4TVhGZ0J2MHgvVnNx?= =?utf-8?B?a1FUMWlvdUZCYzZBTlNQZEhjK2hNeWFLWTVhN1BaSGh2eFRKRHlQUG10OGRR?= =?utf-8?B?NEx2THZaaDZPUitnZXgxTWdKOUV4TkpVdWFDdmxIMmR3aS9nQU9uLzhna2lv?= =?utf-8?B?Z1lmaG11YzdsN3Zoa1dTYlBUcVlGYU5NUU05NkFqNXdtYWM4ZDJaYnhuS2hY?= =?utf-8?B?cVM4NkVTUmpRWk9ISjBMYUwvM0pyVE5xd0VGOUVxQ1BNa3dvSi85dTdnSVBZ?= =?utf-8?B?eWU2MWFZSkNjUjlxQzBMOUhiNzdHV29pNDdXVkpHclIvaGZoRGNyc2Rwb2Zy?= =?utf-8?B?RWhCNWdxNHQyemxCck1RYTd3OERoRGZ3UDgzOEptd0JhaTNYS1ZyTkIxVVcw?= =?utf-8?Q?Qz40=3D?= X-OriginatorOrg: meta.com X-MS-Exchange-CrossTenant-Network-Message-Id: 077ca3ea-efb9-47f3-e09e-08dc3865b3f8 X-MS-Exchange-CrossTenant-AuthSource: LV3PR15MB6455.namprd15.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Feb 2024 14:01:04.7461 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ae927fe-1255-47a7-a2af-5f3a069daaa2 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: JzfHFYtXVyrWJDgYP36iMqVB+J6bbLYeEQedWCOlo94LKwUZnycbTIOLrl2MTiXE X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR15MB5983 X-Proofpoint-ORIG-GUID: cTWs5Z0xikDwW7HP5-kjFjQ29YY5SVJB X-Proofpoint-GUID: cTWs5Z0xikDwW7HP5-kjFjQ29YY5SVJB X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-02-28_06,2024-02-27_01,2023-05-22_02 X-Rspamd-Queue-Id: D5F1B2003F X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 75hm3exfqe6sqrbog8fe54fcwq79qy1y X-HE-Tag: 1709128881-77547 X-HE-Meta: U2FsdGVkX19NLtJDewNTimgajbVqKBRn54SrgMluIGh34NrZ3rEy8EcrM2pz2MwireWL2Mj5QRdIrDXBvWixBJjBOD/PwcnZWfNOv6WzsZvuvDmg+T9XsWc1+lucokG0KxvEObmkKvbAsQE+xqD67xJAAlFRExyqr7oss6YYMkFCqsI56DqEuTf7KfNGlo0ooR3FpD6M7a3WzAV+FbcbNLt6IU1c69iQOGVqArJq1pjAp2vmuDiGWKwJdzbIYhxrrGm2QEdtTJxLk+R4sLm9ZNhNaIbjDCXeXbbcVUfJ/9Yr4qy4hXgRy9Gfz5Pnyxbj/jswnfWBCReCfHrcgPzo/22rsj54/u4KNjDbYWBw5fHgGv71lf1wWKMyTViB03zqmisiPRUxBLZXEOxcbMM7HAuWaJYVJIKBa8WfCQ65riDI2ALWDe6+MS6qQwYrVwWCCcYTDfs2xKI5GYGALGIO6XL1SHNdWzLm/RBqI5biYr7HCu0hysj4ftXxGW5GmN0y/oMJLw2dv3rR2RLxI0waa27UEiA05ta3yZ9LFus/AvsJv1YexuMX0gb5YHDc++TFokyVC8JgsBV+JkGYQSECHj/n1Qrv3pEXgZrEkJE0Ni60oMU1zJR3vARDRCdYV7aMVXv1OIx+KheOha/jf2ZRp47f7hHO46KNoJtilVpJm7ormBiNnfAIX1jI8oUnZr42Fwef7fajbd62WtQO89526qjPXZh57NhzqqLtvlg5kTynM84UwTlRmkHanPKKtWJhIryyRU3cjrS2JsoBP8QelmRjjr6IkGbY/d2W92oYdWMq7yBAnkiKlm4upaUA6qh0NQ2I7u63T8a0FRiiqY0ebPnDnA2eY0ILs1Y8lNRY4Z5bKYj6ffRbGiumW1J4nAMiQGr7e8j7UeBZ2X43oZm2KOuC5WT5chuCgINS2IYzoFrrZ5tRpbwlVATGQFEEu5r0eyBcNr21Z8MJOfIYBC0 QBfVoACE JPvS+lhuBR5tS1RQv0ad77EHrGpRQFG7/p1iAQAREMic5MVx7NkWDlkqs5TBKfS0lR2Rf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2/28/24 2:48 AM, Amir Goldstein wrote: > On Wed, Feb 28, 2024 at 12:42 AM Dave Chinner via Lsf-pc >> Essentially, we need to explicitly give POSIX the big finger and >> state that there are no atomicity guarantees given for write() calls >> of any size, nor are there any guarantees for data coherency for >> any overlapping concurrent buffered IO operations. >> > > I have disabled read vs. write atomicity (out-of-tree) to make xfs behave > as the other fs ever since Jan has added the invalidate_lock and I believe > that Meta kernel has done that way before. Hmmm, you might be thinking of my patch to prevent kswapd from getting stuck on XFS inode reclaim, but I don't think we've ever messed with write concurrency. I'm comfortable with the concurrency change in general, but it's not somewhere I'd be excited about differing from upstream. Total tangent, but we only carry two XFS patches right now that aren't upstream. I dropped the inode reclaim patch; the problem stopped showing up in our profiles, and the impacted workloads changed to rocksdb for other reasons. We flip XFS discards back to synchronous. Async disards without any kind of metering saturate drives when we do bulk deletes, leading to latency spikes on reads and writes. There's probably a class of flash that can handle this, but we don't have it. Unfortunately I also disable large folios on XFS. They are corrupting xarrays on our v5.19 kernel, with large folios from multiple files interleaved together in the same file. We'll try again with them on v6.4 or maybe v6.8, but the repro needs thousands of machines making NFS noises just to trigger one failure, and I won't be able to debug it until I can make a more reasonable reproduction. -chris