From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A63BD5AE60 for ; Thu, 7 Nov 2024 05:56:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 21B856B0083; Thu, 7 Nov 2024 00:56:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A4A76B008A; Thu, 7 Nov 2024 00:56:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 01E5D6B008C; Thu, 7 Nov 2024 00:56:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D3B2A6B0083 for ; Thu, 7 Nov 2024 00:56:34 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 4319FC0437 for ; Thu, 7 Nov 2024 05:56:34 +0000 (UTC) X-FDA: 82758238164.25.E885BFC Received: from mail-oo1-f53.google.com (mail-oo1-f53.google.com [209.85.161.53]) by imf12.hostedemail.com (Postfix) with ESMTP id 5711D40005 for ; Thu, 7 Nov 2024 05:56:16 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=AZjgWRzq; spf=pass (imf12.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.161.53 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730958942; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=apEPlGo1kFZo71hpXutanJLyzRMnObahrFWsH5Rz3IM=; b=elC0evFlzSjvAmnsZc/1IsAwkiFqnr6Mr2pcVv1GVNH6xkqBO7uVqQhQhe5o299SJnlxWa bz9qpPfwLFZd0AnGpoXgiqQWfUSXKy6yc3l4VVmgfHQd3LCvJLNwnEvtDLk5kkjYrh6bzY /7RdzLym72jxjy1dT3Wp3BP9yOL22l4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730958942; a=rsa-sha256; cv=none; b=z+vaDyEaqTbWnFWGX0wWoz3d0TAFe3pKMz2ageA4tLo9rZCo8Aylg9SP386QBO14jVAg32 B4JrUmUNzk/dC4ZYtcEed5Jcw0WyI+oZh1rKoDfu+YJbUE8PK6aKgk4oQT1eovewFBAdRA WdnbZMteYMOVjIJP/P5eJwLXkYpZbOE= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=AZjgWRzq; spf=pass (imf12.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.161.53 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-oo1-f53.google.com with SMTP id 006d021491bc7-5eb73ec1e1aso287131eaf.2 for ; Wed, 06 Nov 2024 21:56:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730958991; x=1731563791; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=apEPlGo1kFZo71hpXutanJLyzRMnObahrFWsH5Rz3IM=; b=AZjgWRzqsZ8aAWsDBkRdSGivrL1UvpgojOJlQ7tBDLcWc8Oo5RiGcLbWLrLJmUJZIr pI3ufVHcrnp61LRErtpueTSsfVp38Tqc92XcYrxBH6cMyClxpqRwLffmbxitIT+XedSl SZt9hx1Iik+7/lkWORgQXU6MY5RHTkGor1DRJZrJ9FUjV6TB7fjXag9sUy1/Lx4bq4PX 6qkrnbN0CA8bv1s9XZCLEDHOTkQI7POIUQRH1/13ZpZYoF2v/MQEwKDZheupWrkqne3n gMbU2qTxeJTEvOdmgBbx6wm9fvX/49NCScb3xDnjUysb7a8AfeT8YjwzA6uv7+CqSSvi lUAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730958991; x=1731563791; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=apEPlGo1kFZo71hpXutanJLyzRMnObahrFWsH5Rz3IM=; b=hPVj8f3MTJwhIGhvc3kXLKnxykA44K3orJJnCD8geaL4qM3066wKDMyx6mqxxmhGXC E8Hl4yNq9Dggv0c6cBtOwyXUdsQnnckTXqO/wImNWGiDUVDqodkeIXxqr9k9WRo3soZv s/6pH7L2ihtja8H5OHeoIlrgwC+bynaNeNMtJzhpoZlfO+Co7ksAL1hGuPUokay2XAJ5 kWUP79iUCGCh6IMyXGHfacz1xFSNWYAuxe6PrrZqdCZ84KBYPTebMY08F2j1tF5EZhPd RlP9ZU3LRRQkH6EcR5skFMPqclm/6CBQ0fqhvqXVhZiu7Jh0LZ6pMDy3mZoqnMdthKh4 JQlA== X-Forwarded-Encrypted: i=1; AJvYcCUb7ZccCrmoqtZXsZ2POx8SgyBSVM/SWmN0j/EB84q3YTVL9A8vTtKq/zcEjjjmQ4xOIrXO4yPU3A==@kvack.org X-Gm-Message-State: AOJu0YxVq5BJITb5PhGxfuXYG8RhxcmkPu4eEfopA3XRc5iMZyc0IW4f F+0jQ3llGqte6e0W0CihKRJfJEKDEBziMx4E2JpVVMwjxUZjSIGMDUGoikmz7nqnQ0Xqrths4/T IPQvFbdCHH/xDUGD3e9+UwSfOH68= X-Google-Smtp-Source: AGHT+IHu4SyE7Ke8ua0PIrXkkBmGZB2qUel4sGKGDUb+fmu4DJZ9hoPftN0SYx6aFF6fMkwWDimkPxeLXyOrDit70RQ= X-Received: by 2002:a05:6358:c8e:b0:1c3:8001:b9eb with SMTP id e5c5f4694b2df-1c5f98d3868mr1066162255d.1.1730958991125; Wed, 06 Nov 2024 21:56:31 -0800 (PST) MIME-Version: 1.0 References: <20241106092114.8408-1-laoar.shao@gmail.com> In-Reply-To: From: Yafang Shao Date: Thu, 7 Nov 2024 13:55:52 +0800 Message-ID: Subject: Re: [PATCH] mm/readahead: Fix large folio support in async readahead To: Matthew Wilcox Cc: akpm@linux-foundation.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: ck7w8f9b5rd88gks4eow15cue8m49chw X-Rspam-User: X-Rspamd-Queue-Id: 5711D40005 X-Rspamd-Server: rspam02 X-HE-Tag: 1730958976-682090 X-HE-Meta: U2FsdGVkX19DN65puG7WKm8RSwaVQUQuZGj0Yju952NgY23GZErL5G23UBV7XelxIkxucZTmfMTI5pl0yH78UNq34HiuqnZW+Pr4SuO60hqy8K81A3mmORtbugxy6pVj0CdykB9lUgjU0HsudWjJpJtciW7P1mlv7RHv27MLWh74QfsRWJEBnlFz17C0njvndWmWV7lxeRgyh97Ia1IZul1QaIYAdh+ZGGhr7O8d0OIIRxZ/M+V/zEzTp7wPvOAIIOPqq8/ZNAel7h5VvghuGUjvxGdhRnDE0xSxvBhArOziAeAUGUpqFI8y6VveHsIW+bZHQb+8/HAeJ+a5SWK0OiXU04LH3f5+vQzYCYjryHrO0rHOOOCXWEN+wfEfiS28HDsBVqykloAn592na3q73XOkmMDdVakErzQppuBVn1Fwr3Cqf5lwqZ/aSllRr5bCUv/hnjjMkhe/pR/dHyWohViD0Tlj/a4WtZZvktdNN6pdFxv5Xdg2yjs06QDRiPxAPfUbc5h50u17PxvYQaApwVrze/KtklqvgDcWMKDNVWVCBwaMBnIen76pECbW4uG/QwcR/W1mwIoT5mSpXDpEf2SjGHJ2iJuPSkwUoKXt+Ap+d13i8UgDfL9YpACWwkz8gHWO8Tfr8pK8Hq/oD0f4aGi0Puh4ZLe5Ac4LFZd6jIOGo2JWH9keFDTN70guPULMa+bB2WlfiSQuudyUkEzBnQNQeptam0KXxMRAbDgfmUl5wgK9M6kF/1VqblzIi4dZkH+fVlm4xyZ8oXO/o+7sm62OV6Z+sQuHKCY/iJcRJQLzmD88pcFrGQLCRZfXb6TyOqWjQykgTMzS1hSENKurbGnWw5sihVQfbqA0i1VGKGJnNB8BDUCB983aQJwoIdgpgqBZymzCyviIfhjz6Pppl5hi77L1RTKK0jikcoq6AGbxhgqDFLHG4luhqLAQSo6ef8nisI5QmiOMf9eioxI Ba7R+fky bLQFhE/scg2/O7lcrNVQVVGhWU9cgicUA4oR6CQV17LOv9qurVDxQ/Tz1COLZxcGs9GyHjvurKHSmn2OQht1iz4/yPYDI7WplgQ6ATRGbA46xusYrbhy/yo1vewZMLmjp3dyeKyhJwCqx8GudfUTAS6ltrp7oI4Sor7MN1tEGNqXqbfPQnxnKrNO9gSbSrKRv9W9LnVrP2mJgzzfh+onqZnu5MDFSW9aMCLosBsxpkai6idbqTCEL8vttPuynLGDeZ/F4c6h6zX/k2LCge+ObM0OHMm3ZzhPhrR5lWc7vZ3fi64mivWT8yvT9ECC6uEYXCWuvqaCfoENb1N9URaqoyZ94jiV//l0KmaTvjvtfjTjf6ZId0XSHfsCoqFSQOIKEekIGtT27hJoz6hwAIZ+cL+3PsQB6c9fEPEHj X-Bogosity: Ham, tests=bogofilter, spamicity=0.000090, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Nov 7, 2024 at 12:52=E2=80=AFPM Matthew Wilcox wrote: > > On Wed, Nov 06, 2024 at 05:21:14PM +0800, Yafang Shao wrote: > > When testing large folio support with XFS on our servers, we observed t= hat > > only a few large folios are mapped when reading large files via mmap. > > After a thorough analysis, I identified it was caused by the > > `/sys/block/*/queue/read_ahead_kb` setting. On our test servers, this > > parameter is set to 128KB. After I tune it to 2MB, the large folio can > > work as expected. However, I believe the large folio behavior should no= t be > > dependent on the value of read_ahead_kb. It would be more robust if the > > kernel can automatically adopt to it. > > > > With `/sys/block/*/queue/read_ahead_kb` set to a non-2MB aligned size, > > this issue can be verified with a simple test case, as shown below: > > I don't like to see these programs in commit messages. If it's a > valuable program, it should go into tools/testing. If not, it shouldn't > be in the commit message. I just want to make it easy for others to verify this change. I'm okay with dropping this program. > > > When large folio support is enabled and read_ahead_kb is set to a small= er > > value, ra->size (4MB) may exceed the maximum allowed size (e.g., 128KB)= . To > > address this, we need to add a conditional check for such cases. Howeve= r, > > this alone is insufficient, as users might set read_ahead_kb to a large= r, > > non-hugepage-aligned value (e.g., 4MB + 128KB). In these instances, it = is > > essential to explicitly align ra->size with the hugepage size. > > I wish you'd discussed this in the earlier thread instead of just > smashing it into this patch. Because your solution is wrong. > > > @@ -642,7 +644,7 @@ void page_cache_async_ra(struct readahead_control *= ractl, > > 1UL << order); > > if (index =3D=3D expected) { > > ra->start +=3D ra->size; > > - ra->size =3D get_next_ra_size(ra, max_pages); > > + ra->size =3D ALIGN(get_next_ra_size(ra, max_pages), 1 << = order); > > Let's suppose that someone sets read_ahead_kb to 192kB. If the previous > readahead did 128kB, we now try to align that to 128kB, so we'll readahea= d > 256kB which is larger than max. We were only intending to breach the > 'max' for the MADV_HUGE case, not for all cases. In the non-MADV_HUGE case, the order can only be 0, correct? > > Honestly, I don't know if we should try to defend a stupid sysadmin > against the consequences of their misconfiguration like this. I'd be in > favour of getting rid of the configuration knob entirely (or just ignorin= g > what the sysadmin set it to), but if we do that, we need to replace it > with something that can automatically figure out what the correct setting > for the readahead_max_kb should be (which is probably a function of the > bandwidth, latency and seek time of the underlying device). > > But that's obviously not part of this patch. I'd be in favour of just > dropping this ALIGN and leaving just the first hunk of the patch. I=E2=80=99m okay with removing this ALIGN, since we won=E2=80=99t be settin= g a large read_ahead_kb value in our production environment. ;) --=20 Regards Yafang