From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0A33D3DEA9 for ; Fri, 18 Oct 2024 18:38:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4928B6B00AE; Fri, 18 Oct 2024 14:38:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 41AC66B00AF; Fri, 18 Oct 2024 14:38:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E2D56B00B0; Fri, 18 Oct 2024 14:38:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0FFE76B00AE for ; Fri, 18 Oct 2024 14:38:52 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 749861A061B for ; Fri, 18 Oct 2024 18:38:29 +0000 (UTC) X-FDA: 82687584084.26.2AB0737 Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com [209.85.208.48]) by imf17.hostedemail.com (Postfix) with ESMTP id B433B40018 for ; Fri, 18 Oct 2024 18:38:40 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QdPoLuUX; spf=pass (imf17.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.48 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729276584; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aMfpolUDz42PH58PTzjh3/JFNIGy2P822pqybTHMLv8=; b=Ec4m00zv39crhzhl3enlX3plOgr+K7XRI5SDDtVsQ7V7oseUVAQwkr7twcUmjjkZPQB7VX zm4o1Jb5CiqydBD+r4HrDQJngjmTJPp7Cz8F9R3xR+vo4nwUEqCk8hE/q8wBFHdHDetuIj mtQjGUE5SbXi69hkLxcIDQkltrjUEuc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729276584; a=rsa-sha256; cv=none; b=Dxn0yiryejVIAPDS4b0r9+2FHRmCBmLcMtrUUdTOm2MtjsoOr0qKOI6qwNXseRM25Z2pSu WOXBqlqXHR7KYJcHw+UBNb6NncLWAa0xl0bze9eRNTnGoxbOwDrEOMWQc1fxC07cxMG7G+ JfY+ZHS5b+wObfsC6cDn6qsnyhfPaZQ= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QdPoLuUX; spf=pass (imf17.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.48 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-ed1-f48.google.com with SMTP id 4fb4d7f45d1cf-5c9693dc739so3074226a12.3 for ; Fri, 18 Oct 2024 11:38:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729276728; x=1729881528; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=aMfpolUDz42PH58PTzjh3/JFNIGy2P822pqybTHMLv8=; b=QdPoLuUX6NpYkiw7khvHA1nEPGMU59mmbBMQ7QgSlDFl7ggpyHOXA3qPLuMU0K3uwu 7IyAlsE/PCaJdNdCAmdZK26hj2Yg5s+362aL9wWY+ac1hsCtbBNXjVcKGcrmN6vuSN+Q 6uoqJx2ydKborxAWjO51KMmrAekf5MEBclvjcPg3q17SXCXMuXPz6EfMNOS1m/04XyF6 RXmSPY9fIhHeCmvOv2sVRC7Yz51UeJTyxLczXP7fiXsF9LuAWb03Cx2Xklzn1CF16Lrr 2CRegod6iO+gf6MXLx+EDgKHrYD/lcHe98DjlWt0c7k1Uuacv/LH65r607Cmo5zegAob iKYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729276728; x=1729881528; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aMfpolUDz42PH58PTzjh3/JFNIGy2P822pqybTHMLv8=; b=K1QW/9Xs+N6SzRyI0vArmAxBrNjYOV4Azw0KNWn6XgBQ6kYO4s+Aqbf1wO2kNBi3D6 YmtmwNkb2wqvf2tyH4AYoqKwrnUjf25BokIFA2kEssW/JGTx5bEY6sh2D1FKVA9xWNhV /y95xFN1aAj18gZRWg8GigBCAw8rao/TTBkjXtSBaqLRX/89VM03TFBqY932xEFPd+BX Hi4kwDXRQ7Il4J7HT4BthyRcgZYpfapOp6li0KR0/Z8HpzOq49uotMfFLH5ULegs9RaA p6sBnNOc6SOsZHidS997Mv/JdNdsUw23yQmtO6bF4TkK+LSzLicSiMf9k3LKzIf+t8HA mcmw== X-Forwarded-Encrypted: i=1; AJvYcCWJ2wRYB9CC4CQi3xoSDMjb9Lef7ptCaal9ImrTVqM+gfr9SkN81sOHltwi7jQBhjHMciBYWOGARw==@kvack.org X-Gm-Message-State: AOJu0Yyq9MZ+WWADGW4G1m2qmRoYMULWwDZ4tJNIlIuvo+AJZuQlKqXv h4rmPWajooSKpsD95/qk9uBX9elnhAZW7AMzr3c7kkOfSi6fsKarnIjkYVr6ramT3KmeqJx1dHB 6lE6224KCudNkWQS8f51/yWu1sUA= X-Google-Smtp-Source: AGHT+IGbuXK7Hr1PpEVHlJDYhACH9CtZunSYZeKo0Qsu6JeCnoos2F1D2b3QuxVhegVe8lgQsr6ZS5rsDxzUSELU78k= X-Received: by 2002:a05:6402:d05:b0:5c8:8db1:1d55 with SMTP id 4fb4d7f45d1cf-5ca0ac50dd0mr3200061a12.10.1729276727650; Fri, 18 Oct 2024 11:38:47 -0700 (PDT) MIME-Version: 1.0 References: <2129a21a5b9f77d3bb7ddec152c009ce7c5653c4.1729218573.git.baolin.wang@linux.alibaba.com> In-Reply-To: <2129a21a5b9f77d3bb7ddec152c009ce7c5653c4.1729218573.git.baolin.wang@linux.alibaba.com> From: Yang Shi Date: Fri, 18 Oct 2024 11:38:36 -0700 Message-ID: Subject: Re: [PATCH v2 2/2] mm: shmem: improve the tmpfs large folio read performance To: Baolin Wang Cc: akpm@linux-foundation.org, hughd@google.com, willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: B433B40018 X-Stat-Signature: fq6xazd1witeojx6z71ncwx5q7t5jetq X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1729276720-322022 X-HE-Meta: U2FsdGVkX1969cJtH2claO+10PPNDLApsR5pMiWwIcXrRsYNe5edVcGdecEZGiJbPlk/vPPbTMa4udLCfFhfii8agK0h7TFaJBRY5yG56hnrIVfh4amUClZa1g6Ua4iSRYjCnDjGD2W+rn9alyKGWM1pG8VeeiAuP9oUB3c4/P9dHCL3STclK5hT0evkjq6pCl2XGQ/TnuVtsCIn5xVCkie8/UqoiT5jFuUV63xvrBvTyH86DRMkfzn1RkcBNP6qBtept2R0AT8nDFuKUiVFwx79WUanA4ixawi1hdYFAZMCo96rJb6IABS7yo9YLukknQOeV4nBHWpEVcvVXuTwEmcSjEGPsCPD/0pJXoRhgPTLiYLw08yJLXrbjDor47YLYhyKNhdoYu+1CvMSVEHSoQQR81POvQPYaewt8ivkzja3ohueaZ8EIsWwMQIF1jbIp7kMeoNj8yI3LeGzEeZ4xXXLWhx7wn/pv014F/zUmnDOOcoiLI8DTjAUbVGfB5DwHN2GKH6Ps2T1aBIPZszm1zMDABs0oGgrzjeFr3EpM+Df8hedcjaRNRGcqF/wrrGbXuJuBqDRng2fV78bW3QNo1FxCEq4b9HpSLZR3RBIm8jash2iR0AVtKfJefXUo7fNU+NhADwbymQk2C92jXcsdV25eHTlyfniLrQIP4BZLxN5g7IEmVg3Yf5ZngLF8HNZz4Qy/jIVX20540fwR7ctWZbTYhkszUFe2GVEbKh713UOz63de0ngzLl8ACq7+FMd59WJ9xJ+7bvVrVPzxcmGPFc6OzYf8d0V1kUGchRTodCe1s9422971mL0skZACVbelVu5RnEdjzLjXOFnCjAi/XQVnYMrLw8ZwOD0i/V9kVQ34DqSjnK8VsO3t+jwADylvcj7M19jchLQ09Pz6NhBLteem2tARNirQ90nLOsfQhmXNzxysQzJuzegujaeNX7Ozp++wsXja2CFt0pTsg/ JFENdYgc J3ZmnZ3pCGundsoV5Gcol/SZhDXlTaph6E8KrzRFa1N+DSQd4SYTdte7H83Mc7KVlFSF4Z26y0kIPvdX5NqYSLUAmEL4TuCUFGulx11NdKrhm09SjF0NCsR4ZJ198mzcQ3z+s3uIKD6m5xkCHe164FPWUYTkE8MisvItKk2f4PUX9YsklHb3sTj8CMj62LgB+LQLTI1h9vgcErFJu7cgdiQfL9HCJFwChM0RpXYY6nzbG2cyqYsMHTekvq0k0aFSeXkuWnOdMbzyWlaJ4kD0Z6ea9Utmpo2Uep/nJ5XcZMwPkl2x5kCWyqqNCaE0IpBcLHRb4/6aWTKFLBblhg3Ot1dwQO31GJCG+kKXCtbZ2UkUke4JhUPoGkzMesw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 17, 2024 at 8:00=E2=80=AFPM Baolin Wang wrote: > > The tmpfs has already supported the PMD-sized large folios, but the tmpfs > read operation still performs copying at the PAGE SIZE granularity, which > is unreasonable. This patch changes to copy data at the folio granularity= , > which can improve the read performance, as well as changing to use folio > related functions. > > Moreoever, if a large folio has a subpage that is hwpoisoned, it will sti= ll > fallback to page granularity copying. s/Moreoever/Moreover > > Use 'fio bs=3D64k' to read a 1G tmpfs file populated with 2M THPs, and I = can > see about 20% performance improvement, and no regression with bs=3D4k. > Before the patch: > READ: bw=3D10.0GiB/s > > After the patch: > READ: bw=3D12.0GiB/s > > Signed-off-by: Baolin Wang The patch looks fine to me. Reviewed-by: Yang Shi > --- > mm/shmem.c | 34 ++++++++++++++++++++++++---------- > 1 file changed, 24 insertions(+), 10 deletions(-) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 93642aa8d1aa..cbefd9801f6b 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -3107,13 +3107,13 @@ static ssize_t shmem_file_read_iter(struct kiocb = *iocb, struct iov_iter *to) > int error =3D 0; > ssize_t retval =3D 0; > > - offset =3D iocb->ki_pos & ~PAGE_MASK; > - > for (;;) { > struct folio *folio =3D NULL; > struct page *page =3D NULL; > unsigned long nr, ret; > loff_t end_offset, i_size =3D i_size_read(inode); > + bool fallback_page_copy =3D false; > + size_t fsize; > > if (unlikely(iocb->ki_pos >=3D i_size)) > break; > @@ -3134,6 +3134,10 @@ static ssize_t shmem_file_read_iter(struct kiocb *= iocb, struct iov_iter *to) > error =3D -EIO; > break; > } > + > + if (folio_test_large(folio) && > + folio_test_has_hwpoisoned(folio)) > + fallback_page_copy =3D true; > } > > /* > @@ -3147,7 +3151,12 @@ static ssize_t shmem_file_read_iter(struct kiocb *= iocb, struct iov_iter *to) > break; > } > end_offset =3D min_t(loff_t, i_size, iocb->ki_pos + to->c= ount); > - nr =3D min_t(loff_t, end_offset - iocb->ki_pos, PAGE_SIZE= - offset); > + if (folio && likely(!fallback_page_copy)) > + fsize =3D folio_size(folio); > + else > + fsize =3D PAGE_SIZE; > + offset =3D iocb->ki_pos & (fsize - 1); > + nr =3D min_t(loff_t, end_offset - iocb->ki_pos, fsize - o= ffset); > > if (folio) { > /* > @@ -3155,10 +3164,15 @@ static ssize_t shmem_file_read_iter(struct kiocb = *iocb, struct iov_iter *to) > * virtual addresses, take care about potential a= liasing > * before reading the page on the kernel side. > */ > - if (mapping_writably_mapped(mapping)) > - flush_dcache_page(page); > + if (mapping_writably_mapped(mapping)) { > + if (likely(!fallback_page_copy)) > + flush_dcache_folio(folio); > + else > + flush_dcache_page(page); > + } > + > /* > - * Mark the page accessed if we read the beginnin= g. > + * Mark the folio accessed if we read the beginni= ng. > */ > if (!offset) > folio_mark_accessed(folio); > @@ -3166,9 +3180,11 @@ static ssize_t shmem_file_read_iter(struct kiocb *= iocb, struct iov_iter *to) > * Ok, we have the page, and it's up-to-date, so > * now we can copy it to user space... > */ > - ret =3D copy_page_to_iter(page, offset, nr, to); > + if (likely(!fallback_page_copy)) > + ret =3D copy_folio_to_iter(folio, offset,= nr, to); > + else > + ret =3D copy_page_to_iter(page, offset, n= r, to); > folio_put(folio); > - > } else if (user_backed_iter(to)) { > /* > * Copy to user tends to be so well optimized, bu= t > @@ -3186,8 +3202,6 @@ static ssize_t shmem_file_read_iter(struct kiocb *i= ocb, struct iov_iter *to) > } > > retval +=3D ret; > - offset +=3D ret; > - offset &=3D ~PAGE_MASK; > iocb->ki_pos +=3D ret; > > if (!iov_iter_count(to)) > -- > 2.39.3 >