From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76340D2F7E1 for ; Thu, 17 Oct 2024 02:46:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 06E386B0089; Wed, 16 Oct 2024 22:46:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 01EA96B008A; Wed, 16 Oct 2024 22:46:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E28A76B008C; Wed, 16 Oct 2024 22:46:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C4B386B0089 for ; Wed, 16 Oct 2024 22:46:39 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3818180113 for ; Thu, 17 Oct 2024 02:46:30 +0000 (UTC) X-FDA: 82681555866.22.49B6FA3 Received: from out30-110.freemail.mail.aliyun.com (out30-110.freemail.mail.aliyun.com [115.124.30.110]) by imf22.hostedemail.com (Postfix) with ESMTP id 021F2C0012 for ; Thu, 17 Oct 2024 02:46:25 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=GEb0h5jk; spf=pass (imf22.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.110 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729133052; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zuoMwQP84WOT3tANGdBEYdnawZrJ+4mRxBerCGeVXmQ=; b=YqXW0IGLlQT4CfaI5aQw3lk3i6btGwca7TXJcnR4Rv+/OpW8gFmRfIazwrE+MKaNl7K0Se DcLLrmy9owEH8bjH8PWx4I8qUIE3nc8H7h229kXEQrGbD6BADsfnrbhJIx369y+cH47THZ FnM7Iy9q+dqSBW3mAbUyfINPbEVlSNo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729133052; a=rsa-sha256; cv=none; b=xmZvldyfoApGOHLDhOdO18GTr3gwJK9Z8ogSPSb/jd9tz4Chw+EDi4lcuzHhDjFynb2xSs tT40sHd+Fqp0i7A97wrgOXW++4h5U45XH9XiDt31yoyokD2pvNr7vsO1MNrzVUfFXISJRm BNAYtVfZSbr05CUcll6idhYk2xvzRYk= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=GEb0h5jk; spf=pass (imf22.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.110 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1729133194; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=zuoMwQP84WOT3tANGdBEYdnawZrJ+4mRxBerCGeVXmQ=; b=GEb0h5jkUfjWaLGw5Qmqy8tCmv9CiFDEGYGoaYT0epdypYYOFErRE8T0ZKLJmL/6412FmP+/XMHfMxijNauOLNHVYo446BFybgiWh+EGPyM2X18oidvdNbnwOE5Jizbre///NLaaWxsvcBgyZN54pUSjskF+dHmEJyAVQ183fBw= Received: from 30.74.144.140(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WHJ713d_1729133192 cluster:ay36) by smtp.aliyun-inc.com; Thu, 17 Oct 2024 10:46:33 +0800 Message-ID: <254f1c38-b2ce-4c83-a13d-54e7c0271dcb@linux.alibaba.com> Date: Thu, 17 Oct 2024 10:46:32 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/2] mm: shmem: improve the tmpfs large folio read performance To: Kefeng Wang , akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <1cb50014-8678-40de-bca3-8a33555ec70e@huawei.com> From: Baolin Wang In-Reply-To: <1cb50014-8678-40de-bca3-8a33555ec70e@huawei.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 021F2C0012 X-Stat-Signature: yne1ud3nwx6jkism6mxss79bs6xd8k84 X-HE-Tag: 1729133185-530816 X-HE-Meta: U2FsdGVkX19OciY+A6z1sWGBKHMZ47Ras1AKx+s+Hb+cfgR9F6N+WyuH1k2ISxsHA/JuTx3Ko1MzHcD5dWStiXhIilTRybSdc7JoFV9Hp+STF7AfooKx/iGS+04oGiKo6yG87ZANlGWvkdn7QtrDwa+oM+wmQ5ELRZcaU2uppDAD1fEVhiqLkts8cUDrRdWMblfp1/SXBW7NvX17vu6jTUNPnFWzAulP7IYPPEbZBd7dQUxaTPreXjmiAdrKc0bYEldEppxSVBHtxWomwFgNR4HprZ71PGKq1t14chVPz87y0UmQzADVSyq5aCOZa1Jp7sQ7FWGjchhdAKSgesTjNs1BsGxFNCXo1D4rsKdaVL8fFbnHhiN1fZ8dIDednHgUvJ5LZToKZkFzDILlpB24yUjAxL88iFbv55HagjepO+r5Va3mNSpa3t6jzBxOY6lIniofJYHeiGsB/Ta68IKGkZmLmcCJQxEgeFf6pTBINja04Sidy4CCFQdcc5130SZQPIRQYG/1Gu7X//KxiubGQd6fbAcSgKIL1rBOsGJyBpTCT8skI1sz5fhA0Y5Ux1Yti+ZGCL2vIrcjm2PLAYeF9BkAhJqKV+eZ2PoP8WS5NTJVYOOKT3t48Imb25e9U6F375JZ3+jlOMeJoIiGbtQeNDuimGEVmkA6/bOOYgI8/2oYwbDz3Do++IBoE9Lt0QKEaCASZ+gbLwaWta1x4l0YlAIcvyyTHU7Z5JiJoLSoOcBi5Scu47/Zr4IjSUO72p6njhp5X/IXFe05aNKoGxVLKnPYitHTFzURvgV14nuPNRbR/3fx5CPDdz/h7pU62JEQyLCivEDONK4r+gisa/vTp3WVQ5M2QuEvHajwa7ZglpbzpBrQy+kDZRuagymWxGKF/9JrRhB4hDNAqfJJ3+Rajf5n+op/e5fijfQ6h51+Ug/qmMcUw6hhnOVe93PPqiv3kvrHDtCakeLdw62qO9l +rDXeB25 w0/x/MjREDLQ8DnsMjzDXffd1GLIfP/h/th3OATCX38j3DZPYzTr6gZT0Ys9ygWmKfJQaPW1gHDGTQtVCm2EpeHKkV55/tFplL3LsjtWqKwJz1vRmnoo/Y+r3fdx5Eweu8/uc5lAeR39MYyViMCDEiO5QPE7FrfC9+hc1dYjTH8m2JeXAPN125daUxTGagZkPM3wG2zOdx8RdgBSG8jr+exLDFXKCNSZjJEaL4yB5X5yCBSHGWyPSMt128+Zz5R/smokmB5DPONW0K8zecR5dJXwgEW7Gipkd2NVewsJeVP34S8k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/10/16 20:36, Kefeng Wang wrote: > > > On 2024/10/16 18:09, Baolin Wang wrote: >> The tmpfs has already supported the PMD-sized large folios, but the tmpfs >> read operation still performs copying at the PAGE SIZE granularity, which >> is unreasonable. This patch changes to copy data at the folio >> granularity, >> which can improve the read performance, as well as changing to use folio >> related functions. >> >> Use 'fio bs=64k' to read a 1G tmpfs file populated with 2M THPs, and I >> can >> see about 20% performance improvement, and no regression with bs=4k. >> Before the patch: >> READ: bw=10.0GiB/s >> >> After the patch: >> READ: bw=12.0GiB/s >> >> Signed-off-by: Baolin Wang >> --- >>   mm/shmem.c | 22 ++++++++++++---------- >>   1 file changed, 12 insertions(+), 10 deletions(-) >> >> diff --git a/mm/shmem.c b/mm/shmem.c >> index edab02a26aac..7e79b6a96da0 100644 >> --- a/mm/shmem.c >> +++ b/mm/shmem.c >> @@ -3108,13 +3108,12 @@ static ssize_t shmem_file_read_iter(struct >> kiocb *iocb, struct iov_iter *to) >>       ssize_t retval = 0; >>       index = iocb->ki_pos >> PAGE_SHIFT; >> -    offset = iocb->ki_pos & ~PAGE_MASK; >>       for (;;) { >>           struct folio *folio = NULL; >> -        struct page *page = NULL; >>           unsigned long nr, ret; >>           loff_t end_offset, i_size = i_size_read(inode); >> +        size_t fsize; >>           if (unlikely(iocb->ki_pos >= i_size)) >>               break; >> @@ -3128,8 +3127,9 @@ static ssize_t shmem_file_read_iter(struct kiocb >> *iocb, struct iov_iter *to) >>           if (folio) { >>               folio_unlock(folio); >> -            page = folio_file_page(folio, index); >> -            if (PageHWPoison(page)) { >> +            if (folio_test_hwpoison(folio) || >> +                (folio_test_large(folio) && >> +                 folio_test_has_hwpoisoned(folio))) { >>                   folio_put(folio); >>                   error = -EIO; >>                   break; >> @@ -3147,7 +3147,12 @@ static ssize_t shmem_file_read_iter(struct >> kiocb *iocb, struct iov_iter *to) >>               break; >>           } >>           end_offset = min_t(loff_t, i_size, iocb->ki_pos + to->count); >> -        nr = min_t(loff_t, end_offset - iocb->ki_pos, PAGE_SIZE - >> offset); >> +        if (folio) >> +            fsize = folio_size(folio); >> +        else >> +            fsize = PAGE_SIZE; >> +        offset = iocb->ki_pos & (fsize - 1); >> +        nr = min_t(loff_t, end_offset - iocb->ki_pos, fsize - offset); >>           if (folio) { >>               /* >> @@ -3156,7 +3161,7 @@ static ssize_t shmem_file_read_iter(struct kiocb >> *iocb, struct iov_iter *to) >>                * before reading the page on the kernel side. >>                */ > > We'd better to update all the comment from page to folio. Ack.