From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58FC3C27C53 for ; Fri, 7 Jun 2024 10:40:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0BF76B008A; Fri, 7 Jun 2024 06:40:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DBB026B00A9; Fri, 7 Jun 2024 06:40:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5CA76B00B1; Fri, 7 Jun 2024 06:40:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A64E26B008A for ; Fri, 7 Jun 2024 06:40:44 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 512F212059E for ; Fri, 7 Jun 2024 10:40:44 +0000 (UTC) X-FDA: 82203749208.07.2D25D31 Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) by imf11.hostedemail.com (Postfix) with ESMTP id 58E6E4001E for ; Fri, 7 Jun 2024 10:40:42 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TRLj4Rqw; spf=pass (imf11.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.208.47 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717756842; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=U4jK93kf7r9hryn5ss3PeYfJYkjhZA2vuicyJegK6VI=; b=0f4pXVKdEGKJv2pqQWiAgt/wtZUn0OKhBmhAcOv0wMBw+cE83Em7GXR7AH0u3x12YPFC36 SVFoPMoyJsG0B2D3uRHOkG/XKYVNz2nsArB00gjS/oMnutzEj7m8oPVKXQc7lVbmEaj2wO rfCPqmH9TRvs8MeZZYVUlUv5PMeQius= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717756842; a=rsa-sha256; cv=none; b=om+gDOs+tYuZHVPwcindrRsOcl5kSvRHHOg1kD06PxTV6LH/hpReZm98gSfRsDUyK0BIgF jfzCRLRPZCKLkHBT876uk2QSjnf8fbmS07tYE3mYPH6hS6ETMzzty1qAFpNdyGLpDvmdpF fQLrWq+odjnuDWYP4oqtMDXZeB6KbTI= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TRLj4Rqw; spf=pass (imf11.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.208.47 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-ed1-f47.google.com with SMTP id 4fb4d7f45d1cf-57a677d3d79so5765381a12.1 for ; Fri, 07 Jun 2024 03:40:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717756841; x=1718361641; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=U4jK93kf7r9hryn5ss3PeYfJYkjhZA2vuicyJegK6VI=; b=TRLj4RqwgzXS+8a8zvF6LQZeBW59DZgRVS62lIhTLz+g7ShIPd63h+Tbi8PuKx8hLv 1EJ9sNVS5UzDZuMUeXvdCF81yA5NpZ7TJG0ecNHfnfR5Yp/En/t3nanupCnyV+SZIUJJ CBBKg95Jx80Sfhej5UBPdcvrNuSij7XBfh0rR9knWYO1WvgMkZGXbL6EXy69h7n6TBjI ZM/4lxaLPUa5UOvVdO0vQw6VsvX1EPvjWZrpuUomZ9l0Jy+2tHN9BEv0EAeufqgAv0o2 H5d81N7DD+S53dvHFW/09YONsv6kHJof8OVEPBG6EYrDJAZbypbB81MySNc3Q+bcgAQH VTmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717756841; x=1718361641; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=U4jK93kf7r9hryn5ss3PeYfJYkjhZA2vuicyJegK6VI=; b=VjQs7qCyDvF4WcHAk1/vX9cBlXp57twgPCETODZb4Xjfu6DljYS7KVkJuJ2h1We9YY 3c5xcIC0D9jmdwQpbJMvgqxe2IfKO1cER95qMbr1bzuOaPsSvrHLhD8bqyNCnoF8nryE tEpMhKeQaRLsLG651RJGCzI95odleTYoWTZTEjRHTY+KgS8Jac6frTCf4cz24EAQOiea LlMImyPw/v+tSxgbuGgbhZw02kfbO6xuFu7p26P8qgVSFL4N77I0hcwrJ6kaEdnx/ILn +ZbE0/5nkqTj1HZvqHfKGaDzvD9JgmEnxZCUyqgAnYVU+R6TE2D94twGVrHloIGlIG6V kzxA== X-Forwarded-Encrypted: i=1; AJvYcCWo4gBbVFnbbj7fTkVqsUHFTVD8CN6j08aniC52Pi0KhB+NaKCEVmC9dOYlQ/32gNfXoENzTjEz5wkidyDLDxGuivo= X-Gm-Message-State: AOJu0YwhWfeWhgGV4YYJpvZRm7lQ8YJ38c3yuhIocydV+oOcKD/hLmCS bzoUKFhbtUqntP5AzEaDckU8n3blUt5BP54JRENleL6+i3Pz2Or8 X-Google-Smtp-Source: AGHT+IGbojGw0dNMmhCKeu+pjC1rlkUV0eLTpHGyjYEpDdn3YpUEO7RRTrVPBAz0VGFBs+D9DgXctg== X-Received: by 2002:a50:8aca:0:b0:57a:27e8:deb with SMTP id 4fb4d7f45d1cf-57c4dee2ccamr2206942a12.12.1717756840503; Fri, 07 Jun 2024 03:40:40 -0700 (PDT) Received: from ?IPV6:2a03:83e0:1126:4:eb:d0d0:c7fd:c82c? ([2620:10d:c092:500::7:fe7a]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-57aae2340f6sm2542658a12.90.2024.06.07.03.40.40 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 07 Jun 2024 03:40:40 -0700 (PDT) Message-ID: Date: Fri, 7 Jun 2024 11:40:39 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 1/2] mm: clear pte for folios that are zero filled To: Shakeel Butt , yosryahmed@google.com Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, willy@infradead.org, nphamcs@gmail.com, chengming.zhou@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com References: <20240604105950.1134192-1-usamaarif642@gmail.com> <20240604105950.1134192-2-usamaarif642@gmail.com> Content-Language: en-US From: Usama Arif In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 58E6E4001E X-Rspam-User: X-Stat-Signature: uh47nt1tkt3qwurt89f433jtnarub1ic X-HE-Tag: 1717756842-853282 X-HE-Meta: U2FsdGVkX1+nlI80lr1aT0pCzKTvBbGjh2UHIx59b3ZTP9x+0dxq9nyLLq5z8+lICT9GhVL7vReMwJd1B0biVqh08ubrUKYg5TwvIHZheD7dUyKGdm4y8xAcmtdYunlhXt0tB4tv98QKPDHzconf8yQtyn7pyBm4YYxSCtX7mjvSaIfz/jWTc2+k4gED55wgIL9/sPkeK86JdMCeYx2dJgPqZFTwjzBxtrFo2gkFHZKIYOEx8vO3jeQr9OnILh+5K0Q5V74BLB61Bd18DcvR0+3aCIacE5TFdxpAMeRCxZFiQ68CSGytTgXrJJJnhorhQVwdbASgOPtrg8EXN6c5QO8g7YdP4AZeAdRyMe4Q3MdYqm/LBD1twWqf3xu8FFhJi85uUPxx+4wTCQ/xJrxEe6jp/6tVEOTVPkdqTEl33IqDSU84jocxRYnl1ryBAfEgQ7jH4iDMX+3PiFzN/lNrLHu3zrYaTr/XoTnbEnDq45gY0RZsrhgOKcWieAZ852rBoxiLexFH7xbFkTXeMVGvS5o3AlYchPIil7oaUsdJ8g8zJTxkwypRHjkO45XEIqjOehHWcDaR06NaymUsIytgCoUlo0xUbqyY5K0ZzcXkrYO7j9pCuLRapzOQa4mSkn5n0pg5JGoe98cYFHEyEGMOJNbghCHtNA0DCA9MmXRcTNWizPmPFLvBH5rcJPkOcxSALSzZsQfsjZyBtDoihnttgoA656sA8fBS9CInrz9bth3RRr077TZodDu8Tgx9EXVod0CtLWGxViYGY1CevEQxrcSUdpKd5q0qYM2YTPcA6CwVWP+gj08jluroCNSsxD8gUveiaBccH+G5QdMSctVGrST63UimPyKxP2j+HnB56pKsKJAnIGUzD0ib2g9oXRz4pPG0lNHq8LNZI0C7Ho3qAxTbUBk23UmtLV/xtuOoF8YnQu3l0PGCkgtND/n03e/VtIMPD9kdlLz9vzlYStl X3vdJptA ATAqIK9eQqhfZxlnQ7w3OrSFMqWoSrMU7JFx9WhqZfJd2FIcKggz2OOQkpLKoZqYv7jctH7fSfVeYvZ9fKCoCy9ZeyymGwWawAWn+FPfWBeJ828o0D0aaZsTIxu34g6cNiem5tojsPySzuSarbSROnog8NjADB78X6wgFD9CxRO5jykCykmUhsSPTr8F48OI2VpJpBfPHz7E21fPn9fiP1K8cYDHzmTGzRUTSFgZYnoMNUNaK7r32EmrLmDn6ylNdvC0b1myI+9dxBIolXu2jZewaSws9TpoEdjHca8YidfhA8bA7nJEXhzQt4OnjCReVvrfEIcy949gEqkFy/bHu6hrGsQbid6AuJcuUljfE+PCkO4/zJDYPxQ/w6cMZ/s3Y5fzQX5LQKfgdNCo22WmN0rvzfOPiiHFRMsKLM2/2NkjqqMhJC+JJqMAfvbd+dIaLRw00UQqSyqIk2yxKtmj9sphKVrjHqwxyMr8ULkSgsesQ5UKlsnyJsXjFz+8O0EQclPLlHmAlRLnk+coA+vSC2UdNPzVtn8B4PhHp4Up5PF9NWJjPgdZK+xjKJy5oTimymnn2UZHXB2R7HO05THGifMQPhf20gBLPyQhC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000089, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 05/06/2024 09:55, Shakeel Butt wrote: > On Tue, Jun 04, 2024 at 11:58:24AM GMT, Usama Arif wrote: > [...] >> >> +static bool is_folio_page_zero_filled(struct folio *folio, int i) >> +{ >> + unsigned long *data; >> + unsigned int pos, last_pos = PAGE_SIZE / sizeof(*data) - 1; >> + bool ret = false; >> + >> + data = kmap_local_folio(folio, i * PAGE_SIZE); >> + >> + if (data[last_pos]) >> + goto out; >> + > Use memchr_inv() instead of the following. I had done some benchmarking before sending v1 and this version is 35% faster than using memchr_inv(). Its likely because this does long comparison, while memchr_inv does a byte comparison using check_bytes8 [1]. I will stick with the current version for my next revision. I have added the kernel module I used for benchmarking below: [308797.975269] Time taken for orig: 2850 ms [308801.911439] Time taken for memchr_inv: 3936 ms [1] https://elixir.bootlin.com/linux/v6.9.3/source/lib/string.c#L800 #include #include #include #include #include #include #define ITERATIONS 10000000 static int is_page_zero_filled(void *ptr, unsigned long *value) {     unsigned long *page;     unsigned long val;     unsigned int pos, last_pos = PAGE_SIZE / sizeof(*page) - 1;     page = (unsigned long *)ptr;     val = page[0];     if (page[last_pos] != 0)         return 0;     for (pos = 1; pos < last_pos; pos++) {         if (page[pos] != 0)             return 0;     }     *value = val;     return 1; } static int is_page_zero_filled_memchr_inv(void *ptr, unsigned long *value) {     unsigned long *page;     unsigned long val;     unsigned long *ret;     page = (unsigned long *)ptr;     val = page[0];     *value = val;     ret = memchr_inv(ptr, 0, PAGE_SIZE);     return ret == NULL ? 1: 0; } static int __init zsmalloc_test_init(void) {     unsigned long *src;     unsigned long value;     ktime_t start_time, end_time;     volatile int res = 0;     unsigned long milliseconds;     src = kmalloc(PAGE_SIZE, GFP_KERNEL);     if (!src)         return -ENOMEM;     for (unsigned int pos = 0; pos <= PAGE_SIZE / sizeof(*src) - 1; pos++) {         src[pos] = 0x0;     }     start_time = ktime_get();     for (int i = 0; i < ITERATIONS; i++)         res = is_page_zero_filled(src, &value);     end_time = ktime_get();     milliseconds = ktime_ms_delta(end_time, start_time);     // printk(KERN_INFO "Result: %d, Value: %lu\n", res, value);     printk(KERN_INFO "Time taken for orig: %lu ms\n", milliseconds);     start_time = ktime_get();     for (int i = 0; i < ITERATIONS; i++)         res = is_page_zero_filled_memchr_inv(src, &value);     end_time = ktime_get();     milliseconds = ktime_ms_delta(end_time, start_time);     // printk(KERN_INFO "Result: %d, Value: %lu\n", res, value);     printk(KERN_INFO "Time taken for memchr_inv: %lu ms\n", milliseconds);     kfree(src);     // Dont insmod so that you can re-run     return -1; } module_init(zsmalloc_test_init); MODULE_LICENSE("GPL");