linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: David Hildenbrand <david@redhat.com>, Hugh Dickins <hughd@google.com>
Cc: "Patryk Kowalczyk" <patryk@kowalczyk.ws>,
	da.gomez@samsung.com, baohua@kernel.org,
	wangkefeng.wang@huawei.com, ioworker0@gmail.com,
	willy@infradead.org, ryan.roberts@arm.com,
	akpm@linux-foundation.org, eero.t.tamminen@intel.com,
	"Ville Syrjälä" <ville.syrjala@linux.intel.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: regression - mm: shmem: add large folio support for tmpfs affect GPU performance.
Date: Fri, 25 Jul 2025 10:38:25 +0800	[thread overview]
Message-ID: <da2ab844-98c9-4eb2-82ac-01d01bec30f3@linux.alibaba.com> (raw)
In-Reply-To: <a8ac7ec3-4cb3-4dd8-8d02-ede6905f322e@linux.alibaba.com>

Cc +mm maillist

On 2025/7/25 10:29, Baolin Wang wrote:
> 
> 
> On 2025/7/25 03:34, David Hildenbrand wrote:
>> On 24.07.25 21:19, Hugh Dickins wrote:
>>> On Thu, 24 Jul 2025, David Hildenbrand wrote:
>>>> On 24.07.25 20:34, Patryk Kowalczyk wrote:
>>>>>
>>>>> Recently, I have observed a significant drop in the performance of the
>>>>> graphics card on a Meteor Lake platform equipped with an Intel Core 
>>>>> Ultra
>>>>> 155H CPU and Xe 128EU GPU, using the i915 driver. Nearly every 
>>>>> workload now
>>>>> runs slower, with memory-intensive tasks experiencing performance 
>>>>> reduced by
>>>>> several percentages.
>>>>>
>>>>> This issue began with the Linux kernel version 6.14.x. Using git 
>>>>> bisect, I
>>>>> identified the problematic commit as
>>>>> acd7ccb284b86da1b2e3233a6826fe933844fc06, which relates to the " 
>>>>> large folio
>>>>> support for tmpfs in the memory management subsystem."
>>>>>
>>>>> More information about this regression can be found at the 
>>>>> following issue
>>>>> tracker: https://gitlab.freedesktop.org/drm/i915/kernel/-/ 
>>>>> issues/14645
>>>>> <https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14645>
>>>>>
>>>>> Older bug for textures:
>>>>> https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13845 
>>>>> <https://
>>>>> gitlab.freedesktop.org/drm/i915/kernel/-/issues/13845>
>>>>
>>>> Reading the tracker, the valuable information is:
>>>>
>>>> "I tested all the options available via the sysfs interface for 
>>>> shmem_enabled
>>>> in transparent hugepage. The selectable settings are: always, 
>>>> within_size,
>>>> advise, [never] (the default), deny, and force. Among these, only 
>>>> the force
>>>> option restores GPU performance on kernels later than version 6.13 
>>>> and only
>>>> for application executed after change."
>>>>
>>>> So probably you are no longer getting PMD THPs, whereby previously 
>>>> you would
>>>> have gotten PMD THPs.
>>>>
>>>> Do we know how the shmem memory is accessed? read/write or mmap? I 
>>>> suspect it
>>>> is accessed through write() first.
>>>>
>>>> I think we had a change in behavior regarding write(): we will now try
>>>> allocating a PMD THP only if the write() spans the complete PHD range.
>>>>
>>>> I recall that we had a similar report before ...
> 
> Yes, Ville reported the same issue before[1], and I provided a fix to 
> Ville off-list, but I have not received any feedback.
> 
> [1] https://lore.kernel.org/lkml/aBEP-6iFhIC87zmb@intel.com/
> 
>>> I haven't noticed a similar report, but that's my guess too: although
>>> I thought I checked for precisely this (knowing it to be a danger)
>>> during 6.14-rc, I did later find that I must have messed my test up;
>>> but still haven't found time to establish precisely what's going on and
>>> fix it (can't worry too much about releases between LTSs these days).
>>
>> At least scanning the code, write() would behave differently now.
>>
>> Now, I don't know which "hint" we should use to use PMD-sized THPs and 
>> ignore the write size.
>>
>> Maybe the last resort would be starting at PMD-size, but falling back 
>> to smaller orders if we fail to allocate one, hmmmm
> 
> I hope to correct the logic of i915 driver's shmem allocation, by 
> extending the shmem write length in the i915 driver to allocate PMD- 
> sized THPs. IIUC, some sample fix code is as follows (untested). Patryk, 
> could you help test it to see if this resolves your issue? Thanks.
> 
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 1e659d2660f7..5dee740d1e70 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -591,7 +591,7 @@ struct page **drm_gem_get_pages(struct 
> drm_gem_object *obj)
>          i = 0;
>          while (i < npages) {
>                  long nr;
> -               folio = shmem_read_folio_gfp(mapping, i,
> +               folio = shmem_read_folio_gfp(mapping, i, 0,
>                                  mapping_gfp_mask(mapping));
>                  if (IS_ERR(folio))
>                          goto fail;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/ 
> drm/i915/gem/i915_gem_shmem.c
> index f263615f6ece..0edc75208b7a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -69,6 +69,7 @@ int shmem_sg_alloc_table(struct drm_i915_private 
> *i915, struct sg_table *st,
>          struct scatterlist *sg;
>          unsigned long next_pfn = 0;     /* suppress gcc warning */
>          gfp_t noreclaim;
> +       size_t chunk;
>          int ret;
> 
>          if (overflows_type(size / PAGE_SIZE, page_count))
> @@ -94,6 +95,7 @@ int shmem_sg_alloc_table(struct drm_i915_private 
> *i915, struct sg_table *st,
>          mapping_set_unevictable(mapping);
>          noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM);
>          noreclaim |= __GFP_NORETRY | __GFP_NOWARN;
> +       chunk = mapping_max_folio_size(mapping);
> 
>          sg = st->sgl;
>          st->nents = 0;
> @@ -105,10 +107,13 @@ int shmem_sg_alloc_table(struct drm_i915_private 
> *i915, struct sg_table *st,
>                          0,
>                  }, *s = shrink;
>                  gfp_t gfp = noreclaim;
> +               size_t bytes = (page_count - i) << PAGE_SHIFT;
> +               loff_t pos = i << PAGE_SHIFT;
> 
> +               bytes = min(chunk, bytes);
>                  do {
>                          cond_resched();
> -                       folio = shmem_read_folio_gfp(mapping, i, gfp);
> +                       folio = shmem_read_folio_gfp(mapping, i, pos + 
> bytes, gfp);
>                          if (!IS_ERR(folio))
>                                  break;
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_backup.c b/drivers/gpu/drm/ttm/ 
> ttm_backup.c
> index 6f2e58be4f3e..0c90ae338afb 100644
> --- a/drivers/gpu/drm/ttm/ttm_backup.c
> +++ b/drivers/gpu/drm/ttm/ttm_backup.c
> @@ -100,7 +100,7 @@ ttm_backup_backup_page(struct file *backup, struct 
> page *page,
>          struct folio *to_folio;
>          int ret;
> 
> -       to_folio = shmem_read_folio_gfp(mapping, idx, alloc_gfp);
> +       to_folio = shmem_read_folio_gfp(mapping, idx, 0, alloc_gfp);
>          if (IS_ERR(to_folio))
>                  return PTR_ERR(to_folio);
> 
> diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
> index cbe46e0c8bce..9fb5f30552e4 100644
> --- a/include/linux/shmem_fs.h
> +++ b/include/linux/shmem_fs.h
> @@ -156,12 +156,12 @@ enum sgp_type {
>   int shmem_get_folio(struct inode *inode, pgoff_t index, loff_t write_end,
>                  struct folio **foliop, enum sgp_type sgp);
>   struct folio *shmem_read_folio_gfp(struct address_space *mapping,
> -               pgoff_t index, gfp_t gfp);
> +               pgoff_t index, loff_t write_end, gfp_t gfp);
> 
>   static inline struct folio *shmem_read_folio(struct address_space 
> *mapping,
>                  pgoff_t index)
>   {
> -       return shmem_read_folio_gfp(mapping, index, 
> mapping_gfp_mask(mapping));
> +       return shmem_read_folio_gfp(mapping, index, 0, 
> mapping_gfp_mask(mapping));
>   }
> 
>   static inline struct page *shmem_read_mapping_page(
> diff --git a/mm/shmem.c b/mm/shmem.c
> index c5eea697a96f..fcf233440c34 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -5920,6 +5920,7 @@ int shmem_zero_setup(struct vm_area_struct *vma)
>    * shmem_read_folio_gfp - read into page cache, using specified page 
> allocation flags.
>    * @mapping:   the folio's address_space
>    * @index:     the folio index
> + * @write_end: end of a write if allocating a new folio
>    * @gfp:       the page allocator flags to use if allocating
>    *
>    * This behaves as a tmpfs "read_cache_page_gfp(mapping, index, gfp)",
> @@ -5932,14 +5933,14 @@ int shmem_zero_setup(struct vm_area_struct *vma)
>    * with the mapping_gfp_mask(), to avoid OOMing the machine 
> unnecessarily.
>    */
>   struct folio *shmem_read_folio_gfp(struct address_space *mapping,
> -               pgoff_t index, gfp_t gfp)
> +               pgoff_t index, loff_t write_end, gfp_t gfp)
>   {
>   #ifdef CONFIG_SHMEM
>          struct inode *inode = mapping->host;
>          struct folio *folio;
>          int error;
> 
> -       error = shmem_get_folio_gfp(inode, index, 0, &folio, SGP_CACHE,
> +       error = shmem_get_folio_gfp(inode, index, write_end, &folio, 
> SGP_CACHE,
>                                      gfp, NULL, NULL);
>          if (error)
>                  return ERR_PTR(error);
> @@ -5958,7 +5959,7 @@ EXPORT_SYMBOL_GPL(shmem_read_folio_gfp);
>   struct page *shmem_read_mapping_page_gfp(struct address_space *mapping,
>                                           pgoff_t index, gfp_t gfp)
>   {
> -       struct folio *folio = shmem_read_folio_gfp(mapping, index, gfp);
> +       struct folio *folio = shmem_read_folio_gfp(mapping, index, 0, gfp);
>          struct page *page;
> 
>          if (IS_ERR(folio))



       reply	other threads:[~2025-07-25  2:38 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAJCW39JCDX6_S2Ojt1HMmX-h_qAKm2eBRzxX5kOHNJz60Zu=vw@mail.gmail.com>
     [not found] ` <d5c6ac93-1af0-4093-afea-94a29a387903@redhat.com>
     [not found]   ` <63b69425-2fd1-2c77-06d6-e7ea25c92f34@google.com>
     [not found]     ` <3f204974-26c8-4d5f-b7ae-4052cbfdf4ac@redhat.com>
     [not found]       ` <a8ac7ec3-4cb3-4dd8-8d02-ede6905f322e@linux.alibaba.com>
2025-07-25  2:38         ` Baolin Wang [this message]
2025-07-25  4:47           ` Hugh Dickins
2025-07-25  6:05             ` Baolin Wang
2025-07-25  8:36               ` Patryk Kowalczyk
2025-07-25  9:17                 ` Baolin Wang
2025-07-28  5:35               ` Hugh Dickins
2025-07-28  6:29                 ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=da2ab844-98c9-4eb2-82ac-01d01bec30f3@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=da.gomez@samsung.com \
    --cc=david@redhat.com \
    --cc=eero.t.tamminen@intel.com \
    --cc=hughd@google.com \
    --cc=ioworker0@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=patryk@kowalczyk.ws \
    --cc=ryan.roberts@arm.com \
    --cc=ville.syrjala@linux.intel.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox