On 28-Nov-24 10:01 AM, Mateusz Guzik wrote: > WIlly mentioned the folio wait queue hash table could be grown, you > can find it in mm/filemap.c: > 1062 #define PAGE_WAIT_TABLE_BITS 8 > 1063 #define PAGE_WAIT_TABLE_SIZE (1 << PAGE_WAIT_TABLE_BITS) > 1064 static wait_queue_head_t folio_wait_table[PAGE_WAIT_TABLE_SIZE] > __cacheline_aligned; > 1065 > 1066 static wait_queue_head_t *folio_waitqueue(struct folio *folio) > 1067 { > 1068 │ return &folio_wait_table[hash_ptr(folio, PAGE_WAIT_TABLE_BITS)]; > 1069 } > > Can you collect off cpu time? offcputime-bpfcc -K > /tmp/out Flamegraph for "perf record --off-cpu -F 99 -a -g --all-kernel --kernel-callchains -- sleep 120" is attached. Off-cpu samples were collected for 120s at around 45th minute run of the FIO benchmark that actually runs for 1hr. This run was with kernel that had your inode_lock fix but no changes to PAGE_WAIT_TABLE_BITS. Hopefully this captures the representative sample of the scalability issue with folio lock. Regards, Bharata.