Hi Martin, On 2019/3/23 5:07, Martin Blumenstingl wrote: > Hi Matthew, > > On Thu, Mar 21, 2019 at 10:44 PM Matthew Wilcox wrote: >> >> On Thu, Mar 21, 2019 at 09:17:34PM +0100, Martin Blumenstingl wrote: >>> Hello, >>> >>> I am experiencing the following crash: >>> ------------[ cut here ]------------ >>> kernel BUG at mm/slub.c:3950! >> >> if (unlikely(!PageSlab(page))) { >> BUG_ON(!PageCompound(page)); >> >> You called kfree() on the address of a page which wasn't allocated by slab. >> >>> I have traced this crash to the kfree() in meson_nfc_read_buf(). >>> my observation is as follows: >>> - meson_nfc_read_buf() is called 7 times without any crash, the >>> kzalloc() call returns 0xe9e6c600 (virtual address) / 0x29e6c600 >>> (physical address) >>> - the eight time meson_nfc_read_buf() is called kzalloc() call returns >>> 0xee39a38b (virtual address) / 0x2e39a38b (physical address) and the >>> final kfree() crashes >>> - changing the size in the kzalloc() call from PER_INFO_BYTE (= 8) to >>> PAGE_SIZE works around that crash >> >> I suspect you're doing something which corrupts memory. Overrunning >> the end of your allocation or something similar. Have you tried KASAN >> or even the various slab debugging (eg redzones)? > KASAN is not available on 32-bit ARM. there was some progress last > year [0] but it didn't make it into mainline. I tried to make the > patches apply again and got it to compile (and my kernel is still > booting) but I have no idea if it's still working. for anyone > interested, my patches are here: [1] (I consider this a HACK because I > don't know anything about the code which is being touched in the > patches, I only made it compile) > > SLAB debugging (redzones) were a great hint, thank you very much for > that Matthew! I enabled: > CONFIG_SLUB_DEBUG=y > CONFIG_SLUB_DEBUG_ON=y > and with that I now get "BUG kmalloc-64 (Not tainted): Redzone > overwritten" (a larger kernel log extract is attached). > > I'm starting to wonder if the NAND controller (hardware) writes more > than 8 bytes. > some context: the "info" buffer allocated in meson_nfc_read_buf is > then passed to the NAND controller IP (after using dma_map_single). > > Liang, how does the NAND controller know that it only has to send > PER_INFO_BYTE (= 8) bytes when called from meson_nfc_read_buf? all > other callers of meson_nfc_dma_buffer_setup (which passes the info > buffer to the hardware) are using (nand->ecc.steps * PER_INFO_BYTE) > bytes? > NFC_CMD_N2M and CMDRWGEN are different commands. CMDRWGEN needs to set the ecc page size (1KB or 512B) and Pages(2, 4, 8, ...), so PER_INFO_BYTE(= 8) bytes for each ecc page. I have never used NFC_CMD_N2M to transfer data before, because it is very low efficient. And I do a experiment with the attachment and find on overwritten on my meson axg platform. Martin, I would appreciate it very much if you would try the attachment on your meson m8b platform. > > Regards > Martin > > > [0] https://lore.kernel.org/patchwork/cover/913212/ > [1] https://github.com/xdarklight/linux/tree/arm-kasan-hack-v5.1-rc1 >