Re: Kmap-related crashes and memory leaks on 32bit arch (5.15+)

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Qu Wenruo <wqu@suse.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	David Sterba <dsterba@suse.cz>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: Kmap-related crashes and memory leaks on 32bit arch (5.15+)
Date: Fri, 5 Nov 2021 08:01:13 +0800	[thread overview]
Message-ID: <f3d3dc5d-dcf8-76b7-f383-aed3c942ae49@suse.com> (raw)
In-Reply-To: <CAHk-=whGUxtcL8Z67y4A6_diSmtQdnOq1p_gyBAMzpKD9yk+gw@mail.gmail.com>



On 2021/11/5 07:48, Linus Torvalds wrote:
> On Thu, Nov 4, 2021 at 4:37 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Having looked at it once more, it still looks "ObviouslyCorrect(tm)"
>> to me, but I suspect I'm just being blind to some obvious bug.
> 
> Oh, I was just looking at the pattern of kmap/kunmap, but there does
> seem to be a questionable pattern outside of that:
> 
> This pattern looks odd:
> 
>          kaddr = kmap(cur_page);
>          write_compress_length(kaddr + offset_in_page(*cur_out),
>                                compressed_size);
>          ...
> 
> (and then whether you kunmap immediately, or you leave it kmap'ed and
> use it again at the end is a different issue)

That part is paired with the the following code, to prevent we cross 
page boundary for the segment header:

	/*
	 * Check if we can fit the next segment header into the remaining space
	 * of the sector.
	 */
	sector_bytes_left = round_up(*cur_out, sectorsize) - *cur_out;
	if (sector_bytes_left >= LZO_LEN || sector_bytes_left == 0)
		goto out;

	/* The remaining size is not enough, pad it with zeros */
	memset(kaddr + offset_in_page(*cur_out), 0,
	       sector_bytes_left);
	*cur_out += sector_bytes_left;


So we always ensure that each segment header never crosses the page 
boundary.

This behavior is a little tricky but is part of the on-disk format for 
lzo compressed data.


BTW, I also thought that part can be suspicious, as it keeps the page 
mapped (unlike all other call sites), thus I tried the following diff, 
but no difference for the leakage:

diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
index 65cb0766e62d..0648acc48310 100644
--- a/fs/btrfs/lzo.c
+++ b/fs/btrfs/lzo.c
@@ -151,6 +151,7 @@ static int copy_compressed_data_to_page(char 
*compressed_data,
  	kaddr = kmap(cur_page);
  	write_compress_length(kaddr + offset_in_page(*cur_out),
  			      compressed_size);
+	kunmap(cur_page);
  	*cur_out += LZO_LEN;

  	orig_out = *cur_out;
@@ -160,7 +161,6 @@ static int copy_compressed_data_to_page(char 
*compressed_data,
  		u32 copy_len = min_t(u32, sectorsize - *cur_out % sectorsize,
  				     orig_out + compressed_size - *cur_out);

-		kunmap(cur_page);
  		cur_page = out_pages[*cur_out / PAGE_SIZE];
  		/* Allocate a new page */
  		if (!cur_page) {
@@ -173,6 +173,7 @@ static int copy_compressed_data_to_page(char 
*compressed_data,

  		memcpy(kaddr + offset_in_page(*cur_out),
  		       compressed_data + *cur_out - orig_out, copy_len);
+		kunmap(cur_page);

  		*cur_out += copy_len;
  	}
@@ -186,12 +187,15 @@ static int copy_compressed_data_to_page(char 
*compressed_data,
  		goto out;

  	/* The remaining size is not enough, pad it with zeros */
+	cur_page = out_pages[*cur_out / PAGE_SIZE];
+	ASSERT(cur_page);
+	kmap(cur_page);
  	memset(kaddr + offset_in_page(*cur_out), 0,
  	       sector_bytes_left);
+	kunmap(cur_page);
  	*cur_out += sector_bytes_left;

  out:
-	kunmap(cur_page);
  	return 0;
  }

Thanks,
Qu
> 
> In particular, what if "offset_in_page(*cur_out)" is very close to the
> end of the page?
> 
> That write_compress_length() always writes out a word-sized length (ie
> LZO_LEN bytes), and the above pattern seems to have no model to handle
> the "oh, this 4-byte write crosses a page boundary"
> 
> The other writes in that function seem to do it properly, and you have
> 
>          u32 copy_len = min_t(u32, sectorsize - *cur_out % sectorsize,
>                               orig_out + compressed_size - *cur_out);
> 
> so doing the memcpy() of size 'copy_len' should never cross a page
> boundary as long as sectorsize is a power-of-2 smaller or equal to a
> page size. But those 4-byte length writes seem like they could be page
> crossers.
> 
> The same situation exists on the reading side, I think.
> 
> Maybe there's some reason why the read/write_compress_length() can
> never cross a page boundary, but it did strike me as odd.
> 
>               Linus
>

next prev parent reply	other threads:[~2021-11-05  0:01 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-04 11:50 David Sterba
     [not found] ` <CAHk-=whYQvExYESEOJoSj4Jy7t+tSZgbCWuNpdwXYh+3zq2itw@mail.gmail.com>
2021-11-04 23:37   ` Linus Torvalds
2021-11-04 23:48     ` Linus Torvalds
2021-11-05  0:01       ` Qu Wenruo [this message]
2021-11-05 16:07         ` Ira Weiny
2021-11-05 19:50     ` David Sterba
2021-11-16 15:43       ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f3d3dc5d-dcf8-76b7-f383-aed3c942ae49@suse.com \
    --to=wqu@suse.com \
    --cc=dsterba@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox