[git pull] drm fixes

List overview All Threads
Download

newer

older

Re: [Bug 28383] New: Cloned...

What happen to drm-fbdev1 branch?

Dave Airlie

7 Jun 2010 7 Jun '10

3:09 a.m.

Hi Linus,

3 regressions fixes, one radeon loading on IGP, one i865 loading, one and an evergreen userspace interaction workaround.

It adds hwmon support for a temperature sensor on r600 cards, later PM patches were build on this and Alex had tested them in one so I didn't want to cherry-pick around it. Also its useful to report the gpu temp to check if power management is helping cooling it down.

Dave.

The following changes since commit e44a21b7268a022c7749f521c06214145bd161e4: Linus Torvalds (1): Linux 2.6.35-rc2

are available in the git repository at:

ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-fixes

Alex Deucher (5): drm/radeon/kms/evergreen: set accel_enabled drm/radeon/kms: add support for internal thermal sensors (v3) drm/radeon/kms/pm: Misc fixes drm/radeon/kms/pm: add mid profile drm/radeon/kms/combios: fix typo in voltage fix

Chris Wilson (4): drm/i915: Propagate error from drm_fb_helper_init(). drm/i915: Propagate error from intel_fbdev_init(). drm/nouveau: Propagate error from drm_fb_helper_init() drm/radeon: Propagate error from drm_fb_helper_init()

Dan Carpenter (2): drm/drm_crtc: return -EFAULT on copy_to_user errors drm/vmwgfx: return -EFAULT for copy_to_user errors

Dave Airlie (3): drm/i915: fix oops on single crtc devices. drm/radeon: fix PM on non-vram cards. drm/fb: use printk to print out the switching to text mode error.

Show replies by date

Linus Torvalds

7 Jun 7 Jun

6 p.m.

On Mon, 7 Jun 2010, Dave Airlie wrote:

...

3 regressions fixes, one radeon loading on IGP, one i865 loading, one and an evergreen userspace interaction workaround.

This is:

26 files changed, 372 insertions(+), 66 deletions(-)

and there are apparently several reports of known problems (the problem with modesetting) that isn't even addressed.

See my -rc2 announcement. I absolutely do NOT want any new code. I want regression fixes, fixes for security issues, and fixes for oopses. Nothing else. I'm going to be hardassed about this, because quite frankly, if I'm not, people will just continue with the same-old, same-old, and send me random stuff without thinking hard about it.

So please. Just make me a tree that has regression fixes _only_. I'm not AT ALL interested in "it is useful to report the gpu temp". If it was so useful, and if it was ready before the merge window, it should hav gone in then. That clearly wasn't the case, so it's not going in now either.

Linus

Al Viro

6:26 p.m.

On Mon, Jun 07, 2010 at 11:00:51AM -0700, Linus Torvalds wrote:

...

So please. Just make me a tree that has regression fixes _only_. I'm not AT ALL interested in "it is useful to report the gpu temp". If it was so useful, and if it was ready before the merge window, it should hav gone in then. That clearly wasn't the case, so it's not going in now either.

Ho-hum... Speaking of which, what about leak fixes? There's a long-standing in-core inode leak in jffs2; basically, if you fail directory modification in symlink() et.al., you get a leaked inode and whinge at umount. Found after -rc1, had been there since all the way back (similar bug in creat() had been fixed in 2003, mkdir()/mknod()/symlink() were not). Fix sits in jffs2-fixes now...

I can simply pull jffs2-fixes into vfs for-next (I need it in there for ->evict_inode() series), but I'd obviously prefer to just rebase it after it gets into mainline.

Linus Torvalds

6:53 p.m.

On Mon, 7 Jun 2010, Al Viro wrote:

...

Ho-hum... Speaking of which, what about leak fixes? There's a long-standing in-core inode leak in jffs2; basically, if you fail directory modification in symlink() et.al., you get a leaked inode and whinge at umount. Found after -rc1, had been there since all the way back (similar bug in creat() had been fixed in 2003, mkdir()/mknod()/symlink() were not). Fix sits in jffs2-fixes now...

I think a leak that is trivial easily falls under "security issue" as a potential DoS issue.

On the other hand, if it's not trivially fixed (say it needs big re-organizing of some locking or refcounting or whatever), and it's a really slow leak of a pretty small data structure, and is not triggered by normal users (say, you need to mount a filesystem or it needs some very specific timing), I think it falls under "we haven't seen in the previous five years, we might as well make sure the fix is tested in the next merge window".

So I think it's a judgement call.

...

I can simply pull jffs2-fixes into vfs for-next (I need it in there for ->evict_inode() series), but I'd obviously prefer to just rebase it after it gets into mainline.

I seem to have a jffs2 pull request that I haven't yet processed, exactly because it wasn't clear. It's much bigger than I would have wished for, and it's not clear it's all regressions at all.

DavidW? It's

7 files changed, 107 insertions(+), 91 deletions(-)

and while that's in the size range that I didn't just reject it like the drm pull, I still do want to know if that's really just true major bugfixes and regressions. We already had a really bad -rc2 release due to a tiny and innocent-looking bugfix that turned out to be anything but. I do _not_ want to repeat that with -rc3, since I'll be gone.

Linus

Al Viro

7:08 p.m.

On Mon, Jun 07, 2010 at 11:53:28AM -0700, Linus Torvalds wrote:

...

On Mon, 7 Jun 2010, Al Viro wrote:

...
Ho-hum... Speaking of which, what about leak fixes? There's a long-standing in-core inode leak in jffs2; basically, if you fail directory modification in symlink() et.al., you get a leaked inode and whinge at umount. Found after -rc1, had been there since all the way back (similar bug in creat() had been fixed in 2003, mkdir()/mknod()/symlink() were not). Fix sits in jffs2-fixes now...

I think a leak that is trivial easily falls under "security issue" as a potential DoS issue.

On the other hand, if it's not trivially fixed (say it needs big re-organizing of some locking or refcounting or whatever), and it's a really slow leak of a pretty small data structure, and is not triggered by normal users (say, you need to mount a filesystem or it needs some very specific timing), I think it falls under "we haven't seen in the previous five years, we might as well make sure the fix is tested in the next merge window".

You need something like IO errors or device being full to trigger it. As for the fix, it's basically a matter of "set i_nlink to 0 and iput() instead of manual jffs2_clear_inode(); sure, you want to kill the on-disk inode, but you want in-core one gone too".

Basically, that's what all local filesystems are doing to clean up after such error and that's what jffs2 is doing for ->create().

As for the other stuff in that tree... There's a fix for nfsd/create race (rather narrow and not trivial to hit, but capable of fs corruption) and there's mtd stuff I've no fscking clue about.

If not for the mtd part I'd simply pulled it in my tree. As it is... I still can do that (done that for current semi-private branch), but I'd prefer to avoid feeding mtd stuff through vfs tree, for all the obvious reasons.

David Woodhouse

7:32 p.m.

On Mon, 2010-06-07 at 11:53 -0700, Linus Torvalds wrote:

...

On Mon, 7 Jun 2010, Al Viro wrote:

...
Ho-hum... Speaking of which, what about leak fixes? There's a long-standing in-core inode leak in jffs2; basically, if you fail directory modification in symlink() et.al., you get a leaked inode and whinge at umount. Found after -rc1, had been there since all the way back (similar bug in creat() had been fixed in 2003, mkdir()/mknod()/symlink() were not). Fix sits in jffs2-fixes now...

I think a leak that is trivial easily falls under "security issue" as a potential DoS issue.

On the other hand, if it's not trivially fixed (say it needs big re-organizing of some locking or refcounting or whatever), and it's a really slow leak of a pretty small data structure, and is not triggered by normal users (say, you need to mount a filesystem or it needs some very specific timing), I think it falls under "we haven't seen in the previous five years, we might as well make sure the fix is tested in the next merge window".

So I think it's a judgement call.

The fix is fairly trivial. There's a "big" patch to fs/jffs2/dir.c which accounts for the bulk of my pull request, but if you look harder you'll see it's mostly just a bunch of removing 'return ret;' and adding 'goto fail;' so the error cleanup happens properly.

Al pointed out a second problem at the same time, fixed by commit e72e6497 in the tree I asked you to pull. That involved adding an unlock_new_inode() to the same error paths that the first patch used.

Between the two bugs, I figured it was worth pushing the fixes for 2.6.35.

The third jffs2 patch in that tree is a fix for ctime semantics which is a two-liner. Again not a regression but worth fixing, and -stable fodder.

Al also pointed out that I could use iget_failed(), but I figured that cleanup could wait for 2.6.36.

...

I seem to have a jffs2 pull request that I haven't yet processed, exactly because it wasn't clear. It's much bigger than I would have wished for, and it's not clear it's all regressions at all.

DavidW? It's
7 files changed, 107 insertions(+), 91 deletions(-)
and while that's in the size range that I didn't just reject it like the drm pull, I still do want to know if that's really just true major bugfixes and regressions.

The patches to r852 are fixing the fact that suspend/resume wasn't working. Not strictly a regression, as it's a new driver in 2.6.35 -- but I judged that it was best to fix it.

The Kconfig patch fixes a problem with the menu nesting, introduced with the SmartMedia support. That one is a regression.

I'll concede that I could probably have lived without the DocBook patch, and the patch to use memdup_user() in mtdchar.c, but they're so trivial that it seemed pointless rebasing the tree to exclude them once I concluded I should to try my luck at getting the other stuff into -rc2.

-- David Woodhouse Open Source Technology Centre David.Woodhouse@intel.com Intel Corporation

Al Viro

7:40 p.m.

On Mon, Jun 07, 2010 at 08:32:45PM +0100, David Woodhouse wrote:

...

The fix is fairly trivial. There's a "big" patch to fs/jffs2/dir.c which accounts for the bulk of my pull request, but if you look harder you'll see it's mostly just a bunch of removing 'return ret;' and adding 'goto fail;' so the error cleanup happens properly.

Al pointed out a second problem at the same time, fixed by commit e72e6497 in the tree I asked you to pull. That involved adding an unlock_new_inode() to the same error paths that the first patch used.

Between the two bugs, I figured it was worth pushing the fixes for 2.6.35.

The third jffs2 patch in that tree is a fix for ctime semantics which is a two-liner. Again not a regression but worth fixing, and -stable fodder.

Al also pointed out that I could use iget_failed(), but I figured that cleanup could wait for 2.6.36.

BTW, if you put jffs2 stuff into a separate queue, I can just pull it (and add iget_failed() conversion on top of that). Not a problem...

Linus Torvalds

9:17 p.m.

On Mon, 7 Jun 2010, David Woodhouse wrote:

...

The fix is fairly trivial. There's a "big" patch to fs/jffs2/dir.c which accounts for the bulk of my pull request, but if you look harder you'll see it's mostly just a bunch of removing 'return ret;' and adding 'goto fail;' so the error cleanup happens properly.

So that's the part I'm worried about.

I'm going to be hardnosed, but I'm _not_ going to so hardnosed as to worry about some oneliner DocBook patch. It's not about being anal to quite that degree, that would be silly. But the dir.c change is what I end up worrying about.

It's not at all clear why it's good to change

jffs2_clear_inode(inode);

into

make_bad_inode(inode); iput(inode);

and that changelog doesn't really explain it either ("fix leak"? Ok, I can see the iput() fixing the leak - but you also did that jffs2_clear_inode() change, and that has no explanation what-so-ever.

So is this a safe thing that definitely fixes a serious bug? I am left with no good way to judge.

Linus

Al Viro

9:33 p.m.

On Mon, Jun 07, 2010 at 02:17:23PM -0700, Linus Torvalds wrote:

...

jffs2_clear_inode(inode);

into

make_bad_inode(inode); iput(inode);

and that changelog doesn't really explain it either ("fix leak"? Ok, I can see the iput() fixing the leak - but you also did that jffs2_clear_inode() change, and that has no explanation what-so-ever.

The final iput() calls ->clear_inode() (jffs2_clear_inode in case of jffs2) and the inode has just been created, with no other in-core references existing. Basically, that call was the only part of (required) iput() that _was_ done there ;-)

FWIW, what's happening around ->clear_inode()/->delete_inode()/->drop_inode() is a mess. This leak got found when I'd been looking through that crap; results of sanitizing are in #evict_inode (vfs-2.6.git). I'm going to shift that into for-next tomorrow, assuming it survives local beating. For now I've just pulled jffs2-fixes in it...

Linus Torvalds

9:38 p.m.

On Mon, 7 Jun 2010, Al Viro wrote:

...

On Mon, Jun 07, 2010 at 02:17:23PM -0700, Linus Torvalds wrote:

...
jffs2_clear_inode(inode);

into

make_bad_inode(inode); iput(inode);

and that changelog doesn't really explain it either ("fix leak"? Ok, I can see the iput() fixing the leak - but you also did that jffs2_clear_inode() change, and that has no explanation what-so-ever.

The final iput() calls ->clear_inode() (jffs2_clear_inode in case of jffs2) and the inode has just been created, with no other in-core references existing. Basically, that call was the only part of (required) iput() that _was_ done there ;-)

FWIW, what's happening around ->clear_inode()/->delete_inode()/->drop_inode() is a mess. This leak got found when I'd been looking through that crap; results of sanitizing are in #evict_inode (vfs-2.6.git). I'm going to shift that into for-next tomorrow, assuming it survives local beating. For now I've just pulled jffs2-fixes in it...

Ok, a changelog like that would have been a good thing. Not that I usually care, but now that I'm in careful mode, I do end up looking at things like this, and a good changelog would have goen m uch further in convincing me that the "goto fail" changes really were just about fixing the leak, and that there wasn't some other change hidden in the same commit.

Linus

David Woodhouse

9:39 p.m.

On Mon, 2010-06-07 at 14:17 -0700, Linus Torvalds wrote:

...

and that changelog doesn't really explain it either ("fix leak"? Ok, I can see the iput() fixing the leak - but you also did that jffs2_clear_inode() change, and that has no explanation what-so-ever.

jffs2_clear_inode() is the file system's ->clear_inode method, so it gets called from the VFS when the inode is destroyed, after iput().

I suppose that ought to have been a clue, right from the very beginning, that we should never have been calling it directly on our error paths.

-- David Woodhouse Open Source Technology Centre David.Woodhouse@intel.com Intel Corporation

Al Viro

8 Jun 8 Jun

12:30 a.m.

On Mon, Jun 07, 2010 at 10:39:28PM +0100, David Woodhouse wrote:

...

On Mon, 2010-06-07 at 14:17 -0700, Linus Torvalds wrote:

...
and that changelog doesn't really explain it either ("fix leak"? Ok, I can see the iput() fixing the leak - but you also did that jffs2_clear_inode() change, and that has no explanation what-so-ever.

jffs2_clear_inode() is the file system's ->clear_inode method, so it gets called from the VFS when the inode is destroyed, after iput().

I suppose that ought to have been a clue, right from the very beginning, that we should never have been calling it directly on our error paths.

Yep. The other place that directly called its ->clear_inode() also had been bogus, BTW - logfs had been playing rather sick games with special inodes and ended up open-coding just about everything on new_inode/iput paths for those. They needed that stuff evicted after all normal inodes, but before the second call of invalidate_inodes() would scream about surviving busy inodes. I.e. that should've been happening in ->put_super(); no need to deal with handcrafted inodes that would sit outside of inode list...

Dave Airlie

7 Jun 7 Jun

8:26 p.m.

On Tue, Jun 8, 2010 at 4:00 AM, Linus Torvalds torvalds@linux-foundation.org wrote:

...

On Mon, 7 Jun 2010, Dave Airlie wrote:

...
3 regressions fixes, one radeon loading on IGP, one i865 loading, one and an evergreen userspace interaction workaround.

This is:

26 files changed, 372 insertions(+), 66 deletions(-)

and there are apparently several reports of known problems (the problem with modesetting) that isn't even addressed.

Okay, not sure what the addressed regression you are talking about, do you want regression fixes early like you always say or do you want to wait until I have every regression reported fixes before I send a pull request?

I'll rebase the tree today (which means it will be tested less than what I originally sent you, inconsistent messages much?)..

...

So please. Just make me a tree that has regression fixes _only_. I'm not AT ALL interested in "it is useful to report the gpu temp". If it was so useful, and if it was ready before the merge window, it should hav gone in then. That clearly wasn't the case, so it's not going in now either.

I really hope you do this, if I find one thing going into your tree after today that isn't a regression fix I'll send revert patches if that'll help.

Dave.

Dave Airlie

8:41 p.m.

On Tue, Jun 8, 2010 at 6:26 AM, Dave Airlie airlied@gmail.com wrote:

...

On Tue, Jun 8, 2010 at 4:00 AM, Linus Torvalds torvalds@linux-foundation.org wrote:

...
On Mon, 7 Jun 2010, Dave Airlie wrote:

...
3 regressions fixes, one radeon loading on IGP, one i865 loading, one and an evergreen userspace interaction workaround.

This is:

26 files changed, 372 insertions(+), 66 deletions(-)

and there are apparently several reports of known problems (the problem with modesetting) that isn't even addressed.

Okay, not sure what the addressed regression you are talking about, do you want regression fixes early like you always say or do you want to wait until I have every regression reported fixes before I send a pull request?

Oh the one where I said to the reporter, I've reproduced this, and will fix it tomorrow when I have proper time and access to my test machine?

I didn't think writing a fix in the 5 mins before I left the test machine and sending it you was acceptable, again can you maintain some semblance of consistency across maintainers/releases?

Like I'm happy if you really enforce this no features idea, I'd be really happy if you did it every release since it makes it a lot easier to push back to submaintainers if you can point at Linus not pulling features from people. However its really hard to push back on submaintainers and one of the reasons Eric talks to you direct after rc1, when you can be very inconsistent about what you pull and from whom. Like I can tell someone this isn't going in this release, then you'll pull something uglier from someone else in -rc3 and I end up looking like stupid because I said there was no hope of getting anything in. Consistency is something that would make everyone's life easier, across all releases and all maintainers.

Dave.

Linus Torvalds

9:03 p.m.

On Tue, 8 Jun 2010, Dave Airlie wrote:

...

Oh the one where I said to the reporter, I've reproduced this, and will fix it tomorrow when I have proper time and access to my test machine?

I didn't think writing a fix in the 5 mins before I left the test machine and sending it you was acceptable, again can you maintain some semblance of consistency across maintainers/releases?

No, no. I really didn't imply that you should hurry and not be careful. I was just unhappy about the mixing of non-regression fixes with the regression fixes.

...

Like I'm happy if you really enforce this no features idea, I'd be really happy if you did it every release since it makes it a lot easier to push back to submaintainers if you can point at Linus not pulling features from people.

I already had another person state their happiness with me pushing back, and while I had my reasons for doing it this particular release, I do know that I've been letting things slide wrt the merge window a bit too much.

So let's see how the 2.6.35 release cycle ends up looking when all is said and done. If pushing back harder ends up actually making things easier and the release cycle ends up working better as a result, I'm certainly very open to just being hardnosed in general.

I suspect it won't even be very painful if people just get used to it. And if it ends up really helping sub-maintainers ("I can't do that, because Linus wouldn't pull the result anyway"), then that would be a really good reason for me to be rather stricter about the rules.

Linus

Linus Torvalds

8:52 p.m.

On Tue, 8 Jun 2010, Dave Airlie wrote:

...

...
26 files changed, 372 insertions(+), 66 deletions(-)

and there are apparently several reports of known problems (the problem with modesetting) that isn't even addressed.

Okay, not sure what the addressed regression you are talking about, do you want regression fixes early like you always say or do you want to wait until I have every regression reported fixes before I send a pull request?

I absolutely do want regression fixes early. If they've been verified to fix something, I do want to merge them as soon as possible.

But EVEN MORE IMPORTANTLY - and the point of me saying no - I do _not_ want non-regression fixes mixed in. Which clearly was the case here. So "as soon as possible" does not mean that I'll take _other_ things just to get the fix. The regression fix should stand on its own - and be merged on its own.

So no, it's not an excuse to send me a tree that contains other crud, and then use ".. but but you wanted regression fixes" as the excuse for sending those other changes too.

The whole point of the merge window is to then _after_ it closes no longer merge new code.

...

I really hope you do this, if I find one thing going into your tree after today that isn't a regression fix I'll send revert patches if that'll help.

I don't like seeing revert patches unless they revert something that is buggy. So no, "revert because I shouldn't have sent that commit" is not a good policy. If I end pulling something, it's my bad, and I won't revert it just because I made a mistake - it needs to be somehow actually be involved in some real problem.

But I _am_ trying to be proactive about problems during this particular release cycle.

The point I'm trying to make is that I am going to be strict about the rules (the ones we've had for several years now, but people tend to be a bit loose about). I _should_ probably be stricter than I usually am even normally, but this time I am going to be very strict for -rc3 due to being away for part of the release cycle.

And no, don't worry - it's not just for you. I already rejected a microblaze pull for all the same reasons, and asked for some extended clarifications for a (much smaller) jffs2 pull despite the fact that we've generally not had lots of problems with jffs2.

Linus

5431

Age (days ago)

5432

Last active (days ago)

dri-devel@lists.freedesktop.org

15 comments

5 participants

tags (0)

participants (5)

Al Viro
Dave Airlie
Dave Airlie
David Woodhouse
Linus Torvalds