On Sun, 31 Mar 2013, Michal Hocko wrote:
On Sat 30-03-13 18:26:53, Ilija Hadzic wrote:
This looks a bit like a hack and it doesn't look right, conceptually. If the call fails, it should restore things as if nothing has ever happened and overwriting old_mapping is not going to do the trick.
OK, I thought this is what the patch does as it falls back to &inode->i_data which is the default mapping for all inodes or it uses what used to be in device mapping.
I am obviously not familiar with the drm code but it feels a bit strange that the device mapping can be different than inode's resp. file's one
The reason for this is explained in commit message associated with 949c4a34.
In summary, the device's mapping is that of the inode associated with the first opener. Before 949c4a34, subsequent openers would have to come in through exactly the same inode that the first opener came in (otherwise the open call would fail). So if a user did something like: start X, remove /dev/dri/cardN file, mknod the same file again, the applications started after such an action would stop working. Also, using the GPU from chroot-ed environment was not possible if there was another opener from different root.
The 949c4a34, removed this restriction, but introduced a problem with VmWare GPU drivers, which fdb40a08. However, fdb40a08 introduced the bug that you have reported.
The problem that I have with your proposed fix is that if the first opener fails, it can set the device's mapping to that of the inode that was never used and never opened (and could even be removed later down the road).
and even more confusing that inode and file are saved separately.
I was trying to quickly get out the patch that was safe in terms of introducing new breakage. So the "conservative" thing to do (without having to think through all possible scenarios) was to restore each of the three pointers from their own temporary variable. Thinking about it, you are probably right that file descriptor's and inode's mapping pointer are equal when open call is entered so we could use one variable. However, you still need a separate variable to store the device's mapping pointer because that one can be different.
Attached is a v2 of the patch, for reference. I would appreciate if the original reporter or you tested it in lieu of your proposed patch and let me know if it fixes your issue.
-- Ilija