[PATCH] drm/radeon/kms/blit: fix blit copy for very large buffers - dri-devel - freedesktop.org experimental mailing list

1 Feb 2012

Evergreen and NI blit copy was broken if the buffer maps to a rectangle
whose one dimension is 16384 (max dimension allowed by these chips).
In the mainline kernel, the problem is exposed only when buffers are
very large (1G), but it's still a problem. The problem could be exposed
for smaller buffers if anyone modifies the algorithm for rectangle
construction in r600_blit_create_rect() (the reason why someone would
modify that algorithm is to tune the performance of buffer moves).
The root cause was in i2f() function which only operated on range between
0 and 16383. Fix this by extending the range of i2f() function to 0 to
32767.
While at it improve the function so that the range can be easily
extended in the future (if it becomes necessary), cleanup lines
over 80 characters, and replace in-line comments with one strategic
comment that explains the crux of the function.
Credits to michel@daenzer.net for pointing out the root cause of
the bug.
Signed-off-by: Ilija Hadzic ihadzic@research.bell-labs.com
---
 drivers/gpu/drm/radeon/r600_blit_kms.c |   35 ++++++++++++++++++++++---------
 1 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c b/drivers/gpu/drm/radeon/r600_blit_kms.c
index d996f43..32dcc95 100644
--- a/drivers/gpu/drm/radeon/r600_blit_kms.c
+++ b/drivers/gpu/drm/radeon/r600_blit_kms.c
@@ -468,27 +468,42 @@ set_default_state(struct radeon_device *rdev)
    radeon_ring_write(ring, sq_stack_resource_mgmt_2);
 }
+#define I2F_MAX_BITS 15
+#define I2F_MAX_INPUT  ((2 << I2F_MAX_BITS) - 1)
+#define I2F_SHIFT (24 - I2F_MAX_BITS)
+
+/*
+ * converts unsigned integer into 32-bit IEEE floating point representation;
+ * conversion is not universal and only works for the range from 0
+ * to 2^^I2F_MAX_BITS-1; currently we only use it with inputs between
+ * 0 and 16384 (inclusive), so I2F_MAX_BITS=15 is enough; if necessary
+ * I2F_MAX_BITS can be increased, but that will add to loop iterations
+ * and slow us down; conversion is done by shifting the input and counting
+ * down until the first 1 reaches bit position 23; the resulting counter
+ * and the shifted input are the exponent and the fraction; the sign is
+ * always zero
+ */
 static uint32_t i2f(uint32_t input)
 {
    u32 result, i, exponent, fraction;
-	if ((input & 0x3fff) == 0)
-		result = 0; /* 0 is a special case */
+	WARN_ON(input > I2F_MAX_INPUT);
+
+	if ((input & I2F_MAX_INPUT) == 0)
+		result = 0;
    else {
-		exponent = 140; /* exponent biased by 127; */
-		fraction = (input & 0x3fff) << 10; /* cheat and only
-						      handle numbers below 2^^15 */
-		for (i = 0; i < 14; i++) {
+		exponent = 126 + I2F_MAX_BITS;
+		fraction = (input & I2F_MAX_INPUT) << I2F_SHIFT;
+
+		for (i = 0; i < I2F_MAX_BITS; i++) {
    		if (fraction & 0x800000)
    			break;
    		else {
-				fraction = fraction << 1; /* keep
-							     shifting left until top bit = 1 */
+				fraction = fraction << 1;
    			exponent = exponent - 1;
    		}
    	}
-		result = exponent << 23 | (fraction & 0x7fffff); /* mask
-								    off top bit; assumed 1 */
+		result = exponent << 23 | (fraction & 0x7fffff);
    }
    return result;
 }
-- 
1.7.7