TTM and AGP conflicts

List overview All Threads
Download

newer

older

[PATCH 1/3] drm/radeon: GPU...

[patch 2/2] drm/radeon: double...

James Simmons

22 Dec 2011 22 Dec '11

9:56 p.m.

Hi!

I updated the openchrome tree and while testing on the AGP system discovered some interesting problems with the new TTM changes. The problems center around the ttm_tt_[un]populate which I modeled after the radeon and nouveau driver. First problem I noticed was on a AGP system that my ttm_tt_populate function would oops. Tracking it down I found the problem was the ttm_agp_tt_create calls ttm_tt_init instead of ttm_dma_tt_init so once my ttm_tt_populate function would attempt to touch the dma_address it would oops. The second issue is the assumption of the cast for struct ttm_tt in both the populate and unpopulate function. For the AGP case the proper case would be to ttm_agp_backend. At this point my assumption is the ttm_bo_move function has to be rewritten to handle the AGP case to avoid calling ttm_tt_bind and in all cases ttm_tt_bind needs to be avoided. Looking at the radeon and nouveau drivers I don't see that testing happening. Am I just missing something?

Show replies by date

Alex Deucher

22 Dec 22 Dec

10 p.m.

On Thu, Dec 22, 2011 at 4:56 PM, James Simmons jsimmons@infradead.org wrote:

...

Hi!

I updated the openchrome tree and while testing on the AGP system discovered some interesting problems with the new TTM changes. The problems center around the ttm_tt_[un]populate which I modeled after the radeon and nouveau driver. First problem I noticed was on a AGP system that my ttm_tt_populate function would oops. Tracking it down I found the problem was the ttm_agp_tt_create calls ttm_tt_init instead of ttm_dma_tt_init so once my ttm_tt_populate function would attempt to touch the dma_address it would oops. The second issue is the assumption of the cast for struct ttm_tt in both the populate and unpopulate function. For the AGP case the proper case would be to ttm_agp_backend. At this point my assumption is the ttm_bo_move function has to be rewritten to handle the AGP case to avoid calling ttm_tt_bind and in all cases ttm_tt_bind needs to be avoided. Looking at the radeon and nouveau drivers I don't see that testing happening. Am I just missing something?

Happens on AGP radeons as well: https://bugs.freedesktop.org/show_bug.cgi?id=43719

Alex

...

dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel

James Simmons

23 Dec 23 Dec

1:19 a.m.

...

...
Hi!

I updated the openchrome tree and while testing on the AGP system discovered some interesting problems with the new TTM changes. The problems center around the ttm_tt_[un]populate which I modeled after the radeon and nouveau driver. First problem I noticed was on a AGP system that my ttm_tt_populate function would oops. Tracking it down I found the problem was the ttm_agp_tt_create calls ttm_tt_init instead of ttm_dma_tt_init so once my ttm_tt_populate function would attempt to touch the dma_address it would oops. The second issue is the assumption of the cast for struct ttm_tt in both the populate and unpopulate function. For the AGP case the proper case would be to ttm_agp_backend. At this point my assumption is the ttm_bo_move function has to be rewritten to handle the AGP case to avoid calling ttm_tt_bind and in all cases ttm_tt_bind needs to be avoided. Looking at the radeon and nouveau drivers I don't see that testing happening. Am I just missing something?

Happens on AGP radeons as well: https://bugs.freedesktop.org/show_bug.cgi?id=43719

So I'm not crazy, so this needs to be fixed. Here is what my understanding of the TTM layer is. My impression is that struct ttm_bo_driver handles multiple domains, AGP, pcie etc and in each method you have to handle each specific domain you support. Also *move gives the impression of moving between these different domains. Where as for struct ttm_backend_func was more for allocating from a specific domain. Also I never saw a clear way to handle multiple backends. For example my AGP systems can also do pci dma as well.

Jerome Glisse

24 Dec 24 Dec

3:44 a.m.

On Thu, Dec 22, 2011 at 8:19 PM, James Simmons jsimmons@infradead.org wrote:

...

...
...
Hi!

I updated the openchrome tree and while testing on the AGP system discovered some interesting problems with the new TTM changes. The problems center around the ttm_tt_[un]populate which I modeled after the radeon and nouveau driver. First problem I noticed was on a AGP system that my ttm_tt_populate function would oops. Tracking it down I found the problem was the ttm_agp_tt_create calls ttm_tt_init instead of ttm_dma_tt_init so once my ttm_tt_populate function would attempt to touch the dma_address it would oops. The second issue is the assumption of the cast for struct ttm_tt in both the populate and unpopulate function. For the AGP case the proper case would be to ttm_agp_backend. At this point my assumption is the ttm_bo_move function has to be rewritten to handle the AGP case to avoid calling ttm_tt_bind and in all cases ttm_tt_bind needs to be avoided. Looking at the radeon and nouveau drivers I don't see that testing happening. Am I just missing something?

Happens on AGP radeons as well: https://bugs.freedesktop.org/show_bug.cgi?id=43719

So I'm not crazy, so this needs to be fixed. Here is what my understanding of the TTM layer is. My impression is that struct ttm_bo_driver handles multiple domains, AGP, pcie etc and in each method you have to handle each specific domain you support. Also *move gives the impression of moving between these different domains. Where as for struct ttm_backend_func was more for allocating from a specific domain. Also I never saw a clear way to handle multiple backends. For example my AGP systems can also do pci dma as well.

Yes this need to be fixed, i am likely guilty, i tested agp along the way but i must have miss something. If no one beat me to it i will get back to this on january.

Cheers, Jerome

Jerome Glisse

3 Jan 3 Jan

10:43 p.m.

On Fri, Dec 23, 2011 at 01:19:43AM +0000, James Simmons wrote:

...

...
...
Hi!

I updated the openchrome tree and while testing on the AGP system discovered some interesting problems with the new TTM changes. The problems center around the ttm_tt_[un]populate which I modeled after the radeon and nouveau driver. First problem I noticed was on a AGP system that my ttm_tt_populate function would oops. Tracking it down I found the problem was the ttm_agp_tt_create calls ttm_tt_init instead of ttm_dma_tt_init so once my ttm_tt_populate function would attempt to touch the dma_address it would oops. The second issue is the assumption of the cast for struct ttm_tt in both the populate and unpopulate function. For the AGP case the proper case would be to ttm_agp_backend. At this point my assumption is the ttm_bo_move function has to be rewritten to handle the AGP case to avoid calling ttm_tt_bind and in all cases ttm_tt_bind needs to be avoided. Looking at the radeon and nouveau drivers I don't see that testing happening. Am I just missing something?

Happens on AGP radeons as well: https://bugs.freedesktop.org/show_bug.cgi?id=43719

So I'm not crazy, so this needs to be fixed. Here is what my understanding of the TTM layer is. My impression is that struct ttm_bo_driver handles multiple domains, AGP, pcie etc and in each method you have to handle each specific domain you support. Also *move gives the impression of moving between these different domains. Where as for struct ttm_backend_func was more for allocating from a specific domain. Also I never saw a clear way to handle multiple backends. For example my AGP systems can also do pci dma as well.

...

dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel

Attached is patch to fix this, so sorry about that, i must have lost my agp change along the way when working on the patchset. This patch is not extensively tested, i will do more testing tomorrow on more gpu, might even found an nvidia agp i can try. Again sorry for breaking this.

Cheers, Jerome

Konrad Rzeszutek Wilk

10:59 p.m.

On Tue, Jan 03, 2012 at 05:43:40PM -0500, Jerome Glisse wrote:

...

On Fri, Dec 23, 2011 at 01:19:43AM +0000, James Simmons wrote:

...
...
...
Hi!

I updated the openchrome tree and while testing on the AGP system discovered some interesting problems with the new TTM changes. The problems center around the ttm_tt_[un]populate which I modeled after the radeon and nouveau driver. First problem I noticed was on a AGP system that my ttm_tt_populate function would oops. Tracking it down I found the problem was the ttm_agp_tt_create calls ttm_tt_init instead of ttm_dma_tt_init so once my ttm_tt_populate function would attempt to touch the dma_address it would oops. The second issue is the assumption of the cast for struct ttm_tt in both the populate and unpopulate function. For the AGP case the proper case would be to ttm_agp_backend. At this point my assumption is the ttm_bo_move function has to be rewritten to handle the AGP case to avoid calling ttm_tt_bind and in all cases ttm_tt_bind needs to be avoided. Looking at the radeon and nouveau drivers I don't see that testing happening. Am I just missing something?

Happens on AGP radeons as well: https://bugs.freedesktop.org/show_bug.cgi?id=43719

So I'm not crazy, so this needs to be fixed. Here is what my understanding of the TTM layer is. My impression is that struct ttm_bo_driver handles multiple domains, AGP, pcie etc and in each method you have to handle each specific domain you support. Also *move gives the impression of moving between these different domains. Where as for struct ttm_backend_func was more for allocating from a specific domain. Also I never saw a clear way to handle multiple backends. For example my AGP systems can also do pci dma as well.

...

dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel

Attached is patch to fix this, so sorry about that, i must have lost my agp change along the way when working on the patchset. This patch is not extensively tested, i will do more testing tomorrow on more gpu, might even found an nvidia agp i can try. Again sorry for breaking this.

Hey Jerome,

Was going to look at this week and see how it performs before (and after) the squash ttm bind+populate operation. Any thoughts of what benchmarks I should run?

Fyi, I've some NVidia AGP cards. We both live in Boston so could meet up.

And I could ask you about the patches I sent - not clear if you want to me squash the patch 2 and 3 together or just redo them diffrently.

...

Cheers, Jerome

...

...
From 99a6eee89a43bce155a93ab6d71ea041aac10680 Mon Sep 17 00:00:00 2001

From: Jerome Glisse jglisse@redhat.com Date: Tue, 3 Jan 2012 17:37:37 -0500 Subject: [PATCH] ttm: fix agp since ttm tt rework

ttm tt rework modified the way we allocate and populate the ttm_tt structure, the AGP side was missing some bit to properly work. Fix those and fix radeon and nouveau AGP support.

Tested on radeon only so far.

Signed-off-by: Jerome Glisse jglisse@redhat.com

drivers/gpu/drm/nouveau/nouveau_bo.c | 13 +++++++++++++ drivers/gpu/drm/radeon/radeon_ttm.c | 11 +++++++++++ drivers/gpu/drm/ttm/ttm_agp_backend.c | 17 +++++++++++++++++ drivers/gpu/drm/ttm/ttm_tt.c | 2 ++ include/drm/ttm/ttm_bo_driver.h | 2 ++ 5 files changed, 45 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index 8cf4a48..724b41a 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -1066,6 +1066,12 @@ nouveau_ttm_tt_populate(struct ttm_tt *ttm) dev_priv = nouveau_bdev(ttm->bdev); dev = dev_priv->dev;

+#if __OS_HAS_AGP
if (dev_priv->gart_info.type == NOUVEAU_GART_AGP) {
return ttm_agp_tt_populate(ttm);
}
+#endif

#ifdef CONFIG_SWIOTLB if (swiotlb_nr_tbl()) { return ttm_dma_populate((void *)ttm, dev->dev); @@ -1105,6 +1111,13 @@ nouveau_ttm_tt_unpopulate(struct ttm_tt *ttm) dev_priv = nouveau_bdev(ttm->bdev); dev = dev_priv->dev;

+#if __OS_HAS_AGP
if (dev_priv->gart_info.type == NOUVEAU_GART_AGP) {
ttm_agp_tt_unpopulate(ttm);
return;
}
+#endif

#ifdef CONFIG_SWIOTLB if (swiotlb_nr_tbl()) { ttm_dma_unpopulate((void *)ttm, dev->dev); diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index b0ebaf8..dc73d77 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -588,6 +588,11 @@ static int radeon_ttm_tt_populate(struct ttm_tt *ttm) return 0;

rdev = radeon_get_rdev(ttm->bdev); +#if __OS_HAS_AGP
if (rdev->flags & RADEON_IS_AGP) {
return ttm_agp_tt_populate(ttm);
}
+#endif

#ifdef CONFIG_SWIOTLB if (swiotlb_nr_tbl()) { @@ -624,6 +629,12 @@ static void radeon_ttm_tt_unpopulate(struct ttm_tt *ttm) unsigned i;

rdev = radeon_get_rdev(ttm->bdev); +#if __OS_HAS_AGP
if (rdev->flags & RADEON_IS_AGP) {
ttm_agp_tt_unpopulate(ttm);
return;
}
+#endif

#ifdef CONFIG_SWIOTLB if (swiotlb_nr_tbl()) { diff --git a/drivers/gpu/drm/ttm/ttm_agp_backend.c b/drivers/gpu/drm/ttm/ttm_agp_backend.c index 14ebd36..747c141 100644 --- a/drivers/gpu/drm/ttm/ttm_agp_backend.c +++ b/drivers/gpu/drm/ttm/ttm_agp_backend.c @@ -31,6 +31,7 @@

#include "ttm/ttm_module.h" #include "ttm/ttm_bo_driver.h" +#include "ttm/ttm_page_alloc.h" #ifdef TTM_HAS_AGP #include "ttm/ttm_placement.h" #include <linux/agp_backend.h> @@ -97,6 +98,7 @@ static void ttm_agp_destroy(struct ttm_tt *ttm)

if (agp_be->mem) ttm_agp_unbind(ttm);

ttm_tt_fini(ttm); kfree(agp_be);

}

@@ -129,4 +131,19 @@ struct ttm_tt *ttm_agp_tt_create(struct ttm_bo_device *bdev, } EXPORT_SYMBOL(ttm_agp_tt_create);

+int ttm_agp_tt_populate(struct ttm_tt *ttm) +{
if (ttm->state != tt_unpopulated)
return 0;
return ttm_pool_populate(ttm);
+} +EXPORT_SYMBOL(ttm_agp_tt_populate);

+void ttm_agp_tt_unpopulate(struct ttm_tt *ttm) +{

ttm_pool_unpopulate(ttm);

+} +EXPORT_SYMBOL(ttm_agp_tt_unpopulate);

#endif diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index 58e1fa1..2f75d20 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -191,6 +191,7 @@ int ttm_tt_init(struct ttm_tt *ttm, struct ttm_bo_device *bdev, ttm->page_flags = page_flags; ttm->dummy_read_page = dummy_read_page; ttm->state = tt_unpopulated;

ttm->swap_storage = NULL;

ttm_tt_alloc_page_directory(ttm); if (!ttm->pages) {

@@ -222,6 +223,7 @@ int ttm_dma_tt_init(struct ttm_dma_tt *ttm_dma, struct ttm_bo_device *bdev, ttm->page_flags = page_flags; ttm->dummy_read_page = dummy_read_page; ttm->state = tt_unpopulated;

ttm->swap_storage = NULL;

INIT_LIST_HEAD(&ttm_dma->pages_list); ttm_dma_tt_alloc_page_directory(ttm_dma);

diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h index 2be8891..d43e892 100644 --- a/include/drm/ttm/ttm_bo_driver.h +++ b/include/drm/ttm/ttm_bo_driver.h @@ -1030,6 +1030,8 @@ extern struct ttm_tt *ttm_agp_tt_create(struct ttm_bo_device *bdev, struct agp_bridge_data *bridge, unsigned long size, uint32_t page_flags, struct page *dummy_read_page); +int ttm_agp_tt_populate(struct ttm_tt *ttm); +void ttm_agp_tt_unpopulate(struct ttm_tt *ttm); #endif

#endif

1.7.5.4

...

dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel

Konrad Rzeszutek Wilk

6 Jan 6 Jan

2:26 a.m.

On Tue, Jan 03, 2012 at 05:59:33PM -0500, Konrad Rzeszutek Wilk wrote:

...

On Tue, Jan 03, 2012 at 05:43:40PM -0500, Jerome Glisse wrote:

...
On Fri, Dec 23, 2011 at 01:19:43AM +0000, James Simmons wrote:

...
...
...
Hi!

I updated the openchrome tree and while testing on the AGP system discovered some interesting problems with the new TTM changes. The problems center around the ttm_tt_[un]populate which I modeled after the radeon and nouveau driver. First problem I noticed was on a AGP system that my ttm_tt_populate function would oops. Tracking it down I found the problem was the ttm_agp_tt_create calls ttm_tt_init instead of ttm_dma_tt_init so once my ttm_tt_populate function would attempt to touch the dma_address it would oops. The second issue is the assumption of the cast for struct ttm_tt in both the populate and unpopulate function. For the AGP case the proper case would be to ttm_agp_backend. At this point my assumption is the ttm_bo_move function has to be rewritten to handle the AGP case to avoid calling ttm_tt_bind and in all cases ttm_tt_bind needs to be avoided. Looking at the radeon and nouveau drivers I don't see that testing happening. Am I just missing something?

Happens on AGP radeons as well: https://bugs.freedesktop.org/show_bug.cgi?id=43719

So I'm not crazy, so this needs to be fixed. Here is what my understanding of the TTM layer is. My impression is that struct ttm_bo_driver handles multiple domains, AGP, pcie etc and in each method you have to handle each specific domain you support. Also *move gives the impression of moving between these different domains. Where as for struct ttm_backend_func was more for allocating from a specific domain. Also I never saw a clear way to handle multiple backends. For example my AGP systems can also do pci dma as well.

...

dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel

Attached is patch to fix this, so sorry about that, i must have lost my agp change along the way when working on the patchset. This patch is not extensively tested, i will do more testing tomorrow on more gpu, might even found an nvidia agp i can try. Again sorry for breaking this.

Hey Jerome,

Was going to look at this week and see how it performs before (and after) the squash ttm bind+populate operation. Any thoughts of what benchmarks I should run?

OK, Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com

I ran it on my radeon AGP cards and they performed nicely. I do need to run it with the nouveau AGP card tomorrow (barring any move_notify issues).

James Simmons

4 Jan 4 Jan

3:36 p.m.

...

Attached is patch to fix this, so sorry about that, i must have lost my agp change along the way when working on the patchset. This patch is not extensively tested, i will do more testing tomorrow on more gpu, might even found an nvidia agp i can try. Again sorry for breaking this.

Thanks for the fix up. I was wondering if this purposed change could be done instead. Its the same as your fix up except that you pass in the ttm_buffer_object for tt_create. The reason is I really like to have the ability to have more than one back end to grab a bunch pf pages from. Currently its AGP or something else. On some of my embedded devices the AGP space is not very large and can be exhausted very quickly. Those embedded devices have DMA engines I could use instead.

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index 8cf4a48..58ad508 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -349,10 +349,11 @@ nouveau_bo_wr32(struct nouveau_bo *nvbo, unsigned index, u32 val) }

static struct ttm_tt * -nouveau_ttm_tt_create(struct ttm_bo_device *bdev, +nouveau_ttm_tt_create(struct ttm_buffer_object *bo, unsigned long size, uint32_t page_flags, struct page *dummy_read_page) { + struct ttm_bo_device *bdev = bo->bdev; struct drm_nouveau_private *dev_priv = nouveau_bdev(bdev); struct drm_device *dev = dev_priv->dev;

@@ -1066,6 +1067,12 @@ nouveau_ttm_tt_populate(struct ttm_tt *ttm) dev_priv = nouveau_bdev(ttm->bdev); dev = dev_priv->dev;

+#if __OS_HAS_AGP + if (dev_priv->gart_info.type == NOUVEAU_GART_AGP) { + return ttm_agp_tt_populate(ttm); + } +#endif + #ifdef CONFIG_SWIOTLB if (swiotlb_nr_tbl()) { return ttm_dma_populate((void *)ttm, dev->dev); @@ -1105,6 +1112,13 @@ nouveau_ttm_tt_unpopulate(struct ttm_tt *ttm) dev_priv = nouveau_bdev(ttm->bdev); dev = dev_priv->dev;

+#if __OS_HAS_AGP + if (dev_priv->gart_info.type == NOUVEAU_GART_AGP) { + ttm_agp_tt_unpopulate(ttm); + return; + } +#endif + #ifdef CONFIG_SWIOTLB if (swiotlb_nr_tbl()) { ttm_dma_unpopulate((void *)ttm, dev->dev); diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index b0ebaf8..655f8e9 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -549,10 +549,12 @@ static struct ttm_backend_func radeon_backend_func = { .destroy = &radeon_ttm_backend_destroy, };

-struct ttm_tt *radeon_ttm_tt_create(struct ttm_bo_device *bdev, - unsigned long size, uint32_t page_flags, +struct ttm_tt *radeon_ttm_tt_create(struct ttm_buffer_object *bo, + unsigned long size, + uint32_t page_flags, struct page *dummy_read_page) { + struct ttm_bo_device *bdev = bo->bdev; struct radeon_device *rdev; struct radeon_ttm_tt *gtt;

@@ -588,6 +590,11 @@ static int radeon_ttm_tt_populate(struct ttm_tt *ttm) return 0;

rdev = radeon_get_rdev(ttm->bdev); +#if __OS_HAS_AGP + if (rdev->flags & RADEON_IS_AGP) { + return ttm_agp_tt_populate(ttm); + } +#endif

#ifdef CONFIG_SWIOTLB if (swiotlb_nr_tbl()) { @@ -624,6 +631,12 @@ static void radeon_ttm_tt_unpopulate(struct ttm_tt *ttm) unsigned i;

rdev = radeon_get_rdev(ttm->bdev); +#if __OS_HAS_AGP + if (rdev->flags & RADEON_IS_AGP) { + ttm_agp_tt_unpopulate(ttm); + return; + } +#endif

#ifdef CONFIG_SWIOTLB if (swiotlb_nr_tbl()) { diff --git a/drivers/gpu/drm/ttm/ttm_agp_backend.c b/drivers/gpu/drm/ttm/ttm_agp_backend.c index 14ebd36..747c141 100644 --- a/drivers/gpu/drm/ttm/ttm_agp_backend.c +++ b/drivers/gpu/drm/ttm/ttm_agp_backend.c @@ -31,6 +31,7 @@

#include "ttm/ttm_module.h" #include "ttm/ttm_bo_driver.h" +#include "ttm/ttm_page_alloc.h" #ifdef TTM_HAS_AGP #include "ttm/ttm_placement.h" #include <linux/agp_backend.h> @@ -97,6 +98,7 @@ static void ttm_agp_destroy(struct ttm_tt *ttm)

if (agp_be->mem) ttm_agp_unbind(ttm); + ttm_tt_fini(ttm); kfree(agp_be); }

@@ -129,4 +131,19 @@ struct ttm_tt *ttm_agp_tt_create(struct ttm_bo_device *bdev, } EXPORT_SYMBOL(ttm_agp_tt_create);

+int ttm_agp_tt_populate(struct ttm_tt *ttm) +{ + if (ttm->state != tt_unpopulated) + return 0; + + return ttm_pool_populate(ttm); +} +EXPORT_SYMBOL(ttm_agp_tt_populate); + +void ttm_agp_tt_unpopulate(struct ttm_tt *ttm) +{ + ttm_pool_unpopulate(ttm); +} +EXPORT_SYMBOL(ttm_agp_tt_unpopulate); + #endif diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 2f0eab6..1adc149 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -338,7 +338,7 @@ static int ttm_bo_add_ttm(struct ttm_buffer_object *bo, bool zero_alloc) if (zero_alloc) page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC; case ttm_bo_type_kernel: - bo->ttm = bdev->driver->ttm_tt_create(bdev, bo->num_pages << PAGE_SHIFT, + bo->ttm = bdev->driver->ttm_tt_create(bo, bo->num_pages << PAGE_SHIFT, page_flags, glob->dummy_read_page); if (unlikely(bo->ttm == NULL)) ret = -ENOMEM; diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index 58e1fa1..2f75d20 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -191,6 +191,7 @@ int ttm_tt_init(struct ttm_tt *ttm, struct ttm_bo_device *bdev, ttm->page_flags = page_flags; ttm->dummy_read_page = dummy_read_page; ttm->state = tt_unpopulated; + ttm->swap_storage = NULL;

ttm_tt_alloc_page_directory(ttm); if (!ttm->pages) { @@ -222,6 +223,7 @@ int ttm_dma_tt_init(struct ttm_dma_tt *ttm_dma, struct ttm_bo_device *bdev, ttm->page_flags = page_flags; ttm->dummy_read_page = dummy_read_page; ttm->state = tt_unpopulated; + ttm->swap_storage = NULL;

INIT_LIST_HEAD(&ttm_dma->pages_list); ttm_dma_tt_alloc_page_directory(ttm_dma); diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h index 2be8891..6b67c37 100644 --- a/include/drm/ttm/ttm_bo_driver.h +++ b/include/drm/ttm/ttm_bo_driver.h @@ -318,7 +318,7 @@ struct ttm_bo_driver { /** * ttm_tt_create * - * @bdev: pointer to a struct ttm_bo_device: + * @bo: pointer to a struct ttm_bo: * @size: Size of the data needed backing. * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags. * @dummy_read_page: See struct ttm_bo_device. @@ -328,7 +328,7 @@ struct ttm_bo_driver { * Returns: * NULL: Out of memory. */ - struct ttm_tt *(*ttm_tt_create)(struct ttm_bo_device *bdev, + struct ttm_tt *(*ttm_tt_create)(struct ttm_buffer_object *bo, unsigned long size, uint32_t page_flags, struct page *dummy_read_page); @@ -1030,6 +1030,8 @@ extern struct ttm_tt *ttm_agp_tt_create(struct ttm_bo_device *bdev, struct agp_bridge_data *bridge, unsigned long size, uint32_t page_flags, struct page *dummy_read_page); +int ttm_agp_tt_populate(struct ttm_tt *ttm); +void ttm_agp_tt_unpopulate(struct ttm_tt *ttm); #endif

#endif

Jerome Glisse

3:49 p.m.

On Wed, Jan 4, 2012 at 10:36 AM, James Simmons jsimmons@infradead.org wrote:

...

...
Attached is patch to fix this, so sorry about that, i must have lost my agp change along the way when working on the patchset. This patch is not extensively tested, i will do more testing tomorrow on more gpu, might even found an nvidia agp i can try. Again sorry for breaking this.

Thanks for the fix up. I was wondering if this purposed change could be done instead. Its the same as your fix up except that you pass in the ttm_buffer_object for tt_create. The reason is I really like to have the ability to have more than one back end to grab a bunch pf pages from. Currently its AGP or something else. On some of my embedded devices the AGP space is not very large and can be exhausted very quickly. Those embedded devices have DMA engines I could use instead.

This change violate the layer separation ttm wish to preserve see : http://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg16443.html

You can achieve what you want by either adding a new domain so you would have system, vram, agp, pcidma and object can be bound to one and only one. Or you can hack your own agp ttm backend that could bind bo to agp or pci or both at the same time depending on what you want to achieve.

Cheers, Jerome

...

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index 8cf4a48..58ad508 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -349,10 +349,11 @@ nouveau_bo_wr32(struct nouveau_bo *nvbo, unsigned index, u32 val) }

static struct ttm_tt * -nouveau_ttm_tt_create(struct ttm_bo_device *bdev, +nouveau_ttm_tt_create(struct ttm_buffer_object *bo, unsigned long size, uint32_t page_flags, struct page *dummy_read_page) {

struct ttm_bo_device *bdev = bo->bdev;

struct drm_nouveau_private *dev_priv = nouveau_bdev(bdev); struct drm_device *dev = dev_priv->dev;

@@ -1066,6 +1067,12 @@ nouveau_ttm_tt_populate(struct ttm_tt *ttm) dev_priv = nouveau_bdev(ttm->bdev); dev = dev_priv->dev;

+#if __OS_HAS_AGP

if (dev_priv->gart_info.type == NOUVEAU_GART_AGP) {

return ttm_agp_tt_populate(ttm);

}

+#endif

#ifdef CONFIG_SWIOTLB if (swiotlb_nr_tbl()) { return ttm_dma_populate((void *)ttm, dev->dev); @@ -1105,6 +1112,13 @@ nouveau_ttm_tt_unpopulate(struct ttm_tt *ttm) dev_priv = nouveau_bdev(ttm->bdev); dev = dev_priv->dev;

+#if __OS_HAS_AGP

if (dev_priv->gart_info.type == NOUVEAU_GART_AGP) {

ttm_agp_tt_unpopulate(ttm);

return;

}

+#endif

#ifdef CONFIG_SWIOTLB if (swiotlb_nr_tbl()) { ttm_dma_unpopulate((void *)ttm, dev->dev); diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index b0ebaf8..655f8e9 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -549,10 +549,12 @@ static struct ttm_backend_func radeon_backend_func = { .destroy = &radeon_ttm_backend_destroy, };

-struct ttm_tt *radeon_ttm_tt_create(struct ttm_bo_device *bdev,

unsigned long size, uint32_t page_flags,

+struct ttm_tt *radeon_ttm_tt_create(struct ttm_buffer_object *bo,

unsigned long size,

uint32_t page_flags,

struct page *dummy_read_page) {

struct ttm_bo_device *bdev = bo->bdev;

struct radeon_device *rdev; struct radeon_ttm_tt *gtt;

@@ -588,6 +590,11 @@ static int radeon_ttm_tt_populate(struct ttm_tt *ttm) return 0;

rdev = radeon_get_rdev(ttm->bdev); +#if __OS_HAS_AGP

if (rdev->flags & RADEON_IS_AGP) {

return ttm_agp_tt_populate(ttm);

}

+#endif

#ifdef CONFIG_SWIOTLB if (swiotlb_nr_tbl()) { @@ -624,6 +631,12 @@ static void radeon_ttm_tt_unpopulate(struct ttm_tt *ttm) unsigned i;

rdev = radeon_get_rdev(ttm->bdev); +#if __OS_HAS_AGP

if (rdev->flags & RADEON_IS_AGP) {

ttm_agp_tt_unpopulate(ttm);

return;

}

+#endif

#ifdef CONFIG_SWIOTLB if (swiotlb_nr_tbl()) { diff --git a/drivers/gpu/drm/ttm/ttm_agp_backend.c b/drivers/gpu/drm/ttm/ttm_agp_backend.c index 14ebd36..747c141 100644 --- a/drivers/gpu/drm/ttm/ttm_agp_backend.c +++ b/drivers/gpu/drm/ttm/ttm_agp_backend.c @@ -31,6 +31,7 @@

#include "ttm/ttm_module.h" #include "ttm/ttm_bo_driver.h" +#include "ttm/ttm_page_alloc.h" #ifdef TTM_HAS_AGP #include "ttm/ttm_placement.h" #include <linux/agp_backend.h> @@ -97,6 +98,7 @@ static void ttm_agp_destroy(struct ttm_tt *ttm)

if (agp_be->mem) ttm_agp_unbind(ttm);

ttm_tt_fini(ttm);

kfree(agp_be); }

@@ -129,4 +131,19 @@ struct ttm_tt *ttm_agp_tt_create(struct ttm_bo_device *bdev, } EXPORT_SYMBOL(ttm_agp_tt_create);

+int ttm_agp_tt_populate(struct ttm_tt *ttm) +{

if (ttm->state != tt_unpopulated)

return 0;

return ttm_pool_populate(ttm);

+} +EXPORT_SYMBOL(ttm_agp_tt_populate);

+void ttm_agp_tt_unpopulate(struct ttm_tt *ttm) +{

ttm_pool_unpopulate(ttm);

+} +EXPORT_SYMBOL(ttm_agp_tt_unpopulate);

#endif diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 2f0eab6..1adc149 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -338,7 +338,7 @@ static int ttm_bo_add_ttm(struct ttm_buffer_object *bo, bool zero_alloc) if (zero_alloc) page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC; case ttm_bo_type_kernel:

bo->ttm = bdev->driver->ttm_tt_create(bdev, bo->num_pages << PAGE_SHIFT,

bo->ttm = bdev->driver->ttm_tt_create(bo, bo->num_pages << PAGE_SHIFT,

page_flags, glob->dummy_read_page); if (unlikely(bo->ttm == NULL)) ret = -ENOMEM; diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index 58e1fa1..2f75d20 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -191,6 +191,7 @@ int ttm_tt_init(struct ttm_tt *ttm, struct ttm_bo_device *bdev, ttm->page_flags = page_flags; ttm->dummy_read_page = dummy_read_page; ttm->state = tt_unpopulated;

ttm->swap_storage = NULL;

ttm_tt_alloc_page_directory(ttm); if (!ttm->pages) { @@ -222,6 +223,7 @@ int ttm_dma_tt_init(struct ttm_dma_tt *ttm_dma, struct ttm_bo_device *bdev, ttm->page_flags = page_flags; ttm->dummy_read_page = dummy_read_page; ttm->state = tt_unpopulated;

ttm->swap_storage = NULL;

INIT_LIST_HEAD(&ttm_dma->pages_list); ttm_dma_tt_alloc_page_directory(ttm_dma); diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h index 2be8891..6b67c37 100644 --- a/include/drm/ttm/ttm_bo_driver.h +++ b/include/drm/ttm/ttm_bo_driver.h @@ -318,7 +318,7 @@ struct ttm_bo_driver { /** * ttm_tt_create *

* @bdev: pointer to a struct ttm_bo_device:

* @bo: pointer to a struct ttm_bo:

* @size: Size of the data needed backing. * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags. * @dummy_read_page: See struct ttm_bo_device. @@ -328,7 +328,7 @@ struct ttm_bo_driver { * Returns: * NULL: Out of memory. */

struct ttm_tt *(*ttm_tt_create)(struct ttm_bo_device *bdev,

struct ttm_tt *(*ttm_tt_create)(struct ttm_buffer_object *bo,

unsigned long size, uint32_t page_flags, struct page *dummy_read_page); @@ -1030,6 +1030,8 @@ extern struct ttm_tt *ttm_agp_tt_create(struct ttm_bo_device *bdev, struct agp_bridge_data *bridge, unsigned long size, uint32_t page_flags, struct page *dummy_read_page); +int ttm_agp_tt_populate(struct ttm_tt *ttm); +void ttm_agp_tt_unpopulate(struct ttm_tt *ttm); #endif

#endif

James Simmons

4:04 p.m.

...

...
...
Attached is patch to fix this, so sorry about that, i must have lost my agp change along the way when working on the patchset. This patch is not extensively tested, i will do more testing tomorrow on more gpu, might even found an nvidia agp i can try. Again sorry for breaking this.

Thanks for the fix up. I was wondering if this purposed change could be done instead. Its the same as your fix up except that you pass in the ttm_buffer_object for tt_create. The reason is I really like to have the ability to have more than one back end to grab a bunch pf pages from. Currently its AGP or something else. On some of my embedded devices the AGP space is not very large and can be exhausted very quickly. Those embedded devices have DMA engines I could use instead.

This change violate the layer separation ttm wish to preserve see : http://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg16443.html

You can achieve what you want by either adding a new domain so you would have system, vram, agp, pcidma and object can be bound to one and only one. Or you can hack your own agp ttm backend that could bind bo to agp or pci or both at the same time depending on what you want to achieve.

The question is how does one know which domain you want in tt_create. Currenty drivers are using there dev_priv but if you have have more than one option available how does one choose? Would you be okay with passing in a domain flag?

Jerome Glisse

4:35 p.m.

On Wed, Jan 4, 2012 at 11:04 AM, James Simmons jsimmons@infradead.org wrote:

...

...
...
...
Attached is patch to fix this, so sorry about that, i must have lost my agp change along the way when working on the patchset. This patch is not extensively tested, i will do more testing tomorrow on more gpu, might even found an nvidia agp i can try. Again sorry for breaking this.

Thanks for the fix up. I was wondering if this purposed change could be done instead. Its the same as your fix up except that you pass in the ttm_buffer_object for tt_create. The reason is I really like to have the ability to have more than one back end to grab a bunch pf pages from. Currently its AGP or something else. On some of my embedded devices the AGP space is not very large and can be exhausted very quickly. Those embedded devices have DMA engines I could use instead.

This change violate the layer separation ttm wish to preserve see : http://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg16443.html

You can achieve what you want by either adding a new domain so you would have system, vram, agp, pcidma and object can be bound to one and only one. Or you can hack your own agp ttm backend that could bind bo to agp or pci or both at the same time depending on what you want to achieve.

The question is how does one know which domain you want in tt_create. Currenty drivers are using there dev_priv but if you have have more than one option available how does one choose? Would you be okay with passing in a domain flag?

Well i agree that something would be usefull there so the driver know which bind/unbind function it should use. Thomas i would prefer passing the bo to the tt_create callback but a flag is the minimum we need.

Cheers, Jerome

James Simmons

6 Jan 6 Jan

3:51 p.m.

...

...
...
You can achieve what you want by either adding a new domain so you would have system, vram, agp, pcidma and object can be bound to one and only one. Or you can hack your own agp ttm backend that could bind bo to agp or pci or both at the same time depending on what you want to achieve.

The question is how does one know which domain you want in tt_create. Currenty drivers are using there dev_priv but if you have have more than one option available how does one choose? Would you be okay with passing in a domain flag?

Well i agree that something would be usefull there so the driver know which bind/unbind function it should use. Thomas i would prefer passing the bo to the tt_create callback but a flag is the minimum we need.

We can discuss this after the merge widow. Jerome your patch does fix a regression whereas my proposal is a enhancement.

Thomas Hellstrom

9 Jan 9 Jan

9:07 a.m.

On 01/06/2012 04:51 PM, James Simmons wrote:

...

...
...
...
You can achieve what you want by either adding a new domain so you would have system, vram, agp, pcidma and object can be bound to one and only one. Or you can hack your own agp ttm backend that could bind bo to agp or pci or both at the same time depending on what you want to achieve.

The question is how does one know which domain you want in tt_create. Currenty drivers are using there dev_priv but if you have have more than one option available how does one choose? Would you be okay with passing in a domain flag?

Well i agree that something would be usefull there so the driver know which bind/unbind function it should use. Thomas i would prefer passing the bo to the tt_create callback but a flag is the minimum we need.

We can discuss this after the merge widow. Jerome your patch does fix a regression whereas my proposal is a enhancement.

..Back from parental leave.

I'm not 100% sure I understand the problem correctly, but I assume the problem is that you receive a "bind" request for pages allocated out of the wrong DMA pool? Is that correct?

In that case we need to make both the following happen: 1) The backend should be required to supply a fallback that can bind anyway by allocating the correct page type and copy. 2) I'm OK with providing a hint to tt_create. IIRC we're passing bo::mem to tt_bind. Would it be ok to pass the same to tt_create?

/Thomas

Jerome Glisse

3:21 p.m.

On Mon, Jan 09, 2012 at 10:07:02AM +0100, Thomas Hellstrom wrote:

...

On 01/06/2012 04:51 PM, James Simmons wrote:

...
...
...
...
You can achieve what you want by either adding a new domain so you would have system, vram, agp, pcidma and object can be bound to one and only one. Or you can hack your own agp ttm backend that could bind bo to agp or pci or both at the same time depending on what you want to achieve.

The question is how does one know which domain you want in tt_create. Currenty drivers are using there dev_priv but if you have have more than one option available how does one choose? Would you be okay with passing in a domain flag?

Well i agree that something would be usefull there so the driver know which bind/unbind function it should use. Thomas i would prefer passing the bo to the tt_create callback but a flag is the minimum we need.

We can discuss this after the merge widow. Jerome your patch does fix a regression whereas my proposal is a enhancement.

..Back from parental leave.

I'm not 100% sure I understand the problem correctly, but I assume the problem is that you receive a "bind" request for pages allocated out of the wrong DMA pool? Is that correct?

In that case we need to make both the following happen:

The backend should be required to supply a fallback that can bind

anyway by allocating the correct page type and copy. 2) I'm OK with providing a hint to tt_create. IIRC we're passing bo::mem to tt_bind. Would it be ok to pass the same to tt_create?

/Thomas

Issue here is that different backend (AGP or PCI DMA or someother communication way btw GPU and system memory) needs to use different page allocator/backend. Thus tt_create need either to know about the bo or at least about bo:mem.

Cheers, Jerome

Konrad Rzeszutek Wilk

4:46 p.m.

On Mon, Jan 09, 2012 at 10:07:02AM +0100, Thomas Hellstrom wrote:

...

On 01/06/2012 04:51 PM, James Simmons wrote:

...
...
...
...
You can achieve what you want by either adding a new domain so you would have system, vram, agp, pcidma and object can be bound to one and only one. Or you can hack your own agp ttm backend that could bind bo to agp or pci or both at the same time depending on what you want to achieve.

The question is how does one know which domain you want in tt_create. Currenty drivers are using there dev_priv but if you have have more than one option available how does one choose? Would you be okay with passing in a domain flag?

Well i agree that something would be usefull there so the driver know which bind/unbind function it should use. Thomas i would prefer passing the bo to the tt_create callback but a flag is the minimum we need.

We can discuss this after the merge widow. Jerome your patch does fix a regression whereas my proposal is a enhancement.

..Back from parental leave.

Congratulations!

...

I'm not 100% sure I understand the problem correctly, but I assume the problem is that you receive a "bind" request for pages allocated out of the wrong DMA pool? Is that correct?

In that case we need to make both the following happen:

The backend should be required to supply a fallback that can bind

anyway by allocating the correct page type and copy. 2) I'm OK with providing a hint to tt_create. IIRC we're passing bo::mem to tt_bind. Would it be ok to pass the same to tt_create?

/Thomas

dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel

4859

Age (days ago)

4877

Last active (days ago)

dri-devel@lists.freedesktop.org

14 comments

5 participants

tags (0)

participants (5)

Alex Deucher
James Simmons
Jerome Glisse
Konrad Rzeszutek Wilk
Thomas Hellstrom