Hello,
Since few months I'm looking for Linaro to how do Secure Data Path (SPD). I have tried and implemented multiple thinks but I always facing architecture issues so I would like to get your help to solve the problem.
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
One typical use case for SDP in a video playback which involve those elements: decrypt -> video decoder -> transform -> display
decrypt output contains encoded data that need to be secure: only hardware video decoder should be able to read them. Hardware decoder output (decoded frame) can only be read by hardware transform and only hardware display can read transform output. Video decoder and transform are v4l2 devices and display is a drm/kms device.
To be able to configure the firewall SDP need to know when each device need to have access to memory (physical address and size) and in which direction (read or write). SDP also need a way to transfert information that memory is secure between different frameworks and devices. Obviously I also want to limit the impact of SDP in userland and kernel: For example not change the way of how buffers are allocating or how graph/pipeline are setup.
When looking to all those constraints I have try to use dma_buf: it is a cross frameworks and processes way to share buffers and, with dma_buf_map_attachment() and dma_buf_unmap_attachment() functions, have an API that provide the informations (device, memory, direction) to configure the firewall.
I have try 2 "hacky" approachs with dma_buf: - add a secure field in dma_buf structure and configure firewall in dma_buf_{map/unmap}_attachment() functions. - overwrite dma_buf exporter ops to have a kind of nested calls which allow to configure firewall without impacting the exporter code.
The both solutions have architecture issues, the first one add a "metadata" into dma_buf structure and calls to a specific SDP environment (OP-TEE + trusted application) and second obvious break dma_buf coding rules when overwriting exporter ops.
From buffer allocation point of view I also facing a problem because when v4l2
or drm/kms are exporting buffers by using dma_buf they don't attaching themself on it and never call dma_buf_{map/unmap}_attachment(). This is not an issue in those framework while it is how dma_buf exporters are supposed to work.
Using an "external" allocator (like ION) solve this problem but I think in this case we will have the same problem than for the "constraint aware allocator" which has never been accepted.
The goal of this RFC is to share/test ideas to solve this problem which at least impact v4l2 and drm/kms. Any suggestions/inputs are welcome !
Regards, Benjamin
On Tue, May 05, 2015 at 05:39:57PM +0200, Benjamin Gaignard wrote:
Since few months I'm looking for Linaro to how do Secure Data Path (SPD). I have tried and implemented multiple thinks but I always facing architecture issues so I would like to get your help to solve the problem.
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
One typical use case for SDP in a video playback which involve those elements: decrypt -> video decoder -> transform -> display
Sounds like a good enough reason not to implement it ever.
On Tuesday 05 May 2015 09:27:52 Christoph Hellwig wrote:
On Tue, May 05, 2015 at 05:39:57PM +0200, Benjamin Gaignard wrote:
Since few months I'm looking for Linaro to how do Secure Data Path (SPD). I have tried and implemented multiple thinks but I always facing architecture issues so I would like to get your help to solve the problem.
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
One typical use case for SDP in a video playback which involve those elements: decrypt -> video decoder -> transform -> display
Sounds like a good enough reason not to implement it ever.
The irony of it is to post an RFC on they day before http://www.defectivebydesign.org/dayagainstdrm/ :-)
On Wed, May 06, 2015 at 03:50:13AM +0300, Laurent Pinchart wrote:
On Tuesday 05 May 2015 09:27:52 Christoph Hellwig wrote:
On Tue, May 05, 2015 at 05:39:57PM +0200, Benjamin Gaignard wrote:
Since few months I'm looking for Linaro to how do Secure Data Path (SPD). I have tried and implemented multiple thinks but I always facing architecture issues so I would like to get your help to solve the problem.
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
One typical use case for SDP in a video playback which involve those elements: decrypt -> video decoder -> transform -> display
Sounds like a good enough reason not to implement it ever.
The irony of it is to post an RFC on they day before http://www.defectivebydesign.org/dayagainstdrm/ :-)
Just for the record: Even though I disagree with the design&threat model for secure memory I don't think we should outright refuse to merge patches. Assuming it comes with a sane design and no blob bits I'd be very much willing to merge support for i915. Unfortunately Intel isn't willing to publish the specs for any of the content protection stuff, at least right now. -Daniel
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
It's not just an ARM feature so any basis for this in the core code should be generic, whether its being enforced by ARM SDP, various Intel feature sets or even via a hypervisor.
I have try 2 "hacky" approachs with dma_buf:
- add a secure field in dma_buf structure and configure firewall in dma_buf_{map/unmap}_attachment() functions.
How is SDP not just another IOMMU. The only oddity here is that it happens to configure buffers the CPU can't touch and it has a control mechanism that is designed to cover big media corp type uses where the threat model is that the system owner is the enemy. Why does anything care about it being SDP, there are also generic cases this might be a useful optimisation (eg knowing the buffer isn't CPU touched so you can optimise cache flushing).
The control mechanism is a device/platform detail as with any IOMMU. It doesn't matter who configures it or how, providing it happens.
We do presumably need some small core DMA changes - anyone trying to map such a buffer into CPU space needs to get a warning or error but what else ?
From buffer allocation point of view I also facing a problem because when v4l2
or drm/kms are exporting buffers by using dma_buf they don't attaching themself on it and never call dma_buf_{map/unmap}_attachment(). This is not an issue in those framework while it is how dma_buf exporters are supposed to work.
Which could be addressed if need be.
So if "SDP" is just another IOMMU feature, just as stuff like IMR is on some x86 devices, and hypervisor enforced protection is on assorted platforms why do we need a special way to do it ? Is there anything actually needed beyond being able to tell the existing DMA code that this buffer won't be CPU touched and wiring it into the DMA operations for the platform ?
Alan
On Tue, May 05, 2015 at 05:54:05PM +0100, One Thousand Gnomes wrote:
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
It's not just an ARM feature so any basis for this in the core code should be generic, whether its being enforced by ARM SDP, various Intel feature sets or even via a hypervisor.
I have try 2 "hacky" approachs with dma_buf:
- add a secure field in dma_buf structure and configure firewall in dma_buf_{map/unmap}_attachment() functions.
How is SDP not just another IOMMU. The only oddity here is that it happens to configure buffers the CPU can't touch and it has a control mechanism that is designed to cover big media corp type uses where the threat model is that the system owner is the enemy. Why does anything care about it being SDP, there are also generic cases this might be a useful optimisation (eg knowing the buffer isn't CPU touched so you can optimise cache flushing).
The control mechanism is a device/platform detail as with any IOMMU. It doesn't matter who configures it or how, providing it happens.
We do presumably need some small core DMA changes - anyone trying to map such a buffer into CPU space needs to get a warning or error but what else ?
From buffer allocation point of view I also facing a problem because when v4l2
or drm/kms are exporting buffers by using dma_buf they don't attaching themself on it and never call dma_buf_{map/unmap}_attachment(). This is not an issue in those framework while it is how dma_buf exporters are supposed to work.
Which could be addressed if need be.
So if "SDP" is just another IOMMU feature, just as stuff like IMR is on some x86 devices, and hypervisor enforced protection is on assorted platforms why do we need a special way to do it ? Is there anything actually needed beyond being able to tell the existing DMA code that this buffer won't be CPU touched and wiring it into the DMA operations for the platform ?
Iirc most of the dma api stuff gets unhappy when memory isn't struct page backed. In i915 we do use sg tables everywhere though (even for memory not backed by struct page, e.g. the "stolen" range the bios prereserves), but we fill those out manually.
A possible generic design I see is to have a secure memory allocator device which doesn nothing else but hand out dma-bufs. With that we can hide the platform-specific allocation methods in there (some need to allocate from carveouts, other just need to mark the pages specifically). Also dma-buf has explicit methods for cpu access, which are allowed to fail. And using the dma-buf attach tracking we can also reject dma to devices which cannot access the secure memory. Given all that I think going through the dma-buf interface but with a special-purpose allocator seems to fit.
I'm not sure whether a special iommu is a good idea otoh: I'd expect that for most devices the driver would need to decide about which iommu to pick (or maybe keep track of some special flags for an extended dma_map interface). At least looking at gpu drivers using iommus would require special code, whereas fully hiding all this behind the dma-buf interface should fit in much better. -Daniel
On 05/06/15 10:35, Daniel Vetter wrote:
On Tue, May 05, 2015 at 05:54:05PM +0100, One Thousand Gnomes wrote:
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
It's not just an ARM feature so any basis for this in the core code should be generic, whether its being enforced by ARM SDP, various Intel feature sets or even via a hypervisor.
I have try 2 "hacky" approachs with dma_buf:
- add a secure field in dma_buf structure and configure firewall in dma_buf_{map/unmap}_attachment() functions.
How is SDP not just another IOMMU. The only oddity here is that it happens to configure buffers the CPU can't touch and it has a control mechanism that is designed to cover big media corp type uses where the threat model is that the system owner is the enemy. Why does anything care about it being SDP, there are also generic cases this might be a useful optimisation (eg knowing the buffer isn't CPU touched so you can optimise cache flushing).
The control mechanism is a device/platform detail as with any IOMMU. It doesn't matter who configures it or how, providing it happens.
We do presumably need some small core DMA changes - anyone trying to map such a buffer into CPU space needs to get a warning or error but what else ?
From buffer allocation point of view I also facing a problem because when v4l2
or drm/kms are exporting buffers by using dma_buf they don't attaching themself on it and never call dma_buf_{map/unmap}_attachment(). This is not an issue in those framework while it is how dma_buf exporters are supposed to work.
Which could be addressed if need be.
So if "SDP" is just another IOMMU feature, just as stuff like IMR is on some x86 devices, and hypervisor enforced protection is on assorted platforms why do we need a special way to do it ? Is there anything actually needed beyond being able to tell the existing DMA code that this buffer won't be CPU touched and wiring it into the DMA operations for the platform ?
Iirc most of the dma api stuff gets unhappy when memory isn't struct page backed. In i915 we do use sg tables everywhere though (even for memory not backed by struct page, e.g. the "stolen" range the bios prereserves), but we fill those out manually.
A possible generic design I see is to have a secure memory allocator device which doesn nothing else but hand out dma-bufs. With that we can hide the platform-specific allocation methods in there (some need to allocate from carveouts, other just need to mark the pages specifically). Also dma-buf has explicit methods for cpu access, which are allowed to fail.
BTW, v4l2 currently doesn't use those cpu access calls. It should, though, and I have patches for that. However, I haven't had time to clean them up and post them. I remember that I had problems with one or two drivers as well, but I can't remember if I solved those problems or not.
I would expect that in order to implement SDP you need to get the cpu access part sorted as well, so this should help.
My latest tree for this work is here:
http://git.linuxtv.org/cgit.cgi/hverkuil/media_tree.git/log/?h=vb2-prep5
I tried to rebase but that is a bit more involved than I have time for right now. If someone really wants this then let me know and I can rebase it for you.
And using the dma-buf attach tracking we can also reject dma to devices which cannot access the secure memory. Given all that I think going through the dma-buf interface but with a special-purpose allocator seems to fit.
I agree with that. I think we discussed this when dma-buf was designed that it should be possible to be used to represent opaque memory.
Regards,
Hans
I'm not sure whether a special iommu is a good idea otoh: I'd expect that for most devices the driver would need to decide about which iommu to pick (or maybe keep track of some special flags for an extended dma_map interface). At least looking at gpu drivers using iommus would require special code, whereas fully hiding all this behind the dma-buf interface should fit in much better. -Daniel
On Wed, May 06, 2015 at 10:35:52AM +0200, Daniel Vetter wrote:
On Tue, May 05, 2015 at 05:54:05PM +0100, One Thousand Gnomes wrote:
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
It's not just an ARM feature so any basis for this in the core code should be generic, whether its being enforced by ARM SDP, various Intel feature sets or even via a hypervisor.
I have try 2 "hacky" approachs with dma_buf:
- add a secure field in dma_buf structure and configure firewall in dma_buf_{map/unmap}_attachment() functions.
How is SDP not just another IOMMU. The only oddity here is that it happens to configure buffers the CPU can't touch and it has a control mechanism that is designed to cover big media corp type uses where the threat model is that the system owner is the enemy. Why does anything care about it being SDP, there are also generic cases this might be a useful optimisation (eg knowing the buffer isn't CPU touched so you can optimise cache flushing).
The control mechanism is a device/platform detail as with any IOMMU. It doesn't matter who configures it or how, providing it happens.
We do presumably need some small core DMA changes - anyone trying to map such a buffer into CPU space needs to get a warning or error but what else ?
From buffer allocation point of view I also facing a problem because when v4l2
or drm/kms are exporting buffers by using dma_buf they don't attaching themself on it and never call dma_buf_{map/unmap}_attachment(). This is not an issue in those framework while it is how dma_buf exporters are supposed to work.
Which could be addressed if need be.
So if "SDP" is just another IOMMU feature, just as stuff like IMR is on some x86 devices, and hypervisor enforced protection is on assorted platforms why do we need a special way to do it ? Is there anything actually needed beyond being able to tell the existing DMA code that this buffer won't be CPU touched and wiring it into the DMA operations for the platform ?
Iirc most of the dma api stuff gets unhappy when memory isn't struct page backed. In i915 we do use sg tables everywhere though (even for memory not backed by struct page, e.g. the "stolen" range the bios prereserves), but we fill those out manually.
A possible generic design I see is to have a secure memory allocator device which doesn nothing else but hand out dma-bufs.
Are you suggesting a device file with a custom set of IOCTLs for this? Or some in-kernel API that would perform the secure allocations? I suspect the former would be better suited, because it gives applications the control over whether they need secure buffers or not. The latter would require custom extensions in every driver to make them allocate from a secure memory pool.
For my understanding, would the secure memory allocator be responsible for setting up the permissions to access the memory at attachment time?
With that we can
hide the platform-specific allocation methods in there (some need to allocate from carveouts, other just need to mark the pages specifically). Also dma-buf has explicit methods for cpu access, which are allowed to fail. And using the dma-buf attach tracking we can also reject dma to devices which cannot access the secure memory. Given all that I think going through the dma-buf interface but with a special-purpose allocator seems to fit.
That sounds like a flexible enough design to me. I think it's something that we could easily implement on Tegra. The memory controller on Tegra implements this using a special video-protect aperture (VPR) and memory clients can individually be allowed access to this aperture. That means VPR is a carveout that is typically set up by some secure firmware, and that in turn, as I understand it, would imply we won't have struct page pointers for the backing memory in this case either.
I suspect that it should work out fine to not require struct page backed memory for this infrastructure since by definition the CPU won't be allowed to access it anyway.
I'm not sure whether a special iommu is a good idea otoh: I'd expect that for most devices the driver would need to decide about which iommu to pick (or maybe keep track of some special flags for an extended dma_map interface). At least looking at gpu drivers using iommus would require special code, whereas fully hiding all this behind the dma-buf interface should fit in much better.
As I understand it, even though the VPR on Tegra is a carveout it still is subject to IOMMU translation. So if IOMMU translation is enabled for a device (say the display controller), then all accesses to memory will be translated, whether they are to VPR or non-protected memory. Again I think this should work out fine with a special secure allocator. If the SG tables are filled in properly drivers should be able to cope.
It's possible that existing IOMMU drivers would require modification to make this work, though. For example, the default_iommu_map_sg() function currently uses sg_page(), so that wouldn't be able to map secure buffers to I/O virtual addresses. That said, drivers could reimplement this on top of iommu_map(), though that may imply suboptimal performance on the mapping operation.
Similarly some backing implementations of the DMA API rely on struct page pointers being present. But it seems more like you wouldn't want to use the DMA API at all if you want to use this kind of protected memory.
Thierry
On Wed, May 06, 2015 at 11:19:21AM +0200, Thierry Reding wrote:
On Wed, May 06, 2015 at 10:35:52AM +0200, Daniel Vetter wrote:
On Tue, May 05, 2015 at 05:54:05PM +0100, One Thousand Gnomes wrote:
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
It's not just an ARM feature so any basis for this in the core code should be generic, whether its being enforced by ARM SDP, various Intel feature sets or even via a hypervisor.
I have try 2 "hacky" approachs with dma_buf:
- add a secure field in dma_buf structure and configure firewall in dma_buf_{map/unmap}_attachment() functions.
How is SDP not just another IOMMU. The only oddity here is that it happens to configure buffers the CPU can't touch and it has a control mechanism that is designed to cover big media corp type uses where the threat model is that the system owner is the enemy. Why does anything care about it being SDP, there are also generic cases this might be a useful optimisation (eg knowing the buffer isn't CPU touched so you can optimise cache flushing).
The control mechanism is a device/platform detail as with any IOMMU. It doesn't matter who configures it or how, providing it happens.
We do presumably need some small core DMA changes - anyone trying to map such a buffer into CPU space needs to get a warning or error but what else ?
From buffer allocation point of view I also facing a problem because when v4l2
or drm/kms are exporting buffers by using dma_buf they don't attaching themself on it and never call dma_buf_{map/unmap}_attachment(). This is not an issue in those framework while it is how dma_buf exporters are supposed to work.
Which could be addressed if need be.
So if "SDP" is just another IOMMU feature, just as stuff like IMR is on some x86 devices, and hypervisor enforced protection is on assorted platforms why do we need a special way to do it ? Is there anything actually needed beyond being able to tell the existing DMA code that this buffer won't be CPU touched and wiring it into the DMA operations for the platform ?
Iirc most of the dma api stuff gets unhappy when memory isn't struct page backed. In i915 we do use sg tables everywhere though (even for memory not backed by struct page, e.g. the "stolen" range the bios prereserves), but we fill those out manually.
A possible generic design I see is to have a secure memory allocator device which doesn nothing else but hand out dma-bufs.
Are you suggesting a device file with a custom set of IOCTLs for this? Or some in-kernel API that would perform the secure allocations? I suspect the former would be better suited, because it gives applications the control over whether they need secure buffers or not. The latter would require custom extensions in every driver to make them allocate from a secure memory pool.
Yes the idea would be a special-purpose allocater thing like ion. Might even want that to be a syscall to do it properly.
For my understanding, would the secure memory allocator be responsible for setting up the permissions to access the memory at attachment time?
Well not permission checks, but hw capability checks. The allocator should have platform knowledge about which devices can access such secure memory (since some will definitely not be able to do that for fear of just plain sending it out to the world).
With that we can
hide the platform-specific allocation methods in there (some need to allocate from carveouts, other just need to mark the pages specifically). Also dma-buf has explicit methods for cpu access, which are allowed to fail. And using the dma-buf attach tracking we can also reject dma to devices which cannot access the secure memory. Given all that I think going through the dma-buf interface but with a special-purpose allocator seems to fit.
That sounds like a flexible enough design to me. I think it's something that we could easily implement on Tegra. The memory controller on Tegra implements this using a special video-protect aperture (VPR) and memory clients can individually be allowed access to this aperture. That means VPR is a carveout that is typically set up by some secure firmware, and that in turn, as I understand it, would imply we won't have struct page pointers for the backing memory in this case either.
I suspect that it should work out fine to not require struct page backed memory for this infrastructure since by definition the CPU won't be allowed to access it anyway.
At least in the cases I know (carveout on i915, apparently also on tegra) there's no way we can have struct page around. Which means we cant rely upon its presence in the generic parts.
I'm not sure whether a special iommu is a good idea otoh: I'd expect that for most devices the driver would need to decide about which iommu to pick (or maybe keep track of some special flags for an extended dma_map interface). At least looking at gpu drivers using iommus would require special code, whereas fully hiding all this behind the dma-buf interface should fit in much better.
As I understand it, even though the VPR on Tegra is a carveout it still is subject to IOMMU translation. So if IOMMU translation is enabled for a device (say the display controller), then all accesses to memory will be translated, whether they are to VPR or non-protected memory. Again I think this should work out fine with a special secure allocator. If the SG tables are filled in properly drivers should be able to cope.
It's possible that existing IOMMU drivers would require modification to make this work, though. For example, the default_iommu_map_sg() function currently uses sg_page(), so that wouldn't be able to map secure buffers to I/O virtual addresses. That said, drivers could reimplement this on top of iommu_map(), though that may imply suboptimal performance on the mapping operation.
Similarly some backing implementations of the DMA API rely on struct page pointers being present. But it seems more like you wouldn't want to use the DMA API at all if you want to use this kind of protected memory.
Hm yeah if you still need to go through the iommu even for the secure carveout then the lack of struct page is annoying. Otoh you have that problem no matter what.
Another isssue is that at least for some consumer devices we need to set special bits to make sure the hw goes into secure mode (and makes sure nothing ever leaves that box). I think in some cases (video encode) that knowledge would even need to be accessible to userspace. And in-kernel dma_buf_is_secure_memory plus ioctl might be needed for that. -Daniel
On Wed, May 06, 2015 at 03:15:32PM +0200, Daniel Vetter wrote:
On Wed, May 06, 2015 at 11:19:21AM +0200, Thierry Reding wrote:
On Wed, May 06, 2015 at 10:35:52AM +0200, Daniel Vetter wrote:
On Tue, May 05, 2015 at 05:54:05PM +0100, One Thousand Gnomes wrote:
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
It's not just an ARM feature so any basis for this in the core code should be generic, whether its being enforced by ARM SDP, various Intel feature sets or even via a hypervisor.
I have try 2 "hacky" approachs with dma_buf:
- add a secure field in dma_buf structure and configure firewall in dma_buf_{map/unmap}_attachment() functions.
How is SDP not just another IOMMU. The only oddity here is that it happens to configure buffers the CPU can't touch and it has a control mechanism that is designed to cover big media corp type uses where the threat model is that the system owner is the enemy. Why does anything care about it being SDP, there are also generic cases this might be a useful optimisation (eg knowing the buffer isn't CPU touched so you can optimise cache flushing).
The control mechanism is a device/platform detail as with any IOMMU. It doesn't matter who configures it or how, providing it happens.
We do presumably need some small core DMA changes - anyone trying to map such a buffer into CPU space needs to get a warning or error but what else ?
From buffer allocation point of view I also facing a problem because when v4l2
or drm/kms are exporting buffers by using dma_buf they don't attaching themself on it and never call dma_buf_{map/unmap}_attachment(). This is not an issue in those framework while it is how dma_buf exporters are supposed to work.
Which could be addressed if need be.
So if "SDP" is just another IOMMU feature, just as stuff like IMR is on some x86 devices, and hypervisor enforced protection is on assorted platforms why do we need a special way to do it ? Is there anything actually needed beyond being able to tell the existing DMA code that this buffer won't be CPU touched and wiring it into the DMA operations for the platform ?
Iirc most of the dma api stuff gets unhappy when memory isn't struct page backed. In i915 we do use sg tables everywhere though (even for memory not backed by struct page, e.g. the "stolen" range the bios prereserves), but we fill those out manually.
A possible generic design I see is to have a secure memory allocator device which doesn nothing else but hand out dma-bufs.
Are you suggesting a device file with a custom set of IOCTLs for this? Or some in-kernel API that would perform the secure allocations? I suspect the former would be better suited, because it gives applications the control over whether they need secure buffers or not. The latter would require custom extensions in every driver to make them allocate from a secure memory pool.
Yes the idea would be a special-purpose allocater thing like ion. Might even want that to be a syscall to do it properly.
Would you care to elaborate why a syscall would be more proper? Not that I'm objecting to it, just for my education.
For my understanding, would the secure memory allocator be responsible for setting up the permissions to access the memory at attachment time?
Well not permission checks, but hw capability checks. The allocator should have platform knowledge about which devices can access such secure memory (since some will definitely not be able to do that for fear of just plain sending it out to the world).
At least on Tegra there are controls to grant access to the VPR to a given device, so I'd expect some driver needing to frob some registers before the device can access a secure buffer.
Thierry
On Thu, May 07, 2015 at 03:22:20PM +0200, Thierry Reding wrote:
On Wed, May 06, 2015 at 03:15:32PM +0200, Daniel Vetter wrote:
Yes the idea would be a special-purpose allocater thing like ion. Might even want that to be a syscall to do it properly.
Would you care to elaborate why a syscall would be more proper? Not that I'm objecting to it, just for my education.
It seems to be the theme with someone proposing a global /dev node for a few system wide ioctls, then reviewers ask to make a proper ioctl out of it. E.g. kdbus, but I have vague memory of this happening a lot. -Daniel
On Thu, 7 May 2015 15:52:12 +0200 Daniel Vetter daniel@ffwll.ch wrote:
On Thu, May 07, 2015 at 03:22:20PM +0200, Thierry Reding wrote:
On Wed, May 06, 2015 at 03:15:32PM +0200, Daniel Vetter wrote:
Yes the idea would be a special-purpose allocater thing like ion. Might even want that to be a syscall to do it properly.
Would you care to elaborate why a syscall would be more proper? Not that I'm objecting to it, just for my education.
It seems to be the theme with someone proposing a global /dev node for a few system wide ioctls, then reviewers ask to make a proper ioctl out of it. E.g. kdbus, but I have vague memory of this happening a lot.
kdbus is not necessarily an advert for how to do anything 8)
If it can be user allocated then it really ought to be one or more device nodes IMHO, because you want the resource to be passable between users, you need a handle to it and you want it to go away nicely on last close. In the cases where the CPU is allowed to or expected to have write only access you also might want an mmap of it.
I guess the same kind of logic as with GEM (except preferably without the DoS security holes) applies as to why its useful to have handles to the DMA buffers.
Alan
On Thu, May 07, 2015 at 05:40:03PM +0100, One Thousand Gnomes wrote:
On Thu, 7 May 2015 15:52:12 +0200 Daniel Vetter daniel@ffwll.ch wrote:
On Thu, May 07, 2015 at 03:22:20PM +0200, Thierry Reding wrote:
On Wed, May 06, 2015 at 03:15:32PM +0200, Daniel Vetter wrote:
Yes the idea would be a special-purpose allocater thing like ion. Might even want that to be a syscall to do it properly.
Would you care to elaborate why a syscall would be more proper? Not that I'm objecting to it, just for my education.
It seems to be the theme with someone proposing a global /dev node for a few system wide ioctls, then reviewers ask to make a proper ioctl out of it. E.g. kdbus, but I have vague memory of this happening a lot.
kdbus is not necessarily an advert for how to do anything 8)
If it can be user allocated then it really ought to be one or more device nodes IMHO, because you want the resource to be passable between users, you need a handle to it and you want it to go away nicely on last close. In the cases where the CPU is allowed to or expected to have write only access you also might want an mmap of it.
dma-buf user handles are fds, which means anything allocated can be passed around nicely already. The question really is whether we'll have one ioctl on top of a special dev node or a syscall. I thought that in these cases where the dev node is only ever used to allocate the real thing, a syscall is the preferred way to go.
I guess the same kind of logic as with GEM (except preferably without the DoS security holes) applies as to why its useful to have handles to the DMA buffers.
We have handles (well file descriptors) to dma-bufs already, I'm a bit confused what you mean? -Daniel
Am 08.05.2015 um 10:37 schrieb Daniel Vetter:
dma-buf user handles are fds, which means anything allocated can be passed around nicely already. The question really is whether we'll have one ioctl on top of a special dev node or a syscall. I thought that in these cases where the dev node is only ever used to allocate the real thing, a syscall is the preferred way to go.
I'd generally prefer a /dev node instead of syscall, as they can be dynamically allocated (loaded as module, etc), which in turn offers more finer control (eg. for containers, etc). One could easily replace the it with its own implementation (w/o patching the kernel directly and reboot it).
Actually, I'm a bit unhappy with the syscall inflation, in fact I'm not even a big friend of ioctl's - I'd really prefer the Plan9 way :p
I guess the same kind of logic as with GEM (except preferably without the DoS security holes) applies as to why its useful to have handles to the DMA buffers.
We have handles (well file descriptors) to dma-bufs already, I'm a bit confused what you mean?
Just curious (as I'm pretty new to this area): how to GEM objects and dma-bufs relate to each other ? Is it possible to directly exchange buffers between GPUs, VPUs, IPUs, FBs, etc ?
cu -- Enrico Weigelt, metux IT consult +49-151-27565287 MELAG Medizintechnik oHG Sitz Berlin Registergericht AG Charlottenburg HRA 21333 B
Wichtiger Hinweis: Diese Nachricht kann vertrauliche oder nur für einen begrenzten Personenkreis bestimmte Informationen enthalten. Sie ist ausschließlich für denjenigen bestimmt, an den sie gerichtet worden ist. Wenn Sie nicht der Adressat dieser E-Mail sind, dürfen Sie diese nicht kopieren, weiterleiten, weitergeben oder sie ganz oder teilweise in irgendeiner Weise nutzen. Sollten Sie diese E-Mail irrtümlich erhalten haben, so benachrichtigen Sie bitte den Absender, indem Sie auf diese Nachricht antworten. Bitte löschen Sie in diesem Fall diese Nachricht und alle Anhänge, ohne eine Kopie zu behalten. Important Notice: This message may contain confidential or privileged information. It is intended only for the person it was addressed to. If you are not the intended recipient of this email you may not copy, forward, disclose or otherwise use it or any part of it in any form whatsoever. If you received this email in error please notify the sender by replying and delete this message and any attachments without retaining a copy.
dma-buf user handles are fds, which means anything allocated can be passed around nicely already. The question really is whether we'll have one ioctl on top of a special dev node or a syscall. I thought that in these cases where the dev node is only ever used to allocate the real thing, a syscall is the preferred way to go.
So you'd go for
fd = dmabuf_alloc(blah..., O_whatever) ?
Whichever I guess.. really we want open("/dev/foo/parameters.....") but we missed that chance a long time ago.
The billion dollar question is how is the resource managed, who owns the object, who is charged for it, how to does containerise. We really ought to have a clear answer to that.
I guess the same kind of logic as with GEM (except preferably without the DoS security holes) applies as to why its useful to have handles to the DMA buffers.
We have handles (well file descriptors) to dma-bufs already, I'm a bit confused what you mean?
I was agreeing with your argument - with GEM as an example that it works for the CPU accessing case.
Alan
I think now I have an answer to my question.
I will back come in a couple of weeks with a generic dmabuf allocator. The feature set of this should be: - allow to have per device specificone allocator - ioctl for buffer allocation and exporting dmabuf file descriptor on /dev/foo - generic API to be able to call buffer securing module which is platform specific. - ioctl and kernel API to set/get dmabuf secure status
Sumit had already done a draft of this kind of dmabuf allocator with his cenalloc [1] I think I will start from that.
Benjamin
[1] http://git.linaro.org/people/sumit.semwal/linux-3.x.git on cenalloc_wip branch.
I agree that the best solution is to have a generic dmabuf allocator but no only for secure use cases.
If we create a memory allocator dedicated to security it means that userland will be responsible to use it or not depending of the context which may change while the pipeline/graph is already running... Renegotiate buffers allocation in "live" is very difficult and takes time.
To keep this simple to use a memory allocator device is probably the best solution but Sumit have already to propose this kind of solution with the "constraint aware" allocator without succes. Does secure data path requirements will be enough to make this acceptable now ?
2015-05-06 10:35 GMT+02:00 Daniel Vetter daniel@ffwll.ch:
On Tue, May 05, 2015 at 05:54:05PM +0100, One Thousand Gnomes wrote:
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
It's not just an ARM feature so any basis for this in the core code should be generic, whether its being enforced by ARM SDP, various Intel feature sets or even via a hypervisor.
I have try 2 "hacky" approachs with dma_buf:
- add a secure field in dma_buf structure and configure firewall in dma_buf_{map/unmap}_attachment() functions.
How is SDP not just another IOMMU. The only oddity here is that it happens to configure buffers the CPU can't touch and it has a control mechanism that is designed to cover big media corp type uses where the threat model is that the system owner is the enemy. Why does anything care about it being SDP, there are also generic cases this might be a useful optimisation (eg knowing the buffer isn't CPU touched so you can optimise cache flushing).
The control mechanism is a device/platform detail as with any IOMMU. It doesn't matter who configures it or how, providing it happens.
We do presumably need some small core DMA changes - anyone trying to map such a buffer into CPU space needs to get a warning or error but what else ?
From buffer allocation point of view I also facing a problem because when v4l2
or drm/kms are exporting buffers by using dma_buf they don't attaching themself on it and never call dma_buf_{map/unmap}_attachment(). This is not an issue in those framework while it is how dma_buf exporters are supposed to work.
Which could be addressed if need be.
So if "SDP" is just another IOMMU feature, just as stuff like IMR is on some x86 devices, and hypervisor enforced protection is on assorted platforms why do we need a special way to do it ? Is there anything actually needed beyond being able to tell the existing DMA code that this buffer won't be CPU touched and wiring it into the DMA operations for the platform ?
Iirc most of the dma api stuff gets unhappy when memory isn't struct page backed. In i915 we do use sg tables everywhere though (even for memory not backed by struct page, e.g. the "stolen" range the bios prereserves), but we fill those out manually.
A possible generic design I see is to have a secure memory allocator device which doesn nothing else but hand out dma-bufs. With that we can hide the platform-specific allocation methods in there (some need to allocate from carveouts, other just need to mark the pages specifically). Also dma-buf has explicit methods for cpu access, which are allowed to fail. And using the dma-buf attach tracking we can also reject dma to devices which cannot access the secure memory. Given all that I think going through the dma-buf interface but with a special-purpose allocator seems to fit.
I'm not sure whether a special iommu is a good idea otoh: I'd expect that for most devices the driver would need to decide about which iommu to pick (or maybe keep track of some special flags for an extended dma_map interface). At least looking at gpu drivers using iommus would require special code, whereas fully hiding all this behind the dma-buf interface should fit in much better.
-Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Wed, May 6, 2015 at 4:35 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, May 05, 2015 at 05:54:05PM +0100, One Thousand Gnomes wrote:
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
It's not just an ARM feature so any basis for this in the core code should be generic, whether its being enforced by ARM SDP, various Intel feature sets or even via a hypervisor.
I have try 2 "hacky" approachs with dma_buf:
- add a secure field in dma_buf structure and configure firewall in dma_buf_{map/unmap}_attachment() functions.
How is SDP not just another IOMMU. The only oddity here is that it happens to configure buffers the CPU can't touch and it has a control mechanism that is designed to cover big media corp type uses where the threat model is that the system owner is the enemy. Why does anything care about it being SDP, there are also generic cases this might be a useful optimisation (eg knowing the buffer isn't CPU touched so you can optimise cache flushing).
The control mechanism is a device/platform detail as with any IOMMU. It doesn't matter who configures it or how, providing it happens.
We do presumably need some small core DMA changes - anyone trying to map such a buffer into CPU space needs to get a warning or error but what else ?
From buffer allocation point of view I also facing a problem because when v4l2
or drm/kms are exporting buffers by using dma_buf they don't attaching themself on it and never call dma_buf_{map/unmap}_attachment(). This is not an issue in those framework while it is how dma_buf exporters are supposed to work.
Which could be addressed if need be.
So if "SDP" is just another IOMMU feature, just as stuff like IMR is on some x86 devices, and hypervisor enforced protection is on assorted platforms why do we need a special way to do it ? Is there anything actually needed beyond being able to tell the existing DMA code that this buffer won't be CPU touched and wiring it into the DMA operations for the platform ?
Iirc most of the dma api stuff gets unhappy when memory isn't struct page backed. In i915 we do use sg tables everywhere though (even for memory not backed by struct page, e.g. the "stolen" range the bios prereserves), but we fill those out manually.
A possible generic design I see is to have a secure memory allocator device which doesn nothing else but hand out dma-bufs. With that we can hide the platform-specific allocation methods in there (some need to allocate from carveouts, other just need to mark the pages specifically). Also dma-buf has explicit methods for cpu access, which are allowed to fail. And using the dma-buf attach tracking we can also reject dma to devices which cannot access the secure memory. Given all that I think going through the dma-buf interface but with a special-purpose allocator seems to fit.
I'm not sure whether a special iommu is a good idea otoh: I'd expect that for most devices the driver would need to decide about which iommu to pick (or maybe keep track of some special flags for an extended dma_map interface). At least looking at gpu drivers using iommus would require special code, whereas fully hiding all this behind the dma-buf interface should fit in much better.
jfwiw, I'd fully expect devices to be dealing with a mix of secure and insecure buffers, so I'm also not really sure how the 'special iommu' plan would play out..
I think 'secure' allocator device sounds attractive from PoV of separating out platform nonsense.. not sure if it is exactly that easy, since importing device probably needs to set some special bits here and there..
BR, -R
-Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Wed, May 06, 2015 at 07:29:56AM -0400, Rob Clark wrote:
On Wed, May 6, 2015 at 4:35 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, May 05, 2015 at 05:54:05PM +0100, One Thousand Gnomes wrote:
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
It's not just an ARM feature so any basis for this in the core code should be generic, whether its being enforced by ARM SDP, various Intel feature sets or even via a hypervisor.
I have try 2 "hacky" approachs with dma_buf:
- add a secure field in dma_buf structure and configure firewall in dma_buf_{map/unmap}_attachment() functions.
How is SDP not just another IOMMU. The only oddity here is that it happens to configure buffers the CPU can't touch and it has a control mechanism that is designed to cover big media corp type uses where the threat model is that the system owner is the enemy. Why does anything care about it being SDP, there are also generic cases this might be a useful optimisation (eg knowing the buffer isn't CPU touched so you can optimise cache flushing).
The control mechanism is a device/platform detail as with any IOMMU. It doesn't matter who configures it or how, providing it happens.
We do presumably need some small core DMA changes - anyone trying to map such a buffer into CPU space needs to get a warning or error but what else ?
From buffer allocation point of view I also facing a problem because when v4l2
or drm/kms are exporting buffers by using dma_buf they don't attaching themself on it and never call dma_buf_{map/unmap}_attachment(). This is not an issue in those framework while it is how dma_buf exporters are supposed to work.
Which could be addressed if need be.
So if "SDP" is just another IOMMU feature, just as stuff like IMR is on some x86 devices, and hypervisor enforced protection is on assorted platforms why do we need a special way to do it ? Is there anything actually needed beyond being able to tell the existing DMA code that this buffer won't be CPU touched and wiring it into the DMA operations for the platform ?
Iirc most of the dma api stuff gets unhappy when memory isn't struct page backed. In i915 we do use sg tables everywhere though (even for memory not backed by struct page, e.g. the "stolen" range the bios prereserves), but we fill those out manually.
A possible generic design I see is to have a secure memory allocator device which doesn nothing else but hand out dma-bufs. With that we can hide the platform-specific allocation methods in there (some need to allocate from carveouts, other just need to mark the pages specifically). Also dma-buf has explicit methods for cpu access, which are allowed to fail. And using the dma-buf attach tracking we can also reject dma to devices which cannot access the secure memory. Given all that I think going through the dma-buf interface but with a special-purpose allocator seems to fit.
I'm not sure whether a special iommu is a good idea otoh: I'd expect that for most devices the driver would need to decide about which iommu to pick (or maybe keep track of some special flags for an extended dma_map interface). At least looking at gpu drivers using iommus would require special code, whereas fully hiding all this behind the dma-buf interface should fit in much better.
jfwiw, I'd fully expect devices to be dealing with a mix of secure and insecure buffers, so I'm also not really sure how the 'special iommu' plan would play out..
I think 'secure' allocator device sounds attractive from PoV of separating out platform nonsense.. not sure if it is exactly that easy, since importing device probably needs to set some special bits here and there..
I would expect there to be a central entity handing out the secure buffers and that the entity would have the ability to grant access to the buffers to devices. So I'm thinking that the exporter would deal with this in the ->attach() operation. That's passed a struct device, so we should be able to retrieve the information necessary somehow.
Then again, maybe things will be more involved than that. I think the way this typically works in consumer devices is that you need to jump into secure firmware to get access granted. For example on Tegra most registers that control this are TrustZone-protected, so if you don't happen to be lucky enough to be running the kernel in secure mode you can't enable access to secure memory from the kernel.
As Alan mentioned before this is designed with the assumption that the user is not to be trusted, so the OEM must make sure that the chain of trust is upheld. In practice that means that consumer devices will run some secure firmware that can't be replaced and which boots the kernel in non-secure mode, thereby disallowing the kernel from touching these registers that could potentially expose protected content to untrusted consumers.
In practice I think the result is that secure firmware sets up secure memory (VPR on Tegra) and then lets the kernel know its geometry. The kernel is then free to manage the VPR as it sees fit. If the devices need to deal with both secure and non-secure memory, I'm not exactly sure how the above security model is supposed to hold. If the kernel is free to program any address into the device, then it would be easy to trick it into writing to some region that the CPU can access. I'm fairly sure that there's some mechanism to disallow that, but I don't know exactly how it's implemented.
Thierry
2015-05-05 18:54 GMT+02:00 One Thousand Gnomes gnomes@lxorguk.ukuu.org.uk:
First what is Secure Data Path ? SDP is a set of hardware features to garanty that some memories regions could only be read and/or write by specific hardware IPs. You can imagine it as a kind of memory firewall which grant/revoke accesses to memory per devices. Firewall configuration must be done in a trusted environment: for ARM architecture we plan to use OP-TEE + a trusted application to do that.
It's not just an ARM feature so any basis for this in the core code should be generic, whether its being enforced by ARM SDP, various Intel feature sets or even via a hypervisor.
I agree the core code should be generic, I was just mention OP-TEE to explain on which context I'm working.
I have try 2 "hacky" approachs with dma_buf:
- add a secure field in dma_buf structure and configure firewall in dma_buf_{map/unmap}_attachment() functions.
How is SDP not just another IOMMU. The only oddity here is that it happens to configure buffers the CPU can't touch and it has a control mechanism that is designed to cover big media corp type uses where the threat model is that the system owner is the enemy. Why does anything care about it being SDP, there are also generic cases this might be a useful optimisation (eg knowing the buffer isn't CPU touched so you can optimise cache flushing).
IOMMU interface doesn't offer API to manage buffer refcounting so you will ignore when you will have to stop protect the memory when using dma_buf you know that the buffer is release when destroy function is called. It is not only to not allow CPU to access to memory but also to configure hardware devices and firewall to be able to access to the memory.
The control mechanism is a device/platform detail as with any IOMMU. It doesn't matter who configures it or how, providing it happens.
We do presumably need some small core DMA changes - anyone trying to map such a buffer into CPU space needs to get a warning or error but what else ?
From buffer allocation point of view I also facing a problem because when v4l2
or drm/kms are exporting buffers by using dma_buf they don't attaching themself on it and never call dma_buf_{map/unmap}_attachment(). This is not an issue in those framework while it is how dma_buf exporters are supposed to work.
Which could be addressed if need be.
So if "SDP" is just another IOMMU feature, just as stuff like IMR is on some x86 devices, and hypervisor enforced protection is on assorted platforms why do we need a special way to do it ? Is there anything actually needed beyond being able to tell the existing DMA code that this buffer won't be CPU touched and wiring it into the DMA operations for the platform ?
Alan
dri-devel@lists.freedesktop.org