On Fri, 28 Feb 2020 at 18:18, Daniel Stone daniel@fooishbar.org wrote:
On Fri, 28 Feb 2020 at 03:38, Dave Airlie airlied@gmail.com wrote:
b) we probably need to take a large step back here.
Look at this from a sponsor POV, why would I give X.org/fd.o sponsorship money that they are just giving straight to google to pay for hosting credits? Google are profiting in some minor way from these hosting credits being bought by us, and I assume we aren't getting any sort of discounts here. Having google sponsor the credits costs google substantially less than having any other company give us money to do it.
The last I looked, Google GCP / Amazon AWS / Azure were all pretty comparable in terms of what you get and what you pay for them. Obviously providers like Packet and Digital Ocean who offer bare-metal services are cheaper, but then you need to find someone who is going to properly administer the various machines, install decent monitoring, make sure that more storage is provisioned when we need more storage (which is basically all the time), make sure that the hardware is maintained in decent shape (pretty sure one of the fd.o machines has had a drive in imminent-failure state for the last few months), etc.
Given the size of our service, that's a much better plan (IMO) than relying on someone who a) isn't an admin by trade, b) has a million other things to do, and c) hasn't wanted to do it for the past several years. But as long as that's the resources we have, then we're paying the cloud tradeoff, where we pay more money in exchange for fewer problems.
Admin for gitlab and CI is a full time role anyways. The system is definitely not self sustaining without time being put in by you and anholt still. If we have $75k to burn on credits, and it was diverted to just pay an admin to admin the real hw + gitlab/CI would that not be a better use of the money? I didn't know if we can afford $75k for an admin, but suddenly we can afford it for gitlab credits?
Yes, we could federate everything back out so everyone runs their own builds and executes those. Tinderbox did something really similar to that IIRC; not sure if Buildbot does as well. Probably rules out pre-merge testing, mind.
Why? does gitlab not support the model? having builds done in parallel on runners closer to the test runners seems like it should be a thing. I guess artifact transfer would cost less then as a result.
The reason we hadn't worked everything out in advance of deploying is because Mesa has had 3993 MRs in the not long over a year since moving, and a similar number in GStreamer, just taking the two biggest users. At the start it was 'maybe let's use MRs if you want to but make sure everything still goes through the list', and now it's something different. Similarly the CI architecture hasn't been 'designed', so much as that people want to run dEQP and Piglit on their hardware pre-merge in an open fashion that's actually accessible to people, and have just done it.
Again, if you want everything to be centrally designed/approved/monitored/controlled, that's a fine enough idea, and I'd be happy to support whoever it was who was doing that for all of fd.o.
I don't think we have any choice but to have someone centrally controlling it, You can't have a system in place that lets CI users burn largs sums of money without authorisation, and that is what we have now.
Dave.