Kao Data is sponsoring the Startup of the Year category at next week’s Science & Technology Awards. Here Kao Data VP, Adam Nethersole shares some valuable insights and advises AI startups to plan their computing journey carefully.
A typical AI startup will use considerable amounts of compute during its journey - and so it’s important to think carefully about what you’re going to need en route and prepare properly before setting off on your high performance computing (HPC) and AI roadtrip.
There are three legs in an AI startup’s compute journey to consider:
- The Prototyping Stage (research and development, hardware testing, market research etc to produce a prototype product) – which, generally requires flexible computing as you try things out.
- The Production Stage (getting the product ready to be used by consumers and users) - when compute utilisation starts to ramp up very quickly.
- The Product Stage: once the product has been developed, compute often starts to level off before increasing again as you (hopefully) hit the big-time.
Many startups opt to use a cloud environment such as Amazon Web Services (AWS), Microsoft Azure or Google Cloud Projects for their initial compute requirements and this is an excellent way to begin. After all, in the prototyping stage, your compute evolution tends to be stop-start or, as it is often called, ‘flashy’ and ‘burstable’. You use the cloud’s compute resources on demand, only paying when you’re utilising what you need - and that’s a huge benefit in the early stages of prototyping when you’d rather be spending your money on R&D, skilled personnel and expertise instead of expensive physical IT assets like hardware, servers and storage.
However, if you use the cloud provider’s own array of ‘built-in’ applications or start by deploying your applications within that cloud platform, your compute can quickly become ‘landlocked’ within that very cloud (unless you’re using open source technology). The very notion of clouds is that they are accessible and flexible, but it’s actually much harder than you think to switch your compute between one cloud and another.
Cloud providers bombard AI startups with free instances and opportunities to use their platforms so they can effectively get them hooked into their platforms. However, you have to be aware of the background costs that can spiral. You may not get charged for the data coming in, but you are charged for the data that’s coming out. And, while that’s not a problem on the first leg of your journey when you’re only trying out sample datasets (and, therefore, not using a lot of compute), it becomes an expensive problem later on.
By the time you’re into the second leg of your compute journey (The Production Stage), depending on what you are doing, you could be using massive datasets, so the amount of your compute will suddenly increase exponentially. As soon as you go into production and start properly using your servers, the 5-10% of compute you were using every now and then during the Protype Phase suddenly jumps to become a solid 40-60% of your compute all of the time.
Once you have reached more than 50% utilisation of your servers, and it’s a reliable base load, that’s the inflection point at which you should consider either investing in your own hardware/infrastructure in your own office so you’re not paying for the increased volumes of data now going out, or utilising a colocation environment like Kao Data – because, when you do, the cost savings are dramatic.
Sure, you’ll have to pay a marginal connectivity charge just as you would to send data across any network – but it’s nothing like the charges you’ll be paying the cloud for the same transfer of data. And there are many other benefits as well – performance and speed improvements if your high performance computing is clustered in one location rather than on virtualised servers in the cloud, the ability to fine-tune your hardware to fit your bespoke applications, improved security and your own dedicated hardware. You won’t find yourself playing the giant game of cloud Tetris whilst waiting for other ‘noisy neighbours’ in the cloud to finish their jobs before yours can run.
Don’t get me wrong, the cloud is a brilliant resource for standard forms of everyday compute such as email servers, ERP and project management software, accounting platforms, etc. And it can be a useful resource when you’re starting out due to its flexibility. But clouds are easy in and hard out, and transitioning out of the cloud isn’t easy - which is why I say you need to think carefully about your compute journey at the offset. It can be tricky trying to detangle yourself although there are companies out there now designed to do just that and help you.
When you are at the appropriate compute scale, one of the great benefits of coming to a specialist data centre, like Kao Data, is that, as well as the provision and assurance of resilient, reliable power and technical expertise, we do also have a Megaport connection, which provides a seamless link between the cloud and the data centre – which makes it the perfect computing environment for hybrid cloud/colo computing.
The ideal situation would be to have your steady ‘base load’ of compute, say 60-70% of what you’re doing, in a data centre like Kao Data, and receiving the economies of scale of being part of a large data centre - and the remaining 30-40% of your compute (the peaky, flashy load), in the cloud on a flexible on-demand tariff so you only pay when you really need that extra burst of compute. This would ensure your servers are never maxed out to the full, you can always do what you need to do, you retain a degree of flexibility and your computing base load is a lot cheaper.
My advice is always: Don’t be a busy fool, consider your computing platform carefully and look at the lifecycle of your compute as a whole journey rather than day-to-day.