KIBI.ONE PRESENTS…
Setting up an Open Source Serverless LLM to Expose API Endpoints
In this tutorial you’ll learn how to setup a version of your favorite open source LLM (i.e. Llama 2, Mistral etc.) in the cloud and expose the serverless endpoints to control the AI.
Kibi.OneĀ is a no-code SaaS incubation program focused on helping non-technical founders build software companies. Visit our homepage to learn more about our program.
How to go Serverless with Open Source AI Models Such as Llama 2?
In this tutorial I want to talk more about how we can interact with the API endpoints of open source LLMs, or large language models, without having to code.
In the last tutorial, linked to below, I showed you how to host your own version of an open source LLM like LLama2 or Mistral. I showed you how by renting hardware by the hour, you were able to avoid having to set your LMM up on your own computer locally. In this tutorial, I want to take this a step further and teach you how to interact with open source LLMs through their API endpoints without having to code, and without having to install anything locally. So letās jump in.
Setting up AI on the Cloud
Again, letās head over to RunPod. Here you can see that RunPod is a cloud provider that specializes in AI applications. After logging in, what I want you to do is I want you to click on āserverlessā. Now from here, we can set up our API endpoints for the AI platform we want to interact with. Under āquick deployā click on āview moreā and here youāll see a list of open source AI products we can launch. For example we could launch Whisper, Llama 13B, Llama 7B, Mistral, Open Journey, stable diffusion and some others. For this tutorial, we went text based AI, so letās selected LLama 13B. Simply click on āstartā. When we do that, weāll be asked to select a GPU. If we roll over the stats here, weāll see the stats of the pod that this serverless API will run on. Iāll selected this 48GB option here and then click on ādeployā.
Viewing the AI API Endpoints
Once our serverless instance is set up, weāll see it here under the serverless tab. You can click on āeditā to change any details. However, for this tutorial what we want to do is we want to find the API endpoints. So letās click on the title here where it says LLama 13B. Youāll see the API endpoints where you see ārunsynchā here. Simply expand this section by clicking on this āmoreā icon here. Here youāll see our other endpoints.
We could for example, use this run endpoint. It will give us the response as an ID, which we can later call through a separate API endpoint called āstatusā where we can dynamically inject the id of any run call to show the response. However, for queries which take under 15 seconds to perform we can use this runsynch option which will take our prompt and give us the answer in the initial response, without having to check the status of a call through a separate aPI call. So how you use this will really depend on your use case and individual needs. But for this tutorial, Iām just going to use run synch.
Testing Serverless AI API Endpoints
So Iāll grab this API endpoint here. So letās test out this Llama 2 API. You can use whatever API testing tool you like. Iām using Insomnia. I just created a tutorial on how to use this API testing software if you want to follow along. Iāll link to that below. Also, Iāve posted step by step instructions over on Kibi.one regarding how to setup the API. Over on Kibi oneās website, Iāve included the API URL endpoints and the exact JSON file formatting you need to use in order to make this API call work. So Iāll link to Kibi below, so you can get the step by step authentication, header and JSON instructions. In the description, just look for where it says āinstructionsā.
Also, while youāre over on Kibi Oneās website, be sure to check out our no-code SaaS development course. We teach people how to build software products by taking a no code or low code approach. We own a few platforms that have AI integrated into them, and in this course, youāll learn how to do the same.
Okay, so once you have the API setup within your API testing tools, itās time to click āsendā. So letās enter a prompt here. Iāll type in something like āwhatās the fastest growing economy in the worldā. When I do that,, I just need to wait a few seconds for the response and thenā¦ once itās ready, it will show up over to the right here.
Using Llama 2 13B in the Cloud
So now weāre using llama 13B in the cloud. As you can see, without writing any code, we were able to set up a serverless instance of Llama 13B and then interact with it VIA our own custom API endpoint. YOu could set up a serverless version of any LLM taking this approach, But these quick launch serverless options are your best bet if youāre trying to take a no code or low code approach. I expect that Runpod will add the newest and most popular versions including llama 70B to this list of quick start LLms shortly. But as you can see, Iām getting really good responses using the Llama 13B.
Now the benefit to taking the serverless approach over the pod approach is that Iām only charged for the request execution time. This means I donāt have to worry about how many hours my pod is online for. I donāt need to worry about any infrastructure or GPU details at all. These out of the box serverless instances all work out of the box and are optimized based on the LLM our open source AI software youāre running.
Conclusion
So I hope youāve enjoyed this tutorial on setting up open source AI projects in the cloud and interacting with them through API endpoints.
And remember, if youāre looking to upskill in the area of SaaS development, or no code AI development I encourage you to check us out over on Kibi.one. We have a no-code SaaS development course where we will teach you how to build and monetize pretty much whatever you can dream up including AI applications. A link to Kibi.one and a coupon code for $100 off can be found below.
Kibi.OneĀ is a no-code SaaS incubation program focused on helping non-technical founders build software companies. Visit our homepage to learn more about our program.
Build SaaS Platforms Without Code
Kibi.One is a platform development incubator that helps non-technical founders and entrepreneurs build software companies without having to know how to code.Ā
We're so sure of our program, that if it's doesn't generate a positive ROI for you, we'll buy your platform from you. Watch the video to the right to learn more.Ā