Is Hosting Your Own LLM Cheaper than OpenAI?
𝐂𝐎𝐌𝐌𝐎𝐍𝐋𝐘 𝐀𝐒𝐊𝐄𝐃 𝐐𝐔𝐄𝐒𝐓𝐈𝐎𝐍 𝐁𝐘 𝐒𝐓𝐀𝐑𝐓𝐔𝐏𝐒
𝐎𝐩𝐞𝐧 𝐀𝐈 𝐀𝐩𝐢 𝐏𝐫𝐢𝐜𝐢𝐧𝐠:
Charges are calculated per tokens. 1000 tokens approx 750 words.
Model wise cost:
1. 𝐆𝐏𝐓-𝟒
2. 𝐆𝐏𝐓-𝟑.𝟓 𝐓𝐮𝐫𝐛𝐨
Monthly costs of average AI app that uses these api's For example Email Copywriting Agent app.
𝐅𝐨𝐫 𝐒𝐡𝐨𝐫𝐭 𝐜𝐨𝐧𝐭𝐞𝐧𝐭 𝐚𝐩𝐩 𝐒𝐨𝐥𝐮𝐭𝐢𝐨𝐧𝐬:
If this app writes marketing copywriting posts for users that takes around 100-150 words as input and outputs 600 words content.
That means 1 email costs 1000 tokens for one user around
1. GPT-4 Model
Total Cost per user = $0.006 + $0.048 = $0.054
If we receive 1000 users requests per day to write Copywriting email than the average monthly cost would be approximately
Therefore, with GPT4 model if 1000 users use your service every day for 30 days, it would cost $1,620 in total.
2. GPT-3.5 Turbo
Therefore, with GPT-3.5 Turbo model if 1000 users use your service every day for 30 days, it would cost $54 in total.
Host Your Own LLM Pricing:
Llama-2 7b on AWS
The choice of server type significantly influences the cost of hosting your own Large Language Model (LLM) on AWS, with varying server requirements for different models. Opting for the Llama-2 7b (7 billion parameter) model necessitates at least the EC2 g5.2xlarge server instance, priced at around $850 per month.
Additionally, connecting the model to an API for usage (utilizing AWS API Gateway & AWS Lambda) incurs an additional cost. However, with 1000 requests per day, this expense remains below $100 per month.
In summary, the estimated monthly cost for AWS hosting, including server and API usage, is approximately $1,000.
One Little catch here:
Given OpenAI's token-based pricing, a rise in your daily requests to 2,000 would result in a doubled monthly cost of $2,000.
However, opting for AWS setup ensures seamless handling of this increased load without additional scaling, maintaining a stable monthly cost at $1,000.
As a discerning businessperson, choosing the AWS setup for your 2,000 requests per day application is a prudent sensible decision.
Upgrading your Custom model:
Despite user complaints about the subpar quality of copyrighting emails generated by Llama-2 7B, it is found unsuitable for the intended use case. Subsequent experimentation revealed that Llama-2 13B significantly improved the output quality.
However, adopting Llama-2 13B necessitates a more robust server, substantially increasing costs to approximately $5,000 per month—$3,000 more than the expenses incurred using the OpenAI API.
Conclusion:
Hers's the key takeaways from today:
Experiment with various models to identify the ones that yield optimal results.
Determine the expected input and output text volumes for each model.
If the text volume is consistent and low, and security is not a primary concern, opting for OpenAI may be the preferable choice.
Otherwise, consider running a cost analysis for AWS to make an informed decision based on your specific requirements.
Software Engineer | DevOps
1yThank you for sharing! Very helpful numbers