This sounds interesting to me!
I will check it out in more detail. I saw 3.5-turbo mentioned which is more expensive and as I understand it, it's usually less good as a base model, if I see 3.5-turbo before I see 4o-mini, and don't see 4o-mini, I might wonder if things are behind! I hope that's fair feedback for a quick reaction.
I have done a bunch of fine-tuning before, locally and with 4o-mini, and there's often a lot of time spent suboptimally on wrangling data so I'm interested in this category of product for sure, if it helps more than it costs me.
felix089 42 days ago [-]
You can fine-tune any OpenAI model that is available in your account, so both 3.5-turbo and 4o-mini work, but mini replaced 3.5 for most use cases. Where did you see 3.5 mentioned over mini?
JulianWasTaken 42 days ago [-]
It's in at least 3 places on the homepage -- in 2 screenshots and in the first main code block you show. This was the first thing I noticed too, so I think it's good feedback from OP.
farouqaldori 42 days ago [-]
Yes, it's outdated. Thanks for the feedback. 4o-mini is the new king for sure!
felix089 42 days ago [-]
Ah I see, yes agreed, we'll update that asap! Thanks
trentontri 42 days ago [-]
Why bury the pricing information under the documentation? The problem with these platforms is that it is unclear how much bandwidth/money your use case will require to actually train and run a successful LLM.
The world needs products like this that are local first and open source. Enable me train an open source LLM on my M2 Macbook with a desktop app then I'll consider giving you my money. App developers integrating LLM's need to be able to experiment and see the potential before storing everything on the cloud.
farouqaldori 42 days ago [-]
We are working on a dedicated pricing page with all relevant information. Pricing in docs is just temporary. With that being said, new users get free credits to try out the platform without spending anything.
We've built the platform primarily for companies that serve LLMs in production, so even if we allowed you to fine-tune on device, sooner or later you will find yourself in a position where you want to deploy the model.
We want to streamline this whole process, end-to-end.
With that being said, I do agree that we shouldn't store everything on the cloud, this is what we're doing about it:
1. Any data in FinetuneDB like evals, logs, datasets etc. can be exported or deleted.
2. Fine-tuned model weights for OS models can be downloaded.
3. Using our inference stack is not a requirement. Many users are happy with only the dataset manager (which is 100% free).
4. We are exploring options to integrate external databases and storage providers with FinetuneDB, allowing datasets to be stored off our servers
42 days ago [-]
farouqaldori 43 days ago [-]
Hey all, co-founder here happy to answer any questions!
krishnasangeeth 42 days ago [-]
Nice product overall. I had some feedback and questions
feedback :
1. Since you guys have support for multiple models, it would be cleaner and more correct to give the API some name which doesn't start with openAI.
2. sdk using other languages like Python in `show code` would be nice.
3. It was a bit confusing to figure out how to fine tune the model, would be nice if it was explicitly available as a side pane.
Questions:
1. Can you speak a bit about your tech stack if that's alright
2. How do you currently scale inference if there is more incoming requests coming in?
farouqaldori 42 days ago [-]
Thank you so much!
1. Where exactly did you see this? There are internal FinetuneDB API keys, and external API keys like OpenAI. Though it's confusing, I agree!
2. Work in progress.
3. I agree, thanks for the feedback.
There are multiple components working together, so it's hard to define a single tech stack. When it comes to the web app, Remix is my framework of choice and can highly recommend it.
FunkyFreddy 42 days ago [-]
Congrats on the launch, UI looks sleek! Is tracking logs available in the free plan?
felix089 42 days ago [-]
Thanks, and yes, tracking logs is included in the free plan!
namanyayg 43 days ago [-]
What benefits does this bring me vs just using OpenAI's official tools?
felix089 43 days ago [-]
Other co-founder here, so we offer more specific features around iterating on your datasets and include domain experts in this workflow. And I'd argue that you also want your datasets not necessarily with your foundation model provider like OpenAI, so you have the option to test with and potentially switch to open-source models.
skerit 42 days ago [-]
Is it possible to fine-tune language models using plain text completions, or is it necessary to use datasets consisting of structured conversations?
felix089 42 days ago [-]
Yes, you can fine-tune using plain text completions. You don't need structured conversations unless you want conversational abilities. Plain text works great if you want the model to generate text in a specific style or domain. It all depends on what you're trying to achieve.
skerit 42 days ago [-]
Nice.
And about the cost of finetuning: is there a difference in price when only training the model on completions?
felix089 42 days ago [-]
The cost depends on the number of tokens processed, so fine-tuning on completions costs the same per token as any other data.
rmbyrro 42 days ago [-]
What's the cost of fine tuning and then serving a model, say Llama 3 8B or 70B? I couldn't find anything on the website...
felix089 42 days ago [-]
Hi, current pricing for Llama 3.1 8B for example is: Training Tokens: $2 / 1M, Input and Output Tokens: $0.30 / 1M. We'll update pricing on the website shortly to reflect this.
ilovefood 43 days ago [-]
Looks pretty cool, congrats so far! Do you allow downloading the fine tuned model for local inference?
felix089 43 days ago [-]
Thank you, and yes that is possible. Which model are you looking to fine-tune?
ilovefood 43 days ago [-]
If that's the case then I'll try the platform out :) I want to finetune Codestral or Qwen2.5-coder on a custom codebase. Thank you for the response! Are there some docs or infos about the compatibility of the downloaded models, meaning will they work right away with llama.cpp?
farouqaldori 43 days ago [-]
We don't support Codestral or Qwen2.5-coder right out of the box for now, but depending on your use-case we certainly could add it.
We utilize LoRA for smaller models, and qLoRA (quantized) for 70b+ models to improve training speeds, so when downloading model weights, what you get is the weights & adapter_config.json. Should work with llama.cpp!
perhapsAnLLM 42 days ago [-]
Awesome work! Really clean UI - who are your competitors that offer a similar "end-to-end workflow" UI for LLMs? I'm typically in a Jupyter notebook for this type of thing but a neat and snappy web app could certainly help streamline some workflows.
farouqaldori 42 days ago [-]
Thanks for the feedback! More than happy to learn more about your workflow if you'd like to share (farouq@finetunedb.com)
felix089 42 days ago [-]
Happy to hear you like the UI, ease of use is key for us. Would love for you to give it a try, any feedback welcome!
inSenCite 42 days ago [-]
Wow this is really cool, congrats on the launch!
Does the platform also help speed up the labelling of semi-structured data? I have a use case where I need to take data in word, ppt, pdf; label paragraphs / sections which could then be used to fine tune a model
felix089 42 days ago [-]
Thank you! We currently don't support direct labeling, but if you can extract the text, our platform helps you organize it for fine-tuning. What use case are you looking to train the model for?
inSenCite 42 days ago [-]
Ah, ok. the text extraction is manageable enough, I will carve out a smaller subset so I can give your platform a go. The use case is professional services contract creation and redlining based on reference documents.
felix089 42 days ago [-]
Okay thanks for sharing, I think that's the way to go with the subset. Feel free to reach out if you need anything, more than happy to take a closer look!
KaoruAoiShiho 43 days ago [-]
Am I able to upload a book and have it respond truthfully to the book in a way that's superior to NotebookLM or similar? Generally most long context solutions are very poor. Or does the data have to be in a specific format?
felix089 43 days ago [-]
To get the outcome you want, RAG (retrieval augmented generation) would be the way to go, not fine-tuning. Fine-tuning doesn't make the model memorize specific content like a book. It teaches new behaviors or styles. RAG allows the model to access and reference the book during inference. Our platform focuses on fine-tuning with structured datasets, so data needs to be in a specific format.
These days, I'd say the easiest and most effective approach is to put the whole book in the context of one of the longer context models.
felix089 43 days ago [-]
Agreed, for this use case probably the easiest way to go.
swyx 43 days ago [-]
(and most expensive)
felix089 42 days ago [-]
Agreed too
thomashop 37 days ago [-]
Prompt caching to the rescue?
KaoruAoiShiho 43 days ago [-]
Not really, for something like gemini the accuracy and performance is very poor.
farouqaldori 43 days ago [-]
The magic behind NotebookLM can't be replicated only with fine-tuning. It's all about the workflow, from the chunking strategy, to retrieval etc.
For a defined specific use-case it's certainly possible to beat their performance, but things get harder when you try to create a general solution.
To answer your question, the format of the data depends entirely on the use-case and how many examples you have. The more examples you have, the more flexible you can be.
hodanli 42 days ago [-]
I dont think it is a big deal but you can use your own image or give credit to openai presentation on YouTube.
fpgaminer 42 days ago [-]
I gave it a try, but when I tried to start a finetune of Llama 3.1 8B it just gave an error every time. I also encountered several server errors just navigating to different pages.
felix089 42 days ago [-]
Thanks for giving it a try, and sorry to hear you're having issues, could you please share the errors you received either here or via founders@finetunedb.com? Many users successfully fine-tuned models today, so it would be great to learn what the specific problem is. Thanks!
cl42 43 days ago [-]
Was looking for a solution like this for a few weeks, and started coding my own yesterday. Thank you for launching! Excited to give it a shot.
Question: when do you expect to release your Python SDK?
farouqaldori 43 days ago [-]
There hasn't been a significant demand for the Python SDK yet, so for now we suggest interacting with the API directly.
With that being said, feel free to email us with your use-case, I could build the SDK within a few days!
rmbyrro 42 days ago [-]
If you currently have an SDK in any of the 5 major languages, or if your API is well documented in a structured way, it should be very easy to write ab SDK in Python, Go, anything LLMs know well.
cl42 43 days ago [-]
Main requirement is to programmatically send my chat logs. Not a big deal though, thanks!
farouqaldori 43 days ago [-]
Ah I see, got it. For now the API should work fine for that!
felix089 43 days ago [-]
Very happy to hear, please do reach out to us with any feedback or questions via founders@finetunedb.com
monkeydust 42 days ago [-]
If I wanted to tune a LLama model on say MermaidJS or PlantUML to improve performance beyond what they can do today with this product be a good fit?
farouqaldori 42 days ago [-]
Yes for sure. But your mileage may vary, and it all depends on the quality of the dataset that you build up.
Happy to discuss this in detail, how do you measure performance?
EliBullockPapa 42 days ago [-]
Awesome but pricing seems a little high right now, and you're missing Gemini Flash, the very cheapest fine tunable model that I know of.
farouqaldori 42 days ago [-]
Which part of the pricing seems high, platform or token pricing? Both?
About Gemini Flash, we add new model providers entirely based on feedback. Gemini is next on the roadmap!
kouteiheika 42 days ago [-]
> Which part of the pricing seems high, platform or token pricing? Both?
You said that you do only LoRA finetuning and your pricing for Llama 3.1 8B is $2/1M tokens. To me this does seem high. I can do full finetuning (so not just a LoRA!) of Llama 3.1 8B for something like ~$0.2/M if I rent a 4090 on RunPod, and ~$0.1/M if I just get the cheapest 4090 I can find on the net.
farouqaldori 42 days ago [-]
That's true when looking solely at fine-tuning costs. In theory, you could fine-tune a model locally and only cover electricity expenses. However, we provide a complete end-to-end workflow that simplifies the entire process.
Once a model is fine-tuned, you can run inference on Llama 3.2 3B for as low as $0.12 per million tokens. This includes access to logging, evaluation, and continuous dataset improvement through collaboration, all without needing to set up GPUs or manage the surrounding infrastructure yourself.
Our primary goal is to provide the best dataset for your specific use case. If you decide to deploy elsewhere to reduce costs, you always have the option to download the model weights.
kouteiheika 41 days ago [-]
Sure, I'm just comparing the baseline costs of finetuning. Assuming you own the hardware and optimize the training I'm guessing you could easily get the costs significantly lower than $0.1/M tokens (considering I can get the $0.1/M right now using publicly rented GPUs, and whoever I'm renting the GPU from is still making money on me), and if you're only doing LoRA that cost would go down even further (don't have the numbers on hand because I never do LoRA finetuning, so I have no idea how much faster that is per token compared to full finetuning).
So your $2/M tokens for LoRA finetuning tells me that you either have a very (per dollar) inefficient finetuning pipeline (e.g. renting expensive GPUs from AWS) and need such a high price to make any money, or that you're charging ~20x~30x more than it costs you. If it's the latter - fair enough, some people will pay a premium for all of the extra features! If it's the former - you might want to consider optimizing your pipeline to bring those costs down. (:
I_am_tiberius 43 days ago [-]
Looks nice. What is the price and what does it depend on?
felix089 43 days ago [-]
Thanks! We have a free tier with limited features. Our pro plan starts at €50 per seat per month and includes all features. Teams often collaborate with domain experts to create datasets. And for custom integrations, we offer custom plans on request.
Any specific features or use cases you're interested in?
martypitt 42 days ago [-]
Congrats on the launch - definitely interested.
Some minor feedback - I went to the website to look for pricing (scanned the header bar), and couldn't find it.
Didn't think to look in the docs, as it's almost always available from the homepage.
Appreciate you linking it here,but if I hadn't come from HN, I'd assume this is a "contact us for pricing" situation, which is a bit of a turnoff.
felix089 42 days ago [-]
Nice to hear, and agreed thanks for the feedback, our pricing page will be up shortly. If any other questions come up RE product, please reach out!
Aditya_Garg 42 days ago [-]
How do I get the 100 dollars in credit? Did a preliminary fine tune, but would like to rigorously test out your tech for my article.
felix089 42 days ago [-]
Thanks for giving it a try! The 100 is for the pro plan, by default you should have 10 in your account, but happy to add more, please email me with your account email, and I'll top it up: founders@finetunedb.com - Thanks!
42 days ago [-]
42 days ago [-]
jpnagel 42 days ago [-]
looks nice, have been looking into fine-tuning for a while, will check it out!
The world needs products like this that are local first and open source. Enable me train an open source LLM on my M2 Macbook with a desktop app then I'll consider giving you my money. App developers integrating LLM's need to be able to experiment and see the potential before storing everything on the cloud.
We've built the platform primarily for companies that serve LLMs in production, so even if we allowed you to fine-tune on device, sooner or later you will find yourself in a position where you want to deploy the model.
We want to streamline this whole process, end-to-end.
With that being said, I do agree that we shouldn't store everything on the cloud, this is what we're doing about it:
1. Any data in FinetuneDB like evals, logs, datasets etc. can be exported or deleted.
2. Fine-tuned model weights for OS models can be downloaded.
3. Using our inference stack is not a requirement. Many users are happy with only the dataset manager (which is 100% free).
4. We are exploring options to integrate external databases and storage providers with FinetuneDB, allowing datasets to be stored off our servers
feedback :
1. Since you guys have support for multiple models, it would be cleaner and more correct to give the API some name which doesn't start with openAI.
2. sdk using other languages like Python in `show code` would be nice.
3. It was a bit confusing to figure out how to fine tune the model, would be nice if it was explicitly available as a side pane.
Questions:
1. Can you speak a bit about your tech stack if that's alright
2. How do you currently scale inference if there is more incoming requests coming in?
1. Where exactly did you see this? There are internal FinetuneDB API keys, and external API keys like OpenAI. Though it's confusing, I agree!
2. Work in progress.
3. I agree, thanks for the feedback.
There are multiple components working together, so it's hard to define a single tech stack. When it comes to the web app, Remix is my framework of choice and can highly recommend it.
And about the cost of finetuning: is there a difference in price when only training the model on completions?
We utilize LoRA for smaller models, and qLoRA (quantized) for 70b+ models to improve training speeds, so when downloading model weights, what you get is the weights & adapter_config.json. Should work with llama.cpp!
Does the platform also help speed up the labelling of semi-structured data? I have a use case where I need to take data in word, ppt, pdf; label paragraphs / sections which could then be used to fine tune a model
This is a very common topic, so I wrote a blog post that explains the difference between fine-tuning and RAG if you're interested: https://finetunedb.com/blog/fine-tuning-vs-rag
For a defined specific use-case it's certainly possible to beat their performance, but things get harder when you try to create a general solution.
To answer your question, the format of the data depends entirely on the use-case and how many examples you have. The more examples you have, the more flexible you can be.
Question: when do you expect to release your Python SDK?
With that being said, feel free to email us with your use-case, I could build the SDK within a few days!
Happy to discuss this in detail, how do you measure performance?
About Gemini Flash, we add new model providers entirely based on feedback. Gemini is next on the roadmap!
You said that you do only LoRA finetuning and your pricing for Llama 3.1 8B is $2/1M tokens. To me this does seem high. I can do full finetuning (so not just a LoRA!) of Llama 3.1 8B for something like ~$0.2/M if I rent a 4090 on RunPod, and ~$0.1/M if I just get the cheapest 4090 I can find on the net.
Once a model is fine-tuned, you can run inference on Llama 3.2 3B for as low as $0.12 per million tokens. This includes access to logging, evaluation, and continuous dataset improvement through collaboration, all without needing to set up GPUs or manage the surrounding infrastructure yourself.
Our primary goal is to provide the best dataset for your specific use case. If you decide to deploy elsewhere to reduce costs, you always have the option to download the model weights.
So your $2/M tokens for LoRA finetuning tells me that you either have a very (per dollar) inefficient finetuning pipeline (e.g. renting expensive GPUs from AWS) and need such a high price to make any money, or that you're charging ~20x~30x more than it costs you. If it's the latter - fair enough, some people will pay a premium for all of the extra features! If it's the former - you might want to consider optimizing your pipeline to bring those costs down. (:
More details here: https://docs.finetunedb.com/getting-started/pricing
Any specific features or use cases you're interested in?
Some minor feedback - I went to the website to look for pricing (scanned the header bar), and couldn't find it.
Didn't think to look in the docs, as it's almost always available from the homepage.
Appreciate you linking it here,but if I hadn't come from HN, I'd assume this is a "contact us for pricing" situation, which is a bit of a turnoff.