Category: Company Updates

  • Here’s your 50% off vibe coding, forever!

    Here’s your 50% off vibe coding, forever!

    We are happiest when our R&D leads to innovations help users save 50% off token cost. Here’s how we did it.

    Say hello to building your entire application at almost half the cost of Cursor, Windsurf or Antigravity. This is not a marketing exaggeration. It comes from two engineering breakthroughs we created inside VibeStudio after months of practical work with real developers and real-world constraints. The first breakthrough is something we call Hive Mode inside the VibeStudio Agentic IDE. The second is our VS Code Extension that carries the same intelligence directly into the most widely used editor on the planet.

    Hive Mode started as an internal experiment. We wanted to understand whether an IDE could become smart enough to pick the right model for each phase of a development workflow instead of blindly sending everything to one expensive frontier model. We observed that most coding tasks do not need the heaviest models. Many need a fast drafting model. Some need a refactoring model. Some need a model that understands your codebase intimately because it has already been running locally. Instead of putting the burden on developers to switch models manually, Hive Mode does this behind the scenes in a systematic and predictable manner. When the developer wants to override the model decisions, the IDE gives full control. When the developer wants to automate everything, the IDE handles it end to end. The result is a natural reduction in cost without a reduction in quality.

    Alongside this, we built the VS Code Extension so that the same capability is available without asking developers to shift environments. Developers stay inside VS Code, but enjoy the full model routing and cost savings of Hive Mode. Local models, remote models, drafting models, task-specific models, and full frontier models work together inside a single workflow. The orchestration is invisible, but the savings are not. It cuts cloud API usage heavily and takes you closer to on-prem and local-first engineering.

    There is another innovation that significantly pushes the cost down. We call it ISD, Intelligent Speculative Decoding. This runs on our server infrastructure and accelerates responses while pushing efficiency even further. ISD works exceptionally well with the fine-tuned models we have been building and pruning in-house. The open-source community has already acknowledged this. The Hugging Face community and the r/LocalLlama community celebrated our pruned and fine-tuned models because they hit a rare, sweet spot. They are aggressively pruned while staying at frontier-level quality. We use these models as drafting models in Hive Mode. They do the heavy lifting early in the workflow before handing off to the full models only when absolutely required. This alone cuts a massive portion of the cost.

    For B2B teams, VibeStudio can be deployed on-premise within ten days. It is free for the first six months for unlimited users so that teams can evaluate without pressure. For freelancers, we have a fifty percent discounted plan. The aim is simple. High quality engineering should not be a luxury product. It should be accessible, fast, local when needed and cost-efficient from day one.

    Say hi to info@vibestud.io

  • Honey we shrunk MiniMax M2!

    Honey we shrunk MiniMax M2!

    We’ve become some what of a friendly neighborhood AI lab around HuggingFace and OLlama

    150k+ downloads and counting.

    Initially we were only fine-tuning SOTA models with our meagre compute with the intention of acclimating ourselves with the latest AI architectures and recipes. We did not anticipate our models would suddenly have a demand. We did what we could and gave back to the community from where we benefit a lot.

    Then we picked up a REAP pruned model by Cerebras. We were blown away by what they had done. They had gone far ahead of the EAN paper they stated was their inspiration. We wanted to test the waters and have at it. We didn’t want to add to the noise so we chose a model no one had pruned yet.

    We were given compute by our incubator ITEL (Immersive Technology & Entrepreneurship Labs) founded by Padma Shri Ashok Jhunjhunwala with Reema as CEO. An 8 X H200 cluster from E2E – India’s most affordable AI data center. (Their shares are publicly listed, you should check out their stock price)

    We improved upon EAN and REAP pruning methods by adding our own research methodology – which actually came out of the frugality that we only had one cluster and can’t run parallel experiments so every iteration had to count. We called this THRIFT. More tech details are in the whitepaper which you can download here.

    What did the community say?

    When it comes to AI model launching, there are two places we can consider a litmus test whether the model is innovative or slop. Thankfully we started getting a whole lot of downloads, quants and finetunes done organically by the community and they in tern got thousands of downloads.

    This one researcher with the username “onil_gova” who has absolutely no connection with us, posted on r/LocalLlama about what he was able to build with our THRIFT model running on a single MacBook Pro.

    Further more, Cerebras themselves released REAP pruned models of MiniMax M2. So we dug in deeper and released a 55% pruned model which has now become the local State of the Art champion to the extent that in an AMA by the CEO and developer evangelist of MiniMax M2, a member asks them:

    Their developer evangelist goes on to endorse our VibeStudio’s THIRFT pruned model if they are not able to run the original as it is just as good. With this we thought we’d hit the highest number of air guitars we will perform that day, but the best was yet to come.

    Following these news media coverages:

    https://yourstory.com/ai-story/vibestudio-major-efficiency-breakthrough-open-source-llm-thrift-minimax-m2

    https://www.thehindubusinessline.com/info-tech/itel-backed-vibestudio-develops-technology-to-trim-size-of-llms-with-same-reasoning-levels/article70325028.ece

    https://cxotoday.com/press-release/vibestudio-announces-indias-first-enterprise-grade-pruned-ai-coding-model

    Office of the Principal Scientific Adviser to the Government of India – posted about us, nationalising our break through.

    Aurite, that’s all good. How does this factor into the VibeStudio growth story? We have a local first thesis for our agentic suite. Making portable state of the art LLMs are how we get there. Even if we must use an external API our built in inference engine will use a smaller local model for drafting and thereby saving heaps for our customer. For our enterprise customers who wanted air-gapped SOTA performance, well, they were served perfectly.

    If you’d like to use VibeStudio as a freelancer or with your team in your company, please drop us a line info@vibestud.io