OctoML Launches OctoAI, a Self-Optimizing Compute Service for Artificial Intelligence

OctoML Launches OctoAI, a Self-Optimizing Compute Service for Artificial Intelligence

In 2019, OctoML launched with a primary focus on optimizing machine learning (ML) models. Since then, the company has added features that make it easier to deploy ML models and raised $132 million. Today, OctoML is launching the latest iteration of its service, which shifts the company’s emphasis from optimizing models to helping businesses use existing open-source models and fine-tune them with their own data or host their own custom models.

Introducing OctoAI

The new OctoML platform, dubbed OctoAI, is a self-optimizing compute service for AI. It helps businesses build ML-based applications and put them into production without having to worry about the underlying infrastructure. "The previous platform was focused on ML engineers and optimizing and packaging the models into containers that could be deployed across different sets of hardware," said OctoML co-founder and CEO Luis Ceze. "We learned a ton from that, but the next natural evolution is to have a fully managed compute service that abstracts all of that [ML infrastructure] away."

How OctoAI Works

With OctoAI, users simply decide what they want to prioritize (e.g., latency vs. cost) and the service will automatically choose the right hardware for them. The service will also automatically optimize these models (leading to additional cost savings and performance gains) and decide whether it’s best to run them on Nvidia GPUs or AWS’s Inferentia machines. This takes away a lot of the complexity of putting models into production, something that is still often a roadblock for many ML projects.

Accelerated Models

OctoML offers accelerated versions of popular foundation models like Dolly 2, Whisper, FILM, FLAN-UL2, and Stable Diffusion out of the box. With these models, OctoML was able to make Stable Diffusion run three times faster and reduce the cost by 5x when compared to running the vanilla model.

Focus on Compute Service

It’s worth noting that while OctoML will continue to work with existing customers who only want to use the service for optimizing their models, the company’s focus going forward will be on this new compute platform. "Most users will opt to let OctoAI manage all of this for them," said Ceze.

Impact on Businesses

The shift in focus from model optimization to a compute service has significant implications for businesses. By abstracting away the underlying infrastructure, businesses can focus on building ML-based applications without worrying about the technical details. This could lead to increased adoption of AI and ML technologies across various industries.

Conclusion

OctoML’s new compute platform, OctoAI, is a significant step forward in making AI and ML more accessible to businesses. By providing a self-optimizing service that abstracts away the underlying infrastructure, OctoML has created a powerful tool for building ML-based applications. With its focus on generative AI and accelerated models, OctoML is well-positioned to become a leader in the AI compute space.

Benefits of OctoAI

  • Simplifies the process of putting models into production
  • Automates optimization and hardware selection
  • Offers accelerated versions of popular foundation models
  • Focuses on generative AI and accelerated models

What’s Next for OctoML?

OctoML will continue to work with existing customers who want to use the service for optimizing their models. However, the company’s focus going forward will be on its new compute platform, OctoAI.

Impact on Businesses

The shift in focus from model optimization to a compute service has significant implications for businesses. By abstracting away the underlying infrastructure, businesses can focus on building ML-based applications without worrying about the technical details.

Conclusion

OctoML’s new compute platform, OctoAI, is a significant step forward in making AI and ML more accessible to businesses. By providing a self-optimizing service that abstracts away the underlying infrastructure, OctoML has created a powerful tool for building ML-based applications. With its focus on generative AI and accelerated models, OctoML is well-positioned to become a leader in the AI compute space.

About the Author

Frederic Lardinois is an Editor at TechCrunch, covering the intersection of technology and business. He has been covering the tech industry for over a decade and has written about topics ranging from artificial intelligence to e-commerce.

Related Stories

Technology