Summary
Open recently unveiled a slew of enhancements to their API, including a real-time API, speech-to-speech systems, and chat completion API with audio responses and call actions. Noteworthy features also include Vision fine-tuning for images and text for improved Vision applications in areas like food delivery and web element identification. Additionally, collaborations with Google and Entropic have resulted in prompt caching improvements for longer API calls, model distillation for more efficient models, and a focus on generating structured outputs with proper masking to ensure accurate AI predictions in Json format.
Realtime API Improvements
Open announced significant improvements to their API, including realtime API, speech to speech systems, and chat completion API with audio responses and call actions.
Vision Fine Tuning
Open introduced Vision fine tuning for images and text, allowing for direct tuning of models to improve Vision applications with examples from food delivery and web element identification.
Prompt Caching
Google and Entropic announced improvements in prompt caching for API calls longer than 10,24 tokens, optimizing responses and benefiting from prompt caching features.
Model Distillation
Implementing model distillation, Open is working on efficient models by distilling outputs from larger models, enabling training with a reduced cost.
Structured Output
Discussing structured output, Open explained the process of generating structured outputs and the importance of masking to ensure AI predicts proper responses in Json format.
Get your own AI Agent Today
Thousands of businesses worldwide are using Chaindesk Generative
AI platform.
Don't get left behind - start building your
own custom AI chatbot now!