AI: Are we there yet!

A quick overview of the current state of AI

AI is a hype...

...unless you dig deeper! While 2023 was the year of demos with a lot of attention and funding ($50bn+) directed towards AI's infrastructure layer, 2024 promises to be the year we see AI in production as the application layer catches up. Enterprise adoption has lagged as they struggle to balance the big four: accuracy, latency, cost and privacy. It should hence be no surprise, that most of the experiments at the Enterprise level are still internal and not consumer facing.

The most airtime hence has been on techniques and tooling to make LLMs more predictable as reasoning engines, such as:

Prompt Engineering: We have advanced far beyond the simplistic "You are an expert" prompting. Techniques like Chain of Thought and ReAct to have an LLM reason vs respond have been a great foundation to build on. Which technique to use and when is use case specific, so be cautious of having a fixed framework. For example, while Chain of Density prompting may work well for qualitative summarization, Tree of Thought prompting may be much better at problem solving. The only way to know is by building and iterating.

RAG: ChatGPT's chat interface meant that Q&A chatbot ("Chat with your data") was the most obvious use case. But, limited context windows and “Lost in the Middle” problem with LLMs implied Retrieval-Augmented Generation (RAG) took centerstage when it came to building LLM applications. Langchain and Llamaindex have emerged the leaders, especially with the vibrant open source community contributing wholeheartedly. Vector DBs such as Pinecone, Chroma, Weaviate etc (too many vector DBs to keep track of at this point) have attracted significant capital to plug into the RAG enabled workflows. Retrieval techniques have become a huge research arena by themselves to enable accuracy, while trying to mitigate cost and latency. Throw in some proprietary data and it quickly becomes obvious why so much energy is being spent in getting this right.

Advanced retrieval techniques such as query expansion, cross-encoder reranking, training embedding adapters etc continue to help automate workflows where 100% accuracy is not a hard threshold. It is worth qualifying though, that RAG's biggest challenge still lies in refreshing stale data and maintaining course in agentic workflows - a solution to look out for in 2024.

Fine-Tuning for Precision: Although OpenAI offers finetuning capabilities, finetuning open-source foundation models is the most efficient way ( that boosts output reliability and yet maintains privacy in my personal experience. Huggingface Open LLM Leaderboard is a good place to start to find a good open source model for your use case - Mistral-8x7B-v0.1 for example, outperforms LLama 2 70B. Techniques such as LoRA are sufficient most of the times and can lead to significant gains in accuracy, cost and latency. If accuracy is important enough to trade off latency, cost, think about combine the finetuned model with advanced prompt engineering and RAG techniques. The open source model finetuned on our schema for example functions at ~95% accuracy, at 10% the cost of a tree based approach, while being local and is 10x faster for retrieval.

A Look Ahead: There's much to be excited about and they need a better introduction than a mere mention. Workflow automation via multi-agent applications based on frameworks such as Autogen are not too distant. The Multimodal era is upon us: GPT4-V, Gemini Ultra, Adept’s Fuyu-Heavy are just a few names making waves with their groundbreaking demos, opening new avenues for creative storytelling. GPT5 is speculatively in training - expect some fireworks there. I am particularly excited about the innovations coming out of the open source community, especially the imminent launch of Llama3 by Meta.

2024 stands as a year of immense potential. It's not just about what AI can do; it's about how we can harness it in meaningful, productive ways. I am beyond excited with the opportunities that come with it. Your thoughts?

#AI #MachineLearning #OpenSource #MultimodalModels #FutureOfAI #TechTrends2024

Join the conversation

or to participate.