Is Meta’s Code Llama a Game-Changer for AI-Driven Code Generation?

KEY TAKEAWAYS

Meta’s Code Llama provides software developers with the ability to generate and explain code to streamline their day-to-day workflows and create next generation applications.

The generative AI arms race has shown no signs of slowing down. Just weeks after introducing the open-source large language model (LLM) Llama 2, Meta announced the launch of Code Llama.

Advertisements

What is Code Llama?

Code Llama is a refined version of Llama 2, trained on a code-heavy dataset with 500 billion tokens of code and code-related data. It has the ability to generate code in multiple programming languages, including Python, Java, Java Script, C#, and Bash.

As the announcement blog post notes, what sets Code Llama apart from Llama 2 is that it “is a code-specialized version of Llama 2 that was created by further training Llama 2 on its code-specific datasets, sampling more data from that same dataset for longer.”

Advertisements

The LLM is available for both research and commercial use and supports up to 7B, 13B, and 34B parameters.

What Can Code Llama Be Used For?

From a top-down perspective, Code Llama can not only be used to generate code but can also be used to explain code in natural language. For example, a user can enter a prompt telling the solution to write a function that outputs the Fibonacci sequence.

The ability of Code Llama to generate and explain code means that it can act as an educational tool for software developers, playing the role of a virtual copilot or coding assistant. This is particularly useful for newer developers who may need help identifying bugs and debugging code or seeing what existing code does.

Advertisements

However, Code Llamas’ true utility lies in its ability to help create intelligent apps and websites. “Code Llama will be integral for next-generation intelligent apps that can understand natural language,” Adrien Treuille, director of product management and head of Streamlit at Snowflake, told Techopedia.

“A model like Code Llama can be used to power next-gen LLM-accelerated coding experiences, such as automatic code completion – where the model guesses what the engineer or analyst will type next – and copilot experiences – where the model is used in a chatbot to help engineers or analysts translated natural language business problems into code.”

In addition, Treuille highlights that Code Llama could also be embedded within business applications, giving them the ability to automatically generate and execute code snippets based on natural language prompts.

Let’s Talk Performance

So far, Code Llama has also shown some promise in terms of its performance capabilities. Meta’s own research suggests that Code Llama achieves “state-of-the-art performance” among open models on multiple code benchmarks, achieving 53% on HumanEval and 55% on MBPP.

In addition, Code LLama not only outperforms LLama 2 under these benchmarks, but it also outperforms GPT-3.5 under both tests.

Similarly, an independent test conducted by Snowflake also found that Code Llama outperforms Llama 2 models by 11-30% on text-to-SQL tasks. This study also found that it approaches near GPT-4 level performance in text-to-SQL tasks, lagging behind by just 6% accuracy points.

While GPT-4 outperforms Code Llama on HumanEval out-of-the-box, a study conducted by AI startup Phind found that fine-tuned versions of Code Llama-34B and Code Llama-34-B-Python model could outperform GPT-4 in this area.

In this exercise, researchers provided each model with 80,000 programming tasks and solutions and found that Code Llama-34B and Code Llama-34-B Python achieved 67.6% and 69.5% accuracy across 80,000 programming tasks and solutions, compared to GPT4’s 67%.

Deepening the Open Source Ecosystem

Above all, these studies indicate that the gap between proprietary and open-source LLMs is closing. With the right training data and fine-tuning, developers can use tools like Code Llama as a viable alternative to closed-source tools like GPT-4.

David Strauss, co-founder and CTO at web Ops provider Pantheon, told Techopedia:

“We’re seeing a standard – and encouraging – competitive landscape emerging. Leading implementations (GitHub Copilot, presumably built on OpenAI’s GPT) lean proprietary, while emerging entrants (Meta’s Llama) are pursuing a more standards-based, open strategy.”

The success of open-source tools is vital to democratizing AI development because if opaque black-box AI models are allowed to dominate the market, then the advancement of this technology as a whole will stay siloed amongst a handful of gatekeeping providers.

Strauss added:

“This is good for this space as a whole because it means there isn’t a single player vulnerable to intellectual property uncertainty, nor can any player afford to stop improving. Engineers and companies can be more confident now when building AI tooling into their development process.”

Open-Source AI Can Compete

At this stage, Code Llama is a welcome improvement to Llama 2 in the world of code generation and unlocks some exciting new use cases for improving software development workflows.

Its early performance indicators show that open-source AI solutions are a force to be reckoned with and highlight developers don’t have to rely on black-box LLMs to develop next-generation applications.

Advertisements

Related Terms

Advertisements
Tim Keary

Since January 2017 Tim Keary has been a freelance technology writer and reporter covering enterprise technology and cybersecurity.