Meta has released yet another sort-of-open machine learning model, this time tuned for generating software source code.
Code Llama is a family of large language models – hence the occasional capitalization “LLaMA” – based on the Llama 2 model released in July. It has been fine tuned and trained to dispense and discuss source code in response to text prompts, instead of prose like its progenitor.
As with all cutting edge technology, Code Llama comes with risks
“Code Llama has the potential to be used as a productivity and educational tool to help programmers write more robust, well-documented software,” Meta claimed in an announcement Thursday.
If you ask Code Llama to write a function that produces the Fibonacci sequence, the model will generate both code and natural language explaining the source, Meta says. And the AI model can do so in Python, C++, Java, PHP, Typescript (Javascript), C#, Bash, and other languages.
Users however are directed to address Code Llama in English as the model hasn’t been put through safety testing in other languages and might just say something awful if queried in an out-of-scope language.
“As with all cutting edge technology, Code Llama comes with risks,” Meta explains, noting that during its own red team testing to solicit the creation of malicious code, Code Llama responded with safer answers than did ChatGPT (GPT3.5 Turbo).
According to Meta, Code Llama outperforms open-source, code-specific LLMs and its own parent Llama 2 on two benchmarks – HumanEval and Mostly Basic Python Programming (MBPP) – and matches the performance of OpenAI’s ChatGPT.
Code Llama comes in three sizes – 7B, 13B and 34B parameters – and each variant was trained with 500B tokens of code and code-related data. One token is roughly four characters in English. The largest version of OpenAI’s Codex, when it was released, had 12B parameters.
The two smallest Code Llama models, Meta says, have been trained to fill in missing source which allows them to be used for code completion without further fine tuning. The 34B version is said to provide the best results, but the smaller two respond faster, making them better for tasks like code completion where latency is noticeable.
There are also two variants: Code Llama – Python, and Code Llama – Instruct. The former comes from fine tuning Code Llama with an extra 100B tokens of Python code. The latter has been fine tuned to adhere to input and output patterns, making it better suited for code generation.
Reliability, anyone?
LLMs often provide incorrect answers to programming prompts, though they’re nonetheless used by many developers for recalling rote patterns and API parameters, or avoiding search queries and documentation checks.
One of the selling points of Code Llama is that it can handle input and output of code sequences that consist of up to 100,000 tokens. That is to say, you can prompt the model with many lines of code and you may get a verbose response.
“Aside from being a prerequisite for generating longer programs, having longer input sequences unlocks exciting new use cases for a code LLM,” Meta explained. “For example, users can provide the model with more context from their codebase to make the generations more relevant. It also helps in debugging scenarios in larger codebases, where staying on top of all code related to a concrete issue can be challenging for developers.”
Users can provide the model with more context from their codebase to make the generations more relevant
Code Llama joins a growing field of code-conversant models initially seeded by OpenAI’s Codex and GitHub’s associated litigation-encumbered Copilot (2021) programming suggestion service. Programming-positive models that followed include DeepMind’s AlphaCode (2022), OpenAI’s GPT-4 (2023), Amazon Code Whisperer (2023), and Google’s Bard (2023), tuned in April to generate source code.
In addition, there have been various open source (or sort of open) LLMs like StarCoder and XGen, to name two.
Meta has released Code Llama under the same community license as Llama 2, citing the mega-corporation’s belief in “an open approach to AI” as the best way to develop tools that are innovative, safe, and responsible.
But as was widely noted with Llama 2, the community license is not an open source license. Meta’s “open approach” to AI is closed to competition – the license explicitly disallows using the software “to improve any other large language model.”
And while Meta’s community license permits commercial use of its various llamas, it draws the line at services with “greater than 700 million monthly active users.”
That rather select group of mega-services – YouTube, WeChat, TikTok, LinkedIn, Telegram, Snapchat, and Douyin, among social media platforms not already run by Meta, and presumably companies running operating system-based platforms like Apple, Google, and Microsoft – “must request a license from Meta, which Meta may grant to you in its sole discretion…” ®