Software developers typically rely on tools like IDEs to develop secure software (i.e., to detect bugs or get suggestions for a secure way to accomplish a given task).
With the recent advent of AI and the availability of a large body of software data like Github, it is natural to wonder if we can adapt these to aid with Software development.With the emergence of frameworks like the code language models of Meta/ Microsoft and Github's Copilot, it is clear that this direction is already being explored quite actively.
Such approaches have shown remarkable progress both with code completion and bug detection. All this being said, such language models treat code primarily as text ignoring a lot of semantic information encoded within the programming language and the domain. Not acknowledging such semantics of code results in insecure code suggestions which are dangerous at best and catastrophic at worst.
But, a lot of code and domain semantics are available in alternate representations of programs such as control flow graphs and more recently Knowledge graphs that link code with natural language descriptions of their behavior.
In this talk, we will explore a vision for the future of code language models that integrate semantic awareness from Knowledge graphs and software tools with AI-driven language models trained on code.