Beyond Copilot
I have a confession to make: I haven’t even bothered to try Copilot. Many of my colleagues claim they can’t live without it. It’s integrated into their VSCode, and I hear it’s great. Meanwhile, I’m using neovim and plain ChatGPT GPT-4 in the browser. I’m not accustomed to predictive text or code suggestions; I’d rather write the answer myself. At NorthCode, we specialize in test automation and platform engineering. My experience is primarily in the latter, which means I’m often writing small pieces of software that integrate various platform components to create an internal development platform.
ChatGPT is excellent for generating boilerplate code, but when it comes to creating something entirely new, the hallucinations start sooner or later, and its usefulness decreases. LLMs are great for generating code to some extent. Personally, I use them for boilerplates, especially when experimenting with a new framework or language. To some extent, ChatGPT also acts as a decent pair coder, meaning that I figure out the hard stuff and ChatGPT can do the boring things.
So, what other options are there for platform engineers besides Copilot? Previously, I discussed how a service mesh built with Istio allows you to finely control which services can communicate with each other. Often, these traffic restrictions come from company policies; they might even be regulated, and perhaps even more so now with the EU's NIS2 regulation. As development progresses and developers are overwhelmed with code reviews, how can we ease their burden when they have to review diverse aspects of the software environment like:
Infrastructure
Application code
Application containers
Database interactions such as SQL statements
Security updates
Each of these has its own domain or programming language, increasing the cognitive load on developers. This can eventually slow down development cycles due to the sheer amount of information and new concepts to absorb.
At NorthCode, we are building internal development platforms for our customers. Our team's extensive expertise in software development gives us a deep understanding of what should be considered when building an IDP. Recently, we’ve started incorporating LLMs into our approach. We truly believe that using LLMs in the right places can significantly reduce the cognitive load on development teams, resulting in happier developers who achieve more with less effort.
Reviewing Authorization Policies of a Service Mesh
One of our customers is currently adopting a service mesh Istio for their platform. This mesh includes a feature called AuthorizationPolicy, which defines the ingress and egress traffic policies for different services. For instance, an AuthorizationPolicy could state that no egress traffic is allowed from an application—not towards the Internet, nor to other internal services. You can think of the AuthorizationPolicy as a modern-day firewall but instead of plain, vendor-specific firewall statements, we use YAML which will then be translated as firewall rules by the service mesh.
Ideally, you have a company policy in place that looks something like below. By the way, these policies are soon mandatory for some organizations with the upcoming NIS2 regulation.
Applications are not allowed to communicate freely with the Internet; instead, a specific whitelisting process is in place.
Backend applications can access their respective databases but not others.
But how do you ensure that application developers adhere to these policies? A common issue with code reviews is that:
It can take weeks(!) for developers to receive feedback on their changes. By then, the person requesting feedback will have lost context and their brain will have to do a context change, which affects cognitive capabilities.
The review process can slow down development, leading to merge conflicts that consume valuable time. Code that isn’t merged isn’t in production, and only code that is in production creates value for your customers.
LLMs excel at interpreting and following rules and statements, so let's leverage the capabilities of OpenAI GPT-4 with this somewhat provocative AuthorizationPolicy example that allows all traffic in and out:
apiVersion:
security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: allow-all
namespace: common-api
spec:
action: ALLOW
The LLM is assigned with the following system prompt:
You are tasked with reviewing AuthorizationPolicies of Istio. Here are the rules:
Policies must specify rules in detail. Do not allow 'all in' or 'all out' policies.
Port numbers must always be specified.
And with a bit of Python code, we can have the LLM’s output directly in the pull request that would change the AuthorizationPolicy of a application:
The feedback command /wrong is recorded into a simple SQLite database along with the prompt, result, and feedback. This way, the LLM developer can adjust the prompt to be more effective, providing developers with immediate feedback on their work.
Even though LLMs are great for this, don’t rely solely on their feedback to block merges. LLMs can sometimes be inaccurate, and a human override should always be in place. LLMs can't truly reason, and the current AI technologies aren't yet ready to make decisions on behalf of humans.
This is just the beginning. We’re also working on other projects, like automated security reporting based on vulnerability scanning. LLMs should be able to determine whether a specific vulnerability can be exploited in your environment. Stay tuned for more AI+Platform engineering content!