No one else has been able to create and implement a self-learning, mathematical model akin to semantic understanding for coding. On the semantic side of things, computer programs are similar to natural language in that different developers can express the same intent in different ways, just as two sentences can express the same meaning using different sentence structures and terminology. Our process works by producing a canonical description of the program’s behavior as a collection of facts as well as many additional overlays of nuance and hints that allow us to reproduce the original program verbatim. Machine translation for computer programs is a matter of recomposing the canonical form in the new language, applying the hints and nuances as necessary, to produce the new result.
In most cases, the programmatic flaws are found in the hints themselves. The developer used the wrong word, so to speak, and the resulting code was therefore defective. It’s actually the examination and removal of the hints that corrects the resulting program. The canonical version of the program, in whichever language it is ultimately composed, is generally the most correct version. It is the composition of the general industry best practices at one level, overlaid with the specifics of a given business process. It would be similar to using a legal template and filling in the blanks, but with the added benefit of years of case-studies to recognize where that particular template is and is not appropriate. Furthermore, and to take the analogy to its conclusion, we operate on levels of scale as small as a single word and as large as a library. We can decompose to Intent at any scale.
With all that said, we have a tremendous edge over the natural language processing space. Unlike natural-language input, we are always presented with reasonably well-defined thoughts in the computer programming space. For the most part, the computer programs that we encounter are actively producing value for someone in some capacity. These programs pass all correctness checks imposed by the coding language. Even machine code, where no source code is available, is a candidate for this process. This makes computer programs ideal candidates for our decomposition process: we know a priori that it means something and our process is tasked with determining what that something is. Fortunately, the domain of natural language, as it pertains to computer programming, can be reduced to a comparatively small set of equations, all which we understand very well.
We absolutely do use compiler technology in our process. It is used in several phases, particularly the “decomposition” phase where we determine canonically what the developer has expressed. A compiler must do the same thing in order to generate machine code -- the difference being that the compiler, and most static analysis tools, are only concerned with satisfying certain correctness constraints. We take this quite a bit further.
Once we have the canonical form of the code, we are then able to map that work to an extensive knowledge base of best practices and algorithms. We separate code out into specific domains, e.g. database access, network protocols, parsing and formatting, and file operations. Each of these domains also carries specific gotchas that we can deterministically detect. SQL injection attacks are a simple example, but more complex issues exist such as maintaining system-wide coherency. Developers rarely maintain best practices in this regard because they rarely ever experience system failures in their development environments, and only the best developers plan around these issues.
In terms of concrete languages supported, we have discrete engines and underlying Intent for many of the most popular scripting (e.g. PHP, ASP, Python, Perl) and complied (e.g. C, C++, .Net, Ada, Java) languages. We have also successfully applied our work to raw machine code.
As to the general case, we handle both procedural and functional languages. For OOP, we can reconstruct the authors’ original object-oriented design or recompose the work into a new, likely more correct, inheritance graph. If required, and the conditions are amenable, we can recompose OOP code directly into procedural code. Technically, we can switch between procedural and functional code, within reason, in the same way that a compiler can switch procedural code to single static assignment and then back again to produce machine code. Ultimately, if the code “means” something and has a deterministic grammar, we can handle it.
Going further, we have actually looked beyond programming languages to natural language processing, particularly automated natural language translation and semantic interpretation. We are using this already in our work with parsing and programmatically regenerating sites. We have also been asked to look at an imaging processing problem. As these domains are not as well-formed as computer programming languages, we expect a higher reliance on deep learning tools.
Holonic utilizes cloud platforms extensively, particularly AWS. We also operate in collocation environments.
A large part of our platform manages devOp tasks automatically, including spinning up machines, and deploying code. All configuration is handled through Intent.
We generate the machine configurations in the same way that we generate the software that runs on them. We can inter-operate with anything that supports an API.