Architecture - MethodScript

MethodScript has grown into quite a large project from its humble beginnings as a simple alias plugin for Minecraft. Due to this, if you desire to contribute to MethodScript, you may not even know where to begin! This document will hopefully get you at least pointed in the right direction, though there is no replacement for digging through the code some yourself. This document is just going over the high level details, and won't cover anything too specific, and is not aimed at the typical user, though it will not cover anything too java specific. Also included are sections that cover the testing architecture, and build process. ==Core Architecture== There are 5 main "components" to MethodScript, each of which is addressed separately below, and a final section speaks as to how they all integrate with each other. ===Core=== The core is what glues everything together. The core knows how start up the program initially, and set up all of the initial parameters that are needed to run. There are 2 cores in MethodScript, the "MethodScript core" which is the command line version of MethodScript, and the "CommandHelper core", which is the core that the Minecraft server starts with. The CommandHelper core registers the plugin with bukkit and handles the builtin commands. When the plugin starts up initially, it starts in bukkit specific code, which sets the abstraction layer type, as well as hands control off to the more generic core. While there is currently no difference between a "MethodScript" and "CommandHelper" executable, this is intended to change in the future, as the CommandHelper and Minecraft specific portions are intended to be removed to their own repository, and embed MethodScript, while the MethodScript core is meant to be used as either a standalone general purpose programming language, or as an embeddable programming language, primarily by CommandHelper at first, but perhaps in other projects in the future. Unfortunately, this design distinction was not put in place from the beginning, so actually separating the two parts is non-trivial, and is currently a long term goal. However, new code written tends to respect this distinction, and so there are some differences when running in standalone mode and as a Minecraft plugin. ===Compiler=== The compiler takes the source code it is given, then lexes, parses and optimizes it. Lexing turns the raw string into tokens, parsing turns the token stream into a tree, and the optimizer takes out unneeded code paths, and converts optimizable function calls into single values (such as turning "2 + 2" into 4). Then, it passes the parse tree to the execution mechanism, which preprocesses the alias files, which stores each alias's execution tree in memory. The main.ms file is also executed, and if it registers any events, those registrations are also stored in memory, to be executed when an applicable event occurs. Technically, this mechanism is part of the Compiler proper, however it is really a separate mechanism, and could easily be split off from the actual compilation procedure, so in the future, if the compiled tree were to be saved to disk for instance, this could easily be accomplished in the future. In addition, because the Abstract Syntax Tree (AST) is separate at this point, much of the battle is done to turning this into a full blown compiler; compiling to some other platform's native code base (for instance, LLVM). ====Lexing==== [http://en.wikipedia.org/wiki/Lexing Lexing] looks at each individual character in the source code, and turns it into tokens. So, for instance, given the source code "1 + 1" it would parse the 5 characters into 3 separate tokens, a number, a plus symbol, and a number. At this point, only a few compile errors can be caught, for instance, an incomplete string, but from this point on, it's much easier to gather meaning from the tokens. ====Compiling==== The compiler takes the tokens and turns them into a parse tree. So, given the following code, it will be converted to this parse tree:

msg(if(@variable, 'True text', 'False text'))
 Copy Code
1   {{function|msg}}({{keyword|if}}(@variable, 'True text', 'False text'))

[[Image:ParseTree.png]] You can see that it roughly corresponds with each token being it's own node, and "(" denoting a child beginning, "," denoting a sibling, and ")" denoting the end of a node's children. This is more or less how the compiler actually works. In the first stage, things like symbols aren't fully parsed yet, and things like array access notation ([ ]) complicate things, so the tree looks a bit funny, but the optimization step turns "1 + 1" into "add(1, 1)", and "@var[1]" to "array_get(@var, 1)" which then finishes up creating a full parse tree where everything is a function. ====Optimizing==== Optimization is the final compilation step. There is one step that is required to finish up the parse tree, which is sort of still a part of compiling, but is in the optimization stage nonetheless. The "[https://github.com/EngineHub/commandhelper/tree/master/src/main/java/com/laytonsmith/core/functions/Compiler.java __autoconcat__]" function is automatically placed in the tree during compiling, which is what the compiler does to offload infix parsing to other code, as well as other complicated constructs like [ ]. The __autoconcat__ function isn't a function per se, but it implements Function so that it can easily be integrated into the rest of the ecosystem. By the time optimization is done, ALL __autoconcat__ functions will have been converted to something else. Optimization is a decent challenge, because you must ensure that any optimization you do has zero side effects on the code's behavior. One of the simplest ideas behind optimization though, is to go ahead and run code that can be run at compile time, assuming it will ALWAYS have the same results, and does not require any external inputs/outputs, including user input, dynamically linked functionality, or other environment settings. So, for instance, if you put "1 + 1" in code, we know that it will ALWAYS be 2, so we can go ahead and "run" that at compile time. This prevents us from having to recalculate 1 + 1 each time the code is run. A good example of this being used in practice is when a function takes milliseconds, and several seconds or minutes are desired. Instead of putting the magic number 300000, a user might type 1000 * 60 * 5, which is more easily read as "five minutes". However, there should be no performance penalty for doing this, because 1000 * 60 * 5 is ALWAYS 300000, no matter what other things the user types in. (Also, 1000 * 60 * @var is always 60000 * @var, so we can do some optimization even if there is some user input.) You can also think of this as a ''code transformation'', which is the base functionality of optimization. We want to transform all code into more efficient versions, without the user having to know or care about these optimizations. For instance, take the following code:

if(@var1){
    if(@var2){
        msg('Both var1 and var2 are true')
    }
}
 Copy Code
1   {{keyword|if}}(@var1){
2       {{keyword|if}}(@var2){
3           {{function|msg}}('Both var1 and var2 are true')
4       }
5   }

This is exactly equivalent to:

if(@var1 && @var2){
    msg('Both var1 and var2 are true')
}
 Copy Code
1   {{keyword|if}}(@var1 && @var2){
2       {{function|msg}}('Both var1 and var2 are true')
3   }

The question at this point is "which is more efficient?" Only through profiling can we actually determine this, but constructs like this can be objectively measured and transformed into the more efficient version, without the coder ever having to worry about it. (BTW, turns out the second one is more efficient). In MethodScript, each function is in charge of its own optimization. This makes it easier for core language features to be added and optimized quite easily, as well as organizes the code a bit better. Many functions can be optimized in a similar way too, so there is a framework in place for handling much of the optimizations generically (and in fact ties into the documentation too). To see if an individual function supports optimizations, check to see if it implements [https://github.com/EngineHub/commandhelper/tree/master/src/main/java/com/laytonsmith/core/Optimizable.java Optimizable], which will then tell you more about the optimization techniques it uses. Each function has the ability to transform itself, based on analysing its child nodes. Many functions cannot be optimized, because they inherently access inputs or outputs, and other functions can only be optimized if the input to them is ''fully static'', that is, there are no variables. Variable tracking is not yet implemented, but once it is, that will allow for automatic detection of variables that are guaranteed to be a certain value at certain points in the code. For instance,

@var = 1
if(@var == 1){
    msg('Var is 1')
}
 Copy Code
1   @var = 1
2   {{keyword|if}}(@var == 1){
3       {{function|msg}}('Var is 1')
4   }

currently is not optimized, because it is usually unknown what the value of @var would be, but as you can see, at least at the point that the if statement is checked, it will in fact always be 1, so we could optimize this to

@var = 1
msg('Var is 1')
 Copy Code
1   @var = 1
2   {{function|msg}}('Var is 1')

===Annotation Processor and meta programming=== MethodScript makes heavy use of annotations to provide functionality. Annotations are a [http://docs.oracle.com/javase/1.5.0/docs/guide/language/annotations.html Java feature] that provides a way to "meta program" in Java. An annotation is a "tag" that can be use to mark various methods, fields, classes, or other constructs in that Java language. This meta programming allows for several different advantages, the main one in MethodScript being the ability to maintain all information about classes in one place, instead of spreading the information around several different files. In general, when adding a new class, it is customary to copy paste another class, then modify it. The ability to do this in one place, instead of having to modify an existing list manually is following a principal known as the [https://en.wikipedia.org/wiki/Open/closed_principle open/closed] principal, and is one of the key components of a [https://en.wikipedia.org/wiki/SOLID_%28object-oriented_design%29 SOLID] architecture. It also enables easier Dependency Injection, one of the other heavily followed design principles. In general, MethodScript uses annotations to mark events, functions, and other resources for addition to the api, and inherently allows for one-to-many relationships between code. An additional feature that MethodScript includes is a [https://github.com/EngineHub/commandhelper/tree/master/src/main/java/com/laytonsmith/PureUtilities/ClassLoading/ClassDiscovery.java ClassDiscovery] utility class, which provides the means to dynamically discover the constructs that are tagged with the various annotations, as well as providing other methods for meta class discovery for java sources that aren't aware of MethodScript. ===Abstraction Layer=== The abstraction layer handles all communication between MethodScript and Bukkit. It is the only place in the code that should directly reference bukkit. All methods of communication from MethodScript to Bukkit are defined as interfaces, which must be implemented once per server type, but are all that are required to be implemented to add another server type. This will allow for easier migration to and from Bukkit and other server mods, with minimal effort on the part of the programmer. There is a disadvantage of code being harder to trace, but if you use the tools available to you in an IDE, this should not be a huge barrier, and the advantages far outweigh the problems. ===Functions=== For a function to exist, it must tag itself with @[https://github.com/EngineHub/commandhelper/tree/master/src/main/java/com/laytonsmith/annotations/api.java api], and implement [https://github.com/EngineHub/commandhelper/tree/master/src/main/java/com/laytonsmith/core/functions/Function.java com.laytonsmith.core.functions.Function]. In most cases, it may extend [https://github.com/EngineHub/commandhelper/tree/master/src/main/java/com/laytonsmith/core/functions/AbstractFunction.java AbstractFunction], and most likely not have to override anything. Details about what each method expects is covered in source comments. The main method however, exec is worth discussing. It is passed a [https://github.com/EngineHub/commandhelper/tree/master/src/main/java/com/laytonsmith/core/constructs/Target.java Target], an [https://github.com/EngineHub/commandhelper/tree/master/src/main/java/com/laytonsmith/core/environments/Environment.java Environment], and an array of [https://github.com/EngineHub/commandhelper/tree/master/src/main/java/com/laytonsmith/core/constructs/Construct.java Construct]s. At this point, all the Constructs are guaranteed to be atomic values, and if preResolveVariables returns true (the default) they will not be [https://github.com/EngineHub/commandhelper/tree/master/src/main/java/com/laytonsmith/core/constructs/IVariable.java IVariable]s either. This means that the function will only need to be able to deal with the primitive types: integer, double, string (and as a side effect, void also, however that will act like an empty string), null, and arrays. (Very special cases may have to deal with other data types, but those are primarily optimized out, and in any case can be handled like strings.) In most cases, the [https://github.com/EngineHub/commandhelper/tree/master/src/main/java/com/laytonsmith/core/Static.java Static] class provides methods for converting Constructs into Java primitives, and automatically throwing exceptions should a value be uncastable to the said type. The code target indicates where in the codebase this function is occurring in, and should be provided to any exception that is thrown, or can otherwise be used by some functions. The Environment contains other information about the current execution environment, which can be freely used inside the function. ===Events=== ==Testing Architecture== You may have noticed that MethodScript has a large base of unit tests. I take automated testing very seriously; there is no way for me to scale up and maintain any semblance of quality without automating as much testing as possible. This is where the unit tests come in. Each time a new build occurs, all the unit tests are run, and failing tests are reported, and very quickly fixed. If a unit test covers a use case, you can more or less bank on that particular use case working in the final product. This allows you to have much higher confidence in the product, despite most functionality not being manually tested before a release. ===JUnit=== JUnit is the test driver. Essentially, each test is generally supposed to test a small unit of code, though it tends to be easier to write integration tests, so there is a framework in place to simply run MethodScript and check the outputs, to verify correctness. There are also unit tests surrounding the documentation and other boilerplate tests to ensure basic consistency of code. ===Mockito=== Mockito is the mocking framework in place. This allows creations of testing mocks, which allow parts of code to be replaced by simple mocks, which don't do anything, but generally look like the code they're replacing. ==Build Architecture== For the most part, because we use maven, building MethodScript is as trivial as running mvn clean install, but it is nice to understand what actually happens when you do that, and what things could cause that to go wrong. ===Git/Github=== MethodScript uses git as its version control system, and the code is hosted on github. To get the source, you can use git clone https://github.com/sk89q/commandhelper.git ===Maven=== [http://maven.apache.org/ Maven] is a build tool, similar in many aspects to [http://ant.apache.org/ Apache Ant] or [http://www.gnu.org/software/make/manual/make.html make], but has several advantages over these other tools, once you know how to use it. It is geared towards Java projects, which is one reason it is appealing for many bukkit plugins, as well as its excellent dependency management system. The biggest advantage it has for CH is that a new dependency can be added to CH, and as long as it is in any public repo, there is zero extra configuration for you to build it. If you are curious for more details, the [http://en.wikipedia.org/wiki/Apache_Maven wikipedia article] has some good information on the subject. A resource that I have found helpful is the [http://maven.apache.org/ref/3.0.4/maven-model/maven.html maven model], which shows many of the possible elements in a pom, which can at first be confusing. ===Dependencies=== The main dependency of CommandHelper is (of course) Spigot, but if you look at it's dependency tree, you see more than 30 different dependencies! Not to worry, most of those are not actually included by CH, they are transitive dependencies, but anyways, with a few exceptions, they are not strictly required at runtime, just build time. There are a few exceptions, but for the most part, for these exceptions, I use a technique called shading. Shading allows you to literally copy another dependency (or parts of a dependency) into the final jar that is distributed. Doing this has both advantages and disadvantages. The main disadvantage is that your distributable gets bigger, and you may end up distributing code that they already have. To me, this is a non-issue, computer's hard disks are huge and cheap, so even if you double the size of the jar, it won't make a dent in the remaining free space for a person's disk drive. The advantage is that you only need to distribute one single file instead of several, which tends to greatly de-complicate the distribution process. To embed MethodScript into another jar may cause issues when shading this way, however, hence why 2 jars are created in the build process. The one ending in ''-full'' is the one that has the dependencies shaded in it. ===Common Failure Reasons=== Dependencies are downloaded from a variety of locations, but if none of them have the specified dependency, it may be that it is only installed locally on developer machines. If this is the case, you'll have to find the source for the dependency, then compile and install it manually, but usually I try to stay away from doing this, as it also makes my life harder. Also, before you build a project for the first time, you may notice compile errors in your IDE. This is because the dependencies have not yet been downloaded. Try to build it, this should download the resources for you, which should the make the compile errors go away. This is known as priming the build. ===CI=== Azure DevOps is a Continuous Integration Server, which automatically builds the project based on the code currently in the github repository. This allows for quick detection of failures, which also usually leads to quick resolutions. This also has the benefit of providing a convenient place to download the newest development versions, without having to compile the code yourself. When the CI builds, if the build fails due to either compilation failures or unit test failures, an email is sent. (Actually successful builds are emailed as well.) Commits to the github account trigger a new build, so these builds are the freshest you could possibly have, unless you're the developer.

Find a bug in this page? Edit this page yourself, then submit a pull request.