Skip to main content

Using PHP_CodeSniffer for nefarious purposes!


I recently had the task of learning how to implement our internal company PHP coding standards documents which exist only as a set of JIRA pages into something that could be integrated into the subversion pre-commit checking phase so that not only does code have to be syntax error free, it now also has to adhere to the coding standards. I remembered reading about PHP_CodeSniffer(PCS) and volunteered to "have a look" and see if I could implement one of the more simpler standards, our variable naming convention, as a starter exercise.

Well, after the mandatory head-banging and spike in coffee consumption that accompanies learning new stuff, I became very impressed by the way that PCS actually does what it does. It builds upon the token_get_all() function and creates not just an array of tokens and there positions but it also figures out (I read the source for half an hour!) which bits of code are contained within other bits of code; in other words the context within which the current token resides. This is very very useful as it means that when, for example, a T_FUNCTION callback is being processed you can know where the function body and the function signature are in the token stream. That's beside the point though, if you want to know more about PCS then visit the PHP_CoderSniffer Pear Site. If you have PEAR installed then it is but a mere incantation away, the sequence as given on the site:
pear install PHP_CodeSniffer
Didn't actually work for me, some message other other came up but when I cut and pasted the suggested alternative command it worked and everything was just fine after that which was a relief because some of the most frustrating PHP vibes I've ever had have come from tackling Pear applications on various platforms that just refuse to install cleanly.

Tick tock tick tock...

A week passes... and I have managed to really "get into" PCS and how it works and I have even managed to write some really cool stuff with it such as one of our standards that says:
  1. all functions must have a single return
  2. the return value must be a variable
  3. the returned variable must begin with a data type and end with "Out", eg $intOut
By using the positions of the open and closing parentheses and the very useful findNext functions, I count the number of "return" statements within the body of the function. None means no further checking is required, more than one raises an error and a subsequent commit rejection whilst one means checking that it is in fact the last statement before the closing brace and that the return data is a variable that fits the acceptable pattern.

The "mad idea" but here about using PCS...

Then I got to thinking.... I have recently released my own programming "system" called FELT, and I have been looking for a way to reverse engineer existing PHP code and then translating it into FELT code. My mad plan would be to reverse engineer Drupal and then make it work as a Node.js site for example. There would be more work but converting it to JavaScript code would be easier if it could be automated.

I had initially planned on using an already available bunch of projects that parse PHP code or provide a grammar description that can be re-used to do it but that's a lot of effort and having spent enough time with PCS now I think it might just be able to pull it off... or it might not but here's the jist of it.

I am going to see if I can use PCS and the data it provides to reverse engineer the code into FELT code.

Sounds good but...


I have a gut feeling that it won't actually be enough because good though it is, it doesn't provide a full AST that I think I am going to need in which case I may need other tools or roll my own.

Comments

Popular posts from this blog

The Coolest Shortest PHP Function I Will Ever Write

Having now released my own programming language, FELT , and learned a lot about this and that in the process I have of late, in the evenings, been struggling to reconcile my love of LISP and how simple FELT makes some PHP coding task leaner and meaner with the fact that I still have to use PHP for my day job. In my language, FELT , I have used the square brackets to define a "normal" array and curly braces to define a "key-value" array, mainly because this is identical to JSON format and anybody familiar with Javascript coding just won't have any issues getting to grips with that now will they! Let's take some simple examples of FELT code: (defvar simple-array [1 2 3 4]) (defvar simple-map {:name "Eric" :age 42 :occupation "Viking Hacker"}) When FELT has done its thing, we get the following PHP code, $simple_array = array(1, 2, 3, 4); $simple_map = array('name' => "Eric", 'age' => 42, ...

Handling multipart/form-data with NanoHTTPD

I am in the process of reviving an old project from 2014 that I never finished because of other work commitments. In that time, bitrot has set in, the Android API has moved on and all in all, the home-brewed HTTP server I wrote using SocketServer and the org.apache libraries had to go! I looked around, found a couple of contenders and after much time decided to go with NanoHTTPD because it is lean, small and fits in exactly two files. The main server is in one file `NanoHTTPD.java`and there is another file called `ServerRunner.java` which manages instances of running servers. The others The other project I looked at is this one:  https://github.com/koush/AndroidAsync which led me a merry dance and I just couldn't figure out how get the POST data I had uploaded. I spent a few days really digging at it with Wire Shark too to make sure the data was going up. It was. Whatever... I had used it via a gradle dependency entry but I dropped it and went back to NanoHTTPD. For m...

Using a RAM disk with Opera on OS X

Having recently configured AndroidStudio to use a RAM disk for Gradle, I thought I would look around and see if I can use the remaining space for Opera. This is essentially a reproduction of this fine page: http://www.ghacks.net/2010/10/20/how-to-change-the-opera-cache-directory/ That page does not deal with Macs though and after a little bit of experimentation I came up with this spell: open /Applications/Opera.app/ --args --disk-cache-dir=/Volumes/RamDisk/opera For the record, here is my Opera version: Make sure that the specified folder exists before starting Opera, if might automatically create the folder for you but I didn't bother to find out, I hate disappointment. And for the record, the way I create a RAM disk on my iMac, which is done automatically when I log in, is like this: diskutil erasevolume HFS+ "RamDisk" `hdiutil attach -nomount ram://4194304` The above line was courtesy of this YouTube video: Thanks to Bartech TV then! So, with Turbo m...