I recently had the task of learning how to implement our internal company PHP coding standards documents which exist only as a set of JIRA pages into something that could be integrated into the subversion pre-commit checking phase so that not only does code have to be syntax error free, it now also has to adhere to the coding standards. I remembered reading about PHP_CodeSniffer(PCS) and volunteered to "have a look" and see if I could implement one of the more simpler standards, our variable naming convention, as a starter exercise.
Well, after the mandatory head-banging and spike in coffee consumption that accompanies learning new stuff, I became very impressed by the way that PCS actually does what it does. It builds upon the token_get_all() function and creates not just an array of tokens and there positions but it also figures out (I read the source for half an hour!) which bits of code are contained within other bits of code; in other words the context within which the current token resides. This is very very useful as it means that when, for example, a T_FUNCTION callback is being processed you can know where the function body and the function signature are in the token stream. That's beside the point though, if you want to know more about PCS then visit the PHP_CoderSniffer Pear Site. If you have PEAR installed then it is but a mere incantation away, the sequence as given on the site:
pear install PHP_CodeSnifferDidn't actually work for me, some message other other came up but when I cut and pasted the suggested alternative command it worked and everything was just fine after that which was a relief because some of the most frustrating PHP vibes I've ever had have come from tackling Pear applications on various platforms that just refuse to install cleanly.
Tick tock tick tock...
A week passes... and I have managed to really "get into" PCS and how it works and I have even managed to write some really cool stuff with it such as one of our standards that says:- all functions must have a single return
- the return value must be a variable
- the returned variable must begin with a data type and end with "Out", eg $intOut
The "mad idea" but here about using PCS...
Then I got to thinking.... I have recently released my own programming "system" called FELT, and I have been looking for a way to reverse engineer existing PHP code and then translating it into FELT code. My mad plan would be to reverse engineer Drupal and then make it work as a Node.js site for example. There would be more work but converting it to JavaScript code would be easier if it could be automated.I had initially planned on using an already available bunch of projects that parse PHP code or provide a grammar description that can be re-used to do it but that's a lot of effort and having spent enough time with PCS now I think it might just be able to pull it off... or it might not but here's the jist of it.
I am going to see if I can use PCS and the data it provides to reverse engineer the code into FELT code.
Sounds good but...
I have a gut feeling that it won't actually be enough because good though it is, it doesn't provide a full AST that I think I am going to need in which case I may need other tools or roll my own.
Comments
Post a Comment