jgmdev commented Jun 23, 2014

My only caveat with using the name "peg" is that you rename your existing project if you intend to keep dev'ing on it - otherwise users will get confused. I'm kind of "meh" in general for naming.

If we are able to get a clear view on how we are going to collaborate, as synchronize our ideas, plus reach a happy medium in how the project should move forward to reach our goals, then I don't see a problem with creating a github.com/peg organization and me removing the peg code and forking from github.com/peg instead :) in any case I could store my actual code locally on a fossil repo, so no issues with that.

Init Command

I still don't see a point to this - other than syntatic sugar for skeleton generation - which is fine if you want to add that but in terms of ease of use for people - but no matter how you slice it generating a skeleton is a type of generated output, not something on it's own

The init command will not generate any code, just predefined template files, generator.conf and everything needed for a developer to start working on his binding.

In the other hand, once a developer modifies the template files to meet his needs (as write plugins if needed) he can proceed to run "peg generate|output|produce" to generate the source code. By the way I liked this generate definition: In Computer Science, To produce (a program) by instructing a computer to follow given parameters with a skeleton program. (which is what we are trying to accomplish)

General use

I think in general we have some different ideas for the use case for the generator - I intended it to generate files once, that you would then commit to a repo and work on if necessary. If you needed to do a change on one thing to fix a bug in your template or definition, then you would run update and it would change the code where needed. But otherwise it would simply be code, but the initial work is done for you.

I still think we can have both functionalities, a template only approach + template/manual editing. I think of templates as patches, instead of patching the source code directly we create template patches (with changes that are easy to differentiate), so a re-generation of source code takes care of fixing everything for you, also future updates are easier.

Do you have a draft or example on how the updater will work? I'm interested in understanding your ideas to implement a source updater.

I would think that a user MIGHT put their specific template overrides, their def files, and their config file in a sub-directory of the repo, but I would NEVER want to make any development dependent on the generator. It makes life hell for people trying to contribute. And I wouldn't want the generator to manage anything!!!

You could take that maintenance approach if you think potential contributors could get upset working with template files instead that directly with the source. But I would also like the ability to only work from template files and plugins to modify generator behavior in order to maintain the php extension source.

I see it as a tool for eliminating what would be "copy and paste" otherwise, not a source control system and not a dependency for any extension in and of itself. Closer to the old "Pecl_gen" package in PEAR than the generator hacked up for PHP, or even the one for wxwidgets.

You DEFINITELY wouldn't have to have people run the generator before building, or even have to run the generator before packaging, people could simply checkout and dev normally to help out, and you'd have real, viewable, searchable code for a project.

I think you misunderstood me, as you mention my idea is for the generator to work as wxwidgets one as you point out, not some dictator over the source which is required to compile your extension or use it. Once the source is generated you could upload all that to a repository so the generator isn't a requirement to build the extension, but a tool to facilitate the developers life. Lets go again to the directory structure of the extension after you run peg init:

mkdir my_extension
cd my_extension
peg init

This would results on a directory tree as follows:

peg.conf
CHANGES (Some initial changelog saying peg was used to generate the initial directory tree)
CREDITS (This could be updated with peg update)
LICENSE (PHP license as default)
package.xml (maybe this should be generated by peg generate|output)
definitions (empty directory with a useful readme file)
- README
plugins (empty directory with a useful readme file)
- README
templates (directory that stores all templates)
- zend_php (directory with zendengine templates, another one would be needed for hhvm)
  - classes (directory for class templates)
    - header.php
    - constructor.php
    - destructor.php
    - get.php
    - set.php
    - method.php
    - virtual_method.php
    - custom_methods (directory with overrides of method.php)
  - config (directory for config.m4/w32 templates)
    - config.m4.php
    - config.w32.php
  - custom_source (directory where you can put custom source files that should also get compiled and automatically added to config.m4/w32)
  - functions (directory with function templates)
    - function.php
    - custom_functions (directory with overrides of function.php)
  - source (directory with main source templates)
    - common.h.php
    - php_my_extension.h.php
    - my_extension.c.php
    - functions.h.php
    - functions.c.php
    - references.h.php
    - references.cpp.php
  - variables (directory for variables manipulation templates)
    - declaration.php
    - parse.php
    - return.php

So thats everything you get when initializing a directory, now if you proceed to peg parse --source C:\some\doxygen|whatever the definitions directory will be populated with json files as I said before, which are:

includes.json
classes.json
classes_variables.json
classes_enumerations.json
functions.json
resources.json
type_definitions.json
constants.json
global_variables.json
enumerations.json
ifdef_components.json

These files can also be safely stored on a source repository.

Now if you run peg generate|output|whatever your source tree would look as follows:

php_my_extension.h
my_extension.c
config.m4
config.w32
peg.conf
CHANGES
CREDITS
LICENSE
package.xml
includes
- someheader1.h
- someheader2.h
- etc..
src
- somesource1.c/cpp
- somesource2.c/cpp
- etc..
definitions
- classes.json
- etc....
plugins
- README
templates
- zend_php
  - classes
    - header.php
    - constructor.php
    - destructor.php
    - get.php
    - set.php
    - method.php
    - virtual_method.php
    - custom_methods
  - config
    - config.m4.php
    - config.w32.php
  - custom_source
  - functions
    - function.php
    - custom_functions
  - source
    - common.h.php
    - php_my_extension.h.php
    - my_extension.c.php
    - functions.h.php
    - functions.c.php
    - references.h.php
    - references.cpp.php
  - variables
    - declaration.php
    - parse.php
    - return.php

Now you are ready to commit your changes to a github|whatever repo :D and people will not need the generator to compile your extension, but they will also have the ability to use the generator if they need to contribute.

Also the root directory of the extension with directories collapsed would look pretty clean:

- includes
- src
- definitions
- plugins
- templates
- php_my_extension.h
- my_extension.c
- config.m4
- config.w32
- peg.conf
- CHANGES
- CREDITS
- LICENSE
- package.xml

I also think that there WILL be far more configuration than someone EVER wants to do on the command line. As for "formats" the trick is not to "ship a bunch of formats" but to design the API internally so that formats can be "plug and play"

Something I've learned though is any time you create a pluggable something - configuration files, parser, whatnot - you should implement a minimum of 3 versions of the adapter for it. That allows you to make the apis configurable enough for almost all use cases.

I think I'm not getting your idea about configuration files, I will write here the idea I have about the usage of a configuration file:

{
    "name": "my_extension", //Added on some generated files
    "authors": "John, Jean", //Added on some generated files
    "contributors": "Jay, Maria", //Added on some generated files
    "version": 1.0, //Added on some generated files
    "auto_command": [ //Previous idea
        "update --blah"
    ],
    "definitions_dir": "", //could be used to override default
    "templates_dir": "", //could be used to override default
    "engine": "zend_php|hhvm"
}

Thats what I think about configuration, so maybe I'm mistaking your idea.

Author

auroraeosrose commented Jun 23, 2014

Hmmm - nitpick - we need to call it a "scanner" for generating your json definitions files from source

a "parser" is something different in computer science ;)

I think your idea of a "template" and mine are different

This is how gtk/gen is designed to work currently - in very stereotypical lexer/parser style

it reads in the definition file using the designated lexer class (say json or GIR xml) and creates "tokens" (actually PHP objects) filled with data describing each thing, class has a name, maybe a comment, method has return value, arguments, etc.
The resulting objects are fed into the parser - the parser says "I have a class, include the class template and pass it this data as variables to interpolate" "I have a method, include the method template and give it these variables to interpolate" - so as little logic is used in the templates as possible (maybe a foreach or some if/else)

Both the lexer (what is the format of my definitions and how do I change it to an internal representation the parser understands) and possibly even the parser (how do I handle each object with it's data) can be pluggable in this situation - the parser could possibly even only be partially pluggable, but there would be a default implementation of this, with default template files

as for configuration - the storage mechanism shouldn't matter, as long as what the actual configuration class expectations are properly documented, an ini version, for example, would look like

config.ini
[description]
name=my_extension
author[]=June
author[]=John
contributor[]=Jay
contributor[]=Maria
version=1.0
deps[]=cairo
deps[]=zlib

[definition]
type=GIR
location=/opt/my_extension/defs
parser=default

[output]
create = true
templates=/opt/my_extension/tpl
location=/opt/my_extension/src

[build]
engine=zend
no_windows=true # maybe the extension doesn't support windows
require_lib=zlib
generate_pecl=false
generate_composer=true
generate_pickle= true
use_cpp=false

jgmdev commented Jun 23, 2014

Hmmm - nitpick - we need to call it a "scanner" for generating your json definitions files from source
a "parser" is something different in computer science ;)

Then lets call it scan xD, I used the term parser, because in doxygen's case there are lots of xml files that contain the whole structure of a library. So I wrote an xml parser? to extract all definitions and store them in a easy to read json format with an established format to feed the code generator (with easy to read I also mean json -> php array with json_decode)

I think your idea of a "template" and mine are different
This is how gtk/gen is designed to work currently - in very stereotypical lexer/parser style

it reads in the definition file using the designated lexer class (say json or GIR xml) and creates "tokens" (actually PHP objects) filled with data describing each thing, class has a name, maybe a comment, method has return value, arguments, etc.

Instead of lexer I call it the parser :D

wxphp generator works the same except that instead of creating PHP objects it creates an associative array for classes, constants, enumerations, etc... But we should definitely use PHP objects to traverse the definitions|tokens once loaded in memory.

Also I established a format for the json definition files so it doesn't matters if the original definitions are in GIR, doxygen, etc... So I called parser the process of extracting definitions from doxygen|GIR and storing it on an established json format that can be read by the code generator and distributed as part of your source repository (so the usage of the definitions directory is to store those json files with an established format).

This way, any one is able to work-with/regenerate the source code without the need of having original GIR, doxygen definition files, and without leaving the extensions directory.

The resulting objects are fed into the parser - the parser says "I have a class, include the class template and pass it this data as variables to interpolate" "I have a method, include the method template and give it these variables to interpolate" - so as little logic is used in the templates as possible (maybe a foreach or some if/else)

Same on wxphp, but instead of parser I called it the generator xD (so it seems I'm using wrong terms). In my sucky english dictionary I used generator as the word referring to the process of reading the json definitions and creating the c/c++ source code, config.m4/w32 of the extension.

so peg generate would do something like this:

<?php
$class_definitions = json_decode(file_get_contents("definitions/classes.json"));
$classes = new ClassSymbols($class_definitions); //Helper class to easily traverse the definitions
unset($class_definitions);

foreach($classes as $class)
{
    $authors = get_authors(); //Can be used from header.php

    ob_start();
    include("templates/classes/header.php");
    $classes_header_code .= ob_get_contents();
    ob_end_clean();

    $generated_code .= $classes_header_code;

    etc...

    file_put_contents("src/header1.h", $generated_code);
}

Obviously more code is required, but it is just a sample so I can better transmit my spaghetti ideas.

Both the lexer (what is the format of my definitions and how do I change it to an internal representation the parser understands) and possibly even the parser (how do I handle each object with it's data) can be pluggable in this situation - the parser could possibly even only be partially pluggable, but there would be a default implementation of this, with default template files

Exactly :)

as for configuration - the storage mechanism shouldn't matter, as long as what the actual configuration class expectations are properly documented, an ini version, for example, would look like

Ahhh nice ideas the [build] and deps options, we can surely make a great team xD

Author

auroraeosrose commented Jun 23, 2014

So you're basically using json as an intermediate "cached" format? That's... not a good format for it

the json parser is OK in PHP, but it's not as nice as other formats and escaping might be problematic (oy, the charset issues you could discover)

If you want to have an easy to use intermediate format (as you're using json) you should use something more accessible and less prone memory eating

Might I suggest perhaps using sqlite as an intermediate cached format instead if you want to have that functionality? - it would be a single file and easily query-able, which would make updating WAY faster (you could store hashes of the original defs files in any format and check if they've changed since the last lexer pass with one query - tada, incremental lexing and parsing for free)

Another option would be to even just write out PHP files with the definitions of the classes/methods embedded right in, then you could totally skip any kind of json_decode or anything else and just include the intermediate files to get the definitions

jgmdev commented Jun 24, 2014

I love sqlite the problem with sqlite is that i cant use a text editor in case i want to manually modify or fix something on the definitions/cache.

Before using json the original wxphp developer used serialize/unserialize but if editing was required it was almost impossible.

Php format sounds good, but parsing a huge php file would surely take some time and memory too.

In my experience json hasnt been bad and wxWidgets is a huge library check https://github.com/wxphp/wxphp/blob/master/json/classes.json

Anyways if we choose php I would be happy, everything would PHP xd even template files :D it would just require a little bit more of work at first.

jgmdev commented Jun 24, 2014

An example of definitions/classes.php could be:

<?php
$class = new ClassSymbol("SomeClassName");

$class->AddMethod(
    "MethodName",
    [
        "static" => false,
        "const" => true,
        "virtual" => true,
        "pure_virtual" => false,
        "protected" => false,
        "return_type" => "const int*",
        "description" => "This method does this and that",
        "parameters" => new Parameters() //Will think on this later
    ]
);

Peg\Application->GetSymbolsTable()->AddClass($class);

//etc...

Author

auroraeosrose commented Jun 24, 2014

I actually have quite a bit of that already done in one fashion - see https://github.com/gtkforphp/generator/tree/master/lib/objects - - we'll need to set up the proper hierarchy - module -> package (actually this is namespace but namespace is a reserved word - we'll have similar issues with clas) -> class -> method but also namespace -> function or just package->function - then for both return values and parameters they are subclasses of args - and we should probably do a "documentation" type with info to fill in docblocks, etc

Also I didn't mention it much but we should make this with a pluggable front-end
So we can have peg-cli and peg-gtk adn peg-web versions for people to play with ;)

jgmdev commented Dec 1, 2015

TODO (Reminder)

Use CastXML to parse C/C++ header files directly and convert them to an easy to parse xml file.
Add GIR support in order to parse GTK documentation.
Finish codegenerator for PHP version 5 (generate classes)
Use PHP version 5 codegenerator as base to start writing a PHP 7 code generator.
Also add HHVM code generation support.
Implement C++ templates support
Write a webui that could be launched from command line (eg: peg ui) in order to graphically modify definition files and enable/disable classes, methods, functions, etc...

auroraeosrose/InitialIdeas.md

jgmdev commented Jun 23, 2014

auroraeosrose commented Jun 23, 2014

jgmdev commented Jun 23, 2014

auroraeosrose commented Jun 23, 2014

jgmdev commented Jun 24, 2014

jgmdev commented Jun 24, 2014

auroraeosrose commented Jun 24, 2014

jgmdev commented Dec 1, 2015

TODO (Reminder)