Notes
Slide Show
Outline
1
Code Generation
  • Keith Alcock
  • TCS DevSIG
  • 6 April 2004
2
Introduction
  • Theory
    • Book by Jack Herrington
  • Implementation
    • Example from book
    • Extension of example
    • My projects
  • Discussion
    • Your experiences

3
Theory
  • Source
    • Code Generation in Action
      Jack Herrington
      Manning Press, 2003
    • www.manning.com/herrington
      (Table of Contents, Chapter 1, Chapter 4)
      www.codegenerationinaction.com
      www.codegeneration.net
  • Definition
    • The technique of building and using programs to write other programs.
      • Passive generators maintain no responsibility for the code…like many “wizards” in IDEs.
      • Active generators maintain responsibility by allowing the generator to be run multiple times over the same output.

4
Scope
  • Implementation technique
    • Design pattern
    • Regular expression escaping
  • Development process
    • Extreme programming
    • Automatic generation of unit tests
    • WordNet preprocessor
  • Language type
    • Object orientation
    • Rules of Dutch morphology for PUMA
5
Types
  • Code munging
  • Inline-code expansion
  • Mixed-code generation
  • Partial-class generation
  • Tier or layer generation
  • Domain language


6
Code munging
  • Description
    • Pick out important features of some input code and use them to create one or more output files.
  • Examples
    • Create documentation
    • Catalog strings for internationalization
    • Report on resource identifier usage
    • Analyze code and report compliance with company standards
    • Create indices of classes, methods, or functions
    • Find and catalog global variable declarations
7
Code munging
8
Inline-code expansion
  • Description
    • Take source code containing special markup and create production code as output.
  • Examples
    • Embedding SQL in implementation files
    • Embedding performance-critical assembler sections
    • Embedding mathematical equations, which are then implemented by the generator
9
Inline-code expansion
10
Mixed-code generation
  • Description
    • Read a source code file, modify it based on formatted comments, for example, and replace the file.
  • Examples
    • Building rudimentary get/set methods (accessors)
    • Building marshalling code for user interfaces or dialog boxes
    • Building redundant infrastructure code, such as C++ copy constructors or operator= methods
    • Converting from model (data) to presentation (controls)
    • Generating unit tests
    • Escaping regular expressions
    • Pre-compiling and obfuscating regular expressions

11
Mixed-code generation
12
Partial-class generation
  • Description
    • Read an abstract definition file and use templates to build base class libraries.
  • Examples
    • Building data access classes that you can override to add business logic
    • Developing basic data marshalling for user interfaces
    • Creating RPC layers that can be overridden to alter behavior
13
Partial-class generation
14
Tier or layer generation
  • Description
    • Build a complete tier of an n-tier system using, for example, model-driven generation from a UML diagram.
  • Examples
    • The RPC layer of an application that exports a web services interface
    • The stub code in a variety of different languages for your RPC layer
    • The dialog boxes for a desktop application
    • The stored procedure layer for managing access to your database schema
    • Data export, import, or conversion layers
15
Tier or layer generation
16
Domain language development
  • Description
    • Develop a language to describe the domain and convert it into executable code.
  • Examples
    • Mathematica
    • PUMA language
17
Workflow
18
Applications
  • N-Tier development
    • User interfaces
    • Business logic
    • Database access
  • Interfacing
    • Remote procedure access
    • Web services
    • DLL wrapper
    • External language wrapper
  • Data management
    • File formats
    • Firewall configuration
    • Lookup tables and functions
  • Documentation
  • Unit tests


19
Benefits
  • Engineers
    • Quality
    • Consistency
    • Single point of knowledge
    • More design time
    • Design decisions that stand out
  • Managers
    • Architectural consistency
    • Abstraction
    • High morale
    • Agile development
  • Code
    • Portability
    • Documentation
    • Performance
    • Obfuscation
20
Rules
  • Give the proper respect to hand-coding.
  • Handwrite the code first.
  • Control the source code.
  • Make a considered decision about the implementation language.
  • Integrate the generator into the development process.
  • Include warnings.
  • Make it friendly.
  • Include documentation.
  • Keep in mind that generation is a cultural issue.
  • Maintain the generator.
21
Skills
  • Programmer
    • Using text templates
    • Writing regular expressions
  • Tools
    • Parsing XML
    • File and directory handling
    • Command-line handling
22
Tools
  • Ruby
    • Language similar to Perl or Python
    • Also mentions C, C++, C#, Java, SQL
  • Rexml
    • XML parser library as are XMLParser, ruby-libxml
    • Also mentions SAX, DOM
  • ERb
    • Tool for templates as is ERuby
    • Also HTML::Mason, JSP, ASP, PHP
  • Regular expressions
    • pcre.h (Perl compatible regular expressions for C)
  • Version control
    • Perforce with Perl binding
  • Command line
    • Cygwin
  • Others
    • C pre-processor, M4 macros
23
Implementation
  • Example
    • Mixed-code generator
    • printf.rb generates printf statements for C code
    • Could translate, escape, encrypt, hash, etc.
  • Extension
    • quote.rb looks for REs and escapes them
    • Apply it to printfquote.rb to generate equivalent of printf.rb
    • Apply printfquote.rb to C code
24
isGenerator
  • Is a code-munger
  • Reverse engineers isa() functions from C
  • Extends the pattern to user defined attributes
  • Achieves code reuse without runtime overhead
  • Assures cross-platform compatibility
25
RegEx preprocessor
  • Is a code-munger, but could be converted to inline-code expander
  • Converts regex strings into static C data structures using (manual) reflection
  • Improves performance with pre-compilation
  • Involves no dynamic memory allocation
  • Obfuscates strings from spies
  • Obviates need to ship compiler component


26
WordNet preprocessor
  • Is also a code-munger
  • Converts database to static C data structures
  • Drastically improves performance
  • Reduces disk access to zero and memory allocation to minimum
  • Combines string pointers
  • Collects separate files into one
  • Compresses the data structures
27
Other projects
  • NESU – Nijmegen Experiment SetUp
    • Graphical programming environment for experiment generation with round-trip capability within Smalltalk
    • Configuration stored in Excel and converted to Smalltalk with VBA program
  • Document to XML converter
  • Background vs. foreground text calculations
  • SVG project
  • RTF formatting
  • Class definition generator for Smalltalk
  • RTF source code listing for Smalltalk
  • JavaScript generation of HTML on page load
  • ASP.NET programming



28
Discussion
  • Who is using it and what for?
  • Where else is generation prevalent?
    • Refactoring, generation of accessors
    • ASP, PHP, etc.
    • XML
  • What else could it do?
    • Convert from math notation to programming language


29
Conclusions
  • Code generation has a dramatic impact on development time and engineering productivity.
  • The application is amenable to change on a large scale.
  • The business rules are abstracted into files that are free of language or framework details that would hinder portability.
  • The code for the application is of consistently high quality across the code base.
  • This is a powerful tool you should have in your bag.