A seemingly simple concept, which most would just think of as one web page associated with another quickly becomes very complex and is proof of why Jim could write a substantial book on this subject and still leave room for more. Classes are used with a variety of different UML stereotypes to define the various components which combine to produce a web page e.g. server side page which builds a “client side page” which aggregates a number of “forms”. Various stereotypes of association are also defined to support different associations between the components e.g. one server side page redirects to another server side page or a client side pages links to another client side page ( through hyperlinks perhaps ). Components contained within pages both on the server and client are also modelled appropriately.
The example web project included with the download available on source forge contains a number of scenarios e.g. file 1.asp includes files include1.asp and include2.asp. Both included files contain server side functions. 1.asp also contains a redirect call to 2.asp, an embedded class, some client side functions ( in java script ) and an embedded form. The 1.asp server side page and it’s associated elements are shown in the diagram below
This is produced by the following settings in the add in
The add in is built in an extensible way to allow new script parsing languages to be easily integrated within the framework e.g. a base ScriptParserBase class contains JavaScriptParser and VBScriptParser derived classes. Similarly the ServerPage class which models the UML server side page contains derived classes ActiveServerPage (ASP ) JavaServerPage (JSP) and DotNetActiveServerPage ( ASP.NET ) for each of the web application technologies. Note only the ActiveServerPage class is fully supported for ASP pages, I’m hoping the open source community will develop the JSP and ASP.NET technologies :)
The add in uses mostly regular expressions to parse the script source into it’s appropriate components. They range from very simple expressions such as #include\s+(?
To parse the client side page the internet explorer active x control is used to load the htm file produced when navigating to the web page. By using the IE control, the parts of the web page such as embedded forms and controls within the form can be found and added to the UML components. I actually wrote a web application testing framework around this control however recently I’ve started to use the WatIN framework which is more feature rich.
Given the main web application our team develops is written in ASP with many hundreds of pages, the productivity boost from using the add in is huge. Although we couldn’t possibly try and reverse engineer the whole web application in one go, being able to selectively single out and reverse engineer a few web pages when adding new features or doing some refactoring is of great help. It clearly shows the effected pages and where the new logic is best placed and perhaps most importantly helps to show the impact of any changes on existing functionality.