One of the largest challenges that developers and software project managers have a difficult time with is how to keep documentation in sync with the code elements. While there are a large amount of tools that allows for documentation to be created based on code comments (i.e. nDoc, JavaDoc, SandCastle, etc.), there are not a lot of tools that do the same for Databases. Now, there are tools that will document a database after the fact (or will create the database based on the documentation you're creating), many are very expensive or appear to be no longer support. One tool that I have started using recently for database documentation was recommended in the book The Productive Programmer by Neal Ford. The tool is called SchemaSpy.
For the sake of brevity, this post will cover how to generate database documentation for only Microsoft SqlServer 2000 and 2005 databases using SchemaSpy.
What is SchemaSpy?
SchemaSpy is a Java command-line application that will connect to your database and generate information about the tables and their relationship. SchemaSpy does does not address any database programming elements (i.e. Functions, Triggers, Stored Procedures, etc.); however, supplies a large amount of information on the tables and views themselves along with their relationships. It also evaluates your indexes and keys in order to build relational diagrams. Furthermore, it retrieves information from the database tables in order to give you a view of the number of records, the types of columns, and a large amount of other information for it. In addition to large amount of information, it also provides notification if there are any anomalies in the schema; such as implied relationships between columns (no foreign key constraints), tables without indexes, columns marked as both "null" and "must be unique", single column tables, and more.
What does SchemaSpy Produce?
Below are a few screen shots of the documentation that SchemaSpy creates for the Northwind database.
What do I need to start using SchemaSpy?
SchemaSpy requires 4 separate components to be installed in order to work properly:
- The SchemaSpy application itself (version 4.1.1 as of this post) - Download
- The Graphviz Diagram Engine - Download
- Java Runtime Environment (v1.5 or higher) - Download
- The Database Driver(s) required for a Java Application to connect to your database.
- MS SqlServer 2000 / 2005 - JTDS v1.2 - Download
Once the 4 components are downloaded and the JRE and Graphviz are installed, the next thing we have to do is install/configure the JTDS drivers (more on why JTDS instead of traditional JDBC in a moment). As of this post, SchemaSpy only works with the 1.2 version of the JTDS driver and requires the jtds-1.2.jar file to be in the C:\Program Files\Java\Shared\jtds directory. The driver jar file can be stored in other directories; however, an additional command parameter has to be passed to SchemaSpy if you do (see below). At this point, the only thing that's left is to execute SchemaSpy with the proper command line arguments.
After all of the above components are installed, you are able to execute SchemaSpy from a command prompt using the following (general) syntax:
java -jar SchemaSpy_4.1.1.jar -t databaseType -host serverName -port portNumber -db dbName -u userName -p password -s schemaName -o outputDirectory
Let's dissect these command parameters a bit:
- -t : This identifies the type of the database we're using. Since this post is focusing on MS SqlServer using the JTDS driver, the database type will be either mssql-jtds (for SqlServer 2000) or mssql05-jtds (for SqlServer 2005).
- -host : This is required by our type. It signifies what database server to connect to.
- -port : This is required by our type. This signifies what port to connect to the database. This will not always be 1433 (see below).
- -db : This tells SchemaSpy what database to diagram for us.
- -u : This tells SchemaSpy the database user name to login with.
- -p : This tells SchemaSpy the database user's password to login with.
- -s : This tells SchemaSpy what schema to read and diagram. The user credentials provided must have access to at least read this schema.
- -o : This tells SchemaSpy the output directory to generate the documentation.
- -dp : This tells SchemaSpy the path to locate the database driver file (not used in this example)
An example of a fully completed command:
java -jar SchemaSpy_4.1.1.jar -t mssql-jtds -host myDBServer -port 1433 -db Northwind -u myDbUser -p password -s dbo -o NorthwindDocumentation
If your projects use Sql Server's integration authentication option, you can use the -sso option. The -sso option allows you to remove both the username (-u) and password (-p) parameters and use the identity of the account used to called the Java Runtime Environment. Modifying the example above to use integrated authentication, we get the following:
java -jar SchemaSpy_4.1.1.jar -t mssql-jtds -host myDBServer -port 1433 -bb Northwind -sso -s dbo -o NorthwindDocumentation
Why JTDS instead of the JDBC drivers?
As I mentioned above, I recommend using the JTDS Java Database drivers instead of the traditional JDBC drivers that Microsoft offers off of their web site. While the fact that JTDS is a single driver as opposed to the 2 separate JDBC drivers is a plus, the main reason is the trouble I had getting the JDBC drivers to work with SqlServer 2005 databases. If I was working on documenting a database I'm creating for a project on my local system using SQL Server 2005 Express Edition, it would always give me an error stating that the "variant" data type is not supported. After a lot of research and trial and error, the only result I could only get this working by using the JTDS driver.
How to use SchemaSpy to Diagram a Named Instance
When installing Microsoft Sql Server, you have the option of installing it as a stand alone service or a named instance. In the event of a named instance, you would normally connect to it in your .Net code (or via Management Studio/Query Analyzer) as ServerName\InstanceName. For example, the default instance of a Sql Server 2005 Express Edition is localhost\SQLExpress. In order to connect to a named instance database with SchemaSpy, you do not need the Instance Name of the database (just the server name as if it wasn't an instance), and you have to use the database's TCP Dynamic Port instead of the typical 1433.
You can locate a named instance's dynamic port through the following steps:
- Open the Sql Server Configuration Manager application (either through the Programs Menu or a MMC Snap In)
- Expand the Sql Server 2005 Network Configuration element
- Click on the Protocols for "INSTANCE" where "INSTANCE" is the name of your named instance
- In the right pane, double click on TCP/IP
- In the TCP/IP Properties dialog that appears, click on the IP Addresses tab.
- Towards the bottom of the IP Addresses tab, you'll find the TCP Dynamic Ports value to use for SchemaSpy
The above steps were taken based on a local installation of SQL Server 2005 Express Edition. Your steps may vary depending upon configuration.
SchemaSpy is a tool that I have been using to help with some of the documentation pains that I have experienced on some of my personal and professional projects. Being able to add the command into a build script to automatically keep the documentation up to date is a huge time saver and is invaluable as people migrate onto the project(s). I hope you find it as useful as I have, and to assist further, I have added a simple step by step guide to setting up and begin using SchemaSpy below.
- Install the Java Runtime Environment version 1.5 or higher if it is not already installed.
- Install Graphviz Diagramming Engine.
- Download the SchemaSpy jar file.
- Download the JTDS v1.2 driver and place the jar file in the C:\Program Files\Java\Shared\JTDS directory.
- Obtain User Login Credentials with the proper permissions to read the schema for the database you will be documenting if integrated authentication is not available.
- Obtain the TCP Dynamic Port if the database you are accessing is a named instance.
- Open a command window and navigate to the location of the SchemaSpy jar file.
- Execute the SchemaSpy jar file using the proper command line parameters.