//
archives

Rich Kier

Rich Kier has written 20 posts for Power Designing

Data Modeling Made Simple with PowerDesigner

If you’re not following George McGeachie at Metadata Junkie, you should be.  George has just completed a book with Steve Hoberman, “Data Modeling Made Simple with PowerDesigner“.  We had Steve in to do modeling training a few years ago and he was working mostly with ERWin at the time, so I’m glad to see he’s embraced PowerDesigner.  I can’t wait to get my copy.

Read more at Metadata Junkie:  The book is finished!.

Advertisement

Events – Before Database Generate (More or less)

And we’re back.  Sorry for the delay.  Between day job, the worlds worst sinus infection, and becoming a pee-wee soccer coach (which required actually learning the rules of soccer), it’s been a busy end of summer.

Part n of our m part series on events continues.  Today we’re going to talk about the Before Database Generate event.  I could take the easy road by pointing out that I’ve already covered these in my post on adding seed data to a table but we’ll take it a little further.

First, let’s talk about before database generate events (and their cousin after database generate events) in general.  What can we use them for?  Well, here’s the list I came up with.

  • To set the properties of an object that need to be modified at generation time such as parameters that are dependent on the target.  Many of these things can be done in pre-generation transforms, and this is probably the right thing to do in most cases.  Putting them in a a before database generate event has the advantage that the changes remain in the model and don’t just exist in the generated DDL.  So if it’s important to keep a model for posterity, or perhaps audit purposes, using a pre-generation event might be better.
  • To integrate data or information from other sources which may also be changing (e.g. seed data)
  • To automate versioning activities such as performing a model check-in when DDL is generated
  • Setting object properties that are dependent on the current date

Now, just because you can do these things doesn’t necessarily mean you should.  This event is at the MODEL level.  This should give you an idea that Sybase really intended this as something you’d use for versioning, external tool interface, etc. rather than for manipulating individual objects (that’s what transforms are for) prior to generation.  You can still write code to go through your objects and take whatever action you want, but think long and hard about it.

I’ve covered external integration with my seed data demo and Sybase has some good material provided on interacting with the repository, so for our example I’m going to modify objects based on the date.

I’m going to configure my model so that for any date range partitioned table, I only have to specify the starting partition, and it will built out partitions until the current date.  I’ll be honest, this isn’t how I would choose to do this, in part because my sample is too simplistic to really be useful, in part because most databases now have interval partitions where you only have to tell it how long the date range for each partition is and it takes care of the rest, and in part because if this is something I need to automate in a warehousing environment, it’s usually something I prefer to build into the ETL or operations routine anyway.  But, it’s still a fun example, and it gave me an excuse to play with the GetPhysicalOptionValue method.

The idea here is we’ll define a single partition at the beginning of your date range.  If I wanted to make this more flexible, I’d establish what the interval was for each partition (daily, weekly, monthly, quarterly, yearly) as an extended attribute and the routine would take care of the rest using the options on the initial partition as a template.  If you take it this far, let me know.

We’ll start with a custom method on the table object.  Since partitions are fairly database specific, I should probably put this in the XDB file but I’m going to leave it in an XEM simply to make it easier to distribute.  This particular method is based on Oracle 11g.

Here’s the code for my method.  I’ll walk through it below.


sub %method%(obj)
'BuildPartitions method

dim part
 dim part_col_code
 dim part_col
 dim col
 dim part_value
 dim start_dt
 dim part_cnt

dim myregexp
 dim mymatches
 dim mymatch

if not obj.getextendedattributetext("tablepropertiestablepartitioningclausesrangeorcompositepartitioningclausepresence") then
 output "table " & obj.code & " is not range partitioned"
 exit sub
 end if

' now assuming we have range partitions, is it a date? get the name of the column
 part_col_code = obj.getphysicaloptionvalue("<table_partitioning_clauses>/partition by range/<column_list>/<column>:1")

'now find that column object

for each col in obj.columns
 if col.code = part_col_code then
 set part_col = col
 end if
 next

' and see if it is a date.
 if part_col.datatype = "DATE" then

'now we retrieve the partition data for the single partition we've defined
 part_value = obj.getphysicaloptionvalue _
 ("<table_partitioning_clauses>/partition by range/<partition_list>/partition:1/values less than/<value_list>")

'now lets get the current date and build out partitions to support it

set myregexp = new regexp
 myregexp.ignorecase = true
 myregexp.global = true
 myregexp.pattern = "\d+/\d+"

set mymatches = myregexp.execute(part_value)

for each mymatch in mymatches
 start_dt = cdate(mymatch.value)
 next

output "Building partitions from " & start_dt & " to " & date & " on " & obj.Code

else
 output "table " & obj.code & " is not range partitioned by date"
 exit sub
 end if

' now we know we have a table that is range partitioned by a date column
 ' let's add monthly partitions from start date to current date + 1 (remember
 ' the partition is defined by the upper bound

' set the partition count - we have one defined by default, so we'll start with 2
 part_cnt = 2

do while start_dt <= dateadd("m",1,date)

obj.SetPhysicalOptionValue "<table_partitioning_clauses>/partition by range/<partition_list>/partition:" & _
 part_cnt & "/<partition_name>", "part" & year(start_dt) & Right("0" & month(start_dt), 2)
 obj.SetPhysicalOptionValue "<table_partitioning_clauses>/partition by range/<partition_list>/partition:" & _
 part_cnt & "/values less than/<value_list>", "to_date('" & year(start_dt) & "/" & Right("0" & month(start_dt), 2) & "','YYYY/MM')"

start_dt = dateadd("m",1,start_dt)
 part_cnt = part_cnt + 1
 loop

output part_cnt & " partitions built"

end sub
 

Let’s break it down. Our first task is to determine if the table is range partitioned, and if it’s a date.  To do this, we’ll check and extended  attribute on the table, then we’re going to inspect the physical options on the table to get the partitioning column we’ve defined and determine if it’s a date.

Lines 15-19 check the partitioning and terminate the subroutine if it’s not range partitioned.  We follow that immediately by using the getPhysicalOptionValue method to retrieve the name of the partitioning column.  I don’t find the help file on this method terribly informative.  So I’ll give you my Cliff’s Notes version of common errors.

  1. There are two similar methods getPhysicalOptionValue and getPhysicalOptionValues.  The first retrieves the value you’ve set, the second retrieves the list of possible values, which for the partitioning column would retrieve a list of all columns in the table.  A misplaced S will cause a lot of confusion.
  2. When you enter the option path, enter it exactly as displayed in the tree.  If it’s surrounded by “<” and “>” include them, if it doesn’t, do not.  You won’t get an error if you enter it wrong, you just won’t get a value.

Lines 26-33 find the column object referred to in the physical options by looping through the columns collection.  If the column is not a date, the subroutine will exit.

Lines 36 and 37 retrieve the “values less than clause”, again using the getPhysicalOption method.

I’m counting on my method having the value expressing in some sort of standard date format.  VBScript is pretty good at automatic date conversions, but use your best judgement here.  Lines 41-50 use regular expressions to extract the date portion from the value.  I’m extracting all numbers and the “/” character.  If you want a good tutorial on regular expressions, you are NOT on the right blog.  I know enough to be dangerous and how to use Google.  Here are two sites I find handy:

Regular Expression Library

Regular Expression Tester

Now that we’ve completed all of that work, we’ll loop through from our starting date to the current date (plus one month since we’re specifying an upper bound) in lines 66 to 75 and add partitions using the setPhysicalOptionValue method.  The same warnings apply as for getPhysicalOptionValue(s).  Notice that we need to change our index to build new partitions and that we start with 2 so we don’t overwrite the starter partition.

So, a lot of work and still no BeforeDatabaseGenerate event (I guess maybe I should’ve titled this one “Getting and Setting Physical Options”).  Let’s add it now.

I’m going to assume you’re familiar with adding classes and events to an XEM file, we’ve covered that before. Remember it’s got to be at the MODEL level.  Here’s the code:


Function %BeforeDatabaseGenerate%(model, fct)

Dim myObj
 Dim mySel

Set mySel = model.CreateSelection
 mySel.AddObjects ActiveModel, cls_table, false, true

For each myObj in mySel.Objects
 With myObj
 Output .Name
 .ExecuteCustomMethod("Events2.BuildPartitions")
 end with
 Next

%BeforeDatabaseGenerate% = True

End Function

Nothing surprising here at this point (I hope).  As usual, we’re using a selection to find all our tables, and then executing our method for each.  You’ll notice that the input parameter for the method isn’t Obj for this event, since it only applies to models, it’s been replaced with the more specific Model on lines 1 and 6.

So there you have it.  A few words of wisdom (or maybe just words) on what you can and should do with BeforeDatabaseGenerate events, and a brief sample.  Remember, this event is probably best reserved for integration with external files or sources  and version control rather than actual model manipulation.  Use transforms for that.  However, in a pinch, here’s another tool for the box.

Best of luck.

Transformation – Creating a Reference Table

I’m going to ignore events for a post due to popular demand, or at least one polite request and talk briefly about transformations.  This isn’t intended to be a detailed discussion of all the wonderful things transforms can do but a quick demo.  A few weeks ago I posted a simple demonstration that automated creating a reference table.  The code I used for that post originally started as a transform, so I’m going to demonstrate how I create lookup tables when generating a PDM from a CDM.

First, the basics.  In order to use transforms, you need 2 things.  An actual transformation, and a transformation profile.  Transformations fall into two categories.  Pre-Generation and Post-Generation.  The documentation has a lot of information on both, but the important thing to remember is that pre-generation transformations are defined on the source model and modifies the source model before the transformation process starts.  Post-generation transformations are defined on the target model and modify the target model after it’s created.

So, for our example, we’ll create a pre-generation transform that will create a generic lookup table.  Our code will be a little different from the event code earlier because we’re working with a CDM instead of a PDM (table becomes entity, etc.).  We’re going to key our transform off the column domain.  So instead of cluttering up your conceptual data model with a lot of lookup tables, we’ll simply define them as being in the “Lookup” domain.

First step, add a transformation to the entity attributes  metaclass and add some code.  I won’t elaborate on the code since we’ve gone through this before.  I’ll call my transform GenerateReferenceTable.

Now the code.  Note that I can define the CDM datatype as “Variable Character” and let PowerDesigner take care of transforming that into the appropriate datatype for the target RDBMS.

Sub %Transformation%(obj, trfm)
   ' Implement your transformation on  here

   dim domainObj
   dim domainName
   dim newAttr
   dim newRefTable
   dim baseTable

   dim newRef

   set domainObj = obj.Domain

   if domainObj is nothing then
      exit sub
   end if

   if domainObj.Name = "Lookup" then

   domainName = obj.name
   output domainName

   ' Step 1 - Create a standard Reference Table

   ' Step 1.5 - Check to see if the table already exists (hey it happens)
   ' create it if you need it, otherwise - reuse it

   set newRefTable = getEntity(domainName)

   if newRefTable is nothing then

          set newRefTable = activeModel.Entities.CreateNew()

         with newRefTable
            .Name = domainName
            .SetNameAndCode .Name, "", true
         	.Comment	= obj.Comment
         End With

         ' Step 2 - Perhaps some columns would be nice?

         set newAttr = newRefTable.Attributes.CreateNew()
         with newAttr
            .Name = domainName &amp; " Key"
            .SetNameAndCode .Name, "", true
            .DataType = "Integer"
            .PrimaryIdentifier = true
         End With

         set newAttr = newRefTable.Attributes.CreateNew()
         with newAttr
            .Name = domainName &amp; " Code"
            .SetNameAndCode .Name, "", true
            .DataType = "Variable Characters(40)"
            .PrimaryIdentifier = false
            .Mandatory = false
         End With

         set newAttr = newRefTable.Attributes.CreateNew()
         with newAttr
            .Name = domainName &amp; " Name"
            .SetNameAndCode .Name, "", true
            .DataType = "Variable Characters(50)"
            .PrimaryIdentifier = false
            .Mandatory = false
         End With

         set newAttr = newRefTable.Attributes.CreateNew()
         with newAttr
            .Name = domainName &amp; " Description"
            .SetNameAndCode .Name, "", true
            .DataType = "Variable Characters(100)"
            .PrimaryIdentifier = false
            .Mandatory = false
         End With

      end if

      ' Step 3 - Create a relationship between the new lookup table and the table containing the original column

      set baseTable = obj.Parent

      set newRef = activeModel.Relationships.CreateNew()

      with newRef
         .object1 = newRefTable
         .object2 = baseTable
         'Set the optionality using the property from the original column
         .Entity2ToEntity1RoleMandatory = obj.Mandatory
      end with

      ' Step 4 - Remove the original column from the base table

      obj.Delete

      ' Step 5 - Have a beer
   end if
End Sub

Now we have a transform, we need a profile to include it in.  There are only a few options to choose from.  When the model type is Physical Data Model, you can select a Family and Sub-Family which correspond to the target RDBMS and sub-family in the target model.   This tells PowerDesigner which transformations to allow/deny (show/hide) when you’re actually generating a data model.  I’m using Oracle here (mainly to make it easy to find the right box in the picture below), although you can leave them blank and have them apply universally.  I’ve named my profile toPDM.

Once you’ve got a transformation profile, you need to tell it which transformations are in the profile, we’ve only created one, so this should be pretty easy.  Click on the Add Transformations box and check the box next to GenerateReferenceTables.  Click OK.  Your profile should look like the sample below.

To use it you just need to create an entity and assign an attribute in it the Lookup domain.  Each attribute tagged as a lookup should result in a reference table and a foreign key back to the original table.  If multiple attributes have the same name, they should point to the same lookup table.  This script will NOT add the additional tables to the diagram, so you’ll need to add them manually and choose complete links to finish it up.

The only thing left is to generate the PDM.  You won’t need see the transformations by default but you’re just a few clicks away.  On the PDM Generation Options dialog, you’ll need to go to the detail tab and click on the “Enable Transformations” box.

Once that’s done, you’ll have additional tabs for pre-generation and post-generation transformations.  Go to the pre-generation tab and check the box next to our toPDM profile.  Notice that if you have multiple transformations in a single profile, they can be toggled on and off independently.

Now, set the remainder of your generation options as you would normally (that’s a topic for another day) and choose OK.  Next post we’ll be back on events – assuming no more polite requests arrive.

Events – Can Create

So now we come to the speculative fiction part of the events series.  CanCreate events are not something I’ve used so we’re making this up as we go.  I’ve been playing around with them and doing some thinking on where and how they might be applicable.  canCreate events are a bit different than the others.  They prevent you from creating an object or assigning a stereotype instead of automating some task.  As I read through the examples, it seemed that these might be a bit more applicable in model types other than data models.

I’m generally not one to modify a tool to restrict someone from using an entire metaclass.  Most of the metaclasses in an RDBMS are there for a reason, and to take away a potentially useful tool seems like a bad idea to me.  Perhaps your company or DBAs have decided that “thou shalt not use views” or “thou shalt not use triggers” and if so, then restricting them with a canCreate trigger might be acceptable.  But not recommended.

So, thinking through the way we use models, I came up with a couple of cases for using the canCreate event.

1) Restricting the use of a metaclass (“thou shalt not use triggers”)

2) Ensuring that pre-conditions or model configuration is completed before allowing the modeler to create objects

3) Restricting the use of a stereotype

I’m going to do one example the shows demonstrates the first two cases, then we’ll handle stereotypes (poorly) separately.

Let me stress again that removing a metaclass entirely is NOT something I would typically recommend.  However, at least on one of my projects, I can actually think of a place where I might use it.  We currently have a team of modelers working on a Teradata implementation.  In this environment we’re building a set of base tables which will store data for a variety of user groups in one database.  Access to these tables will be managed through a set of view layers.  Each group will have their own view database.  Any physical structures such as indexes must be in the shared table database.  Each database has its own model, so by flagging the model as either a view or table database, I can use a canCreate trigger to ensure that the view databases don’t contain any tables.  We’ll create an extended attribute at the model level and make sure that the modeler has set the database type before they create any tables.  I think that this targeted, conditional use of canCreate triggers, which require you to make a few “declarations” at the model level when you start are probably the right way to go and could potentially be useful.

So, let’s walk through an example.

First, I’m going to create an extended attribute on the model metaclass.  I’ll call it DatabaseType and give it three potential values: “view”, “table” and “not set” (which will be the default).  Using a default makes sure someone actually goes in and sets the value.  Then, we’ll put our canCreate event on the table metaclass.

Notice that for the canCreate event the function has a “parent” parameter rather than the obj property that points to a specific object.  For objects such as tables or views, this will be the model.  For columns, indexes, etc. this will be the table.

Here’s the code:

Function %CanCreate%(parent)

   if parent.GetExtendedAttributeText("DatabaseType") = "View" then
      %CanCreate% = False
   elseif parent.GetExtendedAttributeText("DatabaseType") = "Not Set" then
      %CanCreate% = False
   else
      %CanCreate% = True
   end if

End Function

This will prevent anyone from creating a table without setting the DatabaseType property at the model level to “table”.  I have several extended model files that rely on setting a set of model level attributes that are used by initialization events and other scripts to automate common tasks.  Using canCreate triggers, I could ensure that model level variables are set before we begin.

Next case.. restricting use of a stereotype.  I thought long and hard on a good example for this and honestly I could not come up with one.  I’m sure there’s one out there, please send ’em if you’ve got ’em, but I just couldn’t come up with it.  I’m going to use a bad example, just so we can illustrate the point.

So, for the sake of argument, here goes.  I’ve decided to classify my tables as associative, attributive or independent.  In addition, I’m going to assign all of my columns political parties.  Naturally, an independent table can’t contain Republican or Democratic columns (independent or green columns are fine).  Did I mention this is a bad example?  For this example, I’m going to use a stereotype to indicates an independent table.  There are some rules I could use to determine this programmatically but that’s probably another post.

So, here’s what we end up with.

As you can see, we’ve got two stereotypes, each with a canCreate event.  The code is the same in both triggers.

Function %CanCreate%(parent)
   ' Implement your creation validation rule on <parent> here
   ' and return True in case of success

   if parent.stereotype = "Independent" then
      %CanCreate% = False
   else
      %CanCreate% = True
   end if

End Function

Here the parent property points to the enclosing table.  We evaluate the stereotype and return true or false, where false indicates you cannot utilize the stereotype and true indicates you can.

Notice that our table marked as Independent only allows columns to be assigned to the Green or Independent party.

So, another event down.  If you have a good use for the canCreate event in a data model, please do comment here or e-mail me.  I’d really love a better example.

Event – Validation Automation

The long awaited (or at least long delayed) part 2.5!

So, today we’re going to talk about automation and a bit of code I’ve been using for a bit to make my life easier.  If you’re like me (lazy) and like to get as much done as possible with the fewest number of mouse clicks or key strokes, you’ve probably thought, isn’t there some way I can do more with less clicks?  Well, validation events might just be the trick you’re looking for.  Sound good?  Okay, here’s my example.

There are a few things I find myself doing a lot.  First on the list is creating lookup tables.  I do this constantly.  You probably do too, and since most tables take on a pretty consistent structure, why not automate it.

So, here’s what I’m going to do.  I’m going to create a validation event on the column metaclass that automatically creates a lookup table for a given column when I end the column name with “+L”.  This way, whenever I want a new reference table on a column, I simply add a switch to the column name and the tool takes care of the rest.  I’m going to reuse some simplified code I’ve already deployed elsewhere to keep this manageable, but here goes.

Step 1:  I need some code to create the reference table.  As usual, I’ll create this as a method, which allows me to call it from inside the event handler, or from a menu on the column object (if you didn’t see this coming, you haven’t read any of my earlier posts).

Step 2:  Create the event handler.  This will simply look for the +L suffix and call the appropriate method, and then remove the switch from the end (I can’t think of a reason I’d actually want a column ending in +L).

So, here’s the code.  The method will actually consist of two parts, a global routine which I’ll use to make sure I reuse an existing lookup table if I’ve already created one.  Remember from my earlier discussions that event handlers should be written to ensure they work correctly if they execute multiple times. First, my getTable global method.  I’m using a selection here to check all the existing tables.  Again, if you aren’t using selections, read my earlier posts.

function getTable(tableName)

   Dim mySel
   Dim myObj
   Set mySel = ActiveModel.CreateSelection

   mySel.AddObjects ActiveModel, cls_table, false, true

   For each myObj in mySel.Objects

      if myObj.Name = tableName then
         set getTable = myObj
         exit function
      else
         set getTable = nothing
      end if  

   Next

   set getTable = nothing

end function

Pretty straightforward, now the method.  Again, this is a simplified version of some existing code.  I’m not going to spend a lot of time walking through it (lazy, remember?).  First, we’re going to check for an existing table by searching on name using the method above.  If we don’t find it, we’ll create it and add a few columns.  Then we’ll create a relationship from our reference table back to the original table.  Finally, we’ll remove the original column.

Sub %Method%(obj)

   dim domainObj
   dim domainName
   dim newAttr
   dim newRefTable
   dim baseTable

   dim newRef

   domainName = obj.name
   output domainName

   ' Step 1 - Create the standard Reference Table

   ' Step 1.5 - Check to see if the table already exists (hey it happens)
   ' create it if you need it, otherwise - reuse it

   set newRefTable = getTable(domainName)

   if newRefTable is nothing then

      set newRefTable = activeModel.Tables.CreateNew()

      with newRefTable
         .Name = domainName
         .SetNameAndCode .Name, "", true
              .Comment        = obj.Comment
      End With  

      ' Step 2 - Perhaps some columns would be nice?

      set newAttr = newRefTable.Columns.CreateNew()
      with newAttr
         .Name = domainName & " Key"
         .SetNameAndCode .Name, "", true
         .DataType = "Integer"
         .Primary = true
      End With  

      set newAttr = newRefTable.Columns.CreateNew()
      with newAttr
         .Name = domainName & " Code"
         .SetNameAndCode .Name, "", true
         .DataType = "Variable Characters(4)"
         .Primary = false
         .Mandatory = false
      End With  

      set newAttr = newRefTable.Columns.CreateNew()
      with newAttr
         .Name = domainName & " Description"
         .SetNameAndCode .Name, "", true
         .DataType = "Variable Characters(100)"
         .Primary = false
         .Mandatory = false
      End With  

   end if

   ' Step 3 - Create a relationship between the new lookup table and the table 
   ' containing the original column

   set baseTable = obj.Parent

   set newRef = activeModel.References.CreateNew()

   with newRef
      .object2 = newRefTable
      .object1 = baseTable
      'Set the optionality using the property from the original column
      '.Mandatory = true
   end with

   ' Step 4 - Remove the original column from the base table

   obj.Delete

   ' Step 5 - Have a beer (suggestions welcome)

End Sub

Finally, we need our event.  Since we’ve done all the hard work already, the event is simple.  We’ll check for the +L suffix (we’re trailing spaces and ignoring case), then calling our method.

Function %Validate%(obj, ByRef message)
   ' Implement your object validation rule on <parent> here
   ' and return True in case of success, False otherwise with a message

   obj.Name = trim(obj.Name)

   if ucase(right(obj.Name,2)) = "+L" then

      output "Lookup table"
      with obj
      .Name = trim(left(.Name,len(.Name)-2))
      .SetNameAndCode .Name, "", false

      .ExecuteCustomMethod("EventAutomation.ConvertToLookup")

      end with 

   end if

   %Validate% = True

End Function

So there you go.  A few lines of vbscript and I don’t have to create a generic reference table anymore (lazy).

Incidentally, I use almost this exact same method (not event) in a transform to automatically create reference tables when I move from a conceptual to a physical model.  This let’s me keep my conceptual models a little less “cluttered” and still get my reference tables in the PDM.  If you’re not working in a conceptual model, this is obviously not an option, but it’s something worth considering.

That’s all for tonight.   Give it a try and let me know what you think.  Good?   Bad?  Any better ideas for automation?