Database management: Conceptual models for database design (UML model)

UML model builds upon the ER model and EER model we reviewed previously. Unified modeling language is an object oriented system modeling notation that focuses on behavioral modeling, process, and application architecture on top of the data requirements. There are four big terms to know before pushing forward, which are class, object, variable, and method. Class is simply entity types like people, object is the entity like bob, variable is the attributes like height and method is a way of getting the attributes like getHeight or calculateHeight.

Class is represented in rectangle with three sections like shown below. The top section is simply the name of the class or “entity type”, the middle section is filled with variables or “variables” followed by “:” and its value type (integer, string, and boolean, and the last section is filled with methods of getting the attributes. methods include both get and set where get method is used for retrieval and set method is used to assign value to the attributes or variables. Unlike EER, variables or attributes with “unique or key” values are not supported, meaning it can’t be depicted. This is because UML treats every object created as unique and immutable with object identifier and no other variables are needed outside of that.

For composite variables, you can simply break them down. For example, name can be first name, middle initial, and last name in the middle section. For multi-valued variables like email (where you can have more than 1 email), you can add [0..4] to specify one can have from 0 up to 4 email addresses. If you want infinite amount, simply add the string [*]. Any derived variable like age from date of birth can be preceded by a forward slash “/”.

In UML, you have -, +, and # as access modifiers to specify who can access a variable or method. minus symbol means private, plus symbol means public, and sharp means protected. Minus or private means that variable or method can only be accessed by the class itself, so only supplier can access supplier number using et supplier number. Plus or public means it can be accessed by any other class. Sharp or protected means variable or method can be accessed by both the class and subclasses, so within artist, a singer subclass can access it. Good rule of thumb is to make variables private and methods public.

Instead of calling it relationship, UML calls it association and is characterized by multiplicities. multiplicities are pretty much ER model cardinality equivalent and indicates minimum and maximum number of participations of classes in the association. Instead of using N, UML utilizes * to denote maximum cardinality of N. 0..N becomes *, 0..1 stays the same, 1..N becomes 1..* and 1..1 becomes simply 1.

Association can be unidirectional or bidirectional. Unidirectional has arrow pointing from one class to another to state that one class can pull data from another class that it’s pointing to, but other way is not possible. Bidirectional has no arrow and is just a line and states that each classes can pull data from one another.

UML also have qualified association, which is basically adding small extra rectangle with a key attribute or variable that is being used as an index key to navigate from qualified to the target class.

A team can have 0 to N amount of players, but here, we see that team “position” (it’s key attribute/variable/index) is the one being connected, so the number will be minimum of 0 and maximum of 1. Team position like defender number 2 will be filled exactly by 1 player max or bench warmer number 1 will be maximum of 1, but minimum of 0 and if there is a player, they will always be holding 1 team position.

Specialization also exists in the UML model but is depicted differently. The arrow shows that the connected class belongs to the above class and has the characteristics defined within the curly brackets “{}”. It has partial; overlap, meaning artist can be both painter and singer but artist doesn’t need to be either of the subclasses below.

Aggregation also exists to represent composite to part relationship just like EER. There are two types of aggregation in UML, which are shared and composite. When the part object can belong to multiple composite objects simultaneously, it is a shared aggregation and is depicted using a hollow diamond while composite part object can only belong to one composite and is depicted with dark filled in diamond.

For the image above with the white diamond, let’s treat the left side has a company and right side as a business consultant. This is a shared aggregation because consultant can have multiple companies they work for and company can have multiple consultants. Then at the bottom, we have composite aggregation represented by a black diamond. On the left, let’s say we have a gas company and right side we have an account. The account with an account number is tided to specifically one gas company only. When the gas company is removed from the database, all connected account objects will disappear while on the shared, consultants will remain regardless if the company on the left disappears.

The most unique add-on for UML is the object constraint language which lets you define in a declarative way. You can add a code like (context: office invariant: workers->size >100) which specifies that a office should have at least 100 workers. It’s a powerful language that lets you add numerous semantics for a conceptual data model and lets you fill in the missing gaps from EER model.