Data Model Governance
Introduction
PHENOM™ Portal is a collaborative online ecosystem for data model management. This Knowledgebase provides documentation that describes some of the key concepts foundational to this approach, their importance, and how PHENOM uses them to serve you.
In addition, this document will explain the maturity of each of the features and when you, the user, can expect to see them integrated into the portal.
Furthermore, this will capture some guidance for how the data model might be managed by an organization.
Why?
PHENOM aspires to deliver high-quality, always valid™ data models. That is, if you export a model from PHENOM, you can be certain that it will conform to the corresponding standard. However, conformance is only a measure of cohesion to the standard – not an actual measure of model quality. As a result, it is possible to have a perfectly conformant model that is entirely useless.
By having a clear understanding of the processes and capabilities, you will be able to leverage all the capabilities of PHENOM to deliver the best product and process for your organization.
About Proprietary Data
Classes of Proprietary Data
There are two main classes of proprietary data associated with PHENOM:
- that which is created by Skayl LLC (the creator of PHENOM)
- that which is created by users
Skayl Proprietary
Skayl claims the implementation of PHENOM as its own, proprietary intellectual property. However, Skayl makes no claim on the data created with PHENOM. This is very much like the application Microsoft Word. Microsoft does not tell you how Word was created, but they do not claim any ownership of the documents you create with their tool. PHENOM acts in a similar manner.
PHENOM allows users access to all information stored in the data models. Beyond the implementation of PHENOM itself, the only thing that is maintained as proprietary data is the traceability of all model content. This information is an ongoing log of all operations on the data models and it is a significant amount of metadata (data about data) used to keep models glued together over time. Skayl has developed special algorithms that operate on this data for large scale use of data models as they change over time.
Domain Specific Data Models (DSDMs)
Although Skayl holds the copyright to the Air System DSDM (also referred to as: Skayl DSDM, uasModel, and 4586 DSDM), this model is distributed for broad use under Government Purpose Rights. This is intended to protect the government (and taxpayers) from being required to purchase the same thing many times.
PHENOM exports the data model content in standard, open formats. These formats include The Open Group's Future Airborne Capability Environment (FACE) XML Metadata Interchange (XMI) format (both versions v2.1 and 3.0) and a UML XMI v1.2 file.
PHENOM has the ability to capture additional information. At this time, this information is limited to:
- Descriptions on View Attributes
- On-the-wire representation of messages
These two types of information are not supported in the standard formats and the FACE XMI format does not allow any extensions in the data model since it has the ability to limit interoperability. In these cases, Skayl makes it possible for users to export this information in virtually any format they wish through a user-defined template.
User Proprietary Data
PHENOM has many different types of users. While some users could be maintaining the DSDM for the benefit of all users, others may privately be developing proprietary or competitive products. By default, all user-generated content is private until it is shared with its parent. Permission and ownership of the data is transferred when it is shared with the parent. As long as data remains within a specific project, this should not provide any concern to an organization. However, as soon as the Project Admin shares the content with the DSDM, the ownership of the information is transferred to Skayl, LLC so that it can be included in the baseline DSDM with the appropriate Government Purpose Rights.
Does Proprietary Data Hurt Data Models?
Not really.
Data Models are intended to be used for documentation of interfaces. Documentation is meant to be shared. As long as the data model is shared with the people who need to see it, then it serves its purpose.
Contributing modeled content back to the larger community has the potential for saving a lot of redundant work. It adds efficiencies to the process. The only real challenge is when duplicate semantics are created. When some information is being held in a private branch, the content may be created publicly. So while this may have long term effects on interoperability, these structures are traceable and there is a path to merge them with the shared data model should the need ever arise.
Technical Concepts
Version Control
Version Control is the process of tracking the evolution and history of a product. In its simplest form, version control insures that the distinct instances of a product are clearly and unambiguously identified. This allows specific versions of the product to be reliably and accurately be referenced.
Version Control can be handled many ways as long as the primary goal - faithfully reproduction of the model given a specific label identifier – is accomplished.
Due to the way PHENOM stores it information, version control is greatly simplified.
How PHENOM Supports Version Control?
Tag-based Version Control
With tag-based version control, users are able to mark the current state of their model with a label indicating a desired version of the model. This will allow them to pull that version of the model in the future.
Post-Hoc Version Control – Available 2Q2019
Post-hoc version control will allow users to create a label for their model at any point in the past. While it may be beneficial to create labels as you model, this feature will allow retroactive creation of these labels.
Change Management
Change management is related to the idea of version control, but it is more focused on keeping the product consistent and cohesive as multiple users simultaneously make changes. When individuals make changes in different, unrelated parts of the product, change management is very tractable. In this case, changes can exist in their respective areas of the product and they may never interact with each other.
In other cases, users may be changing related things (or even the same thing). In this case, it is not possible for the system to automatically determine which user change is the “correct” change. While modern software version control and change management systems have gotten very good at deconflicting these cases, many cases still require a human to manually merge the two changes.
Most modern systems are not entirely project aware. When two components change, they may be a downstream effect that breaks the entire system. Even if the changes appeared unrelated, the impact caused the merged product to break. There are many tools that help mitigate this problem, but this is another challenge related to change management.
Trusted Peer Environment
In an open, development environment, the workflow may be relatively simple. When the individual contributors can be trusted (in both their skill and their character), change management can be much simpler. In this situation, contributors are viewed as peers. Each contribution is accepted as an incremental improvement to the core project and verification is a part of the test and release process.
There are many implementations of this system, but type of approach is often found within a software development team that work together for a single organization. Not all teams are quite as cohesive – or as trusted.
Untrusted Peer Environment
In some projects – like an open source project – there may be an “approver” who moderates the incorporation of new changes. In these types of projects, there are many dynamics that may not (or, ideally, do not) occur in a corporate environment. Many open source projects are volunteer driven. This means that all changes are made on a best effort basis. One developer may make a significant contribution, but what if that change negatively impacts another piece of the software. It may take weeks (or months) for that other area of the software to get fixed since the developers working on that portion of the software may not be available.
Developers may even have a disagreement about the best way to implement a feature. Unchecked, this could cause conflict within a team. When the process is not managed, it is also possible for bad actors to introduce defects into the code base.
In yet other cases, there may be a need for multiple approvers (often called a change control board (CCB)) to manage a set of changes. Like an approver, a CCB typically represents a trusted set of individuals to whom the authority to approve changes has been granted.
While all CCB members may share the same level of vote (i.e. one vote per person), they may not all represent the technical aspects of the product. It is possible that some members of the CCB are appointed to maintain the overarching vision of the product to ensure that changes do not cause unnecessary deviation.
In CCBs, it is important to have clear acceptance criteria, voting guidelines, and clearly understand the role of each member.
Change Management of Digital Formats
In the change management of digital products, a simplified description of the change management process is that every byte of data is compared with the corresponding byte in the previous version. The presence (or absence) of each byte is annotated in a format that, when applied as a transformation, converts the previous version to the latest version.
Changes can be difficult to detect. When data is added or removed, it is necessary for the change tracker to determine where the original content continues. In cases where there may be repeated patterns, the annotation may be unexpectedly complex.
There may also be times when multiple independent contributors change the same aspects of the product. How does the system determine whose changes are accepted? In most cases, these changes cannot be automatically resolved. The contributors are notified of such conflict when the products are merged together and their input is requested as needed. In these cases, the deconfliction process is manual and it is up to the contributors to decide which version of the product is correct. This often requires out-of-band communication to resolve the conflict.
Change Management of Data Models
Traditional change management tools are insufficient for some highly connected forms of data such as data models. As previously described, these tools focus on the textual differences between subsequent versions of the same data. This presumes that the order of all the data is a relevant metric for determining a difference. In the case of data models, it is entirely possible to have two completely equivalent models that are not stored in a byte-wise congruent format.
If order does not matter for data models, what determines equivalence? While the order of the data is not entirely relevant, the content of the data is and, just as importantly, so are the connections between the data. Thus, for data models, the relevant meaning (beyond the data itself) is captured in the structure of the content. As such, existing tools and processes (which focus on data – primarily the order of the data) are not effective for maintaining changes to data models.
It is possible to apply existing tools to highly connected data models. The UCS Working Group successfully applied Subversion (SVN) version control software to manage their data model. While it was possible to leverage this software, it was accompanied by a 12-step check-in/check-out procedure that was required to maintain system consistency. This is not meant to be a condemnation of the UCS Working Group, it merely seeks to highlight that, while a tool may be used, it is not necessarily the best tool for the job. Furthermore, Subversion is still based on the ordering of content and primarily functioned by tracking changes in highly ordered text files. While it was, in fact, able to help the system manage changes, it required the content in the text files to be maintained in a proper order.
Change Management of UCS/FACE Data Models
UCS and FACE data models are highly connected and highly structured. Their construction follows a set of rules which must be followed to yield a valid model. These rules specify the relationships of the models – not just of the nodes themselves (this is governed by the metamodel) but also by the OCL (object constraint language) which governs the data in the model.
These data models simultaneously exist in (at least) three different levels of abstraction. Each level of the model represents a different set of properties – all of which are, in some way, related. This means that when data at one level of the model, it may (or may not) affect data in other levels of the model.
When a highly connected model like this is changed, it is possible that other nodes are impacted. As with the basic change management, it is possible for two contributors to modify the same (or related) parts of the model. In the case of a highly linked model, it is necessary to calculate the possibility impact of one change elsewhere in the model. That is, since a data is characterized by its relationships just as much as its contents, it is also necessary to manage changes to the relationships.
Change management systems tend to make no assumptions about the products they contain. For the most part, these products are just large collections of binary data. While this makes for an excellent, general purpose tool, imagine the kind of efficiencies that could be realized if the system were programmed with some knowledge of the type of data it contained.
Change Management & PHENOM
Taking the aforementioned concept of change management a step farther, PHENOM applies the rules of construction (metamodel) and validity (OCL) to data models.
How PHENOM Supports Change Management?
Branch-based Change Management
Each user gets their own model branch to which all of their changes are applied. Some users may have a branch they share with other users. In this case, users may choose their branch or another branch they have access to. If working in someone else’s branch, it is still necessary for the users to perform a little bit of manual deconfliction.
For the time being, this means that user changes are relegated to the branch. However, as a result of the way PHENOM stores these changes, this work will be able to be accessed as additional capabilities come online.
Basic Approver & Merge Support
This feature implements a statically configured approval process for a given customer. It will allow a “change approver” to be assigned. And, if the change is approved, other branches in the project will be notified and given the option to accept them.
Custom Workflows – Available 4Q2019
Organizations may have different ways they want the change management process to be implemented. Custom workflows will allow users to specify a change management process specified to their organization.
Roles in a Change Management Process
Contributor
A contributor is, in effect, a user of the system. These are the members of the team making technical changes and adding information/structure to the system. It is possible for a contributor to also be a member of the Change Control Board, but user should abstain from voting in these cases.
Approver
An approver is a part entrusted to evaluate a change proposal and either approve or disapprove the change. Should the change be approved, the approver may or may not be the person who implements it. Also, a disapproved change may not be entirely rejected. It is possible for the approver to return the change and request additional information.
It is possible that approvers have varying levels of knowledge. Approvers may be technically astute and know myriad details about the content. Approvers may also be experts in a subject matter and merely approve the correctness of the change. It is also possible for approvers to know more about the desired/allowed structure of the model and have little technical knowledge whatsoever.
Change Control Board (CCB)
A CCB is a committee of approvers who are all tasked with the responsibility of approving or disapproving a change. Typically, some sort of majority vote is required for a CCB to accept a change.