This article expands on what I wrote several years ago in Each database table requires its own class.
I started experimenting with PHP, my first language with OO capabilities, way back in 2002. This was after programming for several decades with other languages, primarily COBOL and then UNIFACE, so I was already used to a range of good practices such as KISS, DRY, structured programming, coupling and cohesion. My research told me that OOP was built on these same foundations, but was supposed to be better because it included new and revolutionary concepts called Encapsulation Inheritance and Polymorphism. I read the PHP manual on how to use these concepts in my code, then I began to build a new version of my development framework in this new language with these new concepts. If OOP was supposed to be better, then my aim was to produce a framework which enabled me to build applications in a more cost-effective manner that what I had achieved with similar frameworks which I had written in my previous languages. As I write nothing but database applications, what are now known as enterprise applications, I am used to building a standard family of forms to maintain the contents of a database table, so I used as a benchmark the amount of time it would take to build a set of these forms for a new database table. These are the levels of productivity which I achieved:
Those levels of productivity are a combination of the capabilities of the language and the features provided by my development framework. Each language provided new features which made development easier and quicker, and my framework made use of these features to provide a common set of utilities for each of the applications that were developed. The fact that my PHP framework was far more productive than either of my previous versions led me to believe that the benefits of OO were "as advertised" and not just a bunch of hype and that my implementation of OO was on the right track. I began to publish articles on my personal website explaining my approach so that others could benefit from my experience, so imagine my surprise when people started telling me that I was doing it wrong and that
real OO programmers don't do it that way. Most of the arguments were along the lines of
Your way is different from mine, and as my way is the right way your way must be wrong. Instead of concentrating on writing cost-effective software these people appeared to be concentrating on following an arbitrary set of rules which I did not know existed. To them it appeared to be more important to follow a set of rules in a dogmatic fashion and produce software which is 100% "pure" (whatever that is) whereas my approach is totally pragmatic as I put results ahead of any arbitrary rules.
When someone tells me "you should be doing it this way" my immediate response is to ask "Why?" They need to prove that their way is better than mine by identifying all those areas where my approach produces problems and theirs does not. No such proof has ever been provided. All they can do is claim that their method is better, but that is always subjective, it is nothing but an opinion. They cannot offer any objective proof, something which can be measured scientifically. When they say "your method is wrong" I counter with "how can it be wrong if it works?" When they say "your code is difficult to read and maintain" I counter with "how can it be if I have successfully been maintaining and enhancing it for over a decade?"
A classic example of where my critics say that I am breaking a golden rule is in my approach of having a separate class for each database table. I have never seen a rule written anywhere which says I cannot do this, and I have certainly never seen a list of problems which arise from disobeying this rule. Not only have I never encountered any problems with my approach, I have actually avoided the problems which are inherent in their approach and have even provided facilities and cost savings which their approach cannot.
As far as I am concerned my practice of having a separate class for each table is in line with the following definition:
A class is a blueprint, or prototype, that defines the variables and the methods common to all objects (entities) of a certain kind.
If you look at the CREATE TABLE script for a table is this not a blueprint? Is not each row within the table a working instance of that blueprint? Is it not unreasonable therefore to put the table's blueprint into a class so that you can create instances of that class to manipulate the instances (rows) within that table?
I write nothing but database applications for the enterprise, which is why they are sometimes called enterprise applications. This type of application is made up of a number of user transactions (also known as "units of work" or "use cases") where each transaction has a User Interface (UI) at the front end, one or more tables in a relational database at the back end, and some software in the middle which transports data between the two ends. There can be thousands of transactions and hundreds of tables. With my previous languages I gradually progressed from using the 1-Tier Architecture through 2-Tier and eventually 3-Tier with its separate Presentation, Business and Data Access layers. With the UNIFACE language it was standard practice to create a separate component in the business layer for each database table, so I saw no reason why I should not continue with this practice and create a separate class for each database table so that each could then become an object in the business layer.
I have been told on more than one occasion that the "correct" and "approved" way of applying the principles of OOP is to start with Object Oriented Design (OOD) with its is-a and has-a relationships, compositions, aggregations and associations, then design all objects using these principles. The database is left till last and accessed through an Object-Relational Mapper (ORM). As I did not know that these rules existed, let alone were obligatory, I did what had proved successful in the past and started with my database which I designed using the rules of Data Normalisation, then structured by software around the database as was taught in Jackson Structured Programming. This approach produced very good results in my PHP code, so I could see nothing wrong with it.
OO programmers often say that OO Design is incompatible with Database Design, which is precisely why I don't waste my time with it. Once I have designed a properly normalised database I don't have to design any classes to communicate with that database as I can automatically generate a class file for each table. When I want to create a user transaction to do something with that table I use my Data Dictionary to link that table with one of my Transaction Patterns, press a button, and that transaction is generated for me. This provides me with a basic transaction which I can run immediately, but I can then add custom code to the table class in order to augment the basic behaviour.
When I eventually read up on OOD and its IS-A and HAS-A relationships I decided that my implementation was a valid interpretation of those concepts, and as it was simpler than that being proposed by others, required less code, and offered more flexibility. As such it did not need changing. You should remember that when writing a database application that you are writing software that interfaces with an object in a database and NOT and object in the real world, and this simple observation leads to the following:
This information can be obtained from the database schema and made available to the table class in the form of a <table>.dict.inc file.
The fact that this results in a single array of field specifications and that there is also a single array of field data should make it easy to write a single standard routine which uses those two arrays to validate that a field's data matches its specifications.
Because I don't use OOD I never end up with that problem called Object-relational Impedance Mismatch which then requires that abomination of a solution called an Object-Relational Mapper (ORM).
When writing database transactions the most common pattern is to produce a family of forms as shown in figure 1:
Figure 1 - A typical Family of Forms
In my early COBOL days this family would be merged into a single large component (also known as a user transaction). There are still some programmers today who regard all six of the above transactions to be a single use case, so they build a single controller for that use case. The problem with this outdated approach is that each of those six parts has a different screen and different behaviour. Each part is called a "mode", so each time it is called you have to identify which mode - either LIST, SEARCH, INSERT, UPDATE, DELETE or ENQUIRE - is actually required so that the controller exhibits the correct behaviour. This has several problems:
In order to avoid these problems I decided to break down the single large multi-mode component into a series of single-mode components, as discussed in Component Design - Large and Complex vs. Small and Simple. I therefore have a separate user transaction for each mode - LIST, SEARCH, INSERT, UPDATE, DELETE and ENQUIRE. Each of these reference the same table class in the business layer. This arrangement has the following advantages:
All the HTML output is produced by transforming XML documents using XSL stylesheets. Although I started with having to create a separate XSL stylesheet for each screen in order to identify where each piece of data need to be displayed, and with what control, I managed to reduce this to a small group of reusable XSL stylesheets which obtain the data requirements from within the XML document itself. This data is extracted from a small screen structure file. This means that I can have one stylesheet for the LIST view and a separate DETAIL view for the others in this family.
Each of these six transactions does something with a particular database table, and the processing is split between a Controller in the presentation layer and a table/model class in the business layer. Each controller references a single View object - one for HTML, one for PDF and another for CSV output. Because each controller exhibits a pattern of behaviour which could be used with a number of different database tables, this has enabled me to produce a library of Transaction Patterns which encapsulate that behaviour. In order to produce a particular user transaction all I need do is to combine a particular transaction pattern with a particular database table. I have been able to automate this so that the framework can do it for me at the press of a button.
If a database application is one that contains a number of user transactions which operate on a number of database tables, and I have been able to automate both the creation of the table classes and the user transactions, why do my critics keep telling me that my methodology is wrong? Surely the ability to create working code with less effort is a good thing, so why do my critics keep telling me that my methodology is wrong? Are they right in creating expensive software which conforms to a host of arbitrary rules, or am I right in creating cost-effective software which does what my paying customers want?
My methodology may not be the same as that used by most developers, but surely it is the results which are more important? There was a study in 1996 in which the productivity of two teams was compared to find out why one team was twice as productive as the other. The study broke down the code which was written into various categories - business logic, glue code, user interface code, database code, etc. If one considers all these categories, only the business logic code had any real value to the organisation. It turned out that Team A was spending more time writing the code that added value, while team B was spending more time gluing things together. With my approach I can create a table in the database, import it into my Data Dictionary, then generate both the class files and user transactions for the family of forms shown in figure 1 at the press of a button and be able to run those transactions within 5 minutes, all without having to write a single line of code - no PHP, no HTML, no SQL. This means that the developers have to spend far less time in writing boilerplate code, which leaves them with far more time to spend on the code which has actual value to the organisation - the business logic.
That's because they don't understand how databases work, and how to write code which interacts with a database. If they were taught not to do it this way then their teacher was a moron. If they were taught that this way causes problems, then where is this list published?
That is because that when writing a database application I have learned that it is the database which is the most important and the software is a mere implementation detail. Using two different design methodologies - one for the software and another for the database - would be a recipe for disaster. I always design the database first using the rules of normalisation, then I skip OOD completely and force my software structure to follow my database structure. In that way I avoid the problem of Object-relational Impedance Mismatch which then requires that abomination of a solution called an Object-Relational Mapper (ORM).
Writing database applications requires a knowledge of how databases work, yet few OO developers understand this. They start with OO theory and then try to apply that theory in different scenarios, then complain when difficulties arise. I have seen such excuses as:
The advantage I have is that I worked with a variety of database systems - hierarchical, network and relational - for several decades using non-OO languages, so I knew how to design databases and write database applications. When I switched to an OO-capable language all I had to do was learn how to leverage the new concepts - encapsulation, inheritance and polymorphism - in order to write programs with more reusability.
Then you don't understand how to use inheritance. All code which is common to all database tables is defined once in the abstract table class and then inherited by every concrete table class. Every piece of code which can be shared is defined once with the abstract class. Every piece of code which is specific to a particular table is defined within that table's class. If code is defined once and shared multiple times then where exactly is the duplication?
Why? Object composition is only used by retards who don't understand inheritance and therefore overuse it by creating large inheritance hierarchies, but where you only ever inherit from an abstract class there are no complex hierarchies and therefore no problems that using such hierarchies produce.
This is discussed further in What is/is not considered to be good OO programming.
This complaint was explained in the following ways:
Abstract concepts are classes, their instances are objects. IMO The table 'cars' is not an abstract concept but an object in the world.
Classes are supposed to represent abstract concepts. The concept of a table is abstract. A given SQL table is not, it's an object in the world.
If the concept of a table is abstract, that is why I have an abstract table class. If a given SQL table is an object in the real world, that's why I have a concrete table class for each table which can be instantiated into an object. Each concrete table class inherits sharable code which is contained within the abstract class, on top of which it provides everything needed to work with a particular table. I have not seen any definitions of "abstract" , "concrete", "class" and "instance" which invalidate what I have done, so how can my implementation be wrong?
As far as I am concerned Object Oriented Programming (OOP) requires nothing more that writing programs around objects, thus taking advantage of encapsulation, inheritance and polymorphism to increase code reuse and decrease code maintenance. This is discussed further in What is Object Oriented Programming?.
Nobody in their right minds would create a separate subclass for each row in a table. Just as a Person table can hold many rows, one for each person, my Person object is capable of holding many rows, one for each person. I do the same for every table in my databases - one table, one class, where each instance of that class can hold as many rows as is necessary. This follows the Table module pattern which Martin Fowler wrote about in his book Patterns of Enterprise Application Architecture. He also has Class Table Inheritance and Concrete Table Inheritance, but as these talk about hierarchies of tables, which I do not have, I do not use them. I do not have table hierarchies in my software as there are no such table hierarchies in the database. The use of relationships may indicate that one table is related to another, but this in no way forces me to go through one table in a relationship in order to get to the other. Each table can be treated as an independent object - in fact it has to be for insert, update and delete operations - but for read operations it is possible to combine data from several tables by using JOIN clauses in the SELECT query.
How can I create an instance of a class without having a class to start with? I cannot create an instance of an abstract class and then supply it with the information it needs as the rules of OO explicitly prohibit instantiating an abstract class into an object. And you have the nerve to tell me that I don't understand OO!
As for losing the potential benefits of low maintenance, you are talking out of the wrong end of your alimentary canal. All the common code is contained within a single abstract table class, thus following the DRY principle, and each of my 350 concrete table classes contains only that code which is specific to that table.
How so? I write nothing but database applications, so deliberately obscuring from my software the fact that it is communicating with a database would be counter productive. I have gone so far as to implement the 3-Tier Architecture which takes all database code out of the business layer and puts it into a separate data access layer, but I will go no further. The fact that the business layer is then dependent on the data access layer is not a problem, it is precisely how this architecture is supposed to work, and it does not create problems, it solves them. It provides me with the ability to switch from one DBMS to another simply by changing a single line of code in my configuration file, and that is all the flexibility I need. My business layer is therefore not dependent on a particular DBMS, just the idea of a non-specific DBMS which can be identified later.
Why not? Each table is a separate entity in the database, with its own set of columns and its own business rules, so why shouldn't I create a separate class to encapsulate this information? It has to go somewhere, so where would you put it? Besides, creating a new table class is very easy with my framework:
Once I have created the class for that table I can then use the generate PHP script facility in my Data Dictionary to create as many user transactions as is necessary to deal with that table. Each transaction can be run immediately from the framework. These will perform the basic operations after which the developer can add in as many customisations as is necessary.
No I don't. If I need to change a table's structure I deal with it in three simple steps:
I do NOT have to do any of the following:
You may have difficulties with your implementation, but remember that my implementation is totally different, which it had to be in order to eliminate those difficulties.
Every newbie programmer is taught that "design patterns are good", so they do the stupid thing and try to implement as many design patterns as possible. They have their favourite patterns and cannot understand why everyone else does not use the same ones. There are even arguments as to how each pattern should be used. Take a look at the criticisms against my implementation of the MVC pattern as an example.
I do not pick patterns in advance and then attempt to write code which implements them. Instead I write code that works, and after I have got it working I refactor it as necessary to ensure that the structure and logic are as sound as possible. If the code then appears to match a particular pattern then that is pure coincidence - it is more by accident than design (pun intended!)
This topic is discussed further in You are using the wrong design patterns and You don't understand Design Patterns.
Too many people seem to think that code which does not follow their definition of OOP is not "proper" OOP at all, therefore it must be procedural. In this context the term "procedural" is used as an insult. Some of these criticism are explained in In the world of OOP am I Hero or Heretic?. My full response is contained in What is the difference between Procedural and OO programming?
Some people seem to think that just because my abstract table class is bigger than what they are used to that it is automatically bad. They seem to think that there is a rule which says that a class cannot have more than N methods, and each method should not have more than N lines of code (where N is a different number depending on who you talk to). As far as I am concerned this rule was only invented to cater for those people who are so intellectually challenged they cannot count to more than 10 without taking their shoes and socks off. I am obeying the rule of encapsulation which states quite clearly that when you have identified an entity you create a class for that entity which contains ALL the properties and ALL the methods that the entity requires. Note that as I use the 3-Tier Architecture each table class contains only business logic - all data access logic and presentation logic are in separate components. I am following the Single Responsibility Principle (SRP), so how can I be wrong?
Besides, this class cannot be used to create a God Object as it is an abstract class and cannot be instantiated into an object. This class is inherited by each concrete table class, of which I now have over 350, so each table class has its own object. There is no such thing as a single object which handles all database tables.
The definition of a God Object also states that it contains a majority of the program's overall functionality, and in my book "majority" means "greater than 50%", and after counting all the lines of code in my framework I can report that it only contains 17%. That is measurable proof, so your opinion is not worth the toilet paper on which it is written. This also invalidates the claim that I have multiple God classes.
This topic is discussed in more detail in You have created a monster "god" class and A class with multiple methods has multiple responsibilities.
People look at an example of one of my concrete tables classes which starts off by containing nothing more than a constructor and because it is so small they surmise that it must be anemic. Perhaps they don't notice the use of the word "extends" which allows it to incorporate code from a huge abstract class. An anemic domain model is supposed to be one which contains data but no methods to process any business rules, but if you opened your eyes and looked close enough you would see that all the business rules, which includes data validation, for each database table are performed within the class for that table.
This topic is discussed in more detail in You have created an anemic domain model.
Can anyone explain to me how some people looking at my table classes consider that my abstract table class is a God Object which does too much while my concrete table classes are all anemic as they do too little? Surely these two accusations are mutually exclusive?
The full complaint, as explained in OOP for Heretics, was as follows:
If you have one class per database table you are relegating each class to being no more than a simple transport mechanism for moving data between the database and the user interface. It is supposed to be more complicated than that.
Who says? Surely if I make something more complicated than it need be I would be violating the KISS principle? You must be a member of the Lets's-make-it-more-complicated-than-it-really-is-just-to-prove-how-clever-we-are brigade. The alternative idea, that of having a class which is responsible for a group of tables, is something I could never dream up. I have worked with databases for several decades, and just as the DBMS itself treats each table as a separate entity with its own set of business rules, then I follow the principles of OOP and encapsulate all those rules in a separate class. This simple approach has enabled me to identify and take advantage of a wide range of benefits which are explained below.
That might have been true if my framework could only generate the small family of forms as shown in Figure 1 as these only support the basic CRUD operations. However, in order to deal with more complex situations I have created a library of over 40 Transaction Patterns, and using these patterns I have written a complex ERP application which now has 14 subsystems, over 350 database tables, 700 relationships, and over 2,800 user transactions. That surely qualifies as "more than simple". In my decades of experience I can safely say that every complex transaction starts off as a simple transaction to which complexity is then added. The user transactions that are generated by my framework can cover a wide range of scenarios, and can be run immediately after being generated to show the basic functionality. Extra complexity can be added simply by inserting custom code into the relevant customisable methods which already exist in any table class.
If you cannot write a complex enterprise application using your implementation of OOP then perhaps it is your implementation which makes it more difficult than it should be.
This is discussed further in Table Oriented Programming (TOP).
Then your implementation was obviously faulty. Mine was not. Perhaps you need to rethink your ideas on how OOP should be implemented and concentrate, as I do, on producing results instead of following arbitrary rules.
Many people seem to be annoyed at my heretical approach to OOP, but I don't care. It has enabled me to avoid a lot of the coding that other approaches seem to require, and to provide a framework that takes care of a lot of the standard functionality that is needed in a database application. Thus I can achieve more with less, and isn't that supposed to be the benefit of using OOP in the first place? Below is a list of the areas in which I can save time:
I do not use two incompatible design methodologies in my applications, so my software structure is always in sync with my database structure, which I design using the rules of data normalisation. This avoids the problem known as Object-relational Impedance Mismatch which then means that I do not have to work around that problem by using that abomination of a solution called an Object-Relational Mapper (ORM). Prevention is always better than cure.
The only objects that my business layer has to deal with are objects in the database, which are tables. These do not have hierarchies in the database so why should I have hierarchies in my software? The only inheritance I need is to inherit all the standard code from a single abstract table class.
As far as I am concerned all the necessary design patterns have been built into my framework. I started off by using the 3-Tier Architecture, but because I ended up by splitting the presentation layer into two separate components a colleague pointed out that this was also an implementation of the MVC design pattern. This resulted in a four-part structure which is shown in Aren't the MVC and 3-Tier architectures the same thing? The four components are as follows:
Every concrete table class inherits vast amounts of standard code from my abstract table class, so each table class need only contain the specifics for that particular table. Each class file can be generated by the framework in two simples steps:
If a table's structure ever changes all that needs to be done is to repeat the import and export process which will cause the structure file to be recreated. The class file will not be overwritten as it may have been modified to include code in customisable methods.
Every method required by a Controller to communicate with the Model is defined within my abstract table class which is inherited by every concrete table class. Because it is inherited it does not have to be written.
Because all the data, both incoming and outgoing, is held in an array of variables called $fieldarray, which is defined in the abstract table class, I don't have to spend time in defining a separate variable for each column.
I do not need to define a separate method for each user transaction (also known as "use case" or "unit of work") as each transaction has its own entry on the MNU_TASK table. At run-time the user selects which task he wants to run from a menu of options, and that selection activates the relevant Controller which calls methods on the relavant Model. While the code with a Controller is standard the code within a Model's customisable methods can be set to whatever the developer requires.
The validation requirements for each column in a table are defined in the $fieldspec array which is made available in the <table>.dict.inc file which is exported from the Data Dictionary. All user input comes in as an associative array, such as $_POST, where the column values are keyed by the column name. The abstract table class then uses standard code to verify that each of the values in the data array matches that column's specifications in the specifications array.
This topic is discussed further in How NOT to validate data.
Anyone who has written SQL queries for any length of time will tell you that they all follow a standard pattern with the only differences being the table and column names. Anything which is standard and common to all database tables is therefore defined with the abstract table class, and this is merged with the specifics of a particular concrete table class at run-time in order to provide the information requied to build a particular query. This information is then sent to the Data Access Object (DAO) for processing.
Having written thousands of user transactions over several decades I have noticed that these transactions can be broken down into patterns consisting of structure, behaviour and content. The structures can be provided by a collection of reusable XSL stylesheets, the behaviour can be provided by a collection of reusable Controllers, and the content can be provided by any one of the table classes. I have used this knowledge to create a library of Transaction Patterns which combine structure and behaviour for each user transaction, and I can use the Generate PHP Script feature of my Data Dictionary to create a working user transaction at the touch of a button by joining a particular Transaction Pattern with a particular table. This arrangement provides me with the following levels of reusability:
The only "difficulty" with this approach is deciding which Transaction Pattern to use in the first place, but as the framework download contains lots of samples this should become easier with experience.
I have seen such a thing proposed more than once, such as in Decoupling models from the database: Data Access Object pattern in PHP, and I am always surprised, even shocked, that so-called "professional" programmers can come up with such convoluted and complicated solutions. In my mind that is the total opposite of what should actually happen. In my methodology I *DO NOT* have a separate DAO for each table, I only have a separate DAO for each DBMS (MySQL, Postgresql, Oracle and SQL Server) where each can handle any table that exists. If you understand SQL you should realise that there are only four operations that can be performed on a database table - create, read, update and delete - so why would I duplicate those operations for each table when I can have a single object to handle any table?
For each user transaction the associated web page will follow the screen structure used by that particular Transaction Pattern, and all that is required is a small screen structure script to identify which column from each table goes where on that screen. This script is initially generated from the Data Dictionary, but can be amended afterwards. This file is fed into a View Object at run-time which writes it out to an XML document, along with all the data from the table class(es), which is then transformed into HTML using the designated XSL stylesheet.
All the following areas in a web page are automatically supplied by and handled by the framework:
If you have to write such code yourself then you know what a burden it can be. Now imagine not having to write such code.
Because I can save time by NOT doing a lot of useless things I can then spend that time in doing useful things which add value to the application.
When my critics tell me that I am breaking one of their precious rules they don't understand that breaking that rule has no adverse effect on my code. They cannot say "because you are doing that you cannot do this", so following that rule would not solve any problem for me. All it would do is force me to write additional code to achieve the same result, only differently, and if that effort is not rewarded with measurable benefits then in my universe that effort is a waste of time. I run a business where producing cost-effective software is the name of the game, so I prefer to spend my time in writing software that pleases my paying customers. I do not waste time in trying to write code that pleases the paradigm police with its purity as their definition of purity does not result in cost-effective software.
Not only does my approach NOT cause any problems, it actually opens up a lot of advantages which are completely closed to other methods.
Not only does my approach NOT break any documented rule, I would actually say that the idea of having a single class which is responsible for more than one database table is violating the Single Responsibility Principle. What is your answer to THAT?