What is Object Oriented Programming (OOP)?

By Tony Marston

3rd December 2006
Amended 20th April 2017

Introduction
What OOP is NOT
What is an Object Oriented language?
What OOP is
- Optional Extras
The difference between OOP and non-OOP
Practical Examples
- Encapsulation
- Inheritance
- Polymorphism
Popular misconceptions
- What Encapsulation is not
- What Polymorphism is not
- What Inheritance is not
- OOP requires a totally different thought process
What types of object should I create?
How many objects should I create?
What structure should I use?
How much reusability should I have?
Conclusion
References
Amendment History

Introduction

Quite often I see a question in a newsgroup or forum along the lines of: What is this thing called 'OOP'? What is so special about it? Why should I use it? How do I use it?. The person asking this type of question usually has experience of non-OO programming and wants to know the benefits of making the switch. Unfortunately most of the replies I have seen have been long on words but short on substance, full of airy-fairy, wishy-washy, meaningless phrases which are absolutely no use at all to man or beast.

Having created 1000's of programs using non-OO languages, and another 500+ using the OO features of PHP I feel more than qualified to add my own contribution to the melting pot. According to some OO 'purists' I am not qualified at all as I was not taught to do things 'their' way and I refuse to follow 'their' methods. My response to that accusation is that there is no such thing as 'only one true way' with OOP just as there is no such thing as 'only one true way' with religion. People tell me that my methods are wrong, but they are making a classic mistake. My methods cannot be wrong for the simple reason that they work, and anybody with more than two brain cells to rub together will tell you that something that works cannot be wrong just as something that does not work cannot be right. My methods are not wrong, they are simply different, and sometimes it is a willingness to adopt a different approach that separates the code monkeys from the engineers.

One reason why some people give totally useless answers is that it was what they were taught, and they do not have the intelligence to look beyond what they were taught. Another reason is that some of the explanations about OO are rather vague and can be interpreted in several ways, and if something is open to interpretation it is also open to a great deal of mis-interpretation. If you do not believe that there is widespread confusion as to what OO is and is not then take a look at Nobody Agrees On What OO Is. Even some of the basic terminology can mean different things to different people, as explained in Abstraction, Encapsulation, and Information Hiding. If these people cannot agree on the basic concepts of OOP, then how can they possibly agree on how those concepts may be implemented.


What OOP is NOT

As a first step I shall debunk some of the answers that I have seen. In compiling the following list I picked out those descriptions which are not actually unique to OOP as those features which already exist in non-OO languages cannot be used to differentiate between the two.

OOP is about modeling the 'real world'

OOP is a programming paradigm that uses abstraction to create models based on the real world. It provides for better modeling of the real world by providing a much needed improvement in domain analysis and then integration with system design.

Rubbish. OOP is no better at modeling the real world than any other method. Every computer program which seeks to replace a manual process is based on a conceptual software model of that process, and if the model is wrong then the software will also be wrong. The conceptual model is created as an analyst's view of the real world, and the computer software is based solely on this conceptual model. OOP does not provide the ability to model objects which could not be modelled in previous paradigms, it simply provides the ability to produce different types of models where both the data and the operations which act upon that data can be defined (encapsulated) in the same unit (class). OOP does not guarantee that the model will be better, just that the implementation of that model will be different. You should also consider the fact that it would be totally impractical to model the whole of the real world as it is simply too vast and too complicated. It is only ever necessary to model those parts which are actually relevant to your current application, and it is the exercise of deciding what is and is not relevant which decides if your abstraction is correct.

Bear in mind that unless you are developing software which directly manipulates a real-world object, such as process control, robotics, avionics or missile guidance systems, then some of the properties and methods which apply to that real-world object may be completely irrelevant in your software representation. If, for example, you are developing an enterprise application such as Sales Order Processing which deals with entities such as Products, Customers and Orders, you are only manipulating the information about those entities and not the actual entities themselves. In pre-computer days this information was held on paper documents, but nowadays it is held in a database in the form of tables, columns and relationships. An object in the real world may have many properties and methods, but in the software representation it may only need a small subset. For example, an organisation may sell many different products with each having different properties, but all that the software may require to maintain is an identity, a description and a price. A real person may have operations such as stand, sit, walk, and run, but these operations would never be needed in an enterprise application. Regardless of the operations that can be performed on a real-world object, with a database table the only operations that can be performed are Create, Read, Update and Delete (CRUD). Following the process called data normalisation the information for an entity may need to be split across several tables, each with its own columns, constraints and relationships. Each object in the database is a separate table, so I see no reason why I should not have a separate class in my software to deal with each object in my database. Some people advocate having a group of database tables being handled by a single class, but this is not how databases work. It is not necessary to go through one table to get to another as each table is an independent object with its own properties. So, each independent object in the database should have its own independent class in the software.

The article Don't try to model the real world, it doesn't exist puts forward an interesting viewpoint.

The term "abstraction" is also open to interpretation, and therefore mis-interpretation, as discussed in Understand what "abstraction" really means. This is why some people's abstractions look more like the work of Picasso when what is required should look like the work of Michelangelo.

That is why it is possible to create software that does A, B and C but it is useless to the customer as it does not also do X, Y and Z. The real world may contain X, Y and Z but the analyst did not include it in his model either because he did not spot it or because the customer failed to mention it in his Specification Of Requirements (SOR). I know because I have encountered both situations in my long career.

Not everyone agrees that direct real-world mapping is facilitated by OOP, or is even a worthy goal; Bertrand Meyer argues in Object-Oriented Software Construction that a program is not a model of the world but a model of a model of some part of the world; "Reality is a cousin twice removed".

OOP is about code re-use

The power of object-oriented systems lies in their promise of code reuse which will increase productivity, reduce costs and improve software quality.

Rubbish. This implies that code re-use is possible in OOP and not possible in non-OOP. Using OOP does not guarantee that more reusable code will be available as reusability depends on how the code is written, not the language in which it was written. It is possible to produce libraries of reusable modules in any non-OO language (I know, because I was doing just that with COBOL in 1985) just as it is possible to produce volumes of non-reusable code in any OO language.

It does not matter on the capabilities of the language as it is possible to have the same block of code duplicated in 100 places in any language. It is also possible, in any language, to put that block of code into a reusable module and call that module from those 100 places.

The one thing that OO languages have which procedural languages do not is inheritance. This allows code to be defined on one class (the superclass) and then inherited by any number of other subclasses. The subclass then combines everything in the superclass with whatever is defined within itself. Unfortunately too many programmers totally misuse this feature and create complex multi-level class hierarchies which become so messy that they abandon the idea of inheritance in favour of Object Composition.

One of the early promises of OOP that I heard many years ago was that it would be possible for a software vendor to produce a library of pre-written classes, and for other developers to use these "off the shelf" classes instead of creating their own custom versions and thus "re-inventing the wheel". This dream never materialised, which just goes to prove that OOP promises much but delivers little.

OOP is about modularity

The source code for an object can be written and maintained independently of the source code for other objects. Once created, an object can be easily passed around inside the system.

Rubbish. The concept of modular programming has existed in non-OO languages for many years, so this argument cannot be used to explain why OO is supposed to be better than non-OO. Just as it is possible in any language to hold the source code for an entire application in a single file, it is just as possible, in any language, to break that source code into smaller modules so that the source code for each module can be maintained and compiled independently of all other modules.

Besides, any software which consists of multiple classes is automatically "modular" as each class can be considered to be a self-contained "module". The critical factor is how well each module or class is designed.

OOP is about plugability

If a particular object turns out to be problematic, you can simply remove it from your application and plug in a different object as its replacement. This is analogous to fixing mechanical problems in the real world. If a bolt breaks, you replace it, not the entire machine.

Rubbish. This is the same as modularity where the source code for any individual module can be modified, recompiled and inserted into the application without having to touch any of the other modules.

OOP is about implementation hiding

By interacting only with an object's methods, the details of its internal implementation remain hidden from the outside world.

In the first place implementation hiding was never one of the aims of OOP, it is merely a by-product of encapsulation. The outside world can see the method names which can be used on a object, but not the code which exists behind those method names.

In the second place implementation hiding is not unique to OOP, nor did it suddenly appear because of OOP. In any language, whether it is object oriented, procedural, functional, or whatever, when you write a function or procedure (and remember that a class method is nothing more than a procedural function within a class) all you are exposing to the outside world, including programmers who write code which calls that function, is the function's signature, as in:

$return = functionName(arg1, arg2, ..., argX);           // procedural
$return = $object->functionName(arg1, arg2, ..., argX);  // object oriented

Here you are identifying three things:

The only thing which is not exposed is the code that is executed when the function is called. You know what the function does but not how it does it. You know what data goes in and what data comes out, but not what code is executed in the middle. In other words how that function is implemented, the actual code which is executed, is hidden from view. The documentation which comes with the function library should describe what the function does so that the programmer can decide if that function is the right one to call, and how to call it, but the actual code behind the function name is still hidden. The documentation may provide a listing of the source code, and the actual source code may be provided for the programmer to view and possibly modify, but as far as the function's signature goes the implementation is effectively hidden. This means that at any time you could install a new version of that function with a modified implementation and, provided that the function's signature did not change, you would not have to change any code which calls that function. This means that the implementation of a function could change at any time but the calling program would not know that it had changed. The implementation is "hidden", so how can the calling program possibly know that it has changed?

OOP is about information hiding

A lot of people assume the misunderstanding regarding implementation hiding then go one step further and say that because the data is part of the implementation then the data must be hidden as well. This is how they justify the use of the visibility options of public, private and protected which in turn necessitate the user of getters (accessors) and setters (mutators). These people do not realise that there is a fundamental difference between "implementation" and "information":

The act of encapsulation is supposed to put an entity's methods and data inside a capsule, but nowhere does it say that the walls of the capsule should be opaque. Nowhere does it say that the data inside the capsule should be hidden from view. Nowhere does it say that an object's data cannot be accessed directly without the use of a separate API. It is only the code behind the object's API which is hidden from view, not the data contained within the object itself.

It is possible to access an object's data in two ways:

While some programmers say that the use of getters and setters to access an object's data should be mandatory, there are others who have a different opinion, as shown in Why getter and setter methods are evil and Getters/Setters. Evil. Period.

OOP is about the passing of messages.

Message passing is the process by which an object sends data to another object or asks the other object to invoke a method.

Rubbish. The way that an object's method is invoked in an OO language is identical to the way in which a function or procedure in a non-OO language is invoked. If the language supports both non-OO functions and object methods (as PHP does) the method of invocation is called "calling", not "message passing". In fact in some languages it is necessary to specify the word "call" when invoking a subroutine.

non-OO: $result = function(arg1, arg2, ...)
OO:     $result = $object->function(arg1, arg2, ...)

The result of each invocation is exactly the same - the caller is suspended while control is passed to the callee, and control is not returned to the caller until the callee has finished.

I have worked with messaging software in the past and I can tell you quite categorically that they are completely different. A true messaging system has the following characteristics:

A common example of an asynchronous message system is an email. You send an email, and it goes it the recipient's queue. While you are waiting for a reply you can do other things, but every now and then you check your inbox for a reply.

Activating a method on an object is exactly the same as calling a function, and works as follows:

As you can see the mechanics of activating a method in an object is exactly the same as calling a non-OO function and nothing like sending a message in a messaging system.

OOP is about separation of responsibilities.

Each object can be viewed as an independent little machine with a distinct role or responsibility.

Rubbish. It depends entirely on how the module was written, and not the language in which it was written. It is possible to write independent modules in a procedural language such as COBOL, just as it is possible to write non-independent modules in an OO language.

The problem with "separation of responsibilities" is that different people have a different interpretation as to what it actually means. To some people the database operations such as SELECT, INSERT, UPDATE and DELETE require their own objects whereas others (like myself) put them all together in a single data access object (DAO). Some programmers may have a separate DAO for each table in the database while others (like myself) may have a single DAO which can deal with any and all database tables for a specific DBMS. Before you can separate any responsibilities you must first identify what those responsibilities are, and this is a design decision which is totally separate from the language in which the design is ultimately implemented.

The Single Responsibility Principle (SRP) was first defined by Robert C. Martin who said the following:

How do you separate concerns? You separate behaviors that change at different times for different reasons. Things that change together you keep together. Things that change apart you keep apart.

GUIs change at a very different rate, and for very different reasons, than business rules. Database schemas change for very different reasons, and at very different rates than business rules. Keeping these concerns (GUI, business rules, database) separate is good design.

Creating separate components for the GUI, business rules and database access is not restricted to OOP. This is in fact a description of the 3-Tier Architecture which I first encountered in a non-OO language called UNIFACE.

In all my many years of experience the only project that I have ever been involved in which failed to be implemented due to "technical difficulties" was one where the system architects were OO "experts" who knew everything there was to know (or so they thought) about this "separation of responsibilities". They designed a system around design patterns which had a different module for each responsibility, and this resulted in a design with at least ten layers of code between the UI and the database. This made the creation of new components far more complicated and convoluted than it need be, and it made testing and debugging an absolute nightmare. The result was far too expensive for the client, both in time and money, so he pulled the plug on the whole project and cut his losses. A pair of components which took 10 days to build using these "new fangled" OO techniques took me less than an hour to build using my "old fashioned" non-OO methods. So much for the superiority of OO.

Besides, any software which consists of multiple classes/modules automatically has "separation of concerns" as each class/module can be considered to be "concerned with" or "responsible for" a particular entity. The critical factor is how well each class/module deals with the requirements of its entity.

OOP is easier to learn.

OOP is easier to learn for those new to computer programming than previous approaches, and its approach is often simpler to develop and to maintain, lending itself to more direct analysis, coding, and understanding of complex situations and procedures than other programming methods.

Rubbish. This is just marketing hype. Every new language/tool/paradigm is supposed to be better than everything else, but it rarely is. It is not what you use but how you use it that counts, and I have personally witnessed where an "old" language, when used by competent programmers, regularly outperformed a "new" language which was advertised as being more productive by several orders of magnitude.

A person's ability to learn something is often limited by the quality of the teachers or teaching materials, and I'm afraid that too much of what is being taught is too complicated, too inefficient, and more likely to lead to project failures than successes. Too often the teachers insist that there is "only one way" to do OOP, and that is where I most strongly disagree. I have successfully migrated to OOP by ignoring all these so-called "experts" and drawing on my years of experience with non-OO languages.

Someone once told me that OOP is not as simple as taking a procedural function and wrapping it in a class. I disagree. It *IS* that simple. The only "trick" is placing related functions in the same class (this is called encapsulation), then adjusting them to deal with the state which can be maintained within an object. The really clever thing that you can do with classes is to extend a parent or abstract class into a number of subclasses through inheritance. You can also have a function/method which is available in objects which are instantiated from different classes, which gives you polymorphism. I have seen too many examples where classes have been created with the wrong mix of functions, either related functions not being in the same class, or classes containing functions which are not actually related. I have seen inheritance over used so much that the resulting class hierarchy is really difficult to maintain and enhance. I have seen programmers use every OO feature or construct which is available in the language for no better reason than to impress other programmers with their ability to write obscure code, the theory being that the more obscure it is the more OO it is. They seem to think that if it is too simple then you are not doing it right. These people have obviously not heard of the KISS principle.

OOP is about actors and actions.

Object Oriented Programming is a mode of software development that modularizes and decomposes code authorship into the definition of actors and actions.

This is so vague it is meaningless, and therefore of absolutely no use at all.

OOP is all about late binding.

'Late' refers to the fact that the binding decisions (which binary to load, which function to call) are deferred as long as possible, often until just before the function is called, rather than having the binding decisions made at compile time (early).

Rubbish. Whether such binding takes place early or late does not separate OOP from non-OOP. It is possible to have a non-OO language which offers late binding, but that does not magically turn it into OO. Conversely, a language which supports classes, encapsulation, inheritance and polymorphism is suddenly not OO simply because it only offers early binding.


As you can see, the above descriptions are either too vague or not specific to OOP, so they cannot be used as distinguishing features.


What is an Object Oriented language?

A computer language can be said to be Object Oriented if it provides support for the following:

Class A class is a blueprint, or prototype, that defines the variables (data) and the methods (operations) common to all objects of a certain kind.
Object An instance of a class. A class must be instantiated into an object before it can be used in the software. More than one instance of the same class can be in existence at any one time.
Encapsulation The act of placing data and the operations that perform on that data in the same class. The class then becomes the 'capsule' or container for the data and operations. This binds together the data and functions that manipulate the data.

Note that this requires ALL the properties and ALL the methods to be placed in the SAME class. Breaking a single class into smaller classes so that the count of methods in any one class does not exceed an arbitrary number is therefore a bad idea as it violates encapsulation and makes the system harder to read and understand. Putting all methods which are related into the same class leads to high cohesion whereas putting related methods into separate classes leads to low cohesion.

Note that data may include meta-data (type, size, etc) as well as entity data.

Please also refer to What Encapsulation is not.

You may also like to read the following:

Inheritance The reuse of base classes (superclasses) to form derived classes (subclasses). Methods and properties defined in the superclass are automatically shared by any subclass. A subclass may override any of the methods in the superclass, or may introduce new methods of its own.

Note that I am referring to implementation inheritance (which uses the "extends" keyword) and not interface inheritance (which uses the "implements" keyword).

Please also refer to What Inheritance is not.

Polymorphism Same interface, different implementation. The ability to substitute one class for another. This means that different classes may contain the same method names, but the result which is returned by each method will be different as the code behind each method (the implementation) is different in each class.

Note that this does NOT require the use of the keywords "interface" and "implements" as these are totally optional in PHP. All that is required is that different classes implement the same method name with the same signature.

Please also refer to What Polymorphism is not.

A class defines (encapsulates) both the properties (data) of an entity and the methods (functions or operations) which may act upon those properties. Neither properties nor methods which can be applied to that entity should exist outside of that class definition.


What OOP is

When I came to learn OOP in late 2001 and early 2002 the resources which were available on the internet were very small in number and far less complicated. All I had to go on was a description of what made a language object oriented in Object Oriented Programming from October 2001 which stated something similar to the following:

Object Oriented Programming is programming which is oriented around objects, thus taking advantage of Encapsulation, Polymorphism, and Inheritance to increase code reuse and decrease code maintenance.

To do OO programming you need an OO language, and a language can only be said to be object oriented if it supports encapsulation (classes and objects), inheritance and polymorphism. It may support other features, but encapsulation, inheritance and polymorphism are the bare minimum. That is not just my personal opinion, it is also the opinion of the man who invented the term. In addition, Bjarne Stroustrup (who designed and implemented the C++ programming language), provides this broad definition of the term "Object Oriented" in section 3 of his paper called Why C++ is not just an Object Oriented Programming Language:

A language or technique is object-oriented if and only if it directly supports:
  1. Abstraction - providing some form of classes and objects.
  2. Inheritance - providing the ability to build new abstractions out of existing ones.
  3. Runtime polymorphism - providing some form of runtime binding.

This is the current definition found in ISO/IEC 2382:2015:

2122503
object-oriented
pertaining to a technique or a programming language that supports objects, classes, and inheritance

Note 1 to entry: Some authorities list the following requirements for object-oriented programming: information hiding or encapsulation, data abstraction, message passing, polymorphism, dynamic binding, and inheritance.

The fact that some authorities use a slightly different list of requirements in their definition of OO just proves that there is no single definition which satisfies everybody. Provided that I use terms which are included in the ISO/IEC definition - such as encapsulation, inheritance and polymorphism - then I feel justified in saying that my definition cannot be regarded as "wrong". On the contrary, anyone who says that OO requires features or concepts which are NOT in the above list has no justification in doing so and can be ignored with impunity.

OO theory is constantly being expanded to include more and more concepts, and these concepts are becoming more and more complicated. As languages are modified to include these add-on concepts newcomers to these languages become convinced that it is these add-ons which define what OO is. I totally disagree. OOP does not require the use of any of these optional extras, so it is wrong to say that a program is not OO simply because it does not use them. It would be like saying that a car is not a car unless it has climate control and satnav. Those are optional extras, not the distinguishing features, and not having them does not make your car not a car. It would also be incorrect to say that a car is a car because it has wheels. Having wheels does not make something a car - a pram has wheels, but that does not make it a car, so having wheels is not a distinguishing feature.

That is why I say that such things as "modularity", "reusability" and "messaging" are not features which distinguish an OO language from a non-OO language for the simple reason that they already existed in some non-OO languages. That is why I say that using some of these later additions, these optional extras, to OO languages does not make your code "more" OO. If you write programs which are oriented around objects then your code is object oriented, and you are an object oriented programmer. It's as simple as that. But it is how you implement the concepts of encapsulation, inheritance and polymorphism to achieve strong cohesion, loose coupling and the elimination of redundancies which really matters. It is the ability to produce code which has increased reusability and decreased maintainability when compared to previous paradigms which determines if your implementation is effective or not. If I can write effective software quicker and cheaper with a procedural language than you can with your OO language, then I'm afraid that it is you the programmer who has failed. It is not a failure in the language because I personally have made the move from non-OO languages to an OO-capable language and can produce identical software faster and cheaper than I could before. The failure is in the way that modern programmers are being taught to write OO code. There is too much emphasis on following academic principles instead of getting the job done in the most cost-effective manner, which is supposed to be the production of cost-effective software for the benefit of the paying customer.

I highlighted the phrase when compared to previous paradigms deliberately. You cannot say that something is better than all its alternatives unless you actually have some of those alternatives available for comparison. Today's programmers who have never written software in a non-OO language have nothing to compare against, so how do they know that what they are doing is better? I have several decades worth of experience in writing database applications in non-OO languages, so I am more able than most to make that comparison. I develop nothing but database applications for use by the enterprise, and I judge the effectiveness of a language by how productive I can be in that language. For each table in the database I usually have to create a family of forms in order to view and maintain the contents of that table, and as I have moved from one language to another the amount of time taken to build that family of forms has decreased.

Note that I also created my own frameworks in each of those languages, so this would also have contributed to my levels of productivity.

I created my OO framework in PHP 4 because PHP 4 had what was necessary according to the available definition at that time to write OO programs. This OO framework enabled me to create applications at a faster rate than my earlier frameworks in other languages simply because it had higher levels of reusability, which therefore increased both its speed of development and its maintainability. I have therefore used the concepts of OOP and achieved the objectives of OOP, so what justification do these young upstarts have for telling me that my implementation is wrong?

Optional Extras

Among these "optional extras" which have nothing to do with a language being OO or not are:

There is an additional list of "optional extras" at A minimalist approach to Object Oriented Programming with PHP.

As these are optional extras I am merely exercising the option to not use them. Some people say that my minimalist approach to OOP, the fact that I use nothing but encapsulation, inheritance and polymorphism, means that I am not a "proper" OO programmer. Yet with my approach I can still achieve high reusability and low maintenance, so why am I wrong?


The difference between OOP and non-OOP

A better way of trying to explain the differences between non-OO and OO programming is to actual examples.

They are defined differently

A function is defined as a self-contained block of code. Each function name "fName" must be unique within the application.

function fName ($arg1, $arg2) 
// function description
{
    ....
    
    return $result;
    
} // fName

A class method is defined within the boundaries of a class definition. Each class name "cName" must be unique within the application. Each class may contain any number of functions (also known as "methods"), and the function name "fName" must be unique within the class but need not be unique within the application. In fact, the ability for different classes to share common function/method names is a requirement of polymorphism.

class cName
{
    function fName ($arg1, $arg2) 
    // function description
    {
        ....
        
        return $result;
        
    } // fName
    
} // cName

In his article The Object-Oriented Thought Process the author Matt Weisfeld states the following:

Difference Between OO and Procedural
This is the key difference between OO and procedural programming. In OO design, the attributes and behavior are contained within a single object, whereas in procedural, or structured design, the attributes and behavior are normally separated.

They are accessed differently

It is important to note that neither a function nor a class can be accessed until the function/class definition has been loaded.

Calling a function is very straightforward:

$result = fName($arg1, $arg2);

Calling a class method is not so straightforward. First it is necessary to create an instance of the class (an object), then to access the function (method) name through the object. The object name must be unique within the application.

$object = new cName; 
$result = $object->fName($arg1, $arg2);

They have different numbers of working copies

A function does not have to be instantiated before it can be accessed, therefore only one copy (or instance) is said to exist at any one time.

A class method can only be accessed after it has been instantiated into an object (unless it has been defined as a static method, see below), and it is possible to create multiple instances (objects) of the same class with different object names.

$object1 = new cName;
$object2 = new cName;
$object3 = new cName; 

Although it is possible to access a static method without first creating an object, this is no better than accessing a non-class function. As it is not actually using an object it cannot be considered part of object oriented programming.

They have different numbers of entry points

A function has only a single point of entry, and that is the function name itself.

An object has multiple points of entry, one for each method name.

They have different methods of maintaining state

A function by default does not have state, by which I mean that each time that it is called it is treated as a fresh invocation and not a continuation of any previous invocation.

An object does have state, by which I mean that each time an object's method is called it acts upon the object's state as it was after the previous method call.

It is possible for both a function and a class method to use local variables, and they both operate in the same way. This means that the local variables do not exist outside the scope of the function or class method, and any values placed in them do not persist between invocations.

It is possible for a function to remember values between different invocations by declaring a variable as static, as in the following example:

function count () {
    static $count = 0;
    $count++;
    return $count;
}

Each time this function is called it will return a value that is one greater than the previous call. Without the keyword static it would always return the value '1'.

Class variables which need to persist outside of a function (method) are declared at class level, as follows:

class calculator
{
    // define class properties (member variables)
    var $value;
    
    // define class methods
    function setValue ($value) 
    {
        $this->value = $value;
        
        return;
        
    } // setValue
    
    function getValue () 
    // function description
    {
        return $this->value;
        
    } // getValue
    
    function add ($value) 
    // function description
    {
        $this->value = $this->value + $value;
        
        return $this->value;
        
    } // add
    
    function subtract ($value) 
    // function description
    {
        $this->value = $this->value - $value;
        
        return $this->value;
        
    } // subtract
    
} // calculator

Note that all class/object variables are referenced with the prefix $this-> as in $this->varname. Any variable which is referenced without this keyword, as in $varname, is treated as a local variable.

Note also that each instance of the class (object) maintains its own set of variables, so the contents of one object are totally independent of the contents of another object, even it is from the same class.


Practical Examples

Here are some practical examples which demonstrate Encapsulation, Inheritance and Polymorphism.

Encapsulation

Encapsulation The act of placing data and the operations that perform on that data in the same class. The class then becomes the 'capsule' or container for the data and operations. This binds together the data and functions that manipulate the data.

Note that this requires ALL the properties and ALL the methods to be placed in the SAME class. Breaking a single class into smaller classes so that the count of methods in any one class does not exceed an arbitrary number is therefore a bad idea as it violates encapsulation and makes the system harder to read and understand. Putting all methods which are related into the same class leads to high cohesion whereas putting related methods into separate classes leads to low cohesion.

Note that data may include meta-data (type, size, etc) as well as entity data.

Please also refer to What Encapsulation is not.

You may also like to read the following:

Every application deals with a number of different entities or "things", such as "customer" "product" and "invoice", so it is common practice to create a different class for each of these entities. At runtime the software will create one or more objects from each class definition, and when it wants to do something with one of these entities it will do so by calling the relevant method on the relevant object.

The data held within each object at runtime cannot remain in memory for ever, so it is written out to a persistent data store (a database) with a separate table for each entity. There are only four basic operations which can be performed on a database table (Create, Read, Update, Delete) so I shall start by creating a method for each one.

class entity1
{
    // class properties
    var $dbname;           // database name
    var $errors = array(); // array of error messages, indexed by field name          
    var $fieldarray;       // associative array of name=value pairs
    var $fieldspec;        // array of field specifications
    var $numrows;          // number of database rows affected
    var $primary_key;      // array of field names which make up the primary key
    var $tablename;        // table name
    
    // class methods
    function __construct ()
    // constructor
    {
        $this->tablename   = 'entity1';
        $this->dbname      = 'foobar';
        
        $this->fieldlist   = array('column1', 'column2', 'column3', 'column4');
        $this->primary_key = array('column1');
        
    } // __construct
    
    // class methods
    function getData ($where)
    // read data from the database which satisfies the selection criteria in $where
    {
        ....
        
        return $this->fieldarray;
        
    } // getData
    
    function insertRecord ($fieldarray)
    // create a database record using the contents of $fieldarray
    {
        ....
        
        return $this->fieldarray;
        
    } // insertRecord
    
    function updateRecord ($fieldarray)
    // update a database record using the contents of $fieldarray
    {
        ....
        
        return $this->fieldarray;
        
    } // updateRecord
    
    function deleteRecord ($fieldarray)
    // delete a database record identified in $fieldarray
    {
        ....
        
        return $this->fieldarray;
        
    } // deleteRecord
    
} // entity1

Please note the following:

Each of these classes therefore acts as a 'capsule' which contains both the data for an entity and the operations which can be performed upon that data. This is 'encapsulation'.

Inheritance

Inheritance The reuse of base classes (superclasses) to form derived classes (subclasses). Methods and properties defined in the superclass are automatically shared by any subclass. A subclass may override any of the methods in the superclass, or may introduce new methods of its own.

Note that I am referring to implementation inheritance (which uses the "extends" keyword) and not interface inheritance (which uses the "implements" keyword).

Please also refer to What Inheritance is not.

After writing and testing a class to deal with 'entity1' I copied it and made it work for 'entity2'. I then compared the two classes to see what code was common and could be shared, and what code was unique and could not be shared. I then transferred all the common code into a separate class known as a 'superclass'.

Firstly, to create the superclass, I changed the class name and the constructor to the following:

abstract class Default_Table
{
    // class properties
    var $dbname;           // database name
    var $errors = array(); // array of error messages, indexed by field name          
    var $fieldarray;       // associative array of name=value pairs
    var $fieldspec;        // array of field specifications
    var $numrows;          // number of database rows affected
    var $primary_key;      // array of field names which make up the primary key
    var $tablename;        // table name			
    
    // class methods
    function __construct ()
    // constructor
    {
        $this->tablename   = 'unknown';
        $this->dbname      = 'unknown';
        
        $this->fieldlist   = array();
        $this->primary_key = array();
        
    } // __construct
    
    function getData ($where)
    {
        ....
    }
    function insertRecord ($fieldarray)
    {
        ....
    }
    function updateRecord ($fieldarray)
    {
        ....
    }
    function deleteRecord ($fieldarray)
    {
        ....
    }
} // default

Note here that it is defining an unknown table with an unknown number of fields/columns, so it cannot be used as a genuine object. This is reinforced by the use of the word "abstract" in front of the class name which will prevent the "new" keyword from being used. It is only when the details of a specific database table are combined with this abstract definition through the mechanism known as inheritance that a "concrete" class is made available for instantiation.

Secondly, I altered each table class to remove the common methods and properties, and included the keyword extends to force inheritance from the abstract superclass.

include 'default.class.inc';
class entity1 extends Default_Table
{
    function __construct ()
    // constructor
    {
        $this->tablename   = 'entity1';
        $this->dbname      = 'foobar';
        
        $this->fieldlist   = array('column1', 'column2', 'column3', 'column4');
        $this->primary_key = array('column1');
        
    } // __construct
    
} // entity1

When a subclass is instantiated into an object that object will combine all the properties and methods of the superclass as well as those of the subclass. If anything has been defined in both the superclass and the subclass, then the definition from the subclass will take precedence.

In my current development environment the superclass contains several thousand lines of code, but there is only one copy of this code which is inherited by several hundred table classes. Inheritance is therefore a powerful mechanism for making one copy of common code accessible to many objects instead of having multiple copies of that common code.

Polymorphism

Polymorphism Same interface, different implementation. The ability to substitute one class for another. This means that different classes may contain the same method names, but the result which is returned by each method will be different as the code behind each method (the implementation) is different in each class.

Note that this does NOT require the use of the keywords "interface" and "implements" as these are totally optional in PHP. All that is required is that different classes implement the same method name with the same signature.

Please also refer to What Polymorphism is not.

Polymorphism can only be employed where the same method names exist in several classes. The code within the method may be inherited from a parent class, or it may be totally different. This means that the same method can be used on different objects, but the results will be different.

For example, take a series of classes called 'Customer', 'Product' and 'Invoice'. One practice I have seen which makes polymorphism impossible is to incorporate the entity name into the method name, as in:

  1. getCustomer(), insertCustomer(), updateCustomer() deleteCustomer()
  2. getProduct(), insertProduct(), updateProduct() deleteProduct()
  3. getInvoice(), insertInvoice(), updateInvoice() deleteInvoice()

The problem with this approach is that the object (the controller in MVC) which communicates with each table object (the model in MVC) needs to know the method name before it can open up that channel of communication. If each model has a unique set of method names then it must have a unique set of controllers to communicate with it.

My approach is to use a standard set of method names for standard operations, as in:

  1. getData(), insertRecord(), updateRecord(), deleteRecord()

This is made easier as these methods are defined in the superclass and made available to each subclass through inheritance.

The advantage of this is that I can have one standard controller for each standard function, and this controller can work with any table class in the system. This is far better than having a separate set of controllers for each table class.

Here is some example code from one of my controllers:

....
include "$table.class.inc";
$object = new $table;
$data = $object->getData($where);
....

The contents of $table and $where are made available at runtime.

The significant point is that the name of the class (database table) is not hard-coded into the controller, it is passed as an argument at runtime. Only the method names are hard-coded, but as these method names exist within every table class by being inherited from the superclass they will always work. So, if the class name is 'Customer' the controller will obtain data from the 'Customer' table, if it is 'Product' it will obtain data from the 'Product' table, and so on.


Popular misconceptions

I am often berated by my critics, of which there are more than few, for not understanding what OOP really means. This simply boils down to the fact that they have extended the basic principles of OOP to include their personal interpretations, and they have developed rules which govern how these interpretations should be implemented.

Let me make it quite clear that I do not care for these personal interpretations, and I certainly do not care for these rules.

What Encapsulation is not

I disagree with all the following statements:

  1. Encapsulation and Abstraction mean the same thing

    This is a popular misconception, but the fact that one leads to the other does not mean that they are the same thing.

    Another way of putting it is that the process of abstraction identifies individual components within an application, while encapsulation is the act of creating a class for each component.

    The differences between encapsulation and abstraction are discussed more in Mistaking Encapsulation for Abstraction by Kevin Buchanan.

  2. Encapsulation is about implementation hiding

    This is a meaningless statement as "implementation hiding" is not restricted to encapsulation, and neither is it restricted to OO languages. This was not the aim of encapsulation, nor is it a unique or distinguishing feature of encapsulation. It is a universal property of all languages and all paradigms, as described in OOP is about information hiding. In every procedural language you can create a function with a specific signature (the API), and this signature only exposes what it does, not how it does it (the implementation).

  3. Encapsulation is about information (data) hiding

    Just because "information" and "implementation" have similar sounds, it does not follow that they also have similar meanings. It is also untrue to say data is part of the implementation, therefore implementation hiding automatically means data hiding. This is wrong for the simple reason that implementation is code while information is data. Data is not code, so information is not implementation.

    If you think that I am the only person who thinks that encapsulation has nothing to do with information hiding then take a look at the following:

    The article Abstraction, Encapsulation, and Information Hiding shows how different authors have provided their own interpretations of encapsulation which have caused the true meaning to become corrupted beyond recognition:

    To enclose in or as if in a capsule.

    -- Mish, 1988

    The concept of encapsulation as used in an object-oriented context is not essentially different from its dictionary definition. It still refers to building a capsule, in the case a conceptual barrier, around some collection of things.

    -- Wirfs-Brock et al, 1990

    But then the idea of "data hiding" crept in:

    It is a simple, yet reasonable effective, system-building tool. It allows suppliers to present cleanly specified interfaces around the services they provide. A consumer has full visibility to the procedures offered by an object, and no visibility to its data. From a consumer's point of view, and object is a seamless capsule that offers a number of services, with no visibility as to how these services are implemented ... The technical term for this is encapsulation.

    -- Cox, 1986

    Encapsulation, or equivalently information hiding, refers to the practice of including within an object everything it needs, and furthermore doing this in such a way that no other object need ever be aware of this internal structure.

    -- Graham, 1991

    We say that the changeable, hidden information becomes the secret of the module; also, according to a widely used jargon, we say that such information is encapsulated within the implementation.

    -- Ghezzi et al, 1991

    Data hiding is sometimes called encapsulation because the data and its code are put together in a package or 'capsule'.

    -- Smith, 1991

    Encapsulation is used as a generic term for techniques which realize data abstraction. Encapsulation therefore implies the provision of mechanisms to support both modularity and information hiding. There is therefore a one to one correspondence in this case between the technique of encapsulation and the principle of data abstraction.

    -- Blair et al, 1991

    Encapsulation (also information hiding) consists of separating the external aspects of an object which are accessible to other objects, from the internal implementation details of the object, which are hidden from other objects.

    -- Rumbaugh et al, 1991

    Encapsulation -- also known as information hiding -- prevents clients from seeing its inside view, where the behavior of the abstraction is implemented.

    -- Booch, 1991

  4. You must use getters and setters to access your data

    As a consequence of the "Encapsulation means information/data hiding" rule some programmers insist that it also means that the visibility of each piece of data should be changed from "public" to "private". This means that instead of

    $foo = $object->foo;
    $object->foo = 'bar';
    
    I should use:
    $foo = $object->getFoo();
    $object->setFoo('bar');
    

    This is because the 'get' and 'set' methods (also known as 'accessors' and 'mutators') provide the opportunity to execute some additional code when the data is being retrieved or inserted.

    Considering that I do not even acknowledge that the rule exists in the first place, why should I be bound by additional consequential rules?

    It should also be noted by experienced programmers that there are not just two methods of retrieving data from and inserting data into an object. There is actually a third - on the method signature itself. Consider the following statement:

    $output = $object->doSomething($input);
    

    $input is data going in, and $output is data coming out. Neither of these variables can have their visibility downgraded from "public" for the simple reason that they are part of the method signature and therefore cannot be hidden. This means that I can put data into and get data out of an object without having to reference a class variable, public or not, hidden or not.

    There is also no rule that says each class property must be a scalar value, so I can use an array if I want to. This can be especially useful when dealing with data associated with relational databases as they deal with rows and columns, and arrays are the perfect mechanism as they can deal with any number of columns from any number of rows. It is possible for the sql SELECT statement to exclude some of the table's columns, or it may include columns from other tables via a JOIN. This method is much more flexible because I don't have to code the names of any columns in any getters and setters. The Controller object injects the entire HTTP request into the Model object as a single array, and the View object retrieves all the data from the Model as a single array.

  5. You must validate your data in the setter methods

    There is a golden rule in programming that you must never trust any data which is supplied by a user - it should always be validated or filtered before it gets written to the database. While I agree with this rule (now there's a surprise!) I do not like the follow-on rule which states As you must be using setter methods it follows that you must validate the user input within the setter method.

    I don't use setters therefore I cannot validate within setters. That does not mean that the data goes unvalidated, it just means that I perform my validation in a different manner. All data goes into the object as an array, and all data gets passed to the data access object (DAO) as an array so that it can be written to the database. But it only gets to the DAO after it has been validated, and that is done by passing the entire array through a validation object. If there are any validation errors then the whole array gets thrown back to the user with a suitable error message and never gets as far as the database.

  6. You must not have more than N methods in a class

    Some programmers say that if you have more than N methods in a class (where N is a completely arbitrary number) then you class is too big and unmanageable. They say that such a class must surely be breaking the Single Responsibility Principle. They say that you should break that class down into smaller subclasses as they are easier for the programmer to get his brain around.

    Firstly, encapsulation requires that all the data and all the operations that can be performed on that data are placed in the same class. If you use multiple classes then you are breaking encapsulation.

    Secondly, if a class requires 100 methods then it requires 100 methods. If a programmer cannot deal with 100 methods in a single class then how can he possibly deal with all those methods spread across multiple classes with the additional complexity of extra code which would then be required to pass control to another class just to deal with another operation, or a different facet of the same operation?

    Thirdly, you can take the idea of "lots of small methods" too far end up with an unmaintainable mess. As a prime example I was recently forced to use a certain email library, and when I downloaded it I found it had over 140 classes, each in its own file, spread across 24 directories and subdirectories. When I came to step through it with my debugger in order to track down what I thought was a minor problem I spent over 30 minutes stepping through line after line of code which didn't actually do anything useful. All it was doing was instantiating object after object and jumping from one object method to another. Most of these methods contained just a single line of code, and very little of this code was actually associated with constructing and sending the email. If that is your idea of "best practice", and you are teaching this idea to others, then all I can say is "God help us!"

What Polymorphism is not

The definition of polymorphism should be easily understood by everyone, yet there are some people who even manage to screw this up. I recently answered a question in the 'Dynamic form generation' thread in the comp.lang.php newsgroup in which a comedian called Jerry Stuckle said:

You've got a *partial* definition. Polymorphism is only applicable when the two classes have a parent/child hierarchy, and the child class has a method of the same name (and in some languages, the same parameter list) as the parent. When there is no parent/child relationship (as in the case of two different database tables), there is no polymorphism.
This was my reply:
The definition of polymorphism does NOT state that the classes have to exist in a parent/child hierarchy, only that they have the same method signature. Having said that, it is usually the case that the two classes ARE related. You obviously haven't used an abstract table class which is inherited by every concrete table class. All my concrete table classes have instant access to all the methods and properties which are defined just once in the abstract class.
To which Jerry responded:
I've used them much longer than you've even known they existed. When you have a class derived from the abstract class, there is a parent/child relationship. But there is no such relationship between two classes derived from the same one. Once again you show you have no knowledge of OO. Polymorphism cannot exist without inheritance - which requires a parent/child hierarchy.
This was my reply:
Yes it can. Polymorphism simply requires that the same interface exist in more than one class. That may come from inheritance, or it may not. It *IS* possible to define the same interface more than once without inheritance. Each one of the 350 concrete table classes in my application is derived from the same abstract table class, so each one of those 350 classes is a sibling of the other, and "sibling" implies a relationship.
To which Jerry responded:
So there is a sibling relationship? Once again you prove how you don't understand the concept of polymorphism.
Later on he said:
You can have the same interface in more than one object WITHOUT inheritance, but that is not polymorphism.

So, according to Jerry Stuckle, a self-proclaimed "expert", polymorphism is restricted by the following rules:

If multiple classes have the "same interface, different implementation" then the conditions for polymorphism exist whether you like it or not. If you have invented additional rules then you are wrong for inventing those rules. I am most definitely *NOT* wrong for refusing to follow those additional rules.

What Inheritance is not

Inheritance is a technique which allows you define methods and their implementations in one class (superclass), then to reuse those methods in another class (subclass) simply by using the extends keyword. The subclass then becomes a combination of what is in the superclass plus whatever is defined within itself. The article Pragmatic OOP by Ricki Sickenger says the following:

OOP is supposed to be a practical way to organize a program into hierarchies of objects where similar objects can inherit behavior from each other and override that behavior when necessary.

A problem with inheritance is that it can be used incorrectly. The above article contains the following:

A Car and a Train and a Truck can all inherit behavior from a Vehicle object, adding their subtle differences. A Firetruck can inherit from the Truck object, and so on. Wait.. and so on? The thing about inheritance is that is so easy to create massive trees of objects. But what OO-bigots won't tell you is that these trees will mess you up big time if you let them grow too deep, or grow for the wrong reasons.

One problem encountered with the overuse of inheritance is when the superclass contains a method which does not apply in a subclass. This problem led to the creation of the Liskov Substition Principle. It also led to the idea of Favour Composition over Inheritance.

In Object Composition vs. Inheritance I found the following description:

Most designers overuse inheritance, resulting in large inheritance hierarchies that can become hard to deal with. Object composition is a different method of reusing functionality. Objects are composed to achieve more complex functionality. The disadvantage of object composition is that the behavior of the system may be harder to understand just by looking at the source code. A system using object composition may be very dynamic in nature so it may require running the system to get a deeper understanding of how the different objects cooperate.
[....]
However, inheritance is still necessary. You cannot always get all the necessary functionality by assembling existing components.

Interestingly enough the same article also contains this:

The disadvantage of class inheritance is that the subclass becomes dependent on the parent class implementation. This makes it harder to reuse the subclass, especially if part of the inherited implementation is no longer desirable. ... One way around this problem is to only inherit from abstract classes.

The way to avoid problems with inheritance is therefore to avoid deep hierarchies, and to inherit only from abstract classes wherever possible. As I only develop database applications my software never interacts with objects in the "real world", just objects in a database. Every object in a database "is-a" table, so I created an abstract table class to contain the methods that could be applied to any database table, and inherit from this class to create a separate concrete table class for each table in my database. My abstract table class is quite large as there is a lot of processing which could be done on each table, and as I have over 350 concrete classes this results in a large amount of code which is reused through inheritance.

OOP requires a totally different thought process

There are a surprising number of people who hold the opinion that object oriented programming is totally different from procedural programming, and that it requires a totally different way of thinking and a different way of writing code. There are even those who say that if you do not utilise OO concepts in the "proper" way then even though you may be using objects you are nothing more than a procedural programmer, where the term "procedural" is used as an insult. Take a look at the following articles:

The article Pragmatic OOP by Ricki Sickenger contains the following:

I have met programmers who believe that anywhere there is a conditional statement in OO code, there is cause to subclass, "because that is the OO way!". And they will defend it against any pragmatic reasoning. So anywhere you see an if/then/else or a switch statement, you should find a way to break the logic into separate objects to avoid the logic. The dogma here is that conditional statements complicate things and are not strictly OO, so they must be minimized and preferable erased.

In Are You Still Debugging? the author Yegor Bugayenko says the following:

Code is procedural when it is all about how the goal should be achieved instead of what the goal is.
A method is procedural if the name is centered around a verb, but OO if it is centered around a noun.

In this post a person called Fasda said the following:

The key you need to focus is THE WAY YOU THINK A SOLUTION.
In procedural you thinks solutions as writing a recipe, step by step. First do that, then that, and continue with...
In OO you think solutions as "people asking favors to others" -> Objects and Messages only. Try to use some pure OO language to stop thinking on if, while, for and all those keywords so common in procedural.

The article Getters/Setters. Evil. Period. contains the following quote from David West:

Step one in the transformation of a successful procedural developer into a successful object developer is a lobotomy.

I do not share any of these opinions.

In his article All evidence points to OOP being bullshit John Barker says the following:

Procedural programming languages are designed around the idea of enumerating the steps required to complete a task. OOP languages are the same in that they are imperative - they are still essentially about giving the computer a sequence of commands to execute. What OOP introduces are abstractions that attempt to improve code sharing and security. In many ways it is still essentially procedural code.

In his paper Encapsulation as a First Principle of Object-Oriented Design (PDF) the author Scott L. Bain wrote the following:

Object Orientation (OO) addresses as its primary concern those things which influence the rate of success for the developer or team of developers: how easy is it to understand and implement a design, how extensible (and understandable) an existing code set is, how much pain one has to go through to find and fix a bug, add a new feature, change an existing feature, and so forth. Beyond simple "buzzword compliance", most end users and stakeholders are not concerned with whether or not a system is designed in an OO language or using good OO techniques. They are concerned with the end result of the process - it is the development team that enjoys the direct benefits that come from using OO.

This should not surprise us, since OO is routed in those best-practice principles that arose from the wise dons of procedural programming. The three pillars of "good code", namely strong cohesion, loose coupling and the elimination of redundancies, were not discovered by the inventors of OO, but were rather inherited by them (no pun intended).

Cohesion is the degree to which the responsibilities of a single module/component form a meaningful unit. High cohesion is considered to be better than low cohesion.

Coupling is the degree of interaction between two modules. Whenever you have one module calling another you have coupling. Loose coupling is considered to be better than tight coupling.

Elimination of redundancies is aimed at removing code that you do not need and is now called the YAGNI principle.

Other best practices which evolved in procedural languages, but which are still relevant in the OO world are the KISS and DRY principles

This tells me several things:

This also tells me that OOP can be adequately supported in procedural languages (such as PHP and COBOL) which have had the necessary syntax added in to enable encapsulation, inheritance and polymorphism without replacing the original syntax with something which is more OO-like. For example, if a procedural language allows a statement such as $result = uppercase($string) it would seem to be overkill to replace it with $result = $string->uppercase(). The result is exactly the same, but a lot of effort has been expended just to do it differently.

When some people say that OO programming is completely different from procedural programming they are making a fundamental mistake. OO programming is exactly the same as procedural programming except for the addition of encapsulation, inheritance and polymorphism. The only difference between these two paradigms is that one supports encapsulation, inheritance and polymorphism while the other does not. Just as it is possible to produce spaghetti code (unstructured branching using GOTO) in a procedural language it is also possible to produce ravioli code (too many small classes) or lasagne code (too many layers) in an OO language. Using the features that the language provides will not guarantee "good" code, it is how you make use of those features which is the deciding factor.

OO theory is constantly being expanded to include more and more concepts, and these concepts are becoming more and more complicated. As languages are modified to include these add-on concepts newcomers to these languages become convinced that it is these add-ons which define what OO is. I totally disagree. OOP does not require the use of any of these optional extras, so it is wrong to say that a program is not OO simply because it does not use them. It would be like saying that a car is not a car unless it has climate control and satnav. Those are optional extras, not the distinguishing features, and not having them does not make your car not a car. It would also be incorrect to say that a car is a car because it has wheels. Having wheels does not make something a car - a pram has wheels, but that does not make it a car, so having wheels is not a distinguishing feature.

As has already been stated in What OOP is you need to be using a programming language that supports encapsulation, inheritance and polymorphism, and you need to use these features in a way that creates more reusable code. The more reusable code you have the less you have to write, which in turn means the less code you need to read and maintain.


What types of object should I create?

In his article How to write testable code the author identifies three distinct categories of object:

  1. Value objects - an immutable object whose responsibility is mainly holding state but may have some behavior. Examples of Value Objects might be Color, Temperature, Price and Size.
  2. Entities - an object whose job is to hold state and associated behavior. Examples of this might be Account, Product or User.
  3. Services - an object which performs an operation. It encapsulates an activity but has no encapsulated state (that is, it is stateless). Examples of Services could include a parser, an authenticator, a validator or a transformer (such as transforming raw data into XML or HTML).

The PHP language does not have value objects so I shall ignore them.

It would be advisable to avoid the temptation to create Anemic Domain Models which contain data but no processing. This goes against the whole idea of OO which is to create objects which contain both data and processing.

My framework contains the following objects:

  1. Model - this is an entity. One of these is created for each entity in the application and holds all the business rules for that entity.
  2. View - this is a service, a reusable component which is provided by the framework.
  3. Controller - this is a service, a reusable component which is provided by the framework.
  4. Data Access Object - this is a service, a reusable component which is provided by the framework.

Note here that all application/domain knowledge is confined to the Models (the Business layer). There is absolutely no application knowledge in any of the services which are built into the framework. This means that the services (Controllers, Views and DAOs) are application-agnostic while the Models are framework-agnostic.

Every application, however small or large, will be comprised of a number of user transactions (sometimes known as Use Cases) which perform a specific unit of work for the user. In my framework each user transaction has its own Component script which does nothing but identify which combination of Model, View and Controller are required to carry out the relevant processing.


How many objects should I create?

This is a tricky question, but there are two extremes which you should avoid:

It is generally accepted that you should break your application down into areas of different logic where each area has a single responsibility, but what exactly is a "responsibility"? The confusion over the idea that "responsibility" should be treated as "reason for change" is discussed in I don't love the single responsibility principle In which Marco Cecconi says the following:

The purpose of classes is to organize code as to minimize complexity. Therefore, classes should be:
  1. small enough to lower coupling, but
  2. large enough to maximize cohesion.
By default, choose to group by functionality.

He also points out that an over-enthusiastic implementation of SRP can result in large numbers of anemic micro-classes that do little and complicate the organisation of the code base.

In my own architecture, which is shown in Figure 1, I have the following component numbers:


What structure should I use?

In the early days of computing monolithic systems were the most common. These systems have the following characteristics:

A software system is called "monolithic" if it has a monolithic architecture, in which functionally distinguishable aspects (for example data input and output, data processing, error handling, and the user interface) are all interwoven, rather than containing architecturally separate components.

This also became known as the Single-Tier Architecture when the idea of splitting the code into layers or tiers gradually became popular. After using monolithic structures with the COBOL language I switched to UNIFACE which provided the following:

I liked the 3-Tier Architecture so much that I used it when I rebuilt by development framework in PHP.

OO aficionados should note that the 3-Tier Architecture conforms to the Single Responsibility Principle which was written by Robert C. Martin (Uncle Bob). Although in his original article he used the vague term "a class should only have one reason to change" in later articles he gave more usable descriptions:

This is the reason we do not put SQL in JSPs. This is the reason we do not generate HTML in the modules that compute results. This is the reason that business rules should not know the database schema. This is the reason we separate concerns.

In Test Induced Design Damage? he wrote the following:

How do you separate concerns? You separate behaviors that change at different times for different reasons. Things that change together you keep together. Things that change apart you keep apart.

GUIs change at a very different rate, and for very different reasons, than business rules. Database schemas change for very different reasons, and at very different rates than business rules. Keeping these concerns (GUI, business rules, database) separate is good design.

Martin Fowler also describes this separation into three layers in his article PresentationDomainDataLayering where he refers to the "Business" layer as the "Domain" layer. In his article AnemicDomainModel he says the following:

It's also worth emphasizing that putting behavior into the domain objects should not contradict the solid approach of using layering to separate domain logic from such things as persistence and presentation responsibilities. The logic that should be in a domain object is domain logic - validations, calculations, business rules - whatever you like to call it.

Note that each of these three layers is not restricted to having a single component. You may find it convenient to split the program logic into more specialised components. For example, in my Presentation layer I built a separate component to create all HTML output using XML and XSL transformations, and if you treat the Business layer as being the same as the Model, this also produced an implementation of the Model-View-Controller design pattern, as shown in Figure 1:

Figure 1 - The MVC and 3-Tier architectures combined

model-view-controller-03a (5K)

How much reusability should I have?

The purpose of OOP is to use encapsulation, inheritance and polymorphism to produce code with a greater degree of reusability. Code which can be written once and reused multiple times does not have to be rewritten multiple times, and code which you don't have to write takes zero time to write. Taking less time to achieve a given objective results in smaller costs, which means that the developer can become more cost-effective and more productive. The more reusability you have the less time you have to spend in maintenance as a module which is reused 100 times only has to be tested once, not 100 times.

If you are not sure how much reusability you have in your framework then I would ask yourself the following questions:

The volume of reusability in my framework is detailed in Levels of Reusability. If you cannot achieve similar levels of reusability in your framework then stop wasting your time in telling me that my methods are wrong. If my methods produce 10 times the reusability that you can, then this surely indicates that my methods are 10 times better than yours.


Conclusion

Many people use different words to describe what OOP is supposed to mean, but the problem with words is that they are slippery. Like Humpty Dumpty proclaimed in Lewis Carroll's Through the Looking Glass:

When I use a word, it means just what I choose it to mean -- neither more nor less.

If you take the words used by the originators of OOP and apply different meanings to those words, then others take your words and apply different meanings to them, then you can end up with something which is nothing like the original, as immortalised in that children's game called Chinese Whispers.

There are only three features which really differentiate an Object Oriented language from a non-OO language, and these are Encapsulation, Inheritance and Polymorphism. Everything else is either bullshit or hype. Object Oriented Programming is therefore the use of these features in a programming language. High reusability and low maintainability cannot be guaranteed - that depends entirely on how these features are implemented.

Some people accuse me of having a view of OOP which is too simplistic, but instead of saying that my view is "more simple than it need be" surely it can also mean that their view is "more complex than it need be"? As a long-time follower of the KISS principle, which has a more modern variant in Do The Simplest Thing That Could Possibly Work. I know which view I prefer, and I also know which view is easier to teach to others.


References


© Tony Marston
3rd December 2006

http://www.tonymarston.net
http://www.radicore.org

Amendment history:

20 Apr 2017 Added What types of object should I create?
Added How many objects should I create?
Added What structure should I use?
Added How much reusability should I have?
10 Mar 2017 Added What Inheritance is not
Added OOP requires a totally different thought process.
29 Apr 2012 Added Popular misconceptions.
10 Apr 2012 Added Optional Extras.
17 Jun 2010 Amended What OOP is to include a reference to a definition provided by Bjarne Stroustrup.

counter