Sunday, April 15, 2018

UKOUG Northern Technology Summit 2018

The UKOUG has run something called the Northern Server Day for several years. Northern because they were held in a northern part of England (but south of Scotland) and Server because the focus was the database server. Over the last couple of years the day has had several streams, covering Database, High Availability and Engineered Systems. So primarily a day for DBAs and their ilk.

This year the event has expanded to let in the developers. Yay!

The Northern Technology Summit 2018 is effectively a mini-conference: in total there are five streams - Database, RAC Cloud Infrastructure & Availability, Systems, APEX and Development. But for registration it counts as a SIG. So it's free for UKOUG members to attend. What astonishingly good value!1 And it doesn't affect your entitlement to attend the annual conference in December.

The Development stream

The Development stream covers a broad range of topics. Application development in 2018 is a tangled hedge with new technologies like Cloud, AI and NoSQL expanding the ecosystem but not displacing the traditional database and practices. The Development stream presents a mix of sessions from the new-fangled and Old Skool ends of the spectrum.

  • The New Frontier: Mobile and AI Powered Conversational Apps. Oracle are doing interesting work with AI and Grant Ronald is king of the chatbots. This is an opportunity to find out what modern day apps can do.
  • The New Frontier: Mobile and AI Powered Conversational Apps. Oracle are doing interesting work with AI and Grant Ronald is king of the chatbots. This is an opportunity to find out what modern day apps can do.
  • Building a Real-Time Streaming Platform with Oracle, Apache Kafka, and KSQL No single technology is a good fit for all enterprise problems. Robin Moffat of Confluent will explain how we can use Apache Kafka to handle event-based data processing.
  • Modernising Oracle Forms Applications with Oracle Jet Oracle Forms was< - still is - a highly-productive tool for building OLTP front-ends. There are countless organisations still running Forms applications. But perhaps the UX looks a little jaded in 2018. So here's Mark Waite from Griffiths Waite to show how we can use Oracle's JET JavaScript library to write new UIs without having to re-code the whole Forms application.
  • 18(ish) Things Developers Will Love about Oracle Database 18c Oracle's jump to year-based release numbers doesn't make live easier for presenters: ten things about 10c was hard enough. But game for a challenge, Oracle's Chris Saxon attempts to squeeze as many new features as possible into his talk.
  • Modernize Your Development Experience With Oracle Cloud Cloud isn't just something for the sysadmins, there's a cloud for developers too. Sai Janakiram Penumuru from DXC Technology will explain how Oracle Developer Cloud might revolutionise your development practices.
  • Designing for Deployment As Joel Spolsky says, shipping is a feature. But it's a feature which is hard to retrofit. In this talk I will discuss some design principles which make it easier to build, deploy and ship database applications.

Everything else

So I hope the Development stream offers a day of varied and useful ideas. There are things you might be able to use right now or in the next couple of months, and things which might shape what you'll be doing next year. But it doesn't matter if not everything floats your boat. The cool thing about the day is that delegates can attend any of the streams. 2 .

So you can listen to Nigel Bayliss talking about Optimisation in the Database Stream, Vipul Sharma: talking about DevOps in the Availability stream, Anthony talking about Kubernetes in the Systems stream and John Scott talking about using Docker with Oracle in the Apex stream. There are sessions on infrastructure as code, upgrading Oracle 12cR1 to 12cR2, GDPR (the new EU data protection law), the Apex Interactive grid, Apache Impala, and Cloud, lots of Cloud. Oh my!

The full agenda is here (pdf).

Register now

So if you're working with Oracle technology and you want to attend this is what you need to know:
  • Date: 16th May 2018
  • Location: Park Plaza Hotel, Leeds
  • Cost: Free for UKOUG members. There is a fee for non-members but frankly you might as well buy bronze membership package and get a whole year's work of access to UKOUG events (including the annual conference). It's a bargain.

1. The exact number of SIG passes depends on the membership package you have
2. The registration process requires you to pick a stream but that is just for administrative purposes. It's not a lock-in.

Labels: , , ,

Sunday, December 31, 2017

Data Access Layer vs Table APIs

One of the underlying benefits of PL/SQL APIs is the enabling of data governance. Table owners can shield their tables behind a layer of PL/SQL. Other users have no access to the tables directly but only through stored procedures. This confers many benefits:
  • Calling programs code against a programmatic interface. This frees the table owner to change the table's structure whenever it's necessary without affecting its consumers.
  • Likewise the calling programs get access to the data they need without having to know the details of the table structure, such as technical keys.
  • The table owner can use code to enforce complicated business rules when data is changed.
  • The table owner can enforce sophisticated data access policies (especially for applications using Standard Edition without DBMS_RLS).
So naturally the question arises, is this the same as Table APIs?

Table APIs used to be a popular approach to encapsulating tables. The typical Table API comprised two packages per table; one package provided methods for inserting, updating and deleting records, and the other package provided query methods. The big attraction of Table APIs was that they could be 100% generated from the data dictionary - both Oracle Designer and Steven Feuerstein's QNXO library provided TAPI generators. And they felt like good practice because, y'know, access to the tables was shielded by a PL/SQL layer.

But there are several problems with Table APIs.

The first is that they entrench row-by-agonising-row processing. Table APIs have their roots in early versions of Oracle so the DML methods only worked with a single record. Even after Oracle 8 introduced PL/SQL collection types TAPI code in the wild tended to be RBAR: there seems to something in the brain of the average programmer which predisposes them to prefer loops executing procedural code rather than set operations.

The second is that they prevent SQL joins. Individual records have to be selected from one table to provide keys for looking up records in a second table. Quite often this leads to loops within loops. So-called PL/SQL joins prevent the optimizer from choosing good access paths when handling larger amounts of data.

The third issue is that it is pretty hard to generate methods for all conceivable access paths. Consequently the generated packages had a few standard access paths (primary key, indexed columns) and provided an dynamic SQL method which accepted a free text WHERE clause. Besides opening the package to SQL injection this also broke the Law of Demeter: in order to pass a dynamic WHERE clause the calling program needed to know the structure of the underlying table, which defeats the whole objective of encapsulation.

Which leads on to the fourth, more philosophical problem with Table APIs: there is minimal abstraction. Each package is generated so it fits very closely to the structure of the Table. If the table structure changes we have to regenerate the TAPI packages: the fact that this can be done automatically is scant recompense for the tight coupling between the Table and the API.

So although Table APIs could be mistaken for good practice in actuality they provide no real benefit. The interface is 1:1 with the table structure so it has no advantage over granting privileges on the table. Combined with the impact of RBAR processing and PL/SQL joins on performance and the net effect of Table APIs is disastrous.

We cannot generate good Data Access APIs: we need to write them. This is because the APIs should be built around business functions rather than tables. The API packages granted to other users should comprise procedures for executing transactions. A Unit Of Work is likely to touch more than one table. These have to be written by domain experts who understand the data model and the business rules.

Part of the Designing PL/SQL Programs series

Labels: , ,

Friday, December 29, 2017

On hitting 100K on StackOverflow

100,000 is just another number. It's one more than 99,999. And yet, and yet. We live in a decimal cultural. We love to see those zeroes roll up. Order of magnitude baby! It's the excitement of being a child, going on a journey in the family car when the odometer reads 99994. knowing you'll see 100000. Of course everybody got distracted by the journey and next time you look at the dial it reads 100002.

Earlier this year my StackOverflow reputation passed 100,000. Like the car journey I missed the actual moment. My rep had been 99,986 when I last checked the previous evening and 100,011 the next day. Hey ho.

Reputation is a big deal on StackOverflow because it is the prime measure of contribution. As a Q&A site (not a forum - that confuses a lot of people) it needs content, it needs good questions and good answers. Reputation points are the reward for good posts. In this context good is determined democratically: people vote up good questions and good answers, and - crucially - vote down poor questions and answers. Votes are the main way of gaining reputation points: +5 for an upvoted question, +10 for an upvoted answer and +15 for an accepted answer. (There are other ways of gaining - and losing - rep) but posting is the main one.
"Reputation is a rough measurement of how much the community trusts you; it is earned by convincing your peers that you know what you’re talking about." Meta Stack Exchange FAQ

So is reputation just a way of keeping score? Nope: it is gamification but there is more to it than that. Reputation means points and what do points make? Prizes Privileges. StackOverflow is largely a self-policing community. There are full-on (elected) moderators but most moderation is actually carried out by regular SO users with sufficient rep. Somebody has asked an unclear question: once you have 50 rep you can post a comment asking for clarification. Got a user who doesn't know how to turn off the CAPSLOCK key? With 2000 rep you can just edit their post and apply sentence case. And so on.

Hmmm, so StackOverflow rewards its keenest contributors by allowing them to do chores around the site. Yes and it works. One of the big problems with forums is other users. Not griefers as such but there are a lot of low-level irritations: users who don't know how to search the site, or how to format their posts, or just generally fail to understand etiquette. Granting increasing moderation privileges at reputation milestones allows committed users to smooth away soem of those irritations.

But still, getting to 100,000 took eight years and almost 3000 answers. Was it worth it? Of course. It's nice to give back to the community. We are here to help: upvotes and accepted answers provide a nice feedback that we've succeeded. Downvotes also provide a necessary corrective (even if it is annoying when some rando dings you on an answer from five years back without leaving comment). And while there are no prizes, when you get to 100,000 you do get swag. A big box of swag:

Here is the box with a standard reference pear so you can see just how big it is.

Inside there is - a pen ....

Some stickers ....

A StackOverflow T-shirt (I have negotiated with my better half to keep this one) ...

And an over-sized coffee mug...

One more thing. There are also badges. Badges are nudges to encourage desirable behaviour such as editing posts, voting in moderator elections, reviewing posts, offering bounties, being awesome. Because let's face it, badges are cool. More badges = more flair. And who doesn't want more flair? Got flair? Heck yeah!

profile for APC at Stack Overflow, Q&A for professional and enthusiast programmers

Labels: ,

Wednesday, May 31, 2017

Avoiding Coincidental Cohesion

Given that Coincidental Cohesion is bad for our code base so obviously we want to avoid writing utilities packages. Fortunately it is mostly quite easy to do so. It requires vigilance on our part. Utilities packages are rarely planned. More often we are writing a piece of business functionality when we find ourselves in need of some low level functionality. It doesn't fit in the application package we're working on, perhaps we suspect that it might be more generally useful, so we need somewhere to put it.

The important thing is to recognise and resist the temptation of the Utilities package. The name itself (and similarly vague synonyms like helper or utils) should be a red flag. When we find ourselves about to type create or replace package utilities we need to stop and think: what would be a better name for this package? Consider whether there are related functions we might end up needing? Suppose we're about to write a function to convert a date into Unix epoch string. It doesn't take much imagine to think we might need a similar function to convert a Unix timestamp into a date. We don't need to write that function now but let's start a package dedicated to Time functions instead of a miscellaneous utils package.

Looking closely at the programs which comprise the DBMS_UTILITY package it is obviously unfair to describe them as a random selection. In fact that there seven or eight groups of related procedures.

DB Info

  • DBLINK_ARRAY Table Type
  • DB_VERSION Procedure
  • PORT_STRING Function
Runtime Messages
Object Management
  • COMMA_TO_TABLE Procedures
  • COMPILE_SCHEMA Procedure
  • INVALIDATE Procedure
  • TABLE_TO_COMMA Procedures
  • VALIDATE Procedure
Object Info (Object Management?)
  • LNAME_ARRAY Table Type
  • NAME_ARRAY Table Type
  • NUMBER_ARRAY Table Type
  • UNCL_ARRAY Table Type
  • CANONICALIZE Procedure
  • GET_DEPENDENCY Procedure
  • NAME_RESOLVE Procedure
  • NAME_TOKENIZE Procedure
Session Info
SQL Manipulation
  • EXPAND_SQL_TEXT Procedure
  • GET_SQL_HASH Function
Statistics (deprecated))
  • ANALYZE_SCHEMA Procedure
  • GET_CPU_TIME Function
  • GET_TIME Function
  • GET_HASH_VALUE Function
  • IS_BIT_SET Function

We can see an alternative PL/SQL code suite, with several highly cohesive packages. But there will be some procedures which are genuinely unrelated to anything else. The four procedures in the Unclassified section above are examples. But writing a miscellaneous utils package for these programs is still wrong. There are better options.

  1. Find a home. It's worth considering whether we already have a package which would fit the new function. Perhaps WAIT_ON_PENDING_DML() should have gone in DBMS_TRANSACTION; perhaps IS_BIT_SET() properly belongs in UTL_RAW.
  2. A package of their own. Why not? It may seem extravagant to have a package with a single procedure but consider DBMS_DG with its lone procedure INITIATE_FS_FAILOVER(). The package delivers the usual architectural benefits plus it provides a natural home for related procedures we might discover a need for in the future.
  3. Standalone procedure. Again, why not? We are so conditioned to think of a PL/SQL program as a package that we forget it can be just a Procedure or Function. Some programs are suited to standalone implementation.

So avoiding the Utilities package requires vigilance. Code reviews can help here. Preventing the Utilities package becoming entrenched is crucial: once we have a number of packages dependent on a Utilities package it is pretty hard to get rid of it. And once it becomes a fixture in the code base developers will consider it more acceptable to add procedures to it.

Part of the Designing PL/SQL Programs series

Labels: , , ,

Utilities - the Coincidental Cohesion anti-pattern

One way to understand the importance of cohesion is to examine an example of a non-cohesive package, one exhibiting a random level of cohesion. The poster child for Coincidental Cohesion is the utility or helper package. Most applications will have one or more of these, and Oracle's PL/SQL library is no exception. DBMS_UTILITY has 37 distinct procedures and functions (i.e. not counting overloaded signatures) in 11gR2 and 38 in 12cR1 (and R2). Does DBMS_UTILITY deliver any of the benefits the PL/SQL Reference says packages deliver?

Easier Application Design?

One of the characteristics of utilities packages is that they aren't designed in advance. They are the place where functionality ends up because there is no apparently better place for it. Utilities occur when we are working on some other piece of application code; we discover a gap in the available functionality such as hashing a string. When this happens we generally need the functionality now: there's little benefit to deferring the implementation until later. So we write a GET_HASH_VALUE() function,x stick it in our utilities package and proceed with the task at hand.

The benefit of this approach is we keep our focus on the main job, delivering business functionality. The problem is, we never go back and re-evaluate the utilities. Indeed, now there is business functionality which depends on them: refactoring utilities introduces risk. Thus the size of the utilities package slowing increases, one tactical implementation at a time.

Hidden Implementation Details?

Another characteristic of utility functions is that they tend not to share concrete implementations. Often a utilities package beyond a certain size will have groups of procedures with related functionality. It seems probable that DBMS_UTILITY.ANALYZE_DATABASE(), DBMS_UTILITY.ANALYZE_PART_OBJECT() and DBMS_UTILITY.ANALYZE_SCHEMA() share some code. So there are benefits to co-locating them in the same package. But it is unlikely that CANONICALIZE() , CREATE_ALTER_TYPE_ERROR_TABLE() and GET_CPU_TIME() have much code in common.

Added Functionality?

Utility functions are rarely part of a specific business process. They are usually called on a one-off basis rather than being chained together. So there is no state to be maintained across different function calls.

Better Performance?

For the same reason there is no performance benefit from a utilities package. Quite the opposite. When there is no relationship between the functions we cannot make predictions about usage. We are not likely to call EXPAND_SQL_TEXT() right after calling PORT_STRING(). So there is no benefit in loading the former into memory when we call the latter. In fact the performance of EXPAND_SQL_TEXT() is impaired because we have to load the whole DBMS_UTILITY package into the shared pool, plus it uses up a larger chunk of memory until it gets aged out. Although to be fair, in these days of abundant RAM, some unused code in the library cache need not be our greatest concern. But whichever way we bounce it, it's not a boon.


Privileges on utility packages is a neutral concern. Often utilities won't be used outside the owning schema. In cases where we do need to make them more widely available we're probably granting access on some procedures that the grantee will never use.


From an architectural perspective, modularity is the prime benefit of cohesion. A well-designed library should be frictionless and painless to navigate. The problem with random assemblages like DBMS_UTILITY is that it's not obvious what functions it may contain. Sometimes we write a piece of code we didn't need to.

The costs of utility packages

Perhaps your PL/SQL code base has a procedure like this:
create or replace procedure run_ddl
  ( p_stmt in varchar2)
  pragma autonomous_transaction;
  v_cursor number := dbms_sql.open_cursor;
  n pls_integer;
  dbms_sql.parse(v_cursor, p_stmt, dbms_sql.native);
  n := dbms_sql.execute(v_cursor);
  when others then
    if dbms_sql.is_open(v_cursor) then
    end if;
end run_ddl;

It is a nice piece of code for executing DDL statements. The autonomous_transaction pragma prevents the execution of arbitrary DML statements (by throwing ORA-06519), so it's quite safe. The only problem is, it re-implements DBMS_UTILITY.EXEC_DDL_STATEMENT().

Code duplication like this is a common side effect of utility packages. Discovery is hard because their program units are clumped together accidentally. Nobody sets out to deliberately re-write DBMS_UTILITY.EXEC_DDL_STATEMENT(), it happens because not enough people know to look in that package before they start coding a helper function. Redundant code is a nasty cost of Coincidental Cohesion. Besides the initial wasted effort of writing an unnecessary program there are the incurred costs of maintaining it, testing it, the risk of introducing bugs or security holes. Plus each additional duplicated program makes our code base a little harder to navigate.

Fortunately there are tactics for avoiding or dealing with this. Find out more.

Part of the Designing PL/SQL Programs series

Labels: , , ,

Monday, December 05, 2016

UKOUG Tech 2016 - Super Sunday

UKOUG 2016 is underway. This year I'm staying at the Jury's Inn hotel, one of a clutch of hotels within a stone's throw of the ICC and all the action of Brindley Place. Proximity is the greatest luxury. My room is on the thirteenth floor, so I have a great view across Birmingham; a view, which in the words of Telly Savalas "almost takes your breath away".

Although the conference proper - with keynotes, exhibition hall and so on - opens today, Monday, the pre-conference Super Sunday has already delivered some cracking talks. For the second year on the trot we have had a stream devoted to database development, which is great for Old Skool developers like me.

Fighting Bad PL/SQL, Phillip Salvisberg

The first talk in the stream discussed various metrics for assessing the the quality of PL/SQL code: McCabe Cyclic Complexity, Halstead Volume, Maintainability Index. Cyclic Complexity evaluates the number of paths through a piece of code; the more paths the harder it is to understand what the code does under any given circumstance. The volume approach assesses information density (the number of distinct words/total number of words); a higher number means more concepts, and so more to understand. The Maintainability Index takes both measures and throws it some extra calculations based on LoC and comments.

All these measures are interesting, and often insights but none are wholly satisfactory. Phillip showed how easier it is to game the MI by putting all the code of a function on a single line: the idea that such a layout makes our code more maintainable is laughable. More worryingly, none of these measures evaluate what the code actually does. The presented example of better PL/SQL (according to the MI measure) replaced several lines of PL/SQL into a single REGEXP_LIKE call. Regular expressions are notorious for getting complicated and hard to maintain. Also there are performance considerations. Metrics won't replace wise human judgement just yet. In the end I agree with Phillip that the most useful metric remains WTFs per minute.

REST enabling Oracle tables with Oracle REST Data Services, Jeff Smith

It was standing room only for That Jeff Smith, who coped well with jetlag and sleep deprivation. ORDS is the new name for the APEX listener, a misleading name because it is used for more than just APEX calls, and APEX doesn't need it. ORDS is a Java application which brokers JSON calls between a web client and the database: going one way it converts JSON payload into SQL statements, going the other way it converts result sets into JSON messages. Apparently Oracle is going to REST enable the entire database - Jeff showed us the set of REST commands for managing DataGuard. ORDS is the backbone of Oracle Cloud.

Most of the talk centred on Oracle's capabilities for auto-enabling REST access to tables (and PL/SQL with the next release of ORDS). This is quite impressive and certainly I can see the appeal of standing up a REST web service to the database without all the tedious pfaffing in Hibernate or whatever Java framework is in place. However I think auto-enabling is the wrong approach. REST calls are stateless and cannot be assembled to form transactions; basically each one auto-commits. It's Table APIs all over again. TAPI 2.0, if you will. It's a recipe for bad applications.

But I definitely like this vision of the future: an MVC implementation with JavaScript clients (V) passing JSON payloads to ORDS (C) with PL/SQL APIs doing all the business logic (M). The nineties revival starts here.

Meet your match: advanced row pattern matching, Stew Ashton

Stew's talk was one of those ones which are hard to pull off: Oracle 12c's MATCH RECOGNIZE clause is a topic more suited to an article with a database on hand so we can work through the examples. Stew succeeded in making it work as a talk because he's a good speaker with a nice style and a knack for lucid explanation. He made a very good case for the importance of understanding this arcane new syntax.

MATCH RECOGNIZE is lifted from event processing. It allows us to define arbitrary sets of data which we can iterate over in a SELECT statement. This allows us to solve several classes of problems relating to bin filtering, positive and negative sequencing, and hierarchical summaries. The most impressive example showed how to code an inequality (i.e. range) join that performs as well as an equality join. I will certainly be downloading this presentation and learning the syntax when I get back home.

If only Stew had done a talk on the MODEL clause several years ago.

SQL for change history with Temporal Validity and Flash Back Data Archive, Chris Saxon

Chris Saxon tackled the tricky concept of time travel in the database, as a mechanism for handling change. The first type of change is change in transactional data. For instance, when a customer moves house we need to retain a record of their former address as well as their new one. We've all implemented history like this, with START_DATE and END_DATE columns. The snag has always been how to formulate the query to establish which record applies at a given point in time. Oracle 12C solves this with Temporal Validity, a syntax for defining a PERIOD using those start and end dates. Then we can query the history using a simple AS OF PERIOD clause. It doesn't solve all the problems in this area (primary keys remain tricky) but at least the queries are solved.

The other type of change is change in metadata: when was a particular change applied? what are all the states of a record over the last year? etc. These are familiar auditing requirements, which are usually addressed through triggers and journalling tables. That approach carries an ongoing burden of maintenance and is too easy to get wrong. Oracle has had a built-in solution for several years now, Flashback Data Archive. Not enough people use it, probably because in 11g it was called Total Recall and a chargeable extra. In 12C Flashback Data Archive is free; shorn of the data optimization (which requires the Advanced Compression package) it is available in Standard Edition not just Enterprise. And it's been back-ported to The syntax is simple: to get a historical version of the data we simply use AS OF TIMESTAMP. No separate query for a journalling table, no more nasty triggers to maintain... I honestly don't know why everybody isn't using it.

So that was Super Sunday. Roll on Not-So-Mundane Monday.

Labels: , ,

Thursday, November 24, 2016

UKOUG Conference 2016 coming up fast

The weather has turned cold, the lights are twinkling in windows and Starbucks is selling pumpkin lattes. Yes, it's starting to look a lot like Christmas. But first there's the wonder-filled advent calendar that is the UKOUG Annual Conference in Birmingham, UK. So many doors to choose from!

The Conference is the premier event for Oracle users in the UK (and beyond). This year has another cracker of an agenda: check it out.

The session I'm anticipating most is Monday's double header with Bryn Llewellyn and Toon Koopelaar's A Real-World Comparison of the NoPLSQL & Thick Database Paradigms. Will they come down on the side of implementing business logic in stored procedures or won't they? It'll be tense :) But it will definitely be insightful and elegantly argued.

Oracle's bailiwick has expanded vastly over the years, and it's become increasingly hard to cover everything. Even so, it's fair to say in recent years older technologies such as Forms have been neglected in favour in favour of shinier baubles. Not this year. There's a good representation of Forms sessions this year, including a talk from Michael Ferrante, the Forms Product Manager. These sessions are all scheduled for the Wednesday, in a day targeted at database developers. If you're an Old Skool developer, especially if you're a Forms developer, and your boss will allow you only one day at the conference, then Wednesday is the day to pick.

Hope to see you there

Labels: , , ,

Tuesday, April 19, 2016

The importance of cohesion

"Come on, come on, let's stick together" - Bryan Ferry

There's more to PL/SQL programs than packages, but most of our code will live in packages. The PL/SQL Reference offers the following benefits of organising our code into packages:

Modularity - we encapsulate logically related components into an easy to understand structure.

Easier Application Design - we can start with the interface in the package specification and code the implementation later.

Hidden Implementation Details - the package body is private so we can prevent application users having direct access to certain functionality.

Added Functionality - we can share the state of Package public variables and cursors for the life of a session.

Better Performance - Oracle Database loads the whole package into memory the first time you invoke a package subprogram, which makes subsequent invocations of any other subprogram quicker. Also packages prevent cascading dependencies and unnecessary recompilation.

Grants - we can grant permission on a single package instead of a whole bunch of objects.

However, we can only realise these benefits if the packaged components belong together: in other words, if our package is cohesive.  

The ever reliable Wikipedia defines cohesion like this: "the degree to which the elements of a module belong together"; in other words how it's a measure of the strength of the relationship between components. It's common to think of cohesion as a binary state - either a package is cohesive or it isn't - but actually it's a spectrum. (Perhaps computer science should use  "cohesiveness" which is more expressi but cohesion it is.)


Cohesion owes its origin as a Comp Sci term to Stevens, Myers, and Constantine.  Back in the Seventies they used the terms "module" and "processing elements", but we're discussing PL/SQL so let's use Package and Procedure instead. They defined seven levels of cohesion, with each level being better - more usefully cohesive - than its predecessor.


The package comprises an arbitrary selection of procedures and functions which are not related in any way. This obviously seems like a daft thing to do, but most packages with "Utility" in their name fall into this category.


The package contains procedures which all belong to the same logical class of functions. For instance, we might have a package to collect all the procedures which act as endpoints for REST Data Services.


The package consists of procedures which are executed at the same system event. So we might have a package of procedures executed when a user logs on - authentication, auditing, session initialisation - and similar package for tidying up when the user logs off. Other than the triggering event the packaged functions are unrelated to each other.


The package consists of procedures which are executed as part of the same business event. For instance, in an auction application there are a set of actions to follow whenever a bid is made: compare to asking price, evaluate against existing maximum bid, update lot's status, update bidder's history, send an email to the bidder, send an email to the user who's been outbid, etc.


The package contains procedures which share common inputs or outputs. For example a payroll package may have procedures to calculate base salary, overtime, sick pay, commission, bonuses and produce the overall remuneration for an employee.


The package comprises procedures which are executed as a chain, so that the output of one procedure becomes the input for another procedure. A classic example of this is an ETL package with procedures for loading data into a staging area, validating and transforming the data, and then loading records into the target table(s).


The package comprises procedures which are focused on a single task. Not only are all the procedures strongly related to each other but they are fitted to user roles too. So procedures for power users are in a separate package from procedures for normal users. The Oracle built-in packages for Advanced Queuing are a good model of Functional cohesion.

How cohesive is cohesive enough?

The grades of cohesion, with Coincidental as the worst and Functional as the best, are guidelines. Not every package needs to have Functional cohesion. In a software architecture we will have modules at different levels. The higher modules will tend to be composed of calls to lower level modules. The low level modules are the concrete implementations and they should aspire to Sequential or Functional cohesion.

The higher level modules can be organised to other levels. For instance we might want to build packages around user roles - Sales, Production, HR, IT - because Procedural cohesion makes it easier for the UI teams to develop screens, especially if they need to skin them for various different technologies (desktop, web, mobile). Likewise we wouldn't want to have Temporally cohesive packages with concrete code for managing user logon or logoff. But there is a value in organising a package which bundles up all the low level calls into a single abstract call for use in schema level AFTER LOGON triggers.    

Cohesion is not an easily evaluated condition. We need cohesion with a purpose, a reason to stick those procedures together. It's not enough to say "this package is cohesive". We must take into consideration how cohesive the package needs to be: how will it be used? what is its relationships with the other packages?

Applying design principles such as Single Responsibility, Common Reuse, Common Closure and Interface Segregation can help us to build cohesive packages. Getting the balance right requires an understanding of the purpose of the package and its place within the overall software architecture.  

Part of the Designing PL/SQL Programs series

Labels: , , ,