I’m currently working on a project that was clearly written by people who were very familiar with databases and not so comfortable with Java coding. The database is beautiful – well laid out and fully normalized. The code… well, it looks like they copied some examples from the early chapters of books on servlets and JDBC.
It’s not surprising that the developers avoided manipulating objects in Java, and did most of the work via stored procedures written in PL/SQL where they were more comfortable. And they did a lot in the stored procedures: not only did the procedures do data manipulation (converting NULL to “”, converting time formats), but also pagination (useful if there would be a lot of data returned), and even auditing (easy to do when you know a procedure will be called every time data is modified).
I also was reminded of a place where I worked, where the database team fought hard to get more work done in stored procedures. I’m still not sure if they truly thought it would make the system better or if it was largely a power grab to gain more control of the application.
It’s an interesting question: what role should stored procedures have in a modern application?
Here are some pros and cons of using stored procedures:
Pro:
- Speed is, not surprisingly, the big one. Having the procedure compiled into the database makes it lickety-split in accessing data.
- Less data needs to be shuttled between the database and application. So, less bandwidth used, and therefore faster processing.
- The application can change (somewhat) without taking down the system. Stored procedure changes don’t require .war files and planned outages. Just feed a script to SQL*Plus and the changes take place immediately.
- Stored Procedures can hide the complexity of the database underneath it.
Con:
- Stored Procedures are not database-independent. One you start using one, you are forced to stay with that vendor from then on.
- Writing the procedures requires knowledge of the stored procedure language. Then again, Oracle does allow procedures to be written in Java.
- If you have code in both the application server and the database, you have a second code base to manage.
- Debugging stored procedures is harder – Java tools are very mature, and debugging across the seam between Java and DB is near impossible.
Maybe this whole argument is now moot with the rise of Object Relational Mapping. O/R Mapping assumes that a Java object reflects the contents of a row in the database. Which means no interference from a stored procedure. Here’s an interesting quote from Gavin King (author of the O/R persistence service Hibernate):
"Stored procedures are essentially a non-relational view of a relational database. They're a procedure oriented, a call oriented view of a relational database "
On the other hand, if you think of a stored procedure as a “service”, as in “service oriented architecture”, maybe it’s not such a bad idea. See this article for more on the view that stored procedures are a better way to go.
So what’s the proper place for stored procedures?
Comments
You have to hate productivity and have even more of a hate of rolling products out the door in order to justify the use of stored procedures. ORM is without a doubt the way to go. You can fire your entire staff of DBAs (at least those who write stored procedures) and replace them with application servers (which cost much less than DBAs) that will handle the performance loss of using ORM instead of compiled stored procedures. Even a Windows application server complains less and hiccups less frequently than a DBA.
Sorry for the rant. I'm debugging an application that uses stored procs, and you're #1 when "stored procedures waste time" is Googled!
I think this whole debate has gone too far and it's still on. Most experts are biased. Maybe an authentic research company or a body should carry out a comprehensive research and give us a scientific conclusion. Each of the vendors (Oracle,Microsoft,IBM,Sybase etc) can also do the same.
This is a good, quick study on the pros and cons. I have written
both application server stored procedures and database stored
procedures. There are two things I would like to point out that are of
importance.
Quite often, DB SPs are written poorly, with poor
exception handling that leaves the calling method guessing as to why
there is an error. By handling exceptions well within the stored
procedure, troubleshooting can be made many time easier. Along with
the input parameters required for the SP functionality, I typically
have an argument that simply returns success or failure (a code), and
then a string parameter that is descriptive as to what the problem was.
Regarding multiple code bases, it is true that you are now
locating logic in multiple places. Sometimes, though, stored
procedures are called from multiple application servers which each have
their own code base (completely seperate projects). In this case it is
surely better to keep the code in a single location (the DB) and only
have to modify it once when underlying DB objects change or the
functionality of the SP changes.
wrt David@tokyo's comment about dbs being boring... I LOVE em!
Disclaimer: I'm someone with a background mainly in application programming and generally find databases boring to deal with. I'd rather write a rdbms than use one.
I've spent a lot of time with a particular web-application recently which relies very heavily (way too heavily IMO) on stored procedures.
Whenever a bug in the system arises, 80% of the time it's a buggy stored procedure (not the fault of SPs - just illustrating the degree to which they are used in this system). A great deal of business logic is evidently present on the database server.
One stated pro of stored procedures is performance. I'm not going to argue that in some circumstances they can give a performance boost. Wasting time running between the app and database servers is clearly to be avoided. Thus data intensive processing seems to be the prime candidate for implementation in stored procedures. But it's not as simple as saying "sp's are faster (especially for data intensive processing) so let's use them lots".
What I see when I look at the load average on our DB server is this:
load averages: 0.30, 0.20, 0.14
And over on the App server we have this:
load averages: 0.05, 0.04, 0.07
With just a limited amount of business logic implemented in the application server, it mainly just shovels requests and
responses between the web server and the database server.
Currently the system has just a few users on it, but in future there will be something like hundreds of users. When that day comes (well before that day comes in fact), the database is just going to choke under the increased load.
Let me recall the worst case of sp abuse that I can identify. When the system was "designed", rather than keeping running totals of a certain set of values cached at the app server, and only updating the database when there was a change in the underlying figures, the system was designed to call an sp which would recalculate the totals right from the start, everytime a user made a request to see the latest data. At the technical review, I raised this as an issue, but the allknowing consultant rejected the concern saying that "it's fine, the database will be very fast".
Eventually, the system was released and the users found themselves waiting for up to 40 seconds for a their page to be refreshed. Their existing system (to be retired) can display a similar set of data within 1 to 2 seconds, at the most.
Here's the thing: The designer could rightly have used the justification that "we'll use a stored procedure because this is data intensive processing". As our frustrated end-users discovered, it certainly was!
So I'd add an additional rule of thumb regarding the use of stored procedures:
Stored procedures may be fast, but not calling a stored procedure at all is potentially magnitudes of times faster. If you can cache results and avoid recalculation, you don't even need to go to the database in the first place.
This experience illustrates to me very clearly that you don't want a "specialist" in a given technology designing your system.
I am in the camp that subscribes to the idea that the best tool for the job should prevail. If you are working on a project that will become open source, then yes... you may want to keep database transactions within the application code domain. But in my experience, there won't be a lot of switching of backend databases after initial release (vendor licensing, client's requests, etc.)
The efficiency gained by stored procedures is too great to pass up when performing heavy DB data processing. Create statements for the stored procedures can be managed in the same code repository as the application code itself making code management a breeze.
It's nice not having to worry about your backend persistance layer, but if you plan to provide a robust, efficient and "snappy" solution to your client you should become familiar with ALL aspects of the solution including the persistance layer.
Post new comment