What's wrong with Hibernate and JPA
For relational data persistence Hibernate is certainly the most popular solution on the market. It has been around for years and it’s probably been used in thousands of projects. With its latest version Hibernate even conforms to Sun’s Java Persistence API (JPA) specification. So with Hibernate offering everything why look for something else?
Well, we think that Hibernate and JPA are simply not as perfect as it seems. To explain why, we want to provide you with a few reasons what we think is wrong with Hibernate.
Data model definition: The trouble with the metadata
In order to work every relational data persistence solution needs to understand the underlying data model. Hibernate offers two ways of defining the data model: XML mapping files and annotations. Annotations have only recently been introduced to simplify object relational mapping and there are plenty of reasons why annotations are clearly the superior solution over XML mapping. Hence we will not consider XML mapping any further and concentrate on annotations. However, everything said here concerning annotations applies to XML mapping in a similar fashion.
The metadata you provide with annotations is required by Hibernate to know where and how to persist objects in the database. This information however may not only be useful for Hibernate but also for your application logic and thus you may want to access it from your code instead of redundantly providing the same information again. A good example is e.g. the maximum length of a text field or whether or not a field is mandatory. On a user interface for example, you may need this information for the display of a form input control or for validation.
With annotations you can access this information from your code, but a particular annotation property cannot be directly referenced from Java code in contrast to a Java object’s property. Here’s an example:
This is how to access metadata using annotations:
Method field = Employee.class.getDeclaredMethod( "getFirstname", new Class[0] ); javax.persistence.Column col = field.getAnnotation( javax.persistence.Column.class ); int length = col.length();
This code achieves the same using Empire-db's object model schema definition:
int length = mydb.EMPLOYEES.FIRSTNAME.getSize();
(Note that EMPLOYEES and FIRSTNAME are both public final member fields that we write in all upper-case letters, but could as well be made accessible through getters – it’s your code and your decision!)
Apart from the lack of compile-time safety and code complexity on the client side there are more issues why annotations are problematic. The metadata provided with the persistence annotations is often not sufficient. Additional annotations as with Hibernate Validator are required or you may want to define and assign custom annotations, making your mapping and access code even less readable and manageable. Not to mention the impossibility of programmatic changes to your data model at runtime. With a Java object model for metadata in contrast all this is easy to achieve. So why use annotations if you can do better without?
We think that annotations should not be used to provide information
likely to be required for application logic. Rather they
should be used to provide code specific information for compiler optimization or code documentation.
@Deprecated
and
@SuppressWarnings
are good examples for acceptable
annotations. Even though annotations integrate much better with Java
code than XML mapping files, they're far from being as flexible as
normal interfaces and classes. Annotations are new and cool now, but the
more extensively they are being used, the more they are polluting your
source code. Don't let annotation hell follow XML configuration hell.
Data object definition: The getter setter mania
Besides the metadata we also need somewhere to store and access our data. For Hibernate and JPA this is a JavaBean or POJO that is equipped with a member field as well as a getter and a setter method for each of the columns of the corresponding table. For large data models this means lots of lines of code. Hibernate tools may be used to automatically generate all this code using reverse engineering. However for large and mature projects you may run into the problem that once you have manually changed bean or mapping code – and you want to keep that change – automatic tools are problematic. So very often all this code including the metadata is maintained by hand. Even worse, since these objects are usually used as data transfer objects in order to fill business objects, you’ll find endless lines of code where property values are copied from one Java object to another. So what’s the point in having all these getters and setters in the first place?
With Empire-db's dynamic beans you only have one generic getter and setter for each entity that both have already been implemented. The amount of classes may stay the same as we also recommended creating an individual data object class for each database entity - although this is not necessary when using a generic DBRecord object. But we recommend this for two reasons: The first one is again type-safety, since you want your internal code to rely on certain entities. Secondly it is likely that, as your project grows, you will need to override existing and implement new methods there. But even so, due to the absence of all those member fields and their corresponding getter and setter methods you will end up with considerably less code to maintain. Still you may add special getters and setters for individual columns if necessary or convenient.
Dynamic queries: The select dilemma
What we expect from a relational data persistence solution is good support for dynamic query generation, so let's look at how dynamic queries are handled. Again Hibernate offers two options: HQL and the criteria API. HQL is a language of its own which you have to learn first. It's basically an SQL dialect with Java code mapping extensions. After all, it's provided or assembled using unsafe string literals and the problem comes when trying to build complex statements with conditionally added constraints and joins. We think that at some point of complexity such building code becomes simply unmaintainable. The criteria API is better from this respect but on the other hand - as we found - too limited in its capabilities.
But there is another issue: A common programming task is, that you need a special view of your data that will contain only few columns collected or even calculated from one or more tables. The result may be displayed to a user or used for other processing purposes.
For this Hibernate HQL offers the possibility to define the select clause, ideally giving it a special "result" bean that holds exactly the data that you need, even transformed with SQL functions such as string concatenation or numeric calculations. Strangely we found that in many projects this feature is rarely used. Instead people were working with the full entity beans, which meant that far more attributes than necessary were loaded from the database. For entity relation resolving Hibernate either uses joins (eager mode) loading all referenced entities as well or - with lazy loading enabled – additional queries are performed, one for each unique referenced object – sometimes even just to access one simple attribute. So in fact instead of one bean object per row holding say 5 attributes, 5 objects a row holding together over 50 attributes are loaded. It is fairly obvious that this is not something you'd call the perfect solution.
Now the question here is: if people fail to take the right approach, is it the people's or the tool's fault?
In order to find
out, I asked myself: Would I – if I were using Hibernate – always use result beans for my read only queries or would I
work with the full entity beans?
From a performance perspective I would definitely use the result beans. But from a code quality perspective maybe I would not. After all an expression like
filter.append("select new EmployeeResult(employee.employeeId, employee.firstname || ', ' || employee.lastname, employee.gender, employee.dateOfBirth, department.name) ");
isn't exactly going to improve my code quality. Apart from avoiding spelling mistakes also the number and type of my result
bean's constructor parameters must match this code. Errors will be detected only at runtime making every mistake a tedious
experience. And after all - I must admit – I'd sometimes probably be just too lazy to type in all those
property names
even with their entity prefix.
So in the end I'd probably be using both approaches depending on the size of the entities and the nature of the query.
However I can understand why people shy away from this efficient and sensible feature – rather paying the price with
high memory consumption and poor performance.
With Empire-db it's so much easier. Here you would definitely transform and select the columns as you need them in your result set, since the cost of doing so is so much lower. First you browse all columns for your selection, even including transformation functions, comfortably using your IDE's code completion. Then you can store them either manually or automatically in a JavaBean using either the constructor or setters. And most important: you can usually do all this completely string-free with 100% compile-time safety. Here's an example:
To build the following SQL (Oracle syntax):
SELECT t2.EMPLOYEE_ID, t2.LASTNAME || ', ' || t2.FIRSTNAME AS NAME, t1.NAME AS DEPARTMENT FROM (DEPARTMENTS t1 INNER JOIN EMPLOYEES t2 ON t2.DEPARTMENT_ID = t1.DEPARTMENT_ID) WHERE upper(t2.LASTNAME) LIKE upper('Foo%') AND t2.RETIRED=0 ORDER BY t2.LASTNAME, t2.FIRSTNAME
With Empire-db you write:
SampleDB db = getDatabase(); // Declare shortcuts (not necessary but convenient) SampleDB.Employees EMP = db.EMPLOYEES; SampleDB.Departments DEP = db.DEPARTMENTS; // Create a command object DBCommand cmd = db.createCommand(); // Select columns cmd.select(EMP.EMPLOYEE_ID); cmd.select(EMP.LASTNAME.append(", ").append(EMP.FIRSTNAME).as("NAME")); cmd.select(DEP.NAME.as("DEPARTMENT")); // Join tables cmd.join (DEP.DEPARTMENT_ID, EMP.DEPARTMENT_ID); // Set constraints cmd.where(EMP.LASTNAME.likeUpper("Foo%")); cmd.where(EMP.RETIRED.is(false)); // Set order cmd.orderBy(EMP.LASTNAME); cmd.orderBy(EMP.FIRSTNAME);
Our conclusion
Hibernate is in fact one of the most advanced traditional ORM solutions available. The problem with ORM however is, that it’s designed primarily to work with full entities. Relational databases on the other hand offer powerful capabilities for combining, filtering and transforming entities and their attributes. In order to retain this flexibility Hibernate offers various features to bridge that gap. But in order to utilize these features properly you’ve got to make lots of decisions: XML or annotations, HQL or criteria API, lazy or eager fetching and so on. What Hibernate internally does and especially when it actually performs a database operation is not always transparent (if you set logging to debug level - you can grasp some of its complexity). Hibernate probably does a good job if used and configured properly, but it requires a lot of caution and it’s a long way getting there. Especially if you’re not yet familiar with Hibernate, the learning curve is steep.
The one thing that Hibernate lacks most is support for compile-time safety. With both HQL and criteria API you need to provide property names or even entire SQL fragments as string literals – making each data model change a risk – which can only be addressed through extensive and expensive testing.
Empire-db on contrary addresses compile-time safety by offering a type safe API based on a Java object model database definition. Whenever you change this model description your Java compiler will tell you exactly which lines of code are affected by that change. This dramatically improves code quality and thus reduces the amount and expense of testing. As a side effect your coding productivity increases as your IDE will allow you to browse all tables, columns and even SQL functions when building dynamic queries.
Empire-db is not an ORM solution as you know it. Its focus is clearly on modelling the way relational databases work in Java and not vice versa. Empire-db is passive and does not interfere with your connection and transaction handling – making it easy to integrate and requiring zero configuration. It does not automatically resolve object references but since you select the data exactly as you need it, there is rarely demand for this. Still if you need it, you may simply add a getter with a few simple lines of code. In this case – we believe – that less sometimes is more.
We recommend that if you are not very familiar with SQL and all you need is to store away and reload your POJO's, Hibernate or another JPA implementation is probably the better choice. But if you want to get the most out of SQL and you want to keep full control over when which statements are executed, with all the additional benefits of metadata access and compile-time safety then you really should give Empire-db a go.
Note: If you feel that any of the criticism we made about Hibernate is without reason please let us know. E-mail: