Saturday, January 26, 2013

@SequenceGenerator revealed

The following trio of annotations is popular in JPA applications:
@Id

@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "customer_seq")

@SequenceGenerator(name = "customer_seq", sequenceName = "CUSTOMER_SEQ")  
Those annotations inform JPA provider that the particular field is a primary key and the value of that field should be generated automatically in a specific way. In this example we use sequence based ID generator. The corresponding database sequence is created as follows (I have used H2 database and Hibernate 4.1.9 as my JPA provider. Other JPA implementations may behave slightly different.):
create sequence CUSTOMER_SEQ start with 1 increment by 50
The sequence will start with the value of 1 and calling:
@PersistenceContext(unitName = "my-unit")

private EntityManager em;

... 

em.createNativeQuery("SELECT NEXT VALUE FOR CUSTOMER_SEQ").getResultList();
will increment the sequence value by 50.
The first question is: why the values of 1 and 50 appeared in the sequence creation? The answer is simple - those values are default ones for @SequenceGenerator annotation. We can modify them easily:
@SequenceGenerator(name = "customer_seq", sequenceName = "CUSTOMER_SEQ", initialValue = 777, 
allocationSize = 100)
But why on earth?! Can't we simply increment the sequence one by one? The answer is: "Yes, we can". But it is not the optimal way. JPA provider must set the value of @Id annotated field. It hits the database for the next sequence value. However, every database hit is potentially an expensive operation. The 'allocationSize' parameter is used to reduce the number of database hits. 
Let's persist 30 instances of Customer with two @SequenceGenerator sets of parameters:
1). allocationSize = 1
2). allocationSize = 10
In the first case we have 30 invocations of:
Hibernate: 
     call next value for CUSTOMER_SEQ
In the second example the number of such invocations is 3.

The mechanism works as follows:
1). JPA provider needs the value for ID and it hits the database.
2). Now, depending on the 'allocationSize' value, JPA provider has a pool of sequence values. They can be used without hitting the database again.
Is it safe? Does it guarantee the uniqueness of the primary keys? The answer is: YES.

The pool of sequence values is held in the EntityManagerFactory (SessionFactory) region. Let's assume that we have two database transactions. Both of them are associated with different EntityManagers (Sessions) but those EntityManagers have the same EntityManagerFactory. The following code within Stateless Bean - CustomerDao - presents the situation:
        
 @PersistenceContext(unitName = "my-unit")
 private EntityManager em;

 @Resource
 private SessionContext ctx;
 
 @Resource
 private TransactionSynchronizationRegistry txRegistry;

 @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
 public void persist1() {
  log.info("[persist1] start");
  log.info(em.unwrap(Session.class).getSessionFactory());
  log.info(txRegistry.getTransactionKey());
  
  for (int i = 0; i < 2; i++) {
   em.persist(new Customer());
  }
  
  CustomerDao thisDao = ctx.getBusinessObject(CustomerDao.class);
  thisDao.persist2();
  
  List customers = em.createQuery("from Customer c order by c.id", Customer.class).getResultList();
  for (Customer c : customers) {
   log.info(c.getId());
  }
  
  log.info("[persist1] end");
 }
 
 @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
 public void persist2() {
  log.info("[persist2] start");
  log.info(em.unwrap(Session.class).getSessionFactory());
  log.info(txRegistry.getTransactionKey());

  for (int i = 0; i < 2; i++) {
   em.persist(new Customer());
  }

  log.info("[persist2] end");
 }

The output:
[persist1] start
org.hibernate.internal.SessionFactoryImpl@343a20a7
0:ffffc0a80186:-7c39d3fb:5103f425:8

[persist2] start
org.hibernate.internal.SessionFactoryImpl@343a20a7
0:ffffc0a80186:-7c39d3fb:5103f425:d
16:20:20,942 INFO  [control.CustomerDao] (http-localhost-127.0.0.1-8080-1)
[persist2] end
 1
 2
 3
 4
[persist1] end  
Despite the fact that two different transactions are used, Hibernate is able to increment the ID without any gap.
Now let's try to use two separate EntityManagerFactories. I have defined a second persistent unit in persistence.xml (the same configuration as the first one, but different name) and I have injected EntityManagers for both units:
 @PersistenceContext(unitName = "my-unit")
 private EntityManager em;
 
 @PersistenceContext(unitName = "my-second-unit")
 private EntityManager em2;
 
 @Resource
 private SessionContext ctx;
 
 @Resource
 private TransactionSynchronizationRegistry txRegistry;
 
 @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
 public void persist1() {
  log.info("[persist1] start");
  log.info(em.unwrap(Session.class).getSessionFactory());
  log.info(txRegistry.getTransactionKey());
  
  for (int i = 0; i < 2; i++) {
   em.persist(new Customer());
  }
  
  CustomerDao thisDao = ctx.getBusinessObject(CustomerDao.class);
  thisDao.persist2();
  
  List customers = em.createQuery("from Customer c order by c.id", Customer.class).getResultList();
  for (Customer c : customers) {
   log.info(c.getId());
  }
  
  log.info("[persist1] end");
 }
 
 @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
 public void persist2() {
  log.info("[persist2] start");
  log.info(em2.unwrap(Session.class).getSessionFactory());
  log.info(txRegistry.getTransactionKey());

  for (int i = 0; i < 2; i++) {
   em2.persist(new Customer());
  }

  log.info("[persist2] end");
 }
The output:
[persist1] start
org.hibernate.internal.SessionFactoryImpl@2a0ed6a2
0:ffffc0a80186:-7c39d3fb:5103f425:37

[persist2] start
org.hibernate.internal.SessionFactoryImpl@363b8734
0:ffffc0a80186:-7c39d3fb:5103f425:3c
[persist2] end
 1
 2
 11
 12
[persist1] end
Now, we have the values with a gap. It is similar to the situation when we have two different applications that talk to the same database or we have a crash/restart of the application server (reinstantiate EntityManagerFactory). Despite the fact that Hibernate keeps a pool of sequence values there is no chance that the values will be duplicated.

No comments :

Post a Comment