Facing process/integration performance issue? The solution might be in your head, just put your “memory” to work.

It has become increasingly common to see companies struggle in improving the performance of their integration system. They approached it in many different ways and although it looks like they have no more clear options to improve it, they keep trying. They probably already tried the most crucial alternatives like, server configuration, network and database tuning, architecture and redesigning the entire system again. But the results remain unsatisfactory.

In this article, I am going to describe a very promising solution to improve the performance of an integration system without going into too much technical detail. Let’s get started with a relaxing story about a busy restaurant in a big city. Later on we will enter into the technical concepts behind this analogy.

Imagine a 5 floor building where the 1st and 5th floors belong to a restaurant. The kitchen and main lobby of the restaurant is on 1st floor but the storage room where the ingredients are kept is on 5th floor. This restaurant supplies monthly more than 1 million a la carte orders, during work hours, to its customers. Every time that an order is placed by the customer on the 1st floor, all necessary ingredients are taken by the kitchen helpers, from the storage room on the 5th floor to prepare the meals and serve them to the customers on the 1st floor. This makes the time of delivery of the order much longer than necessary and some customers even left the restaurant without having their lunch of which some took the time to make a complaint. A very simple, but brilliant idea was proposed by the new chef who used his intelligence and his “memory”. He suggested the following: Let’s keep the most recently used ingredients in the 3 free rooms besides the kitchen on the 1th floor of the building.

The idea was implemented and food is now delivered in a much shorter time. When the demand is higher the restaurant simply uses one more room available on the 1st floor, and all the ingredients are rearranged over the rooms to easily find them. With the introduction of this new way of working the restaurant performed much better and received much less complaints.

The following picture shows the detailed flow of the restaurant.

  1. Customers place the order in the restaurant.
  2. Cooks look ingredients up in storage rooms on the 1st floor.
  3. In case of necessary ingredients are not storage in the rooms on 1st floor, Kitchen helpers go up to the storage room on 5st floor and take the ingredients to the cooks on 1st floor. They take just the necessary amount to prepare the food that was ordered by the customer.
  4. Cooks make the food.
  5. Kitchen helpers move the leftover of the most recent ingredients used from storage on the 5th floor to storage rooms on the 1st floor. It avoids waste time for the next food preparation.
  6. Food is delivered to the customer.

What is then the conclusion of this story? The rooms on the same floor as the kitchen and main lobby had brought much better results. They increased the performance and value of the business by scaling the restaurant processes and cope with the increased amount of customers.

This is also the concept of in memory and caching type of tools like TIBCO Active Spaces. Using in-memory data grid avoids frequent extraction of data from databases minimizing the overhead and optimizing processing time.

So, the relation between our restaurant story and the concept of the application layer/process that uses data grid in memory is explained with the following table. Column 2 in the middle shows the relation between column 1 and 3.

Restaurant Story                                                 Relation between Restaurant Story and IT Context                               IT Context
Restaurant/Main lobby Receive the order/request Process/Interface
Storage, 5th floor Used to store contents Database
Storage, 1th floor, 3 rooms expandable to 9 rooms.                      Used to store contents     In-memory data grid
Client/Customer Make the request Client program
Ingredient Contents are manipulated Data
Food/Lunch Expected product/Results Results/Response

 

The following picture shows the flow on the IT Context.

  1. Client application create a request through the exposed interface by Application/Integration servers.
  2. Application look data up in storage in memory data grid.
  3. In case of necessary data is not storage in memory yet, Application look data up on the database.
  4. Data is processed.
  5. A process takes most recent used data to in memory data grid.
  6. It avoids waste time for the next data processing.
  7. Results/Response is replied to the customer.

Now we understand the principle, let’s continue with a more technical discussion. The following definition was extracted from a TIBCO Solution Brief.

“TIBCO ActiveSpaces® is a distributed in-memory data grid that provides very fast data access and update. While its performance will vary depending on multiple factors, it is not uncommon for ActiveSpaces to be 100x faster than corresponding database implementations. For this and others reasons, ActiveSpaces is part of the TIBCO Fast Data solution for lifting the burden of big data, reducing reliance on costly transactional systems, and building highly scalable, fault-tolerant applications.”

So, I will reformulate it with my own words: TIBCO Active Spaces is a cluster of computers that work together sharing their resources, CPU and memory, scaling and handling large data volumes in a distributed manner. It dynamically adds or drops nodes (rooms) to scale the processing power with no disruption of the service when scaling up or down.

According to the definition just mentioned, it is very simple to understand that any processing done directly in memory and with 1 or more computer(s) adding up the resources, avoiding the overhead of accessing a remote back-end storage such as a database or files will bring much more performance. We can also roughly say that any application that uses external information coming from databases or file systems might be improved using TIBCO Active Spaces.

After explaining the main concept and goal of TIBCO ActiveSpaces, some important questions will come up which I will try to answer as simple as possible without going into detail.

  • What are the infrastructure requirements to be able to support TIBCO ActiveSpaces?
  • Does my current application need to be redesigned?
  • What kind of maintenance and monitoring do I have to implement?
  • What professional skills are required?

1. Infrastructure

The infrastructure will depend on non-functional requirements and there is no possibility to come up with an ideal infrastructure set- up without having the following questions answered:

  • what is the current and forecasted load,
  • expected performance and availability,
  • level of backup,
  • level of security,
  • type of persistence,
  • scalability,
  • maintainability,
  • network set-up,
  • monitoring

Roughly, you need as many CPU processing power and free memory based on your application needs.

Furthermore TIBCO ActiveSpaces must be installed on any computer that wants to join the data grid cluster. Computers can take part in the cluster either as passive or active. Passive doesn’t share resources (CPU, Memory), therefore the requirement for resources are much less.

It may be desirable that data on TIBCO ActiveSpaces be persisted so the type of persistence is something to think about. The persistence can be done to a disk or a database. It is also important to bear in mind that although TIBCO ActiveSpaces is able to deal with Terabytes of data in memory that the legacy database is not supposed to retire and/or rest in peace.

The infrastructure to run TIBCO ActiveSpaces must be well equipped with sufficient CPUs, memory and network.

2. Redesign of the Implementation

For any IT solution, the architecture to be chosen depends on the company’s needs and current situation.

By experience, I have seen TIBCO ActiveSpaces being mostly used by an application as a fast cache for information either reading and writing data to TIBCO ActiveSpaces or only reading from it. These caches can have the most varied type of data such as lookup, correlation process keys, cross reference data, routing tables, inventory, real time offers, data to be transformed and so on.

In addition, TIBCO ActiveSpaces can also handle data that is transiting in rush hours where the amount of data in a period of time reaches high numbers of transactions.

TIBCO ActiveSpaces works dissociated and can be easily plugged to your current infrastructure and for that, changes for your application are required. These changes are most of the time really small to connect your application to the cluster of TIBCO ActiveSpace, and making it a member. This implementation can directly be done within your application with actually no line of code added on your TIBCO Business Works, or without a need to actually code it with Java, C++, .Net or TIBCO Business Events.

In the first place, there are no changes required for the contract of the interfaces with external applications.

It is very important to normalize the data once it is not in a relational database. Doing the right work in the beginning will avoid later concerns.

Don’t be concerned to duplicate the data, but to keep always only one single version of the truth.

3. Monitoring

Monitoring is always an important subject to be taken in consideration from the very beginning of your system development life cycle (SDLC) when you define and design your solution. This is called Monitoring Driven Development (MDD). TIBCO ActiveSpaces already comes with a monitoring tool called ASMM (ActiveSpaces Monitoring and Management). TIBCO Hawk is also a monitoring solution provided by TIBCO. Although there are many monitoring solutions on the market, most of them are on a technical level. Therefore, I strongly suggest to also introduce monitoring of the business functions. With this you make the business value of the solution transparent to your key stake holders, including customers and internal decision makers.

4. What skills are needed?

It is recommended that the person exactly knows what one is doing. Although working with TIBCO ActiveSpaces seems to be easy, and the concepts are not that difficult, it must be carefully used. An extensive knowledge of integration, involving architecture, design and basic programming (in one of the languages or tools being used for implementing as mentioned in this article before) is recommended. This would make life easier and more comfortable.

Conclusion

The in-memory data grid was explained using the analogy of the chef’s decision to optimize the restaurant’s processes. The relevant issues of applying an in-memory data grid were discussed, like needed infrastructure, implementation, redesign of applications, monitoring and needed expertise.
With this better understanding of concepts you can take more sophisticated decisions and take the benefits of a powerful in-memory data grid.