Optimizing NiFi with Concurrent Tasks

Ben Yaakobi
4 min readApr 23, 2020

Optimizing any application can be found out to be a tough mission. It depends on a variety of factors: memory, CPU, the application’s consumption, and many other factors.

In NiFi, various optimization can be made. In this article, I will review the application optimization of NiFi and the difficulties behind it.

As you may know, there’s an option in NiFi to adjust the Maximum Timer Driven Thread Count as we’ve already seen in my previous article.
But this time, we’ll review how to adjust it correctly.

So, what if we have a rather simple task that simply copies data from a Kafka topic to another Kafka topic and filters it according to timestamp and a certain field in the message?

Well, currently, our Maximum Timer Driven Thread Count is set to 25. My tasks, duplicated by 6, is the following task, with a lot of messages.

Maximum Timer Driven Thread Count set to 25

As you can see, this flow is pretty busy, it processes pretty slow and our lag in the destination Kafka topic, against the source Kafka topic, is pretty high.
However, a quick look at the CPU stats can show us that it’s not as busy:

So basically we can configure more concurrent tasks. So let’s increase the Maximum Timer Driven Thread Count to 100(and the Concurrent Tasks of the processors accordingly).

Maximum Timer Driven Thread Count set to 100

We can also take a look at the CPU consumption and see that it has increased.

So, if we would increase the Maximum Timer Driven Thread Count, we should get even better results, right? Well, let’s try it:

Maximum Timer Driven Thread Count set to 600

The performance is pretty much the same. So why didn’t it have any effect? Well, it’s a matter of I/O & CPU limitation. NiFi is I/O intensive, so when we increase the job in NiFi, the disk has a limitation to its reads & writes all at once. Another thing is the CPU. Even though we set so many threads, the CPU is performing a lot of context switches(since it doesn’t have as much as cores for these tasks) which could reduce performance.

So thread count set to 100 is pretty much optimized, but we wouldn’t have used this amount of threads unless we had such a low-on-resource flow. Actually, each processor is documented with a “System Resource Considerations”. What does it mean? Well, let’s take a look at SplitJson as an example.

“The entirety of the FlowFile’s content…is read into memory…” . System Resource Considerations of SplitJson

Its only consideration is memory. It loads the whole FlowFile up to the memory. So if we were to set it up with 20 concurrent tasks and a queue full of 500MB FlowFiles, we would hit the memory limit and NiFi would throw exceptions all over the place.

So adjusting the Maximum Timer Driven Thread Count is not just a matter of the system resources, but also a matter of which flows and processors are running in our environment. Of course, every flow needs to be made as low-on-resource as possible. For example, instead of SplitJson, we can use SplitRecord which doesn’t load the whole FlowFile up to memory(in general, as a convention, it’s better working with records).

Another factor we need to take into consideration is the Concurrent Tasks of each processor. How do we decide how many Concurrent Tasks each of the processors will have? Well, as before, we need to review its System Resource Considerations and decide accordingly. It would also mean, that if we have Maximum Timer Driven Thread Count set to 10 and we have one processor that is set to 9 Concurrent Tasks, the other one could take enough tasks for 9 processors while the other processor will wait.

So we need to take into consideration the number of Maximum Timer Driven Thread Count, as well as all the other flows and the processors in the current flow. Of course, the more stuck FlowFiles, the more Concurrent Tasks we need. For example, the Decode processor in my flow above, has the most Concurrent Tasks in the flow, since it is the bottleneck.

In conclusion, there is a variety of factors that are taken into consideration: the system resources, the amount of processors, the type of processors, the amount of resources consumption of the processors, the size of the transferred FlowFiles, and many more factors! It requires a lot of fine-tuning until you reach a fully optimized environment.

--

--