About Apache NiFi & The FlowFile
When people hear that Apache NiFi is I/O intensive, they usually think that each processor that a FlowFile passes, a read/write is made from/to disk. Actually, it’s more complicated than this.
A FlowFile in NiFi is more than just a file on the disk. A FlowFile is constructed of two parts: The Content Repository & The FlowFile Repository.
The FlowFile Repository
The FlowFile Repository only holds metadata of the FlowFile(mainly attributes and which queue holds it right now). Not so complicated, is it? Well, It is. FlowFiles are held in a HashMap in memory so it makes it a lot faster to transfer them and deal with them. So how is it, that the JVM memory is not exceeded? Well, NiFi has a special way of dealing with this, using swapping. Not the OS swapping, but a mechanism of its own. When the number of FlowFiles exceeds the number of the configuration property “nifi.queue.swap.threshold” NiFi writes the FlowFiles with the lowest priority in the queue to a swap file on the disk in batches of 10,000. Then, the queue is in charge of determining when to swap the FlowFiles back to memory.
Not only that, but FlowFiles are edited using Write-Ahead Log, allowing a 0% chance of losing the data due to system failure or some other surprise.
Additionally, the FlowFile holds a reference to its content in the Content Repository. But what does that mean? Well, let’s move on to the next part to find out.
The Content Repository
The Content Repository is quite an amazing thing.
There can be a several content repositories each one referred as “container”. Each container is divided by partition folders named “sections”. Each section contains FlowFiles contents. The different contents of a few different FlowFiles can be found in one file. All to make a higher throughput. Then, to keep track of it, we have two different Java objects: Resource Claim & Content Claim.
Resource Claim holds a reference to the file(the one constructed of a few contents). The reference contains the filename, the section and the container it is placed in.
Content Claim holds a reference to the resource claim, the offset of the content and the length of its content(to know where it ends). A FlowFile holds a content claim reference and that’s how it knows what is its content.
But how can we know that the length is not changed and make multiple writes to the same resource claim? Well, we can know because NiFi uses Write-To-Copy paradigm, making the content immutable. So the content is only called when it’s needed and when a modification is made, it is made into a new content claim. That way, in case of a failure in modification, we can go back to the old content. So when any modification is made to the FlowFile’s content, it’s basically made into a new content claim, and after it is done, the reference of the FlowFile is changed to the new content claim. So even when we duplicate a FlowFile, we don’t have to duplicate its contents.
So this one file & Copy-On-Write paradigm seem flawless right? Well, not.
Say you create a flow, which in you generate a lot of small files, and keep merging them until the final file is reaching a certain size. Shouldn’t be a problem, right?
Depends. The max appendable size to a resource claim(by default) is 1 MB. So what would happen in a case where I generate lots of lots of small files, that are much smaller than 1MB and keep merging them until I reach a certain size? Well, I’ll have lots of resource claims containing content claims that are no longer referenced. But other content claims in the same resource claim will still be referenced. So what would happen in that case? File system flood. The content repository will take much more space than the queued FlowFiles. In my case, the queued FlowFiles took 30GB while the content repository consumed 105GB of my disk space. Pretty bad, right? Well, discussed in NIFI-3376, they figured out a solution by exposing the property “nifi.content.claim.max.appendable.size”. Decreasing it would lower the chance of a file system flood, but it would also slow down the I/O throughput. Of course decreasing “nifi.content.claim.max.flow.files” (which controls on how many content claims can one resource claim contain) would also work, since it can lower the chance that many long gone content claims and available content claims would be put together.
So what have we learned from all of this? The repositories’ mechanism is pretty awesome, and the more we know, the better we can tune our clusters!