Apache NiFi User Guide

Deon
4 min readDec 2, 2017

Target Audience :
Who don’t know how to use apache NiFi and terminology of it.
But assume review through NiFi overview

Goal :
Quick guidance about how to use NiFi

Agenda:
Terminology
NiFi architecture
NiFI features
How to operate?
How to debug?
How to design?
What’s more?
What are learning steps?

Terminology
Example NiFI Web UI

NiFi terminologies, this nouns are going to appear all day long.
NiFi defines data package as FlowFile. It’s concept is similar as mail delivery package. Passing it by tag on top of it instead of un-package to know how to deal with it. The concept is NiFI keep data agnostic while routing.

NiFi Architecture:
Flow controller represent current data flow templates
Other three repository act configuration data warehouse

Data flow example :

NiFi Features:
Data provenance : records each data package(FlowFile) history
You can even simply download or view online.

How to operate:
In the NiFI web canvas (The only way to access NiFi to REST request), drag some processors and define their relations and configuration

Connection:
It holds the FlowFile during processors and perform prioritize and back pressure.

So a simple dataflow example:
GenerateFlowFile : generate FlowFiles in data flow to start working.
UpdateAttribute: Add some attribute(Tag) to FlowFile(Package)
LogAttribute: Write all attribute to log.

Controller Service : reusable resources component through whole NiFI instance, like connection pool.
Expression language : most important skill in NiFi data flow design.
Dynamically processing by attribute of FlowFile.
Templates : deploy data flow templates to other NiFi instance. Template stores a data flow in XML format. (NiFI itself will actually constantly restore flow status as checkpoint)

How to debug:

NiFi data flow design skills:
NiFi itself like flow based programming. Although much higher and abstract. But some basic programming skills is very helpful like good naming, reusable…

Comment:
If challenged to learn HBase rated as 8 of 10,
then learn NiFi is at most 4 of 10.
So understand the concept of Apache NiFi is much more important than too many operations.

--

--