Skip to main content

Adding Processors

Adding Processors

Processors are the basic block of data flow creation. Each processor has its own functionality, which contributes to the creation of the final flowfile.

The data flow below illustrates the data flow for retrieving files from one directory using the GetFile processor and storing them in another directory using the PutFile processor.

Let's start creating the data flow by adding a Processor to the canvas. To do this, drag the Processor icon from the top left of the screen to the center of the canvas (like a graph paper background) and drop it there. This will give us a dialogue that will allow us to select the processor we want to add:

GetFile

GetFile is used to fetch files of a specific format from a particular directory. There are also other options to control better the process of pulling files out.

  • GetFile Settings

The following are the different settings of the GetFile processor:

In the Name setting, the user can specify any name for the processor either according to the project or create a more meaningful name.

The user can enable or disable the processor using this setting.

  • Penalty Duration

In a flow file failure, this setting allows the user to add a penalty time duration.

  • Yield Duration

This setting is used to determine the result time for the processor. Within this duration, the process is no longer scheduled.

  • Bulletin Level

This setting is used to determine the log level of the processor.

  • Automatically Terminate Relationships

It has a list of all available relationships of a given process. By checking the box, the user can program the processor to stop the flow file on that event and not send it further down the stream.

  • GetFile Scheduling

These are the following scheduling options offered by the GetFile processor:

  • Schedule Strategy

You can schedule a process based on time by selecting time-driven or a specific CRON string by selecting the CRON driver option.

  • Concurrent Tasks

This option is used to specify the number of concurrently running tasks for this processor.

The user can specify whether to run the processor on all nodes or only on the primary node by using this option.

  • Run Schedule

It is used to determine a time-driven strategy or the CRON expression for a CRON-driven strategy.

  • etFile Properties

GetFile offers several properties, as shown in the image below, ranging from mandatory properties such as Input directory and file filters to optional properties such as Path Filter and Maximum File Size. A user can manage the file retrieval process using this property.

  • GetFile Comments

This section is used to specify any information about the processor.

PutFile

The PutFile processor saves files from the data stream to a specific location.

  • PutFile Configuration

The PutFile processor has the following settings :

In the Name setting, the user can specify any name for the processor either according to the project or create a more meaningful name.

The user can enable or disable the processor using this setting.

  • Penalty Duration

This setting allows the user to add a penalty time duration in a flow file failure.

  • Yield Duration

This setting is used to determine the result time for the processor. Within this duration, the process is no longer scheduled.

  • Bulletin Level

This setting is used to determine the log level of the processor.

  • Automatically Terminate Relationships

It has a list of all available relationships of a given process. By checking the box, the user can program the processor to stop the flow file on that event and not send it further down the stream.

  • PutFile Scheduling

These are the following scheduling options offered by the GetFile processor:

  • Schedule Strategy

You can schedule a process based on time by selecting time-driven or a specific CRON string by selecting the CRON driver option.

  • Concurrent Tasks

This option is used to specify the number of concurrently running tasks for this processor.

The user can specify whether to run the processor on all nodes or only on the primary node by using this option.

  • Run Schedule

It is used to determine a time-driven strategy or the CRON expression for a CRON-driven strategy.

  • PutFile Properties

The PutFile processor provides properties such as Directory to specify the output directory for file transfer purposes and others to manage transfers as shown in the screenshot below.

  • PutFile Comments

This section is used to specify any information about the processor.

Was This Article Helpfull?

Yes
No
© 2021 by Bigbox, Inc. All rights reserved.Last modified : December 27, 2021

Was This Article Helpfull?

Yes
No