![]() This option also causes file writers to pick up the HDFS’s default values for block sizes and replication factors. Without this option, HDFS files can be accessed, but require fully qualified URIs like hdfs://address:port/path/to/files. Specifying this value allows programs to reference HDFS files using short URIs ( hdfs:///path/to/files, without including the address and port of the NameNode in the file URI). Setups that do not specify a HDFS configuration have to specify the full path to HDFS files ( hdfs://address:port/path/to/files) Files will also be written with default HDFS parameters (block size, replication factor).įs.hdfs.hadoopconf: The absolute path to the Hadoop File System’s (HDFS) configuration directory (OPTIONAL VALUE). These parameters configure the default HDFS used by Flink. Note: These keys are deprecated and it is recommended to configure the Hadoop path with the environment variable HADOOP_CONF_DIR instead. Turns on SSL for external communication via the REST endpoints. Optionally, specific components may override this through their own settings (rpc, data transport, REST, etc). ![]() Turns on SSL for internal network communication. To enable high-availability, set this mode to "ZOOKEEPER" or specify FQN of factory class.įile system path (URI) where Flink persists metadata in high-availability setups. Used by the state backends that write savepoints to file systems (MemoryStateBackend, FsStateBackend, RocksDBStateBackend).ĭefines high-availability mode used for the cluster execution. The storage path must be accessible from all participating processes/nodes(i.e. The default directory used for storing the data files and meta data of checkpoints in a Flink supported filesystem. The state backend to be used to store and checkpoint state. This value is typically proportional to the number of physical CPU cores that the TaskManager's machine has (e.g., equal to the number of cores, or half the number of cores). That way, the TaskManager can utilize multiple CPU cores, but at the same time, the available memory is divided between the different operator or function instances. If this value is larger than 1, a single TaskManager takes multiple instances of a function or operator. The number of parallel operator or user function instances that a single TaskManager can run. On YARN setups, this value is automatically configured to the size of the TaskManager's YARN container, minus a certain tolerance value. JVM heap size for the TaskManagers, which are the parallel workers of the system.
0 Comments
Leave a Reply. |