Package org.apache.hadoop.tools
Enum Class DistCpOptionSwitch
- All Implemented Interfaces:
Serializable,Comparable<DistCpOptionSwitch>,Constable
Enumeration mapping configuration keys to distcp command line
options.
-
Nested Class Summary
Nested classes/interfaces inherited from class java.lang.Enum
Enum.EnumDesc<E extends Enum<E>> -
Enum Constant Summary
Enum ConstantsEnum ConstantDescriptionCopy all the source files and commit them atomically to the target This is typically useful in cases where there is a process polling for availability of a file/dir.Specify bandwidth per map in MB, accepts bandwidth as a fractionShould DisctpExecution be blockingConfigurable copy buffer size.Copy strategy is use.Deletes missing files in target that are missing from source.Write directly to the final location, avoiding the creation and rename of temporary files.Path containing a list of strings, which when found in the path of a file to be copied excludes that file from the copy job.Ignores any failures during copy, and continues with rest.Log path where distcp output logs are written toMax number of maps to use during copy.Number of threads for building source file listing (before map-reduce phase, max one listStatus per thread at a time).Overwrite target-files unconditionally.Preserves status of file/path in the target.Skip CRC checks between source and target, when determining what files need to be copied.Source file listing can be provided to DistCp in a file.Update target location by copying only files that are missing in the target.Track missing files in target that are missing from source This allows for other applications to complete the synchronization, possibly with object-store-specific delete algorithms.Log additional info (path, size) in the SKIP/COPY log.Work path to be used only in conjunction in Atomic commit -
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescriptionstatic voidaddToConf(Configuration conf, DistCpOptionSwitch option) Helper function to set an option to hadoop configuration objectstatic voidaddToConf(Configuration conf, DistCpOptionSwitch option, String value) Helper function to add an option to hadoop configuration objectGet Configuration label for the optionorg.apache.commons.cli.OptionGet CLI Option corresponding to the distcp optionGet Switch symboltoString()static DistCpOptionSwitchReturns the enum constant of this class with the specified name.static DistCpOptionSwitch[]values()Returns an array containing the constants of this enum class, in the order they are declared.
-
Enum Constant Details
-
IGNORE_FAILURES
Ignores any failures during copy, and continues with rest. Logs failures in a file -
PRESERVE_STATUS
Preserves status of file/path in the target. Default behavior with -p, is to preserve replication, block size, user, group, permission, checksum type and timestamps on the target file. Note that when preserving checksum type, block size is also preserved. If any of the optional switches are present among rbugpcaxt, then only the corresponding file attribute is preserved. -
SYNC_FOLDERS
Update target location by copying only files that are missing in the target. This can be used to periodically sync two folders across source and target. Typically used with DELETE_MISSING Incompatible with ATOMIC_COMMIT -
DELETE_MISSING
Deletes missing files in target that are missing from source. This allows the target to be in sync with the source contents Typically used in conjunction with SYNC_FOLDERS Incompatible with ATOMIC_COMMIT -
TRACK_MISSING
Track missing files in target that are missing from source This allows for other applications to complete the synchronization, possibly with object-store-specific delete algorithms. Typically used in conjunction with SYNC_FOLDERS Incompatible with ATOMIC_COMMIT -
NUM_LISTSTATUS_THREADS
Number of threads for building source file listing (before map-reduce phase, max one listStatus per thread at a time). -
MAX_MAPS
Max number of maps to use during copy. DistCp will split work as equally as possible among these maps -
SOURCE_FILE_LISTING
Source file listing can be provided to DistCp in a file. This allows DistCp to copy random list of files from source and copy them to target -
ATOMIC_COMMIT
Copy all the source files and commit them atomically to the target This is typically useful in cases where there is a process polling for availability of a file/dir. This option is incompatible with SYNC_FOLDERS and DELETE_MISSING -
WORK_PATH
Work path to be used only in conjunction in Atomic commit -
LOG_PATH
Log path where distcp output logs are written to -
VERBOSE_LOG
Log additional info (path, size) in the SKIP/COPY log. -
COPY_STRATEGY
Copy strategy is use. This could be dynamic or uniform size etc. DistCp would use an appropriate input format based on this. -
SKIP_CRC
Skip CRC checks between source and target, when determining what files need to be copied. -
OVERWRITE
Overwrite target-files unconditionally. -
APPEND
-
DIFF
-
RDIFF
-
BLOCKING
Should DisctpExecution be blocking -
FILE_LIMIT
-
SIZE_LIMIT
-
BLOCKS_PER_CHUNK
-
COPY_BUFFER_SIZE
Configurable copy buffer size. -
BANDWIDTH
Specify bandwidth per map in MB, accepts bandwidth as a fraction -
FILTERS
Path containing a list of strings, which when found in the path of a file to be copied excludes that file from the copy job. -
DIRECT_WRITE
Write directly to the final location, avoiding the creation and rename of temporary files. This is typically useful in cases where the target filesystem implementation does not support atomic rename operations, such as with the S3AFileSystem which translates file renames to potentially very expensive copy-then-delete operations. -
USE_ITERATOR
-
UPDATE_ROOT
-
-
Field Details
-
PRESERVE_STATUS_DEFAULT
- See Also:
-
-
Method Details
-
values
Returns an array containing the constants of this enum class, in the order they are declared.- Returns:
- an array containing the constants of this enum class, in the order they are declared
-
valueOf
Returns the enum constant of this class with the specified name. The string must match exactly an identifier used to declare an enum constant in this class. (Extraneous whitespace characters are not permitted.)- Parameters:
name- the name of the enum constant to be returned.- Returns:
- the enum constant with the specified name
- Throws:
IllegalArgumentException- if this enum class has no constant with the specified nameNullPointerException- if the argument is null
-
getConfigLabel
Get Configuration label for the option- Returns:
- configuration label name
-
getOption
public org.apache.commons.cli.Option getOption()Get CLI Option corresponding to the distcp option- Returns:
- option
-
getSwitch
Get Switch symbol- Returns:
- switch symbol char
-
toString
- Overrides:
toStringin classEnum<DistCpOptionSwitch>
-
addToConf
Helper function to add an option to hadoop configuration object- Parameters:
conf- - Configuration object to include the optionoption- - Option to addvalue- - Value
-
addToConf
Helper function to set an option to hadoop configuration object- Parameters:
conf- - Configuration object to include the optionoption- - Option to add
-