Class RemoteIterators
This aims to make it straightforward to use lambda-expressions to transform the results of an iterator, without losing the statistics in the process, and to chain the operations together.
The closeable operation will be passed through RemoteIterators which wrap other RemoteIterators. This is to support any iterator which can be closed to release held connections, file handles etc. Unless client code is written to assume that RemoteIterator instances may be closed, this is not likely to be broadly used. It is added to make it possible to adopt this feature in a managed way.
One notable feature is that the
foreach(RemoteIterator, ConsumerRaisingIOE) method will
LOG at debug any IOStatistics provided by the iterator, if such
statistics are provided. There's no attempt at retrieval and logging
if the LOG is not set to debug, so it is a zero cost feature unless
the logger org.apache.hadoop.fs.functional.RemoteIterators
is at DEBUG.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classorg.apache.hadoop.util.functional.RemoteIterators.WrappingRemoteIterator<S,T> Wrapper of another remote iterator; IOStatistics and Closeable methods are passed down if implemented. -
Method Summary
Modifier and TypeMethodDescriptionstatic <T> voidcleanupRemoteIterator(org.apache.hadoop.fs.RemoteIterator<T> source) Clean up after an iteration.static <S> org.apache.hadoop.fs.RemoteIterator<S>closingRemoteIterator(org.apache.hadoop.fs.RemoteIterator<S> iterator, Closeable toClose) This adds an extra close operation alongside the passthrough to any Closeable.close() method supported by the source iterator.static <S> org.apache.hadoop.fs.RemoteIterator<S>filteringRemoteIterator(org.apache.hadoop.fs.RemoteIterator<S> iterator, org.apache.hadoop.util.functional.FunctionRaisingIOE<? super S, Boolean> filter) Create a RemoteIterator from a RemoteIterator and a filter function which returns true for every element to be passed through.static <T> longforeach(org.apache.hadoop.fs.RemoteIterator<T> source, org.apache.hadoop.util.functional.ConsumerRaisingIOE<? super T> consumer) Apply an operation to all values of a RemoteIterator.static <S> org.apache.hadoop.fs.RemoteIterator<S>haltableRemoteIterator(org.apache.hadoop.fs.RemoteIterator<S> iterator, org.apache.hadoop.util.functional.CallableRaisingIOE<Boolean> continueWork) Wrap an iterator with one which adds a continuation probe.static <S,T> org.apache.hadoop.fs.RemoteIterator<T> mappingRemoteIterator(org.apache.hadoop.fs.RemoteIterator<S> iterator, org.apache.hadoop.util.functional.FunctionRaisingIOE<? super S, T> mapper) Create an iterator from an iterator and a transformation function.static org.apache.hadoop.fs.RemoteIterator<Long>rangeExcludingIterator(long start, long excludedFinish) A remote iterator which simply counts up, stopping once the value is greater than the value ofexcludedFinish.static <T> org.apache.hadoop.fs.RemoteIterator<T>remoteIteratorFromArray(T[] array) Create a remote iterator from an array.static <T> org.apache.hadoop.fs.RemoteIterator<T>remoteIteratorFromIterable(Iterable<T> iterable) Create a remote iterator from a java.util.Iterable -e.g. a list or other collection.static <T> org.apache.hadoop.fs.RemoteIterator<T>remoteIteratorFromIterator(Iterator<T> iterator) Create a remote iterator from a java.util.Iterator.static <T> org.apache.hadoop.fs.RemoteIterator<T>remoteIteratorFromSingleton(T singleton) Create an iterator from a singleton.static <T> T[]toArray(org.apache.hadoop.fs.RemoteIterator<T> source, T[] a) Build an array from a RemoteIterator.static <T> List<T>toList(org.apache.hadoop.fs.RemoteIterator<T> source) Build a list from a RemoteIterator.static <S,T> org.apache.hadoop.fs.RemoteIterator<T> typeCastingRemoteIterator(org.apache.hadoop.fs.RemoteIterator<S> iterator) Create a RemoteIterator from a RemoteIterator, casting the type in the process.
-
Method Details
-
remoteIteratorFromSingleton
public static <T> org.apache.hadoop.fs.RemoteIterator<T> remoteIteratorFromSingleton(@Nullable T singleton) Create an iterator from a singleton.- Type Parameters:
T- type- Parameters:
singleton- instance- Returns:
- a remote iterator
-
remoteIteratorFromIterator
public static <T> org.apache.hadoop.fs.RemoteIterator<T> remoteIteratorFromIterator(Iterator<T> iterator) Create a remote iterator from a java.util.Iterator.- Type Parameters:
T- type- Parameters:
iterator- iterator.- Returns:
- a remote iterator
-
remoteIteratorFromIterable
public static <T> org.apache.hadoop.fs.RemoteIterator<T> remoteIteratorFromIterable(Iterable<T> iterable) Create a remote iterator from a java.util.Iterable -e.g. a list or other collection.- Type Parameters:
T- type- Parameters:
iterable- iterable.- Returns:
- a remote iterator
-
remoteIteratorFromArray
public static <T> org.apache.hadoop.fs.RemoteIterator<T> remoteIteratorFromArray(T[] array) Create a remote iterator from an array.- Type Parameters:
T- type- Parameters:
array- array.- Returns:
- a remote iterator
-
mappingRemoteIterator
public static <S,T> org.apache.hadoop.fs.RemoteIterator<T> mappingRemoteIterator(org.apache.hadoop.fs.RemoteIterator<S> iterator, org.apache.hadoop.util.functional.FunctionRaisingIOE<? super S, T> mapper) Create an iterator from an iterator and a transformation function.- Type Parameters:
S- source typeT- result type- Parameters:
iterator- sourcemapper- transformation- Returns:
- a remote iterator
-
typeCastingRemoteIterator
public static <S,T> org.apache.hadoop.fs.RemoteIterator<T> typeCastingRemoteIterator(org.apache.hadoop.fs.RemoteIterator<S> iterator) Create a RemoteIterator from a RemoteIterator, casting the type in the process. This is to help with filesystem API calls where overloading causes confusion (e.g. listStatusIterator())- Type Parameters:
S- source typeT- result type- Parameters:
iterator- source- Returns:
- a remote iterator
-
filteringRemoteIterator
public static <S> org.apache.hadoop.fs.RemoteIterator<S> filteringRemoteIterator(org.apache.hadoop.fs.RemoteIterator<S> iterator, org.apache.hadoop.util.functional.FunctionRaisingIOE<? super S, Boolean> filter) Create a RemoteIterator from a RemoteIterator and a filter function which returns true for every element to be passed through.Elements are filtered in the hasNext() method; if not used the filtering will be done on demand in the
next()call.- Type Parameters:
S- type- Parameters:
iterator- sourcefilter- filter- Returns:
- a remote iterator
-
closingRemoteIterator
public static <S> org.apache.hadoop.fs.RemoteIterator<S> closingRemoteIterator(org.apache.hadoop.fs.RemoteIterator<S> iterator, Closeable toClose) This adds an extra close operation alongside the passthrough to any Closeable.close() method supported by the source iterator.- Type Parameters:
S- source type.- Parameters:
iterator- sourcetoClose- extra object to close.- Returns:
- a new iterator
-
haltableRemoteIterator
public static <S> org.apache.hadoop.fs.RemoteIterator<S> haltableRemoteIterator(org.apache.hadoop.fs.RemoteIterator<S> iterator, org.apache.hadoop.util.functional.CallableRaisingIOE<Boolean> continueWork) Wrap an iterator with one which adds a continuation probe. This allows work to exit fast without complicated breakout logic- Type Parameters:
S- source type.- Parameters:
iterator- sourcecontinueWork- predicate which will trigger a fast halt if it returns false.- Returns:
- a new iterator
-
rangeExcludingIterator
public static org.apache.hadoop.fs.RemoteIterator<Long> rangeExcludingIterator(long start, long excludedFinish) A remote iterator which simply counts up, stopping once the value is greater than the value ofexcludedFinish. This is primarily for tests or when submitting work into a TaskPool. equivalent tofor(long l = start, l < excludedFinish; l++) yield l;
- Parameters:
start- start valueexcludedFinish- excluded finish- Returns:
- an iterator which returns longs from [start, finish)
-
toList
Build a list from a RemoteIterator.- Type Parameters:
T- type- Parameters:
source- source iterator- Returns:
- a list of the values.
- Throws:
IOException- if the source RemoteIterator raises it.
-
toArray
public static <T> T[] toArray(org.apache.hadoop.fs.RemoteIterator<T> source, T[] a) throws IOException Build an array from a RemoteIterator.- Type Parameters:
T- type- Parameters:
source- source iteratora- destination array; if too small a new array of the same type is created- Returns:
- an array of the values.
- Throws:
IOException- if the source RemoteIterator raises it.
-
foreach
public static <T> long foreach(org.apache.hadoop.fs.RemoteIterator<T> source, org.apache.hadoop.util.functional.ConsumerRaisingIOE<? super T> consumer) throws IOException Apply an operation to all values of a RemoteIterator. If the iterator is an IOStatisticsSource returning a non-null set of statistics, and this classes log is set to DEBUG, then the statistics of the operation are evaluated and logged at debug.The number of entries processed is returned, as it is useful to know this, especially during tests or when reporting values to users.
This does not close the iterator afterwards.- Type Parameters:
T- type of source- Parameters:
source- iterator sourceconsumer- consumer of the values.- Returns:
- the number of elements processed
- Throws:
IOException- if the source RemoteIterator or the consumer raise one.
-
cleanupRemoteIterator
public static <T> void cleanupRemoteIterator(org.apache.hadoop.fs.RemoteIterator<T> source) Clean up after an iteration. If the log is at debug, calculate and log the IOStatistics. If the iterator is closeable, cast and then cleanup the iterator- Type Parameters:
T- type of source- Parameters:
source- iterator source
-