- java.lang.Object
-
- net.automatalib.common.util.string.AbstractPrintable
-
- net.automatalib.word.Word<I>
-
- Type Parameters:
I
- symbol type
- All Implemented Interfaces:
Iterable<I>
,ArrayWritable<I>
,Printable
public abstract class Word<I> extends AbstractPrintable implements ArrayWritable<I>, Iterable<I>
A word is an ordered sequence of symbols.Word
s are generally immutable, i.e., a singleWord
object will never change (unless symbol objects are modified, which is however highly discouraged).This class provides the following static methods for creating words in the most common scenarios:
-
epsilon()
returns the empty word of length 0 -
fromLetter(Object)
turns a single letter into a word of length 1 -
fromSymbols(Object...)
creates a word from an array of symbols -
fromArray(Object[], int, int)
creates a word from a subrange of a symbols array -
fromList(List)
creates a word from aList
of symbols
Modification operations like
append(Object)
orconcat(Word...)
create new objects, subsequently invoking these operations on the respective objects returned is therefore highly inefficient. If words need to be dynamically created, aWordBuilder
should be used.This is an abstract base class for word representations. Implementing classes only need to implement
However, for the sake of efficiency it is highly encouraged to overwrite the other methods as well, providing specialized realizations.
-
-
Constructor Summary
Constructors Constructor Description Word()
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description Word<I>
append(I symbol)
Appends a symbol to this word and returns the result as a new word.IntSeq
asIntSeq(ToIntFunction<I> indexFunction)
List<I>
asList()
Retrieves aList
view on the contents of this word.static <I> Comparator<Word<I>>
canonicalComparator(Comparator<? super I> symComparator)
Word<I>
canonicalNext(Alphabet<I> sigma)
Retrieves the next word after this in canonical order.static <I> Collector<I,?,Word<I>>
collector()
Word<I>
concat(Word<? extends I>... words)
Concatenates this word with several other words and returns the result as a new word.protected Word<I>
concatInternal(Word<? extends I>... words)
Realizes the concatenation of this word with several other words.static <I> Word<I>
epsilon()
Retrieves the empty word.boolean
equals(@Nullable Object other)
I
firstSymbol()
Retrieves the first symbol of this word.Word<I>
flatten()
Retrieves a "flattened" version of this word, i.e., without any hierarchical structure attached.static <I> Word<I>
fromArray(I[] symbols, int offset, int length)
Creates a word from a subrange of an array of symbols.static Word<Character>
fromCharSequence(CharSequence cs)
static <I> Word<I>
fromLetter(I letter)
Constructs a word from a single letter.static <I> Word<I>
fromList(List<? extends I> symbolList)
Creates a word from a list of symbols.static Word<Character>
fromString(String str)
static <I> Word<I>
fromSymbols(I... symbols)
Creates a word from an array of symbols.static <I> Word<I>
fromWords(Collection<? extends Word<? extends I>> words)
static <I> Word<I>
fromWords(Word<? extends I>... words)
abstract I
getSymbol(int index)
Return symbol that is at the specified position.int
hashCode()
boolean
isEmpty()
Checks if this word is empty, i.e., contains no symbols.boolean
isPrefixOf(Word<?> other)
Checks if this word is a prefix of another word.boolean
isSuffixOf(Word<?> other)
Checks if this word is a suffix of another word.Iterator<I>
iterator()
I
lastSymbol()
Retrieves the last symbol of this word.abstract int
length()
Retrieves the length of this word.Word<I>
longestCommonPrefix(Word<?> other)
Determines the longest common prefix of this word and another word.Word<I>
longestCommonSuffix(Word<?> other)
Determines the longest common suffix of this word and another word.Word<I>
prefix(int prefixLen)
Retrieves a prefix of the given length.List<Word<I>>
prefixes(boolean longestFirst)
Retrieves the list of all prefixes of this word.Word<I>
prepend(I symbol)
Prepends a symbol to this word and returns the result as a new word.void
print(Appendable a)
Outputs the current object.int
size()
Returns the size of this container.Spliterator<I>
spliterator()
Stream<I>
stream()
Word<I>
subWord(int fromIndex)
Retrieves the subword of this word starting at the given index and extending until the end of this word.Word<I>
subWord(int fromIndex, int toIndex)
Retrieves a word representing the specified subrange of this word.protected Word<I>
subWordInternal(int fromIndex, int toIndex)
Internal subword operation implementation.Word<I>
suffix(int suffixLen)
Retrieves a suffix of the given length.List<Word<I>>
suffixes(boolean longestFirst)
Retrieves the list of all suffixes of this word.int[]
toIntArray(ToIntFunction<? super I> toInt)
Transforms this word into an array of integers, using the specified function for translating an individual symbol to an integer.<T> Word<T>
transform(Function<? super I,? extends T> transformer)
Transforms a word symbol-by-symbol, using the specified transformation function.Word<I>
trimmed()
static <I> Word<I>
upcast(Word<? extends I> word)
Performs an upcast of the generic type parameter of the word.void
writeToArray(int offset, @Nullable Object[] array, int tgtOffset, int length)
Writes the contents of this container to an array.-
Methods inherited from class net.automatalib.common.util.string.AbstractPrintable
toString
-
-
-
-
Method Detail
-
canonicalComparator
public static <I> Comparator<Word<I>> canonicalComparator(Comparator<? super I> symComparator)
-
fromSymbols
@SafeVarargs public static <I> Word<I> fromSymbols(I... symbols)
Creates a word from an array of symbols.- Type Parameters:
I
- symbol type- Parameters:
symbols
- the symbol array- Returns:
- a word containing the symbols in the specified array
-
epsilon
public static <I> Word<I> epsilon()
Retrieves the empty word.- Type Parameters:
I
- symbol type- Returns:
- the empty word.
- See Also:
Collections.emptyList()
-
fromLetter
public static <I> Word<I> fromLetter(I letter)
Constructs a word from a single letter.- Type Parameters:
I
- symbol type- Parameters:
letter
- the letter- Returns:
- a word consisting of only this letter
-
fromArray
public static <I> Word<I> fromArray(I[] symbols, int offset, int length)
Creates a word from a subrange of an array of symbols. Note that to ensure immutability, internally a copy of the array is made.- Type Parameters:
I
- symbol type- Parameters:
symbols
- the symbols arrayoffset
- the starting index in the arraylength
- the length of the resulting word (from the starting index on)- Returns:
- the word consisting of the symbols in the range
-
fromList
public static <I> Word<I> fromList(List<? extends I> symbolList)
Creates a word from a list of symbols.- Type Parameters:
I
- symbol type- Parameters:
symbolList
- the list of symbols- Returns:
- the resulting word
-
fromCharSequence
public static Word<Character> fromCharSequence(CharSequence cs)
-
fromWords
@SafeVarargs public static <I> Word<I> fromWords(Word<? extends I>... words)
-
fromWords
public static <I> Word<I> fromWords(Collection<? extends Word<? extends I>> words)
-
length
public abstract int length()
Retrieves the length of this word.- Returns:
- the length of this word.
-
upcast
public static <I> Word<I> upcast(Word<? extends I> word)
Performs an upcast of the generic type parameter of the word. Since words are immutable, the type parameter<I>
is covariant (even though it is not possible to express this in Java), making this a safe operation.- Type Parameters:
I
- symbol type- Parameters:
word
- the word to upcast- Returns:
- the upcasted word (reference identical to
word
)
-
spliterator
public Spliterator<I> spliterator()
- Specified by:
spliterator
in interfaceIterable<I>
-
print
public void print(Appendable a) throws IOException
Description copied from interface:Printable
Outputs the current object.- Specified by:
print
in interfacePrintable
- Parameters:
a
- the appendable.- Throws:
IOException
- if an error occurs during appending.
-
isEmpty
public boolean isEmpty()
Checks if this word is empty, i.e., contains no symbols.- Returns:
true
if this word is empty,false
otherwise.
-
subWord
public final Word<I> subWord(int fromIndex)
Retrieves the subword of this word starting at the given index and extending until the end of this word. Calling this method is equivalent to callingw.subWord(fromIndex, w.length())
- Parameters:
fromIndex
- the first index, inclusive- Returns:
- the word representing the specified subrange
-
subWord
public final Word<I> subWord(int fromIndex, int toIndex)
Retrieves a word representing the specified subrange of this word. As words are immutable, this function usually can be realized quite efficient (implementing classes should take care of this).- Parameters:
fromIndex
- the first index, inclusive.toIndex
- the last index, exclusive.- Returns:
- the word representing the specified subrange.
-
subWordInternal
protected Word<I> subWordInternal(int fromIndex, int toIndex)
Internal subword operation implementation. In contrast tosubWord(int, int)
, no range checks need to be performed. As this method is flagged asprotected
, implementations may rely on the specified indices being valid.- Parameters:
fromIndex
- the first index, inclusive (guaranteed to be valid)toIndex
- the last index, exclusive (guaranteed to be valid)- Returns:
- the word representing the specified subrange
-
writeToArray
public void writeToArray(int offset, @Nullable Object[] array, int tgtOffset, int length)
Description copied from interface:ArrayWritable
Writes the contents of this container to an array. The behavior of calling this method should be equivalent toSystem.arraycopy(this.toArray(), offset, array, tgtOfs, num);
.- Specified by:
writeToArray
in interfaceArrayWritable<I>
- Parameters:
offset
- how many elements of this container to skip.array
- the array in which to store the elements.tgtOffset
- the starting offset in the target array.length
- the maximum number of elements to copy.
-
getSymbol
public abstract I getSymbol(int index)
Return symbol that is at the specified position.- Parameters:
index
- the position- Returns:
- symbol at position i.
- Throws:
IndexOutOfBoundsException
- if there is no symbol with this index.
-
size
public final int size()
Description copied from interface:ArrayWritable
Returns the size of this container.- Specified by:
size
in interfaceArrayWritable<I>
- Returns:
- the size of this container
-
asList
public List<I> asList()
Retrieves aList
view on the contents of this word.- Returns:
- an unmodifiable list of the contained symbols.
-
asIntSeq
public IntSeq asIntSeq(ToIntFunction<I> indexFunction)
Retrieves aIntSeq
view on the contents of this word for a given indexing function (e.g. anAlphabet
).- Parameters:
indexFunction
- the mapping from symbols to indices- Returns:
- an
IntSeq
view of the contained symbols.
-
prefixes
public List<Word<I>> prefixes(boolean longestFirst)
Retrieves the list of all prefixes of this word. In the default implementation, the prefixes are lazily instantiated upon the respective calls ofList.get(int)
orWord.Iterator.next()
.- Parameters:
longestFirst
- whether to start with the longest prefix (otherwise, the first prefix in the list will be the shortest).- Returns:
- a (non-materialized) list containing all prefixes
-
suffixes
public List<Word<I>> suffixes(boolean longestFirst)
Retrieves the list of all suffixes of this word. In the default implementation, the suffixes are lazily instantiated upon the respective calls ofList.get(int)
orWord.Iterator.next()
.- Parameters:
longestFirst
- whether to start with the longest suffix (otherwise, the first suffix in the list will be the shortest).- Returns:
- a (non-materialized) list containing all suffix
-
canonicalNext
public Word<I> canonicalNext(Alphabet<I> sigma)
Retrieves the next word after this in canonical order. Figuratively speaking, if there arek
alphabet symbols, one can think of a word of lengthn
as ann
-digit radix-k
representation of the number. The next word in canonical order is the representation for the number represented by this word plus one.- Parameters:
sigma
- the alphabet- Returns:
- the next word in canonical order
-
lastSymbol
public I lastSymbol()
Retrieves the last symbol of this word.- Returns:
- the last symbol of this word.
-
firstSymbol
public I firstSymbol()
Retrieves the first symbol of this word.- Returns:
- the first symbol of this word
-
append
public Word<I> append(I symbol)
Appends a symbol to this word and returns the result as a new word.- Parameters:
symbol
- the symbol to append- Returns:
- the word plus the given symbol
-
prepend
public Word<I> prepend(I symbol)
Prepends a symbol to this word and returns the result as a new word.- Parameters:
symbol
- the symbol to prepend- Returns:
- the given symbol plus to word.
-
concat
@SafeVarargs public final Word<I> concat(Word<? extends I>... words)
Concatenates this word with several other words and returns the result as a new word.Note that this method cannot be overridden. Implementing classes need to override the
concatInternal(Word...)
method instead.- Parameters:
words
- the words to concatenate with this word- Returns:
- the result of the concatenation
- See Also:
concatInternal(Word...)
-
concatInternal
protected Word<I> concatInternal(Word<? extends I>... words)
Realizes the concatenation of this word with several other words.- Parameters:
words
- the words to concatenate- Returns:
- the results of the concatenation
-
isPrefixOf
public boolean isPrefixOf(Word<?> other)
Checks if this word is a prefix of another word.- Parameters:
other
- the other word- Returns:
true
if this word is a prefix of the other word,false
otherwise.
-
longestCommonPrefix
public Word<I> longestCommonPrefix(Word<?> other)
Determines the longest common prefix of this word and another word.- Parameters:
other
- the other word- Returns:
- the longest common prefix of this word and the other word
-
prefix
public final Word<I> prefix(int prefixLen)
Retrieves a prefix of the given length. Iflength
is negative, then a prefix consisting of all but the last-length
symbols is returned.- Parameters:
prefixLen
- the length of the prefix (may be negative, see above).- Returns:
- the prefix of the given length.
-
isSuffixOf
public boolean isSuffixOf(Word<?> other)
Checks if this word is a suffix of another word.- Parameters:
other
- the other word- Returns:
true
if this word is a suffix of the other word,false
otherwise.
-
longestCommonSuffix
public Word<I> longestCommonSuffix(Word<?> other)
Determines the longest common suffix of this word and another word.- Parameters:
other
- the other word- Returns:
- the longest common suffix
-
suffix
public final Word<I> suffix(int suffixLen)
Retrieves a suffix of the given length. Iflength
is negative, then a suffix consisting of all but the first-length
symbols is returned.- Parameters:
suffixLen
- the length of the suffix (may be negative, see above).- Returns:
- the suffix of the given length.
-
flatten
public Word<I> flatten()
Retrieves a "flattened" version of this word, i.e., without any hierarchical structure attached. This can be helpful ifWord
is subclassed to allow representing, e.g., a concatenation dynamically, but due to performance concerns not too many levels of indirection should be introduced.- Returns:
- a flattened version of this word.
-
toIntArray
public int[] toIntArray(ToIntFunction<? super I> toInt)
Transforms this word into an array of integers, using the specified function for translating an individual symbol to an integer.- Parameters:
toInt
- the function for translating symbols to integers- Returns:
- an integer-array representation of the word, according to the specified translation function
-
transform
public <T> Word<T> transform(Function<? super I,? extends T> transformer)
Transforms a word symbol-by-symbol, using the specified transformation function.- Type Parameters:
T
- the target type- Parameters:
transformer
- the transformation function- Returns:
- the transformed word
-
-