Class Automaton

java.lang.Object
org.apache.lucene.util.automaton.Automaton
All Implemented Interfaces:
Accountable, TransitionAccessor

public class Automaton extends Object implements Accountable, TransitionAccessor
Represents an automaton and all its states and transitions. States are integers and must be created using createState(). Mark a state as an accept state using setAccept(int, boolean). Add transitions using addTransition(int, int, int). Each state must have all of its transitions added at once; if this is too restrictive then use Automaton.Builder instead. State 0 is always the initial state. Once a state is finished, either because you've starting adding transitions to another state or you call finishState(), then that states transitions are sorted (first by min, then max, then dest) and reduced (transitions with adjacent labels going to the same dest are combined).
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static class 
    Records new states and transitions and then Automaton.Builder.finish() creates the Automaton.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    private int
    Current state we are adding transitions to; the caller must add all transitions for this state before moving onto another state.
    private final Sorter
    Sorts transitions by dest, ascending, then min label ascending, then max label ascending
    private boolean
    True if no state has two transitions leaving with the same label.
    private final BitSet
     
    private final Sorter
    Sorts transitions by min label, ascending, then max label ascending, then dest ascending
    private int
    Where we next write to the int[] states; this increments by 2 for each added state because we pack a pointer to the transitions array and a count of how many transitions leave the state.
    private int
    Where we next write to in int[] transitions; this increments by 3 for each added transition because we pack min, max, dest in sequence.
    private int[]
    Index in the transitions array, where this states leaving transitions are stored, or -1 if this state has not added any transitions yet, followed by number of transitions.
    private int[]
    Holds toState, min, max for each transition.

    Fields inherited from interface org.apache.lucene.util.Accountable

    NULL_ACCOUNTABLE
  • Constructor Summary

    Constructors
    Constructor
    Description
    Sole constructor; creates an automaton with no states.
    Automaton(int numStates, int numTransitions)
    Constructor which creates an automaton with enough space for the given number of states and transitions.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    addEpsilon(int source, int dest)
    Add a [virtual] epsilon transition between source and dest.
    void
    addTransition(int source, int dest, int label)
    Add a new transition with min = max = label.
    void
    addTransition(int source, int dest, int min, int max)
    Add a new transition with the specified source, dest, min, max.
    (package private) static void
     
    void
    copy(Automaton other)
    Copies over all states/transitions from other.
    int
    Create a new state.
    private void
    Freezes the last state, sorting and reducing the transitions.
    void
    Finishes the current state; call this once you are done adding transitions for a state.
    (package private) BitSet
    Returns accept states.
    void
    Iterate to the next transition after the provided one
    int
    How many states this automaton has.
    int
    How many transitions this automaton has.
    int
    getNumTransitions(int state)
    How many transitions this state has.
    Sugar to get all transitions for all states.
    int[]
    Returns sorted array of all interval start points.
    void
    getTransition(int state, int index, Transition t)
    Fill the provided Transition with the index'th transition leaving the specified state.
    private void
     
    private void
     
    int
    initTransition(int state, Transition t)
    Initialize the provided Transition to iterate through all transitions leaving the specified state.
    boolean
    isAccept(int state)
    Returns true if this state is an accept state.
    boolean
    Returns true if this automaton is deterministic (for ever state there is only one transition for each label).
    private int
    next(int state, int fromTransitionIndex, int label, Transition transition)
    Looks for the next transition that matches the provided label, assuming determinism.
    int
    next(Transition transition, int label)
    Looks for the next transition that matches the provided label, assuming determinism.
    long
    Return the memory usage of this object in bytes.
    void
    setAccept(int state, boolean accept)
    Set or clear this state as an accept state.
    int
    step(int state, int label)
    Performs lookup in transitions, assuming determinism.
    Returns the dot (graphviz) representation of this automaton.
    private boolean
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.apache.lucene.util.Accountable

    getChildResources
  • Field Details

    • nextState

      private int nextState
      Where we next write to the int[] states; this increments by 2 for each added state because we pack a pointer to the transitions array and a count of how many transitions leave the state.
    • nextTransition

      private int nextTransition
      Where we next write to in int[] transitions; this increments by 3 for each added transition because we pack min, max, dest in sequence.
    • curState

      private int curState
      Current state we are adding transitions to; the caller must add all transitions for this state before moving onto another state.
    • states

      private int[] states
      Index in the transitions array, where this states leaving transitions are stored, or -1 if this state has not added any transitions yet, followed by number of transitions.
    • isAccept

      private final BitSet isAccept
    • transitions

      private int[] transitions
      Holds toState, min, max for each transition.
    • deterministic

      private boolean deterministic
      True if no state has two transitions leaving with the same label.
    • destMinMaxSorter

      private final Sorter destMinMaxSorter
      Sorts transitions by dest, ascending, then min label ascending, then max label ascending
    • minMaxDestSorter

      private final Sorter minMaxDestSorter
      Sorts transitions by min label, ascending, then max label ascending, then dest ascending
  • Constructor Details

    • Automaton

      public Automaton()
      Sole constructor; creates an automaton with no states.
    • Automaton

      public Automaton(int numStates, int numTransitions)
      Constructor which creates an automaton with enough space for the given number of states and transitions.
      Parameters:
      numStates - Number of states.
      numTransitions - Number of transitions.
  • Method Details

    • createState

      public int createState()
      Create a new state.
    • setAccept

      public void setAccept(int state, boolean accept)
      Set or clear this state as an accept state.
    • getSortedTransitions

      public Transition[][] getSortedTransitions()
      Sugar to get all transitions for all states. This is object-heavy; it's better to iterate state by state instead.
    • getAcceptStates

      BitSet getAcceptStates()
      Returns accept states. If the bit is set then that state is an accept state.
    • isAccept

      public boolean isAccept(int state)
      Returns true if this state is an accept state.
    • addTransition

      public void addTransition(int source, int dest, int label)
      Add a new transition with min = max = label.
    • addTransition

      public void addTransition(int source, int dest, int min, int max)
      Add a new transition with the specified source, dest, min, max.
    • addEpsilon

      public void addEpsilon(int source, int dest)
      Add a [virtual] epsilon transition between source and dest. Dest state must already have all transitions added because this method simply copies those same transitions over to source.
    • copy

      public void copy(Automaton other)
      Copies over all states/transitions from other. The states numbers are sequentially assigned (appended).
    • finishCurrentState

      private void finishCurrentState()
      Freezes the last state, sorting and reducing the transitions.
    • isDeterministic

      public boolean isDeterministic()
      Returns true if this automaton is deterministic (for ever state there is only one transition for each label).
    • finishState

      public void finishState()
      Finishes the current state; call this once you are done adding transitions for a state. This is automatically called if you start adding transitions to a new source state, but for the last state you add you need to this method yourself.
    • getNumStates

      public int getNumStates()
      How many states this automaton has.
    • getNumTransitions

      public int getNumTransitions()
      How many transitions this automaton has.
    • getNumTransitions

      public int getNumTransitions(int state)
      Description copied from interface: TransitionAccessor
      How many transitions this state has.
      Specified by:
      getNumTransitions in interface TransitionAccessor
    • growStates

      private void growStates()
    • growTransitions

      private void growTransitions()
    • initTransition

      public int initTransition(int state, Transition t)
      Description copied from interface: TransitionAccessor
      Initialize the provided Transition to iterate through all transitions leaving the specified state. You must call TransitionAccessor.getNextTransition(org.apache.lucene.util.automaton.Transition) to get each transition. Returns the number of transitions leaving this state.
      Specified by:
      initTransition in interface TransitionAccessor
    • getNextTransition

      public void getNextTransition(Transition t)
      Description copied from interface: TransitionAccessor
      Iterate to the next transition after the provided one
      Specified by:
      getNextTransition in interface TransitionAccessor
    • transitionSorted

      private boolean transitionSorted(Transition t)
    • getTransition

      public void getTransition(int state, int index, Transition t)
      Description copied from interface: TransitionAccessor
      Fill the provided Transition with the index'th transition leaving the specified state.
      Specified by:
      getTransition in interface TransitionAccessor
    • appendCharString

      static void appendCharString(int c, StringBuilder b)
    • toDot

      public String toDot()
      Returns the dot (graphviz) representation of this automaton. This is extremely useful for visualizing the automaton.
    • getStartPoints

      public int[] getStartPoints()
      Returns sorted array of all interval start points.
    • step

      public int step(int state, int label)
      Performs lookup in transitions, assuming determinism.
      Parameters:
      state - starting state
      label - codepoint to look up
      Returns:
      destination state, -1 if no matching outgoing transition
    • next

      public int next(Transition transition, int label)
      Looks for the next transition that matches the provided label, assuming determinism.

      This method is similar to step(int, int) but is used more efficiently when iterating over multiple transitions from the same source state. It keeps the latest reached transition index in transition.transitionUpto so the next call to this method can continue from there instead of restarting from the first transition.

      Parameters:
      transition - The transition to start the lookup from (inclusive, using its Transition.source and Transition.transitionUpto). It is updated with the matched transition; or with Transition.dest = -1 if no match.
      label - The codepoint to look up.
      Returns:
      The destination state; or -1 if no matching outgoing transition.
    • next

      private int next(int state, int fromTransitionIndex, int label, Transition transition)
      Looks for the next transition that matches the provided label, assuming determinism.
      Parameters:
      state - The source state.
      fromTransitionIndex - The transition index to start the lookup from (inclusive); negative interpreted as 0.
      label - The codepoint to look up.
      transition - The output transition to update with the matching transition; or null for no update.
      Returns:
      The destination state; or -1 if no matching outgoing transition.
    • ramBytesUsed

      public long ramBytesUsed()
      Description copied from interface: Accountable
      Return the memory usage of this object in bytes. Negative values are illegal.
      Specified by:
      ramBytesUsed in interface Accountable