org.jasen.core.token
Class SimpleWordTokenizer

java.lang.Object
  extended byorg.jasen.core.token.SimpleWordTokenizer

public class SimpleWordTokenizer
extends Object

Used to parse text which has already been semi formatted.

This class is used to prepare the linguisic analysis engine

Author:
Jason Polites

Constructor Summary
SimpleWordTokenizer(File file)
           
SimpleWordTokenizer(InputStream in)
           
 
Method Summary
 String[] getTokens()
          Gets the tokens returned from the tokenization process
 void tokenize()
          Tokenizes (splits) the text
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimpleWordTokenizer

public SimpleWordTokenizer(File file)
                    throws FileNotFoundException

SimpleWordTokenizer

public SimpleWordTokenizer(InputStream in)
Method Detail

tokenize

public void tokenize()
              throws IOException
Tokenizes (splits) the text

Throws:
IOException

getTokens

public String[] getTokens()
Gets the tokens returned from the tokenization process

Returns: