This tokenizer is implemented as a state machine. The characters used for whitespace, separators, quotes, and end-of-line are all implemented as properties. This allows for a great deal of flexibility in how text is tokenized.
Procedure Name | Type | Description |
(Declarations) | Declarations | Declarations and private variables for the CTokenizer class. |
ConvertCase | Property | Get the current type of case conversion. The ConvertCase property can have the following values:
|
EOLChars | Property | Get the characters currently treated as end of line characters. |
QuoteChars | Property | Get the characters currently treated as quotes. |
SeparatorChars | Property | Get the characters currently treated as separators. |
Text | Property | Get the text to tokenize. |
WhiteSpaceChars | Property | Get the characters currently treated as white space. |
Class_Initialize | Initialize | Initialize the class. |
GetNextToken | Method | Get the next token in the string identified by the text property. |
AppendToken | Private | Append the current character to the token, performing any necessary case conversion. |
CharType | Private | Determine the type of the current character. |
HandleEOL | Private | Handle an end of line character. |
HandleQuote | Private | Handle a quote character. |
HandleSeparator | Private | Handle a separator character. |
HandleToken | Private | Handle a token character. |
HandleWhiteSpace | Private | Handle a whitespace character. |
' Example of CTokenizer ' ' To try this example, do the following: ' 1. Create a new form ' 2. Add a command button named 'cmdTest' ' 3. Paste all the code from this example to the new form's module. ' 4. Run the form Private Const mcstrText As String = "A-whop boppa lu-mop," & vbCrLf & "A whop bam boom" Private Sub cmdTest_Click() ' Example for the CTokenizer class ' This example breaks a string up into its pieces and constructs a new string, separated by "."'s ' to display in a message box Dim tokenizer As CTokenizer Dim strBops As String Dim strTok As String Set tokenizer = New CTokenizer With tokenizer .Text = mcstrText .WhiteSpaceChars = " ,.:-" .ConvertCase = tokLower .EOLChars = vbCrLf Do While .GetNextToken(strTok, " ", False) strBops = strBops & "." & strTok Loop End With Debug.Print strBops End Sub
The source code in Total Visual Sourcebook includes modules and classes for Microsoft Access, Visual Basic 6 (VB6), and Visual Basic for Applications (VBA) developers. Easily add this professionally written, tested, and documented royalty-free code into your applications to simplify your application development efforts.
Total Visual SourceBook is written for the needs of a developer using a source code library covering the many challenges you face. Countless developers over the years have told us they learned some or much of their development skills and tricks from our code. You can too!
Supports Access/Office 2016, 2013, 2010 and 2007, and Visual Basic 6.0!
"The code is exactly how I would like to write code and the algorithms used are very efficient and well-documented."
Van T. Dinh, Microsoft MVP