This article explain the techniques to read and evaluate the result of an arithmetic expression string. The tokenizer, expression validator (both syntax validation and data-type validation) and expression evaluator are implemented in C# and Silverlight and source code is available for download below.

Expression Evaluation Demo in Silverlight

The techniques of expression-evaluation are illustrated in the following demo. The source code of the demo can be downloaded below. The input expression accepts variable names, numbers and operators.

This demo doesn't support unary operators and functions, as of now.

Expression Evaluation Process

Evaluating a free-text expression string is a three-step process:

- Split the input expression string into tokens (tokenization).
- Convert the expression from infix to postfix notation (or RPN - Reverse Polish Notation).
- Evaluate the postfix expression to find result.

All these steps are illustrated with a live demo (below) in Silverlight. The user can enter a free text expression with numbers, variables and operators and see the evaluation process step by step.

Data Structures for this Article

An expression is ordered collection of tokens. A token is a string of characters that represents a symbol of the programming language in context. The symbol can be a number constant, text constant, arithmetic operators, variable names, function names, etc.

For illustration purposes, let us consider three types of tokens, namely number constants, variables and arithmetic operators. First, we create an interface *ITokenObject* and then create classes for each type of token, namely *Constant*, *Variable* and *Operator*.

public interface ITokenObject { } public class Constant : ITokenObject { public string Value { get; set; } } public class Variable : ITokenObject { public string Name { get; set; } public string Value { get; set; } } public class Operator : ITokenObject { public OperatorSymbol Symbol { get; set; } public string SymbolText { get; set; } public int OperandCount { get; set; } public int PrecedenceLevel { get; set; } public OperatorAssociativity Associativity { get; set; } }

The above classes act as data-structures for each token type. The *OperatorSymbol* enum indicates the operator symbol; we support these arithmetic operators: add (+), subtract (-), multiply (*), divide (/) and modulus (%). The open and close parenthesis are also considered as operators; the parenthesis are used to control the precedence of operators. The *OperatorAssociativity* enum indicates the associativity of the corresponding operator. The associativity controls how operands are chosen for operators in the absence of parenthesis for controlling precedence. For more information, see http://en.wikipedia.org/wiki/Operator_associativity.

The following *Token* class is the data-structure for an instance of a token in expressions. The *Expression* class contains a single property, the list of tokens.

public class Token { public TokenType Type { get; set; } public ITokenObject TokenObject { get; set; } public int Index { get; set; } } public class Expression { public List<Token> Tokens { get; set; } }

The *TokenType* enum defines three enum members, namely *Constant*, *Variable* and *Operator*.

Tokenization

The process of building token objects from input expression string is called **tokenization** or **lexical analysis**. For more information on tokens and tokenization, see http://en.wikipedia.org/wiki/Tokenizing.

The following is the tokenizer class for data-structures defined above.

public class Tokenizer { public static Expression Split(string expressionString) { List<Token> tokens = new List<Token>(); //read the input-expression letter-by-letter and build tokens string alphaNumericString = string.Empty; for (int index = 0; index < expressionString.Length; index++) { char currentChar = expressionString[index]; if (AllOperators.Find("" + currentChar) != null) //if operator { if (alphaNumericString.Length > 0) { tokens.Add(Token.Resolve(alphaNumericString)); alphaNumericString = string.Empty; //reset token string } tokens.Add(ExpressionUtility.CreateOperatorToken(currentChar)); } else if (Char.IsLetterOrDigit(currentChar) || currentChar == '.') { //if alphabet or digit or dot alphaNumericString += currentChar; } } //check if any token at last if (alphaNumericString.Length > 0) { tokens.Add(Token.Resolve(alphaNumericString)); } return (new Expression() { Tokens = tokens }); } }

The *AllOperators* class contains instances of *Operator* class for all supported operators. The *Resolve()* method of *Token* class creates instances of corresponding token class. For example, it creates a *Variable* class for "a", *Constant* class for "52" and *Operator* class for "+", "-", etc.

Infix, Postfix and Prefix Notations

Infix, postfix and prefix notations are three different ways of writing an expression.

- In infix notation, the operators are written in-between their operands. Infix notation needs extra information (operator precedence, associativity and parenthesis) to control the order of evaluation.
*Example: (3 + 5 ) * 7* - In postfix notation, the operators are written after their operands. The operators act on operands immediately to the left of them.
*Example: 3 5 + 7 ** - In prefix notation, the operators are written before their operands. The operators act on operands to their right.
*Example: * 7 + 3 5*

Postfix and prefix notations do not need any extra information like parenthesis or rules of operator precedence and associativity. Evaluating an infix expression (with its complex rules) will be a daunting task, unless it is converted to postfix. Evaluating an expression in postfix notation is trivially easy while using stacks.

Converting Infix to Postfix

The **shunting yard algorithm**, invented by Edsger Dijkstra is a stack-based algorithm used to convert an expression from infix notation to postfix notation. It maintains a stack to hold operators that are not yet added to postfix notation. The following is a summary of this algorithm:

- Read a token (left to right).
- If token is a variable or constant, push it onto the stack.
- If token is an operator,
- If open parenthesis, push it onto stack.
- If close parenthesis, pop all tokens (until but not including a left parenthesis) from stack and add them to output.
- If current token and top token in stack are arithmetic operators, then push the later onto output if its precedence is greater than the former.

For full explanation of shunting yard algorithm, see http://en.wikipedia.org/wiki/Shunting-yard_algorithm. The following function demonstrates the shunting yard algorithm for data-structures defined above.

public static bool InfixToPostfix(Expression inputExpression, out Expression postfixExpression) { List<Token> postfixTokens = new List<Token>(); Stack<Token> postfixStack = new Stack<Token>(); //process all tokens in input-expression, one by one foreach (Token token in inputExpression.Tokens) { if (token.Type == TokenType.Constant) //handle constants { postfixTokens.Add(token); } else if (token.Type == TokenType.Variable) //handle variables { postfixTokens.Add(token); } else if (token.Type == TokenType.Operator) //handle operators { if (ExpressionUtility.IsOpenParenthesis(token)) //handle open-parenthesis { postfixStack.Push(token); } else if (ExpressionUtility.IsCloseParenthesis(token)) //handle close-parenthesis { //pop all operators off the stack onto the output (until left parenthesis) while (true) { if (postfixStack.Count == 0) { postfixExpression = null; //error: mismatched parenthesis return (false); } Token top = postfixStack.Pop(); if (ExpressionUtility.IsOpenParenthesis(top)) break; else postfixTokens.Add(top); } } else //handle arithmetic operators { Operator currentOperator = AllOperators.Find(token.LinearToken); if (postfixStack.Count > 0) { Token top = postfixStack.Peek(); if (ExpressionUtility.IsArithmeticOperator(top)) { //'>' operator implies less precedence Operator stackOperator = AllOperators.Find(top.LinearToken); if ((currentOperator.Associativity == OperatorAssociativity.LeftToRight && currentOperator.PrecedenceLevel >= stackOperator.PrecedenceLevel) || (currentOperator.Associativity == OperatorAssociativity.RightToLeft && currentOperator.PrecedenceLevel > stackOperator.PrecedenceLevel)) { postfixStack.Pop(); postfixTokens.Add(top); } } } postfixStack.Push(token); //push operator to stack } } } //after reading all tokens, pop entire stack to output while (postfixStack.Count > 0) { Token top = postfixStack.Pop(); if (ExpressionUtility.IsOpenParenthesis(top) || ExpressionUtility.IsCloseParenthesis(top)) { postfixExpression = null; //error: mismatched parenthesis return (false); } else { postfixTokens.Add(top); } } postfixExpression = new Expression() { Tokens = postfixTokens }; return (true); }