Tokens in C++

Table of Contents

Tokens may be defined as the smallest individual units in a program. The programmer can write a program by using tokens. The following tokens are used in C++.

All the C++ programs are written using these tokens, white spaces, and the syntax (grammar) of the language. Most of the C++ tokens are similar to those of C tokens except few additions and minor changes.

Note: C++ recognizes some special tokens, such as parentheses without using white space. For example, return (0); here the white space may be used or omitted before the opening parentheses.

Keywords:

There are some reserved words in C++ which have predefined meaning to the language compiler called keywords and must not be used as normal identifier names. The original C++ (developed by Stroustrup) contains the following keywords:

asm	float	signed
auto	for	sizeof
break	friend	static
case	goto	struct
catch	if	switch
char	inline	template
class	integer	this
const	long	throw
continue	new	try
default	operator	typedef
delete	private	union
do	protected	unsigned
double	public	virtual
else	register	void
enum	return	volatile
extern	short	while

Some implementations and standard libraries contain a double underscore ( _ _ ) as a reserved word and so it should be avoided.

Identifiers:

Identifiers are the fundamental building blocks of a program and are used to give names to variables, functions, arrays, objects, classes, etc.

The rules for the formation of an identifier are given below:

An identifier can consist of letters, digits, and/or underscores.
Their names must begin with a letter of the alphabet or an underscore ( _ ).
C++ is case sensitive, i.e., upper case and lower case letters are considered different from each other. It may be noted that TOTAL and total are two different identifier names.
All the characters are significant.
Reserved words cannot be used as names of identifiers/variables.

Examples of acceptable identifiers are:

num, sum, average, total_salary, big, SIZE, Value

Examples of unacceptable identifiers are:

Ma rks (blank not allowed)
B, pay (special character ‘,‘ used)
It may be noted that TOTAL and total are two different identifier names.

Literals or Constants:

These are data items that never change their value during the execution of the program. The following types of literals are available in C++.

Integer constants:

Integer constants are whole numbers without any fractional part. It may contain either a plus (+) or minus (–) sign, but a decimal point or commas do not appear in any integer constant. C++ allows three types of integer constants.

Decimal Integer Constants: An integer constant consisting of a sequence of digits is taken to be a decimal integer constant unless it begins with 0 (digit zero). For instance, 1024, 3315, +59, -87 are decimal integer constants.
Octal Integer Constants: It consists of a sequence of digits starting with 0 (zero). For example- decimal integer 14 will be written as 016 as an octal integer (∵ 14₁₀ = 16₈).
Hexadecimal Integer Constants: These are preceded by 0x or 0X. For example- decimal integer 14 will be written as 0XE as hexadecimal integer (as 14₁₀ = E₁₆). The suffix l or L and u or U forces any constant to be represented as long and unsigned respectively.

Character Constants:

The constant which is stored in a variable within a single quotation mark is called the character constant. They have their data type as char, which is the data type for characters in C++. The value of a single character constant is the numeric value of the character in the computer’s character set. For example– the value of ‘A’ will be 65 which is the ASCII value of A and the value of ‘C’ will be 99 which is the ASCII value of C.

C++ allows you to have certain nongraphic characters in character constants. Nongraphic characters are those characters that cannot be typed directly from the keyboard. For example- backspace, tabs, carriage return, etc. These nongraphic characters can be represented by using escape sequences. An escape sequence represents a single character. The below table gives a listing of common escape sequences

Escape Sequence	Nongraphic Character
\a	Audible Bell (beep)
\b	Backspace
\f	Formfeed
\n	Newline or Linefeed
\r	Carriage Return
\t	Horizontal Tab
\v	Vertical Tab
\\	Backslash
\’	Single Quote
\”	Double Quote
\?	Question Mark
\On	Octal Number (On represents the number in octal)
\xHn	Hexadecimal Number (Hn represents the number in hexadecimal)
\0	Null

Floating Constants or Real Constants:

These have fractional parts. These may be written in either fractional form or exponent form. The following rules are followed for constructing real constants in the fractional form:

A floating constant in the fractional form must have at least one digit before and after the decimal point.
It may either have plus (+) or minus (-) sign.
When no sign is present it is assumed to be positive.
Commas and blanks are not permitted in it.

For example, 15.9, -17.8, -0.0057

A floating constant in exponent form has two parts: a mantissa and an exponent. The mantissa is either an integer or a real constant followed by the letter E or e and the exponent must be an integer. For example 2E03, 1.23E07.

String Literals:

A sequence of characters enclosed within double quotes is called a string literal. String literal is by default (automatically) added with a special character ‘\0’ which denotes the end of the string. Therefore the size of the string is increased by one character. For example, “GKSCIENTIST” will be represented as “GKSCIENTIST\O” in the memory and its size is 12 characters.

Punctuators:

The following characters are used as punctuators or separators in C++.

Brackets [ ]	These are used for enclosing subscripts in the case of single and multidimensional arrays.
Parentheses ( )	These are used for function calls and function parameters. These are group expressions and separate conditional statements.
Braces { }	These are used for blocking of code having simple or compound (more than one) executable statement(s).
Comma ,	It is used as a separator in a function argument list.
Semicolon ;	It is used as a statement terminator. Every executable statement is terminated by a semicolon.
Colon :	It indicates a labeled statement or conditional operator.
Asterisk *	It is used in pointer declaration or as a multiplication operator.
Ellipsis …	These are used in the formal parameter lists of a function declaration (prototype) to have a variable number of parameters (arguments).
Equal to sign =	It is used as an assignment operator.
Pound sign #	It is used as a pre-processor directive.

Operators:

An operator may be defined as a symbol that specifies an operation to be performed. The data items on which the operators act upon are called operands. Some operators require a single operand while others might require two operands to act upon. The order in which the operations are performed by the operators is known as the order of precedence. C++ includes many operators.

Arithmetic Operators:

An operator that performs an arithmetic (numeric) operation +, -, *, / , or %. For these operations always two or more than two operands are required. Therefore these operators are called binary operators. The following table shows the arithmetic operators.

Symbol	Meaning	Example
–	Subtraction	x – y
+	Addition	x + y
*	Multiplication	x * y
/	Division	x / y
%	Modulus or Remainder	x % y

For Example- Let x and y be the two integer variables having the values 8 and 5 respectively. The following tables give the result of different operations:

Expression	Result
x – y	3
x + y	13
x * y	40
x / y	1
x % y	3

Remember the following points while using the arithmetic operators:

The division of an integer by another integer always gives an integer result. For example, 13/3 is 4 (the decimal point is dropped).
If both or one of the operands in a division operation happens to be a floating point value, the result is always a floating point number. For example, 29/2.0 is 14.5.
The modulus or remainder operator provides the remainder on an integer division. For example, 33 % 7 is 5. We can’t use this operator on floating point numbers.
The remainder operator requires that both operands be integers and the second operand be nonzero.
The division operator requires that the second operand be nonzero, though the operands need not be integers.

Relational Operators:

The relational operators are used to test the relation between two values. All relational operators are binary operators and therefore require two operands. A relational expression returns zero when the relation is false and a non-zero when it is true. The following table shows the relational operators.

Symbol	Meaning	Example
>	Greater than	x > y
>=	Greater than or equal to	x >= y
<	Less than	x < y
<=	Less than or equal to	x <= y
==	Equal to	x == y
!=	Not equal to	x != y

For Example- Let the two variables x and y have initial values of 15 and 20 respectively. The following table illustrates the usage of the relational operators.

Expression	Result
x > 12	True
x + y >= 38	False
x < y	True
x <= y	True
a + 5 == b	True
a != 12	True

Logical Operators:

The logical operators are used to combine one or more relational expressions. The following table shows the logical operators.

Symbol	Meaning	Explanation
\|\|	OR	It combines two or more logical expressions and evaluates to true if any one of the conditions is true.
&&	AND	It combines two or more logical expressions and evaluates to true if all the conditions are true.
!	NOT	It is a unary operator as it takes only one operand. It reverses the logical value of the operand.

Unary Operators:

Operators that act on one operand are referred to as unary operators.

The operator Unary + precedes an operand. The operand of the unary+ operator must have arithmetic or pointer type and the result is the value of the argument. For example,

If a = 6 then +a means 6.
If a = 0 then +a means 0.
If a = -3 then +a means -3.

The operator Unary – precedes an operand. The operand of the unary – operator must have an arithmetic type and the result is the negation of its operand’s value. For example,

If a = 6 then -a means -6.
If a = 0 then -a means 0.
If a = -3 then -a means 3.
This operator reverses the sign of the operand’s value.

Assignment operator:

Operates “=” is used for assignment, it takes the right-hand side (called rvalue) and copies it into the left-hand side (called lvalue). An assignment operator is the only operator which can be overloaded but cannot be inherited.

In addition to the standard assignment operator shown above, C++ also supports compound assignment operators. C++ provides two special operators viz ‘++’ and ‘– –’ for incrementing and decrementing the value of a variable by 1.

For Example-

a = a + 1; is the same as ++a; or a++

a = a – 1; is the same as –a; or a–

Compound Assignment Operators:

Symbol	Meaning	Example
+ =	A + = 4	A = A + 4
– =	A – = 4	A = A – 4
% =	A % = 4	A = A % 4
/ =	A / = 4	A = A / 4
* =	A * = 4	A = A * 4

Conditional Operator:

C++ offers a conditional operator (?:) that stores a value depending upon a condition. This operator is a ternary operator i.e., it requires three operands. The format of the conditional operator is: Conditional_expression ? expression1 : expression2;

If the value of conditional_expression is true then expression1 is evaluated, otherwise, expression2 is evaluated.

Comma Operator:

It is used to string together several expressions. The group of expressions separated by commas (,) is evaluated left-to-right in sequence and the result of the right-most expression becomes the value of the total comma-separated expression. For example-

b = (a = 3, a + 1);
First assigns a the value 3 and then assigns b the value a + 1 i.e., 4. The parentheses are necessary because the comma operator has lower precedence than the assignment operator.

Procedure-Oriented Programming
Object-Oriented Programming (OOP)
C++ Character Set and Program Structure
Tokens and character sets– Microsoft Docs

GK SCIENTIST

Tokens in C++