How to compare strings
You compare strings to answer one of two questions: "Are these two strings equal?" or "In what order should these strings be placed when sorting them?"
Those two questions are complicated by factors that affect string comparisons:
- You can choose an ordinal or linguistic comparison.
- You can choose if case matters.
- You can choose culture specific comparisons.
- Linguistic comparisons are culture and platform dependent.
The C# examples in this article run in the Try.NET inline code runner and playground. Select the Run button to run an example in an interactive window. Once you execute the code, you can modify it and run the modified code by selecting Run again. The modified code either runs in the interactive window or, if compilation fails, the interactive window displays all C# compiler error messages.
When you compare strings, you define an order among them. Comparisons are used to sort a sequence of strings. Once the sequence is in a known order, it is easier to search, both for software and for humans. Other comparisons may check if strings are the same. These sameness checks are similar to equality, but some differences, such as case differences, may be ignored.
Default ordinal comparisons
The most common operations, String.CompareTo and String.Equals or String.Equality use an ordinal comparison, a case-sensitive comparison, and use the current culture. The results are shown in the following example.
string root = @"C:\users";
string root2 = @"C:\Users";
bool result = root.Equals(root2);
int comparison = root.CompareTo(root2);
Console.WriteLine($"Ordinal comparison: <{root}> and <{root2}> are {(result ? "equal." : "not equal.")}");
if (comparison < 0)
Console.WriteLine($"<{root}> is less than <{root2}>");
else if (comparison > 0)
Console.WriteLine($"<{root}> is greater than <{root2}>");
else
Console.WriteLine($"<{root}> and <{root2}> are equivalent in order");
result = root.Equals(root2, StringComparison.Ordinal);
Console.WriteLine($"Ordinal comparison: <{root}> and <{root2}> are {(result ? "equal." : "not equal.")}");
Console.WriteLine($"Using == says that <{root}> and <{root2}> are {(root == root2 ? "equal" : "not equal")}");
Ordinal comparisons do not take linguistic rules into account when comparing strings. They will compare the strings character by character. Case-sensitive comparisons use capitalization in their comparisons. The most important point about these default comparison methods is that because they use the current culture, the results depend on the locale and language settings of the machine where they run. These comparisons are unsuitable for comparisons where order should be consistent across machines or locations.
Case-insensitive ordinal comparisons
The String.Equals method enables you to specify a StringComparison value ofStringComparison.OrdinalIgnoreCase to specify a case-insensitive comparison. There is also a staticCompare method that includes a boolean argument to specify case-insensitive comparisons. These are shown in the following code:
string root = @"C:\users";
string root2 = @"C:\Users";
bool result = root.Equals(root2, StringComparison.OrdinalIgnoreCase);
bool areEqual = String.Equals(root, root2, StringComparison.OrdinalIgnoreCase);
int comparison = String.Compare(root, root2, ignoreCase: true);
Console.WriteLine($"Ordinal ignore case: <{root}> and <{root2}> are {(result ? "equal." : "not equal.")}");
Console.WriteLine($"Ordinal static ignore case: <{root}> and <{root2}> are {(areEqual ? "equal." : "not equal.")}");
if (comparison < 0)
Console.WriteLine($"<{root}> is less than <{root2}>");
else if (comparison > 0)
Console.WriteLine($"<{root}> is greater than <{root2}>");
else
Console.WriteLine($"<{root}> and <{root2}> are equivalent in order");
Linguistic comparisons
Strings can also be ordered using linguistic rules for the current culture. This is sometimes referred to as "word sort order." When you perform a linguistic comparison, some nonalphanumeric Unicode characters might have special weights assigned. For example, the hyphen "-" may have a very small weight assigned to it so that "co-op" and "coop" appear next to each other in sort order. In addition, some Unicode characters may be equivalent to a sequence of alphanumeric characters. The following example uses the phrase "They dance in the street." in German with the "ss" and 'ß'. Linguistically (in Windows), "ss" is equal to the German Essetz: 'ß' character in both "en-US" and "de-DE" cultures.
string first = "Sie tanzen auf der Straße.";
string second = "Sie tanzen auf der Strasse.";
Console.WriteLine($"First sentence is <{first}>");
Console.WriteLine($"Second sentence is <{second}>");
bool equal = String.Equals(first, second, StringComparison.InvariantCulture);
Console.WriteLine($"The two strings {(equal == true ? "are" : "are not")} equal.");
showComparison(first, second);
string word = "coop";
string words = "co-op";
string other = "cop";
showComparison(word, words);
showComparison(word, other);
showComparison(words, other);
void showComparison(string one, string two)
{
int compareLinguistic = String.Compare(one, two, StringComparison.InvariantCulture);
int compareOrdinal = String.Compare(one, two, StringComparison.Ordinal);
if (compareLinguistic < 0)
Console.WriteLine($"<{one}> is less than <{two}> using invariant culture");
else if (compareLinguistic > 0)
Console.WriteLine($"<{one}> is greater than <{two}> using invariant culture");
else
Console.WriteLine($"<{one}> and <{two}> are equivalent in order using invariant culture");
if (compareOrdinal < 0)
Console.WriteLine($"<{one}> is less than <{two}> using ordinal comparison");
else if (compareOrdinal > 0)
Console.WriteLine($"<{one}> is greater than <{two}> using ordinal comparison");
else
Console.WriteLine($"<{one}> and <{two}> are equivalent in order using ordinal comparison");
}
This sample demonstrates the operating system dependent nature of linguistic comparisons. The host for the interactive window is a Linux host. The linguistic and ordinal comparisons produce the same results. If you ran this same sample on a Windows host, you would see the following output:
<coop> is less than <co-op> using invariant culture
<coop> is greater than <co-op> using ordinal comparison
<coop> is less than <cop> using invariant culture
<coop> is less than <cop> using ordinal comparison
<co-op> is less than <cop> using invariant culture
<co-op> is less than <cop> using ordinal comparison
On Windows, the sort order of "cop", "coop", and "co-op" change when you change from a linguistic comparison to an ordinal comparison. The two German sentences also compare differently using the different comparison types.
Comparisons using specific cultures
This sample stores CultureInfo for the current culture. The original culture can be set and retrieved on the current thread object. The comparisons are performed using the CurrentCulture value to ensure a culture-specific comparison.
The culture used affects linguistic comparisons. The following example shows the results of comparing the two German sentences using the "en-US" culture and the "de-DE" culture:
string first = "Sie tanzen auf der Straße.";
string second = "Sie tanzen auf der Strasse.";
Console.WriteLine($"First sentence is <{first}>");
Console.WriteLine($"Second sentence is <{second}>");
var en = new System.Globalization.CultureInfo("en-US");
// For culture-sensitive comparisons, use the String.Compare
// overload that takes a StringComparison value.
int i = String.Compare(first, second, en, System.Globalization.CompareOptions.IgnoreNonSpace);
Console.WriteLine($"Comparing in {en.Name} returns {i}.");
var de = new System.Globalization.CultureInfo("de-DE");
i = String.Compare(first, second, de, System.Globalization.CompareOptions.IgnoreNonSpace);
Console.WriteLine($"Comparing in {de.Name} returns {i}.");
bool b = String.Equals(first, second, StringComparison.CurrentCulture);
Console.WriteLine($"The two strings {(b == true ? "are" : "are not")} equal.");
string word = "coop";
string words = "co-op";
string other = "cop";
showComparison(word, words, en);
showComparison(word, other, en);
showComparison(words, other, en);
void showComparison(string one, string two, System.Globalization.CultureInfo culture)
{
int compareLinguistic = String.Compare(one, two, en, System.Globalization.CompareOptions.IgnoreNonSpace);
int compareOrdinal = String.Compare(one, two, StringComparison.Ordinal);
if (compareLinguistic < 0)
Console.WriteLine($"<{one}> is less than <{two}> using en-US culture");
else if (compareLinguistic > 0)
Console.WriteLine($"<{one}> is greater than <{two}> using en-US culture");
else
Console.WriteLine($"<{one}> and <{two}> are equivalent in order using en-US culture");
if (compareOrdinal < 0)
Console.WriteLine($"<{one}> is less than <{two}> using ordinal comparison");
else if (compareOrdinal > 0)
Console.WriteLine($"<{one}> is greater than <{two}> using ordinal comparison");
else
Console.WriteLine($"<{one}> and <{two}> are equivalent in order using ordinal comparison");
}
Culture-sensitive comparisons are typically used to compare and sort strings input by users with other strings input by users. The characters and sorting conventions of these strings might vary depending on the locale of the user's computer. Even strings that contain identical characters might sort differently depending on the culture of the current thread. In addition, try this sample code locally on a Windows machine, and you will the following results:
<coop> is less than <co-op> using en-US culture
<coop> is greater than <co-op> using ordinal comparison
<coop> is less than <cop> using en-US culture
<coop> is less than <cop> using ordinal comparison
<co-op> is less than <cop> using en-US culture
<co-op> is less than <cop> using ordinal comparison
Linguistic comparisons are dependent on the current culture, and are OS dependent. You must take that into account when you work with string comparisons.
Linguistic sorting and searching strings in arrays
The following examples show how to sort and search for strings in an array using a linguistic comparison dependent on the current culture. You use the static Array methods that take a System.StringComparer parameter.
This example shows how to sort an array of strings using the current culture:
string[] lines = new string[]
{
@"c:\public\textfile.txt",
@"c:\public\textFile.TXT",
@"c:\public\Text.txt",
@"c:\public\testfile2.txt"
};
Console.WriteLine("Non-sorted order:");
foreach (string s in lines)
{
Console.WriteLine($" {s}");
}
Console.WriteLine("\n\rSorted order:");
// Specify Ordinal to demonstrate the different behavior.
Array.Sort(lines, StringComparer.CurrentCulture);
foreach (string s in lines)
{
Console.WriteLine($" {s}");
}
Once the array is sorted, you can search for entries using a binary search. A binary search starts in the middle of the collection to determine which half of the collection would contain the sought string. Each subsequent comparison subdivides the remaining part of the collection in half. The array is sorted using StringComparer.CurrentCulture. The local function ShowWhere
displays information about where the string was found. If the string was not found, the returned value indicates where it would be if it were found.
string[] lines = new string[]
{
@"c:\public\textfile.txt",
@"c:\public\textFile.TXT",
@"c:\public\Text.txt",
@"c:\public\testfile2.txt"
};
Array.Sort(lines, StringComparer.CurrentCulture);
string searchString = @"c:\public\TEXTFILE.TXT";
Console.WriteLine($"Binary search for <{searchString}>");
int result = Array.BinarySearch(lines, searchString, StringComparer.CurrentCulture);
ShowWhere<string>(lines, result);
Console.WriteLine($"{(result > 0 ? "Found" : "Did not find")} {searchString}");
void ShowWhere<T>(T[] array, int index)
{
if (index < 0)
{
index = ~index;
Console.Write("Not found. Sorts between: ");
if (index == 0)
Console.Write("beginning of sequence and ");
else
Console.Write($"{array[index - 1]} and ");
if (index == array.Length)
Console.WriteLine("end of sequence.");
else
Console.WriteLine($"{array[index]}.");
}
else
{
Console.WriteLine($"Found at index {index}.");
}
}
Ordinal sorting and searching in collections
The following code uses the System.Collections.Generic.List<T> collection class to store strings. The strings are sorted using the List<T>.Sort method. This method needs a delegate that compares and orders two strings. The String.CompareTo method provides that comparison function. Run the sample and observe the order. This sort operation uses an ordinal case sensitive sort. You would use the static String.Compare methods to specify different comparison rules.
List<string> lines = new List<string>
{
@"c:\public\textfile.txt",
@"c:\public\textFile.TXT",
@"c:\public\Text.txt",
@"c:\public\testfile2.txt"
};
Console.WriteLine("Non-sorted order:");
foreach (string s in lines)
{
Console.WriteLine($" {s}");
}
Console.WriteLine("\n\rSorted order:");
lines.Sort((left, right) => left.CompareTo(right));
foreach (string s in lines)
{
Console.WriteLine($" {s}");
}
Once sorted, the list of strings can be searched using a binary search. The following sample shows how to search the sorted listed using the same comparison function. The local function ShowWhere
shows where the sought text is or would be:
List<string> lines = new List<string>
{
@"c:\public\textfile.txt",
@"c:\public\textFile.TXT",
@"c:\public\Text.txt",
@"c:\public\testfile2.txt"
};
lines.Sort((left, right) => left.CompareTo(right));
string searchString = @"c:\public\TEXTFILE.TXT";
Console.WriteLine($"Binary search for <{searchString}>");
int result = lines.BinarySearch(searchString);
ShowWhere<string>(lines, result);
Console.WriteLine($"{(result > 0 ? "Found" : "Did not find")} {searchString}");
void ShowWhere<T>(IList<T> collection, int index)
{
if (index < 0)
{
index = ~index;
Console.Write("Not found. Sorts between: ");
if (index == 0)
Console.Write("beginning of sequence and ");
else
Console.Write($"{collection[index - 1]} and ");
if (index == collection.Count)
Console.WriteLine("end of sequence.");
else
Console.WriteLine($"{collection[index]}.");
}
else
{
Console.WriteLine($"Found at index {index}.");
}
}
Always make sure to use the same type of comparison for sorting and searching. Using different comparison types for sorting and searching produces unexpected results.
Collection classes such as System.Collections.Hashtable, System.Collections.Generic.Dictionary<TKey,TValue>, and System.Collections.Generic.List<T> have constructors that take a System.StringComparer parameter when the type of the elements or keys is string
. In general, you should use these constructors whenever possible, and specify either StringComparer.Ordinal or StringComparer.OrdinalIgnoreCase.
Reference equality and string interning
None of the samples have used ReferenceEquals. This method determines if two strings are the same object. This can lead to inconsistent results in string comparisons. The following example demonstrates the string interning feature of C#. When a program declares two or more identical string variables, the compiler stores them all in the same location. By calling the ReferenceEquals method, you can see that the two strings actually refer to the same object in memory. Use the String.Copymethod to avoid interning. After the copy has been made, the two strings have different storage locations, even though they have the same value. Run the following sample to show that strings a
and b
are interned meaning they share the same storage. The strings a
and c
are not.
string a = "The computer ate my source code.";
string b = "The computer ate my source code.";
if (String.ReferenceEquals(a, b))
Console.WriteLine("a and b are interned.");
else
Console.WriteLine("a and b are not interned.");
string c = String.Copy(a);
if (String.ReferenceEquals(a, c))
Console.WriteLine("a and c are interned.");
else
Console.WriteLine("a and c are not interned.");
Note
When you test for equality of strings, you should use the methods that explicitly specify what kind of comparison you intend to perform. Your code is much more maintainable and readable. Use the overloads of the methods of the System.String and System.Array classes that take a StringComparison enumeration parameter. You specify which type of comparison to perform. Avoid using the ==
and !=
operators when you test for equality. The String.CompareToinstance methods always perform an ordinal case-sensitive comparison. They are primarily suited for ordering strings alphabetically.