sql length - sql position - sql substring

SQL Substring — The Lint From The Fluff

Relational databases store information in tables — in columns that describe aspects of an item and rows which tie together the columns. The SQL SUBSTRING() function — and related, support functions — is used to extract portions of a text string. This blog entry explores some common use cases; our  hub of SQL tutorials covers other topics.

The syntax of the SQL SUBSTRING() function is as follows; all three parameters are required.

The Most Basic Usage (And Errors)

Extracting a substring from a constant text string with hard-coded arguments is the basic use case:

SUBSTRING( ‘hippopotomonstrosesquipedaliophobia’, 1, 5 )
hippo

Supplying a negative number as the starting position refers to “distance” from the string’s end:

Check your local database documentation to verify that negative starting positions are supported.

SUBSTRING( ‘hippopotomonstrosesquipedaliophobia’, -13, 5 )
pedal

The most common usage errors are supplying invalid or unexpected arguments.

Using zero as a starting point will result in SUBSTRING() returning an empty (zero-length) string, because unlike many other programming languages, string positions in SQL begin at 1. (Check your local database documentation to ensure that a NULL is not returned instead.) Sanity-checking the starting value will be necessary for writing robust software.

Providing a length argument that’s longer than the actual length of the text string will not raise any error conditions. Instead, the substring extraction ends at the word’s end:

SUBSTRING( ‘hippopotomonstrosesquipedaliophobia’, 30, 123456789 )
phobia

Processing Email Addresses With LENGTH() and POSITION()

Fixed values for the starting position and length make sense for scenarios where a string pattern is known, like U.S. Social Security Numbers or phone numbers (both ‘###-###-####’), but more complex problems require more adaptive solutions. For example, extracting user and domain name information from email addresses — parsing ‘user@domain.ext’ when the length of both sides are unknown demands additional string processing functions.

LENGTH() returns an integer measure of the number of characters in a text string.

POSITION() returns an integer location of the first occurrence of the specified string in another string.

Setting up an emails table with an addr column — SQL code provided at the end of this blog — to supply our program, we extract the user and domain portions dynamically, without any fixed values, by finding splitting them by the character position found by POSITION(‘@’ IN addr).

Additionally, our code saves the user and domain portions into the SQL variables @user and @domain and then puts the parts back together with CONCAT( @user, ‘@’, @domain ).

addressusernamedomainconcatenated
user@example.comuserexample.comuser@example.com

Check your local database SQL implementation; syntax chosen varies, some use CHARINDEX() and LEN() functions instead.

 

 

 

 

 

 

 

 

Conclusion

Covered here has been the several SQL string-manipulation functions — SUBSTRING(), LENGTH(), and POSITION() — by working through some common use cases, ending with a robust method for parsing email addresses. A reminder that different databases use other functions and syntaxes to achieve similar utility.

Start Learning

To learn more about SQL, check out other SQL blog posts and enroll in our SQL Nanodegree program.

SQL Sample Table Code

This blog entry contains no additional SQL code.