String Data Types in SQL Server: VARCHAR vs. CHAR

Cloud Computing

Your database is made of data; data is defined by data types.  Data types are so fundamental that they’re often overlooked, but here are a few very common questions, and their answers.

Can I just use VARCHAR for everything?

The short answer is: No.

Sure, VARCHAR is very flexible and will accept most kinds of data. But using VARCHAR for everything robs your database of critical functionality, data consistency, and performance.

Let’s take the example of storing date data in a VARCHAR column.  We’ve instantly lost functionality, because we can’t easily add, subtract, or compare our date data. If we use one of the date data types (e.g., DATETIME, SMALLDATE, etc.) then we have a host of system functions like DATEADD and DATEPART.

Another problem with storing date data in a VARCHAR is that we have no built-in format control.  Our system will now readily accept all of the following as “valid” date data:

  • February 1, 2010
  • eFbruary 1, 2010
  • Feb 1 2010
  • Star date 002.1.2010.304
  • 2-1-2010
  • 2-1-2010 3:03pm
  • 02012010
  • 020110
  • February 31, 2010

We want to unify and enforce the date format in the database, so we can easily seek out date errors (like February 31) and compare date, for example:  SELECT columns FROM table1 WHERE myDate > ‘1/1/2010’.

We run into the same issues no matter what non-string type we try to store in VARCHAR.

What’s the difference between CHAR and VARCHAR?

The short answer is: VARCHAR is variable length, while CHAR is fixed length.

CHAR is a fixed length string data type, so any remaining space in the field is padded with blanks. CHAR takes up 1 byte per character. So, a CHAR(100) field (or variable) takes up 100 bytes on disk, regardless of the string it holds.

VARCHAR is a variable length string data type, so it holds only the characters you assign to it. VARCHAR takes up 1 byte per character, + 2 bytes to hold length information.  For example, if you set a VARCHAR(100) data type = ‘Jen’, then it would take up 3 bytes (for J, E, and N) plus 2 bytes, or 5 bytes in all.

You can see how the use of VARCHAR in most cases is preferred, to save space.  Let’s take a look at a script; it declares a CHAR and a VARCHAR variable, sets each equal to the string ‘SQL’, and then SELECTs each variable to display what is actually stored:

DECLARE @myChar CHAR(100) , @myVarchar VARCHAR(100)
SET @myChar = 'SQL'
SET @myVarchar = 'SQL'
SELECT '[BEGIN]' + @myChar + '[END]' AS CHAR_Data
SELECT '[BEGIN]' + @myVarchar + '[END]' AS VARCHAR_Data

Here is the result when you run the script:

Char_Data

[BEGIN]SQL[END]

Varchar_Data

[BEGIN]SQL[END]

This demo just makes a nice visual of all the extra padding (and space wasted) CHAR uses.

When should I use CHAR instead of VARCHAR?

The short answer is: Almost never.

VARCHAR only costs two “extra” bytes, when compared to CHAR.  It’s only in rare cases where using CHAR will actually save you space and effort. Examples of those cases:

  • CHAR(2) for state abbreviation.  If your business rules say that the State column will ALWAYS be two characters long, then use CHAR(2).
  • Fixed length product codes (e.g., CHAR(8) for product codes like ‘004-3228’). Just like the state; if you have a field, like product code, that is ALWAYS (now and forevermore) a set length, then CHAR is preferred.
  • A single letter string. For example, we should use CHAR(1) for a middle initial column.

Your developers, users, and your database will be much happier if you use proper data types.

Further Reading