Tags

, , ,

When taking existing data to put into a data warehouse one hopes that it is nicely formatted. This is seldom the case and recently I was faced with the problem that single field contained a rather dirty list. First I cleaned the list using REPLACE to replace the variety of delimiters with comma. Then I used this excellent T-SQL function to extract the items from the list.

FN_STRING_HELPER_SPLIT(
   @text_to_split
 , @delimiter
 , @index_of_list_item_to_return
)
          kgalli

Here is an example of it in use,

select
        list
      , FN_STRING_HELPER_SPLIT(list,',',1)
       ,FN_STRING_HELPER_SPLIT(list,',',2)

Things to note,

  • This is a scalar function, it will only return a single item from the list
  • You have to choose how many columns you extract
  • If you ask for a non-existent item it seems to return the last non-null item rather than null.
  • This does not create additional records, there is no transposition or un-pivoting of the extracted fields. This can be done manually by creating a view that that unions two queries that select item 1 and item 2 into a single column called item.
  • Extraction and un-pivoting can be done in a single operation using the APPLY function and table-valued functions but this code  is conceptually complex.
About these ads