When taking existing data to put into a data warehouse one hopes that it is nicely formatted. This is seldom the case and recently I was faced with the problem that single field contained a rather dirty list. First I cleaned the list using REPLACE to replace the variety of delimiters with comma. Then I used this excellent T-SQL function to extract the items from the list.
FN_STRING_HELPER_SPLIT( @text_to_split , @delimiter , @index_of_list_item_to_return ) kgalli
Here is an example of it in use,
select list , FN_STRING_HELPER_SPLIT(list,',',1) ,FN_STRING_HELPER_SPLIT(list,',',2)
Things to note,
- This is a scalar function, it will only return a single item from the list
- You have to choose how many columns you extract
- If you ask for a non-existent item it seems to return the last non-null item rather than null.
- This does not create additional records, there is no transposition or un-pivoting of the extracted fields. This can be done manually by creating a view that that unions two queries that select item 1 and item 2 into a single column called item.
- Extraction and un-pivoting can be done in a single operation using the APPLY function and table-valued functions but this code is conceptually complex.