Bulk insert operation into Parent and Child table in SQL

Solution for Bulk insert operation into Parent and Child table in SQL
is Given Below:

For our current project(.Net Core), we require a migration of old data where the data is in an excel sheet in plain format. We need to import the data and then create those data into our database. There are multiple tables involved in this migration. Suppose the tables are:

table Parent(
  Id UNIQUEIDENTIFIER PRIMARY KEY,
  Name varchar(120),
  value varchar(150)
)

table Child1(
  Id UNIQUEIDENTIFIER PRIMARY KEY,
  childvalue1 varchar(120),
  childvalue2 varchar(150),
  P_ID UNIQUEIDENTIFIER foreign_key # has an index
)

table Child2(
  Id UNIQUEIDENTIFIER PRIMARY KEY,
  child2value1 varchar(200),
  P_ID UNIQUEIDENTIFIER foreign_key # has an index
)

The relationship is: Parent has 1-to-many Child1,Child2.

The parent table has a reference to table A.

The format the data are present in excel sheet is:
Name - value - childvalue1 - childvalue2 - child2value1

There are thousands of rows in this format.

What I want to do is:

  • copy the excel data into a table
  • create a stored procedure to insert into the parent and child tables

My concern here is:

  • How can I map the parent id to the children while uploading the data row by row?
  • while inserting the entries Id also has to be generated and the id is of type GUID

For bulk insert from an Excel file to a database if you want to have a performant application follow this approach.

  1. First of all, create a temp table corresponding to your excel file columns and one more column with the name of InsertionKey.
  2. Insert all your input data from excel with the same InsertionKey( generate a new GUID) and all your excel rows will have the same key. ( use Entity framework BulkInsertAsync or SqlBulkCopy for bulk insert. you also can check my GitHub repo for benchmarking BulkInsert in .Net)
  3. after data are inserted, send the GUID as a parameter to a store procedure

Now, you can easily process all your inserted data by fetching them from the temp table with the key.

Example:

your excel file columns:

Item1,Red,25

Item2,Black,32

Item3,Orange,44

Generate a new Guid before bulk insert

var insertionKey = Guid.NewGuid();

now bulk insert into DB:

Item1,Red,25,insertionKey

Item2,Black,32,insertionKey

Item3,Orange,44,insertionKey

now call the store procedure with the Insertionkey

YourStorProcedure(Insertionkey)

Inside the StorProcedure

select *  into #temp from temptable where insertionKey = insertionKeyParameter

here you can read inserted rows and after that process them as you want.

Here a sample script that gives a concept of how you can do it. It relies on the values being unique, that is, a given name/value only has one meaning

First you need to load your excel data into the Stage table.

Then you can use T-SQL to farm these values out to three table. By joining back on values you can retrieve the parent id’s

Note there’s no looping here, it’s a set based solution.

DECLARE @Parent TABLE (
  Id UNIQUEIDENTIFIER PRIMARY KEY DEFAULT newsequentialid() NOT NULL,
  Name varchar(120),
  value varchar(150)
)

DECLARE @Child1 TABLE(
  Id UNIQUEIDENTIFIER PRIMARY KEY DEFAULT newsequentialid() NOT NULL,
  childvalue1 varchar(120),
  childvalue2 varchar(150),
  P_ID UNIQUEIDENTIFIER
)

DECLARE @Child2 TABLE (
  Id UNIQUEIDENTIFIER PRIMARY KEY DEFAULT newsequentialid() NOT NULL,
  child2value1 varchar(200),
  P_ID UNIQUEIDENTIFIER
)


-- This is the stage table that excel data is loaded into
DECLARE @Stage  TABLE (
    COL1 VARCHAR(100),
    COL2 VARCHAR(100),
    COL3 VARCHAR(100),
    COL4 VARCHAR(100),
    COL5 VARCHAR(100)
    )

-- Load sample data
INSERT INTO @Stage(COL1,COL2,COL3,COL4,COL5)
VALUES 
('Name1','value1','childvalue1','childvalue2','child2value1'),
('Name1','value1','childvalue3','childvalue4','child4value1'),
('Name1','value1','childvalue3','childvalue4','child4value2'),
('Name1','value1','childvalue3','childvalue4','child4value3'),
('Name2','value2','childvalue8','childvalue5','child5value1'),
('Name2','value2','childvalue8','childvalue6','child6value1'),
('Name2','value2','childvalue8','childvalue6','child6value2')

-- Save top level, generating id's
INSERT INTO @Parent(Name,Value)
SELECT DISTINCT COL1,COL2 FROM @Stage

-- save children, looking up id's based on actual values
INSERT INTO @Child1(P_ID,childvalue1,childvalue2)
SELECT DISTINCT P.Id, S.COL3,S.COL4 
FROM @Stage S
INNER JOIN @Parent P
-- relies on these values being unique
ON S.COL1 = P.Name
AND S.COL2 = P.value

-- Check results    
SELECT * FROM @Parent
SELECT * FROM @Child1