Tuesday, March 27, 2012

Foreach loop runs out of memory

Hi

I have a for each loop which steps through an ado recordset (approx. 5,000 rows), this passes two variables to an SQL statement which populates a second recordset (normally 8 to 10 rows). I use the second recordset in a dataflow task which was a simple Script which returns approximately 30 rows for inclusion in my destination table. The package runs for a while OK, although the loop appears to execute slowly, then I get the below message constantly repeated in the debug window.

[DTS.Pipeline] Information: The buffer manager detected that the system was low on virtual memory, but was unable to swap out any buffers. 174 buffers were considered and 174 were locked. Either not enough memory is available to the pipeline because not enough is installed, other processes are using it, or too many buffers are locked.

I have 2gb of virtual memory on my machine, and the recordsets are relatively small. Have I missed a seting some where?My guess is your problem is where you're creating 5,000 8-10 row recordsets. Instead of creating those recordsets from a SQL statement and consuming them in a Data Flow script, can you not execute the SQL statement in an OLE DB Source in your Data Flow?

Maybe you simplified your scenario for the sake of explanation, but if you can do the above, I wonder if you can't find a way to move that 5,000 iteration loop into the same SQL Statement as a set-based operation.

|||

Thanks for the reply, the 5000 row record set has two fields one integer the other nchar(5). The second recordset gets re-used in the loop so this will only have 10 rows max at a time and has 4 fields. Unfortunately I cannot change the Loop to SQL, I do not have any knowledge of set based operations, is this similar to a loop?

Regards

ADG

|||What SP level are you running? I looks like there was some memory related issues in pre SP versions of the products.|||

How is the second recordset being reused? Do you mean that you are repopulated the same variable with a new recordset?

Also, are you explicitly closing the recrodset you are using?

|||

Thanks for the reply.

Inside my Foreach loop container I have an Execute SQL task which uses the two variables from the Recordset being scrolled through, and populates my second recordset. Do I need to explicitly close the recordset inside the loop? If so how should the recordset be closed, is this a Script task?

|||

ADG wrote:

Thanks for the reply.

Inside my Foreach loop container I have an Execute SQL task which uses the two variables from the Recordset being scrolled through, and populates my second recordset. Do I need to explicitly close the recordset inside the loop? If so how should the recordset be closed, is this a Script task?

The outer recordset (A) probably isn't the issue - it's the second one (B) that I am talking about. You are using the second one in a data flow, right? Are you using a source script component to read recordset B? If so, call the recordset.Close() method (if you are using ADODB) or the DataTable.Dispose() method if you are using ADO.NET) after the loop to output the rows completes.

|||

Many thanks,

I will change the script component and give it a try. I am new to SQL Server / SSIS so it looks like II have made another beginners error.

|||

I have tried to close the recordset but the close method does not appear to be valid in the main code,( I tried Me.Variables.MyRecordSet2.close). Do I need to override one of the Class methods? The code I used is modified from one of Jamie Thomsons blogs and is as below:

Imports System

Imports System.Data

Imports System.Math

Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper

Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

Public Class ScriptMain

Inherits UserComponent

Public Overrides Sub CreateNewOutputRows()

'

Dim olead As New Data.OleDb.OleDbDataAdapter'Define an ADO.Net data adapter

Dim dt As New Data.DataTable'Define an ADO.Net DataTable

Dim row As System.Data.DataRow'Define an ADO.Net DataRow

Dim lastSI, SI As Int32

Dim strU(50) As String

Dim strU2(50) As String

Dim dblFactor(50) As Double

Dim x, y, j, k As Int32

Dim isFirst As Boolean

Dim strLComb, strComb As String

olead.Fill(dt, Me.Variables.MyRecordSet2)'Populate our DataTable from the adapter

lastSI = 0

strLComb = " "

SI = 0

x = 0

isFirst = True

For Each row In dt.Rows'Iterate over the rows in the table

strComb = row("SI").ToString & row("U").ToString

If strComb <> strLComb Then

If isFirst Then

isFirst = False

Else

For y = 1 To (x - 1)

For k = 2 To x

With OutputBuffer

.AddRow()

.Calculated = True

.Factor = dblFactor(k) / dblFactor(y)

.SI = SI

.U = strU2(y)

.U2 = strU2(k)

End With

Next

Next

End If

SI = CType(row("SI"), Integer)

strLComb = row("SI").ToString & row("U").ToString

x = 1

strU(x) = row("U").ToString

strU2(x) = row("U2").ToString

dblFactor(x) = CType(row("Factor"), Double)

Else

x = x + 1

strU(x) = row("U").ToString

strU2(x) = row("U2").ToString

dblFactor(x) = CType(row("Factor"), Double)

End If

Next

For y = 1 To (x - 1)

For k = 2 To x

With OutputBuffer

.AddRow()

.Calculated = True

.Factor = dblFactor(k) / dblFactor(y)

.SI = SI

.U = strU2(y)

.U2 = strU2(k)

End With

Next

Next

dt.Dispose()

End Sub

End Class

|||Thanks for showing us your code. I don't see why you need the recordsets at all. It looks like you could just as easily execute the query that creates the recordset in an OLE DB Source component and use this code in a custom transformation component. Instead of the rows coming into this code via Recordset, they come in via the component's input. This is what I suggested on 4/2.
|||Since you are using ADO.NET, you need to call the Dispose() method. However, Jay is correct in saying that the operation you are trying to perform would be much more efficient if it was not done in a script component.|||

Thanks again,

Can you point me in the right direction with the custom transformation. I have created the OLE DB which will return between 2 and 10 rows, I need to store this in an array to process. I assume that each row will pass into the script via Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer) . How do I catch all the rows then perform the output?

|||

ADG wrote:

Thanks again,

Can you point me in the right direction with the custom transformation. I have created the OLE DB which will return between 2 and 10 rows, I need to store this in an array to process. I assume that each row will pass into the script via Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer) . How do I catch all the rows then perform the output?

Take a look at this topic in books online: "Creating an Asynchronous Transformation with the Script Component"

Since your are not simply passing each row through, you'll need to use an asynchronous transform.

|||If you have your rows coming from an OLE DB Source, then you would change your script source to an asynchronous script transformation and do your processing on the incoming rows instead of shredding the recordset. Biggest difference is that you no longer have a FOR loop to iterate the rows. Instead ProcessInputRows gets called multiple times. Therefore some of your variables needed to be made global. Also, the rows you output after seeing all the input rows had to be moved to another method that gets called when the input rows are finished. I rearranged your code to demonstrate. Hopefully I didn't change the logic any.

Code Snippet


Dim lastSI As Int32 = 0
Dim SI As Int32 = 0
Dim strU(50) As String
Dim strU2(50) As String
Dim dblFactor(50) As Double
Dim x As Int32 = 0
Dim isFirst As Boolean = True
Dim strLComb As String = " "

Public Overrides Sub FinishOutputs()
Dim y, k As Int32
For y = 1 To (x - 1)
For k = 2 To x
With OutputBuffer
.AddRow()
.Calculated = True
.Factor = dblFactor(k) / dblFactor(y)
.SI = SI
.U = strU2(y)
.U2 = strU2(k)
End With
Next
Next
End Sub

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
'
' Add your code here
'
Dim strComb As String
Dim y, k As Int32

strComb = Row.SI & Row.U

If strComb <> strLComb Then

If isFirst Then
isFirst = False
Else

For y = 1 To (x - 1)
For k = 2 To x
With OutputBuffer
.AddRow()
.Calculated = True
.Factor = dblFactor(k) / dblFactor(y)
.SI = SI
.U = strU2(y)
.U2 = strU2(k)
End With
Next
Next
End If

SI = CType(Row.SI, Integer)
strLComb = Row.SI & Row.U

x = 1
strU(x) = Row.U
strU2(x) = Row.U2
dblFactor(x) = CType(Row.Factor, Double)

Else
x = x + 1
strU(x) = Row.U
strU2(x) = Row.U2
dblFactor(x) = CType(Row.Factor, Double)
End If

End Sub


|||

Many thanks for the post. I have incorporated some of the above into my final package

However, I have gone back to using a for each loop to generate a OLE DB source, which feeds an asynchronous script transformation . The above works but does not create all the records that I needed. The process of adding rows to the table in turn creates more possible row combinations. My memory issue was eventually resolved by installing SP2. My package now works, but is slow. Below is an exapmle of my data problem ( data for two SKU's as in the original source, and as in my final table )

SI U1 U2 Factor 1 CT BG 192 1 CT EA 24 1 CT KG 10.9 1 CT M3 0.001 1 CT SM 143 1 PL CT 49 1 PL HT 1400 2 CT BG 300 2 CT EA 30 2 CT KG 2.028 2 CT M3 0.001 2 CT SM 125 2 PL CT 96 2 PL HT 1400

Becomes

SI U1 U2 Factor Calculated 1 BG BG 1.00000000 1 1 BG CT 0.00520833 1 1 BG EA 0.12500000 1 1 BG KG 0.05677083 1 1 BG M3 0.00000521 1 1 BG PL 0.00010629 1 1 BG SM 0.74479167 1 1 CT BG 192.00000000 0 1 CT CT 1.00000000 1 1 CT EA 24.00000000 0 1 CT KG 10.90000000 0 1 CT M3 0.00100000 0 1 CT PL 0.02040816 1 1 CT SM 143.00000000 0 1 EA BG 8.00000000 1 1 EA CT 0.04166667 1 1 EA EA 1.00000000 1 1 EA KG 0.45416667 1 1 EA M3 0.00004167 1 1 EA PL 0.00085034 1 1 EA SM 5.95833333 1 1 HT HT 1.00000000 1 1 HT PL 0.00071429 1 1 KG BG 17.61467890 1 1 KG CT 0.09174312 1 1 KG EA 2.20183486 1 1 KG KG 1.00000000 1 1 KG M3 0.00009174 1 1 KG PL 0.00187231 1 1 KG SM 13.11926606 1 1 M3 BG 192000.00000000 1 1 M3 CT 1000.00000000 1 1 M3 EA 24000.00000000 1 1 M3 KG 10900.00000000 1 1 M3 M3 1.00000000 1 1 M3 PL 20.40816327 1 1 M3 SM 143000.00000000 1 1 PL BG 9408.00000000 1 1 PL CT 49.00000000 0 1 PL EA 1176.00000000 1 1 PL HT 1400.00000000 0 1 PL KG 534.10000000 1 1 PL M3 0.04900000 1 1 PL PL 1.00000000 1 1 PL SM 7007.00000000 1 1 SM BG 1.34265734 1 1 SM CT 0.00699301 1 1 SM EA 0.16783217 1 1 SM KG 0.07622378 1 1 SM M3 0.00000699 1 1 SM PL 0.00014271 1 1 SM SM 1.00000000 1 2 BG BG 1.00000000 1 2 BG CT 0.00333333 1 2 BG EA 0.10000000 1 2 BG KG 0.00676000 1 2 BG M3 0.00000333 1 2 BG PL 0.00003472 1 2 BG SM 0.41666667 1 2 CT BG 300.00000000 0 2 CT CT 1.00000000 1 2 CT EA 30.00000000 0 2 CT KG 2.02800000 0 2 CT M3 0.00100000 0 2 CT PL 0.01041667 1 2 CT SM 125.00000000 0 2 EA BG 10.00000000 1 2 EA CT 0.03333333 1 2 EA EA 1.00000000 1 2 EA KG 0.06760000 1 2 EA M3 0.00003333 1 2 EA PL 0.00034722 1 2 EA SM 4.16666667 1 2 HT HT 1.00000000 1 2 HT PL 0.00071429 1 2 KG BG 147.92899408 1 2 KG CT 0.49309665 1 2 KG EA 14.79289941 1 2 KG KG 1.00000000 1 2 KG M3 0.00049310 1 2 KG PL 0.00513642 1 2 KG SM 61.63708087 1 2 M3 BG 300000.00000000 1 2 M3 CT 1000.00000000 1 2 M3 EA 30000.00000000 1 2 M3 KG 2028.00000000 1 2 M3 M3 1.00000000 1 2 M3 PL 10.41666667 1 2 M3 SM 125000.00000000 1 2 PL BG 28800.00000000 1 2 PL CT 96.00000000 0 2 PL EA 2880.00000000 1 2 PL HT 1400.00000000 0 2 PL KG 194.68800000 1 2 PL M3 0.09600000 1 2 PL PL 1.00000000 1 2 PL SM 12000.00000000 1 2 SM BG 2.40000000 1 2 SM CT 0.00800000 1 2 SM EA 0.24000000 1 2 SM KG 0.01622400 1 2 SM M3 0.00000800 1 2 SM PL 0.00008333 1 2 SM SM 1.00000000 1

The project takes about 30 minutes to run with 21,500 input rows and has 135,900 final rows. I guess if this process could be done with SQL it would be far more efficient, but I cannot see how this can be done when each row added can gerate more possible combinations. If anyone out there can see away of performing the above transformation with pure SQL I would be grateful for your ideas.

No comments:

Post a Comment