FullText Ranking: Force to complete query, ignore errors

Hi,

I have a big table and want to make a plausibility check of it′s data.

Problem is, that my query stops, if there is an unexpected datatype in one of the rows. But that is it, what i want to filter out of my table with that query and save the result as new correct table.

How can i write a parameter to my query SQL Code, that if a error occurs, the querry resumes and the error line will not displayed in my final querry overview?

In my books and on the net, i don′t found something to this theme ;-(.

Thx in advance.

You should provide additional info
- how is the table outline ?
- which is the unexpected data that breaks your code ?
- which sql server version are you working with ?
- it's a pure t-sql approach or is a mixed ado.net / sql approach ?
- can you post a short version of the t-sql code here ?|||- how is the table outline ?
The table has 30 columns with differnt data's in there. The datatyps are all nvarchar(50) after the flatfile txt Import. But there are some date, text, int and float datats, in there. I will bringt the correct datatyp to the columns later, because i have 80 of such tables, and before i must correct them form wrong lines and merge them after that.

- which is the unexpected data that breaks your code ?

Very very much possible things. A date where a int is expected. A float where a int is expected and and. It′s because the txt-Files i got have many data errors and wrong moved lines in there and i must correct them now.

- which sql server version are you working with ?

SQL Server 2005

- it's a pure t-sql approach or is a mixed ado.net / sql approach ?

I don′t know, but i don′t think ado.net.

- can you post a short version of the t-sql code here ?

[Code]
SELECT TOP (100) PERCENT _Ti, _Pl, _Ve, [_Datum_Time], _Ere, _Sg, _Sm,
FROM dbo.d6020900_ges
WHERE (_Ti < 1) OR
(_Ti > 1000) AND (_Pl < 0) OR
(_Pl > 15) AND (_Ve < 0) OR
(_Ve > 10000) AND (_Ere < - 1) OR
(_Ere > 10) AND (_Sg < N'0') OR
(_Sg > N'356') AND (_Sm < N'0') OR
(_Sm > N'255')
ORDER BY _Ti, _Pl, _Ve, [_Datum_Time]
[\Code]

|||There are different approaches depending on which is the operation you have to do...

First of all don't expect that is a simple task... I don't think you will be able to build a query like the one you posted... there are a lot of implicit conversions that simply cannot work in your case... there's no way to tell SQL to "skip" some row when a conversion fails... you have to build a little "engine" that anayze data column by column.

If you have to correct data in a table, this happens only once, so you can launch a long running query, maybe using a cursor and evaluating on a per row basis...
if this is the case you may also want to use a CLR stored proc... in that case you may fetch your data into a dataset and use standard manipulation/conversion using your favorite language (C#, VB.NET). In that case I think that the use of a CLR SP is fully justified from the improved flexibility in analyzing data, after all string manipulation with the framework is a lot more powerful that T-SQL.

If you have to repeadetly query the data to build a report or something like that... well this is a nightmare... I strongly suggest you to convert your data into a table that has the required data types (integers, datetime and so on), else you may build a bunch of conversion functions and/or a view made up by functions (or computed fields) that may be used to access the data... but, as you probably have already thought, this not a good performance solution.. it would be better to apply those functions only once to migrate your data into a brand new ad-hoc table.

Anyway let me know if you need further info.|||Thx so far.

My biggest problem are the much wrong lines in my Table, so converting in SQL fails and importing the basis txt-flatfile to a table with the correct datatypes fails also.

Is there a way to import only the rows, that are correct and dont stop at the wrong?|||You need to process you data row by row. This can be done
- using a cursor if you want to operate with a pure T-SQL solution
- using an external application (or a CLR stored proc) if you want more power/flexibility
I don't know any other way

You may also add a signature bit to your row that marks the row as dirty so you can filter it out... but again this requires that you process the table line by line.

in pseudo-code this may look like this

OPEN CURSOR
FETCH NEXT DATA INTO FIELD1, FIELD2, ...
WHILE (@.@.FETCH_STATUS = 0) BEGIN
IF DIRTY_DATE(FIELD1) OR DIRTY_INT(FIELD2) OR ... BEGIN
... DO SOMETHING HERE ...
END
FETCH NEXT DATA INTO FIELD1, FIELD2, ...
END

Just a question: why don't you avoid inserting bad rows when you import the txt file ?|||Hm, what u mean with avoid inserting bad rows? The bad rows are allready in my base txt-Files. But they are too large, to check and delete there every single line with the hand.

I don′t know how i can say on the import assistent, that bad rows will not be importet. At my try's the import stops, if a bad row is detected, an i musst correct the line in the txt-file with my own hands, delete the table in the db and then start again the import.

Is there a better way?|||> Hm, what u mean with avoid inserting bad rows?

You have an application that imports the text file into the system... correct ? well in that case the validation must be performed from that application. the data are imported only after validation ha occurred, so you won't have any problem in the db.

> Is there a better way?

I will be explicit, hope you won't offend:
1. who is the mad person that thought such a procedure ? how is possible that your "file-producer" is not able to write a decent file with no errors ? What I would do ? completely reject any file that contains errors.
2. who have designed the table with the nvarchar data only ? hope that you agree that is really a stupid thing... it's like you were trying to program an application with no data types... only pointers and bytes... that's prehistory.
|||O.K. thx, but the produced basis txt Files are corrupt and i have filtered many things before with VBA in excel, but to filter everything, vba and excel are to slow for my masses on files. The basis txt files are so, like I have got them, i can't get new ones or better ones, i must live with them.|||
> i must live with them.

ognuno ha le sue sfighe ...

Wednesday, March 21, 2012

Force to complete query, ignore errors

No comments:

Post a Comment

FullText Ranking

Blog Archive

About Me