Monday, February 20, 2012

newbie: processing records sequentially

Hi to all,
I'm a newbie to T-SQL but not to programming. What I want to do is
process every record in a table, and depending on the values in the
record itself, either copy the record to another table or update a
third table, and the delete the original record. Basically, it's a data
preparation area table where the records are imported with no error
check, and then validated against a set of conditions.

THE QUESTION IS: which approach should I follow? Is there a sort of
FORALL loop in T-SQL that I can execute on every record in the table?
Or shall I rely on other tools (VB Scripting, or...)? SQL Server 2000

Thanx to allIn SQL your goal should generally be to avoid processing data a row at
a time. This example may help you:

INSERT INTO AnotherTable (col1, col2, ...)
SELECT col1, col2, ...
FROM YourTable
WHERE ... ? /* unspecified */

UPDATE AnotherTable
SET ... ? /* unspecified */
WHERE ... /* uspecified */

DELETE FROM YourTable
WHERE ...

etc.

Hopefully this gives you the idea. If you need more help, please refer
to the following article which describes the best way to post a fuller
description of your problem:
http://www.aspfaq.com/etiquette.asp?id=5006

--
David Portas
SQL Server MVP
--|||David,
thanks for prompt response. I understand the process is somehow
reversed from a row-by-row cycle. It seems to me somehow inefficient,
though, to run a query against a third table twice in order to UPDATE
the first time and DELETE the second.

Cheers

David Portas wrote:
> In SQL your goal should generally be to avoid processing data a row at
> a time. This example may help you:
> INSERT INTO AnotherTable (col1, col2, ...)
> SELECT col1, col2, ...
> FROM YourTable
> WHERE ... ? /* unspecified */
> UPDATE AnotherTable
> SET ... ? /* unspecified */
> WHERE ... /* uspecified */
> DELETE FROM YourTable
> WHERE ...
> etc.
> Hopefully this gives you the idea. If you need more help, please refer
> to the following article which describes the best way to post a fuller
> description of your problem:
> http://www.aspfaq.com/etiquette.asp?id=5006|||It depends. INSERT, DELETE and UPDATE are separate operations so will
always require separate statements whatever method you use. The main
question is how best to optimise the WHERE selection criteria. You
didn't specify what those criteria are so I can only guess. It is at
least likely that putting the criteria in WHERE clauses, with suitable
indexes, will be much more efficient than the overhead of looping
through each row in a cursor. Understand that SQL is a declarative
language, unlike procedural languages that you are probably more
familiar with.Cursor processing (row by row) is rarely a good idea in
SQL and when manipulating data should usually be a last resort only.

--
David Portas
SQL Server MVP
--|||>> I'm a newbie to T-SQL but not to programming.<<

You missed the point of a declarative, set-oriented language and it
will take you about a year to un-learn your old mental models. I have
a book entitled SQL PROGRAMMING STYLE that deals with this problem in a
few chapters.

>> What I want to do is process every record [sic] in a table, and
depending on the values in the record [sic] itself, either copy the
record [sic] to another table or update a third table, and the delete
the original record [sic] . <<

This is a common problem; rows are not records, tables are not files
and columns are not fields. Totally differrent concepts. I will give
you the long painful details that I post on this topic if you ask.
Have you ever worked with LISP, FP, Prolog or a language that was not
based on procedures? This is a new way of thinking.

>> Basically, it's a data preparation area table where the records
[sic] are imported with no error check, and then validated against a
set of conditions. <<

This is usually done with an ETL tool these days, but you can do this
with a few statements. If you have an implementation of the new SQL-99
MERGE statement, then you can probably do it in one statement. It will
be a very complex statement, tho :)

BEGIN
UPDATE RawData
SET <column edits>;
DELETE GoodStuff
WHERE EXISTS
(SELECT *
FROM RawData
WHERE <<conditions>>;
INSERT INTO GoodStuff
SELECT a, b, c, ..
FROM RawData
WHERE <<conditions>>;
END;

First scrub the raw data, then remove the old data that is going to be
replaced, and finally insert the new clean data.|||moco (makelovenotspamdaldegan@.email.it) writes:
> thanks for prompt response. I understand the process is somehow
> reversed from a row-by-row cycle. It seems to me somehow inefficient,
> though, to run a query against a third table twice in order to UPDATE
> the first time and DELETE the second.

And with a loop there will be many queries against the table.

But as they say, seeing is believing. Go ahead and bench-mark. You will
be amazed...

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

No comments:

Post a Comment