> On Mon, 20 Mar 2000 18:07:43 -0500, "Steve Anderson"
> <steve.ander...@homemail.com> wrote:
> > we have an interesting problem. I want to prevent certain records in
a
> >dataset (TQuery) from showing up in a report. I also do not want them to
be
> >included in any sum calculations. Adding a field in the database to
> >distinguish these records is not really an option right now, so I'm
trying a
> >coding solution. Do you guys have any ideas? I did attempt putting a
> >conditional Next in the AfterScroll event of the TQuery object, but it
> >seemed to corrupt/mixup the dataset pointers because I was getting
> >unexpected results (wrong record skipped and another record repeated). My
> >theory was probably flawed because I was causing event recursion. Anyhow,
do
> >you guys know how to do this? Any advice would be appreciated. Thank you,
> First you have to be able to define what makes a given row a duplicate
> or not a duplicate. Then you need to express that in SQL. Take the
> dataset below for example.
> FirstName LastName DateIn
> --------- -------- ----------
> Joe Smith 1/1/1999
> Joe Smith 3/1/2000
> John Jones 2/4/1999
> Joe Henry 5/6/1998
> Brighton Smith 12/31/1999
> Mary Jones 7/24/1997
> Mary Jones 2/1/2000
> There are two sets of duplicated records based on the combination of
> the FirstName and LastName fields: Joe Smith and Mary Jones. The
> difference between the record in each group is the value of the DateIn
> field. So this could be used to specify which is the duplicate (the
> ones in a group with less than the latest date) and which one to keep
> (the one with the latest date).
> A SELECT statement can use in its WHERE clause another SELECT acting
> as a subquery to define the criteria on which to restrict the effects
> of the data retrieval to a subset of the overall dataset. This
> situation requires two subqueries. The first is one for finding the
> groups where there are two or more records with the same FirstName and
> LastName. The second is one to determine the highest date for each
> group. When these two criteria are combined, they allow us to delete
> all but one record for each group of duplicates, the one with the
> latest date.
> The SELECT statement below retrieves all rows where there are two or
> more duplicates. Any records not included in this result set are those
> where they are the only record with a given combination of FirstName
> and LastName, i.e., not duplicates.
> SELECT K.*
> FROM "KillDupes.db" K
> WHERE EXISTS
> (
> SELECT K1.LastName, K1.FirstName, COUNT(K1.DateIn)
> FROM "KillDupes.db" K1
> WHERE (K.LastName = K1.LastName) AND
> (K.FirstName = K1.FirstName)
> GROUP BY K1.LastName, K1.FirstName
> HAVING (COUNT(K1.DateIn) > 1)
> )
> This statement retrieves the records with other than the highest date
> for their respective groups (these groups being based on the
> combination of the LastName and FirstName fields).
> SELECT K.*
> FROM "KillDupes.db" K
> WHERE NOT K.DateIn = ANY
> (
> SELECT MAX(K2.DateIn)
> FROM "KillDupes.db" K2
> WHERE (K.LastName = K2.LastName) AND
> (K.FirstName = K2.FirstName)
> )
> Now, combined the two SELECT queries act criteria to determine whether
> a record is deleted or not. To be included in this final result set a
> record must meet BOTH criteria. It must be in a group of two or more
> records AND its DateIn field must not contain the highest date for
> that group.
> SELECT *
> FROM "KillDupes.db" K
> /* Criteria #1 - grouping */
> WHERE EXISTS
> (
> SELECT K1.LastName, K1.FirstName, COUNT(K1.DateIn)
> FROM "KillDupes.db" K1
> WHERE (K.LastName = K1.LastName) AND
> (K.FirstName = K1.FirstName)
> GROUP BY K1.LastName, K1.FirstName
> HAVING COUNT(K1.DateIn) > 1
> )
> /* Criteria #2 - date field */
> AND NOT K.DateIn = ANY
> (
> SELECT MAX(K2.DateIn)
> FROM "KillDupes.db" K2
> WHERE ((K.LastName = K2.LastName) AND
> (K.FirstName = K2.FirstName))
> )
> ///////////////////////////////////////////////////////////
> Steve Koterski "No matter how cynical you get,
> Felton, CA it is impossible to keep up."
> -- Lily Tomlin