tag:blogger.com,1999:blog-2377727724851115192.post776384389201872901..comments2024-03-23T15:56:15.328+05:30Comments on Data is everywhere, but?: Dropping DuplicatesDinesh Asankahttp://www.blogger.com/profile/15933617650741277563noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-2377727724851115192.post-2535540892315999582011-02-05T17:18:57.630+05:302011-02-05T17:18:57.630+05:30You need read MS-SQL book online, and understand T...You need read MS-SQL book online, and understand T-SQL;)<br /><br />http://msdn.microsoft.com/en-us/library/aa299742(v=SQL.80).aspx <br /><br />Dropping statement for deleting a table not a duplicates in a row? <br /><br />- You can delete duplicated row by folowing http://support.microsoft.com/kb/139444 <br />temp table can be created by Select into statement<br /><br />-Use DISTINCTROW <br /><br />Posted by Osama Abu-ArishehDinesh Asankahttps://www.blogger.com/profile/15933617650741277563noreply@blogger.comtag:blogger.com,1999:blog-2377727724851115192.post-87827961281838283592011-02-04T15:48:48.431+05:302011-02-04T15:48:48.431+05:30Michael Austin described one way to remove duplica...Michael Austin described one way to remove duplicates in MS SQL Server. There are other ways of accomplishing the same goal and that is the beauty of most products in our industry. Dinesh described a way of doing it using MongoDB and Kenneth Ambrose made some recommendations for Oracle.<br /><br />My only advice is to make sure you analyze the whole data row before removing duplicate key values and that entire rows are duplicate and not just the columns to be used as a unique constraint. <br /><br />By Sal YoungDinesh Asankahttps://www.blogger.com/profile/15933617650741277563noreply@blogger.comtag:blogger.com,1999:blog-2377727724851115192.post-30449137860877930132011-02-04T15:47:58.356+05:302011-02-04T15:47:58.356+05:30I don't think I'd want to do it that way w...I don't think I'd want to do it that way with a large table- would at least double the TLOG use and the physical space used by the table's data. <br /><br />There are ways to dedupe without making a new copy of the data. Research rownum and windowing functions... <br /><br />Having said that any database designer who creates DDL that allows dupes in a table should be fired anyway. <br />regards <br /><br />By Kenneth AmbroseDinesh Asankahttps://www.blogger.com/profile/15933617650741277563noreply@blogger.comtag:blogger.com,1999:blog-2377727724851115192.post-39188130887514998112011-02-04T14:19:44.612+05:302011-02-04T14:19:44.612+05:30• Here's a SQL Server method:
1. Create a te...• Here's a SQL Server method: <br /><br />1. Create a temp table with the same structure as the source table. <br />2. Create a unique index on the temp table with "ignore dupe key" <br />3. Insert the source table into the temp table (duplicates will be removed) <br />4. Truncate the source table <br />5. Insert the temp table into the source table. <br /><br />By Michael AustinDinesh Asankahttps://www.blogger.com/profile/15933617650741277563noreply@blogger.comtag:blogger.com,1999:blog-2377727724851115192.post-16512912698584140042011-02-04T14:19:12.443+05:302011-02-04T14:19:12.443+05:30Here is a general SQL method that will log deletes...Here is a general SQL method that will log deletes and that should work with modern, ANSI compliant databases supporting standard OLAP functions. In this example tab1 has 3 nullable smallint columns, named not very imaginatively col1, col2, col3. I ran this in DB2 Personal Edition V9.7.1 just now: <br /><br />delete from <br />(select col1,col2,col3, <br />row_number() over (order by col1,col2,col3) RN, <br />rank() over (order by col1, col2, col3) RNK <br />from tab1) temp <br />where RN != RNK <br /><br />Here is a general SQL method that will log deletes and that should work with modern, ANSI compliant databases supporting standard OLAP functions. In this example tab1 has 3 nullable smallint columns, named not very imaginatively col1, col2, col3. I ran this in DB2 Personal Edition V9.7.1 just now: <br /><br />delete from <br />(select col1,col2,col3, <br />row_number() over (order by col1,col2,col3) RN, <br />rank() over (order by col1, col2, col3) RNK <br />from tab1) temp <br />where RN != RNK <br /><br />You can easily modify this query to select and verify qualifying rows before they are deleted. In fact, in DB2, you can do both report the deleted rows and run the delete at the same time. No idea if there is a SQL Server equivalent as I eschew Micro$loth.Note: this is a general method, posted for interest only, and I would not use it in DB2 for tables of significant cardinality.<br /><br />By Alex LevyDinesh Asankahttps://www.blogger.com/profile/15933617650741277563noreply@blogger.com