Wednesday, October 30, 2013

Data Model Capabilities

In SQL Server 2012, there is a new data model called Tabular which is an addition to the existing Multidimensional Model.

So what are the limits for Tabular data model?

Feature Multidimensional Tabular

Calculated measures

Yes Yes

Distinct count

Yes Yes (via DAX)


Yes Yes



Yes (not trends)


Yes No

Linked objects

Yes No

Many-to-many relationships

Yes No

Parent-child hierarchies


Yes (via DAX)


Yes Yes


Yes Yes

Semi-additive measures

Yes Yes


Yes No

User-defined hierarchies

Yes Yes

Custom assemblies

Yes No

Custom rollups

Yes No


Yes No

Data mining

Yes No

If you require data mining, writeback, or translations within a data model, this can only be facilitated by multidimensional data models.

Thursday, October 17, 2013

Microsoft Launches SQL Server 2014 Preview 2

On Wednesday Microsoft announced the release of SQL Server 2014 Community Technology Preview 2 (CTP2).

SQL Server 2014 was announced a few months ago. It continues much of the technologies of SQL Server 2012, which was shipped about two years ago. However, Microsoft is stressing several new capabilities in SQL Server 2014 CTP2: The use of in-memory online transactional processing (OLTP) technology that formerly went by the code name "Hekaton," and the ability to use SQL Server 2014 with Windows Azure to enable scalability, as was as backup and disaster recovery support.

Microsoft releases SQL Server 2014 CTP2

Microsoft SQL Server 2014 Community Technology Preview 2

Sunday, October 13, 2013

Ford drives in the right direction with big data

Nowadays, Ford uses big data to find out what their customers want and to develop better cars faster. Developing a product that requires 20.000 – 25.000 different parts to develop, big data seems to be the only way forward and Ford bets heavily on big data.

Ford actually opened a lab in Silicon Valley to improve its cars with big data. In order to progress their cars regarding fuel consumption, safety, quality and emissions, Ford gathers data from over four million cars with in-car sensors and remote application management software. All data is analysed in real-time giving engineers valuable information to notice and solve issues in real-time, know how the car responds in different road and weather conditions and any other forces that could affect the car.

Ford is also installing numerous sensors in their cars to monitor behaviour. They install over 74 sensors in cars including sonar, cameras, radar, accelerometers, temperature sensors and rain sensors. As a result, it Energi line of plug-in hybrid cars generate over 25 gigabytes of data every hour. This data is returned back to the factory for real-time analysis and returned to the driver via a mobile app. The cars in its testing facility even generate up to 250 gigabytes of data per hour from smart cameras and sensors.


Tuesday, October 8, 2013

Getting Started with Data Mining in SQL Server

As database professionals, we typically work in a field of exact science. For example, a common practice in business intelligence (BI) solutions is creating duplicate copies of data sets, then comparing the results from the different sources to make sure they're the same. If you extract five years' worth of data from an application's database and put it into a data mart, the results in the data mart must be the same as the results in the application's database, even if the table structures were changed and older records were archived. You might build a cube or semantic model and again check to make sure the results are exactly the same as the source system. If the numbers don't add up, the results are rejected because you know that something is wrong and must be corrected. I have to confess that not getting a conclusive result when working on a tough data problem sometimes keeps me up at night.

Read more

Saturday, September 28, 2013

Get a SQL Server Environment in 5 minutes!

Do you need a cloud SQL Server Environment. CloudShare will provide you this.


One account will be active for two weeks. In case you need to connect from remote you nee to change the SQL Server port any of following port number list.

3389, 8000,  30015, 50013, 50014

Monday, September 16, 2013

September 2013 meet-up of the SS SLUG

The September 2013 meet-up of the SS SLUG is happening this week. See the image below for details.


Tuesday, September 10, 2013

Weka 3: Data Mining Software in Java

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.

Found only on the islands of New Zealand, the Weka is a flightless bird with an inquisitive nature. The name is pronounced like this, and the bird sounds like this.

Weka is open source software issued under the GNU General Public License.

Data Mining with Weka, a 5 week MOOC, starting on September 9th 2013, is now open for enrolment:

Thursday, August 22, 2013

Five Myths About Big Data

Samuel Arbesman, an applied mathematician and network scientist, is a senior scholar at the Ewing Marion Kauffman Foundation and the author of “The Half-Life of Facts.” Follow him on Twitter: @Arbesman.

Big data holds the promise of harnessing huge amounts of information to help us better understand the world. But when talking about big data, there’s a tendency to fall into hyperbole. It is what compels contrarians to write such tweets as “Big Data, n.: the belief that any sufficiently large pile of s--- contains a pony.” Let’s deflate the hype.

1. “Big data” has a clear definition.

The term “big data” has been in circulation since at least the 1990s, when it is believed to have originated in Silicon Valley. IBM offers a seemingly simple definition: Big data is characterized by the four V’s of volume, variety, velocity and veracity. But the term is thrown around so often, in so many contexts — science, marketing, politics, sports — that its meaning has become vague and ambiguous.

2. Big data is new.

It’s true that today we can mine massive amounts of data — textual, social, scientific and otherwise — using complex algorithms and computer power. But big data has been around for a long time. It’s just that exhaustive datasets were more exhausting to compile and study in the days when “computer” meant a person who performed calculations.

Vast linguistic datasets, for example, go back nearly 800 years. Early biblical concordances — alphabetical indexes of words in the Bible, along with their context — allowed for some of the same types of analyses found in modern-day textual data-crunching.

The sciences also have been using big data for some time. In the early 1600s, Johannes Kepler used Tycho Brahe’s detailed astronomical dataset to elucidate certain laws of planetary motion. Astronomy in the age of the Sloan Digital Sky Survey is certainly different and more awesome, but it’s still astronomy.

3. Big data is revolutionary.

When a phenomenon or an effect is large, we usually don’t need huge amounts of data to recognize it (and science has traditionally focused on these large effects). As things become more subtle, bigger data helps. It can lead us to smaller pieces of knowledge: how to tailor a product or how to treat a disease a little bit better. If those bits can help lots of people, the effect may be large. But revolutionary for an individual? Probably not.

4. Bigger data is better.

In science, some admittedly mind-blowing big-data analyses are being done. In business, companies are being told to “embrace big data before your competitors do.” But big data is not automatically better.

Really big datasets can be a mess. Unless researchers and analysts can reduce the number of variables and make the data more manageable, they get quantity without a whole lot of quality. Give me some quality medium data over bad big data any day.

5. Big data means the end of scientific theories.

Chris Anderson argued in a 2008 Wired essay that big data renders the scientific method obsolete: Throw enough data at an advanced machine-learning technique, and all the correlations and relationships will simply jump out. We’ll understand everything.

But you can’t just go fishing for correlations and hope they will explain the world. If you’re not careful, you’ll end up with spurious correlations. Even more important, to contend with the “why” of things, we still need ideas, hypotheses and theories. If you don’t have good questions, your results can be silly and meaningless.


Monday, June 24, 2013

NoSQL and Related systems by feature


I got this from datascience course in www,

EC – > Eventual Consistency

Saturday, June 22, 2013

Huge collection of Free Microsoft SQL Server eBooks

5 Tips for a Smooth SSIS Upgrade to SQL Server 2012
A Hitchiker's Guide to Microsoft StreamInsight Queries
Data Mining Extensions (DMX) Reference
Data Quality Services

High Availability Solutions
Master Data Services
Monitor and Tune for Performance
Multidimensional Expressions (MDX) Reference

SQL Server Distributed Replay
Transact-SQL Data Definition Language (DDL) Reference
XQuery Language Reference
Extracting and Loading SharePoint Data in SQL Server Integration Services

Integration Services: Extending Packages with Scripting
Multidimensional Model Programming
Optimized Bulk Loading of Data into Oracle
Planning Disaster Recovery for Microsoft SQL Server Reporting Services in SharePoint Integrated Mode

SQL Server 2012 Tutorials: Reporting Services
SQL Server 2012 Tutorials: Writing Transact-SQL Statements
SQL Server Community FAQs Manual

Copy only backup feature Modification in SQL Server 2012

Copy only backups were introduced in SQL Server 2005 so that DBA has an option of taking ad-hoc backups without breaking the backup sequence. In SQL Server 2005, copy only backup was available only for as a T-SQL option and in SQL Server 2008, this feature came to UI.

In SQL Server 2012, there is another UI change!!!!

Copy only backup does not have a meaning for differential backups since ad-hoc differential backup does not effect any sequence.

However, in SQL Server 2008 still you can get a copy only differential backups.


In SQL Server 2012, we you select differential backups copy only backup option will be disabled.


you can read more about copy only backups from here.

Saturday, June 15, 2013

New Features in SQL Server 2014

Although it doesn’t seem possible so soon, Microsoft announced SQL Server 2014 at this year’s TechEd 2013 conference in New Orleans. Quentin Clark, Microsoft Corporate Vice President for SQL Server, said that Microsoft is getting ready for the upcoming SQL Server 2014 release. Some of the most important new features in SQL Server 2014 include the following:

In-Memory OLTP Engine
The new In-Memory OLTP Engine (formerly code-named Hekaton) will provide OLTP performance improvements by moving selected tables into memory. The In-memory OTLP Engine works with commodity hardware and won’t require any application code changes. A built-in wizard will help you to choose which tables go in memory and select the stored procedures that will be compiled into machine code for high performance execution.

EdgeNet, an early adopter, saw a 7X performance increase with no code changes. Microsoft claims that some applications can achieve a 50x performance boost using the new In-Memory OTLP engine.

Improved Scalability
SQL Server 2014 will have the ability to scale up to 640 logical processors and 4TB of memory in a physical environment. It can scale to 64 virtual processors and 1TB of memory when running in a virtual machine (VM). New buffer pool enhancements increase performance by extending SQL Server’s in-memory buffer pool to SSDs for faster paging.

Windows Azure Integrated Backup
The new backup option is integrated into SQL Server Management Studio (SSMS). It lets you back up a SQL Server database to Windows Azure. You can also use it to quickly restore database backups to an Azure VM.

Azure Integrated AlwaysOn Availability Groups
AlwaysOn Availability Groups have also been integrated with Azure, providing AlwaysOn capabilities in the cloud. AlwaysOn Azure integration enables you to create asynchronous Availability Group replicas in Azure for disaster recovery.

Like the new Azure backup feature, the Azure AlwaysOn Availability options are completely integrated into SSMS. Other enhancements to AlwaysOn Availability Groups include the ability to have up to eight replicas—up from four in SQL Server 2012.

Business Intelligence and Data Visualization Enhancements
SQL Server 2014 will include the new data visualization tool, code-named Data Explorer. Data Explorer enables data analysis in Microsoft Excel, and its can work with a wide variety of sources including relational, structured, and semi-structured data such as OData, Hadoop, and the Azure Marketplace.

The new feature, code-named GEOFlow, will able to provide visual data mapping in Excel. You can download the previews for Data Explorer and GEOFlow from the Microsoft website. Other BI enhancements include the ability for Power View to work against multidimensional cube data in addition to tabular data models.

Improved Integration with Windows Server 2012
SQL Server 2014 will also provide support for Windows Server 2012’s new Storage Spaces feature. Storage Spaces enables you to create pools of tiered storage that can improve application availability and performance. SQL Server 2014’s Resource Governor can take advantage of Windows Server 2012’s automated storage tiering. Plus, you can use the Resource Governor to manage and limit application IO utilization.

Microsoft skipped the traditional R2 release that it normally puts out between major releases because of the significant changes it made to the database engine to support the new In-Memory OLTP Engine. A preview of the SQL Server 2014 release is expected in late June and the general availability of SQL Server 2014 is planned for early 2014. You can read more in the Microsoft SQL Server Blog.

Tuesday, June 11, 2013

Avoiding Parameter Sniffing in SQL Server

Parameter sniffing is when SQL Server compiles a stored procedure’s execution plan with the first parameter that has been used and then uses this plan for subsequent executions regardless of the parameters.  Read more about parameter sniffing and how to avoid it from here.

Tuesday, June 4, 2013

SQL Server 2014 & the Data Platform



SQL Server 2014 due out in late 2013, in-memory OLTP a big feature

SQL Server 2014, due out late this year, has an in-memory capability for online transaction processing (OLTP) with table-level granularity and doesn't require expensive hardware like other in-memory databases, according to Microsoft.

Microsoft announced SQL Server 2014 at its TechEd conference in New Orleans this week. A technology preview download will be available this month, and the database is expected to be generally available in late 2013. Redmond had already said last fall that in-memory would be a feature of the next version and previewed it during the Professional Association for SQL Server (PASS) Summit in Seattle under the codename "Hekaton." What's new this week is Microsoft's announcement of the release date.


Saturday, June 1, 2013


When creating an INDEX in many database systems we use, CREATE INDEX statement. However, in MongoDB command is ensureIndex. Reason for this is, ensureIndex will ensure there is an index on given attribute. CREATE INDEX will create an index on given columns for given index name.

Also, if you execute CREATE INDEX twice second execution will fail. However, in ensureIndex multiple execution will not fail.

Monday, May 27, 2013

Compress Backup Information in Backup and Restore Events

In SSMS, there is report called , “Backup and Restore Events” report.


if you consider latest backups backup size are almost similar. (~  339 MB) However, if you really see the physical file size, it is 79 MB.

So the report and physical file size is different.

You can measure the file size by running a query.

bs.backup_size / 1024 backup_size_KB,
bs.compressed_backup_size / 1024 compressed_backup_size_KB,
FROM dbo.backupset AS bs
INNER JOIN dbo.backupmediafamily AS mf
ON bs.media_set_id = mf.media_set_id
WHERE database_name = 'AdventureWorks2012'
ORDER BY backup_finish_date DESC;


So , here you can see the correct value can be obtained you query compressed_backup_size column rather than querying backup_size column. so it seems like, SSMS report is using backup_size column rather than compressed_backup_size column which seems to be BUG!

Sunday, May 26, 2013

How Vertica Was the Star of the Obama Campaign, and Other Revelations

The 2012 Obama re-election campaign has important implications for organizations that want to make better use of big data. The hype about its use of data is certainly justified, but a lesser-noticed aspect of the campaign ran against another kind of data hype we’ve all heard: the Silicon Valley hype around Hadoop that goes too far and claims an unreasonably large role for Hadoop. One of the most critical contributors to the Obama campaign’s success was the direct access it had to a massive database of voter data stored in Vertica.


Wednesday, May 22, 2013

Monday, May 20, 2013

Before Triggers in SQL Server

There are lots of questions asking whether there are before triggers in SQL Server. There are nothing called Before Triggers in SQL Server.

What is the requirement for the Before Trigger?

Let us say, you want to verify some values other table before inserting it. In SQL Server, you can use INSTEAD OF TRIGGER.

CREATE TRIGGER tr_data_before ON Table_Data

However, in Oracle and DB2 there are Before Triggers.

Sunday, May 19, 2013

What Operation Type Invoked a Trigger?

This was a question brought to my attention and I didn’t have an answer. After searching, I realized that there is no direct function for it. Hence I came up with following solution.

I used inserted and deleted virtual tables.





rows just inserted




rows just deleted


modified row contents

original row contents

With the above cases, I came up with following trigger.

CREATE TRIGGER trg_data_ins_del_upd
ON Data
@ins int ,@del int

@ins = Count (*) From inserted
SELECT @del = Count (*) From deleted
IF @ins > 0 AND @del > 0
INSERT INTO Operation (Operation) VALUES ('Update')
ELSE IF @ins > 0
INSERT INTO Operation (Operation) VALUES ('Insert')
ELSE IF @del > 0
INSERT INTO Operation (Operation) VALUES ('Delete')

However, I feel in the above case, you better of having three separate triggers than loading everything to one trigger.

Wednesday, May 15, 2013

Infrastructure as a Service for SQL Server

SQL Server Virtual images available for SQL 2008 R2 and SQL Server 2012 in 3 editions – Web, standard and enterprise.  If you have an existing SQL Server license with software assurance, as an additional benefit you can move your existing license to Windows Azure and simply pay for compute and storage.                 

Prices are determined based on

  • Size of the database
    • Web Edition if size of database < 5 GB
    • Business Edition if size of database < 150 GB
    • Least unit for which prices are charged is a GB. Example: If a database is 2.4 GB,  charges will be levied for 3 GB
  • Computing power required
  • Version of License

your free trial contains

virtual machines & cloud services / 750 compute hours per month

Monday, May 13, 2013

Maintenance Plan Blunders - 1

Maintenance Plans seems to be some what popular with novice DBAs simply because you don’t have to think much to configure a Maintenance Plan.

However, today I came across with two “interesting” cases. I will explain one of them today.

This is one of them.



According to this plan, it take a full backup and just after the full backup , it takes a differential backup!!!

Differential backup will contain all the modified data pages after the last full backup. Since these plans are running on mid-night, it is highly unlikely, you get any modifications after the full backup.  This means that you should have a differential backup for a different plan.

Friday, May 3, 2013

Search in MongoDB

I came across this incidents while working on some MongoDB. I am not sure it is the expected behavior, so thought of posting it.

I have a collection called zipcodes which has following attributes.


So, I need to get the count for two state , I executed following script.


it returned me  1596 which was incorrect. Then started to go more into details, I started with by interchanging the values and then I got a different results.


Then I remove first row and executed it again.


Oops, I got the same values as before. Then I realize that, whenever we pass values for same attribute only the last one will be taken in. So for the above case I should us $in.

This behavior is same for insert, update, find etc.

Wednesday, May 1, 2013

ME Bank picks 'cheaper' Microsoft stack over Linux

The bank, which is one of Australia's smaller banks with 250,000 customers, does not have any bank branches. It is currently undertaking an AU$57 million technology transformation program.

ME Bank's enterprise architect Jem Richards said that after the bank examined the costs between the two platforms, including the cost of providing tech support, ultimately decided that the Windows Server 2012 and SQL Server 2012 option would be cheaper.

"Although the alternative Linux-based platform is essentially free to deploy, based on our past experience, we knew that it would cost more to support than Windows. This made the overall costs of the two operating systems approximately the same," he said in a statement.

"In addition, Microsoft and Windows community specialists were readily available to help us configure Temenos T24, whereas finding the relevant skills for the alternative platform with Temenos was proving to be a lot more difficult."

Richards said that the unnamed Linux platform would cost AU$100,000 more to upgrade, while the Microsoft licence would require no additional cost over five years.


UI Features in RoboMongo

I posted about RoboMongo tool around one month back. Thought of writing about two UI features of the tool.


This tol provides the intellisense feature so that users have the option of selecting the command rather than typing it.


Query Formatting

When large queries are typed, it will format the queries with different colors as shown below.


Saturday, April 27, 2013



Schema free

Schema free, document databaseForget about tables, rows, mappings or complicated data-layers. RavenDB is a document-oriented database you can just dump all your objects into. Queries are amazingly fast and flexible.


It scalesSharding, replication and multi-tenancy are supported out-of-the-box. Scaling out is as easy as it gets.


TransactionalACID transactions are fully supported, even between different nodes. If you put data in, it is going to stay there. We care about your data and we keep it safe.

High performance

Fast and high performantRavenDB is a very fast persistence layer for every type of data model. Skip creating complicated mapping or multi-layer DALs, just persist your entities. It Just Works, and it does the Right Thing.


Variety of APIsRavenDB comes with a carefully designed client API for .NET, Silverlight, Javascript and REST over HTTP.

Feature rich and extensible

ExtensibleBuilt with extensibility in mind, RavenDB can be easily extended via bundles. Many integration points ensure you can always squeeze more out of RavenDB. You aren't shackled to a One Size Fits None solution.

Carefully designed

Carefully designed with best practices in mindEvery bit of code was carefully considered. RavenDB was designed with best-practices in mind, and it ensures that everything Just Works.

Get the details from

Monday, April 22, 2013

Reduce Dev Time and Improve Performance with MongoDB Training

Learning best practices for MongoDB development and administration can help reduce development time for your application, and increase reliability and performance. Get yourself and your team up to speed quickly with 10gen's comprehensive courses in person and online:

  • MongoDB for Developers: Learn best practices for document-based data moderling, queries and commands, map/reduce and basic administration.
  • MongoDB for Administrators: Diagnose performance issues, import and export data from MongoDB and establish the proper backup and restore routines.

Sunday, April 21, 2013

Free ebook: Introducing Microsoft SQL Server 2012


The final and complete version of Introducing Microsoft SQL Server 2012, by Ross Mistry and Stacia Misner, is now ready as a free download! You can download the PDF version of this title here









1.   SQL Server 2012 Editions and Engine Enhancements

2.   High-Availability and Disaster-Recovery Enhancements

3.   Performance and Scalability

4.   Security Enhancements

5.   Programmability and Beyond-Relational Enhancements


6.   Integration Services

7.   Data Quality Services

8.   Master Data Services

9.   Analysis Services and PowerPivot

10.   Reporting Services

INCLUDE Index Column Order

You can create an include index as shown below.

USE [AdventureWorks2012]

[AK_Product_ProductNumber] ON [Production].[Product]
[ProductNumber] ASC

So in this there are two columns are included in INCLUDE clause. So the question is, will the column order matters?

Since it is not only the columns in the SELECT list that need to be present in the INCLUDE clause, order of those columns is irrelevant.

However, if you look the UI of INCLUDE index, you will see following screen.


In the above screen, you can see that there is an option of changing the order by pressing Move Up or Move Down buttons.

So the question is, if the order of the columns is irrelevant, why you have an option of changing it?


Like a series of SET Options, thought of doing another series on CREATE INDEX options in coming weeks.

These are the available options,


Saturday, April 20, 2013

Why MongoDB doesn’t need Transactions and Constraints

Eye brows may be raised when we say MongoDB doesn’t support transactions and constraints. First let us look at why we need transactions and constraints in relational databases.

Let us think about a simple blog system in relational databases.


So in relational, one transaction need to be saved in three tables. So there can be a scenario where post will get updated and tags will not be updated.  To avoid this, you can run all the statements in a transaction.

Also, there can be another scenario where user can insert into tags without inserting into posts table. So the constraints are used to avoid such situations.

How are we saving this data in MongoDB.


So in MongoDB, tags and comments are embedded to the posts documents so that above scenarios will not exist. 

Monday, April 15, 2013

How Does a DBA fit in ITIL process

DBAs find allot of problems and need to implement process to fill the gap. Money or the concept of making money is often lost between internal organizations because of lack of communication and especially lack of process. ITIL is recognized as the de facto standard for IT Service Management.ITIL is a best practices framework. Kind of..ITIL has a strong relationship with the ISO9000 quality framework. Today all the big organization like HP, EDS , IBM strictly follow the ITIL process and shared the same with client so they can rely on them . As per the current job market it is a mandatory for all the support engineer must be attend the ITIL training and should be complete the foundation exam. ITIL has two major groups, service delivery (what services must IT provide to the business) and service support (how does IT ensure access and service).

View it now

Sunday, April 14, 2013

DROP and REMOVE Mongo documents

In MongoDB you have two options to remove all the documents in a collection.

db.users.drop ()


thought the output of above two statements ( removing the all the documents in a collection) is same, there are couple of differences with two options.

remove will remove document one by one where as drop will remove all the documents at once. So in case you are dropping documents of a large collation, drop will be much faster than the remove.

However, drop will remove the meta data related to the collection as well. for example, it will remove the indexes associated with the collection.

So if you want to remove all the documents in a large collection, better option is use drop statement and recreate indexes again.

Saturday, April 6, 2013

Viewing Query Plan from Activity Monitor

There are lot of options to view query plan. Activity Monitor has an option of displaying Recent Expensive Queries.


You can view the query plan by right clicking the relevant query by as shown below.