Friday, 25 May 2018

6 PHASES IN DATA ANALYTICS LIFECYCLE



6-PHASES IN DATA ANALYTICS


Phase 1- Discovery

           In Phase 1, the team learns the business domain, including relevant history such as whether the organization or business unit has attempted similar projects in the past from which they can learn. The team assesses the resources available to support the project in terms of people, technology, time, and data. Important activities in this phase include framing the business problem as an analytics challenge that can be addressed in subsequent phases and formulating initial hypotheses (IHs) to test and begin learning the data. 

Phase 2- Data preparation 

              Phase 2 requires the presence of an analytic sandbox, in which the team can work with data and perform analytics for the duration of the project. The team needs to execute extract, load, and transform (ELT) or extract, transform and load (ETL) to get data into the sandbox. The ELT and ETL are sometimes abbreviated as ETLT. Data should be transformed in the ETLT process so t he team can work with it and analyze it. In t his phase, the team also needs to familiarize itself with the data thoroughly and take steps to condition the data 

Phase 3-Model planning

         Phase 3 is model planning, where the team determines the methods, techniques, and work flow it intends to follow for the subsequent model building phase. The team explores the data to learn about the relationships between variables and subsequently selects key variables and the most suitable models.

Phase 4-Model building 

         In Phase 4, the team develops data sets for testing, training, and production purposes. In addition, in this phase the team builds and executes models based on the work done in the model planning phase. The team also considers whether its existing tools will suffice for running the models, or if it will need a more robust environment for executing models and work flows
(for example, fast hardware and parallel processing, if applicable).

Phase 5-Communicate results

          In Phase 5, the team, in collaboration with major stakeholders, determines if the results of the project are a success or a failure based on the criteria developed in Phase 1. The team should identify key findings, quantify the business value, and develop a narrative to summarize and convey findings to stakeholders.

Phase 6-0perationalize 

         In Phase 6, the team delivers final reports, briefings, code, and technical documents. In addition, the team may run a pilot project to implement the models in a production environment.

          These are the six phases in a data analytical life cycle to perform a successful analysis in our data.

Sunday, 20 May 2018

RapidMiner-Studio Installation in UBUNTU 16.04

INSTALLATION  OF  RAPIDMINER-STUDIO IN UBUNTU 16.04

Installation of Java:
Step 1:Open the terminal by pressing ctrl+alt+T and then type
     sudo apt-get update




Step 2:After the update is finished type
     sudo apt-get install software-properties-common






Step 3:We have to add java repository by typing
      sudo add-apt-repository ppa:webupd8team/java




Step 4:Update the system once again.
     sudo apt-get update



step 5:Install java-8 by running the command
     sudo apt-get install oracle-java8-installer


 

step 6:After that to check whether java is installed on our system type the command in terminal like
       java-version





Step 7:To check whether the path is set for java type
        echo $JAVA_HOME
in terminal

It gives the current information about java.



Installation of Rapid miner studio


Step 1:Get the Rapid miner studio from the website by clicking the below link
https://my.rapidminer.com/nexus/account/index.html#downloads
and then select the linux distribution







Step 2:After that extract the downloaded file and right click on it and select open in terminal option




Step 3:You can list the files in the directory by dir and then type ./RapidMiner-Studio.sh and click ok




Step 4:Rapid miner studio opens Accept the lisence term and click next



Step 5:Enter the information to sign up to your account in Educational purpose.




Step 6:After verifying your mail your Rapid miner studio will be opened

 

(Note:Everytime you have to open the extracted folder in terminal and type ./RapidMiner-Studio.sh in your terminal to access RapidMiner-Studio

Saturday, 19 May 2018

Configure VPN service in UBUNTU 16.04

Configure VPN service in UBUNTU 16.04

Starting VPN service:

Step 1:Open the terminal by pressing ctrl+alt+T.Our terminal window appears like this. 


 Step 2:Install openvpn by typing sudo apt-get install openvpn in terminal.


Step 3:Visit vpnbook  website by this link https://www.vpnbook.com/ .In that select openvpn tab and then download anyone server by clicking the server links


Step 4:After that extract the compressed file that we have dowloaded from the vpnbook wrbsite and right click on that compressed file and then open in terminal.


Step 5:Type dir in the terminal to find the list of available ovpn files.


Step 6:Now type openvpn --configure and anyone in the available file names then press enter.


Step 7:It prompts you for the Auth name and Auth password.You get it in the same https://www.vpnbook.com/ 

 Step 8:Then click enter you will get your vpn started.

                  (note:Please Don't close the terminal else your VPN connection will get terminated)

Stopping VPN service:

Step 1: (warning:If you don't stop the vpnservice properly then you'll get some problem in the Internet connection.)Hence to stop the vpn service correctly type the following commands in terminal.
                     1.sudo killall openvpn
                     2.sudo /etc/init.d/openvpn stop                   
                     3.sudo service network-manager restart



           I hope this will help you access the blocked websites in your working area or in your college.

Tuesday, 30 January 2018

5V's of Big Data


5V's of Big Data:

        For me V is for Victory.In order to get success in analysis of big data we should have to know that 5V's in the Data Science.
           Hence these are the 5Vs in a Data Science let us see an elaboration of this 5V.

Volume:

       Coming to the data science what is the need of data science? The main usage of Data Science is to handle large amount of data.So coming to the concept, How much Data we are using? This is measured by the Volume.Hence Volume plays the first role in the world of Data Science.

Velocity:

       So we are having particular amount of data but in the field of data science there is a frequent incoming of data. Hence to manage those data we have to know the term velocity.So the term velocity was useful in measuring the speed of incoming data that needs to be updated before data mining.

Variety:

       So as of I previously said what data is? Its a raw fact.Now a days we are processing different kinds of data.Such as audio,video and text.Hence for data mining or processing we have to know what kind of data it should be? 
        Based on the kinds data is classified into three different categories.
Structured data         : That data that was having labels example-text
Unstructured data     : The data that was not readable example-audio,video.
Semi-Structured data: Best example of this is a log file.

Veracity:

     After the collection of data we all need to remember a term, that is what we are calling it as accuracy. We are processing the data to get some meaningful information in a database. Suppose if the data we are processing is entirely wrong then our entire analysis on the data should be wasted.In order to avoid such awkwardness we need Veracity in Data Science.Though it was not an important V in Data science we have to maintain the accuracy for our data.

Value:

     We can't get anything on analyzing the useless data.If we are using a data for analysis it definitely should have some value in the field of data science.Hence concluding that we need some value for the data.  

            

 

Monday, 29 January 2018

Data Science

 

Why Data Science?

     To know why Data Science? let us take the example of Amazon You like to buy a product in Amazon But you feel that it was too expensive. But you often visits the page and Check for price of that particular product. 

 

Why I am seeing the ads of Amazon in my Facebook newsfeed That too what I searched in amazon!!😠😠😠😠

   Amazon has its marketing strategy called promotional marketing. 

 

Hey what is promotional marketing???

     Promotional marketing is the use of any special offer intended to raise a customer's interest and influence a purchase, and to make a particular product or company stand out among its competitors. 

 

Still you don't got it???😀😀😀😀

     They get how much time you see his ad,If they decided to make you customer they do email marketing or promotional marketing or ads that we seen like facebook. Now let us see a short about Data Science and Data Scientist. 

 

What Data Science will be?

   Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured Datas.
     Today, more and more organizations are opening up their doors to big data and unlocking its power—increasing the value of a data scientist who knows how to tease actionable insights out of gigabytes of data.

 

Who is Data Scientist? 

      A person employed to analyse and interpret complex digital data, such as the usage statistics of a website, especially in order to assist a business in its decision-making. "Silicon Valley technology companies are hiring data scientists to help them glean insights from the terabytes of data that they collect everyday" 

 

So anyone can become a Data Scientist😊😊??

      My answer to that question is definitely yes ,If you have the following skills
  • Math & Statistics
  • Programming and Database
  • Domain Knowledge and soft skills
  • Communication and Visualization
   
    
     If you already have those skills CONGRATULATIONS apply for the post of Data analyst ,Data Scientist,Data Steward or Data Engineer in any company we can see the difference in the upcoming posts.



6 PHASES IN DATA ANALYTICS LIFECYCLE

6-PHASES IN DATA ANALYTICS Phase 1- Discovery            In Phase 1, the team learns the business domain, including relevant...